Increasing the number of checkable steps would improve usability #1571

danwt · 2022-03-28T15:47:58Z

danwt
Mar 28, 2022

Problem description

In my experience it can be hard to get the most from Apalache for model-based testing because for large specs the number of steps it can check can be quite limited. In my experience a very large non trivial model can be limited to as few as 2/3 steps when checking invariants, and medium/large models are often limited to 5-7 steps. 8+ steps only be achieved in some cases or with very small/trivial models.

Apalache has many advantages for MBT such as being able to handle unbounded Integers, and being able to use View and Trace invariants to check and generate traces for very interesting behaviors. These features far surpass what TLC offers. Therefore it is unfortunate to be limited in the steps (where TLC is usually not as limited).

As an example please consider these versions of a spec written for MBT of the staking module: Apalache and TLC. The spec is large, and I wrote it over a period time. Early on, when the spec was smaller, I experimented and was achieving better results with TLC so after that I wrote a lot of the spec with TLC in mind. But still, in the Apalache version I tried to the extent possible to avoid anti patterns (there are no recursive operators, @@ chains, or CHOOSE, and I use Nat instead of small sets of Ints). There are admittedly a lot of uses of Fold due to computing sums over sets, but I'm not sure what I could do to remove those.

The TLC version of the model can be checked up to 10+ steps quite easily whereas Apalache cannot really get past 3 steps (or 2 if checking an invariant). This rules out using Apalache as testing any meaningful behavior will require several actions.

Of course, the small scope hypothesis is a thing, and there are many many instances where 5-7 steps is sufficient to test meaningful behavior, but still, being able to consistently get there, or check 8+ steps more often would be great.

I would like to add, the example I gave is concrete and fresh in memory but I have noticed this as a trend when using Apalache. I think @andrey-kuprianov has had similar experiences.

The solution I'd like

N/A

Alternatives I've considered

N/A

Additional context

I spoke @gabrielamafra about the phenomena of the step limitation being a bigger problem than the state explosion and I'm happy that Apalache is discussing this.

danwt · 2022-03-28T15:50:37Z

danwt
Mar 28, 2022
Author

I could add to this issue in future when I have more concrete examples if that would be useful.

2 replies

konnov Mar 29, 2022
Maintainer

I have converted your issue in a discussion. It would not get lost in hundreds of issues and we can have a better discussion format here.

konnov Mar 29, 2022
Maintainer

Yes, please add more examples in this discussion. We need more case studies, if we want to improve the tool towards our use cases.

konnov · 2022-03-29T07:10:37Z

konnov
Mar 29, 2022
Maintainer

True. A large number of symbolic transitions seriously affects performance. As a temporary solution, I can recommend to use a transition filter, if you know how to restrict the scope of your execution space: https://apalache.informal.systems/docs/apalache/tuning.html

In the long run, we are looking for ways to parallelize the search by partitioning the execution space. @thpani, this is a very interesting conversation that is relevant for your work.

0 replies

konnov · 2022-03-29T07:20:42Z

konnov
Mar 29, 2022
Maintainer

Another thought is whether it's possible to decompose your spec into two or three smaller ones? It's probably non-trivial to do for MBT though.

1 reply

danwt Mar 29, 2022
Author

Going forward I will be trying to use a greater number of smaller specs rather than fewer large ones. I'm not sure how easy it is to decompose directly though, especially for MBT, as you say.

Kukovec · 2022-03-29T08:47:29Z

Kukovec
Mar 29, 2022
Maintainer

I know this is more of a general discussion, but in the case of this particular spec you've posted, there are a few things that you can do to improve performance. For example:
https://github.com/danwt/cosmos-sdk/blob/3ef8375c5521b921830cd3136bdf2132c6db0caa/x/staking/mbt/staking_apa.tla#L195-L200
is a pretty clear instance of an Apalache antipattern. I'd also try https://apalache.informal.systems/docs/apalache/profiling.html#profiling-your-specification.

3 replies

thpani Mar 29, 2022
Maintainer

Edit: turned into an issue: #1578

danwt Mar 29, 2022
Author

Sweet thanks, I was not aware of the particular impact of ITE and EXCEPT in the context of folds

konnov Apr 15, 2022
Maintainer

This same advice has also propagated into the manual: https://apalache.informal.systems/docs/tutorials/trail-tips.html#tip-1-use-tla-constructs-instead-of-explicit-iteration

konnov · 2022-06-03T12:30:31Z

konnov
Jun 3, 2022
Maintainer

To address this slowdown, we have added the random simulation mode, which randomly picks an enabled action at each step and does symbolic execution for the sequence of picked actions:

https://apalache.informal.systems/docs/apalache/running.html#12-simulator-command-line-parameters

In this mode, Apalache finds invariant violations for your spec in 10-20 seconds for P0, P1, ..., P8. But it gets stuck on P9. This is a good indicator of the need to simplify the specification. If it gets stuck in a random symbolic execution, the spec complexity is immense.

If you only need the model checker for test generation (even without checking an invariant), the simulation mode is reasonably fast. It can also save examples for all runs with the option --save-runs.

However, we should always remember that it is easy to write expressions of enormous complexity such as Cardinality([SUBSET S -> SUBSET T]) > Cardinality(S). It requires some practices not to write the busy-beaver expressions.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increasing the number of checkable steps would improve usability #1571

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 5 comments 6 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Increasing the number of checkable steps would improve usability #1571

danwt Mar 28, 2022

Replies: 5 comments · 6 replies

danwt Mar 28, 2022 Author

konnov Mar 29, 2022 Maintainer

konnov Mar 29, 2022 Maintainer

konnov Mar 29, 2022 Maintainer

konnov Mar 29, 2022 Maintainer

danwt Mar 29, 2022 Author

Kukovec Mar 29, 2022 Maintainer

thpani Mar 29, 2022 Maintainer

danwt Mar 29, 2022 Author

konnov Apr 15, 2022 Maintainer

konnov Jun 3, 2022 Maintainer

danwt
Mar 28, 2022

Replies: 5 comments 6 replies

danwt
Mar 28, 2022
Author

konnov Mar 29, 2022
Maintainer

konnov Mar 29, 2022
Maintainer

konnov
Mar 29, 2022
Maintainer

konnov
Mar 29, 2022
Maintainer

danwt Mar 29, 2022
Author

Kukovec
Mar 29, 2022
Maintainer

thpani Mar 29, 2022
Maintainer

danwt Mar 29, 2022
Author

konnov Apr 15, 2022
Maintainer

konnov
Jun 3, 2022
Maintainer