e2e: `run-multiple.sh` script should save the dump of failed tests #4115

cason · 2024-09-18T06:59:58Z

This is a helper script to run multiple executions, one per manifest.

If an execution fails, it dumps some data to the standard output and then cleans the directory with the actual execution data:

cometbft/test/e2e/run-multiple.sh

Lines 24 to 35 in 276996a

    
           if ! ./build/runner -f "$MANIFEST"; then 
        
           	echo "==> Testnet $MANIFEST failed, dumping manifest..." 
        
           	cat "$MANIFEST" 
        
           	echo "==> Dumping container logs for $MANIFEST..." 
        
           	./build/runner -f "$MANIFEST" logs 
        
           	echo "==> Cleaning up failed testnet $MANIFEST..." 
        
           	./build/runner -f "$MANIFEST" cleanup 
        
           	FAILED+=("$MANIFEST") 
        
           fi

It should instead create a temporary directory with the information of the failed execution, containing:

The network/${manifest%.toml} execution directory
The log of the current execution (log action)
It would be great to have the logs of every node that was part of the execution (as in e2e/docker: enable saving the produced logs to a file #4113)

The text was updated successfully, but these errors were encountered:

andynog · 2024-09-20T14:01:55Z

@cason I've already implemented this logic, it's the concept of runner executions and I was using it to better test the e2e nightly failures.

https://github.com/cometbft/cometbft/tree/andy/e2e-preserve-logs

you can run with "executions" enabled

./build/runner -f networks/nightly/gen-group01-0004.toml --execution

I[2024-09-20|13:53:42.069] Starting initial network nodes...            
I[2024-09-20|13:53:42.070] load                                         msg="Starting transaction load (16 workers)..."
....
I[2024-09-20|13:55:17.123] wait until                                   msg="Waiting for all nodes to reach height 51..."
I[2024-09-20|13:55:26.540] Running all tests in ./tests/...             
ok      github.com/cometbft/cometbft/test/e2e/tests     46.244s
I[2024-09-20|13:56:15.085] saving execution                             msg="saving e2e network execution information"

and it will save all the logs and manifest used in each run (could add logic to save the node data/config too), the nice thing you can run the same manifest multiple times, maybe modifying a parameter, and still keep the old results because it saves each run within a time date folder

./networks_executions/gen-group01-0004
└── 20240920_095615
    ├── manifest.toml
    └── nodes
        ├── validator01
        │   ├── docker-errors.log
        │   └── docker.log
        ├── validator02
        │   ├── docker-errors.log
        │   └── docker.log
        ├── validator03
        │   ├── docker-errors.log
        │   └── docker.log
        └── validator04
            ├── docker-errors.log
            └── docker.log

This logic can be expanded on many fronts, like some ideas parse the results and generate automated reports in order to improve the time it currently takes to comb through the results or eventually do diffs of results based on tweaks between runs 😉

I just didn't have time to port this to main because I was busy investigation the nightly failures but if you think this is the right direction and should be prioritized I'd be happy to work on it.

cason · 2024-09-23T08:19:21Z

@andynog , this is great, should we open a PR on this?

(could add logic to save the node data/config too)

Yes, I would keep everything.

andynog · 2024-09-23T14:20:17Z

@cason, I'll try to find time this week to push a PR with this logic 👍

cason · 2024-09-24T16:55:56Z

Thank you for this code, @andynog, great work.

I was wondering, however, whether we should add a command to the runner to dump all this information. So we can run the multiple actions, one by one, and after stopping the network but before cleaning-up the execution, we could also "manually" dump and the relevant data.

Ok, it is true that the Comet's home directories for every node are under networks/<manifest> and they won't change once we invoke the stop command. But this is not true for the logs, isn't it?

cason · 2024-11-08T09:07:44Z

This issue can be associated with #4113.

andynog · 2024-12-12T15:33:44Z

PR #4165 was closed since the solution to this problem needs additional design considerations that needs to be addressed/defined before a logic can be implemented.

cason added enhancement New feature or request e2e Related to our end-to-end tests testing related to unit testing in general labels Sep 18, 2024

andynog added this to CometBFT Sep 20, 2024

github-project-automation bot moved this to Todo in CometBFT Sep 20, 2024

andynog self-assigned this Sep 23, 2024

andynog moved this from Todo to In Progress in CometBFT Sep 23, 2024

andynog added a commit that referenced this issue Sep 23, 2024

add changelog entry, README info, shell script change (#4115)

21645be

andynog mentioned this issue Sep 23, 2024

test(e2e): option to save runner executions #4165

Closed

3 tasks

andynog removed their assignment Dec 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

e2e: `run-multiple.sh` script should save the dump of failed tests #4115

e2e: `run-multiple.sh` script should save the dump of failed tests #4115

e2e: run-multiple.sh script should save the dump of failed tests #4115

e2e: run-multiple.sh script should save the dump of failed tests #4115

Comments

e2e: `run-multiple.sh` script should save the dump of failed tests #4115

e2e: `run-multiple.sh` script should save the dump of failed tests #4115