Hey Everyone,
I talked to my boss on Friday and he gave me the OK to work with the community to create a method of parallel execution. (This goes well beyond the abilities of pabot). We already have this internally but it was somewhat hacked together so I'm starting over from scratch; furthermore, if anyone is interested in contributing I would be more than gracious.
I've created a repository
here on Github.
So let me outline what we've accomplished and what we're hoping to accomplish over the next few weeks.
Old Setup
I work for a networking company and we wanted the ability to run a test on all the units in a given topology. So the given goal was as follows.
1.) Open a topology file
2.) Parse all the units in a topology and make them available to RF
3.) Run specified tests in parallel with each begin given a single unit from the list of units
4.) Wait for all parallel instances to finish execution
5.) Join the results seamlessly to the results file at the end of execution
Our method for going about implementing this was to utilize the listener interface. We tag any test cases that should be ran in concurrently with "parallel." Also we define what topology should be used as metadata in the top level test suite. I'll give a brief example of what one of our tests could look like.
Example Test Suite
*** Settings ***
Metadata /path/to/topology.yaml
Resource IPv6_Resource
*** Test Cases ***
Some serial test
Do serial stuff
Verify IPv6
[Tags] parallel
Connect to unit
Configure IPv6
Test IPv6 Connectivity
Disconnect from unit
Another serial test
Do serial stuff
Our listener interface does the following.
Listener Flow
suite_start:
For the topmost suite
load in the topology data
start_test:
if test has parallel tag
for every unit in topology (minus one that will run in the main process)
call a special robotframework runner script which uses the robot api
* Uses the API to strip off the suite setup and teardown
* Only includes the particular test desired to be ran in parallel
end_test:
Wait for parallel tests to finish before continuing
output_file:
parse the output file and join all the results in a compact way (more on this in a second)
So that's basically the flow of it. There's some other things going on behind the scenes but they aren't important for the moment.
Looking back at our example test suite I'll demonstrate out that suite would execute.
Test Execution
+ Some serial test runs
+ Verify IPv6 hits start test
1st parallel instance of Verify IPv6 begins
2nd instance begins
...
+ Verify IPv6 Actually begins*
+ Verify IPv6 hits end test
Waits for all parallel instances to complete
+ Execution continues as normal (you know the rest)
* We let the main RF instance run a single test because we can't actually control the flow of execution from a listener.
When running this from ride you'd only see that verify IPv6 ran once and the pass or fail would be based on that. This is another restriction from the listener interface: it can't alter the test status (PASS/FAIL). Not a big deal, but definitely would be nice.
The last thing to talk about is what the actual output looks like for a testcase. I regret that I don't have a screenshot of it to show, but maybe I can post one on Monday. The output merging is a really hacky effort on my part to get things to look nice. RF's default result merging just wasn't up to snuff. We have a situation where there's 1 main xml file (containing all the serial runs) and any number of other xml files that were created from the parallel tests. Combining these all with rebot was a nightmare. So I wrote a custom merger do to the following:
Output Merger Flow
1.) Find the serial test case that matches to a parallel test
2.) Create a new suite around the serial test with the same name as the test except for "(parallel)" appended at the end.
3.) Append the unit name to the end of the serial test case name
4.) Scrape the test case data from the parallel output file and append that below the serial test case
5.) Repeat 4 for all other parallel results of this test
6.) Repeat all for any other parallel tests
The results would look something like this
Example Results
[-] Test Case Some serial test
[-] ... some content
[-] Test Suite Verify IPv6 (parallel)
[-] Test Case Verify IPv6 (NV7100)
[-] Test Case Verify IPv6 (NV5350)
... etc
[-] Test Case Another serial test
[-] ... some content
So the dummy suite acts as a container for all the parallel runs. This way it's easy to collapse all the multiple instances of that parallel test case without having them all clutter your log.
As you can tell this is a rather complex system... but, it gives us the ability to run tests in parallel. To our team that's huge.
Proposed Setup
With the new interface I'm looking to use the 2.8.5 feature of having a conjoined library/listener. Also, I'd like to add the ability to run entire suites in parallel as opposed to only test cases. Here's what I envision the interface looking like.
Parallel Suite
*** Settings ***
Library Parallel suite ${topology_units}
*** Test Cases ***
Blah
you get the idea
Here you simply specify what level of parallelization you desire and what you would like to iterate over. '${topology_units}' would be something defined
Parallel Tests
*** Settings ***
Library Parallel ${topology_units} # The first argument (scope of parallelization) is option with the default value 'test'
*** Test Cases ***
My Parallel Test
Run Parallel
Do this
Do that
The parameter passed into the library here would propagate to all the tests that were specified to run in parallel.
Alternate Parallel Test
*** Settings ***
Library Parallel
*** Test Cases ***
My New Parallel Test
Run Parallel ${topology_units}
Do stuff
In this form a list of items to iterate over is passed directly into the run parallel keyword. It would override anything passed in at the suite level (i.e. any arguments passed as a library parameter).
Data Availability
One of the big objectives of being able to run parallel tests in this way is to have certain test to be able to access certain data. For our usage, we want to be able to access unit objects from a topology. For this part, I'm really open to ideas. One method would be to make the list variable (topology_units in the examples) to be a list of dictionaries. Each dictionary would be passed to the parallel test and be turned into variables. i.e. [ { 'unit1': <obj>, ... } ] would translate into ${unit1} for the first parallel run.
Conclusion
While I haven't laid out the exact implementation details for the new interface, I hope it excites some of you for the possibility of having parallel tests. Also, if any of the NSN team knows of an easier way to go about this then by all means let me know. Hopefully I'll have a prototype by the end of next week so be sure to keep an eye on the repository if you're interested. If anyone has any detailed questions I'd be more than happy to answer them.
Thanks everyone!