automated testing of Dataverse

Philip Durbin

unread,

Dec 3, 2015, 1:18:28 PM12/3/15

to dataverse...@googlegroups.com

Hello! I'm writing to share with the community the state of automated testing of Dataverse as I see it and explain some recent developments.

I recently created https://github.com/IQSS/dataverse/issues/2746 to jot down some ideas for improving our automated testing and this email will make more sense if you take a quick peek at that issue. I expect that some of those bullet points might need more explanation so please feel free to ask if anything is unclear! :)

That issue is really just a brain dump at this point. There are *lots* of directions we could go with automated testing and I'd love to hear your ideas as well. Please feel free to reply all to this email, make some noise in http://chat.dataverse.org , or comment on that issue directly.

Just to set expectations, that issue already contains a good number of ideas (again, it's a brain dump) but it represents a *ton* of work. Bit by bit we hope to improve our automated testing, but it's a potentially long road. The road will grow longer as I add more bullets to the description as new ideas come to me, hopefully from some of you! :)

The main test automation task I've been working on is setting up a server at http://phoenix.dataverse.org so named because I drop the database and delete all the uploaded files on every build, allowing it to rise from the ashes. :)

I'm using Jenkins to automate the testing of that phoenix server, and I thought I'd give a quick walkthrough of how it works.

The first Jenkins job builds from the release branch we're currently working on (4.2.2 as of this writing). A war file is produced and copied to the phoenix server. In addition, some testing and setup scripts are sent over as well. This is the first job, and if you want to really dig into it you can see the console output at https://build.hmdc.harvard.edu:8443/job/phoenix.dataverse.org-build/

The second job takes the latest war file and the latest test scripts that were copied over from the first job and uses them on the phoenix server. As of this writing, this is roughly what happens on the phoenix server:

- database dropped and recreated as fresh/empty

- Solr emptied

- new war file deployed
- setup-all.sh run

- reference_data.sql applied

- test users, dataverses, and datasets created

- test file uploaded

- test dataset with a file published

As of this writing this is all triggered from this script at https://github.com/IQSS/dataverse/blob/485cb2cbad92f4d666b07125c4fafd5be81fc213/scripts/deploy/phoenix.dataverse.org/deploy

Then, the second job does a simple test to download the published file. If it succeeds, it tells us that all of the steps above succeeded. Here's the second job (the console output is potentially interesting): https://build.hmdc.harvard.edu:8443/job/phoenix.dataverse.org-deploy/

(Please don't worry about recent failures in builds for both jobs. I was testing that Jenkins correctly indicates failures!)

This is really just the tip of the iceberg of automated testing. What I've been describing by way of the phoenix example is mostly *integration* testing of a running server via API calls. The code also has some amount of *unit* testing that gets executed every time a war file is compiled. (Code coverage reports of unit tests are available at https://coveralls.io/github/IQSS/dataverse .)

I'd love to hear feedback on any of this! I hope you're as excited about automated testing as I am. If anyone would like to assist with this effort, please get in touch and I'll explain how you can help!

Again, please check out https://github.com/IQSS/dataverse/issues/2746

Thanks! Phew!

Phil

--

Philip Durbin
Software Developer for http://dataverse.org
http://www.iq.harvard.edu/people/philip-durbin

Ben Companjen

unread,

Dec 4, 2015, 6:37:40 AM12/4/15

to dataverse...@googlegroups.com

Hi Phil,

I commented on the issue, but I wanted to applaud your efforts more widely. Thanks for sharing!

I hope the efforts are joined by the rest of the team (I know they are, to some degree, but it can always be better), because manual quality assurance and quality control is labour and time intensive yet in many cases – no offence to QA staff – dull. Also, it should prevent regression of functionality like new critical issues that have risen recently (and that the team are addressing in upcoming fix releases).

Your approach of setting up a fresh instance and putting it to the test is a good one, I think. Now all you need are representative tests and edge cases that allow you to check that all requirements are met. If the number of tests goes up fast, perhaps it's enough to rerun the test once or twice a day rather than on every build, but you'll figure that out :)

Generally speaking, testing your software is something that distinguishes 'programming' from 'software engineering'. If you test your software well you don't need to rely on a brand. So yes, more tests please!

Ben

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/CABbxx8EX2J3dQPOKo6aAAjbUvTqrT6P%3DxZ73VUG4%3Dm9f%3DGw5aQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Philip Durbin

unread,

Dec 4, 2015, 9:11:09 AM12/4/15

to dataverse...@googlegroups.com

Thanks, Ben. I appreciate the comment and it was a pleasure chatting with you at http://irclog.iq.harvard.edu/dataverse/2015-12-03

There we discussed http://www.se-radio.net/2015/11/se-radio-episode-242-dave-thomas-on-innovating-legacy-systems/ and I very well may re-listen to that episode. I was shocked that Dave Thomas seemed to be down on unit tests, but it turns out he's really, really into testing. He just prioritizes user acceptance testing. (I'm sure there are lots of opinions out there about all of this!) As you pointed out, he's also really into QuickCheck, which is neat. Pretty fancy. I wouldn't know which Java library to try first.

Anyway, we developers will need to pick and choose what to focus on when it comes to writing automated tests. We want the most bang for our buck. I'm glad you like the phoenix idea. :)

Please keep the feedback coming!

Phil

To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/D04BA9E4-B824-45A1-B44C-49F26E894308%40dans.knaw.nl.

For more options, visit https://groups.google.com/d/optout.

Philip Durbin

unread,

Dec 7, 2015, 4:52:43 PM12/7/15

to dataverse...@googlegroups.com

Just a quick update that I've added a third Jenkins job to the mix. After the second job deploys the war file and gets everything set up, the new third job runs API tests using https://github.com/jayway/rest-assured against the running system at http://phoenix.dataverse.org

Here's an example of me adding a quick SWORD test to create a dataset: https://github.com/IQSS/dataverse/commit/967354f

What's neat is that I can see the trend line going up here as more of these API tests are added:

https://build.hmdc.harvard.edu:8443/job/phoenix.dataverse.org-apitest-4.2.3/

I'll attach a screenshot. These Jenkins URLs may change but please feel free to click into the test results and console log if you're interested in the gory details. :)

Happy testing!

Phil

Test_[Jenkins]_-_2015-12-07_16.40.49.png

Philip Durbin

unread,

Jan 12, 2016, 4:46:40 PM1/12/16

to dataverse...@googlegroups.com

I just gave a talk about API testing and thought folks on this list might be interested in the slides: http://bl.ocks.org/pdurbin/raw/814fd29916749523db9a

I'm still fixing typos at https://gist.github.com/pdurbin/814fd29916749523db9a if you spot any more. :)

Phil

2016-01-12_Dataverse_API_Testing_-_2016-01-12_16.45.26.png

Reply all

Reply to author

Forward