SilverStripe, Testing and Continuous Integration

107 views
Skip to first unread message

Gordon Anderson

unread,
Jan 28, 2016, 4:01:55 AM1/28/16
to silverst...@googlegroups.com
Dear fellow developers,

I've recently been experimenting with testing SilverStripe modules using third party continuous integration services, and feel it would be useful to document what I've found and learned, as well as ask for feedback on some issues that either require fixes or discussion.


The main issue that previously stopped me writing tests was speed.  If a test takes 30 secs to a minute to run it's easy to lose focus.  The documentation said to use SQLite but this is still slow unless you add the tweak to use an in memory database as opposed to the file system.  I've had a pull request accepted documenting the in memory config change, see https://github.com/silverstripe/silverstripe-framework/blob/3/docs/en/02_Developer_Guides/06_Testing/00_Unit_Testing.md - the section "Use SQLite In Memory".  I had to ask around on IRC to get this info initially.  Tests will now run around 20 to 30 times faster, feedback in 2 seconds is far better than likes of 40...

I initially started writing tests for a search module using Elasticsearch, https://github.com/gordonbanderson/silverstripe-elastica.  My aim was to get as close as possible to 100% test coverage.  This eventually involved me having to execute the test run to generate coverage reports overnight, as it ended up taking around 12 hours to run on my laptop.  Whilst this is an extreme example (more than just trivial fixtures data needed to test searching), it can still take in the region of 30 to 60 mins for a 'typical' SilverStripe module.  It should be noted that running without test coverage being generated is in the region of about 30 times faster.

As I did not wish to tie up my laptop generating coverage reports, I decided to look into third party services to do this for me.  As SilverStripe uses Travis I read up on this and after some experimentation created a config file that is reasonably generic.  This file, https://github.com/gordonbanderson/SilverStripeExampeModuleCI/blob/master/.travis.yml, can simply be copied, and then the only change required is editing the ENV variable 'MODULE_PATH' to that of path when the module is installed by composer (note, this may differ from the GitHub project name).  It does the following:
* Creates a working SilverStripe project with your module
* Executes the module's test suite with various versions of PHP and SilverStripe (the various combination of versions are known as a matrix in Travis speak)
* For only one combination, run the tests and generate a config file.  This is triggered by the ENV variable COVERAGE being set to 1 (https://github.com/gordonbanderson/SilverStripeExampeModuleCI/blob/master/.travis.yml#L37).  All of the other combinations in the matrix run with COVERAGE=0, and are thus much faster.
* After a successful run (this can be changed to work with test failures also) upload code coverage report to Scrutinizer and codecov.io

Scrutinizer is a third party service that uses static analysis tools to give your code a score based on various metrics such as number of paths through if then else spaghetti, unused variable declarations and compliance to coding standards.  I'm in 2 minds about using the coverage upload here as it seems to fail a lot of the time, and only provides for a coverage badge, not full code coverage reports.  It is worth using for highlighting logic hotspots and messy/lazy coding practicies though.

I tried uploading code coverage to Coveralls but there seemed to be an extra configuration step required to get source code working, and there is a lot of anecdotal evidence to suggest it's been a problem for a lot of people.  I couldn't get it working either... :(  I tried CodeCov and it 'just worked'.  CodeCov also has a browser addon that overlays code coverage whilst browsing your code on GitHub.

Example reports:
- Scrutinizer report for my version of the Mappable module https://scrutinizer-ci.com/g/gordonbanderson/Mappable/
- CodeCov report for the same module https://codecov.io/github/gordonbanderson/Mappable.  Looking at an example file that lacks 100% coverage, you can see the lines not tested as their background is a light red https://codecov.io/github/gordonbanderson/Mappable/code/LatLongField.php?ref=56fd6dce7f2d8d5ba97a0633450e46b1dbde5915 (in this case the last method in the class).

CodeCov is not perfect, two issues I've yet to resolve:
1) I cannot see how to change the default branch after I've changed it in GitHub
2) I've not figured out the criteria for a class appearing in a CodeCov report when it has 0% coverage.  Sometimes I see them, sometimes I do not.  This can result in either a false high or low coverage percentage being reported.  I'm not sure if this is a CodeCov or a phpunit configuration issue, if anyone can help resolve this or at least identify the cause that would be great.

Where CodeCov really does win is that it can join multiple coverage reports for any given Travis build.  Many modules rely optionally on others, and make checks like the following (pseudo code):

if (Translatable module is installed) {
do task with lots of languages;
} else {
do task with one language;
}

If test coverage is generated with either the Translatable module installed or not installed, only one branch of the above if statement will be executed.  However if both cases are catered for, namely with and without Translatable, it's possible to get 100% coverage.

The Travis setup tool has been changed by myself with prodding from @tractorcow to allow for multiple packages to be installed other than just those defined in composer.json.  It's most useful for installing suggested modules for the scenario above.  Multiple packages can now be installed with either multiple '--require provider/package_name' options or one split by commas '--require provider1/package1,provider2/package2'.  See https://github.com/silverstripe/silverstripe-comments/blob/master/.travis.yml#L47 for a working example.  I've added a lot of tests to the SilverStripe comments module as hopefully an example for others to follow, increasing coverage from 54% to 92%.  One of the suggested packages, HTML Purifier, which was not previously part of the Travis build (some tests was skipped due to the module not being installed) triggered a now fixed bug. 


With the above tools in place, and your project published on composer, it is possible to add badges to your README file.  These give an indication of code quality and care, namely does the test suite pass, the percentage of coverage, the Scrutinizer code quality score, the number of times a package has been downloaded from Packagist, the number of references in packagist etc.  I *may* have gone a bit over the top but have a look at these examples:

- https://github.com/gordonbanderson/wot-faq (this example has no tests, but I was quite surprised it has been downloaded over 300 times.  No README either!)

And where I have managed to infiltrate a couple of SilverStripe owned modules...

My choice of badges might not be the ideal ones, but I think it would be good to see consistency around SilverStripe modules in this regard.  With this consistency in mind, and because I found changing the badges for different branches/owners tedious, I wrote a tool to add the badges to a README.md file, see https://github.com/gordonbanderson/Badger

[BTW just an idea I've had whilst writing this, some kind of badge for whether or not the module has none, some or lots of documentation?  Hard one for a computer to calculate though.]


If your module has no testing or continuous integration whatsoever, as many of mine didn't, you can use the 2 scripts here https://github.com/gordonbanderson/SilverStripeTestTools to get up and running.  Instructions are included in the README.


When I tested some of the SilverStripe modules using techniques outlined above, coverage was lower than expected
* Comments 54%
* Blog 57% (but only half of the files covered due to the CodeCov bug previously mentioned)

Testing larger projects likes of the SilverStripe framework is trickier in that tests have to be run in parallel to avoid running over the 50 minute time allowed for Travis jobs.  The technique is to set an environment variable in the matrix and use that as a switch in order to run a subsection of the test suite.  Have a look at https://github.com/gordonbanderson/silverstripe-framework/blob/TESTEVAL/.travis.yml which splits the framework tests into 25 parallel runs, one for each test directory.  It should be noted by the way the number 25 could be reduced by checking the execution times from the Travis build for each directory of tests, e.g. https://travis-ci.org/gordonbanderson/silverstripe-framework/builds/105353194.  Groups of shorter tests could be run in sequence which would reduce the number of times travis_setup is executed.

The coverage report for the framework can be found at https://codecov.io/github/gordonbanderson/silverstripe-framework?ref=9864567618cf7b0531a41e2db0510bfb171ef113 , currently reported as 50.32%.  However this figure is artificially low as it appears anything of *Test.php is excluded from logging coverage of lines executed but appearing in the CodeCov test report as untested code.  If anyone can figure out why that is please let me know :)

Using the number of lines reported by CodeCov, and removing the test directory from consideration, the maths becomes 100*(31295-110)/(62186-12853) or 63.2%.  Whilst the coverage percentage metric may currently be slightly off, the report is still of use in determining where tests are missing.  It should be noted that critical core code such as DataObject and DataList do not have 100% test coverage for example.


Testing isn't too hard once the above steps are taken, being able to get a coverage report and an indication of code quality after every push to GitHub is extremely useful.  With the above information there really is no excuse for not adding tests.:)


A few questions
* Is there a definitive .editorconfig file for modules?  There really ought to be one.
* Same question regarding Scrutinizer configuration.
* Is there a definitive list of SilverStripe and PHP versions to test again when using Travis?
* What are a suitable selection of badges to standardize on for modules?
* Should there be a concerted effort to setup code coverage for all SilverStripe owned modules, the CMS and the Framework, and then aim to get test coverage to say 90% plus?  I realize of course that budgeting issues may prevent this.


Look forward to your feedback.

Regards, and happy testing :)

Cheers

Gordon [Anderson]

(nontgor on IRC)

Christopher Pitt

unread,
Jan 28, 2016, 12:19:27 PM1/28/16
to SilverStripe Core Development
Hey Gordon,

Thanks for posting this - I definitely think it would make a good blog post! Would you be keen to write one for the silverstripe.org blog?


"Is there a definitive .editorconfig file for modules? There really ought to be one."

By the look of things, a little robot seems to be recommending the same .editorconfig file for all modules.

"Same question regarding Scrutinizer configuration."

I haven't seen a helpfulrobot pull request, for Scrutinizer config, in a long time. I expect it may start happening before long though...

"Is there a definitive list of SilverStripe and PHP versions to test again when using Travis?"

What usually works well (at least for me) is testing the latest stable minor version (like 3.2) with every minor version of PHP that SilverStripe supports. Then I also include PHP 7 (but allow it to fail), and a test for PGSQL and a test for SilverStripe master branches. This covers quite a few scenarios...


"What are a suitable selection of badges to standardize on for modules?"

Depends on the CI services I usually connect. I usually have "build", "code quality", "version", and "license". Sometimes I will have "code quality", but I've had trouble connecting that up in the past.

"Should there be a concerted effort to setup code coverage for all SilverStripe owned modules, the CMS and the Framework, and then aim to get test coverage to say 90% plus?"

That would definitely be cool, but also a massive time-sink. Getting most SilverStripe modules to have at least 1 test (for the most critical feature) is easier to achieve IMO. There's something like 1.4k SilverStripe modules on Packagist...

Gordon Anderson

unread,
Jan 28, 2016, 10:43:17 PM1/28/16
to silverst...@googlegroups.com
hi Chris

On Fri, Jan 29, 2016 at 12:19 AM, Christopher Pitt <cgp...@gmail.com> wrote:
Hey Gordon,

Thanks for posting this - I definitely think it would make a good blog post! Would you be keen to write one for the silverstripe.org blog?

If you can send me any relevant guidelines please do so off list.  The email did get a bit longer than I'd originally intended...
 


"Is there a definitive .editorconfig file for modules? There really ought to be one."

By the look of things, a little robot seems to be recommending the same .editorconfig file for all modules.

An example of this would be https://github.com/silverstripe/silverstripe-blog/blob/master/.editorconfig  - checking older versions of framework's .editconfig it would appear that the following is missing:

[*.md]
trim_trailing_whitespace = false

Also composer.json needs to be taken account of.  Actually re-reading, all .json files.  But easy enough to fix.


"Same question regarding Scrutinizer configuration."

I haven't seen a helpfulrobot pull request, for Scrutinizer config, in a long time. I expect it may start happening before long though...

 
"Is there a definitive list of SilverStripe and PHP versions to test again when using Travis?"

What usually works well (at least for me) is testing the latest stable minor version (like 3.2) with every minor version of PHP that SilverStripe supports.

OK, that seems sensible.

 
Then I also include PHP 7 (but allow it to fail), and a test for PGSQL and a test for SilverStripe master branches. This covers quite a few scenarios...


That gist is useful, thanks.  Looks like I have some Travis files to change :)
 


"What are a suitable selection of badges to standardize on for modules?"

Depends on the CI services I usually connect. I usually have "build", "code quality", "version", and "license". Sometimes I will have "code quality", but I've had trouble connecting that up in the past.
I agrree that Scrutinizer can be flakey sometimes. 

"Should there be a concerted effort to setup code coverage for all SilverStripe owned modules, the CMS and the Framework, and then aim to get test coverage to say 90% plus?"

That would definitely be cool, but also a massive time-sink. Getting most SilverStripe modules to have at least 1 test (for the most critical feature) is easier to achieve IMO. There's something like 1.4k SilverStripe modules on Packagist...

I agree that some tests are better than no tests :)

Regards

Gordon

Nicolaas Thiemen Francken - Sunny Side Up

unread,
Jan 31, 2016, 3:55:57 PM1/31/16
to silverstripe-dev
This is awesome.  Inspirational. I am going to read it a few times!  THANK YOU Gordon and Chris.

I have a ton of modules with hardly any tests and so I would love to automate the creation of tests for all of the modules. For that, this module seems very useful: https://github.com/gordonbanderson/SilverStripeTestTools (as far as I can tell). 

One thing I wonder is: what is the value of code coverage in tests?  I mean, if you had three hours left on a module that is basically in working order, what would be the best way to spend that time:

a. write documentation
b. write tests (up to what coverage?)
c. improve CMS fields
d. increase speed through caching and other techniques

That would be a great question to answer.  I guess it can not be answered because it all depends, but a few rules of thumb might still be useful. 

What are the most important things to test?  Are there any rules of thumb for that? 

I would be really keen to improve the quality of all my modules so I'd love to see a list of things to do that would be easy to follow and implement. For such a list, it would be good to have the to do items ordered by importance, from the most crucial, to the least interesting fluff. 

Thank you again

Nicolaas

--
You received this message because you are subscribed to the Google Groups "SilverStripe Core Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to silverstripe-d...@googlegroups.com.
To post to this group, send email to silverst...@googlegroups.com.
Visit this group at https://groups.google.com/group/silverstripe-dev.
For more options, visit https://groups.google.com/d/optout.




Gordon Anderson

unread,
Jan 31, 2016, 8:57:10 PM1/31/16
to silverst...@googlegroups.com
hi Nicolaas

On Mon, Feb 1, 2016 at 3:55 AM, Nicolaas Thiemen Francken - Sunny Side Up <nfra...@gmail.com> wrote:
This is awesome.  Inspirational. I am going to read it a few times!  THANK YOU Gordon and Chris.
:) 

I have a ton of modules with hardly any tests and so I would love to automate the creation of tests for all of the modules. For that, this module seems very useful: https://github.com/gordonbanderson/SilverStripeTestTools (as far as I can tell). 

This is not a module as such but a Ruby script, UNIX only, that will grep out function names from PHP code and create a valid but empty test suite that marks all tests as skipped.  You still have to write the actual tests :)

One thing I wonder is: what is the value of code coverage in tests?  I mean, if you had three hours left on a module that is basically in working order, what would be the best way to spend that time:

a. write documentation
b. write tests (up to what coverage?)
c. improve CMS fields
d. increase speed through caching and other techniques

Good question.  In a sense tests are documenting, they document at a coding level how some given software should work.  So I would go with a) and b) in this case.  Option  c) is hard to test (currently I am only checking the names of fields depending on whatever criteria exists, e.g. 'Show comments').  And as for d) you know things are not broken if you've written tests :) 
 
That would be a great question to answer.  I guess it can not be answered because it all depends, but a few rules of thumb might still be useful. 
I think tests take highest precedence in the case where you have a module that is popular and receiving a lot of pull requests.  Having these automatically tested against a test suite prior to a merge is most definitely useful. 

What are the most important things to test?  Are there any rules of thumb for that? 

If test writing time is limited I'd go for the most complex section logically, you can use Scrutinizer to identify hotspots in your code.  A hotspot is a method that has lots of execution paths (basically lots of if then else statements, possibly nested).  And perhaps some low hanging fruit that it is quick to write tests for.
 
I would be really keen to improve the quality of all my modules so I'd love to see a list of things to do that would be easy to follow and implement. For such a list, it would be good to have the to do items ordered by importance, from the most crucial, to the least interesting fluff. 

That's not something I've pondered to be honest.  I will have a look through my commits and see if there is a pattern or not.
 
Thank you again

:)

Gordon
 
Nicolaas


Nicolaas Thiemen Francken - Sunny Side Up

unread,
Feb 1, 2016, 8:46:58 PM2/1/16
to silverstripe-dev
Thank you Gordon,

This is not a module as such but a Ruby script, UNIX only, that will grep out function names from PHP code and create a valid but empty test suite that marks all tests as skipped.  You still have to write the actual tests :)

I figured something along those lines, that is a good begin. I wonder how I could write a few generic tests that I could apply to all my modules (a dev/build might be a good start ;-))

Also, I have added a small module for the basic, basic checks (is it on packagist, does it have a README file, etc...):  https://github.com/sunnysideup/silverstripe-modulechecks.


Gordon Anderson

unread,
Feb 7, 2016, 12:40:56 PM2/7/16
to silverst...@googlegroups.com
hi Nicolaas,

Most of the basic checks are effectively covered by setting up continuous integration on a module.  The 'exists in addons' is not one I had though of, though I imagine this process is probably automated, but hey that could be one more badge.


On a separate note, I have resolved one of the codecov issues, it is possible to remove folders via configuration in the web interface.  So in the case of the SilverStripe framework, I now have the report without the test folders included.  See https://codecov.io/github/gordonbanderson/silverstripe-framework?branch=TESTEVAL


The link that says 'Learn how to *ignore folders*' was the key :)

Cheers

Gordon

Ingo Schommer

unread,
Feb 7, 2016, 6:55:32 PM2/7/16
to SilverStripe Core Development
Hey Gordon, that's awesome stuff - looks like we're getting closer to code coverage actually being viable in CI.

Regarding the COVERAGE=1 flags in your travis.yml, have you considered using the fast_finish=1 flag?
It allows builds marked as allowed_failures to take longer than others, meaning faster feedback loops for pull requests while code coverage is still running.

Also, using phpdbg with PHPUnit 4.8 and PHP7 seems to be a 10x speed improvement on code coverage, compared to PHP5+Xdebug.

Gordon Anderson

unread,
Feb 8, 2016, 11:57:26 AM2/8/16
to silverst...@googlegroups.com

Hey Gordon, that's awesome stuff - looks like we're getting closer to code coverage actually being viable in CI.

For modules I think it's there now, but for larger pieces of code such as the framework it is probably too slow due to the coverage reports taking too long and delaying test feedback on pull requests on GitHub.  [Or at least I did, opinion now changed as a result of writing this email.]
 
Regarding the COVERAGE=1 flags in your travis.yml, have you considered using the fast_finish=1 flag?
It allows builds marked as allowed_failures to take longer than others, meaning faster feedback loops for pull requests while code coverage is still running.

I wasn't aware of that parameter, thanks.  Are you then suggesting then that the version of PHP used for the coverage be marked as allowed to fail, and with the fast_finish flag set to true, the normal feedback on GitHub regarding successful builds will remain as it is now?  But with the added addition of delayed code coverage?  If so, that is darned good idea :)

The only potential problem I can see with how to allow builds both to succeed and fail with the same version of PHP.

Testing I tried the following Travis config (note an allowed failure for COVERAGE=1, instead of using a version of PHP)

Resulting build report:

This does *not* have an allow fail showing for COVERAGE=1, only the versions of PHP, suggestion that using environment variables in this manner will not work.

Checking support for minor versions, it appears that 5.5.9 *is* supported as it's the one shipped with Ubuntu 14.04.  As such this minor version number could be used as the necessary flag instead of the environment variable COVERAGE.  It's not an ideal solution, as a target version of PHP cannot be chosen, but it's better than no solution.

we do indeed have a win.  Build marked as passing whilst the coverage is still being generated, build report at https://travis-ci.org/gordonbanderson/ss3gallery/builds/107797350 and deliberately delayed code coverage https://codecov.io/github/gordonbanderson/ss3gallery/code?ref=a66001a8ab30fa0389532a15029d9bd3f31a3d8c

Thanks Ingo for the idea, not one that I'd thought of.


Also, using phpdbg with PHPUnit 4.8 and PHP7 seems to be a 10x speed improvement on code coverage, compared to PHP5+Xdebug.

Oooh that is interesting.  Ironically if it was that fast locally I may never have started looking at CI, but my SilverStripe Elastica module was taking around 12 hours on my laptop to generate test coverage.  The module I've used as an example in this email above is passing with PHP 7, so it's a suitable candidate to try and get phpdbg working in Travis with a SilverStripe module.


BTW before you came up with the idea of delayed coverage report generation, my thinking was to create a parallel branch from a relevant one such as master or 3.1 etc where only the .travis.yml file was different, and use that as for the CodeCov badge and graph.  On a periodic basis merge into that branch to generate a coverage report and thus update badges.    Your idea, and what appears to be an implementable solution of that idea, looks like the next step forward to code coverage viability.

Cheers,

Gordon
Reply all
Reply to author
Forward
0 new messages