[Announcement] New testing infrastructure for STP

14 views
Skip to first unread message

Dan Liew

unread,
Mar 17, 2014, 1:13:04 PM3/17/14
to stp-...@googlegroups.com
Hi All,

As people are probably aware STP didn't really have any proper testing infrastructure (there were tests, but what's the point of having tests if you can't run them conveniently?). I've started working towards better testing infrastructure and I've now pushed my initial work into the master branch. This testing is disabled by default (you need to set the ENABLE_TESTING CMake option to enable it) because it requires a few extra components and I don't want to break people's builds.

We basically have two sorts of tests at the moment

* Query file tests. These are tests where the result of running STP on a query file (e.g. .smt2) is checked. This is driven using the lit and OutputCheck python tools
* Unit tests. These link against libstp and try to check various behaviours. This is driven using the GoogleTest framework.

Unfortunately there is still **a lot of work to do** because many tests are currently broken (it is because of this that TravisCI is currently not running our tests)

I've put some of these issues on our issue tracker on GitHub [1] under the "Proper testing" milestone but I'm sure there are many more issues.

What I'd like to happen next is for us to fix all the tests so we have some sort of safety net before starting a badly needed clean up of STP's codebase. This unfortunately is not going to happen on its own so I'm hoping the community will step up to the challenge of helping us fix the tests.
I've written some documentation (admittedly incomplete) [2] on how to get started with testing in STP.


Cheers,
Dan Liew.

Stephen McCamant

unread,
Mar 17, 2014, 3:25:56 PM3/17/14
to stp-...@googlegroups.com
>>>>> "DL" == Dan Liew <delc...@gmail.com> writes:

DL> Hi All,
DL> As people are probably aware STP didn't really have any proper
DL> testing infrastructure (there were tests, but what's the point of
DL> having tests if you can't run them conveniently?). I've started
DL> working towards better testing infrastructure and I've now pushed
DL> my initial work into the master branch. This testing is disabled
DL> by default (you need to set the ENABLE_TESTING CMake option to
DL> enable it) because it requires a few extra components and I don't
DL> want to break people's builds.

DL> Unfortunately there is still **a lot of work to do** because many
DL> tests are currently broken (it is because of this that TravisCI is
DL> currently not running our tests)

DL> I've put some of these issues on our issue tracker on GitHub [1]
DL> under the "Proper testing" milestone but I'm sure there are many
DL> more issues.

DL> What I'd like to happen next is for us to fix all the tests so we
DL> have some sort of safety net before starting a badly needed clean
DL> up of STP's codebase. This unfortunately is not going to happen on
DL> its own so I'm hoping the community will step up to the challenge
DL> of helping us fix the tests. I've written some documentation
DL> (admittedly incomplete) [2] on how to get started with testing in
DL> STP.

DL> [1] https://github.com/stp/stp/issues?milestone=1&page=1&state=open
DL> [2] https://github.com/stp/stp/wiki/Testing

Thanks for all of your work in setting this up. IMO STP did have a
somewhat usable infrastructure for query tests up until the point of
the CMake switchover, but it had a lot of custom code with a murky
origin and hadn't been well maintained in a while, so switching to
something based on standard tools and with good integration with a
system like Travis should put us on much better footing.

I would like to point out a different possibility for one big
strategic point, though. I think we'd be better off setting up a test
suite that passes, and enabling Travis, based on the subset of the old
tests that you already have working, rather than waiting to make the
best possible modern tests out of all of the old ones. There's
definitely value in a longstanding regression suite, so it makes
plenty of sense to keep the old tests around in the repository. But I
think you're right in pointing out that there would be a lot of work
needed to get all the old tests in an ideal shape, and I'm less
sanguine than you seem to be about the effect of calling on "the
community" to do this large batch of work, especially on any
particular time frame.

Part of why I mention this now is that, while I don't know what the
original intent was, it feels to me like something similar happened
when we turned on a large batch of compiler warnings recently. It
seemed like the thought might have been "enabling compiler warnings is
good; let's turn them on and then hope someone fixes them". But
compiler warnings are actually only really useful if the noise
warnings have been fixed, so the new problem you added shows up. And
similarly, a test suite is only practically useful if it's passing
with the baseline version of the code, so you can run it to see if you
broke anything.

There's no question that the more tests we have the better off we
are. And I think there's some unique value in the old tests, because
they work better than any new tests for catching regressions, and
ensuring that STP still works the way old external code depends on.
But in terms of a safety net for future development, any kind of test
suite that developers will run is light years better than nothing at
all. Given the limited total resources of the whole community to work
on STP, I think this may be a point where the perfect is the enemy of
the good.

-- Stephen

Reply all
Reply to author
Forward
0 new messages