how to automatically test pull requests with sympy-bot

Ondřej Čertík

unread,

Jun 24, 2012, 4:20:40 PM6/24/12

to sympy

Hi,

I need some help with figuring out how to implement robust pull
requests testing with sympy-bot.

The webapp at reviews.sympy.org (the source code of which is in the
"web" directory of sympy-bot)
has a database of all pull requests, and I can quite easily implement
almost any interaction between
the sympy-bot and the app. The idea is that the webapp's database has
and up-to-date information
about all pull requests and since it is under our control, we can
easily implement any interaction
with the sympy-bot. That seems like a very robust model to me.

The problem is that the database is currently updated by polling the
github API periodically by recursively
going over all pull requests and comparing any changes against our
database (so it's a lot of traffic,
lots of github API requests and a lot of database accesses). Recently
we were running over various app engine quota, so I had to eventually
lower the polling period to
once or twice a day. Not very good.

What is the best way to notify our app, that

a) a new pull request was created
b) a new commit was pushed into the pull request

? I was hoping to use github webhooks
(https://help.github.com/articles/post-receive-hooks), but it seems
that
only notifies us of pushing into the "master" of the sympy repository.
Another option are "Events"
(http://developer.github.com/v3/events/), for example here is a list
of such events for the sympy repo:

http://hurl.it/hurls/999dedbb7a1221912b5dcbea69d8b8b5b235a955/f6e36af68fd5b39d97b4d84cd12a067184f3ec92

it shows all recent changes in issues. It's a simple request, so we
can poll it let's say every 5 minutes
and keep an updated list of changes in our database. This should be
very lightweight.
From this list, we should be able to determine both a) and b) above.

Update: so actually one should be able to register these "hooks" here:

http://developer.github.com/v3/repos/hooks/

I am not quite sure currently how this is related to the old "hooks"
above, but at least from the docs this
should do exactly what we need --- we would listen to these hooks and
simply trigger a build whenever
there is a push of a new commit to the pull request

And so the way I see it is that the first iteration would be to
implement "sympy-bot work", which would
connect to our webapp, and whenever there is a new pull request or any
new commit, or
a *comment* with a specific command like "sympy-bot test", it would
simply run all tests on the pull request
(just like it does now).

Anyone is then free to run such sympy-bot in the "work" mode. I don't
think we have to worry about spam here,
because you still have to use your github credentials to post any comments.

------------------------------------------

So here is my proposed step by step TODO list:

1) Implement "sympy-bot work" with the following features:
It would poll the github API directly (as often as the rate limiting
of the github API allows) and it would
listen to 3 types of events:

1a) new pull request created
1b) new commit pushed into the pull request
1c) A comment (by anybody) of the exact text "sympy-bot test"

It would trigger a test and report to the pull request just as it is
now. The 1c) feature allows us to
easily test older pull requests manually, while 1a) and 1b) will
automatically test new pull requests.

2) I will leave such sympy-bot running for Python 3.2 and Python 2.7
on my linode. Other people
can keep it running on their hardware.

3) After the above works well, we can setup automatic hooks to our
webapp, and manage all such
changes centrally by the webapp, and the sympy-bot would simply stay
connected to the webapp
without having to periodically poll the github API.

-----------------

If anybody wants to help out with 1), that would be a huge help. I
think it should be relatively easy to implement
and it would allow us to automatically test all pull requests. I can
then setup 2). Finally, maybe later
we can implement 3), but that's not a big deal now.

Ondrej

krastano...@gmail.com

unread,

Jun 24, 2012, 4:38:28 PM6/24/12

to sy...@googlegroups.com

This will be quite an effort. Is it worth it? There is already
TravisCI for the master branch (and it is capable of testing PR if we
want). There is also the admittedly-very-ugly script that I use for
testing - it fails occasionally, however one rarely waits for more
than half a day to receive 3 tests (2 architectures and 3 versions of
python). It also tests after each change to the PR.

If we want more architectures we just need more people running the bot.

One thing that would be a definite step forward is if the bot is smart
enough to rewrite its own comments (at the moment it floods the page
with comments). Other than that, I think it is best to focus on moving
to TravisCI. When they update their python versions to include hash
randomization there will be no reason to stick to the bot.

Ondřej Čertík

unread,

Jun 24, 2012, 5:03:19 PM6/24/12

to sy...@googlegroups.com

On Sun, Jun 24, 2012 at 1:38 PM, krastano...@gmail.com
<krastano...@gmail.com> wrote:
> This will be quite an effort. Is it worth it? There is already
> TravisCI for the master branch (and it is capable of testing PR if we
> want). There is also the admittedly-very-ugly script that I use for
> testing - it fails occasionally, however one rarely waits for more
> than half a day to receive 3 tests (2 architectures and 3 versions of
> python). It also tests after each change to the PR.

Can you post your script? If it already implements the above, that
would of course be great!
I didn't know about it.

Ondrej

krastano...@gmail.com

unread,

Jun 24, 2012, 5:13:50 PM6/24/12

to sy...@googlegroups.com

> Can you post your script? If it already implements the above, that
> would of course be great!
> I didn't know about it.

I have posted a version of it some time ago as a pull request:
https://github.com/sympy/sympy-bot/pull/74

The current version is a bit more robust, however this is only a
**throwaway script**.

It downloads the PR webpage, checks in the comments whether an
up-to-date test was already ran and if not it starts sympy-bot. The
current version rearranged a bit the control flow, because the one in
the PR74 was prone to redo unnecessary tests.

If it is decided that sympy-bot really needs beefing-up I advise
against using this script.

Ondřej Čertík

unread,

Jun 24, 2012, 5:37:56 PM6/24/12

to sy...@googlegroups.com

I think it works well. Would you mind posting your current script?
So that I can set it up on my server as well.

I think that your sympy-bot installation is using
randomized hashes, right? It's important to have one such set of tests
to discover the hash() bugs.

However, given that such bugs are already in sympy, it is also equally
important to have some tests that
simply use some hash/platform combination which works (for example the
one at Travis CI works),
so that we can quickly determine whether the given pull request breaks anything.

I just donated $70 to Travis CI for sympy, so they will probably
enable the pull request testing
soon.

Ondrej

Ondřej Čertík

unread,

Jun 24, 2012, 5:55:54 PM6/24/12

to sy...@googlegroups.com

On Sun, Jun 24, 2012 at 2:37 PM, Ondřej Čertík <ondrej...@gmail.com> wrote:
> I just donated $70 to Travis CI for sympy, so they will probably
> enable the pull request testing
> soon.

Looks like it's working:

http://travis-ci.org/#!/sympy/sympy/pull_requests

Ondrej

krastano...@gmail.com

unread,

Jun 24, 2012, 5:56:04 PM6/24/12

to sy...@googlegroups.com

> I think it works well. Would you mind posting your current script?
> So that I can set it up on my server as well.

OK, however it will not do much good - what is tested by your server
will not be tested by mine and vice-versa. So running it on two
machines would be useful only if there are too many pull requests to
be tested.

https://gist.github.com/2985162

> I think that your sympy-bot installation is using
> randomized hashes, right? It's important to have one such set of tests
> to discover the hash() bugs.

It is using randomized hashes because the test runner uses them by
default (since yesterday, thanks to a pull request done by Aaron).

In detail:
- python2.5, 32bit, without additional installed modules, without hash
randomization because it is not supported in 2.5
- python2.7.3, 64bit, with numpy, gmpy, matplotlib, with hash randomization
- python3.2.3, 64bit, without additional installed modules, with hash
randomization

So we have both tests with and without hash randomization at the
moment on both architectures.
Off topic: Just because of how silly it would be I would also start
testing on a RasberyPie ARM processor when I receive it in a month or
two.

> I just donated $70 to Travis CI for sympy, so they will probably
> enable the pull request testing
> soon.

This is great. Is there actually a way to donate to sympy in order to
pay for stuff like this.

Ondřej Čertík

unread,

Jun 24, 2012, 6:06:35 PM6/24/12

to sy...@googlegroups.com

On Sun, Jun 24, 2012 at 2:56 PM, krastano...@gmail.com
<krastano...@gmail.com> wrote:
>> I think it works well. Would you mind posting your current script?
>> So that I can set it up on my server as well.
> OK, however it will not do much good - what is tested by your server
> will not be tested by mine and vice-versa. So running it on two
> machines would be useful only if there are too many pull requests to
> be tested.
>
> https://gist.github.com/2985162

I'll try to provide a better implementation of the
really_ugly_solution_for_finding_the_last_test_sha()
function to only take into account comments produced by you (or me).
Then there will be no clashes,
as long as both you and me use the new version.

>
>> I think that your sympy-bot installation is using
>> randomized hashes, right? It's important to have one such set of tests
>> to discover the hash() bugs.
> It is using randomized hashes because the test runner uses them by
> default (since yesterday, thanks to a pull request done by Aaron).
>
> In detail:
> - python2.5, 32bit, without additional installed modules, without hash
> randomization because it is not supported in 2.5
> - python2.7.3, 64bit, with numpy, gmpy, matplotlib, with hash randomization
> - python3.2.3, 64bit, without additional installed modules, with hash
> randomization

This is very good. Yes.

>
> So we have both tests with and without hash randomization at the
> moment on both architectures.
> Off topic: Just because of how silly it would be I would also start
> testing on a RasberyPie ARM processor when I receive it in a month or
> two.

:)

>
>> I just donated $70 to Travis CI for sympy, so they will probably
>> enable the pull request testing
>> soon.
>
> This is great. Is there actually a way to donate to sympy in order to
> pay for stuff like this.

Aaron and I are setting it up through http://numfocus.org/.
Eventually we'll have a donate button on the webpage.

Ondrej

Ondřej Čertík

unread,

Jun 24, 2012, 6:44:14 PM6/24/12

to sy...@googlegroups.com

Seems to be working pretty well, except that now the overall status of
the sympy repository
is equivalent to the latest build status (be it master or a pull
request), so in particular,
right now the master passes, the latest pull request fails and the
overall status is fail.
It is a known problem:

https://groups.google.com/d/topic/travis-ci/xcK7glACQ30/discussion
https://github.com/travis-ci/travis-ci/issues/567

So hopefully it will be fixed eventually.

Ondrej

Aaron Meurer

unread,

Jun 24, 2012, 10:25:31 PM6/24/12

to sy...@googlegroups.com

On Jun 24, 2012, at 4:06 PM, "Ondřej Čertík" <ondrej...@gmail.com> wrote:

> On Sun, Jun 24, 2012 at 2:56 PM, krastano...@gmail.com
> <krastano...@gmail.com> wrote:
>>> I think it works well. Would you mind posting your current script?
>>> So that I can set it up on my server as well.
>> OK, however it will not do much good - what is tested by your server
>> will not be tested by mine and vice-versa. So running it on two
>> machines would be useful only if there are too many pull requests to
>> be tested.
>>
>> https://gist.github.com/2985162
>
> I'll try to provide a better implementation of the
> really_ugly_solution_for_finding_the_last_test_sha()
> function to only take into account comments produced by you (or me).
> Then there will be no clashes,
> as long as both you and me use the new version.

How will this scale? I will also have spare machines to dedicate to
this, once I get my new laptop.

I think the idea behind sympy-bot work is that the app-engine server
would keep track of what needs to be tested and distribute the work so
that there is no duplication.

Aaron Meurer

> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To post to this group, send email to sy...@googlegroups.com.
> To unsubscribe from this group, send email to sympy+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
>

krastano...@gmail.com

unread,

Jun 25, 2012, 4:33:16 AM6/25/12

to sy...@googlegroups.com

> How will this scale? I will also have spare machines to dedicate to
> this, once I get my new laptop.
>
> I think the idea behind sympy-bot work is that the app-engine server
> would keep track of what needs to be tested and distribute the work so
> that there is no duplication.

I may be wrong, but I think that Certik actually wants the duplication
so more architectures are tested.

And if we actually do not want duplication, the easiest solution is
just to leave one single computer doing all the tests - the PR number
is low enough for one computer to handle them all even if it is
testing numerous configurations. I do not think it is worth it to make
sympy-bot any smarter.

Ondřej Čertík

unread,

Jun 26, 2012, 8:27:39 PM6/26/12

to sy...@googlegroups.com

Hi Stefan,

On Mon, Jun 25, 2012 at 1:33 AM, krastano...@gmail.com
<krastano...@gmail.com> wrote:
>> How will this scale? I will also have spare machines to dedicate to
>> this, once I get my new laptop.
>>
>> I think the idea behind sympy-bot work is that the app-engine server
>> would keep track of what needs to be tested and distribute the work so
>> that there is no duplication.
>
> I may be wrong, but I think that Certik actually wants the duplication
> so more architectures are tested.

Right.

Btw, my first name is Ondrej. :)

>
> And if we actually do not want duplication, the easiest solution is
> just to leave one single computer doing all the tests - the PR number
> is low enough for one computer to handle them all even if it is
> testing numerous configurations. I do not think it is worth it to make
> sympy-bot any smarter.

If Travis can do everything that we need, I am certainly ok to only use Travis.
They will eventually update the Python to support "-R", so that's not
a big deal.
Based on our discussions, it seems that we really don't need to test
various platforms
as long as we use randomized hashing.

My intuition tells me that there will be occasions to use sympy-bot, for example
right now the "-R" testing, after Travis supports it, maybe some
customized testing.
It'd be nice if most of the complexity can be managed by the web app, and then
the sympy-bot can stay "simple stupid". And I don't think it's that much work,
but I don't have time for it at the moment. So I'll leave it as it is for now,
I am personally very happy with Travis, and we'll see how it goes.

However, I have a big problem with the randomized hashes.
Our current status is that we can only pass tests without randomizing the hash.
As such, at the moment, this is the only way that we can tell whether
some pull request breaks something or not.
So before we fix sympy, we should keep our main test runner
(Travis) with a definite seed for the hashes.

At the moment, your buildbot is running randomized hashes and so it
almost always fails.
This is useful for fixing sympy to actually work with randomized hashes, but
the failures usually have nothing to do with the pull request at hand,
and as such
when working on a particular pull request (and not on fixing hash
bugs), it's not very useful.

So our main test running should not use random hashes, so that we can
make sure that
all tests actually pass on at least some configuration.

Ondrej

Aaron Meurer

unread,

Jun 26, 2012, 9:37:05 PM6/26/12

to sy...@googlegroups.com

I disagree. One of the reasons that I pushed this pull request through
was so exactly this would happen. I'm hoping that it will get more
people to help out fixing these problems. Otherwise, it's just me who
really cares about them, and I don't have the resources to fix them
all in a reasonable time frame. I don't want to release unless the
tests pass in Python 3.3, and I do want to release, because it's been
almost a year again, and all the GSoC stuff from last summer is new
from the last release.

I do understand that this makes it hard to tell if a pull request
breaks things or not. For now, the simplest solution is to rerun the
tests on master with the same hash seed. The better way is to send
pull requests fixing the errors.

Aaron Meurer

krastano...@gmail.com

unread,

Jun 27, 2012, 5:33:19 AM6/27/12

to sy...@googlegroups.com

As per Aaron's suggestion I will keep testing with the defaults (with
randomization). I am always doing 3 tests and the first one is in a
version of python that does not support randomization, so you can
check that. If there is interested I can start testing the other
version with both randomized and not randomized hashes?

Check the pull request enabling randomization for the original
discussion on this topic.

Obviously, if someone finds some part (or all) of the testing
configuration irritating, I will immediately change it.

Marchael

unread,

Jun 29, 2012, 6:03:25 AM6/29/12

to sy...@googlegroups.com

Hey,

I could try to help with implementing "sympy-bot work" mode.

I like idea with github hooks, but I think that first we should rid from pollings on review site, there is my issue about that https://github.com/sympy/sympy-bot/issues/112

понедельник, 25 июня 2012 г., 2:20:40 UTC+6 пользователь Ondřej Čertík написал:

Ondřej Čertík

unread,

Jun 29, 2012, 5:17:24 PM6/29/12

to sy...@googlegroups.com

Hi Michael,

On Fri, Jun 29, 2012 at 3:03 AM, Marchael <marc...@kb.csu.ru> wrote:
> Hey,
> I could try to help with implementing "sympy-bot work" mode.
> I like idea with github hooks, but I think that first we should rid from
> pollings on review site, there is my issue about
> that https://github.com/sympy/sympy-bot/issues/112

That would be really cool. I don't have time to implement it,
but I do have time to timely review and test any pull requests
if you send any.

Ondrej

Ondřej Čertík

unread,

Jun 29, 2012, 5:21:49 PM6/29/12

to sy...@googlegroups.com

I can see your point --- by making everybody to see the failures, you
want to get more attention to it and get it fixed.
Why not, let's get it fixed.

I just don't want people's pull requests to get stalled by this,
i.e. I want somehow to very robustly see if they break anything or not.
The Travis tests do that currently, so that's an ok solution for me.

Stefan, yes, please do run the tests as you do now. That is very valuable.

Ondrej

Reply all

Reply to author

Forward