If somebody wishes to spam the mailing list, they could issue pull
requests by the dozen and let the list overflow.
I'm not sure whether we prefer to deal with it when it happens, or be
proactive. Proactive means we might be investing work for something that
doesn't ever happen on the current infrastructure. Dealing when it
happens might be having to react quickly without the ability to priorize it.
I think that's a question the project leader needs to decide.
Thanks for your comment!
I guess auth will be usefull anyway, just look at
http://reviews.sympy.org/report/agZzeW1weTNyDAsSBFRhc2sYwYsRDA
http://reviews.sympy.org/report/agZzeW1weTNyDAsSBFRhc2sYvuQQDA
http://reviews.sympy.org/report/agZzeW1weTNyDAsSBFRhc2sY14MRDA
I don't security specialist, but I think that isn't good anyway
--
With best regards,
Mayorov Michael
See also https://developers.google.com/appengine/docs/python/config/dos.
My biggest concern with this would be with the UI. If we use GitHub
authentication, will we be able to use the API token? If not, then it
won't be helpful to those of us who put the token in our sympy-bot
config file. If we use Google authentication, can we generate an
authentication token? That way, we would not have to put our Google
password in our config files, but we could still keep it from asking
our password every time.
Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To post to this group, send email to sy...@googlegroups.com.
> To unsubscribe from this group, send email to
> sympy+un...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/sympy?hl=en.
>
Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/sympy/-/654KiI-z6IMJ.
SymPy bot could exchange data with web application in theese ways(some examlpes in pseudo-python-code):
1. Using pickle for storing dict like
On web application:
import pickle
pull_info = {pull_number: (date, "commit_hash", [label1,label2], comments_count)}
pickled_info = pickle.dumps(pull_info)
2. Using class
class IncomingPull(object):
def __main__(self, pull_number, date, commit_hash, labels, comment_count);
self.pull_number = pull_number
....
on web-server side:
import pickle
pulls_for_review = [pull_data1,pull_data2,...]
pulls_for_bot = []
for pull in pulls_for_review:
pulls_for_bot.append(IncomingPull(value1,value2,...))
picklestring = pickle.dumps(pulls_for_bot)
We also should split up and prioritize the various testing
configurations. For example, if only Python 2 tests have been run,
then Python 3 tests should be prioritized. We could also go for even
more fancy things as well.
>
> SymPy bot will be able to run tests on different versions of Python(user
> could add full path to interpreter to config)
> Web application will waits some time before moving new pull request to
> queue(about 30 mins)
> If web application gives more than one commits for this pull, then review
> only latest from them, that will prevent spam
> "Modified since"\"Not modified", SymPy bot will gives last time when it ran
> tests to web application, and it'll answer "Not modified" if nothing new
> happens since that time or pulls info if happend.
> Not sure about that idea, but in some cases we could ignore some tests which
> unrelated to current changes, what I mean:
>
> If we're looking on SymPy code tree then we'll see that code has hierarchy.
> So, if we make changes in one of core modules then these changes will affect
> on most part of other code which depends from that module.
> But if we make changes in some final functions which lie much deeper in code
> hierarchy, then in most cases changes in them will not affect on other
> functions, which located in other modules and not depended from them
> Weak place of this idea is how do we know about other
> modules/classes/functions which using changed code? We could use searching
> in source code to know that, but is that really good idea?
If what you're suggesting is to only run the tests on the module that
was changed by the pull request, then I would recommend against that.
SymPy-Bot makes it easy to run the full test suite, and this is the
only way we can be assured that the code changes do not break
anything. The parts of SymPy are very interlinked, and changing one
thing can lead to unexpected changes elsewhere. So we should always
run the full test suite.
What we should do is have SymPy-Bot run tests on master, and not
report test failures that exist in master. Well, actually, they
should be reported somewhere, but not on unrelated pull requests.
Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/sympy/-/sZSsjpXA3-4J.
>
> To post to this group, send email to sy...@googlegroups.com.
> To unsubscribe from this group, send email to
> sympy+un...@googlegroups.com.
We also should split up and prioritize the various testing configurations. For example, if only Python 2 tests have been run, then Python 3 tests should be prioritized.
We could also go for even more fancy things as well.
Could that strategies work togherter or not? How do we will
switching between strategies? Will that require restart?
I guess if bot could demonize yourself ( "./sympy-bot work" ) then
it could keep track on config and reload each time when it
changes.
And also, I forgot about logs! Bot should be able to log your
actions while working as daemon, so admin could keep track on it
state.
If what you're suggesting is to only run the tests on the module that was changed by the pull request, then I would recommend against that. SymPy-Bot makes it easy to run the full test suite, and this is the only way we can be assured that the code changes do not break anything. The parts of SymPy are very interlinked, and changing one thing can lead to unexpected changes elsewhere. So we should always run the full test suite.
What we should do is have SymPy-Bot run tests on master, and not report test failures that exist in master. Well, actually, they should be reported somewhere, but not on unrelated pull requests. Aaron Meurer
Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
If you do priorizing (which is a good thing if it can be done), consider
giving tests with older versions of Python priority. For the code where
the Python version matters, this has a higher chance of detecting bugs.
On skipping: The bot could priorize the tests on changed code, running
the full tests with a lower priority.
I'm not sure that this would be useful though. I see two use cases for
the bot:
a) Keep a record about which pull requests pass the smoke test of not
triggering an error. For these, the full test suite needs to be run:
bin/test, bin/test --slow, bin/doctest. If possible, for all Python
versions that we support.
b) For those developers with a machine that's too weak to run the full
test suite routinely, do that testing in the background. If they want to
run just the tests for the code they're hacking, they could run them
locally.
> I guess if bot could demonize yourself ( "./sympy-bot work" ) then it
> could keep track on config and reload each time when it changes.
That's standard technique on Unixoid systems. I bet there's a Perl
module for this on CPAN, though a Python module would probably be
better. (Rolling your own is probably not a good idea, catching signals
is nontrivial.)
> And also, I forgot about logs! Bot should be able to log your actions
> while working as daemon, so admin could keep track on it state.
You'll want some log rotation scheme to keep log size under control.
Again, use a library if at all possible, it's not hard but lots of
details to attend to.
>> What we should do is have SymPy-Bot run tests on master, and not
>> report test failures that exist in master. Well, actually, they
>> should be reported somewhere, but not on unrelated pull requests.
> So, bot should somehow know about failed tests in master.
Probably by running the tests on master and keeping a hash of the
results, indexed by test name.
And complain only about those things that diverge from master.
> I suppose that
> bot could ask user about testing master branch, and if user agrees, then
> it run tests in master and then will try run tests in specific branch.
I'd want to keep the web interface out of the standard way of using the
bot. By default, I want to push to my pull requests, and get a report
from the bot later without any additional steps. At least that's what
I'd find valuable in such a testing bot from my perspective, YMMV :-)
The web interface may be useful for one-off activity.
Special-cased tests (but running them locally may be easier).
Configuring bot settings my pull requests, maybe at the levels of
"github user", "user's repository", "user's branch", "user's pull
request". (Find a way for bot to authenticate a user against github
without seeing their passwords or other secret information. SSH
authentication might work. github could provide an API for this.)
> If user tests have same failures as master then mark this failures like
> "Failed in master".
>
> Also we could use yellow for coloring thees failures
>
> So "test report page" in web-ui wil have table with:
> -------------------------------------------
> | test_name.py | Failed in master |
> -------------------------------------------
> Coloring table lines could be simply done wth jquery, I already working
> with that lib some time ago.
No Javascript if you don't need fancy interaction stuff!
Just to color the output, a CSS style is fully sufficient.
I have Javascript off by default. Turning JS on requires that I trust
(a) the browser not to have security holes, something that's patently
untrue in the days of zero-day exploits; (b) the owner of the web site
not to plan on infecting my machine with malware, which is usually not a
problem; (c) the owner of the web site to be competent enough to fend of
any and all hacking attempts, something that, again, is patently untrue
if even the Debian guys had successful break-ins.
>>> What we should do is have SymPy-Bot run tests on master, and not
>>> report test failures that exist in master. Well, actually, they
>>> should be reported somewhere, but not on unrelated pull requests.
>> So, bot should somehow know about failed tests in master.
>
> Probably by running the tests on master and keeping a hash of the
> results, indexed by test name.
> And complain only about those things that diverge from master.
Where will we store it? I already suggested to use SQLite for that.
>
> > I suppose that
>> bot could ask user about testing master branch, and if user agrees, then
>> it run tests in master and then will try run tests in specific branch.
>
> I'd want to keep the web interface out of the standard way of using
> the bot. By default, I want to push to my pull requests, and get a
> report from the bot later without any additional steps. At least
> that's what I'd find valuable in such a testing bot from my
> perspective, YMMV :-)
>
No, I meant case, when bot asks user about checking master repo before
going to background("./sympy-bot work"). So if you just want to review
your pull, it won't asks you about that.
> The web interface may be useful for one-off activity.
> Special-cased tests (but running them locally may be easier).
Do you mean that bot could run only some part of tests? That could be
good for one-off using of course, but I not sure someone will wants it
when bot daemonized.
> Configuring bot settings my pull requests, maybe at the levels of
> "github user", "user's repository", "user's branch", "user's pull
> request". (Find a way for bot to authenticate a user against github
> without seeing their passwords or other secret information. SSH
> authentication might work. github could provide an API for this.)
Hm, your words give me one interesting idea. As you know, any user now
may upload test results on reviews.sympy.org That's not good anyway
because anyone could use it for spaming or others bad things. But in
case when user don't have account on github or just want to stay
anonymous, SymPy bot could just paste results to public paste service
like http://paste.pocoo.org/ or http://pastebin.com/
And also, bot should inform user about that, like "Your results were
pasted in http://example.com/paste , for sending your review on
reviews.sympy.org and github pull-request discussion, please
authenticate yourself"
So, until you don't specify your credentials for access to github, bot
will paste results to public paste service.
I also thought about authentication with Google Account or OAuth, and
decided that it will be useless, because bot will more properly
integrated to github and supports additional features, e.g. posting
results in discussion.
>> If user tests have same failures as master then mark this failures like
>> "Failed in master".
>>
>> Also we could use yellow for coloring thees failures
>>
>> So "test report page" in web-ui wil have table with:
>> -------------------------------------------
>> | test_name.py | Failed in master |
>> -------------------------------------------
>> Coloring table lines could be simply done wth jquery, I already working
>> with that lib some time ago.
>
> No Javascript if you don't need fancy interaction stuff!
> Just to color the output, a CSS style is fully sufficient.
>
Yep, but currently main page of review.sympy.org using jQuery for
drawing that fancy window with search box. And also I want to make small
window with information about test enviroment which will appear on user
click.
> I have Javascript off by default. Turning JS on requires that I trust
> (a) the browser not to have security holes, something that's patently
> untrue in the days of zero-day exploits; (b) the owner of the web site
> not to plan on infecting my machine with malware, which is usually not
> a problem; (c) the owner of the web site to be competent enough to
> fend of any and all hacking attempts, something that, again, is
> patently untrue if even the Debian guys had successful break-ins.
>
Why you just don't install no-sctipt for that? I already tried some
tricks against GAE but useless
http://reviews.sympy.org/report/agZzeW1weTNyDAsSBFRhc2sY14MRDA ;-)
The signal module in the standard library is operating system
independent (see http://docs.python.org/library/signal.html).
> Actualy I don't know how reload config on windows without restart. Any
> suggestions?
Just read in the file again. I don't see what the issue is.
>
>>
>>
>>> And also, I forgot about logs! Bot should be able to log your actions
>>> while working as daemon, so admin could keep track on it state.
>>
>>
>> You'll want some log rotation scheme to keep log size under control.
>> Again, use a library if at all possible, it's not hard but lots of details
>> to attend to.
>
> Not sure that good idea, because e.g. on *nix-like OSes you could use rather
> flexible tool called "logrotate" which allow split logs on several files and
> make archivation, so implementing this functional to SymPy bot will looks
> like reinventing a bycicle. Admin should keep track on log files by itself.
There are a ton of logging modules for Python, including one in the
standard library. This is definitely a place where we don't need to
reinvent the wheel.
As I noted on IRC, I think this is a bad idea. pastehtml is
unreliable. You don't know this because you weren't around when we
used it, but sometimes the report would just not upload, or sometimes
instead of the report it would just give spam. The site went down a
lot too, as I seem to remember.
I don't see why we can't require github authentication. Anyone who
uses sympy-bot should have a github account. This is required to post
a comment on the pull request. And not requiring this will make it
easier to send spam or false reports.
I agree that just colors can be done in CSS, but if you turn off
javascript in your browser, you can't expect any modern webpage to
work.
Aaron Meurer
>
>
> --
> With best regards,
> Mayorov Michael
>
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
This is grossly untrue.
There are a ton of logging modules for Python, including one in the
standard library. This is definitely a place where we don't need to
reinvent the wheel.
The signal module in the standard library is operating system
independent (see http://docs.python.org/library/signal.html).
As I noted on IRC, I think this is a bad idea. pastehtml is
unreliable. You don't know this because you weren't around when we
used it, but sometimes the report would just not upload, or sometimes
instead of the report it would just give spam. The site went down a
lot too, as I seem to remember.I don't see why we can't require github authentication. Anyone who
uses sympy-bot should have a github account. This is required to post
a comment on the pull request. And not requiring this will make it
easier to send spam or false reports.
From a security point of view, it hasn't een necessary, but that could
easily change if someone decides to take advantage of it.
It could be useful to have authentication to store some user
information on the server (depending on what you're doing, of course).
I think it would be best to avoid requiring additional authentication
by default unless it becomes necessary, as that is just a further
annoyance to people who want to use SymPy bot.
Aaron Meurer
>
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/sympy/-/HKjQuVxJ1zIJ.
I already suggested to use SQLite for storing information about pulls,
which will helps bot to analize previous pulls and make priorizing strategy.
I guess we could add an option to config like "test_master = 30" which
will test master branch for every 30 minutes and store information about
failed tests in database, so bot will always know about theese tests.
Good luck :-)
> There are a ton of logging modules for Python, including one in the
> standard library. This is definitely a place where we don't need to
> reinvent the wheel.
>
> I just meant, that admin could make log rotation with third part tools(like
> logrotate). It's usual practice in system administrating.
Log rotation would definitely be a good idea.
Depending on a service that is not available on Windows is a no-go, so
logrotate itself is not possible. I agree that something with similar
functionality would be a good idea, provided that logging is an issue.
@Aaron: Are external dependencies okay for Bot?
Probably it's best to just store the information in the app engine.
>
> I already suggested to use SQLite for storing information about pulls, which
> will helps bot to analize previous pulls and make priorizing strategy.
> I guess we could add an option to config like "test_master = 30" which will
> test master branch for every 30 minutes and store information about failed
> tests in database, so bot will always know about theese tests.
Again, store it in the app engine. The basic functionality will be
./sympy-bot review master, which will review master and upload it. I
suppose this could be incorporated into sympy-bot work, as a request
to be served out.
When the bot tries to upload test with failures to the app-engine, it
will first check against the app engine for the latest master tests,
and compare against those.
Aaron Meurer
>
>
> --
> With best regards,
> Mayorov Michael
>
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
Yeah, though the ability to just download and run the bot is nice.
But if we need it for some functionality, it's fine. The bot is a
development tool, and dependencies are fine for development tools (as
long as they are reasonable and truly necessary).
Aaron Meurer
Given Python's dynamic nature, it should be possible to detect whether
the logging library is available, and to log to stdout if it isn't.
Running the bot interactively doesn't require logging anyway. I.e. bot
could log to stderr if it detects that stdout is a console. (Is it
possible to detect whether stdout is a console? There's a system call
for that on Posix system, but I don't know if that's (a) available in
Python and (b) present in Windows).
It's possible in Python [0]:
if sys.stdout.isatty():
# You're running in a real terminal
else:
# You're being piped or redirected
And in [1] (yeah, it's old) an Apparently Knowledgeable Guy says:
fileobject.isatty() exists on all platforms.
which probably means that we have a solution. I don't really have an
opportunity to test it on Windows now though.
Sergiu
[1] http://mail.python.org/pipermail/python-dev/2003-March/034131.html
+1
"Record" in the sense of "keep it somewhere so people can pick it up".
Leaving a comment on the pull request at github would satisfy that.
> why some specific pulls needs to be run with "bin/test, bin/test
> --slow, bin/doctest" and how bot will guessing about which pythons
> system have?
A pull request is supposed to pass all three tests in all supported
versions of Python before it can be pulled.
In practice, the full suite isn't run, usually for lack of time or
possibly due to oversight. This can end in unnecessary extra rounds of
review because some test combination isn't run until the day before the
request would have been pulled, delaying the whole process by a week or two.
I haven't contributed much yet, and I already had two experiences of
that kind. It's frustrating - and we want to reduce sources of
frustration, of course.
>> b) For those developers with a machine that's too weak to run the full
>> test suite routinely, do that testing in the background. If they want
>> to run just the tests for the code they're hacking, they could run
>> them locally.
> If developer just run some specific test(s) for own code on weak
> machine, then why he needs daemonizing bot and priorisation?
They can push to their pull request and have bot run the tests while
they continue work locally.
> Actualy I don't know how reload config on windows without restart. Any
> suggestions?
Actually, that's an easy one and you don't need signals.
Here's how:
Assuming that bot is mainly useful for the long-running test suites, the
milliseconds to read and parse a configuration file are negligible. So
just reread the configuration before deciding what test to run next.
> Not sure that good idea, because e.g. on *nix-like OSes you could use
> rather flexible tool called "logrotate" which allow split logs on
> several files and make archivation, so implementing this functional to
> SymPy bot will looks like reinventing a bycicle. Admin should keep track
> on log files by itself.
Logrotate is designed for system logs. It can be made to work for user
logs, but it requires an extra configuration file that must be given on
the command line.
A lot depends on the environment bot is supposed to run in - just
github? On a sponsored server? Windows server or Linux/*BSD/otherUnixoid
server? (BTW I could sponsor a bot server. I'm administering an
eight-core machine; I'd let the bot run whenever system load is below
50%, which is most of the time.)
On a non-Linux server, you might not even have logrotate. A Python
library requires the least amount of external dependencies, which would
be helpful if bot is supposed to run in varying environments.
Heck, even if it is supposed to run on a single Linux machine where we
could verify that logrotate exists and is configured the way we need it,
we'd still want to be independent of that - after all, we might want to
move the bot elsewhere.
>>>> What we should do is have SymPy-Bot run tests on master, and not
>>>> report test failures that exist in master. Well, actually, they
>>>> should be reported somewhere, but not on unrelated pull requests.
>>> So, bot should somehow know about failed tests in master.
>>
>> Probably by running the tests on master and keeping a hash of the
>> results, indexed by test name.
>> And complain only about those things that diverge from master.
> Where will we store it? I already suggested to use SQLite for that.
That's one possibility.
I'd consider serializing the data out to a text file as an alternative.
Pros:
+ No external dependency.
Cons:
- We can't do long-term data series for statistics. (Just a nice-to-have
from my perspective, but anyway.)
Non-issues:
o RAM usage: negligible, we only need to store the test results from
master plus configuration data. All other test results go to github as
comments.
o Access speed: Python hashes are faster than even SQLite.
o Ad-hoc SQL queries: Irrelevant, all data fits into a few screen pages;
a text editor search would be faster than writing SQL.
>> > I suppose that
>>> bot could ask user about testing master branch, and if user agrees, then
>>> it run tests in master and then will try run tests in specific branch.
>>
>> I'd want to keep the web interface out of the standard way of using
>> the bot. By default, I want to push to my pull requests, and get a
>> report from the bot later without any additional steps. At least
>> that's what I'd find valuable in such a testing bot from my
>> perspective, YMMV :-)
>>
> No, I meant case, when bot asks user about checking master repo before
> going to background("./sympy-bot work").
I'd want something that requires the bare minimum of configuration.
Preferrably configuration that the user has to do anyway. Possibly along
the lines of "if bin/test or bin/doctest detect that they are running in
a git branch that's configured to push to github, publish the results".
Or maybe give the test scripts a command-line option that makes them
publish the test results.
You're thinking along the lines of lots of user interaction. Testing
needs to be as automatic as possible; having to wait until a test run
finishes is disruptive enough, requiring people to visit a website, or
anything else will just make them not use bot.
> So if you just want to review
> your pull, it won't asks you about that.
I don't need bot to do that.
I do bin/doctest && bin/test && bin/test --slow
Possibly in a separate console window so I can do other stuff (but
that's limited, I can't reasonably git checkout, I'd have to prepare
another workdir for that).
>> The web interface may be useful for one-off activity.
>> Special-cased tests (but running them locally may be easier).
> Do you mean that bot could run only some part of tests? That could be
> good for one-off using of course, but I not sure someone will wants it
> when bot daemonized.
I still don't understand what a daemonized bot would be good for.
I can't do anything with my workdir while a test is running anyway.
And whatever you do for bot, it would be as useful for the test scripts
themselves. Somehow I fail to see distinct use cases for "local bot" and
"test script" (other than a results upload).
For a results upload, I can
- add command line switches to the test scripts to make them upload
- have the test scripts generate output files that can be uploaded with
a separate script
- create a separate bin/logtest script that will run all tests and pipe
the output to an upload
No "bot" there.
>> Configuring bot settings my pull requests, maybe at the levels of
>> "github user", "user's repository", "user's branch", "user's pull
>> request". (Find a way for bot to authenticate a user against github
>> without seeing their passwords or other secret information. SSH
>> authentication might work. github could provide an API for this.)
> Hm, your words give me one interesting idea. As you know, any user now
> may upload test results on reviews.sympy.org That's not good anyway
> because anyone could use it for spaming or others bad things. But in
> case when user don't have account on github or just want to stay
> anonymous, SymPy bot could just paste results to public paste service
> like http://paste.pocoo.org/ or http://pastebin.com/
They still need to announce that a new result is available. So a griefer
could use the announcement service for spamming.
(Commercial spamming to the devs of a relatively small project does not
pay off, so we don't need to worry about UCE, only about griefers.)
> And also, bot should inform user about that, like "Your results were
> pasted in http://example.com/paste , for sending your review on
> reviews.sympy.org and github pull-request discussion, please
> authenticate yourself"
I'd avoid additional user interaction like the plague.
People want to fire and forget. If you want them to authenticate, they
will not fire, just forget.
The most that people are willing to endure is some additional
configuration step that needs to be done once.
>>> If user tests have same failures as master then mark this failures like
>>> "Failed in master".
>>>
>>> Also we could use yellow for coloring thees failures
>>>
>>> So "test report page" in web-ui wil have table with:
>>> -------------------------------------------
>>> | test_name.py | Failed in master |
>>> -------------------------------------------
>>> Coloring table lines could be simply done wth jquery, I already working
>>> with that lib some time ago.
>>
>> No Javascript if you don't need fancy interaction stuff!
>> Just to color the output, a CSS style is fully sufficient.
>>
> Yep, but currently main page of review.sympy.org
That server doesn't exist...
> using jQuery for
> drawing that fancy window with search box.
Sorry, but it wouldn't be too hard to slip malware past the review
process. Just bury it in some JS library and declare it as an update to
the website, and we wouldn't notice.
That's why I don't allow Javascript for sympy.org, and would recommend
everybody to do likewise. (I do allow it for github, under the
assumption that they have fulltime administrators who are dedicated to
detecting and preventing such attacks. Relying on the same assumption on
sympy.org would be misguided in my eyes. I may be paranoid, but in these
times of security holes and zero-day exploits, being rational would
probably require a far higher degree of paranoia.)
> And also I want to make small
> window with information about test enviroment which will appear on user
> click.
I will never ever see it.
Use a CSS mouseover for that.
Don't try to solve everything via Javascript just because JQuery is such
a shiny tool.
Just as you don't hammer everything into place just because you happen
to have such a shiny hammer.
>> I have Javascript off by default. Turning JS on requires that I trust
>> (a) the browser not to have security holes, something that's patently
>> untrue in the days of zero-day exploits; (b) the owner of the web site
>> not to plan on infecting my machine with malware, which is usually not
>> a problem; (c) the owner of the web site to be competent enough to
>> fend of any and all hacking attempts, something that, again, is
>> patently untrue if even the Debian guys had successful break-ins.
>>
> Why you just don't install no-sctipt for that?
I have it installed.
But with JS switched off on sympy.org, I won't ever see all the fancy
stuff you're planning.
> I already tried some
> tricks against GAE but useless
> http://reviews.sympy.org/report/agZzeW1weTNyDAsSBFRhc2sY14MRDA ;-)
Just because you failed doesn't mean that others will.
12.04.2012 03:19, Joachim Durchholz написал:
> Am 03.04.2012 14:34, schrieb Mayorov Michael:
>>> I'm not sure that this would be useful though. I see two use cases for
>>> the bot:
>>> a) Keep a record about which pull requests pass the smoke test of not
>>> triggering an error. For these, the full test suite needs to be run:
>>> bin/test, bin/test --slow, bin/doctest. If possible, for all Python
>>> versions that we support.
>> I am not sure that understand you. What do you mean by record,
>
> "Record" in the sense of "keep it somewhere so people can pick it up".
> Leaving a comment on the pull request at github would satisfy that.
>
>> why some specific pulls needs to be run with "bin/test, bin/test
>> --slow, bin/doctest" and how bot will guessing about which pythons
>> system have?
>
> A pull request is supposed to pass all three tests in all supported
> versions of Python before it can be pulled.
>
> In practice, the full suite isn't run, usually for lack of time or
> possibly due to oversight. This can end in unnecessary extra rounds of
> review because some test combination isn't run until the day before
> the request would have been pulled, delaying the whole process by a
> week or two.
> I haven't contributed much yet, and I already had two experiences of
> that kind. It's frustrating - and we want to reduce sources of
> frustration, of course.
Then we may probably want to add switcher --test-cases [ all | doctest |
usual (./bin/test by default) ] to bot
>
>>> b) For those developers with a machine that's too weak to run the full
>>> test suite routinely, do that testing in the background. If they want
>>> to run just the tests for the code they're hacking, they could run
>>> them locally.
>> If developer just run some specific test(s) for own code on weak
>> machine, then why he needs daemonizing bot and priorisation?
>
> They can push to their pull request and have bot run the tests while
> they continue work locally.
>
>> Actualy I don't know how reload config on windows without restart. Any
>> suggestions?
>
> Actually, that's an easy one and you don't need signals.
> Here's how:
> Assuming that bot is mainly useful for the long-running test suites,
> the milliseconds to read and parse a configuration file are
> negligible. So just reread the configuration before deciding what test
> to run next.
Yeah, that's good idea.
>
>> Not sure that good idea, because e.g. on *nix-like OSes you could use
>> rather flexible tool called "logrotate" which allow split logs on
>> several files and make archivation, so implementing this functional to
>> SymPy bot will looks like reinventing a bycicle. Admin should keep track
>> on log files by itself.
>
> Logrotate is designed for system logs. It can be made to work for user
> logs, but it requires an extra configuration file that must be given
> on the command line.
>
> A lot depends on the environment bot is supposed to run in - just
> github? On a sponsored server? Windows server or
> Linux/*BSD/otherUnixoid server? (BTW I could sponsor a bot server. I'm
> administering an eight-core machine; I'd let the bot run whenever
> system load is below 50%, which is most of the time.)
> On a non-Linux server, you might not even have logrotate. A Python
> library requires the least amount of external dependencies, which
> would be helpful if bot is supposed to run in varying environments.
> Heck, even if it is supposed to run on a single Linux machine where we
> could verify that logrotate exists and is configured the way we need
> it, we'd still want to be independent of that - after all, we might
> want to move the bot elsewhere.
I not sure that much project's developers using windows, but I agreed
with you, unversal log rotating schema will be usefull, so what we could
use for it? First that I google was http://pypi.python.org/pypi/logrotate
>
>
>>>>> What we should do is have SymPy-Bot run tests on master, and not
>>>>> report test failures that exist in master. Well, actually, they
>>>>> should be reported somewhere, but not on unrelated pull requests.
>>>> So, bot should somehow know about failed tests in master.
>>>
>>> Probably by running the tests on master and keeping a hash of the
>>> results, indexed by test name.
>>> And complain only about those things that diverge from master.
>> Where will we store it? I already suggested to use SQLite for that.
>
> That's one possibility.
>
> I'd consider serializing the data out to a text file as an alternative.
>
> Pros:
> + No external dependency.
> Cons:
> - We can't do long-term data series for statistics. (Just a
> nice-to-have from my perspective, but anyway.)
> Non-issues:
> o RAM usage: negligible, we only need to store the test results from
> master plus configuration data. All other test results go to github as
> comments.
> o Access speed: Python hashes are faster than even SQLite.
> o Ad-hoc SQL queries: Irrelevant, all data fits into a few screen
> pages; a text editor search would be faster than writing SQL.
@asmeurer already suggested me to use GAE models, I guess that will more
relevant.
>
>>> > I suppose that
>>>> bot could ask user about testing master branch, and if user agrees,
>>>> then
>>>> it run tests in master and then will try run tests in specific branch.
>>>
>>> I'd want to keep the web interface out of the standard way of using
>>> the bot. By default, I want to push to my pull requests, and get a
>>> report from the bot later without any additional steps. At least
>>> that's what I'd find valuable in such a testing bot from my
>>> perspective, YMMV :-)
>>>
>> No, I meant case, when bot asks user about checking master repo before
>> going to background("./sympy-bot work").
>
> I'd want something that requires the bare minimum of configuration.
> Preferrably configuration that the user has to do anyway. Possibly
> along the lines of "if bin/test or bin/doctest detect that they are
> running in a git branch that's configured to push to github, publish
> the results".
> Or maybe give the test scripts a command-line option that makes them
> publish the test results.
>
> You're thinking along the lines of lots of user interaction. Testing
> needs to be as automatic as possible; having to wait until a test run
> finishes is disruptive enough, requiring people to visit a website, or
> anything else will just make them not use bot.
developer's interrogation won't requir :) Config file is good place for
storing all answers.
>
>> So if you just want to review
>> your pull, it won't asks you about that.
>
> I don't need bot to do that.
> I do bin/doctest && bin/test && bin/test --slow
> Possibly in a separate console window so I can do other stuff (but
> that's limited, I can't reasonably git checkout, I'd have to prepare
> another workdir for that).
>
>>> The web interface may be useful for one-off activity.
>>> Special-cased tests (but running them locally may be easier).
>> Do you mean that bot could run only some part of tests? That could be
>> good for one-off using of course, but I not sure someone will wants it
>> when bot daemonized.
>
> I still don't understand what a daemonized bot would be good for.
May be daemonizing is not good, really. Then I guess will be better if
bot could run in terminal and write some usefull information, so you
coul kill it with ctrl+C at any time.
> I can't do anything with my workdir while a test is running anyway.
By default each time when bot uploads new pull it creates new workdir in
/tmp (or may be in C:\WINDOWS\TEMP )
> And whatever you do for bot, it would be as useful for the test
> scripts themselves. Somehow I fail to see distinct use cases for
> "local bot" and "test script" (other than a results upload).
>
> For a results upload, I can
> - add command line switches to the test scripts to make them upload
> - have the test scripts generate output files that can be uploaded
> with a separate script
> - create a separate bin/logtest script that will run all tests and
> pipe the output to an upload
>
> No "bot" there.
ok, but where you will publish your results, on public paste? then how
other users will remember links on these pastes? Of course you could
manually collect and paste all information what you want and it will
works, but it's routine and could be automized.
>>> Configuring bot settings my pull requests, maybe at the levels of
>>> "github user", "user's repository", "user's branch", "user's pull
>>> request". (Find a way for bot to authenticate a user against github
>>> without seeing their passwords or other secret information. SSH
>>> authentication might work. github could provide an API for this.)
>> Hm, your words give me one interesting idea. As you know, any user now
>> may upload test results on reviews.sympy.org That's not good anyway
>> because anyone could use it for spaming or others bad things. But in
>> case when user don't have account on github or just want to stay
>> anonymous, SymPy bot could just paste results to public paste service
>> like http://paste.pocoo.org/ or http://pastebin.com/
>
> They still need to announce that a new result is available. So a
> griefer could use the announcement service for spamming.
> (Commercial spamming to the devs of a relatively small project does
> not pay off, so we don't need to worry about UCE, only about griefers.)
griefers could exceed sympy-bot quota by adding large messages. I plan
to add github auth to reviews.sympy.org to prevents this, it will
doesn't require additional developer's interaction with server, SymPy
bot will auth developer with credentials specified in config.
>
>> And also, bot should inform user about that, like "Your results were
>> pasted in http://example.com/paste , for sending your review on
>> reviews.sympy.org and github pull-request discussion, please
>> authenticate yourself"
>
> I'd avoid additional user interaction like the plague.
> People want to fire and forget. If you want them to authenticate, they
> will not fire, just forget.
> The most that people are willing to endure is some additional
> configuration step that needs to be done once.
mm, point of my thought was like "I want to share test results with
others, but I don't want to add github token in config". @asmeurer
noticed me that most of developers have github account, so that
functional will be useless. As before, developers will set up config
only one time and use it everytime.
I didn't say that I want to have JS in each place I see. And btw, I keep
no-script turned on by default too :)
>
>
>>> I have Javascript off by default. Turning JS on requires that I trust
>>> (a) the browser not to have security holes, something that's patently
>>> untrue in the days of zero-day exploits; (b) the owner of the web site
>>> not to plan on infecting my machine with malware, which is usually not
>>> a problem; (c) the owner of the web site to be competent enough to
>>> fend of any and all hacking attempts, something that, again, is
>>> patently untrue if even the Debian guys had successful break-ins.
>>>
>> Why you just don't install no-sctipt for that?
>
> I have it installed.
> But with JS switched off on sympy.org, I won't ever see all the fancy
> stuff you're planning.
It's not a problem. I won't plan to make review site unusable if user
will turn off JS in browser. Main purpose of my project is to improve
usability, not beauty.
>
> > I already tried some
>> tricks against GAE but useless
>> http://reviews.sympy.org/report/agZzeW1weTNyDAsSBFRhc2sY14MRDA ;-)
>
> Just because you failed doesn't mean that others will.
>
I don't security specialist(at least not in web stuff), but review site
doesn't seems to be attractive target for JS injections attacking user's
browsers
We already have the --testcommand flag. 99% of the time, the default
(run all tests) is what you want.
A few do. When we did Google Code In, it was somewhat painful to not
have SymPy Bot working on Windows, because quite a few of the students
were on Windows, and couldn't test their own pull requests. So
support for at least one-off testing on that platform is essential
(sympy-bot work not so much, but if it can work, then it should).
Not only that, but you can determine a lot of things just by guessing.
For example, if could easily search the PATH for all versions of
Python.
And one-off users can just enter their password on the command line.
The config file is just a convenience; it is not necessary to run.
Well I for one don't get the point of this. If you can make a better
interface by using Javascript, which is a web standard, then do it.
If you are too paranoid to enable javascript in your browser, then you
should not expect modern webpages to work.
That being said, if you can do things without javascript (for example,
with html5), then that would probably be better.
>
>>
>> > I already tried some
>>>
>>> tricks against GAE but useless
>>> http://reviews.sympy.org/report/agZzeW1weTNyDAsSBFRhc2sY14MRDA ;-)
>>
>>
>> Just because you failed doesn't mean that others will.
>>
> I don't security specialist(at least not in web stuff), but review site
> doesn't seems to be attractive target for JS injections attacking user's
> browsers
>
>
> --
> With best regards,
> Mayorov Michae
>
> --
> You received this message because you are subscribed to the Google Groups
> "sympy" group.
--
With best regards,
Mayorov Michael
Aaron Meurer
Thanks, that's useful advice. I'll switch off my browser now.
NOT.
The only pages that really do not work, in my experience, are company pages.
And asserting that "modern" pages won't work without JS is (a) vague
(what's "modern", and are the concrete properties of a "modern" page
even desirable?), (b) untrue: even with a vague definition, WP, phpBB
and Google should be considered "modern", yet I find them working very
well without JS.
Third, in these days, paranoia is really the only rational reaction to
the threat levels of drive-by downloads. Between patchdays, Windows
systems get infected with a 1:50 to 1:400 probability, depending on
Windows version and country. New malware technology can create outbreaks
with far higher infection rates.
Javascript is the first and foremost transmission vehicle for malware,
simple as that. (Flash is on the rise, but nobody is proposing to use
flash.)
> That being said, if you can do things without javascript (for example,
> with html5), then that would probably be better.
Exactly my point.
Okay, that's what MS says.
As I did some more googling, I found http://tinyurl.com/ca4ub2l which
claims an 1:4 infection rate even in the lowest-infected countries.
I don't believe either number, actually. MS has a vested interest in
downplaying infection rates, and Norman has a vested interest in
exaggeration (and I see signs of that on both reports).
I alredy told about auth issue, user must be authenticated before do
this. but anyway, if someone makes changes in master by mistake and
fails some tests, then bot will confuse other developers.
To prevent this we could use "trusted users list" on web-ui side
--
With best regards,
Mayorov Michael