I am most of the way finished with adding database storage of result
sets to multi-mechanize (I should have it completed later today) I
have a stats db where I aggregate the logs from our web servers and
application for analysis, and sending the results straight to this
database will allow me to perform some deeper analysis of our
application.
I am have the database storage configurable through the multi-
mechanize config file. The code allows for storing in the database
while still writing to the normal multi-mechanize results file. I
would be more than glad to contribute this code back to you if anyone
would be interested in this capability and it would save you the
trouble of writing it yourself.
With much appreciation,
Brian
> The code allows for storing in the database
> while still writing to the normal multi-mechanize results file.
> would be more than glad to contribute this code back
what type of database did you use? this might be a feature that
others would want as well. perhaps you can explain a little more
about what data is stored, and then how you are using the database
afterwards. did you build any reporting?
regards,
-Corey
that would be great. I'll gladly take a look. this could be really
useful. I especially like that its not tied to a specific database
type.
I just did a release a few mins ago, so you might want to take a look
at the new directory structure and base your code off that.
-Corey
This will make sure that the solution stays scalable instead of
inserting the database latency into the game.
Granted database latencies will not show up for trivial tests but
anything beyond that and you'll see an impact.
Roland
On Feb 24, 1:02 pm, Brian Knox <taote...@gmail.com> wrote:
> Up and running on it now. The new projects directory structure is
> definitely handy. I'll try to get something working your way by this
> evening, tomorrow at the latest, for the db storage.
>
> Brian
>
Roland,
that's a great point. All of the output is written asynchronously and
from a different OS process than the load generator, so doing db
writes during the test isn't a huge concern... But during large
tests, we would definitely take a hit in disk i/o and it would make
multi-mechanize less scalable.
> when I get a little time later today I'll look at working up a version
> that loads the CSV data up after the tests are completed
Brian,
that would be great. so basically, once the test completes, the db
would get populated by parsing the results.csv file and inserting all
of the data into a db. This can be done during the results analysis
phase, so we don't interfere with load generation by additional i/o.
I received your initial code and will start playing with it.
thanks!
-Corey
My 2 cents: how about using memcached as an in-memory store for the
intermediate results? Then we flush it to a DB when the test run is
over. I've seen this done successfully in other projects.
Grig
since results are already written to disk as the test runs (in csv),
we don't need to store all intermediate results in memory. I think it
makes more sense to decouple the db totally from a test run and have
it just populated as a post-processing step.
however, memcached is an interesting idea and would bring up the
possibility of not doing *any* disk i/o during the test. Everything
could be saved in memory and then later flushed to a db and the
results.csv file could be generated from memcached data also (or from
the db). This would alleviate all disk i/o, but add some additional
memory/processing overhead. I don't think we need to make a change
like this yet though.
thanks for the input,
-Corey
BTW, unless memcached runs on your server, you will incur network
latency again small, but small things will bite when volume hits.
Network access is not a no-latency proposition.
On Feb 25, 8:59 am, "Grig Gheorghiu" <grig.gheorg...@gmail.com> wrote:
> I agree that simple is good. Maybe have a back-end storage plugin that would default to storing CSV flat files on disk, but could replace the storage mechanism with a database, or memcached + database, or whatever the end user wants?
>
> Grig
>
> -----Original Message-----
> From: Brian Knox <taote...@gmail.com>
> Date: Thu, 25 Feb 2010 11:48:47
> To: <multi-m...@googlegroups.com>
> Subject: Re: [multi-mech] Re: Feature: Database Storage of Results
>
> So my thought as a multi-mechanize user is, if multi-mechanize were extended
> to allow things like memcached or redis for intermediate results, I
> personally would like to see those as options and not as requirements.
>
> That being said I think it's an interesting idea, and my mind fills with
> visions of multi-mechanize client processes running across multiple servers
> all happily sending their results to a central memcache server *laughs*.
>
> I just think there is great value in the simplicity of the tool and it's
> lack of infrastructure requirements.
>
> For now my personal goal is to continue with the database code, trying to
> keep it as unobtrusive as possible (making it an option that does not depend
> on additional requirements unless you are using it), and to continue sending
> updates to Corey to incorporate or use as he wishes if he likes them.
>
> Brian
>
> On Thu, Feb 25, 2010 at 11:28 AM, Grig Gheorghiu
> <grig.gheorg...@gmail.com>wrote:
>
> > On Thu, Feb 25, 2010 at 7:30 AM, Corey Goldberg <cgoldb...@gmail.com>
I absolutely agree. I want to keep external dependencies as minimal
as possible... so everything we are talking about here would be
optionally turned off so you can run multi-mechanize with just a
standard python setup (+ matplotlib). I see multi-mechanize working
"out of the box" just like it does now, with an optional switch you
can configure to also allow results to go to a db.
> For now my personal goal is to continue with the database code, trying to
> keep it as unobtrusive as possible (making it an option that does not depend
> on additional requirements unless you are using it), and to continue sending
> updates to Corey to incorporate or use as he wishes if he likes them.
sounds good. keep em coming :)
-Corey
I also have some thoughts about turning it into a distributed system
that uses multiple load generating nodes. I don't wanna get too far
ahead of myself, so I haven't started working in that area yet. feel
free to start a new thread with ideas for how to get there.
-Corey