GSoC: Effortless Model Testing

8 views
Skip to first unread message

Jason Ledbetter

unread,
Apr 1, 2008, 11:58:22 AM4/1/08
to Django developers
I'm looking for feedback on my SoC application which I call
"Effortless Model Testing". EMT would add sample instance generation,
fuzz testing, and (if still necessary) model-level validation to
Django.

I have my ideas explained in detail on my web page:

http://sarcasticzombie.net

Anyone willing to take a look and give me feedback would be most
appreciated. I'm new to open source contribution but I'm really
excited about Django in general, so I'll gladly change anything
that'll take me a step closer to being accepted for SoC.

Thanks everyone!

-Jason L.

Malcolm Tredinnick

unread,
Apr 2, 2008, 5:00:47 PM4/2/08
to django-d...@googlegroups.com

Note that model-level validation is already pretty much taken care of.
Honza Kraal has done quite a bit of work there recently, building on the
discussions a bunch of people had at PyCon. Check the recent archives of
this list for details and a pointer to the relevant ticket. So you can
remove that portion of your application, since I don't think we need yet
another design for that.

The main part of your application is something I'll think about some
more, but I'm not really sure what development effort is involved there.
Everything you've mentioned is already possible and not particularly
hard: generating test data from a Yaml-like description, or even a short
Python script, isn't too hard. Perhaps you could flesh out a bit more
what the involved work is. What pieces do you see as currently missing
and needing to be implemented?

Malcolm

--
Works better when plugged in.
http://www.pointy-stick.com/blog/

Jason Ledbetter

unread,
Apr 3, 2008, 12:32:36 PM4/3/08
to Django developers
Malcolm,

Regarding model-level validation, apparently I'm bad at tracking down
information. I tried on a couple of occasions to search for
information on validation's progress and only found references to
discussions that weren't recorded therein. My mistake. I'm glad that's
already so far along; I'll remove that part of my proposal and
rebalance it.


As for Batch creation:

To address Yaml; yeah, it's easy to do, but it's also not a terribly
pythonic way to handle things. While my theoretical scripting language
isn't python, it's close enough in structure and simple to use. Why
have django's db layer? So you don't have to use SQL. Why have a batch
creator? So you don't have to use Yaml, or json, etc. But I could drop
the scripting idea and instead work off of a "batch options" object
and keep everything pure python. Either way, you don't have go far
afield of the current language space to use it.

However, I've done a bad job in my writeup of explaining the real
benefit of my proposal.

I work for an insurance underwriting company and I've used Django very
successfully to automate finances, policy issuance, our contact books,
etc. It works great, but I'm working with what's probably an unusually
complex set of data (compared to most users of Django). So my idea for
the project comes from working in that environment.

There are other ways to generate test data, yes. But they don't *scale
well*.

If we're taking a test data set and shoving it into the database, Yaml
or anything else is perfectly suited to the task. But that's assuming
we *already have test data*. What if you're writing a new system? What
if your data is in another format you'll have to translate, and you
need a prototype to justify that expense to your bosses?

You end up hobbling together scripts to generate valid but random data
in order to populate a given set of instances. I have such a script
here at work, to make sample insurance policies for our issuance
system. It's hundreds of lines of ugly, hard-to-read, boilerplate
code.

Anyone can quickly throw together a script that makes a set of authors
who have a set of articles in five mintes, so my chosen example in my
proposal was a poor one. What if you need 15 sample companies with 30
sample prefixes each with 5000 sample policies which connect to an
underwriter, an insured, an agent (who connects to an agency), etc
etc. Then the agents can bind for certain companies, the companies
have adjusters who work with certain agents, and on top of all that
some policies have brokers who act as intermediaries for *all* of
that.

In situations of sufficient complexity, the boilerplate code becomes
huge and you wear out the keys even *near* your copy and paste
shortcuts. That inevitably leads to bugs and with a sufficiently large
generating script, you fall into a "run it until it works" trudge. You
get to step 15 of the generation process and find out you typed
"underwriter" as "unserwriter" and you just wasted 25 minutes creating
an incomplete set.

So the first major value in my system is the ability to automatically
not only fill in the data, but to manage complex relationships. All
the django relationships with a stable API would be covered. And this
is why I'm leaning toward a psuedo-language instead of pure python: I
can simplify expressing those relationships if I'm not completely
bound to pythons syntax.

The second major time saver here is the series of automatic data
generators that I'd include. I want this batch maker to be very
"batteries included". Have you ever written an algorithm to generate
valid, realistic pseudo-random names, addresses, titles, etc? What if
you want to only generate valid zip codes? What if you intentionally
want to generate invalid data? I want these generators to be fleshed
out swiss-army-knives of their given data set.

I'm sure you've dealt with this before, so you know how much time is
lost to rolling these little pieces together when you need sample
data. My approach would be fairly simple: I'll spend time trolling
every django app I can find, looking at the models, and compile a
master list of types of data I see being stored. Then I'll create lazy
generators, tune them for performance, and present them to the world.
Each of those generators will also have associated template tags,
which will in turn make them an awesome tool for testing layouts, etc.

If you're throwing together a contact page, for example, you can have
your page showing a new name, address, etc etc on each page refresh,
allowing you to see first hand how your layout reacts to different
sets of data. So there's a good bit of synergy there for template
developers.

In short, I want these generators to have *qaulity*. They'll be
written as a stand-alone set of libraries which are then wrapped to
work with Django. I can see such a tool set as being useful to any
developer of any type of application, not just Django.

(On a side note: As with most of what I want to do here, there are
fast ways to implement these methods on the cheap for a given narrow
problem space. I want high quality, superbly written, easy-to-use
tools that can adapt to any problem space.)


Another major benefit: automated performance testing. Imagine a simple
interface where you tell django, "I want a performance test on view X.
Start with 100 records, and add 200 at a time and reload this view,
measuring speeds. When the view degrades to the point that it takes 5
seconds or more to return a result or we reach 500,000 iterations,
halt generation. Save to log 'perform.txt'."

When you run this test, first it automatically creates temporary
tables. Then it begins populating those tables, calling the given
function over and over and seeing how long it takes to return its
data, logging all the while. The system has to be able to look at a
view or method and figure out what parameters to pass it; when it
can't figure things out, it has to exit and request more information
from the programmer (ala when syncdb asks you to add a related_name).

All applications eventually break down as they scale up; figuring out
when your application will break and deciding on the acceptability of
that threshold is a major part of deploying production code. This
gives hard numbers in an area where many developers operate by "feel".
Add in automatic fuzz testing, validation testing, etc in this same
harness and you have a batteries-included method of testing almost
every aspect of your models *and* their associated views with no need
to write cumbersome boilerplate code.


Imagine I'm, say, a developer for an insurance company. I get an idea
that I could eliminate a lot of complexity by restructuring some
models, and it'll take two weeks to do it system-wide. The boss will
want some proof that it's worth it.

I log into the development server, restructure the models, and convert
just one view, perhaps a report. I define a batch script and feed it
to the batch creator. Django creates temporary tables and generates
sample data that uses the new structure, sidestepping the need to
massage the current data into a new form. And since I already had the
batch script for the old structure, I just copy that file and edit it,
saving even more time.

The automated testing finishes. With the new structure I get a 35%
performance gain with no loss in fidelity. I now have a set of hard
numbers to bring to my boss and all I spent was a couple of hours
throwing together the prototype.


So I guess my point is that you're right, just being able to insert
records isn't the major benefit or even *a* benefit. But by starting
there and building up and integrating, there starts to form a picture
of a simpler, faster, more robust way of defining and testing your
models right out of the box. Of testing the implications of complex of
relationships. Of being able to quickly restructure your models on a
test box. Prototypes can be made to dance in front of decision-makers
with ease.

From what I can see, most Django users don't have needs that complex.
Right now Django isn't being used for things like insurance issuance
very often. Why should that stop us from making it so? I think Django
is a wonderful tool and I want to see it grow into new problem spaces
so long as that growth doesn't compromise its core goals.


Thank you for your feedback, I want as much as possible. Tell me if
you still think I'm barking up the wrong tree; I certainly trust your
judgement. If so I'll write a proposal for object history, another
interesting problem to solve, and submit that. Temporal objects and
dual-axis time tracking always seemed like fun to me and I've already
implemented "effective" tracking in the past with undo/redo
functionality.

Regardless, I'm really sold on working with Django this summer.
Whatever you want me to work on, I'll work on it. It's no exaggeration
to say that Python and Django rekindled by love for programming and
jumpstarted my programming career. To me, it's like being a young
musician who's *paid* to jam with David Bowie. ;)


Thanks for your time!

-Jason L.

Jason Ledbetter

unread,
Apr 3, 2008, 1:18:29 PM4/3/08
to Django developers
> The main part of your application is something I'll think about some
> more, but I'm not really sure what development effort is involved there.

To address this particular point a little more:

My original proposal was lean on details, which is entirely my fault.
To be honest, I've never participated in something like this and I
didn't want to make a 30 page document for anyone to comb through if
the reaction to the great idea is "we don't want that". As my previous
reply indicates, I have no problem with writing at length. ;) I just
didn't know what balance to strike for this particular situation.

I erred a little too much on the side of summary and ended up
accenting the wrong aspects of my proposal. My apologies. That's why
dialogue like this is so useful.

As for development time, it's easy to underestimate the problem space
here because when a given programmer writes a script to generate
things like this, it's tedious but straight forward. A more generic
tool actually becomes rather complex to implement well.

Let's take one example, a psuedo-random name generator. One
implementation is as simple as a dictionary of first names and last
names and a call to random.randrange. I included such an example in my
proposal to demonstrate the intended ease of plugging custom
generators into the batch system. However, the generators I'd include
are on a different level.

With the intended name generator, for example:

- Do we want middle names?
- Names for what culture?
- In unicode? Which unicode set?
- What maximum length?
- Do we want designators (jr, sr, etc)
- Which ones?

That's just the beginning of the problem space. It gets more complex
for, say, sample addresses. Addresses for which country? They're
structured wildly different. What if your contact database is truly
international? Now we need to match the culture of the name and the
address of each contact and figure out easy ways for the script to
plug all that into different fields.

Do you have the name as one long char field? Or two or three different
fields for each part of the name? Is Jr/Sr a selector or just
appended? And with addresses, the structure could be wildly different
on the model level, ranging from (address, city, state, zip) to a
simple line-broken longtext. And whatever structure suits the needs of
the developer, we need to address.

Furthermore, what if you have a question like: is the performance cost
of having the name in three fields instead of one justified for my
application? What if we drop it to two fields? By having batch
creation, smart generators, and a performance tester already built-in
we can answer these questions with hard numbers. We can even include
such a batch set in the django source for the purpose of testing new
design decisions.

If a django patch comes in that says it makes queries faster, how do
the django developers know if the claim is true? There are already
ways to test such a thing, but nothing beats seeing actual "on the
street" performance. When it comes to the performance of a complex
system like django, mini-tests are contrived and don't usually reflect
the reality. See any language shoot-out for an example of how unreal
contrived tests tend to be. ;)

So Django could have a rule that any performance claims need to be
accompanied with hard numbers on the chosen sample data batch.

If I'm chosen, once the project is complete it'd also be easy to
provide "teaching samples" of various Django applications which can
then generate their own data on install. Right now we have
instructions on creating a polling site in the documentation (last I
checked) that's used to teach the basics. We could supplement this
with, say, a sample blog, fully-designed and commented, that will
create its own sample blog entries on install.

$ django-admin addsampleproject blog

This will give live data to the student so that they can mess around
in the admin and on the code level, getting a feel for what it's like
working with a real Django application. Now, if we make the teaching
sample also the performance data set for people to test their Django
code changes on, then we get more of that delightful synergy.

What if I to create any sort of new feature (history tracking, etc)
for Django? The batch system could make it easy to stamp a "clean" app
on a development box with valid but expendable data for the programmer
to work with while developing his feature. He can finish his feature
and even include performance metrics to prove he's not slowing
anything down.

So yeah, my idea is a lot more ambitious and useful than it would
appear at first glance. Especially if the glance is at my woefully
understated proposal. ;)

Let me know what you think!


Thanks!

-Jason L.

Jason Ledbetter

unread,
Apr 3, 2008, 2:04:52 PM4/3/08
to Django developers
Sorry, one more thought. Told you I'm not scared of writing!

Regarding hand-rolling these processes: yes, that's easy to do. But
automatic data generation is part of the greater testing process that
production code needs to undergo. And when you hand-roll testing
processes on the fly, you have to *fully* test your testing processes
too. In any situation where you're writing your own testing suite,
even if that suite is well-written, you now have *two* code bases to
debug and test.

For example: each batteries-included generator would come primed to
render itself as if it came from a browser's post/get data. Now you
have a way to keep throwing data (both valid and invalid) at your
system's views, testing for exploits, without having to use a browser
itself. Don't think that's so useful? Anyone can find and reject names
that aren't really names that malicious kids send at your form. A 500
character stream of random noise *probably* isn't a name. But will
your protection scheme reject valid names? Are you really sure?

You'll never be sure. But if you test with hundreds of random names
from each continent, that's pretty close to it.

The more clever you get with your data processing, the more vital this
sort of testing becomes. If you're taking a full name in one form
field and attempting to break it into first and last name, for
example, it'll work for all the cases you think to test of. Because,
of course, you're only going to test for the cases you thought to code
for! Anyone who's intentionally not handling something in his code
that he thought to handle in his tests needs medication or a vacation,
not a framework.

So one of the major advantages of a set of useful data generators is
that the data generator *isn't written by the coder*. If one single
coder writes a name handler, he'll of course handle all the aspects of
name construction he can think of. And he'll test according to that
same list of things *he* can think of. And it might blow up on the
fifteen things he *didn't* think of.

By spending some time looking into name variations and then writing my
own generator which is added to the Django code base, it's now there
for each contributor or user to look at the code and say, "He forgot
about this". Thus leveraging the power of open source. My version will
be fairly complete, but with each eye on the code, it becomes
increasingly suited to testing this very common case: properly
handling someone's name.

This strikes to the heart of why we even write frameworks: for a test
to be useful it has to be trustworthy. If you're writing each piece
yourself, you now have to establish the trust of that piece before you
can use it to establish the trust of something else. (Having Frank
vouch for Tom is no help if you don't trust Frank.) Even worse,
there's a good chance there's a problem with your test that you just
don't see. Code that lives only on one dude's server is code
unreviewed.

The wake-up call for "dude code" comes when it smashes against real
data and fails. And in many ways, Django specializes in facilitating
the writing of such lonely, unseen dude code.

Projects like this are the reason Django *exists*: providing visible,
repeatedly-read-and-evaluated, trustworthy methods for achieving
tedious but straight-forward goals in web design. After all, we
usually screw up when handling the tedious stuff that has us thinking
about dinner or shopping. I submit that this sort of testing is one
such tedious but vital step, and one that's far harder to "get right"
than most people appreciate. And it's the hidden difficulties that
most need to be yanked into the sunlight, right?

It might seem overkill for what it achieves, but that's only true if
you're thinking in "make a blog" context. But if you're thinking in
"make an intranet page to track the insurance policies for seven
sattelite offices" context, this sort of provability becomes terribly
useful.

And now I'll stop replying. Probably.


-Jason L.

Adam Findley

unread,
Apr 4, 2008, 12:44:18 PM4/4/08
to django-d...@googlegroups.com
On Thu, Apr 3, 2008 at 12:04 PM, Jason Ledbetter
<sarcast...@gmail.com> wrote:
<snip> lots of thoughts </snip>

Ok so right off I have to admit I haven't read everything you've said,
but as I have some experience working with database generation tools,
I didn't feel the need to be converted to their goodness.

Having this sort of functionality in Django would really distinguish
it as a framework. One of the biggest problems you face when
designing a large website is having enough sample data to see how it
would look and perform under typical usage. Even small websites could
really benefit from this sort of too. The enters a new problem space
rarely explored by web frameworks and I think it's a fascinating idea.

That said, generating all that data isn't the easiest task. While
generating lots of data not too difficult, creating lots of meaningful
data while also being generic can become a bit more challenging.
While I'm not a decision maker in the Django community, that's where
my concern would be: is this something you could finish in the time
period.

Adam

Eric Walstad

unread,
Apr 4, 2008, 12:52:45 PM4/4/08
to Django developers
Hi Jason,
I'm not a Django dev and have nothing to do with GSOC but wanted to
make some comments on your proposal because I'm interested in it from
an end-user's perspective.

Jason Ledbetter wrote:
>...But I could drop
> the scripting idea and instead work off of a "batch options" object
> and keep everything pure python.
FWIW, this is the approach I would take, mostly because Python is what
I'm most comfortable with.


> I work for an insurance underwriting company and I've used Django very
> successfully to automate finances, policy issuance, our contact books,
> etc. It works great, but I'm working with what's probably an unusually
> complex set of data (compared to most users of Django). So my idea for
> the project comes from working in that environment.
I'm in a similar boat. I write complex apps for the Energy Industry;
our apps have what I consider very complex data sets. Testing them is
*hard*.


> You end up hobbling together scripts to generate valid but random data
> in order to populate a given set of instances. I have such a script
> here at work, to make sample insurance policies for our issuance
> system. It's hundreds of lines of ugly, hard-to-read, boilerplate
> code.
The test module for our finite state machine is about 900 lines long.
Ugly, boilerpplate, painful to update and/or customize for new test
cases. That's just for the FSM.


> In situations of sufficient complexity, the boilerplate code becomes
> huge and you wear out the keys even *near* your copy and paste
> shortcuts. That inevitably leads to bugs and with a sufficiently large
> generating script, you fall into a "run it until it works" trudge. You
> get to step 15 of the generation process and find out you typed
> "underwriter" as "unserwriter" and you just wasted 25 minutes creating
> an incomplete set.
I feel this pain.


> From what I can see, most Django users don't have needs that complex.
> Right now Django isn't being used for things like insurance issuance
> very often.
but some of us DO...


So I have a serious interest in your GSOC testing work. We are
planning a major refactoring of our code base this year and testing is
near the top of my priority list. I'm interested in following your
progress, trying your code and reporting issues if you are open to
it. If you start a mailing list or wiki or somesuch on your project,
please announce it on django-dev so I can follow along.

Best regards,

Eric.

Jason Ledbetter

unread,
Apr 4, 2008, 1:26:43 PM4/4/08
to Django developers
> While I'm not a decision maker in the Django community, that's where
> my concern would be: is this something you could finish in the time
> period.

That's my only real concern as well. That's how I knew my original
proposal was all sorts of misleading and/or muddled: the fact that
Malcolm asked, "What coding needs done?" In my mind, a whole heck of a
lot of it. Not to mention research on things as fringe as the layout
of addresses in London or the most popular names in Spain. ;)

(Though admittedly, my mental design for generators focuses more on
making it easy for people to extend them with local knowledge than
being mini-wikipedias. Harness the power of the masses!)

There are two mitigating factors that ease my worry about completion:

1. I have no intention of collecting my GSoC check and bouncing out of
the Django scene. GSoC is a starting point, and if this tool is
popular and useful in the Django community, I have every intention of
continuing as its main developer and maintainer. In fact, once I feel
the project is "done enough" that it's mostly a matter of bug fixes
and implementing minor features, I'll almost certainly branch into
other areas of Django development.

2. By approaching the design iteratively, I can ensure that however
far I get, I'm still turning in a functionally complete project. First
I sit down and chart out the API/features of the ideal "complete"
project. Then I put together the most basic framework. Next I
implement straightforward versions of each piece of that framework,
implementing only the most core features of each. At this point I
already have a completed project. With all remaining time, I pick a
piece and implement another set of features, then repeat. This isn't
hard to manage if your modules are properly separated and you design
the 'final' API beforehand.

After SoC ends, I just keep at it until I consider the generation/
testing framework complete enough to drop from 'developer' mode into
'maintainer' mode. Then I use my vast SoC riches to buy a castle in
Europe where I will live off of cigars and brandy.

What could go wrong? ;)

-Jason L.

P.S. I'm going to rewrite my proposal this weekend to properly focus
on the real features. I'm used to writing proposals for a conference
room, not a web page; I was a little lost on attempt one. But that's
how we learn, right?

Jason Ledbetter

unread,
Apr 4, 2008, 2:06:09 PM4/4/08
to Django developers
> I'm not a Django dev and have nothing to do with GSOC but wanted to
> make some comments on your proposal because I'm interested in it from
> an end-user's perspective.

At the risk of oversimplifying: when justifying a new feature in an
existing code base, an end-user perspective is just as important to
the debate as that of an internal developer.

> I'm in a similar boat.  I write complex apps for the Energy Industry;
> our apps have what I consider very complex data sets.  Testing them is
> *hard*.

And it really shouldn't be, not in a language like Python. I feel
there's a lot of inherent power that's not being leveraged in this
area. Especially when you consider that it's ok for a testing suit to
take a while to finish (it's not an interactive process), you can
utilize a lot of ridiculously helpful dynamic design patterns.

This is part of why I feel this project is manageable. Much of the
work involves setting up the test (harnessing, generating the test
data for that run, etc) and the only part that has to run at "normal"
speed is the actual call to the given view/function/etc (so the
numbers aren't distorted by overhead).

Since all the rest can be fairly slow without hurting the overall
utility of the system, I can utilize a lot of abstraction tricks that
are powerful from a design standpoint but often not practical for
performance reasons.

> > You end up hobbling together scripts to generate valid but random data
> > in order to populate a given set of instances. I have such a script
> > here at work, to make sample insurance policies for our issuance
> > system. It's hundreds of lines of ugly, hard-to-read, boilerplate
> > code.

This is exactly what I want to avoid.

> painful to update and/or customize for new test cases

But this is probably the best justification for what I'm proposing. If
writing these tests was painful and time consuming but you could do it
once and be done with it, then there'd be no need for my proposed
additions. But that's just *not* how the real world of development
works. Businesses constantly change so business needs constantly
change.

This is where a lack of proper, well-tested abstractions really starts
to hurt.

> > From what I can see, most Django users don't have needs that complex.
> > Right now Django isn't being used for things like insurance issuance
> > very often.
>
> but some of us DO...

And right now, no single open-source framework does much to
comprehensively address the needs of this demographic. If we *can*
then that becomes a major notch on Django's e-belt. And since this
stuff scales well (useful for small-time *and* huge applications), I
don't see any real disadvantage to adding it to the code base.

> near the top of my priority list.  I'm interested in following your
> progress, trying your code and reporting issues if you are open to
> it.  If you start a mailing list or wiki or somesuch on your project,
> please announce it on django-dev so I can follow along.

Assuming I'm given the SoC nod by the Django team, you won't have to
worry about open communication. As you may have noticed, I kinda like
to talk. ;)

A mailing list and continually-updated documentation, definitely. A
wiki eventually, but not until after SoC, where I'll have the time to
manage one. Unless, of course, you or some other kind soul ends up
wanting to maintain one.

Thanks for your reply!

-Jason L.
Reply all
Reply to author
Forward
0 new messages