GSoC Proposal: Auto-generation of Models from Data

12 views
Skip to first unread message

Collin Anderson

unread,
Mar 28, 2008, 11:51:26 AM3/28/08
to Django developers
Hello,

My name is Collin Anderson and I am interested in participating in
Google Summer of Code for django. I am a junior at the University of
Minnesota studying computer science. My idea is inspired by a
suggestion on the 2006 Summer of Code ideas list:

"add dabbledb (http://dabbledb.com/utr/) like functionality to the
django admin. e.g. create a project and super user with one command
line statement, fire up the development web server, and
import/create/refine your model using live data from directly within
the django auto admin."

The rough idea is to make it easier to auto-generate models from
existing data.

To start out, I would make an "inspectfile" command, similar to the
"inspectdb" command. It would take a csv or a simple json or xml file,
determine the field types, and output a model for the data. I figure
that this would be even smarter than the inspectdb and also attempt to
detect email, phone number, URL, etc field types. I would then also
add the more specific field detection to inspectdb.

The next step would be to extend inspectfile to be able to handle more
structured json and xml data that would require multiple models. This
would be a little harder, but it can be done.

The third step would be to create a web interface to aid in the
creation of the model. There are three features that would take
advantage of user interaction. The first would be the ability to paste
in data from a spreadsheet. The second would be the ability to change
the field types before they are outputted. The third would be the
ability to create foreign keys with already existing models.

Part of the inspiration of this also came from James Bennett's talk on
Database-driven journalism [1] where the journalist would get data in
tables, create models for the data, and then import the data into
django. This could speed up one step of that process. Another
application of this would be the ability to take an xml feed on the
web and get a rough django model from it.

Any thoughts?

Thanks,
Collin Anderson
cmawe...@gmail.com

[1] http://www.b-list.org/weblog/2008/mar/16/slides/

Jacob Kaplan-Moss

unread,
Mar 28, 2008, 12:01:02 PM3/28/08
to django-d...@googlegroups.com
Hi Colin --

Neat ideas!

My main point of feedback is that you're dealing with a *HUGE* problem
-- dabbledb represents literally years of work, and trying to
reproduce that in a single summer is seriously unrealistic.

I think you should scale back your proposal. "inspectfile" would be a
nifty addition to Django, and quite within the scale of SoC. It's not
quite that small, really -- there's a metric ass-load of data formats
out there, and dealing with 'em will take some hard thinking. Probably
some sort of plugin architecture and interactive "questionaire" bits,
too.

Essentially, it's better to do a small thing really, really well than
only half-implement something bigger.

Jacob

Sage La Torra

unread,
Mar 28, 2008, 12:09:27 PM3/28/08
to django-d...@googlegroups.com

+1

My tip would be to be really clear about your goals: what will you
accomplish, when, and what counts as a success?

This is a really cool idea, I hope we get to see more of it.

>
> Jacob
>
>
>
> >
>

Collin Anderson

unread,
Mar 28, 2008, 4:35:43 PM3/28/08
to django-d...@googlegroups.com
Hi Jacob,

> My main point of feedback is that you're dealing with a *HUGE* problem
> -- dabbledb represents literally years of work, and trying to
> reproduce that in a single summer is seriously unrealistic.

Yes, I realize that the types of migrations that dabbledb does are
currently not very possible with Django. I guess I am just looking at
it for ideas of how to make things easier.

> I think you should scale back your proposal. "inspectfile" would be a
> nifty addition to Django, and quite within the scale of SoC. It's not
> quite that small, really -- there's a metric ass-load of data formats
> out there, and dealing with 'em will take some hard thinking. Probably
> some sort of plugin architecture and interactive "questionaire" bits,
> too.
>
> Essentially, it's better to do a small thing really, really well than
> only half-implement something bigger.

So maybe I should ditch the web-interface and just make a good, robust
inspectfile.

Thanks,
Collin

Jacob Kaplan-Moss

unread,
Mar 28, 2008, 4:41:02 PM3/28/08
to django-d...@googlegroups.com
On Fri, Mar 28, 2008 at 3:35 PM, Collin Anderson <swe...@gmail.com> wrote:
> So maybe I should ditch the web-interface and just make a good, robust
> inspectfile.

Glad my hint wasn't all that subtle :)

I, for one, would use ``inspectfile`` all the damn time.

Jacob

Reply all
Reply to author
Forward
0 new messages