--
I like python!
UliPad <<The Python Editor>>: http://wiki.woodpecker.org.cn/moin/UliPad
My Blog: http://www.donews.net/limodou
You should also keep in mind that the fixtures patch that is part of
ticket #2333 can be used to dump and reload database contents. This
patch should be applied shortly (it has been slated for inclusion
before v1.0)
Yours,
Russ Magee %-)
And I also want to proposal that manage.py should has a plugin
interface, so that user can contribute their own useful tool, and this
tool can be invoked through:
manage tool customtool
./manage.py dumpdb > /foo/bar/out.data
dumpdb (this will be changed to 'dumpdata' in the final patch) dumps
to stdout; you can put that data wherever you want.
> And if there is foreignkey relationship in one app, the
> order of output data is very important. Because the restore process
> will use model.objects.create to insert a new record, so if the
> foreign object is not created yet, the inserting will be failed.
If you create objects and foreign key relations in a single
transaction, there shouldn't be any issue with object creation order.
> And
> what I do is using cursor.execute, so there are no these problem. And
> also my tool can support the restore after you changed the model, and
> I think is handy for me.
How does the patch in #2333 prevent you from doing this? Dump your
data, change your model, modify your data to add new required fields,
load your data.
> And I also want to proposal that manage.py should has a plugin
> interface, so that user can contribute their own useful tool, and this
> tool can be invoked through:
>
> manage tool customtool
There is no need for this. If you need to write your own tools, they
can be .py scripts, invoking whatever Django calls are required.
Manage.py doesn't do anything magical that couldn't be easily
replicated in your own script.
Yours,
Russ Magee %-)
I don't like this. Because I want to save the data into different
files according to each table, but not a single file.
> > And if there is foreignkey relationship in one app, the
> > order of output data is very important. Because the restore process
> > will use model.objects.create to insert a new record, so if the
> > foreign object is not created yet, the inserting will be failed.
>
> If you create objects and foreign key relations in a single
> transaction, there shouldn't be any issue with object creation order.
Because the idea above, so it's not very suit for my requirement.
>
> > And
> > what I do is using cursor.execute, so there are no these problem. And
> > also my tool can support the restore after you changed the model, and
> > I think is handy for me.
>
> How does the patch in #2333 prevent you from doing this? Dump your
> data, change your model, modify your data to add new required fields,
> load your data.
I don't want to modify every line of data, but I need a default value
dealing appoach. And #2333 doesn't support this idea. I didn't try
#2333 for: if I remove some fields of a Model, and restore data from
data file, how about #2333 will do?
>
> > And I also want to proposal that manage.py should has a plugin
> > interface, so that user can contribute their own useful tool, and this
> > tool can be invoked through:
> >
> > manage tool customtool
>
> There is no need for this. If you need to write your own tools, they
> can be .py scripts, invoking whatever Django calls are required.
> Manage.py doesn't do anything magical that couldn't be easily
> replicated in your own script.
>
So your thought is very different from mine. I think core.management
has many common methods, and users like me hope to reuse them. If
there is no such plugin system, so I had to write my personal tool
just like db_dump.py, and it'll be a whole program, and I need
introduce many things I need from core.management which I think
useful.
#2333 doesn't do per-table, but it does do per-app:
./manage.py dumpdb firstapp > firstapp.data
./manage.py dumpbdb secondapp thirdapp > otherapps.data
Table-level doesn't strike me as a particularly good level for
serialization granularity, specifically because of cross application
dependencies.
> > If you create objects and foreign key relations in a single
> > transaction, there shouldn't be any issue with object creation order.
>
> Because the idea above, so it's not very suit for my requirement.
When you load data using #2333, what you provide is a label; Django
then searches each application for a data file matching that label. As
a result, you can have multiple data files with the same label, with
each application having a fixture directory to store the fixture data
files.
The transaction that is used is per-label, so all the 'initial_data'
fixtures for each app are loaded as a single transaction, eliminating
cross-app problems.
> I don't want to modify every line of data, but I need a default value
> dealing appoach. And #2333 doesn't support this idea. I didn't try
> #2333 for: if I remove some fields of a Model, and restore data from
> data file, how about #2333 will do?
I haven't tried - but either problem (default values for new fields,
and ignoring old fields) should be a relatively easy modification to
the existing serialization code.
> So your thought is very different from mine. I think core.management
> has many common methods, and users like me hope to reuse them. If
> there is no such plugin system, so I had to write my personal tool
> just like db_dump.py, and it'll be a whole program, and I need
> introduce many things I need from core.management which I think
> useful.
manage.py has useful methods? Then use them.
from django.core import management
print management.get_sql_all()
No need to add an extension API to manage.py.
Yours,
Russ Magee %-)
This will be helpful if you want to switch database -- say from MySQL
to SQLite -- for some reason. Just change settings.py, build the
tables via syncdb, and reload the data. Simple!
For us it will also help backup/archive our projects that is more
flexible than just a straight mysqldump.
Nice work. (And I haven't even looked at the fixtures part of it yet.)
-Rob
I have an app with Applications, each of which is for a single Camp.
What I want to do is retrieve all the Camps for which there exists an
Application.
Thanks,
Chris
The straightforward way is to retrieve a list of distinct values for
'camp_id' from the Application table, and then grab the corresponding
Camp objects for that list. In SQL the query would look something like
this (allowing for DB variations):
SELECT * FROM camps where id in (SELECT DISTINCT camp_id FROM application);
You might be able to do this with the 'extra' method of a QuerySet, by
putting the 'where' clause with subselect into the 'where' keyword
argument to extra(), but I haven't tested that to be certain.
--
"May the forces of evil become confused on the way to your house."
-- George Carlin
Computing the DISTINCT portion here may be superfluous, and
possibly (depending on your DB) a premature mis-optimization.
Whether DISTINCT or not, membership via IN will still behave the
same, but DISTINCT will require a pass to determine
distinct-ness. Without the DISTINCT keyword, the DB should be
able to just do a hash-lookup for each ID in its Camps set to
test membership.
Or, alternatively, one could do something like
SELECT c.* FROM camps c INNER JOIN application a ON c.id = a.camp_id
which might also be another way to let the optimizer take a crack
at doing it well.
YMMV, so be sure to use your DB's EXPLAIN abilities to disclose
the nitty-gritty details and choose what works best.
-tim
Just to clarify - JSON is the default format, but you can use any
other supported serializer format. At the moment, the only alternative
serializer that has been implemented is the XML serializer, but others
could be defined.
> Nice work. (And I haven't even looked at the fixtures part of it yet.)
Glad you like it. I'm hoping to get some time in the next week or so
to finish off the few things that are needed before I bug Adrian for
permission to commit. In the meantime, if you have any suggestions or
comments, let me know.
Yours,
Russ Magee %-)
Yeah, though AFAIK the 'extra' method on QuerySets can't specify an
inner join, so you're stuck executing the query via the cursor
instead. It's a tradeoff, really; if you need to squeeze every ounce
of optimization out of the DB, go with that. Otherwise, doing the 'id
in' with subselect is probably a little bit easier for the end user of
Django.
Maybe I could just throw in another use case and we can see if the
dump/restore facility will work in this way too...
Right now I'm working on a project that we're developing a new data
model with. The content team is already starting to input data.
Occasionally I find a need to add a new database column to support
something new. This is your typical schema evolution need.
If I could dump the current data, run an sqlreset, and reload the data,
that would be amazingly helpful. (At least until schema evolution comes
online or even as an alternative approach after it's available.)
Would/Could it work like this?
Is the dump/restore facility tied to fixtures? I see a dumpdb option in
your patch but not a reloaddb (or the like)... just a 'installfixture'.
-Rob
If you follow the discussion when I revealed the patch, you will see
that one of the suggestions was to rename installfixture to loaddata,
and dumpdb to dumpdata - this was specifically to reinforce the
symmetry between the two commands.
Using the fixtures as a 'schema evolution lite' was also suggested.
The #2333 patch doesn't address the question of filling in new fields
and removing old fields as part of an evolution step, but the
potential is certainly there.
Yours,
Russ Magee %-)