Model Objects and Updating db via XML

86 views
Skip to first unread message

Sayth Renshaw

unread,
May 13, 2014, 3:40:37 PM5/13/14
to django...@googlegroups.com
Can I ask for some assistance please.

For my project I will need to upate the database from an external xml file consistently. The xml is like this one http://old.racingnsw.com.au/Site/_content/racebooks/20140515GOSF0.xml
If i have create my model objects, most of which will have a direct correlation to the xml file, but there will be extra model objects and fields relating to user input. How can I do updates to my database via xml.

There are 2 docs I have been reviewing https://docs.djangoproject.com/en/1.6/howto/initial-data/#providing-initial-data-with-fixtures and https://docs.djangoproject.com/en/1.6/topics/serialization/.

The guides talk about creating a fixture but I am not sure it is appropriate exactly and I am unsure from the guide how I would setup a mapping so that the xml data went to the correct model object.
How do I do this?

Currently I have my model as.

from django.db import models

class Meeting(models.Model):
    location = models.ForeignKey(Venue)
    MeetingRace = models.ForeignKey(Race)
    meeting_id = models.IntegerField(8)
    meeting_date = models.DateField()
    rail = models.CharField(max_length=75)
    weather = models.CharField(max_length=10)
    trackcondition = models.CharField(max_length=15)
   
class Venue(models.Model):
    NameVenue = models.CharField(max_length=50)
    FullNamevenue = models.CharField(max_length=70)
    ClubCode = models.IntegerFields(5)
    AssosciationClass = models.IntegerField(5)
   
class Race(models.Model):
    RaceHorse = models.ForeignKey(Horse)
    RaceID = models.IntegerField(8)
    RaceNumber = models.IntegerField(3)
    RaceDivision = models.IntegerField(10)
    RaceName = models.CharField(max_length=80)
    RaceClass = models.CharField(max_length=20)
    RaceDistance = models.IntegerField(12)
    RaceMinWeight = models.IntegerField(3)
    RaceRaisedWeight = models.IntegerField(3)
    RaceAge = models.IntegerField(3)
    RaceGrade = models.IntegerField(3)
    RaceWeightCondition = models.CharField(max_length=80)
    # Check timing method, fastest, sectional
    #Review adding conditionals

class Horse(models.Model):
    HorseTrainer = models.ForeignKey(Trainer)
    HorseJockey = models.ForeignKey(Jockey)
    HorseName = models.CharField(max_length=40)
    HorseID = models.IntegerField(10)
    HorseBlinkers = models.IntegerField(3)
    HorseBarrier = models.IntegerField(3)
    HorseWeight = models.IntegerField(3)
    HorseRating = models.IntegerField(3)

Thanks

Sayth
    HorseDescription = models.CharField(max_length=60)

Sayth Renshaw

unread,
May 14, 2014, 2:18:08 AM5/14/14
to django...@googlegroups.com
Do I need to provide more information or clarity, or is this a harder question than I imagined and I just need to wait for a guru to come through?

Sayth

Avraham Serour

unread,
May 14, 2014, 2:23:06 AM5/14/14
to django...@googlegroups.com

Fixtures are usually in json. Not sure if the is support for importing fixtures in XML
In any case you can always parse them yourself and create the proper objects

Don't forget that this is an open source project, you can code XML support and contribute

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/f8e008bf-1f0a-4a0a-a35c-04c59f9997f6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Sayth Renshaw

unread,
May 14, 2014, 3:23:53 AM5/14/14
to django...@googlegroups.com

Thanks for the response I can look at converting the fixtures to json of that would make it work.

I wouldn't contribute xml.support at this stage,  the quality of my contribution would likely be poor.

Sayth

You received this message because you are subscribed to a topic in the Google Groups "Django users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/django-users/TZ2GbmlVk1o/unsubscribe.
To unsubscribe from this group and all its topics, send an email to django-users...@googlegroups.com.

To post to this group, send email to django...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.

Tom Evans

unread,
May 14, 2014, 11:12:50 AM5/14/14
to django...@googlegroups.com
On Tue, May 13, 2014 at 8:40 PM, Sayth Renshaw <flebbe...@gmail.com> wrote:
> Can I ask for some assistance please.
>
> For my project I will need to upate the database from an external xml file
> consistently. The xml is like this one
> http://old.racingnsw.com.au/Site/_content/racebooks/20140515GOSF0.xml
> If i have create my model objects, most of which will have a direct
> correlation to the xml file, but there will be extra model objects and
> fields relating to user input. How can I do updates to my database via xml.
>
> There are 2 docs I have been reviewing
> https://docs.djangoproject.com/en/1.6/howto/initial-data/#providing-initial-data-with-fixtures
> and https://docs.djangoproject.com/en/1.6/topics/serialization/.
>
> The guides talk about creating a fixture but I am not sure it is appropriate
> exactly and I am unsure from the guide how I would setup a mapping so that
> the xml data went to the correct model object.
> How do I do this?
>

I don't think you want fixtures, even though there are XML fixtures.
This is basically a data feed which periodically you wish to import in
to your own database. Fixtures are normally exports of your own
database. I think it would be very difficult to ensure that your
database always matches the contents of an XML file produced by a 3rd
party.

You could use xslt to transform the 3rd party xml in to the right
format for an xml serialisation, but that would involve using xslt.

What I would do is write a management command that parses the XML (I
like lxml), and then iterates through it, using xpath to pull out
relevant parts of the doc, and then use standard django models to
create this data in the database, taking care not to insert duplicate
data when it exists in both DB and XML.

Cheers

Tom

Sayth Renshaw

unread,
May 14, 2014, 4:41:34 PM5/14/14
to django...@googlegroups.com

Thanks Tom, I have set up a way to extract the elements I want with xmltodict and played with lxml.objectify last night and that would work as well.

But what exactly in django models am I using to bring in the data?

Sayth

--
You received this message because you are subscribed to a topic in the Google Groups "Django users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/django-users/TZ2GbmlVk1o/unsubscribe.
To unsubscribe from this group and all its topics, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.

Amirouche Boubekki

unread,
May 23, 2014, 9:20:21 AM5/23/14
to django...@googlegroups.com
Héllo (again),


2014-05-14 22:41 GMT+02:00 Sayth Renshaw <flebbe...@gmail.com>:

Thanks Tom, I have set up a way to extract the elements I want with xmltodict and played with lxml.objectify last night and that would work as well.

But what exactly in django models am I using to bring in the data?


Django knows how to load data from json, xml, whatnot only if it knows about the format. I don't remember
we considered Django native serialization/unserialization support (see my mail on Python ML).

The thing is that, in your project, it seems like import is a "one time thing". So maybe using a framework thingy like django-swallow or a custom django's fixture thing is too much.

Anyway, here is the "getting started" document of django swallow, summed up below:

- Builder class: entry point for an import, this used among other thing to trigger the import from command line.
- Mapper class: create "mapping object" for the input xml. Basically it breaks down a xml into sub-documents (if it's rss/atom like xml) or a single object if it's actually only one document, pythonize the xml data and make it easily accessible as object properties... no heavy processing of data or mapping to data already in the database. For instance, if you have an identifier, that maps to a row in the database, you retrieve the populator (see below) will retrieve the identifier as an integer or string and convert it to a Django model...
- Populator class: it helps populating the target django model object. It implements common operations, like «model.foo = mapper.foo» and other stuff.

The idea behind django swallow, is that an import is breaked down into several steps:

- parse xml, and turn in into simple python object(s) with python types (Mapper class)
- populate one django model instance (or several) with based on the data from the first step (Populator class)

They are extra optional steps ;)

I'm not sure about what is the repository, but for instance you can create a base mapper, which will populate a plain python objects from xpath and then the populator can be as simple as:

    class Populator(BasePopulator):

        _fields_one_to_one = None # XXX: in this case None means map every properties of the mapper object to the model fields with the same name...
        _fields_if_instance_already_exists = None
        _fields_if_instance_modified_from_last_import = None


It's may seem kind of overkill. But when you have 5+ imports running everyday, business critical, processing several dozen of files, for different kind of documents it's helpful to know that the implementation is built in a canvas, a framework, you know where to look for the code implementing a behaviour depending on the behaviour... whether it is for debugging or building new imports...

There is a full but simple example, importing items from an atom file: https://github.com/liberation/django-swallow/blob/master/example/config.py

HTH,


Amirouche
 

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.

To post to this group, send email to django...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
Reply all
Reply to author
Forward
0 new messages