I plan to have lots of data tables with similar structure. How do you recommend I model them?

19 views
Skip to first unread message

Tom Tanner

unread,
Jan 10, 2018, 8:20:54 PM1/10/18
to Django users
Hey everyone,

I have a bunch of text files that each have a bunch of columns in common. I plan to import these files into PostgreSQL tables. The website user will be able to send a GET request to query a table and get back data from it. Since most of the tables will have a bunch of columns in common, how should I structure them in my `models.py`?

Here's a couple examples of tab-delimited text files I'll import.

    NAME S1903_C02_001E state county tract State-County-Tract-ID
   
Census Tract 201, Autauga County, Alabama 66000 01 001 020100 01001020100
   
Census Tract 202, Autauga County, Alabama 41107 01 001 020200 01001020200
   
Census Tract 203, Autauga County, Alabama 51250 01 001 020300 01001020300



and 

    NAME S1903_C02_001F S1903_C02_001G state county tract State-County-Tract-ID
   
Census Tract 201, Autauga County, Alabama 66000 4040 01 001 020100 01001020100
   
Census Tract 202, Autauga County, Alabama 41107 192837 01 001 020200 01001020200
   
Census Tract 203, Autauga County, Alabama 51250 39482 01 001 020300 01001020300


As you can see, they have several columns in common. I wouldn't want to repeat myself in `models.py` by listing the same columns over and over.

Mike Dewhirst

unread,
Jan 10, 2018, 9:44:22 PM1/10/18
to django-users@googlegroups.com >> Django users
On 11/01/2018 12:20 PM, Tom Tanner wrote:
> Hey everyone,
>
> I have a bunch of text files that each have a bunch of columns in
> common. I plan to import these files into PostgreSQL tables. The
> website user will be able to send a GET request to query a table and
> get back data from it. Since most of the tables will have a bunch of
> columns in common, how should I structure them in my `models.py`?

I think you want a core model with most of the common fields and meta
abstract = True ...

class CoreFields(models.Model):
    ...
    class Meta:
        abstract = True

Then a bunch of other tables set up with fields which are not in
CoreFields ...

class DataSetX(CoreFields):
    ...

You can also move any methods in common into CoreFields

hth

Mike

>
> Here's a couple examples of tab-delimited text files I'll import.
>
> |
>     NAME S1903_C02_001E state county tract State-County-Tract-ID
> CensusTract201,AutaugaCounty,Alabama660000100102010001001020100
> CensusTract202,AutaugaCounty,Alabama411070100102020001001020200
> CensusTract203,AutaugaCounty,Alabama512500100102030001001020300
>
> |
>
>
> and
>
> |
>     NAME S1903_C02_001F S1903_C02_001G state county tract
> State-County-Tract-ID
> CensusTract201,AutaugaCounty,Alabama6600040400100102010001001020100
> CensusTract202,AutaugaCounty,Alabama411071928370100102020001001020200
> CensusTract203,AutaugaCounty,Alabama51250394820100102030001001020300
> |
>
>
> As you can see, they have several columns in common. I wouldn't want
> to repeat myself in `models.py` by listing the same columns over and over.
> --
> You received this message because you are subscribed to the Google
> Groups "Django users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to django-users...@googlegroups.com
> <mailto:django-users...@googlegroups.com>.
> To post to this group, send email to django...@googlegroups.com
> <mailto:django...@googlegroups.com>.
> Visit this group at https://groups.google.com/group/django-users.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-users/e5fc9dab-c0aa-4fb8-a66c-5f507a18f4b6%40googlegroups.com
> <https://groups.google.com/d/msgid/django-users/e5fc9dab-c0aa-4fb8-a66c-5f507a18f4b6%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.

Thomas G Lockhart

unread,
Jan 11, 2018, 10:18:34 AM1/11/18
to django...@googlegroups.com
You may want to have several tables linked by foreign keys. A Census Tract table, a County table, a State table, etc.

That reduces the amount of duplicate data and the chances for inconsistent entries. The downside is a bit more work on the part of the computer to do lookups, but particularly if your typical use case accesses one or a few at a time (say 1% of a big table) then the performance hit for joins etc is minimal.

Design your normalized schema, then write a standalone Python program to stuff the initial data into the tables.

Good luck!

- Tom

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages