Dear Django users.
I am looking into the so-called Point-In-Time architecture for databases with version history. See e.g.
https://www.red-gate.com/simple-talk/sql/database-administration/database-design-a-point-in-time-architecture/One reason I like PTA is that a version history for my project is a "nice-to".
The "need-to" reason is, however, that I need to have a general system for "propose, then commit or reject" updates to my data tables.
With "propose and commit or reject" I specifically think of users uploading CSV tables through a table web interface.
The website will then inform the user of the number of rows changed, total number of rows and perhaps even more detailed information.
I am aware that I could perhaps store the information in another table until the new data is committed or rejected.
However, what I am trying to build is a sort of mini-data warehouse framework for storing and modifying data for scientific calculations.
A central feature is, that the (super)users themselves are defining the tables they want in the data warehouse by coding simple Django models.
So, what I need is a structure which allows me to add the necessary columns to whatever model the user creates.
Enter "class meta" - yay, I can add version_begin and version_end to keep PTA on all tables the user creates, if she/he uses my meta class.
Let's take the most complicated example of updating a record. Let's say I have two tables:
- Country (a primary key field, and a text field for the name)
- City (a primary key field, a foreign key to country, and a text field for the name)
Both tables additionally have columns describing "version_begin", "version_end" and "previous_primary_key"
Now, if I have misspelled the name of a country, I'd like not to have a new primary key for the correctly spelled country.
In that case, I'd have to correct the all the foreign keys of the city table.
So, why not simply insert a new country record (with a new PK) in the "propose" stage.
Then, if I commit, I simply switch the content of the data column (here the two misspelled names, using the previous_primary_key, and also fixing the previous primary key as well).
Simple! What could possibly go wrong?
Also, the database will be quite heavy with the "city" like tables, i.e. a lot of big tables with multiple foreign keys to the primary keys only tables (like country).
So, it seems more efficient - albeit a bit hacky - to make a small manipulation of the primary table, rather than correct potentially a lot of foreign keys.
Probably a lot, but my database knowledge barely stretches to third normalisation, so ... help!
QUESTION 1:
Are there any Django modules with propose-commit/reject functionality that I can use already (please correct me if I got something wrong)?
- Django-Reversion: is a duplicate history table, so a no-go
- django-simple-histroy: Also (?) a duplicate table or transaction history
- django-dirtyfields: updates other tables with foreign key links
- cleanerversion: Seems to be PTA, but can I implement the propose-commit/reject that I need?
- django-easy-audit: Logs transactions, don't allow "propose-commit/reject" (?)
... and then some others which seemed not that relevant (again, I may have missed something).
Also, a nice reference to "slowly changing dimensions", though it seems less Django specific (i.e. primary key is combination of multple colums).
https://en.wikipedia.org/wiki/Slowly_changing_dimension#Type_2QUESTION 2:
Or is the "switch values" approach just ugly, but not completely far-fetched?
My +40 years database experienced father in-law quite rightly named this as "hammering a screw",
But if it has a chance of working, I'm a quite pragmatic guy (my apologies to database afficinados already) ...
Do ask questions if I'm not entirely clear.
I could have written many pages of text more, but tried to keep it a bit short.
thanks, Mikkel