Synchronizing data, want to track what has changed. Advice please.

33 views
Skip to first unread message

Brian

unread,
Sep 27, 2011, 11:42:15 AM9/27/11
to sqlal...@googlegroups.com
hello all,

I'm grabbing data from a Student Information System (SIS) about students and then saving that data in a local database.  I then query my local database to create/modify/disable accounts in Active Directory and Google Apps.

What I've been doing so far is the following:

Set all students in the local database as inactive--
for activestudent in am.session.query(am.Student).filter_by(active=1):
    activestudent.active = 0
Get all "active" records from SIS using an engine.execute(<sql statement>)
Merge data from SIS into the Student objects from the local database.

This works well for synchronizing the data, any students leaving the school get marked inactive and any updates in name or status are reflected in the local database.

What I can't do, though, is tell exactly what changed.  I'd really like to run functions based on what the changes are:  disable students made inactive.

I tried using session.dirty, but I think because I initially set active to 0, every record object is marked, even if active reverts back to 1 through the merge.

I'd appreciate any ideas, or guidance in the right direction.

Thanks!

Michael Bayer

unread,
Sep 27, 2011, 2:39:20 PM9/27/11
to sqlal...@googlegroups.com
hm what if you set active to "0" after you've learned what you can about attribute change status ?





Thanks!

--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To view this discussion on the web visit https://groups.google.com/d/msg/sqlalchemy/-/Mq9tsqsugk0J.
To post to this group, send email to sqlal...@googlegroups.com.
To unsubscribe from this group, send email to sqlalchemy+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.

Brian

unread,
Sep 27, 2011, 6:56:38 PM9/27/11
to sqlal...@googlegroups.com
I'm not sure how I could do that because I'd want to set active to 0 only on the students that are no longer going to school here.

Maybe merge isn't the right thing to do, or maybe I need to rethink my algorithm.  The bottom line is I need to know what records get added to the local database table (new users),  what ones were active before -- but now I don't see them in the query (disable users), and what records were changed (modify/update users).

What I'm trying right now is to use session.expunge() on each active student, THEN merge the data from the SIS.  That way (I hope) that only records that changed or added would be "dirty" with an active attribute of 1 and the dirty objects with an attribute of 0 would need to be removed.

If anyone has a bright idea I'm all ears.  I just don't know SA well enough to know all the tools available to me for this.

Michael Bayer

unread,
Sep 27, 2011, 8:02:08 PM9/27/11
to sqlal...@googlegroups.com
SQLA itself can only give you "what's changed" in memory.   so if you had an X(), and you set X.y to "bar" where it used to be "foo", before a flush you can see that happen with session.dirty and attributes.get_history(x, "y").

I don't entirely understand the actual task here but if it has to do with comparing the contents of one database to another, the general way to do that is to load the objects, populate them with the data from the second database, then use .dirty to see that.

Clearly, if you are using that technique, you should do the "compare" job on clean objects, otherwise you can't be sure what changes are from the compare and what changes might have been local.   I don't really understand the problem with the "0" and doing it later - if there's some state on the object that determines the "0" and that state is blown away by setting new attributes from the other database, then either don't overwrite those particular attributes or save the important state elsewhere on the object or in a dictionary.



--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To view this discussion on the web visit https://groups.google.com/d/msg/sqlalchemy/-/Qk7ieEmaC98J.

Brian

unread,
Sep 28, 2011, 12:57:01 AM9/28/11
to sqlal...@googlegroups.com
Thanks for your responses.  I've been doing quite a bit of probing and I think I have what I need now to get the job done.  Sorry for the confusion, I wasn't being very clear.

The problem with setting active to 0 after the fact stems from the fact that I'm only querying each database for the "active" records.  The SIS database has thousands of records of students that are no longer in the district, and I don't need data on former students.  So I query the SIS for the active students, the ones that are going to school on the day my script runs.  But I have a list of "active" students also in my local database.  When I query for active students there, I'm getting the kids that were active a few days ago, the last time the script was run.  We may have had a student leave, and I need to account for that so I can disable their user accounts.

I think with session.is_modified and with your suggestion to use sqlalchemy.orm.attributes.get_history I can get all the information I need.

So, now if I start out by setting Student.active = 0 and then set Student.active = 1 for each record I get back from the SIS, is_modified will go back to false unless one of the other attributes changes, which is exactly what I was hoping I could do in the first place.  Then before my commit, I can get a list or dictionary or whatever of the Student objects that didn't get set back to active.

In doing all my probing, I found that setting a date attribute with a string like '8/26/2011' causes is_modified to be true even if the date object equaled date(2011, 8, 26)  (ie.  Student.AUPdate = '8/26/2011'), but if I set it using datetime.strptime('8/26/2011', '%m/%d/%Y').date() then is_modified stays false.  This is important because I was seeing all sorts of UPDATEs on records I knew had not changed when I was doing the merge, so it was really throwing me and I couldn't tell what had truly changed.

Thanks again.

Reply all
Reply to author
Forward
0 new messages