Glamkit-eventtools: Prevent duplicate events?

23 views
Skip to first unread message

J. Heasly

unread,
Apr 27, 2012, 5:30:49 PM4/27/12
to glamkit-d...@googlegroups.com
Hi —

Are there any built-in save()-type methods that help filter out the submission of duplicate events? i.e., occurrences that intersect about the start and end and location? Been looking through the glamkit-eventtools code and am not seeing anything ...

Thanks in advance for any tips, pointers, advice on code to look at, etc. (Cross-posting on glamkit-users.)

— John

Greg Turner

unread,
Apr 30, 2012, 4:39:57 AM4/30/12
to glamkit-d...@googlegroups.com
Hi John,

Generators do not create Occurrences that exist with the same `start` (which we assume to be sufficient to determine a duplicate). The code ain't great though, it would have been best to maintain a set and test for membership, rather than .filter() every time.


I'm wondering if it might be better to move this functionality into Occurrence, which would instead update an existing occurrence if `start` matches? That would mean that in no case can an Event have two Occurrences that start at the same time. I would use an is_duplicate method to test, so that the method can be overridden in the case that e.g. end time or Venue is important for duplicate detection. Any thoughts on that approach?

Greg.

J. Heasly

unread,
May 1, 2012, 11:34:50 PM5/1/12
to glamkit-d...@googlegroups.com
Hello Greg —

Being able to add in the Venue on a duplicate check would be great, in fact absolutely necessary for our purposes (a countywide entertainment calendar) — e.g., last Friday night there were eight events at different venues all starting at 8 p.m., so start time along won't work. The ability to check the Venue is definitely something I would have to/need to add with an override.

Just to be clear; what I'm after is allowing the general public to create events that would go into a quarantine area/get reviewed before being turned loose. The biggest hurdle seems to be making sure one event doesn't get entered three times by three different parties. Since they're likely to have variations on the Occurrence name, the only sane way, it seemed to me, was to run a check that includes the three data points of Occurrence date & time and Venue. Is eventtools the tool for me?

Thanks for all your work on this!
John

Greg Turner

unread,
May 2, 2012, 6:10:27 AM5/2/12
to glamkit-d...@googlegroups.com
Sorry, I meant to be clearer:

Occurrences would be judged to be unique based on their event and start time. So we would allow several events to start at the same time, of course, but not allow one event to have several overlapping start times.

For your specific use case, you'd need the overridable 'is_duplicate()' on Occurrence - or even Event (though that may not be a part of eventtools just yet). You would override it to test start and venue, and I'd suggest something like a Levenshtein distance threshold on Event title/slug, because it's plausible that two events with different names could start at the same time at the same venue.

J. Heasly

unread,
May 2, 2012, 10:37:09 PM5/2/12
to glamkit-d...@googlegroups.com
An overridable 'is_duplicate()' on Occurrence seems like a fine idea. And the Levenshtein distance threshold [1] — which sent me scrambling to Google; news to me — on the Event slug seems like a solid approach; not having to futz with the Venue definitely has its appeal.

So currently there's no is_duplicate() in https://github.com/glamkit/glamkit-eventtools/blob/api-tidy/eventtools/models/occurrence.py. (I was looking in the api-tidy branch based on what I read here: https://github.com/glamkit/glamkit-eventtools/issues/18#issuecomment-4630877) Is it something I should add to OccurrenceModel()? What's your advice on how I proceed/experiment?

Or should I just do it in higher up in some sort of on save() override?

Thanks for your time & guidance,
John

[1] Found this Python C extension: http://code.google.com/p/pylevenshtein/

Greg Turner

unread,
May 3, 2012, 1:04:51 AM5/3/12
to glamkit-d...@googlegroups.com
Yep, Occurrence.is_duplicate() doesn't exist yet, but probably should, if unique_together is insufficient.

However, I haven't really thought through whether using is_duplicate to inhibit save(), or within save() itself, will have unacceptable consequences. e.g. if two Generators both generate the same Occurrence but with different durations, does it matter that only one will remain? And would it be the first-generated or last-generated? Perhaps some experimentation and documentation of the edge cases is in order.

J. Heasly

unread,
May 4, 2012, 2:22:01 AM5/4/12
to glamkit-d...@googlegroups.com
Ah, I'd missed that unique_together; thanks for pointing it out. That would be just the ticket if I could rely on disparate users entering an event using the same character-for-character name. (Everything would be great if it weren't for users!)

The comparison I was considering is in a save(), check if there's a persistent Occurrence (already in the database) against the proposed, about-to-be saved Occurrence and *any* overlap between the durations + Venue (the way our calendar's set up, there's only one performance per Venue) triggers a no-go flag. Exactly what happens next, I hadn't really fleshed out. Probably/possible saving() it to the Queue of Troublesome Events/Occurrences with the user given notice of the conflict.

Maybe the simpler thing is just a Levenshtein distance threshold comparison on the Event title/slug is there's a start date/time overlap. Hrmm.
Reply all
Reply to author
Forward
0 new messages