Generate a daily serial/batch/lot number

802 views
Skip to first unread message

kmpm

unread,
Sep 6, 2010, 2:19:39 AM9/6/10
to Django users
I have a project running in a manufacturing industry that is actually
built upon django.
In this I need to generate a unique serial, batch or lot number
(depending on what you would like to call it) that is a running number
from 0 to whathever for each and every day.
There is a high risk of concurrency so just finding the previous max
and then do a +1 before saving is not what I want.
The important part of the model looks like this...

class ProducedEntity(models.Model):
....
production_date = models.DateField(auto_now_add=True)
lot_no = models.PositiveIntegerField(default=0)

class Meta:
unique_together = ('production_date', 'lot_no')

Of course I could just save it and see if the .save() call works
without generating a IntegrityError but that's not elegant.
Is there a way of generating that number per day in a way that it's
done when I save the model?

Mike Dewhirst

unread,
Sep 6, 2010, 3:02:32 AM9/6/10
to django...@googlegroups.com
On 6/09/2010 4:19pm, kmpm wrote:
> I have a project running in a manufacturing industry that is actually
> built upon django.
> In this I need to generate a unique serial, batch or lot number
> (depending on what you would like to call it) that is a running number
> from 0 to whathever for each and every day.

So when every day ticks over at midnight, the serial number starts at
zero again? Does production continue across the time 00:00? Why don't
you use a continuously increment number?

I don't know what the "right" thing to do is. I'm sure there will be a
classic solution. I think I would be writing a singleton function
nextnum() which returns the next number as required and then

lot_no = models.IntegerField(default=nextnum)

I'm not sure what would happen threadwise but a singleton would be a
natural place to deal with thread-locking and restarting the sequence
when the clock ticks over.

lot_no then gets the value prior to the save()

Mike

kmpm

unread,
Sep 6, 2010, 3:38:18 AM9/6/10
to Django users
On Sep 6, 9:02 am, Mike Dewhirst <mi...@dewhirst.com.au> wrote:
> On 6/09/2010 4:19pm, kmpm wrote:
>
> > I have a project running in a manufacturing industry that is actually
> > built upon django.
> > In this I need to generate a unique serial, batch or lot number
> > (depending on what you would like to call it) that is a running number
> > from 0 to whathever for each and every day.
>
> So when every day ticks over at midnight, the serial number starts at
> zero again? Does production continue across the time 00:00? Why don't
> you use a continuously increment number?
>
> I don't know what the "right" thing to do is. I'm sure there will be a
> classic solution. I think I would be writing a singleton function
> nextnum() which returns the next number as required and then
>
>     lot_no = models.IntegerField(default=nextnum)
>
> I'm not sure what would happen threadwise but a singleton would be a
> natural place to deal with thread-locking and restarting the sequence
> when the clock ticks over.
>
> lot_no then gets the value prior to the save()
>
> Mike


Yes it ticks over at midnight, and no there is no production at that
time. More or less office hours only.
A singleton would be nice but as you said it's not thread safe and
won't scale nicely if we were to use multiple frontend servers or
anything like that.
There is a project called django-idmapper[1] that I just found that
might solve it by using a model that seems to be stored in shared
memory.
If running on multiple frontends that would need to be something like
memcached but I don't know about thread safety and concurrency there
either.


[1]http://github.com/dcramer/django-idmapper


Daniel Roseman

unread,
Sep 6, 2010, 3:53:57 AM9/6/10
to Django users
You should probably rely on your database to generate a unique
incrementing ID. If you're on Postgres, you can use a sequence, and
for MySQL there's some code emulating sequences here:
http://dev.mysql.com/doc/refman/5.1/en/information-functions.html#function_last-insert-id
(it's towards the end of the section on last_insert_id()).
--
DR.

Mike Dewhirst

unread,
Sep 6, 2010, 5:09:13 AM9/6/10
to django...@googlegroups.com

Then it would need to be a database trigger. At least in PostgreSQL you
could write it in the Python proc language supplied. It could create a
new temporary table with an incrementing column each night as a source
for the sequence number. That would cover occasional system stoppages
during the day.

I'm not sure how to integrate that in a nice way with django models
except as you alreadhy indicated with unique for date.

Tom Evans

unread,
Sep 6, 2010, 5:24:39 AM9/6/10
to django...@googlegroups.com

UUIDs are your best bet. Even if you generated 100 billion UUIDs a
second for the next 100 years, the chance of one collision would only
be 50% [1].

class ProducedEntity(models.Model):
....
uuid = models.CharField(max_length=36, unique=True)

def _hook_add_uuid(instance, sender, **kwargs):
import uuid
if not hasattr(instance, 'uuid') or not instance.uuid:
instance.uuid = str(uuid.uuid4())

from django.db.models.signals import pre_save
pre_save.connect(_hook_add_uuid, sender=ProducedEntity)


UUIDs can be stored more efficiently than that, its just a big 128 bit
number, so can be stored as 2 long integers. See your DB manual for
more info.

Cheers

Tom

[1] http://en.wikipedia.org/wiki/Universally_unique_identifier#Random_UUID_probability_of_duplicates

kmpm

unread,
Sep 6, 2010, 5:39:41 AM9/6/10
to Django users
>
> UUIDs are your best bet. Even if you generated 100 billion UUIDs a
> second for the next 100 years, the chance of one collision would only
> be 50% [1].
>
> class ProducedEntity(models.Model):
>     ....
>     uuid = models.CharField(max_length=36, unique=True)
>
> def _hook_add_uuid(instance, sender, **kwargs):
>     import uuid
>     if not hasattr(instance, 'uuid') or not instance.uuid:
>         instance.uuid = str(uuid.uuid4())
>
> from django.db.models.signals import pre_save
> pre_save.connect(_hook_add_uuid, sender=ProducedEntity)
>
> UUIDs can be stored more efficiently than that, its just a big 128 bit
> number, so can be stored as 2 long integers. See your DB manual for
> more info.
>
> Cheers
>
> Tom
>
> [1]http://en.wikipedia.org/wiki/Universally_unique_identifier#Random_UUI...

First of all to everyone, thanks for all good suggestions so far.

Regarding UUIDs, one of the business requirements (which I don't
control) is that the identifier that is to be used on barcodes and
what not is to be in the format "YYMMDDXXXX" where XXXX is the number
we are talking about. So UUID would be great and unique, as well would
the automatic pk/id from the model itself as that is unique enough.
But I need a rolling number per day and nothing else.

kmpm

unread,
Sep 6, 2010, 10:20:23 AM9/6/10
to Django users
For anyone listening in...
the ticket http://code.djangoproject.com/ticket/2705 might also be
worth looking at

Preston Holmes

unread,
Sep 6, 2010, 12:52:23 PM9/6/10
to Django users


On Sep 6, 2:39 am, kmpm <pe...@birchroad.net> wrote:
> First of all to everyone, thanks for all good suggestions so far.
>
> Regarding UUIDs, one of the business requirements (which I don't
> control) is that the identifier that is to be used on barcodes and
> what not is to be in the format "YYMMDDXXXX" where XXXX is the number
> we are talking about. So UUID would be great and unique, as well would
> the automatic pk/id from the model itself as that is unique enough.
> But I need a rolling number per day and nothing else.

what about a simple function that does something as low tech as
checking a lock file. If you had multiple front end servers, this
could be a running process on a server using semaphore lock.

-Preston

raj

unread,
Sep 6, 2010, 2:40:05 PM9/6/10
to Django users

Hi all,
Why can't we use the aggregate function 'max' from within save()? I
found it working to find the lot_max from among the objects filtered
by prod_date and then if lot_max is None, lot_no must be 0, else it's
just lot_max + 1. I don't know if I've misunderstood some requirement
by the way.

Raj.

kmpm

unread,
Sep 6, 2010, 2:49:54 PM9/6/10
to Django users


On Sep 6, 6:52 pm, Preston Holmes <pres...@ptone.com> wrote:
>
> what about a simple function that does something as low tech as
> checking a lock file.  If you had multiple front end servers, this
> could be a running process on a server using semaphore lock.
>
> -Preston

Sort of thought about that as well.
Meanwhile I found two other solutions to similar problems.
One is really high tech and made for GAE[1] dealing with high
concurrency counters using memcashed and periodically persisting the
counter to database. That solution involves something similar that a
lock file would do but still not...
The other [2] is on the other hand very low tech and deals with race-
conditions in django in a more trial-and-error kind of way that was my
first plan for this. Try your best by looking at a previous Max value
and take care of the integrity error that might occur.
I am really intrigued by the scalability of the first one and the
pragmatist in me would go for the second.

I really don't want to hit the database for either read or write more
then necessary so if I could get a highly probable counter value as
default by doing something similar as the GAE[1] approach and just
deal with the integrity error when that unlikely event occurs just as
option 2 describes.
What would the django community go for? It is of course a matter of
what scalability and performance requirements you have but what would
be the "right" way if you wanted a as pure django solution as
possible? All these questions... and to many answers.

[1] http://appengine-cookbook.appspot.com/recipe/high-concurrency-counters-without-sharding/
[2] http://stackoverflow.com/questions/3522827/

Mike Dewhirst

unread,
Sep 6, 2010, 7:16:01 PM9/6/10
to django...@googlegroups.com
I'm reminded that almost all my own disasters were caused by premature
optimisation.

I would go with the simplest solution and keep an eye on performance as
it scales. Gives you plenty of time to research plan B.

That'll be 2c please

M

kmpm

unread,
Sep 7, 2010, 2:06:04 AM9/7/10
to Django users
We might have a race-condition if 2 or more occurrences of the same
event happens at the same time.
If to processes start on about the same time then both will get the
same 'max' from the database. The first that saves will be OK, the
second will get a IntegrityError because it tries to use the same
lot_no.
Thats why something a little more is needed but you are basically
right.

Shamail Tayyab

unread,
Sep 7, 2010, 2:48:52 AM9/7/10
to django...@googlegroups.com
On Tuesday 07 September 2010 04:46 AM, Mike Dewhirst wrote:
> I'm reminded that almost all my own disasters were caused by premature
> optimisation.
Seconds that! :-)

>
> I would go with the simplest solution and keep an eye on performance
> as it scales. Gives you plenty of time to research plan B.
>

--
Shamail Tayyab
Blog: http://shamail.in/blog

Darryl Ross

unread,
Oct 8, 2010, 1:29:28 AM10/8/10
to django...@googlegroups.com
On 06/09/10 19:09, kmpm wrote:
> Regarding UUIDs, one of the business requirements (which I don't
> control) is that the identifier that is to be used on barcodes and
> what not is to be in the format "YYMMDDXXXX" where XXXX is the number
> we are talking about. So UUID would be great and unique, as well would
> the automatic pk/id from the model itself as that is unique enough.
> But I need a rolling number per day and nothing else.

I can't see if you mentioned which database you are using, but if you
are using Postgres, why can't you just use the default PK field and run
a cronjob to excecute a postgres SETVAL at midnight?

#!/bin/sh
DATECODE=$( date +%Y%m%d )
echo "SET SETVAL('table_id_seq', '${DATECODE}0000');" | psql ....


Regards
Darryl

Reply all
Reply to author
Forward
0 new messages