Questions about uids

adr...@gmail.com

unread,

Feb 13, 2013, 8:57:46 AM2/13/13

to scal...@googlegroups.com

Hello,

No offfense intended, really naive beginner questions discovering the code base.

In uid.erl arround line ~37 I see some interesting comment "It is not completely safe, but safe enough".

About this current uids implementation :

What are the uids for ? (I still haven't understood "the big picture" of scalaris, only some pieces)
Why is the implementation "not completely safe" ? (in which case does it do what harm?)
Why is it "safe enough"? Enough for which operational case ?

About safe uids implementations for scalaris :
Would the use of one of

Flake https://github.com/boundary/flake
Original Twitter's Snowflake https://github.com/twitter/snowflake
gproc counters
some other battle proofed "counter/uid"

make it "completely safe" ?

Pierre M.

Florian Schintke

unread,

Feb 13, 2013, 9:34:43 AM2/13/13

to scal...@googlegroups.com

Hi,

> In uid.erl arround line ~37 I see some interesting comment "It is not
> completely safe, but safe enough".

The answer is in the two lines above:
"Same pid may be reused in the same VM, so we get a VM unique offset to start"

The function you mention tries to find an id which is unique inside an
Erlang process. When no id counter is found, it does so by retrieving
a number from the Erlang VM (the number of reductions, that the VM
already did). This function call is fast and the returned number
increases with the work the Erlang VM did. This is called once for a
pid and after that, the number is incremented each time the same
process wants to have the next unique id.

Normally you will not see the same id twice. If the Erlang VM would
later on start a process with the same pid (because it may reuse pids
of terminated processes at some time?) and this new process with the
same id would ask the Erlang VM for a unique id, it would again
retrieve the number of reductions the Erlang VM made until this point
in time (probably a much larger number than the first process
retrieved). So, still no collision. If the first pid retrieved unique
ids faster than the Erlang VM increased its number of reductions
counter, then ids may overlap and not by unique. I do not expect that
to happen in a practical system.

Would that have a negative influence in the system: probably not. Most
unique ids are used to tag requests and to be able to identify replies
were send for exactly that request, so they are only of temporaray
interest and outdated and of no further meaning afterwards.

To bother the system, in most cases a duplicated unique id would have
to occur in the system in a very short time frame and would have to be
used in the same context, which is once more very unlikely (as
described above, I assume plenty of time before an Erlang VM reuses
process identidiers).

So, the comment is a bit conservative when it states, 'it is safe
enough', but that is probably how scientists write...

> About this current uids implementation :

> 1. What are the uids for ? (I still haven't understood "the big picture"

> of scalaris, only some pieces)

> 2. Why is the implementation "not completely safe" ? (in which case does
> it do what harm?)
> 3. Why is it "safe enough"? Enough for which operational case ?

See above

> About safe uids implementations for scalaris :
> Would the use of one of
>

> 1. Flake https://github.com/boundary/flake
> 2. Original Twitter's Snowflake https://github.com/twitter/snowflake
> 3. gproc counters
> 4. some other battle proofed "counter/uid"

>
> make it "completely safe" ?

Not necessary to think about. 'Safe enough' *is* safe enough for any
practical scenario!

Florian

adr...@gmail.com

unread,

Feb 13, 2013, 10:37:54 AM2/13/13

to scal...@googlegroups.com

Thank you for these explanations.
Yes, now I understand it is actually "safe enough".

Pierre M.

Reply all

Reply to author

Forward