ObjectId generation: processId or threadId?

29 views
Skip to first unread message

João Ferreira

unread,
Jan 18, 2018, 8:45:39 AM1/18/18
to ReactiveMongo - http://reactivemongo.org
Accordingly to https://docs.mongodb.com/manual/reference/method/ObjectId/#ObjectIDs-BSONObjectIDSpecification the ObjectId is made from:
  • a 4-byte value representing the seconds since the Unix epoch,
  • a 3-byte machine identifier,
  • a 2-byte process id, and
  • a 3-byte counter, starting with a random value.

However looking into https://github.com/ReactiveMongo/ReactiveMongo/blob/master/bson/src/main/scala/types.scala#L522 we can see that the threadId is used.


Comparing with the official java driver the processId is used https://github.com/mongodb/mongo-java-driver/blob/3.4.x/bson/src/main/org/bson/types/ObjectId.java#L530


This is causing issues in our tests, where were expecting incremental ObjectIds...


I can work on a small reproduction project, and a fix.


Regards

João

João Ferreira

unread,
Jan 18, 2018, 9:06:18 AM1/18/18
to ReactiveMongo - http://reactivemongo.org
After searching a bit more I found out https://groups.google.com/forum/?fromgroups#!topic/reactivemongo/oPuDBnXhr7g

However I am more worried about the ids not being incremental and less about collisions...

Cédric Chantepie

unread,
Jan 19, 2018, 2:50:05 PM1/19/18
to ReactiveMongo - http://reactivemongo.org


On Thursday, 18 January 2018 15:06:18 UTC+1, João Ferreira wrote:
After searching a bit more I found out https://groups.google.com/forum/?fromgroups#!topic/reactivemongo/oPuDBnXhr7g

However I am more worried about the ids not being incremental and less about collisions...

On Thursday, January 18, 2018 at 1:45:39 PM UTC, João Ferreira wrote:
...


This is causing issues in our tests, where were expecting incremental ObjectIds...


I cannot see based on which part of the ObjectID specification are you expecting such incremental behaviour?

João Ferreira

unread,
Jan 22, 2018, 12:35:20 PM1/22/18
to ReactiveMongo - http://reactivemongo.org
In the same process, ObjectIDs should be incremental because the parts have a stable or incremental behavior:
  • a 4-byte value representing the seconds since the Unix epoch (incremental),
  • a 3-byte machine identifier (stable in the same process),
  • a 2-byte process id (stable in the same process),
  • a 3-byte counter, starting with a random value (incremental)
However when we use the a thread id instead of a process id, that part is no longer incremental nor stable (it can decrease). In reactive applications, we avoid blocking and code runs in whichever thread is available.

I wrote a small reproduction snippet that I ran on https://github.com/cchantep/RM-SBT-Playground. The ideia is that ObjectIds generated in the same thread are incremental (see method testIds), but the ones generated in different threads are not (see method testIdsAsync). Calling testIdsAsync some times eventually gives us the error:

$ ./run.sh
SBT command
: sbt
[info] Loading project definition from /home/joao/git/RM-SBT-Playground/project
[info] Loading settings from build.sbt ...
[info] Set current project to RM-SBT-Playground (in build file:/home/joao/git/RM-SBT-Playground/)
[info] Starting scala interpreter...
Welcome to Scala 2.12.4 (OpenJDK 64-Bit Server VM, Java 1.8.0_144).
Type in expressions for evaluation. Or try :help.

scala
> :paste
// Entering paste mode (ctrl-D to finish)

import scala.concurrent.{Await, Future}
import scala.concurrent.duration._
import scala.concurrent.ExecutionContext.Implicits.global
import reactivemongo.bson._

def testIds() = { val a = BSONObjectID.generate(); val b = BSONObjectID.generate(); (a,b,a.stringify < b.stringify)  }

def testIdsAsync() = Future { BSONObjectID.generate() }.flatMap{ a => Future { val b = BSONObjectID.generate(); (a,b,a.stringify < b.stringify) }}

// Exiting paste mode, now interpreting.

import scala.concurrent.{Await, Future}
import scala.concurrent.duration._
import scala.concurrent.ExecutionContext.Implicits.global
import reactivemongo.bson._
testIds
: ()(reactivemongo.bson.BSONObjectID, reactivemongo.bson.BSONObjectID, Boolean)
testIdsAsync
: ()scala.concurrent.Future[(reactivemongo.bson.BSONObjectID, reactivemongo.bson.BSONObjectID, Boolean)]

scala
> testIds()
res0
: (reactivemongo.bson.BSONObjectID, reactivemongo.bson.BSONObjectID, Boolean) = (BSONObjectID("5a66194525000025002efb34"),BSONObjectID("5a66194525000025002efb35"),true)

scala
> testIds()
res1
: (reactivemongo.bson.BSONObjectID, reactivemongo.bson.BSONObjectID, Boolean) = (BSONObjectID("5a66194625000025002efb36"),BSONObjectID("5a66194625000025002efb37"),true)

scala
> testIds()
res2
: (reactivemongo.bson.BSONObjectID, reactivemongo.bson.BSONObjectID, Boolean) = (BSONObjectID("5a66194625000025002efb38"),BSONObjectID("5a66194625000025002efb39"),true)

scala
> testIds()
res3
: (reactivemongo.bson.BSONObjectID, reactivemongo.bson.BSONObjectID, Boolean) = (BSONObjectID("5a66194625000025002efb3a"),BSONObjectID("5a66194625000025002efb3b"),true)

scala
>

scala
> Await.result(testIdsAsync(), 1.second)
res4
: (reactivemongo.bson.BSONObjectID, reactivemongo.bson.BSONObjectID, Boolean) = (BSONObjectID("5a66194725000027002efb3c"),BSONObjectID("5a66194725000027002efb3d"),true)

scala
> Await.result(testIdsAsync(), 1.second)
res5
: (reactivemongo.bson.BSONObjectID, reactivemongo.bson.BSONObjectID, Boolean) = (BSONObjectID("5a66194725000028002efb3e"),BSONObjectID("5a66194725000028002efb3f"),true)

scala
> Await.result(testIdsAsync(), 1.second)
res6
: (reactivemongo.bson.BSONObjectID, reactivemongo.bson.BSONObjectID, Boolean) = (BSONObjectID("5a66194725000027002efb40"),BSONObjectID("5a66194725000027002efb41"),true)

scala
> Await.result(testIdsAsync(), 1.second)
res7
: (reactivemongo.bson.BSONObjectID, reactivemongo.bson.BSONObjectID, Boolean) = (BSONObjectID("5a66194925000028002efb42"),BSONObjectID("5a66194925000027002efb43"),false)

scala
>

Note that res7 returns false, because the threadId part was not stable "2800" vs "2700"

Since the way the scheduler distributes work through the threads is not deterministic, one might have to run this code some times.

Regards
João
Reply all
Reply to author
Forward
0 new messages