Message ID best practice? (or how best to define related messages in Scala)

582 views
Skip to first unread message

Shawn

unread,
Dec 14, 2013, 1:04:14 PM12/14/13
to akka...@googlegroups.com

To give realistic(?) example, I've read that it's a good practice to have messages carry a unique identifier so you can track them (e.g. via logging).
Assuming it's also a good practice to use case classes for defining messages.

Given definitions like:

case class Msg1(id:Int,a:String)
case class Msg2(id:Int, b:String)
case class Msg3(id:Int,c:String)

Is there some way I could factor out the common "id" property and possibly it's implementation?

More generally, how does one define messages with shared structure (and perhaps even light behavior) without repeating myself?

How does one tackle declaring an akka hierachy of related messages without repeating yourself?

Shawn

unread,
Dec 14, 2013, 1:51:51 PM12/14/13
to akka...@googlegroups.com
Hacking around a bit I created something like:

abstract class IdMsg(val id:Long = System.nanoTime)

case class AMsg(a:Int, b:String) extends IdMsg {
  override def toString = s"(a = $a, b= $b, id = $id)"
}

val a = AMsg(1,"foo")
val b = AMsg(2,"foo")

which results in :

> defined class IdMsg

> defined class AMsg

> a: AMsg = (a = 1, b= foo, id = 230607343776394)
> b: AMsg = (a = 2, b= foo, id = 230607421074922)


However this requires me to provide me own implementation of toString (without it a is printed as just AMsg(1,foo) ) and it uses up the inheritance slot. Maybe that's not so bad since I don't think the default case case toString() is descriptive enough?

Is there anything else horrific about this?

Derek Wyatt

unread,
Dec 15, 2013, 1:01:04 PM12/15/13
to akka...@googlegroups.com

Nicholas Sterling

unread,
Dec 15, 2013, 5:52:54 PM12/15/13
to akka...@googlegroups.com
Is it really safe, in this multi-core era, to assume that two IDs could not be generated in the same nanosecond?

Derek Wyatt

unread,
Dec 15, 2013, 5:55:37 PM12/15/13
to akka...@googlegroups.com
On 2013-12-15, at 5:52 PM, Nicholas Sterling wrote:

Is it really safe, in this multi-core era, to assume that two IDs could not be generated in the same nanosecond?

Absolutely.  Use whatever mechanism you want. A good UUID generator, an AtomicLong, whatever... This is a problem that's been heavily solved for a long time now.


--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://akka.io/faq/
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to a topic in the Google Groups "Akka User List" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/akka-user/84Mb4pEp4wQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/groups/opt_out.

Nicholas Sterling

unread,
Dec 15, 2013, 8:25:53 PM12/15/13
to akka...@googlegroups.com
Judging from the response, I probably wasn't clear enough -- let me try again.

Generating unique IDs by grabbing System.nanoTime, as Shawn's code snippet above does, appears to assume that you'll never generate two IDs in the same nanosecond.  I wonder whether that's a safe assumption.  Unless there are special circumstances that guarantee that it won't happen, I wouldn't assume so.

Roland Kuhn

unread,
Dec 16, 2013, 2:09:00 AM12/16/13
to akka-user
Hi Derek,

15 dec 2013 kl. 23:55 skrev Derek Wyatt <de...@derekwyatt.org>:

On 2013-12-15, at 5:52 PM, Nicholas Sterling wrote:

Is it really safe, in this multi-core era, to assume that two IDs could not be generated in the same nanosecond?

Absolutely.  Use whatever mechanism you want. A good UUID generator, an AtomicLong, whatever... This is a problem that's been heavily solved for a long time now.

Those mechanisms you quote are not appropriate to be used at the rates proposed and shared across threads, they are simply too heavy. Doing something once may be fine, but doing it 50 million times per second can be a problem. Anecdotally, we already generate a unique ID—the identity of the message envelope—but even that is optimized away if not needed: there is a noticeable performance price you pay if you touch an object’s identityHashCode because that is normally not stored and instead the (unstable) memory location is used. So, what exactly do you refer to when saying that this problem has been solved long ago?

Regards,

Roland



--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://akka.io/faq/
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to a topic in the Google Groups "Akka User List" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/akka-user/84Mb4pEp4wQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/groups/opt_out.


--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://akka.io/faq/
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.

To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/groups/opt_out.



Dr. Roland Kuhn
Akka Tech Lead
Typesafe – Reactive apps on the JVM.
twitter: @rolandkuhn


Derek Wyatt

unread,
Dec 16, 2013, 5:33:10 AM12/16/13
to akka...@googlegroups.com
On Dec 16, 2013, at 2:09 AM, Roland Kuhn <goo...@rkuhn.info> wrote:

Hi Derek,

15 dec 2013 kl. 23:55 skrev Derek Wyatt <de...@derekwyatt.org>:

On 2013-12-15, at 5:52 PM, Nicholas Sterling wrote:

Is it really safe, in this multi-core era, to assume that two IDs could not be generated in the same nanosecond?

Absolutely.  Use whatever mechanism you want. A good UUID generator, an AtomicLong, whatever... This is a problem that's been heavily solved for a long time now.

Those mechanisms you quote are not appropriate to be used at the rates proposed and shared across threads, they are simply too heavy. Doing something once may be fine, but doing it 50 million times per second can be a problem. Anecdotally, we already generate a unique ID—the identity of the message envelope—but even that is optimized away if not needed: there is a noticeable performance price you pay if you touch an object’s identityHashCode because that is normally not stored and instead the (unstable) memory location is used. So, what exactly do you refer to when saying that this problem has been solved long ago?

Well, I think we need to be realistic here.  If an app is going to do interesting work, you're not going to be generating 50 mil msgs per second. Theoretical discussions are fine, but in reality I don't think we're talking about that scale.  As for how the problem has been solved, I think there are a number of reasonably obvious solutions, no?

def genId(implicit uniqueifier: Long) = s"${uniqueifier}-${System.nanoTime}"

Now, give each Actor an implicit uniqueifier, and then put in a guarantee that no single Actor will call this more than once per nano second. e.g. create enough Actors to do the work, loop until the new Id isn't the same as the last one, etc...

Or, give a each Actor a uniqueifier, and let it use a monotonically increasing postfix identifier.  e.g. gen the uniqueifier as a UUID and then let them increment a Long until it gets exhausted, then pick another UUID, or crash the Actor, or...

Vary these themes as you see fit.  "Solved" here means that I'm not being all that inventive; for non-generalized situations, the above should work perfectly fine.

√iktor Ҡlang

unread,
Dec 16, 2013, 5:42:43 AM12/16/13
to Akka User List
On Mon, Dec 16, 2013 at 11:33 AM, Derek Wyatt <de...@derekwyatt.org> wrote:

On Dec 16, 2013, at 2:09 AM, Roland Kuhn <goo...@rkuhn.info> wrote:

Hi Derek,

15 dec 2013 kl. 23:55 skrev Derek Wyatt <de...@derekwyatt.org>:

On 2013-12-15, at 5:52 PM, Nicholas Sterling wrote:

Is it really safe, in this multi-core era, to assume that two IDs could not be generated in the same nanosecond?

Absolutely.  Use whatever mechanism you want. A good UUID generator, an AtomicLong, whatever... This is a problem that's been heavily solved for a long time now.

Those mechanisms you quote are not appropriate to be used at the rates proposed and shared across threads, they are simply too heavy. Doing something once may be fine, but doing it 50 million times per second can be a problem. Anecdotally, we already generate a unique ID—the identity of the message envelope—but even that is optimized away if not needed: there is a noticeable performance price you pay if you touch an object’s identityHashCode because that is normally not stored and instead the (unstable) memory location is used. So, what exactly do you refer to when saying that this problem has been solved long ago?

Well, I think we need to be realistic here.  If an app is going to do interesting work, you're not going to be generating 50 mil msgs per second. Theoretical discussions are fine, but in reality I don't think we're talking about that scale.  As for how the problem has been solved, I think there are a number of reasonably obvious solutions, no?

You're assuming that nanoTime has nanosecond accuracy.

Cheers,



--
Cheers,

Viktor Klang

Director of Engineering

Twitter: @viktorklang

Derek Wyatt

unread,
Dec 16, 2013, 5:46:17 AM12/16/13
to akka...@googlegroups.com
On 2013-12-16, at 5:42 AM, √iktor Ҡlang wrote:



On Mon, Dec 16, 2013 at 11:33 AM, Derek Wyatt <de...@derekwyatt.org> wrote:

On Dec 16, 2013, at 2:09 AM, Roland Kuhn <goo...@rkuhn.info> wrote:

Hi Derek,

15 dec 2013 kl. 23:55 skrev Derek Wyatt <de...@derekwyatt.org>:

On 2013-12-15, at 5:52 PM, Nicholas Sterling wrote:

Is it really safe, in this multi-core era, to assume that two IDs could not be generated in the same nanosecond?

Absolutely.  Use whatever mechanism you want. A good UUID generator, an AtomicLong, whatever... This is a problem that's been heavily solved for a long time now.

Those mechanisms you quote are not appropriate to be used at the rates proposed and shared across threads, they are simply too heavy. Doing something once may be fine, but doing it 50 million times per second can be a problem. Anecdotally, we already generate a unique ID—the identity of the message envelope—but even that is optimized away if not needed: there is a noticeable performance price you pay if you touch an object’s identityHashCode because that is normally not stored and instead the (unstable) memory location is used. So, what exactly do you refer to when saying that this problem has been solved long ago?

Well, I think we need to be realistic here.  If an app is going to do interesting work, you're not going to be generating 50 mil msgs per second. Theoretical discussions are fine, but in reality I don't think we're talking about that scale.  As for how the problem has been solved, I think there are a number of reasonably obvious solutions, no?

You're assuming that nanoTime has nanosecond accuracy.

No, actually I'm not.  I'm trying to just be practical.  You're right that it isn't that tight, so when I say nanosecond, you can take that as "resolution of the timer", or whatever you like.

In the general case, this might be a problem, but with Actors all you should need to do is give them a unique prefix. If you don't like nanoTime, don't use it.  Use a Long.

√iktor Ҡlang

unread,
Dec 16, 2013, 5:50:07 AM12/16/13
to Akka User List
On Mon, Dec 16, 2013 at 11:46 AM, Derek Wyatt <de...@derekwyatt.org> wrote:
On 2013-12-16, at 5:42 AM, √iktor Ҡlang wrote:



On Mon, Dec 16, 2013 at 11:33 AM, Derek Wyatt <de...@derekwyatt.org> wrote:

On Dec 16, 2013, at 2:09 AM, Roland Kuhn <goo...@rkuhn.info> wrote:

Hi Derek,

15 dec 2013 kl. 23:55 skrev Derek Wyatt <de...@derekwyatt.org>:

On 2013-12-15, at 5:52 PM, Nicholas Sterling wrote:

Is it really safe, in this multi-core era, to assume that two IDs could not be generated in the same nanosecond?

Absolutely.  Use whatever mechanism you want. A good UUID generator, an AtomicLong, whatever... This is a problem that's been heavily solved for a long time now.

Those mechanisms you quote are not appropriate to be used at the rates proposed and shared across threads, they are simply too heavy. Doing something once may be fine, but doing it 50 million times per second can be a problem. Anecdotally, we already generate a unique ID—the identity of the message envelope—but even that is optimized away if not needed: there is a noticeable performance price you pay if you touch an object’s identityHashCode because that is normally not stored and instead the (unstable) memory location is used. So, what exactly do you refer to when saying that this problem has been solved long ago?

Well, I think we need to be realistic here.  If an app is going to do interesting work, you're not going to be generating 50 mil msgs per second. Theoretical discussions are fine, but in reality I don't think we're talking about that scale.  As for how the problem has been solved, I think there are a number of reasonably obvious solutions, no?

You're assuming that nanoTime has nanosecond accuracy.

No, actually I'm not.  I'm trying to just be practical.  You're right that it isn't that tight, so when I say nanosecond, you can take that as "resolution of the timer", or whatever you like.

Yes, and if you're really unlucky, it's using currentTimeMillis padded with zeroes (and currentTimeMillis can have 10s of _milliseconds_ of accuracy, which will quite soon lead to collisions)
 

In the general case, this might be a problem, but with Actors all you should need to do is give them a unique prefix.

Yes, but now you need to generate a unique prefix...
 
If you don't like nanoTime, don't use it.  Use a Long.

nanoTime is a Long, but I guess you mean sequence number?

Cheers,

Derek Wyatt

unread,
Dec 16, 2013, 7:42:57 AM12/16/13
to akka...@googlegroups.com
On 2013-12-16, at 5:50 AM, √iktor Ҡlang wrote:




On Mon, Dec 16, 2013 at 11:46 AM, Derek Wyatt <de...@derekwyatt.org> wrote:
On 2013-12-16, at 5:42 AM, √iktor Ҡlang wrote:



On Mon, Dec 16, 2013 at 11:33 AM, Derek Wyatt <de...@derekwyatt.org> wrote:

On Dec 16, 2013, at 2:09 AM, Roland Kuhn <goo...@rkuhn.info> wrote:

Hi Derek,

15 dec 2013 kl. 23:55 skrev Derek Wyatt <de...@derekwyatt.org>:

On 2013-12-15, at 5:52 PM, Nicholas Sterling wrote:

Is it really safe, in this multi-core era, to assume that two IDs could not be generated in the same nanosecond?

Absolutely.  Use whatever mechanism you want. A good UUID generator, an AtomicLong, whatever... This is a problem that's been heavily solved for a long time now.

Those mechanisms you quote are not appropriate to be used at the rates proposed and shared across threads, they are simply too heavy. Doing something once may be fine, but doing it 50 million times per second can be a problem. Anecdotally, we already generate a unique ID—the identity of the message envelope—but even that is optimized away if not needed: there is a noticeable performance price you pay if you touch an object’s identityHashCode because that is normally not stored and instead the (unstable) memory location is used. So, what exactly do you refer to when saying that this problem has been solved long ago?

Well, I think we need to be realistic here.  If an app is going to do interesting work, you're not going to be generating 50 mil msgs per second. Theoretical discussions are fine, but in reality I don't think we're talking about that scale.  As for how the problem has been solved, I think there are a number of reasonably obvious solutions, no?

You're assuming that nanoTime has nanosecond accuracy.

No, actually I'm not.  I'm trying to just be practical.  You're right that it isn't that tight, so when I say nanosecond, you can take that as "resolution of the timer", or whatever you like.

Yes, and if you're really unlucky, it's using currentTimeMillis padded with zeroes (and currentTimeMillis can have 10s of _milliseconds_ of accuracy, which will quite soon lead to collisions)
 

In the general case, this might be a problem, but with Actors all you should need to do is give them a unique prefix.

Yes, but now you need to generate a unique prefix...
 
If you don't like nanoTime, don't use it.  Use a Long.

nanoTime is a Long, but I guess you mean sequence number?

I do.

It's quite possible that this thread has gone awry :)  I believe that the original issue was probably meaning to say something like, "Are you sure that using nanoseconds as unique identifiers isn't a terrible idea?".  Of course, to this I would certainly say "yes".  And if that was what the original question was, I probably tore this thread way off course :)

Jisoo Park

unread,
Dec 17, 2013, 8:14:28 AM12/17/13
to akka...@googlegroups.com
It might be overkill, but I partially migrated Twitter's Snowflake as an Akka extension to generate unique ID which can be represented as a number.

Snowflake uses millisecond timestamp, datacenter id, worker id and sequence number which increases from 0 for a certain timestamp.

Shawn

unread,
Dec 17, 2013, 3:19:21 PM12/17/13
to akka...@googlegroups.com
Sorry - please ignore my implementation on determining the unique identifier - it wasn't meant to be salient to the discussion. My point was about creating related messages, I just picked the notion of a unique message id as a example message property I might like many of my messages to conceptually inherit.

Shawn

unread,
Dec 17, 2013, 3:30:10 PM12/17/13
to akka...@googlegroups.com
Bummer, really appreciate the link Derek but most of the code looks to render as blank space in my browser (Chrome).

Shawn

unread,
Dec 17, 2013, 3:39:55 PM12/17/13
to akka...@googlegroups.com
Gosh, I must have done a poor job making my point! My question is probably more about Scala but since I'm looking for ideas on modeling related data (messages) specifically for Akka, I posted here. 

So, my original issue is really asking how to define common data between messages. My first reply to this thread was a naive (scala newbie) attempt at that.

Ideally I'd like to say something like

case class MyMessage(foo:String) extends MessageWithId

... and have it act as if MyMessage was a case class defined with an id property and preferably some implementation (implementation important, but not for this discussion) to generate the id. So my question is about whether we can define messages in something akin to an OO inheritance hierarchy. 

Alternately, perhaps someone can enlighten me as to why I should always prefer to define this id property (repeatedly) on each case class message?

e.g. MyMessage(id:something, foo:String), MyOtherMsg(id:something, bar:String), ...

Patrik Nordwall

unread,
Dec 18, 2013, 2:37:11 AM12/18/13
to akka...@googlegroups.com
On Tue, Dec 17, 2013 at 9:39 PM, Shawn <stalbert...@gmail.com> wrote:
Gosh, I must have done a poor job making my point! My question is probably more about Scala but since I'm looking for ideas on modeling related data (messages) specifically for Akka, I posted here. 

So, my original issue is really asking how to define common data between messages. My first reply to this thread was a naive (scala newbie) attempt at that.

Ideally I'd like to say something like

case class MyMessage(foo:String) extends MessageWithId

... and have it act as if MyMessage was a case class defined with an id property and preferably some implementation (implementation important, but not for this discussion) to generate the id. So my question is about whether we can define messages in something akin to an OO inheritance hierarchy. 

To include id in the automatic case class equals/hashCode/toString/copy the id must be part of the first parameter list. You can sometimes benefit from having a `trait MessageWithId { def id: ID }`, but that is for being able to operate on id-carrying messages without having to know all message types.

I often find it more flexible to use an envelope:
case class Envelope(id: ID, message: Any)

Cheers,
Patrik



--

Patrik Nordwall
Typesafe Reactive apps on the JVM
Twitter: @patriknw

Derek Wyatt

unread,
Dec 18, 2013, 9:17:20 AM12/18/13
to akka...@googlegroups.com
This was the subject of the original link I sent.  I actually wanted to have a lot more than just a message ID - workId, senderId, recipientId, etc... - and I didn't want to burden my "business logic" actors with having to worry about it.

The solution was, as Patrik has suggested, to use use an envelope.  I just spiced it up with a bunch of implicits and such to make the pain of using the envelope go away.

I'll plug it again, because I really think it might help solve your problem: http://blog.primal.com/using-scala-implicits-to-implement-a-messaging-protocol/
Reply all
Reply to author
Forward
0 new messages