How do you handle duplicate events using the eventbus publish method

766 views
Skip to first unread message

Idan Fridman

unread,
Jun 24, 2017, 4:25:06 AM6/24/17
to vert.x
the pub-sub architecture is awesome as producer doesnt need to be aware of it's consumers.
When using publish in vert.x's eventbus we will need to handle duplications as vert.x eventbus publish command (in contrast to kafka) wont be able to configure consumer-groups.

so if I have multiple instances of the same service all of them will consume the message (while the needed behavior should be only one)

I thought about this options:

1. every service (and it's instances) going to use 4rd persistent and after consuming check if other service already processed the message - locking and seems like bad for peformances
2. use address per service and use the send method (instead of publish) which works with round-robin  - producer is aware to it's consumers and need to update the sending block when new services comes in. we lost the pub-sub flexibility 


How did you handle this? 

Thank you.

javadevmtl

unread,
Jun 25, 2017, 12:21:20 AM6/25/17
to vert.x
It's actually simpler than you think. A publish will send the message to all consumers who are listening on the same address.

A send will only one of the consumers, listening on yhe same address, will receibe the message in roumd robin fashion.

So if yoi have 4 comsumer listening to the same address then a publish means all 4 will get it. While a send only one at a time will get it.

Idan Fridman

unread,
Jun 25, 2017, 12:54:58 AM6/25/17
to vert.x
Hi @javadevmtl 
Thanks for your response.
I understand the diff between send() and publish() my question was diff. I asked how can I use publish and still have only one consumer that consume the message in the same instance group?

For example lets say you have three services: one-service, two-service,three-service
let's say you have 3 instances of each. (3's one-service, 3's two-service, 3's three-service)

now one service want to publish message to two-service and three-service. so what will happen? all the 6 instances will get the message.
the only way to over come this is to use send() and send to two-service and send to three service. but thats not real pub-sub as the sender sending explicitly to it's consumers his message.
what happen if 'tomrrow' i have four-service with 3 instances ? then I need to modify one-service again and add another send() method to the four-service

get my issue?

thanks,
Idan.

javadevmtl

unread,
Jun 25, 2017, 4:02:39 AM6/25/17
to vert.x
Each service no matter how many instances has to have a unique address???

javadevmtl

unread,
Jun 25, 2017, 4:58:47 AM6/25/17
to vert.x
Publish is publish. Everyone interested (i.e: listening on same address) gets the message, that like any pub/sub system.

Idan Fridman

unread,
Jun 25, 2017, 5:49:12 AM6/25/17
to vert.x
The address isnt unique. the address will be the same for all services. to problem that this publish cannot scale.

Idan Fridman

unread,
Jun 25, 2017, 5:51:36 AM6/25/17
to vert.x
I understand that everyone is interested. I was asking how do you handle duplications? that was my main question.
again if we had multiple services that listen to the same address and each service has 4 instances and you want to do one operation on each service type. how you going to make sure the operation will be handled once and not 4 times(per service)

Idan Fridman

unread,
Jun 25, 2017, 5:52:28 AM6/25/17
to vert.x
You know in Kafka there is way to force it by setting consumer-groups. we dont have this in vertx's eventbus. so I was wondering how you guys achieve this within scaling environment 

Tim Fox

unread,
Jun 25, 2017, 6:36:43 AM6/25/17
to vert.x


On Sunday, 25 June 2017 05:54:58 UTC+1, Idan Fridman wrote:
Hi @javadevmtl 
Thanks for your response.
I understand the diff between send() and publish() my question was diff. I asked how can I use publish and still have only one consumer that consume the message in the same instance group?

For example lets say you have three services: one-service, two-service,three-service
let's say you have 3 instances of each. (3's one-service, 3's two-service, 3's three-service)

now one service want to publish message to two-service and three-service. so what will happen? all the 6 instances will get the message.
the only way to over come this is to use send() and send to two-service and send to three service. but thats not real pub-sub as the sender sending explicitly to it's consumers his message.
what happen if 'tomrrow' i have four-service with 3 instances ? then I need to modify one-service again and add another send() method to the four-service

get my issue?


You can do this by writing an intermediate service where consumers can register themselves along with their "consumer group" string (call this "MyFakeKafkaService"). When you publish your message, this service can then simply resend the same message to each of the consumer group addresses - only one of the consumers on each address will get the message.

We had something similar to this in Vert.x - the work queue service.

Idan Fridman

unread,
Jun 25, 2017, 6:59:42 AM6/25/17
to vert.x
Thanks Tim. 
So how did this effect your performances? as each message has one additional stop (the MyFakeKafkaService stop)

you said we had something similar to this in Vert.x - the work queue service. is that on github somewhere? 

thanks again.

Idan Fridman

unread,
Jun 25, 2017, 7:08:51 AM6/25/17
to vert.x
I afraid this service going to be the bottleneck of the whole architecture as it's going to handle and passalong all messagings in  the system 

--
You received this message because you are subscribed to a topic in the Google Groups "vert.x" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/vertx/dJCoIpTR9zg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to vertx+un...@googlegroups.com.
Visit this group at https://groups.google.com/group/vertx.
To view this discussion on the web, visit https://groups.google.com/d/msgid/vertx/66251c71-79db-4e10-9a97-e68651708a5b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tim Fox

unread,
Jun 25, 2017, 10:40:34 AM6/25/17
to vert.x
Relay across event bus is rarely going to be the bottleneck. Much more likely it's your processing or any kind of persistence.

Really hard to say more without understanding what you're trying to achieve here. Are you trying to create another Kafka? 
To unsubscribe from this group and all its topics, send an email to vertx+unsubscribe@googlegroups.com.

Idan Fridman

unread,
Jun 25, 2017, 12:34:25 PM6/25/17
to vert.x
I like the flexibility of vert.x eventbus. however I like to keep the sub-pub architecture. 
I have my base architecture with producers and consumers and times to times we adding more and more services that listen to our eventbus existed hosts and enjoy the flexibility of publish. 

In other hand each service that we add to the cluster almost always have more then 1 instance (scale and performances reasons) and many of the messages that we consume doing some actions like adding records to database. so what do we get? duplications (coz we using publish)
and if we switch to send() we lose the pub-sub flexibility.

To unsubscribe from this group and all its topics, send an email to vertx+un...@googlegroups.com.

Idan Fridman

unread,
Jun 26, 2017, 3:53:10 PM6/26/17
to vert.x
Hi Tim,
I was wondering if you had a chance to think about my case?
anway could give me a ref to the code/project which you mentioned (- the work queue service)
Thanks.

Jez P

unread,
Jun 27, 2017, 4:14:19 AM6/27/17
to vert.x
https://github.com/vert-x/mod-work-queue was the version in vertx 2 which I think Tim was referring to.

With respect, please stop chasing this, you're no more important than any other question or issue raised on the group. You've invented a scenario where you think there will be a performance problem, without any benchmarking to demonstrate it - why don't you try TIm's suggestion before saying you think it'll be a bottleneck? It's a perfectly good suggestion and is similar to what Kafka probably does under the hood with its approach anyway. If you've got any real work being done by your consumers, then the bottleneck will not be the eventbus. 

Idan Fridman

unread,
Jun 27, 2017, 4:53:39 AM6/27/17
to vert.x
Hi Jez,
Thanks for your reply. I wasn't chasing anything. I wasn't thinking about performance issues. I was asking about possibilities of that. 

I just said that I wanna try Tim's suggesting and therefor asking for ref of his worker-queue. 
We can calm down abit:)

Thank you.

Jez P

unread,
Jun 27, 2017, 7:32:41 AM6/27/17
to vert.x
Sorry, it just looked like you were trying to pursue for further answers. I'm sure Tim will get back to you when he has bandwidth to do so. Anyway you have a ref to the mod worker queue now, so you can investigate it. Maybe you could even offer a PR for vertx 3 based on it :) 
Reply all
Reply to author
Forward
0 new messages