Rxjava2 hash based rail assignment logic

60 views
Skip to first unread message

Ujjwal Acharya

unread,
Apr 11, 2021, 3:10:05 PM4/11/21
to RxJava
Hi all,

I am using Flowable<T> parallel() method to create rails so that i can process the items emitted by this flowable in parallel. Currently the documentation for this method says
 
"Parallelizes the flow by creating multiple 'rails' (equal to the number of CPUs) and dispatches the upstream items to them in a round-robin fashion"

I was wondering if we can have a custom logic on deciding how the upstream items are assigned to the rails. I was looking for a hash based solution. The requirement i have is to parallelize the upstream but have two events with same hash id must belong to same rail so that they are ultimately processed sequentially. 

Any directions would be helpful. Thank you


Dávid Karnok

unread,
Apr 11, 2021, 3:18:58 PM4/11/21
to Ujjwal Acharya, RxJava
Hi.

The rail dispatching can't be customized this way. You could use groupBy():

source.groupBy(item -> item.hashCode() % rails)
.flatMap(group -> 
    group.observeOn(Schedulers.computation())
    ...
)


--
You received this message because you are subscribed to the Google Groups "RxJava" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rxjava+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rxjava/91d9b3d2-180f-44b9-8e64-0c7b21d405a3n%40googlegroups.com.


--
Best regards,
David Karnok

Ujjwal Acharya

unread,
Apr 11, 2021, 4:51:50 PM4/11/21
to RxJava
Thank for the reply. 

I have never used groupedFlowable so wanted to know if my understanding is correct. So basically the snippet you posted will created a GroupedFlowable for each unique key the getKey function returns. And then the flatmap will cause each group to be run in its own Scheduler thread basically causing to run this source flowable in parallel. 

But i was curious when will that GroupedFlowable object be destroyed ? Or once created this GroupedFlowable will keep on receiving events and will only be taken out of memory when the source flowable emits complete event  ?

Dávid Karnok

unread,
Apr 11, 2021, 5:00:04 PM4/11/21
to Ujjwal Acharya, RxJava
If the source completes, all groups complete. You can cut a group  short (via take, takeUntil, etc). 

Ujjwal Acharya

unread,
Apr 11, 2021, 5:21:34 PM4/11/21
to RxJava

That makes sense. One final question if you dnt mind. 

So inside flatmap when we do group.observeOn(Schedulers.computation()) we are processing GroupedFlowables in parallel using doOnNext method. But will that processed event also be available to the subscriber downstream since flatmap will merge all of these resulting publishers and emit all of their events ?

Dávid Karnok

unread,
Apr 12, 2021, 1:21:23 AM4/12/21
to Ujjwal Acharya, RxJava

Ujjwal Acharya

unread,
Apr 13, 2021, 6:50:39 PM4/13/21
to RxJava
Curious if this is going to work based on the code snippet posted in this thread earlier.

source.groupBy(item -> item.hashCode() % rails).parallel(4).runOn(Schedulers.computation()).doOnNext(()->{
}) .....

I wanted to know how parallel API behaves on Flowable<GroupedFlowable<?,?>> object ?

Does the above code run each GroupedFlowable in separate thread by creating separate rail for each one of them or breaks the GroupedFlowable itself into rails eventually creating substreams out of each of the GroupedFlowable and ultimately creating 4 * 3 substreams for 3 GroupedFlowable ?

Sorry if my question is not clear.

Ujjwal Acharya

unread,
Apr 13, 2021, 9:24:52 PM4/13/21
to RxJava

I figured out that this doesnot work since each events that is passed on to the rail is going to be a GroupedFlowable object. Sorry for bothering you guys.
Reply all
Reply to author
Forward
0 new messages