Hi Abhishek,
It is illegal to build a Hyracks job where an activity cluster contains
two activities that are connected by a blocking edge. An activity
cluster is constructed by performing a transitive closure of activities
connected by dataflow edges. As you have seen in your case, this leads
to a deadlock.
Going back to your original example, let's see what it means for the
replicate operator to feed the groupby on one side and the probe of the
join on the other. Physically it means that the replicate operator will
need to produce data for the groupby so that the join can do the build.
But while the build is in progress, the probe cannot start and so the
other side of the replicate operator is not willing to accept any data.
So clearly the replicate operator (that uses finite buffers to send data
to its two receivers) will block and not make any progress.
The correct way out of your dilemma is to implement a new replicate
operator that is made up of three activities.
-----> activity0 - - - > activity1 ----->
|
|_ _ _ _ _ _> activity2 ----->
activty0 would read in all the data and materialize it to a file.
activities 1 and 2 would then read the materialized data and provide it
to the upstream consumers.
Another slightly optimized way to build a similar replicate operator is
with two activities:
-------> activity0 ------->
|
| _ _ _ _ > activity1 -------->
Here activity0 reads in data from the producer, materializes it to a
file and forwards it to the first consumer. Then activity1 that has a
blocking edge from activity0 reads the materialized data and forwards it
to the second consumer. This is slightly more optimal because the first
consumer does not have to wait for all the data to be first materialized
like it would have to in the previous way of building the replicated
operator.
Please let me know if you have any other questions.
> * InnerJoin
> ^ ^*
> *
> / |
> / |
> / |
> / |
> / |
> *
> * Group By |
> ^ |
> \ |
> \ |
> \ |
> \ |
> \ |
> *
> * \ |
> *
> * Replicate*
>
>
>
>
> 2. As use of replicate operator in pipelined manner might lead
> to deadlock.
> Is the replicate designed such that it will not lead
> to dead lock. ?
>
>
> Thanks and Regards
> Abhishek Gupta
>
> --
> You received this message because you are subscribed to the Google
> Groups "hyracks-users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to
hyracks-user...@googlegroups.com
> <mailto:
hyracks-user...@googlegroups.com>.
> For more options, visit
https://groups.google.com/d/optout.