Hi flume experts,
here is what I am trying to do. First, I go though all the cafe conversion logs and generate
PCollection<GWSLogEntryProto>
I can then group all the log entries based on their account_id
PTable<int64, Stream<GWSLogEntryProto>>
then what I want to do is to go though all log entries corresponding to one account_id and find the top urls corresponding to one account_id. However, I will not be able to use GroupByKey on the Stream object. Any suggestions on this?
Thanks,
Chenyun