Okay I was comparing it with Hadoop reducer function.
So here is what I am doing..
1. Reading data from a file.. Applied my map function and got (K,V) pairs as an output.
2. Now I want to apply my own function as a reducer. But in reducer I want to process all (K,V) pairs with the same key. Thats what I am assuming ReduceByKey does for you.
My question is how can I write my reducer function to process all the (K,V) pair belonging to same key together.
What would be the signature of my reducer function in that case ?
How would I pass it inside reducebykey( ) method.
e.g. I passed my mapper like this ---- dataset.map(x=>Mymapper.map(x))
Here I knew x would be the line, so I defined my own map function to take string as argument.
Now after applying this map function, I got some (K,V) pairs.
How can I write my reduce function and pass it in reducebykey( ) function as I did in map case ?
Regards,
Praveenesh