I wanted to implement lookup functionality where left input is very large like a fact table and lookup is to be performed on small table like dimension tables. I think it make sense to use HashJoin for such scenarios and it works well with left outer join option (i dont want to drop fact records).
But I have landed into a problem in case there are duplicates in lookup file (right input), I just want to pick one first or last. I think i will have to implement custom joiner for this.
also any input on custom joiner for this case would help.
Thanks,
Pushpender