yea, joins in mapreduce aren't terribly performant. basically, you collect rows from both sides on the join keys and do a cross product in the reducer. i don't see inner joins being much more efficient ... like i don't think you can figure out to throw out the rows during the the map stage, so it would still be the same # of rows going into the reducer (i might be wrong on this).
however, if one side of the data is very small, you can try to implement a map only join where all the data from the smaller side is fed into each mapper and the join occurs there. basically would have to fit in the memory of the mapper.