HashJoin is going to be more performant that a CoGroup in almost every case, because it's a map-side operation - so you don't need to have a reduce phase, and thus this can be "chained" into other map operations as part of a single job.
The use of HashJoin in the code you referenced is basically joining a single count (number of documents) with the document count for every unique term found.
// join to bring together all the components for calculating TF-IDF
// the D side of the join is smaller, so it goes on the RHS
Pipe idfPipe = new HashJoin( dfPipe, lhs_join, dPipe, rhs_join );
So the single count (number of documents) goes on the right side, and is a great use case for HashJoin.
The subsequent CoGroup is used to join the term frequency for each document/term combination with the doc count (& total docs) for each term.
// the IDF side of the join is smaller, so it goes on the RHS
Pipe tfidfPipe = new CoGroup( tfPipe, tf_token, idfPipe, df_token );
Here neither of these two pipes is likely to be well bounded and small in their size, so a CoGroup makes sense.
You could maybe argue that the number of unique terms should be in the 10K - 100K range, and thus it could be a HashJoin. But in my experience that assumption usually winds up hurting you in the end, when your workflow is used to process data with "unexpected" characteristics. With enough data, every possible edge case you didn't consider winds up having a finite probability of occurring.
-- Ken