rmr2 uses the distributed cache behind the scenes but doesn't give direct access to it. I am not sure I understand your problem, but if you can store your words in a vector, say keywords then you can use them directly in your map or reduce functions
keywords = ....
pattern = paste(keywords, collapse = "|")
mapreduce(input, map = function(k,v) grep(pattern, v))
or some such. Not sure grep scales well to such large pattern.
Antonio