Another point about the underlying html (Sorry!)
In some HTML outputs spans are placed between words.
An example of this is in page 1 of the attached PDF:
" and game theory for decades (1). In fact, "
Copying and pasting from a PDF viewer results in the correct text indicating spaces are there in the original and are in the correct place.
I understand why PDF2HTMLEX puts spans in (to preserve exact pixel perfect spacing between words / glyphs).
but is there a way to preserve individual words instead presumably at the expense of exact representation?