For languages without spaces, Unitex/GramLab recognizes elementary units in two steps. First, it tokenizes the text on a character by character basis (
Manual, Section 2.5.4). Then, when you apply dictionaries, words are recognized; in case of word-segmentation ambiguity, all solutions are represented in parallel.