* Michal Zalewski <
lca...@gmail.com>, 2015-05-02, 14:18:
Well, but you have the same problem with one-token-per-file scheme. For
example, tokens in the HTML dictionary have trailing newlines, which is
likely unintentional. (Although I guess it doesn't hurt much either...)
>Any ideas on how to solve that neatly?
I have two different proposals:
1) Just split the dictionary file using "\n" as the separator. No
escaping scheme, no special treatment of control characters. People who
need "\n" in their tokens will have to use the old one-token-per-file
scheme. Windows users will have to get a better OS.
2) Split the dictionary file using "\n" as the separator. Strip trailing
whitespace from each line. If the line starts and ends with double-quote
characters, treat it as a C string; otherwise treat it as a raw string.
--
Jakub Wilk