SCI_GETREPRESENTATION(const char *encodedCharacter, char *representation)
SCI_CLEARREPRESENTATION(const char *encodedCharacter)
This can be used, for example, to display an ambiguous character in a special way. The Ohm sign Ω U+2126 can be displayed like [U+2126 Ω] with
SCI_SETREPRESENTATION("\xe2\x84\xa6", "U+2126 \xe2\x84\xa6")
Another use is to temporarily show invisible characters like zero-width space.
The standard representations for control characters can be replaced if desired.
Available from the Mercurial repository:
Performance and portability notes:
This feature requires examining every character and looking up its representation when laying out or displaying lines. It originally used an unordered_map (hash table) instead of a map since finding a character in an unordered_map is faster. The unordered_map class was added to C++ in C++11 but was originally documented in C++ TR1 in 2007 so has been available with most compilers for years. Its the details of that availability that caused too much trouble.
On Windows with Microsoft Visual C++, unordered_map was included with an update to Visual C++ 2008 but this update was not made available with the free 'Express' edition. Its probably too early to sacrifice VC++ 2008 as some important software, including Python 2.7 require it.
On OS X, using libstdc++, as Scintilla currently does, unordered_map is available as std::tr1::unordered_map and the header is tr1/unordered_map. The future is libc++ where the tr1 can be ignored, but libc++ doesn't work on OS X 10.6. Scintilla currently supports 10.5+ and again, its too early to drop 10.6.
Using gcc, as on Linux, the --std=c++0x option must be set. However, one of the supported platforms, PySide, does not work with --std=c++0x. The next version, PySide 1.2.0, will but that is not widely distributed as yet. BTW, the setting to make Qt projects use the c++0x option is:
*-g++*:QMAKE_CXXFLAGS += --std=c++0x
If just one of these issues had occurred, there'd be an #ifdef or similar workaround (and one was checked in for a while) but the number of problems made me believe that using unordered_map now would be a support sink. Similar issues will recur for using C++11 regex.
To avoid a full retrieval from the map for each character, there is an extra array indexed by the first byte of a character indicating whether there are any entries in the map starting with that byte value. This avoids deep checks for visible ASCII characters, unless they have representations added by the application which would be unusual. This check may be extended in the future if there are any slow-downs observed with particular types of files, such as Asian text.
Neil