Substyles are now implemented for the Lua and HTML lexers to allow styling many sets of identifiers.
For Lua, this is simple, just allowing more styling of identifiers that start from SCE_LUA_IDENTIFIER. The Lua lexer already offered multiple keyword lists so this mostly enhances consistency between lexers.
For HTML, it is much more significant as the lexer only provides 6 keyword lists where each is for one specific type of content: tags & attributes, JavaScript, Basic, Python, PHP, and SGML. The new substyles feature allows each of these, except for SGML, to have multiple identifier lists with different visual styles. Two of these cases are further split with separate sets for tags and attributes (instead of a combined set) and separate sets for client-side and server-side JavaScript as the different execution locations may have different APIs available.
As client-side Basic and Python are not popular, client-side reuses the server-side sets.
Both lexers allow 64 substyles. For Lua, these start at the standard value 128 but for HTML, they start at 192 to allow a larger contiguous range of base styles.
While it would make sense to support substyles for SGML, this is not done as it is more difficult to implement than the other cases.
The substyles are checked when a lexeme in one of these base styles is applied: SCE_H_TAG, SCE_H_ATTRIBUTE, SCE_HJ_WORD, SCE_HJA_WORD, SCE_HB_WORD, SCE_HP_WORD, and SCE_HPHP_WORD. The previously-defined keyword lists have priority over the substyle sets.
While the implementation handles most of the obvious cases, there may be more cases desired so it would be best to discuss them before this is in common use so hard to change. For example, if client-side Basic is more popular than I thought then different sets for client- and server-side Basic should be defined now as this would be difficult to retrofit in a compatible way. The downside of separate lists is that identifiers may have to be added to both.
See this documentation for the substyles API or examine SciTE's source code:
Committed mostly in these change sets:
Neil