Patch for Scintillua style handling

85 views
Skip to first unread message

Mitchell

unread,
Aug 14, 2022, 10:24:48 AM8/14/22
to scite-i...@googlegroups.com
Hi Neil,

Attached is a small patch against SciTE 5.2.4 that updates property handling for Scintillua lexers. I am in the process of rewriting the Scintilla lexer to be a "proper" lexer in that it has no knowledge about styles and styling information (previously it would define SciTE-style properties that SciTE would read via a PrivateLexerCall). Applications like SciTE use the ILexer5 interface to ask for style names in order to produce a map of names to style numbers and then set styles in their own way. I've also renamed the Scintillua lexer from "lpeg" to "scintillua" since it's a lexer that allows for writing lexers in plain Lua too, not just using LPeg.

I hope this is acceptable to you.

Cheers,
Mitchell
scite_scintillua.diff
SciTEProps.cxx

Neil Hodgson

unread,
Aug 14, 2022, 7:16:52 PM8/14/22
to scite-interest
Hi Mitchell,

> Attached is a small patch against SciTE 5.2.4 that updates property handling for Scintillua lexers.

This has grown enough that it may be better as its own method where its more obvious that the code is Scintillua-specific.

The loops are a bit complex. The subtraction of numPredefined in the max expression isn't clear to me - if the lexer returns '4', say, then no lexer styles will be processed. If I've understood correctly, there is a range of named styles and a range of predefined styles that may or may not overlap so its divided into 3 ranges (named-before, predefined, named-after). I'd think this was clearer as enumerating up to max of named and predefined then ignoring when in the (potential) hole (namedStyles..StyleDefault).

for (i=0, i < max(namedStyles, LastPredefined+1); i++) {
if (i < namedStyles || i >= StyleDefault) {
// find the name and apply it.
}
}

'constexpr' should be used for compile-time constants like LastPredefined and numPredefined as they aren't just locally const and Visual C++ shows warnings.

Neil

Mitchell

unread,
Aug 14, 2022, 8:02:50 PM8/14/22
to scite-i...@googlegroups.com
Hi Neil,

On Mon, 15 Aug 2022 09:16:46 +1000
"'Neil Hodgson' via scite-interest" <scite-i...@googlegroups.com> wrote:

> Hi Mitchell,
>
> > Attached is a small patch against SciTE 5.2.4 that updates property handling for Scintillua lexers.
>
> This has grown enough that it may be better as its own method where its more obvious that the code is Scintillua-specific.

That's a good idea. I've moved the code into a function in an anonymous namespace. I'm also using constexpr like you suggest. Please see the revised patch.

> The loops are a bit complex. The subtraction of numPredefined in the max expression isn't clear to me - if the lexer returns '4', say, then no lexer styles will be processed. If I've understood correctly, there is a range of named styles and a range of predefined styles that may or may not overlap so its divided into 3 ranges (named-before, predefined, named-after). I'd think this was clearer as enumerating up to max of named and predefined then ignoring when in the (potential) hole (namedStyles..StyleDefault).

To make bookkeeping inside the lexer easier, NamedStyles() includes predefined styles in the count, so there are no overlapping ranges. It will always return at least 8 (the number of predefined styles from StyleDefault to LastDefined). If there are 33 named styles, it is composed of 33-8 lexer-defined styles, and 8 predefined styles. The predefined styles always need to be set, so that's where there's a separate loop from StyleDefault to LastDefined. This was the easiest way I could wrap my head around it.

Cheers,
Mitchell

>
> for (i=0, i < max(namedStyles, LastPredefined+1); i++) {
> if (i < namedStyles || i >= StyleDefault) {
> // find the name and apply it.
> }
> }
>
> 'constexpr' should be used for compile-time constants like LastPredefined and numPredefined as they aren't just locally const and Visual C++ shows warnings.
>
> Neil
>
> --
> You received this message because you are subscribed to the Google Groups "scite-interest" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to scite-interes...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/scite-interest/633F3D80-6E3A-4ACB-8963-4EB0B494C88E%40me.com.


Cheers,
Mitchell
scite_scintillua.diff
SciTEProps.cxx

Neil Hodgson

unread,
Aug 15, 2022, 7:57:35 PM8/15/22
to scite-i...@googlegroups.com
Mitchell:

> That's a good idea. I've moved the code into a function in an anonymous namespace. I'm also using constexpr like you suggest. Please see the revised patch.

OK, committed.
https://sourceforge.net/p/scintilla/scite/ci/be5c8e9e2e0df5a72e2bd8aee4bdd5cb26b3bb6c/

Neil

Mitchell

unread,
Aug 15, 2022, 11:29:17 PM8/15/22
to scite-i...@googlegroups.com
Thanks Neil, you are most kind.

On Tue, 16 Aug 2022 09:57:29 +1000
"'Neil Hodgson' via scite-interest" <scite-i...@googlegroups.com> wrote:

> --
> You received this message because you are subscribed to the Google Groups "scite-interest" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to scite-interes...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/scite-interest/6A747502-CBEB-4F9C-A789-B4306CA7B79F%40me.com.


Cheers,
Mitchell

Mitchell

unread,
Aug 23, 2022, 11:59:30 PM8/23/22
to scite-i...@googlegroups.com
Hi Neil,

I have one more patch for you before 5.3.0 goes out. This is a diff and file against the change you previously committed for me.

For Scintillua style names with more than one component (e.g. 'function.builtin' or 'string.longstring'), a property like this is set:

style.scintillua.cpp.3=$(scintillua.styles.function),$(scintillua.styles.function.builtin)

This allows for more targeted styling. It's similar to your tags concept: https://www.scintilla.org/StyleMetadata.html

I'm still not an expert with C++, so feel free to point out anything that needs changing or could be done better.

Cheers,
Mitchell
scite_scintillua.diff
SciTEProps.cxx

Neil Hodgson

unread,
Aug 25, 2022, 12:45:25 AM8/25/22
to scite-i...@googlegroups.com
Mitchell:

> I have one more patch for you before 5.3.0 goes out. This is a diff and file against the change you previously committed for me.

OK.
https://sourceforge.net/p/scintilla/scite/ci/ce4cba5a24358ceb6139aadf0c887888b0686134/

Changed the initialization of end to std::string::npos (which is immediately overwritten anyway) as -1 gave a type warning as size_t is unsigned.

Neil

Neil Hodgson

unread,
Aug 29, 2022, 7:34:06 PM8/29/22
to scite-interest
Mitchell, the lifetime of the variable referred to by 'name' may be a problem. 'name' is a reference, which is essentially a pointer and does not keep the thing referred to alive. I'm unsure of the actual C++ rules here but I think the value returned by 'wEditor.NameOfStyle(style)' can be deallocated at the end of that statement so will be dead when 'name' is used. Its better to just make 'name' a 'std::string' rather than a reference.

const std::string &name = wEditor.NameOfStyle(style);
...
end = name.find('.', ++end);

Neil

Csaba Raduly

unread,
Aug 31, 2022, 5:38:28 AM8/31/22
to scite-i...@googlegroups.com
Assigning a temporary to a *const* reference extends the lifetime of
the temporary to the lifetime of the reference.
See the first exception under
https://en.cppreference.com/w/cpp/language/lifetime#Temporary_object_lifetime

Csaba
> --
> You received this message because you are subscribed to the Google Groups "scite-interest" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to scite-interes...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/scite-interest/82B5F2D8-B4F2-43B1-9586-BA13689BF9F5%40me.com.



--
You can get very substantial performance improvements
by not doing the right thing. - Scott Meyers, An Effective C++11/14 Sampler
So if you're looking for a completely portable, 100% standards-conformant way
to get the wrong information: this is what you want. - Scott Meyers (C++TDaWYK)

Mitchell

unread,
Aug 31, 2022, 8:55:37 AM8/31/22
to scite-i...@googlegroups.com
Hi Neil,

On Tue, 30 Aug 2022 09:33:58 +1000
"'Neil Hodgson' via scite-interest" <scite-i...@googlegroups.com> wrote:

Before returning the name of a style, my Scintillua lexer stores it inside its own std::string and returns the c_str(), which is guaranteed to exist until the next call to NameOfStyle. Therefore, I didn't think lifetime was an issue here, as it is for most char* returns. If you prefer to use std::string, then that's fine with me.

Cheers,
Mitchell

Neil Hodgson

unread,
Aug 31, 2022, 6:59:22 PM8/31/22
to scite-i...@googlegroups.com
Mitchell:

> Before returning the name of a style, my Scintillua lexer stores it inside its own std::string and returns the c_str(), which is guaranteed to exist until the next call to NameOfStyle. Therefore, I didn't think lifetime was an issue here, as it is for most char* returns. If you prefer to use std::string, then that's fine with me.

Its already using std::string. This is going through ScintillaCall,

std::string ScintillaCall::NameOfStyle(int style) {
return CallReturnString(Message::NameOfStyle, style);
}

This invokes CallReturnString which finds the string length then allocates a std::string and copies the value. Its calling your lexer twice. So, a std::string is being allocated, making a copy of the string from the Scintillua lexer and this new std::string is returned as a temporary to the caller. As Csaba points out, its OK to take a const reference to this temporary as the temporary's lifetime is extended.

ScintillaCall also has a 2 argument form (int, char *) for copying into an already allocated buffer but that should be avoided as its safer to treat strings as values. I should convert more of the code in property handling to use std::string and avoid the potential for bugs with fixed size buffers.

Neil

Mitchell

unread,
Aug 31, 2022, 7:03:28 PM8/31/22
to scite-i...@googlegroups.com
Hi Neil,

On Thu, 1 Sep 2022 08:59:15 +1000
"'Neil Hodgson' via scite-interest" <scite-i...@googlegroups.com> wrote:

> Mitchell:
>
> > Before returning the name of a style, my Scintillua lexer stores it inside its own std::string and returns the c_str(), which is guaranteed to exist until the next call to NameOfStyle. Therefore, I didn't think lifetime was an issue here, as it is for most char* returns. If you prefer to use std::string, then that's fine with me.
>
> Its already using std::string. This is going through ScintillaCall,
>
> std::string ScintillaCall::NameOfStyle(int style) {
> return CallReturnString(Message::NameOfStyle, style);
> }
>
> This invokes CallReturnString which finds the string length then allocates a std::string and copies the value. Its calling your lexer twice.

Oh, thanks for pointing that out. I did not know that.

Cheers,
Mitchell
Reply all
Reply to author
Forward
0 new messages