Default keyword lists in lexers

109 views
Skip to first unread message

thomas_li...@hotmail.com

unread,
May 26, 2017, 11:23:44 AM5/26/17
to scintilla-interest
To make it more effortless to use a certain lexer I suggest that they come with default keywords.

I am aware that keywords will vary over time and may not even be fixed at a certain time, however the effort in maintaining "decent" keyword lists in a single source seems prefarable to the current situation where every "program" that uses Scintilla have to obtain and maintain such keyword lists.

Individual "programs" can still set another set of keywords for their "pet" languages.

And programs like notepad++ can still provide the ability for users to configure this individually (though they will only need to store things that are different from the default).

For "my" language the default keywords can be handled like this:

static const char *const visualPrologWordLists[] = {
"Major keywords (class, predicates, ...)",
"Minor keywords (if, then, try, ...)",
"Directive keywords without the '#' (include, requires, ...)",
"Documentation keywords without the '@' (short, detail, ...)",
0,
};

static const char* const defaultKeywords[] = {
// majorKeywords
"goal namespace interface class implement open inherits supports resolve delegate monitor constants domains"
" predicates constructors properties clauses facts",
// minorKeywords
"guard language stdcall apicall c thiscall prolog digits if then elseif else foreach do try catch finally erroneous failure"
" procedure determ multi nondeterm anyflow and or externally from div mod rem quot in orelse otherwise",
// directiveKeywords
"include bininclude requires orrequires if then else elseif endif error message export externally options",
// docKeywords
"short detail end exception withdomain"
};

struct OptionSetVisualProlog : public OptionSet<OptionsVisualProlog> {
OptionSetVisualProlog() {
DefineWordListSets(visualPrologWordLists);
}
};

class LexerVisualProlog : public ILexer {
WordList majorKeywords;
WordList minorKeywords;
WordList directiveKeywords;
WordList docKeywords;
OptionsVisualProlog options;
OptionSetVisualProlog osVisualProlog;
public:
LexerVisualProlog() {
majorKeywords.Set(defaultKeywords[0]);
minorKeywords.Set(defaultKeywords[1]);
directiveKeywords.Set(defaultKeywords[2]);
docKeywords.Set(defaultKeywords[3]);
}
...

Lex Trotman

unread,
May 26, 2017, 8:31:20 PM5/26/17
to scintilla...@googlegroups.com
On 26 May 2017 at 23:57, <thomas_li...@hotmail.com> wrote:
> To make it more effortless to use a certain lexer I suggest that they come with default keywords.
>
> I am aware that keywords will vary over time and may not even be fixed at a certain time, however the effort in maintaining "decent" keyword lists in a single source seems prefarable to the current situation where every "program" that uses Scintilla have to obtain and maintain such keyword lists.

Not sure the users agree on the keyword list, and some (ab)use keyword
lists for highlighting dynamic information like typenames.

>
> Individual "programs" can still set another set of keywords for their "pet" languages.
>
> And programs like notepad++ can still provide the ability for users to configure this individually (though they will only need to store things that are different from the default).

This would increase the workload on users of Scintilla who would have
to check the default keyword list for every language for every release
to see how it differed from their list, and it would need interface to
remove words not just add them, to allow a user to target a version of
the language older than that Scintilla targeted.

This would just swap the up front effort for new users of Scintilla of
creating the keyword lists for more ongoing effort for Scintilla
itself and for ALL users, not a good overall deal IMHO.

Cheers
Lex
> --
> You received this message because you are subscribed to the Google Groups "scintilla-interest" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to scintilla-inter...@googlegroups.com.
> To post to this group, send email to scintilla...@googlegroups.com.
> Visit this group at https://groups.google.com/group/scintilla-interest.
> For more options, visit https://groups.google.com/d/optout.

thomas_li...@hotmail.com

unread,
May 27, 2017, 9:51:31 AM5/27/17
to scintilla-interest

I am aware that some languages/IDE's have dynamic token coloring, in Visual Studio C++ and C# classes become light-blue, something they can only become after some kind of "compiler" have analyzed the code.

But even in C++ and C# there is also a set of fixed/statuc keywords (dark-blue), like class, public, private, static, return, for, if, etc.  Such lists will of course vary when languages develop over time, but a huge core of them typically continue to exist.

Unfortunately, I do not have such lists for the languages that Scintilla lexers supports.  I doubt that you do either (did you have the lists for Visual Prolog, before I listed them above?).

The producers of Notepad++ do not have the lists either, and in fact this is one of the reasons that Notepad++ only supports a little fragment of the languages that Scintilla supports.

It is true that Scintilla itself will have an extra maintenaince overhead from this, but the alternative it that either all the direct users (e.g. the Notepad++ programmers) each have an equally large overhead, or even worse that each of the users of their programs (e.g. the Notepad++ users) all have such an overhead or is completely restrained from using a certain language (as it is the case for Notepad++ users).

You write:

This would increase the workload on users of Scintilla who would have
to check the default keyword list for every language for every release
to see how it differed from their list, and it would need interface to
remove words not just add them, to allow a user to target a version of
the language older than that Scintilla targeted.
  1. Nobody force you to use those keywords, you can simply set other lists, just like you have to do now.
  2. Now you will at least have some information about the development of languages that you do not yourself. have intimete knowledge about.
  3. People that uses Scintilla and have kowledge about a certain language can now easily inform other Scintilla users that the language has developed (I would feel very greatful about receiving such information automatically with Scintilla releases)
Regards Thomas Linder Puls

Neil Hodgson

unread,
May 27, 2017, 8:21:55 PM5/27/17
to scintilla...@googlegroups.com
thomas_linder_puls:

> But even in C++ and C# there is also a set of fixed/statuc keywords (dark-blue), like class, public, private, static, return, for, if, etc. Such lists will of course vary when languages develop over time, but a huge core of them typically continue to exist.

Scintilla lexers may cover more than one language with the cpp lexer handling C, C++, Java, C#, JavaScript, and others. There would have to be a tighter selection (possibly called something like language or lexer mode) for keywords.

Language communities often have different ideas of just what is a keyword with various literals and predefined identifiers treated as keywords.

> Unfortunately, I do not have such lists for the languages that Scintilla lexers supports. I doubt that you do either (did you have the lists for Visual Prolog, before I listed them above?).

Some projects use keyword lists from SciTE’s .properties files. No one has provided a SciTE .properties file for Visual Prolog.

> It is true that Scintilla itself will have an extra maintenaince overhead from this, but the alternative it that either all the direct users (e.g. the Notepad++ programmers) each have an equally large overhead, or even worse that each of the users of their programs (e.g. the Notepad++ users) all have such an overhead or is completely restrained from using a certain language (as it is the case for Notepad++ users).

Effort can be moved from other projects to Scintilla by expanding the scope of Scintilla. That isn’t going to work unless there is a corresponding increase in labour provided to Scintilla to handle, discuss, integrate, and test this additional scope.

Neil

Andreas Tscharner

unread,
May 28, 2017, 8:07:52 AM5/28/17
to scintilla...@googlegroups.com
On 26.05.2017 15:57, thomas_li...@hotmail.com wrote:
> To make it more effortless to use a certain lexer I suggest that they
> come with default keywords.
>
> I am aware that keywords will vary over time and may not even be
> fixed at a certain time, however the effort in maintaining "decent"
> keyword lists in a single source seems prefarable to the current
> situation where every "program" that uses Scintilla have to obtain
> and maintain such keyword lists.
>

The reference for "my" language DMIS is sold and so I am not sure that
I'd be allowed to include the word lists.

Best regards
Andreas
--
Andreas Tscharner sterne...@gmail.com
------------------------------------------------------------------------
Der entscheidende Vorteil eines Chats gegenueber einem normalen Telefon-
anruf ist der, dass ersterer langsamer geht und mehr kostet (fuer den
lebenswichtigen Austausch von Informationen wie "hya folks", "C U
l8er" und ":-)") ... Aus Murphy's Computergesetzen

thomas_li...@hotmail.com

unread,
Jun 6, 2017, 5:37:35 PM6/6/17
to scintilla-interest, nyama...@me.com


Neil Hodgson:

>   Scintilla lexers may cover more than one language with the cpp lexer handling C, C++, Java, C#, JavaScript, and others. There would have to be a tighter selection (possibly called something like language or lexer mode) for keywords.

Yes, that is correct there will have to be some mechamism for dealing with this.  In fact, I use the Visual Prolog with two different set of keywords myself.
 
>   Some projects use keyword lists from SciTE’s .properties files. No one has provided a SciTE .properties file for Visual Prolog.

I will provide such properties, and I can of course also "translate" the other .properties files for my lexer support.

(But the point is that it seems prefarable if that kind of effort was done one-for-all, rather than all-for-all).
 

thomas_li...@hotmail.com

unread,
Jun 13, 2017, 3:54:52 PM6/13/17
to scintilla-interest, nyama...@me.com


Neil Hodgson:
No one has provided a SciTE .properties file for Visual Prolog.

Actually, Visual Prolog properties has been provided: visualprolog.properties

Regards Thomas Linder Puls

Neil Hodgson

unread,
Jun 13, 2017, 5:41:22 PM6/13/17
to scintilla-interest
thomas_linder_puls:

> Actually, Visual Prolog properties has been provided: visualprolog.properties

That does not work with current Scintilla. That feature request renumbered the SCE_VISUALPROLOG_* constants which I refused as that was a compatibility break. The visualprolog.properties file used the new numbers so sets the wrong styles.

There were also issues with assigning twice to settings which I queried but did not receive a reply. For example:

# String
style.visualprolog.13=$(colour.string),$(font.string.literal)
style.visualprolog.13=fore:#3898B2,$(font.string.literal)

Neil

thomas_li...@hotmail.com

unread,
Jun 20, 2017, 9:41:12 AM6/20/17
to scintilla-interest, nyama...@me.com
I see.


Regards Thomas Linder Puls

Reply all
Reply to author
Forward
0 new messages