My symbols are overriding my keywords/built-ins

42 views
Skip to first unread message

John Horigan

unread,
Aug 15, 2017, 2:32:15 AM8/15/17
to highlight.js
I'm trying to create a language file for the Context Free design grammar but I am running into a conflict between symbols and keywords. If I include a symbol definition then all of my keywords and built-ins are classified as symbols. If I leave out the symbol definition then my keywords and built-ins are classified correctly.

/*
Language: Context Free Design Grammar
Author: John Horigan <redacted
>
Category: graphics
*/



function(hljs) {
 
var CFDG_KEYWORDS = {
    keyword
: 'startshape|10 rule shape|10 background include import tile path|10 loop clone ' +
     
'let finally if else switch case time timescale rotate r flip f size s skew x y z ' +
     
'transform trans hue h saturation sat brightness b alpha a ',
    built_in
: 'cos sin tan cot acos asin atan acot cosh sinh tanh acosh asinh atanh log log10 ' +
     
'sqrt exp abs floor ceiling infinity factorial sg isNatural bitnot bitor bitand bitxor ' +
     
'bitleft bitright atan2 mod divides div dot cross hsb2rgb rgb2hsb vec min max ftime ' +
     
'frame rand_static rand rand::exponential rand::gamma rand::weibull rand::extremeV ' +
     
'rand::normal rand::lognormal rand::chisquared rand::cauchy rand::fisherF ' +
     
'rand::studentT randint randint::bernoulli randint::binomial randint::negbinomial ' +
     
'randint::poisson randint::discrete randint::geometric'
 
};
 
return {
    case_insensitive
: false,
    aliases
: ['cfdg'],
    lexemes
: /[a-zA-Z][a-zA-Z0-9:_]*/,
    keywords
: CFDG_KEYWORDS,
    contains
: [
     
{
        className
: 'string',
       
begin: '"', end: '"',
        illegal
: '\\n\\r',
     
},
     
{
        className
: 'number',
       
begin: /(\b\d+(\.\d*)?|\.\d+)([eE][-+]?\d+)?/,
        relevance
: 0
     
},
     
{
        className
: 'strong',
       
begin: /\bCF::[a-zA-Z]+/,
        relevance
: 10
     
},
     
// This symbol definition conflicts with my keywords and built-ins
     
{
        className
: 'symbol',
       
begin: /\b[a-zA-Z_]([:a-zA-Z_0-9])*/,
        relevance
: 0
     
},
      hljs
.C_BLOCK_COMMENT_MODE,
      hljs
.C_LINE_COMMENT_MODE,
      hljs
.HASH_COMMENT_MODE,
   
]
 
};
}


How can I have the keywords and built-ins override the symbol definition?

-- john

John Horigan

unread,
Aug 15, 2017, 2:08:32 PM8/15/17
to highlight.js
I have been looking through highlight.js and I see that it does not do keyword matching unless all of the other matches fail. So it does not look possible for me to style all identifiers except for those that match keywords/built-ins. If I modify function subMode() like so:

    function subMode(lexeme, mode) {
     
var i, length;

     
for (i = 0, length = mode.contains.length; i < length; i++) {
       
var match = matchRe(mode.contains[i].beginRe, lexeme);
       
if (match) {
         
if (!keywordMatch(mode, match))
           
return mode.contains[i];
       
}
     
}
   
}

where matchRe() is a variant of testRe() that returns the match object:

  function matchRe(re, lexeme) {
   
var match = re && re.exec(lexeme);
   
if (match && match.index === 0)
     
return match;
   
else
     
return null;
 
}

Then I get the behavior that I want: identifiers in my language get the symbol style except for those that match keywords and built-ins.

This change to the subMode() function would break other languages. But what if a new property, called excludeKeywords, was added to the content objects. When excludeKeywords is true then keywordMatch() would be called so that keywords would not match the beginRe for the content object:

      {
        className
: 'symbol',
       
begin: /\b[a-zA-Z_]([:a-zA-Z_0-9])*/,

        excludeKeywords
: true,
        relevance
: 0
     
},

Does this seem like a reasonable way to do things? Should I create a pull request with these changes to highlight.js?

-- john




Reply all
Reply to author
Forward
0 new messages