Hi,
tl;dr How should I use named Unicode character classes in regexps?
I'm trying to write a regular expression that matches
Go identifiers, which start with a Unicode letter or underscore followed by zero or more Unicode letters, decimal digits, and/or underscores.
identiferRegexp := regexp.MustCompile(`\A[[\p{Letter}]_][[\p{Letter}][\p{Number, decimal digit}]_]*\z`)
However, this pattern does not compile, giving the error:
regexp: Compile(`\A[[\p{Letter}]_][[\p{Letter}][\p{Number, decimal digit}]_]*\z`): error parsing regexp: invalid character class range: `\p{Letter}`
Using the short name for character classes (L for Letter, Nd for Number, decimal digit) does work however:
identiferRegexp := regexp.MustCompile(`\A[\pL_][\pL\p{Nd}_]*\z`)
Is this simply an oversight that Unicode character classes like "Letter" and "Number, decimal digit" are not available for use in regexps, or should I be using them differently?
Many thanks,
Tom