Why can't a regexp.Regexp be const

215 views
Skip to first unread message

Pat Farrell

unread,
Feb 13, 2023, 6:48:42 PM2/13/23
to golang-nuts
This won't compile

var ExtRegex = regexp.MustCompile("(M|m)(p|P)(3|4))|((F|f)(L|l)(A|a)(C|c))$")

with a 
./prog.go:10:18: regexp.MustCompile("((M|m)(p|P)(3|4))|((F|f)(L|l)(A|a)(C|c))$") (value of type *regexp.Regexp) is not constant

while
const pat = "((M|m)(p|P)(3|4))|((F|f)(L|l)(A|a)(C|c))$"
var ExtRegex = regexp.MustCompile(pat)

Works fine.
So, why can't the regexp be a constant?
Is there some state that is kept in the regexp.Regexp store?

And perhaps more importantly, what is the proper go style to
have a compiled regexp?
I could put the var statement outside all blocks, so its in effect
a package variable. But I think having package variable is bad form.

I'm using the regexp in a loop for all the strings in all the files in a directory tree.
I really don't want to compile them for ever pass thru the lines

Thanks
Pat

burak serdar

unread,
Feb 13, 2023, 7:04:21 PM2/13/23
to Pat Farrell, golang-nuts
This compiles just fine, but the regexp compilation fails:


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/39ae6e9f-1c27-45cd-93c2-39a3b75cc6a3n%40googlegroups.com.

David Finkel

unread,
Feb 13, 2023, 7:08:35 PM2/13/23
to Pat Farrell, golang-nuts
On Mon, Feb 13, 2023 at 6:48 PM Pat Farrell <pat2...@gmail.com> wrote:
This won't compile

var ExtRegex = regexp.MustCompile("(M|m)(p|P)(3|4))|((F|f)(L|l)(A|a)(C|c))$")

with a 
./prog.go:10:18: regexp.MustCompile("((M|m)(p|P)(3|4))|((F|f)(L|l)(A|a)(C|c))$") (value of type *regexp.Regexp) is not constant
That error indicates that you wrote `const ExtRegex = ....`

while
const pat = "((M|m)(p|P)(3|4))|((F|f)(L|l)(A|a)(C|c))$"
var ExtRegex = regexp.MustCompile(pat)

Works fine.
So, why can't the regexp be a constant?
Is there some state that is kept in the regexp.Regexp store?
It used to, but that was removed in an earlier version (see the deprecation notice on regexp.Copy

And perhaps more importantly, what is the proper go style to
have a compiled regexp?
I could put the var statement outside all blocks, so its in effect
a package variable. But I think having package variable is bad form.
For cases like this, I tend to create an unexported package variable that's initialized with MustCompile.
package-level variables are generally bad-form, but there are cases where you have mostly-readonly things that must be initialized at startup in which they make sense. (constant-initialized regexps are one of these)

I'm using the regexp in a loop for all the strings in all the files in a directory tree.
I really don't want to compile them for ever pass thru the lines

Thanks
Pat

--

Pat Farrell

unread,
Feb 13, 2023, 7:35:42 PM2/13/23
to golang-nuts
On Monday, February 13, 2023 at 7:08:35 PM UTC-5 David Finkel wrote:
So, why can't the regexp be a constant?

Oh, this is what I missed.

For cases like this, I tend to create an unexported package variable that's initialized with MustCompile.
package-level variables are generally bad-form, but there are cases where you have mostly-readonly things that must be initialized at startup in which they make sense. (constant-initialized regexps are one of these)

Which is why I was trying to not use them or make things const.

Long ago, when dinosaurs roamed, I spent a lot of time trying to use C++'s const
in places that made sense. It was a disaster and I gave up.

Howard C. Shaw III

unread,
Feb 14, 2023, 10:51:31 AM2/14/23
to golang-nuts
var ExtRegex = regexp.MustCompile("(M|m)(p|P)(3|4))|((F|f)(L|l)(A|a)(C|c))$")

with a 
./prog.go:10:18: regexp.MustCompile("((M|m)(p|P)(3|4))|((F|f)(L|l)(A|a)(C|c))$") (value of type *regexp.Regexp) is not constant


Actual error I get is "error parsing regexp: unexpected ): `(M|m)(p|P)(3|4))|((F|f)(L|l)(A|a)(C|c))$`"
It is missing a parentheses. This compiles fine:
 
    var ExtRegex = regexp.MustCompile("((M|m)(p|P)(3|4))|((F|f)(L|l)(A|a)(C|c))$")
Note that there are two parentheses after the double-quote.

Rob Pike

unread,
Feb 14, 2023, 3:52:09 PM2/14/23
to Howard C. Shaw III, golang-nuts
I suggest two changes.

First, always use backquotes `` around regular expressions to avoid misunderstandings should a backslash occur. 

Second, in this case if M|m is just a choice and not a subexpression you need to track, you can make it easier to read by using character classes to reduce the number of parentheses. It is easier to read therefore. You will need to adjust the code that uses subexpression matching. I did this by hand and didn't test it, but it will give you the idea:

`([Mm][Pp])[34])|([Ff][Ll][Aa][Cc])$`

In fact, if you don't need subexpressions there are no parentheses needed, as the alternation operator groups as you would expect.

`[Mm][Pp][34]|[Ff][Ll][Aa][Cc]$`

However, I suspect at least some of the elements must be parenthesized for the job at hand.

-rob


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.

Philip Stein

unread,
Feb 14, 2023, 4:11:00 PM2/14/23
to Rob Pike, Howard C. Shaw III, golang-nuts
Related to this, couldn't it also be made case insensitive to simplify it? https://go.dev/play/p/5qJTAcmkowg

Pat Farrell

unread,
Feb 16, 2023, 3:35:25 PM2/16/23
to golang-nuts
Yes of course.
At least your version is easier for humans to read.
In the application, I expect the whole regexp thing is trivially small part of
the execution process. But I'm trying to write proper idiomatic go, and yours is nicer


Reply all
Reply to author
Forward
0 new messages