Is the result of FindStringSubmatch correct?

99 views
Skip to first unread message

sleepy

unread,
Mar 2, 2020, 3:46:16 AM3/2/20
to golang-nuts
https://play.golang.org/p/YoNyiKo-lTc

package main

import (
"fmt"
"regexp"
)

func main() {
s := `([^ ]+ ?)*`
re := regexp.MustCompile(s)
fmt.Printf("%q\n", re.FindStringSubmatch("aaa bbb ccc"))
}


I expect the output to be:
["aaa bbb ccc" "aaa " "bbb " "ccc"]
but the actual output is:
["aaa bbb ccc" "ccc"]
 
 

Brian Candler

unread,
Mar 2, 2020, 6:36:16 AM3/2/20
to golang-nuts
Perhaps you were confused by this in the documentation:
"...and the matches, if any, of its subexpressions, as defined by the 'Submatch' descriptions in the package comment"

That means one value per actual parenthesised subexpression in the original expression. e.g. given expression ((\S+) (\S+) (\S+)) the results will be "aaa bbb ccc", "aaa", "bbb", "ccc".

((\S+) (\S+) (\S+)) 
^^     ^     ^
||     |     |
|$2    $3    $4
|
$1

But if one parenthesised expression matches multiple times due to a repeat, it still gives a single value (the last match).  That is: $1 is always $1, and $2 is always $2, based on their positions in the regexp.  They don't shift along if there are multiple matches.

Maybe you want FindAllSubmatch instead.

Brian Candler

unread,
Mar 2, 2020, 6:39:28 AM3/2/20
to golang-nuts

bomin

unread,
Mar 6, 2020, 8:28:17 PM3/6/20
to golang-nuts
Got it, thanks.

在 2020年3月2日星期一 UTC+8下午7:39:28,Brian Candler写道:
Reply all
Reply to author
Forward
0 new messages