regexp group multi Replace Pattern

109 views
Skip to first unread message

Emilius Omeen

unread,
Sep 2, 2020, 1:03:28 PM9/2/20
to golang-nuts
Hello
Can sombody suggest way to realize multipattern replace?
93881231212 -> 13881231212
0013881231212 -> 13881231212
+13881231212  -> 13881231212

somthing like this 

https://play.golang.org/p/Abeq2-FV7NW


import (
	"fmt"
		"regexp"
	"strings"
)	

func main() {
	fmt.Println("Hello, 世界")
	r := regexp.MustCompile(`^9(\d{10})$|^00(.*)$|^\+(.*)`)
	fmt.Println(ReplaceAllStringGroup("93881231212", `1$1|$2|$3`, r))
	fmt.Println(ReplaceAllStringGroup("0013881231212", `0$1|$2|$3`, r))
	fmt.Println(ReplaceAllStringGroup("+13881231212", `0$1|$2|$3`, r))
	fmt.Println(ReplaceAllStringGroup("-13881231212", `0$1|$2|$3`, r))
}

func ReplaceAllStringGroup(s string, repl string, r *regexp.Regexp) string {
	splitted := strings.Split(repl, "|")
	list := r.FindStringSubmatch(s)
	if len(list) != len(splitted)+1 {
		return "not matched"
	}
	for i, m := range list[1:] {
		if m != "" {
			return r.ReplaceAllString(s, splitted[i])
		}
	}
	return "not matched"
}

Brian Candler

unread,
Sep 2, 2020, 4:23:40 PM9/2/20
to golang-nuts
Can you define the problem more clearly?

It looks like you just want to replace a leading "9", "001" or "+1" with "1"?  In that case, you just match ^(9|001|[+]1) and replace it with 1

If that's not what you want, then what are the exact rules for how the input needs to be changed?

Amnon

unread,
Sep 2, 2020, 5:45:51 PM9/2/20
to golang-nuts
There are various projects that combine multiple regexps into a single DFA.
Have a look at https://github.com/proebsting/re

Emilius Omeen

unread,
Sep 3, 2020, 5:15:29 AM9/3/20
to golang-nuts
pkg "regexp" not have function which allow many group replace, only have ReplaceAllString 

// ReplaceAllString returns a copy of src, replacing matches of the Regexp
// with the replacement string repl. Inside repl, $ signs are interpreted as
// in Expand, so for instance $1 represents the text of the first submatch.
func (re *Regexp) ReplaceAllString(src, repl string) string {
	n := 2
	if strings.Contains(repl, "$") {
		n = 2 * (re.numSubexp + 1)
	}
	b := re.replaceAll(nil, src, n, func(dst []byte, match []int) []byte {
		return re.expand(dst, repl, nil, src, match)
	})
	return string(b)
}


I looking for way to do replace with additional option `1$1|$2|$3`



среда, 2 сентября 2020 г., 23:23:40 UTC+3 пользователь Brian Candler написал:

Brian Candler

unread,
Sep 3, 2020, 6:13:59 AM9/3/20
to golang-nuts
On Thursday, 3 September 2020 10:15:29 UTC+1, Emilius Omeen wrote:
pkg "regexp" not have function which allow many group replace, only have ReplaceAllString 


If I just take the examples you posted at the start, I can do that with a single ReplaceAllString:

I think the original problem definition doesn't explain why it needs more than this - maybe you need some more test cases.

You might find Expand / ExpandString or ReplaceAllStringFunc useful.

Note that you can write your original regexp with capture groups to distinguish which branch matched:

r := regexp.MustCompile(`(^9(\d{10})$)|(^00(.*)$)|(^\+(.*))`)
                         ^  ^          ^   ^      ^   ^
                        $1 $2         $3  $4     $5  $6

Using ...Submatch or ...SubmatchIndex, you can identify which of the branches matched.  If your regexp matches the first alternative, $1 (and $2) will be set.  If it matches the second alternative, $3 (and $4) will be set.  And so on.

Emilius Omeen

unread,
Sep 3, 2020, 9:17:48 AM9/3/20
to golang-nuts
thank you

четверг, 3 сентября 2020 г., 13:13:59 UTC+3 пользователь Brian Candler написал:
Reply all
Reply to author
Forward
0 new messages