string matching

133 views
Skip to first unread message

Sharan Guhan

unread,
Apr 1, 2021, 2:28:29 PM4/1/21
to golang-nuts
Hi Experts,

New to Golang and finding it non trivial to achieve the below efficiently :-) Any pointers will help..

I have a huge string as below  .. Now from this I want to extract the number "18" after "Core count".. I was thinking of walking through each string with Spilt("\n"), but that will make it slower. I also tried strings.Index with "Core count", but unable to see how to pull the 18 from this..

Sharan


"Version: Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz
Voltage: 1.6 V
External Clock: 100 MHz
Max Speed: 4000 MHz
Current Speed: 2300 MHz
Status: Populated, Enabled
Upgrade: Socket LGA3647-1
L1 Cache Handle: 0x004D
L2 Cache Handle: 0x004E
L3 Cache Handle: 0x004F
Serial Number: Not Specified
Asset Tag: UNKNOWN
Part Number: Not Specified
Core Count: 18"

Artur Vianna

unread,
Apr 1, 2021, 2:33:24 PM4/1/21
to Sharan Guhan, golang-nuts
Try the regex: "Core Count: [0-9]+", then split on ":" and select the second part.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CAGOT8aq8btVfAYWCpGz69cvPG_OQomNGyUjBg7oa85%2BAKqp7yQ%40mail.gmail.com.

Matt KØDVB

unread,
Apr 1, 2021, 3:02:29 PM4/1/21
to Artur Vianna, Sharan Guhan, golang-nuts
Use a capture group

regex := regexp.MustCompile(`.*\bCore Count: (\d+)`)
result := regex.FindAllStringSubmatch(raw, -1) 

Amnon

unread,
Apr 1, 2021, 3:10:18 PM4/1/21
to golang-nuts
Or you could try something like https://play.golang.org/p/xkmchEgyVir

Michael Poole

unread,
Apr 1, 2021, 3:13:36 PM4/1/21
to Sharan Guhan, golang-nuts
You were on a good start with strings.Index(). The key is to move
past the pattern you searched for, assuming it was found.

Something like this should work, if `bigString` is what you are
searching over, and depending on whether "Core Count:" might be on the
first line or not:

pattern := "\nCore Count: "
if start := strings.Index(bigString, pattern); start >= 0 {
var nCores int
_, err := fmt.Sscanf(bigString[start+len(pattern):], "%d", &nCores)
if err == nil {
fmt.Println("Number of cores:", nCores)
}
}

As you can see, the regular expression-based solution suggested by
others leads to less code. This input string is so short that CPU
usage will be negligible for most purposes, outweighed by graceful
error handling and code maintenance concerns.

Best regards,
Michael

Sharan Guhan

unread,
Apr 1, 2021, 3:46:17 PM4/1/21
to Michael Poole, golang-nuts
Thanks Everyone for such good responses and for code pointers !!!
It really helped me.. I have a long string with multiple such occurrences and I need to keep filling them into an array.. I consolidated all of your approaches and now have a working model as below..


func ParseandFillCpuInfo (cmd_output string) {
    var per_cpu Cpus

    r := regexp.MustCompile(`Core Count: [0-9]+`)
    match := r.FindAllStringSubmatch(cmd_output, -1)

    for i:=0; i<count; i++ {
        output := strings.Split(match[i][0], ": ")
        fmt.Println("Parenthesis match =", output[1])

        per_cpu.num_cores,_ = strconv.Atoi(output[1])
        per_cpu.num_active_cores,_ = strconv.Atoi(output[1])

        CpuInfoDB.per_cpu_info = append(CpuInfoDB.per_cpu_info, per_cpu)
    } // for loop

    fmt.Println("CPU info below:")
    fmt.Println(CpuInfoDB)
}

wagner riffel

unread,
Apr 2, 2021, 1:13:37 PM4/2/21
to Sharan Guhan, Michael Poole, golang-nuts
On Thu Apr 1, 2021 at 4:45 PM -03, Sharan Guhan wrote:
> r := regexp.MustCompile(`Core Count: [0-9]+`)
> match := r.FindAllStringSubmatch(cmd_output, -1)

As of the "efficiently" part, you should compile the regexp once
instead of each call to ParseFillCpuInfo, a common practice is to use
a package-level scoped variable but even better (for efficiency) would
be keep it with strings.Index approach, as such:

count := -1
const substr = "Core Count: "
if i := strings.Index(input, substr); i >= 0 {
countStart := i + len(substr)

const digits = "0123456789"
j := 0
for strings.ContainsRune(digits, rune(input[countStart+j])) {
j++
}
count, err := strconv.Atoi(input[countStart : countStart+j])
// handle err
}

You can benchmark both approaches using testing.B, have fun.

https://golang.org/pkg/testing/#B

Reply all
Reply to author
Forward
0 new messages