This is related of previous discussion but this time about finding concurrent solution.
I'm posting my solution. Any feedback and suggestions for improvement are welcome!
I need to write a program that reads STDIN and should output every line that contains a search word "test" to STDOUT.
The input is a 100MB string.
Utilizing 8 cores is giving 2 times faster solution with concurrent version.
Each coroutine is working on 12.5MB. The worst case scenario is when the string error is at the end of the input.
If one goroutine finds it I can stop the other 7 from working. This will be good if the searched string is in the middle.
The only way to interrupt bytes.Contains is by getting its source code and look how to stop the search.
------------------------------------------------
Machine Mac-mini M1 16 MB:
------------------------------------------------
7.47s BigGrepBytes
3.81s BigGrepBytes1_Concurrent
------------------------------------------------
Here are the benchmarks:
---
Type: cpu
Time: Jun 13, 2022 at 10:34am (PDT)
Duration: 124.30s, Total samples = 207.43s (166.88%)
Active filters:
focus=Benchmark
hide=benchm
show=Benchmark
flat flat% sum% cum cum%
7.47s 3.60% 21.86% 7.47s 3.60% Benchmark_100MB_End__20charsBigGrepBytes_
3.81s 1.84% 34.38% 3.81s 1.84% Benchmark_100MB_End__20charsBigGrepBytes1_Concurrent
---
Not Concurrent version:
---
func BigGrepBytes(r io.Reader, w io.Writer, find []byte) { // <1>
var b bytes.Buffer // <2>
_, _ = b.ReadFrom(r) // <3>
if bytes.Contains(b.Bytes(), find) { // <4>
w.Write(b.Bytes())
} else {
w.Write([]byte(" \n")) // <5>
}
}
Concurrent version:
--
func BigGrepBytesCh1(r io.Reader, w io.Writer, find []byte, cores int) {
ch := make(chan bool, cores) // <1>
overlap := len(find)
var b bytes.Buffer // <2>
_, _ = b.ReadFrom(r) // <3>
// <4>
filelen := len(b.Bytes())
chunksize := filelen / cores
for i := 0; i < cores; i++ {
start := i * chunksize
end := min(start+chunksize+overlap, filelen)
go BytesContainsCh1(b.Bytes(), start, end, find, ch)
}
found := false
for i := 0; i < cores; i++ {
if <-ch {
found = true
w.Write(b.Bytes()) // <7>
break
} else {
}
}
if !found { // <8>
w.Write([]byte(" \n"))
}
}
func BytesContainsCh1(b []byte, start int, end int, find []byte, ch chan bool) { // <1>
ch <- bytes.Contains(b[start:end], find)
}
--