Hi Go Team,
Good Morning.
I stumbled upon this performance difference when i was trying to use 'go' for a simple program that i had originally written in C. It appears that 'go' in this scenario is extremely slow in comparison with my C implementation.
$go version
go version go1.2 linux/amd64
Here is a simple re-creation procedure.
when we continuously read from the console using fmt.Scanf the performance is considerably slower in comparison to its 'C' counter part. This is visible especially for very large input set.
Here is a simple program in C that prints numbers until 'n'
$ cat pnum.c
#include <stdio.h>
#include <string.h>
int main (int argc, char **argv)
{
int i,n;
if (argc <=1)
{
n=10;
}
else
{
n=atoi(argv[1]);
}
for (i=0; i<n; i++)
printf("%d ",i);
}
so it runs like below
$pnum 20
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
A go program that will Scanf from console for 'n' times.
$cat scanf.go
package main
import (
"fmt"
"os"
"strconv"
)
func main () {
var i,n,tmp int
if len(os.Args) <= 1 {
n = 10
} else {
n,_ = strconv.Atoi(os.Args[1])
}
for i=0; i<n; i++ {
fmt.Scanf ("%d", &tmp)
}
fmt.Printf("last num =%d\n", tmp)
}
its 'C' counter part is below
$cat scanfc.c
#include <stdio.h>
#include <string.h>
int main (int argc, char **argv)
{
int tmp,i, n;
if (argc <= 1)
{
n = 10;
}
else
{
n = atoi(argv[1]);
}
for (i=0; i<n; i++)
scanf("%d", &tmp);
printf("last num = %d\n", tmp);
return 0;
}
Output of go program when we Scanf 100 times, 1000 times, etc until 1 Million times
output of Go program
$for i in 100 1000 10000 100000 1000000^Jdo^Jecho "n="${i}^Jpnum ${i} | time scanf ${i}^Jdone
n=100
last num =99
real 0m0.00s
user 0m0.00s
sys 0m0.01s
n=1000
last num =999
real 0m0.01s
user 0m0.01s
sys 0m0.00s
n=10000
last num =9999
real 0m0.06s
user 0m0.04s
sys 0m0.02s
n=100000
last num =99999
real 0m0.68s
user 0m0.39s
sys 0m0.30s
n=1000000
last num =999999
real 0m7.64s
user 0m4.22s
sys 0m3.57s
output of C program
$for i in 100 1000 10000 100000 1000000^Jdo^Jecho "n="${i}^Jpnum ${i} | time scanfc ${i}^Jdone
n=100
last num = 99
real 0m0.00s
user 0m0.00s
sys 0m0.00s
n=1000
last num = 999
real 0m0.00s
user 0m0.00s
sys 0m0.00s
n=10000
last num = 9999
real 0m0.01s
user 0m0.01s
sys 0m0.00s
n=100000
last num = 99999
real 0m0.02s
user 0m0.04s
sys 0m0.00s
n=1000000
last num = 999999
real 0m0.17s
user 0m0.33s
sys 0m0.01s
Difference is 7 secs vs 0.17 secs. Is there a way to improve this?
If i added some profiling code to scanf.go like below
$diff scanf.go scanf_prof.go
7a8
> "runtime/pprof"
14a16,20
>
> f, err := os.Create("scanf.prof")
> if err != nil {
> fmt.Printf ("Error: Profiling %v\n", err)
> }
20a27,28
> pprof.StartCPUProfile(f)
> defer pprof.StopCPUProfile ()
then the top10 shows syscall.Syscall being called maximum number of times.
$go tool pprof scanf_prof scanf.prof
Welcome to pprof! For help, type 'help'.
(pprof) top10
Total: 669 samples
319 47.7% 47.7% 352 52.6% syscall.Syscall
35 5.2% 52.9% 35 5.2% runtime.stringiter2
27 4.0% 57.0% 34 5.1% runtime.entersyscall
22 3.3% 60.2% 57 8.5% fmt.indexRune
16 2.4% 62.6% 168 25.1% fmt.(*ss).consume
15 2.2% 64.9% 64 9.6% io.ReadAtLeast
12 1.8% 66.7% 14 2.1% itab
12 1.8% 68.5% 12 1.8% runtime.atomicstore
11 1.6% 70.1% 81 12.1% fmt.(*readRune).readByte
11 1.6% 71.7% 102 15.2% fmt.(*ss).getRune
(pprof) quit
Does it mean that every single fmt.Scanf go runtime goes to 'syscall.Syscall' ? will it hold true for frequently used functions like reading from a socket a million times or reading from a file a million times?
Is this expected? is there a way to improve it?
Thanks in Advance,
Dhilip