I'm not really sure what you are trying to demonstrate. You appear to comparing apples, oranges and yugos.
I was experimenting with buffer management and ran across some interesting performance numbers. I wanted to see how much overhead is affected the append functionality so I wrote a several variations of the test. My analysis indicates that for complex buffers with lots individual items the single item append will be 2 to 3 times slower. This could be important but if you happen to be using fmt.Sprintf() to generate the content then it will not matter since it is Sprintf() is 80 times slower yet. Comments or improvements welcomed.A) append a string to slice 1 million times,B) save a string to slice using index assignments 1 million times. Manually grows the slice by a big chunk when you run out.C) save a string to slice using index assignments 1 million times using an array already big enough to contain the result.E) Generate same number of similar sized strings with Sprintf() but do not save them.D) Save the string using the manual grow slice but generate the strings with a simple call to Sprintf()The representative results where as shown:A = 92ms, B = 31ms, C = 30ms, D = 2,412ms, E = 2446msA = 115ms, B = 38ms, C = 31ms, D = 2,384ms, E = 2332msA = 80ms, B = 28ms, C = 30ms, D = 2,326ms, E = 2670msSample code:package mainimport "fmt"import . "time"import "runtime"func now_ms() float64 {nn := Now()return float64(nn.UnixNano()) / float64(Millisecond)}func elap_ms(msg string, beg_time float64, end_time float64) float64 {elap := end_time - beg_timefmt.Printf("ELAP %s = %12.3f ms\n", msg, elap)return elap}func main() {const numLoop = 1000000const numCycle = 10const DefaultSBSize = 500fmt.Printf("Number of Appends for this test run=%d\n", numLoop)// GC calls added so the next test is not spending// time cleaning up the prior tests garbage.for xxk := 0; xxk < 5; xxk++ {////// Demonstrate the simple case// append numLoop times to initially// empty slice.runtime.GC()
beg_time := now_ms()b := make([]string, DefaultSBSize)for ii := 0; ii < numLoop; ii++ {b = append(b, "I am a simple string 198282\n")}
elap_ms("build with simple Append", beg_time, now_ms())beg_time = now_ms()b = nilruntime.GC()
elap_ms(" post test GC", beg_time, now_ms())////// Demonstrate the Manual manual array// pre-allocation. Each time we run out// manually allocate 5 times the existing// storage size. May be too aggressive. Do not// want to pay to allocate millions up front// but also want to minimize the alloc and copy// operations.runtime.GC()
beg_time = now_ms()max := DefaultSBSizeb = make([]string, DefaultSBSize)for i := 0; i < numLoop; i++ {if i >= max {if max > 500000 {max = max * 2} else {max = max * 8}
t := make([]string, max)copy(t, b)b = t}
b[i] = "I am a simple string 198282\n"}
elap_ms("\n\nbuild with simple pre-grow array", beg_time, now_ms())beg_time = now_ms()for i := 0; i < numLoop; i++ {b[i] = ""}
elap_ms(" Clear elements to nic but leave array", beg_time, now_ms())beg_time = now_ms()runtime.GC()
elap_ms(" post test GC", beg_time, now_ms())////// Demonstrate the Manual manual array// pre-allocation but this time re-use the array// pre-allocated last time.runtime.GC()
beg_time = now_ms()max = DefaultSBSizefor i := 0; i < numLoop; i++ {if i >= max {if max > 500000 {max = max * 2} else {max = max * 8}
t := make([]string, max)copy(t, b)b = t}
b[i] = "I am a simple string 198282\n"}
elap_ms("\n\nbuild but re-use prior array", beg_time, now_ms())b = nilbeg_time = now_ms()runtime.GC()
elap_ms(" post test GC", beg_time, now_ms())////// Demonstrate how slow fmt.fprintf() runs// for the same loop with now append overheadruntime.GC()
beg_time = now_ms()for i := 0; i < numLoop; i++ {fmt.Sprintf("I am a simple string %d", i)}
elap_ms("\n\nRun fmt.Sprintf() the same number of times ", beg_time, now_ms())beg_time = now_ms()runtime.GC()
elap_ms(" post test GC", beg_time, now_ms())////// Demonstrate the really slooooow fmt.fprintf()// We will re-use the fast manual grow but this// time we will use a fmt.Printf to add the index// to the string being concatenatedruntime.GC()
beg_time = now_ms()max = DefaultSBSizeb = make([]string, DefaultSBSize)for i := 0; i < numLoop; i++ {if i >= max {max = max * 5t := make([]string, max)copy(t, b)b = t}
b[i] = fmt.Sprintf("I am a simple string %d", i)}
elap_ms("\n\nbuild with pre-grow array with Sprintf()", beg_time, now_ms())// Time converting the array to// output string.//beg_time = now_ms()//tout = strings.Join(b[0:i], "")//elap_ms(" convert array to single str", beg_time, now_ms())//fmt.Printf("length of string=%d\n", len(tout))//tout = ""b = nilbeg_time = now_ms()runtime.GC()
elap_ms(" post test GC", beg_time, now_ms())fmt.Printf("\n\n***********\n")}
}Sample Output***********ELAP build with simple Append = 80.004 msELAP post test GC = 227.013 msELAPbuild with simple pre-grow array = 33.002 msELAP Clear elements to nic but leave array = 16.001 msELAP post test GC = 265.015 msELAPbuild but re-use prior array = 30.002 msELAP post test GC = 223.013 msELAPRun fmt.Sprintf() the same number of times = 2371.136 msELAP post test GC = 378.022 msELAPbuild with pre-grow array with Sprintf() = 2412.138 msELAP post test GC = 339.020 ms--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Good catch. I changed it for the map construction to use b := make([]string, 0, DefaultSBSize) Unfortunately it had no observed impact. Or the changes where within the statistical variation on this system. The original 500 was wasted effort but since it took so little time it had no meaningful impact. Original point stands
By the way I have read and re-read http://golang.org/doc/effective_go.html it has been quite helpful.I will look at moving over to use the test http://golang.org/doc/code.html#Testing but I believe my current numbers are sufficiently accurate for this test.
--
My tests show that GO is fast but not all that much faster than other languages. My area of focus is centered around heavy duty information mining and predictive analytics. I never believe FUD and always write empirical tests representative of my applications. I was simply trying to share some findings I thought others could use.
The append() function is heavily hyped and multiple GO experts have recently told me that it is fast. Well fast is a relative term. I recently fixed a system for a major company where a 300% improvement in speed was enough to move them from unhappy customers to very happy customers. Net, Net a 300% speed delta between append() and manually indexed buffers could be important. Many people claim they adopted GO for speed reasons. If this is true then they should care about a 300% performance delta between similar techniques commonly used for a common task.
It is one thing to wait and hot spot the code, It is even better to run enough from micro tests to know what to avoid before you have to substantially refactor a complex system because it fails to perform. I spend a good portion of my career fixing distributed performance problems in complex systems. My customers are generally on an emergency footing. Their root cause problems are varied but they are almost always caused by engineers who failed to smoke test the code as it was written. It is always horribly expensive because they generally call me after they missed their launch date or after they are loosing revenue and their customers are complaining.
Accumulating a collection or buffer from many small bits of input is a common task. I find it important to know which techniques are likely to be hot spots. Once of our known hot spots is exactly this kind a problem. I have seen problems in this area in enough customer locations to know that it is an area where performance problems multiply. This test shows the new append method is far slower than managing pre-allocated buffers. (The same is true in C). The code difference is trivial but it could be expensive to ferret out all the places to change latter. I am not trying to say the new append() function should never be used as there is a lot of code which is not performance sensitive. I am saying that it should either be fixed or avoided for areas where you know performance counts.
I find that stress tests like this are essential when designing high performance systems. I need to know exactly how much speed penalty is paid when using different parts of the system. As part of this testing I also found that Sprintf() should be avoided where performance matters. (GO CORE TEAM PLEASE FIX) I think Python and Node.js would would beat you in this area right now.
Some of my tests show that GO was barely twice as fast as Python for a complex jobs. If you throw away 300% in performance for a critical hot area then you may be running slower than the equivalent python code. In one of my tests we parsed about 1.5 million rows of data from a 320 meg file. GO was initially the slowest performer of the bunch behind F#, Python and Lua by over 200%. It turned out that the problem was in f := bufio.NewReader(fi) head_str, _ := f.ReadString('\n') . The f.ReadString() is a performance pig. When I replaced it with a manually written method which reads the bytes in larger chunks and then breaks them into arrays of strings GO moved from last place to second place behind F#. Knowing this kind of performance information is the difference between GO being viable or being thrown out as a toy.