We wrote the simplest possible TCP server (with minor logging) to examine the memory footprint (see tcp-server.go below)
The server simply accepts connections and does nothing. It is being run on an Ubuntu 12.04.4 LTS server (kernel 3.2.0-61-generic) with Go version go1.3 linux/amd64.
The attached benchmarking program (pulse.go) creates, in this example, 10k connections, disconnects them after 30 seconds, repeats this cycle three times, and then continuously repeats small pulses of 1k connections/disconnections. The command used to test was ./pulse -big=10000 -bs=30.
The first attached graph is obtained by recording runtime.ReadMemStats when the number of clients has changed by a multiple of 500, and the second graph is the RES memory size seen by “top” for the server process.
The server starts with a negligible 1.6KB of memory. Then the memory is set by the “big” pulses of 10k connections at ~60MB (as seen by top), or at about 16MB “SystemMemory” as seen by ReadMemStats. As expected, when the 10K pulses end, the in-use memory drops, and eventually the program starts releasing memory back to OS as evidenced by the grey “Released Memory” line.
The problem is that the System Memory (and correspondingly, the RES memory seen by “top”) never drops significantly (although it drops a little as seen in the second graph).
We would expect that after the 10K pulses end, memory would continue to be released until the RES size is the minimum needed for handling each 1k pulse (which is 8m RES as seen by “top” and 2MB in-use reported by runtime.ReadMemStats). Instead, the RES stays at about 56MB and in-use never drops from its highest value of 60MB at all.
We want to ensure scalability for irregular traffic with occasional spikes as well as be able to run multiple servers on the same box that have spikes at different times. Is there a way to effectively ensure that as much memory is released back to the system as possible in a reasonable time frame?
First graph http://i.imgur.com/PD4A0q6.png :
Second graph http://i.imgur.com/78QKW0a.png :
Code: https://gist.github.com/eugene-bulkin/e8d690b4db144f468bc5
server.go :
package mainimport ("net""log""runtime""sync")var m sync.Mutexvar num_clients = 0var cycle = 0func printMem() {var ms runtime.MemStatsruntime.ReadMemStats(&ms)log.Printf("Cycle #%3d: %5d clients | System: %8d Inuse: %8d Released: %8d Objects: %6d\n", cycle, num_clients, ms.HeapSys, ms.HeapInuse, ms.HeapReleased, ms.HeapObjects)}func handleConnection(conn net.Conn) {//log.Println("Accepted connection:", conn.RemoteAddr())m.Lock()num_clients++if num_clients % 500 == 0 {printMem()}m.Unlock()buffer := make([]byte, 256)for {_, err := conn.Read(buffer)if err != nil {//log.Println("Lost connection:", conn.RemoteAddr())err := conn.Close()if err != nil {log.Println("Connection close error:", err)}m.Lock()num_clients--if num_clients % 500 == 0 {printMem()}if num_clients == 0 {cycle++}m.Unlock()break}}}func main() {printMem()cycle++listener, err := net.Listen("tcp", ":3033")if err != nil {log.Fatal("Could not listen.")}for {conn, err := listener.Accept()if err != nil {log.Println("Could not listen to client:", err)continue}go handleConnection(conn)}}pulse.go:
package mainimport ("flag""net""sync""log""time")var (numBig = flag.Int("big", 4000, "Number of connections in big pulse")bigIters = flag.Int("i", 3, "Number of iterations of big pulse")bigSep = flag.Int("bs", 5, "Number of seconds between big pulses")numSmall = flag.Int("small", 1000, "Number of connections in small pulse")smallSep = flag.Int("ss", 20, "Number of seconds between small pulses")linger = flag.Int("l", 4, "How long connections should linger before being disconnected"))var m sync.Mutexvar active_conns = 0var connections = make(map[net.Conn] bool)func pulse(n int, linger int) {var wg sync.WaitGrouplog.Printf("Connecting %d client(s)...\n", n)for i := 0; i < n; i++ {wg.Add(1)go func() {m.Lock()defer m.Unlock()defer wg.Done()active_conns++conn, err := net.Dial("tcp", ":3033")if err != nil {log.Panicln("Unable to connect: ", err)return}connections[conn] = true}()}wg.Wait()if len(connections) != n {log.Fatalf("Unable to connect all %d client(s).\n", n)}log.Printf("Connected %d client(s).\n", n)time.Sleep(time.Duration(linger) * time.Second)for conn := range connections {active_conns--err := conn.Close()if err != nil {log.Panicln("Unable to close connection:", err)conn = nilcontinue}delete(connections, conn)conn = nil}if len(connections) > 0 {log.Fatalf("Unable to disconnect all %d client(s) [%d remain].\n", n, len(connections))}log.Printf("Disconnected %d client(s).\n", n)}func main() {flag.Parse()for i := 0; i < *bigIters; i++ {pulse(*numBig, *linger)time.Sleep(time.Duration(*bigSep) * time.Second)}for {pulse(*numSmall, *linger)time.Sleep(time.Duration(*smallSep) * time.Second)}}
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
On Jun 25, 2014 2:59 AM, "Vincent Callanan" <vin...@callanan.ie> wrote:
>
> Sorry, I meant "Will we see [a fix] in 1.3.1?" (not 1.4)
No. A change of this type will not go into a point release.
Ian
>>>>>> Second graph http://i.imgur.com/78QKW0a.png :
>>>>>
>>>>> I don't there is a solution today.
>>>>> Most of the memory seems to be occupied by goroutine stacks, and we don't release that memory to OS.
>>>>> It will be somewhat better in the next release.
>>>>>
We now (as of 1.4) release goroutine stack memory to the OS. So your example should behave much better. Please try 1.4.2 and tip if you can, and open a new bug if you're still seeing problems. There are two known caveats which you might run into:1) There is a kernel bug where freed pages aren't actually freed because of huge page support, see https://github.com/golang/go/issues/8832 . There is a workaround checked in for 1.5.2) Stacks are freed, but not G structures, see https://github.com/golang/go/issues/8832 . We might get to this in 1.6.
On Thursday, June 25, 2015 at 3:47:15 AM UTC+1, keith....@gmail.com wrote:We now (as of 1.4) release goroutine stack memory to the OS. So your example should behave much better. Please try 1.4.2 and tip if you can, and open a new bug if you're still seeing problems. There are two known caveats which you might run into:1) There is a kernel bug where freed pages aren't actually freed because of huge page support, see https://github.com/golang/go/issues/8832 . There is a workaround checked in for 1.5.2) Stacks are freed, but not G structures, see https://github.com/golang/go/issues/8832 . We might get to this in 1.6.1) I'm seeing much better performance using 1.4.2 (vs 1.4) w.r.t process memory size reducing after a load test. Is this likely due to goroutine stack memory releasing to the OS?
2) Is there a G structure instantiation per goroutine instance? Or is the structure shared by all goroutine instances? Does the overhead (structure size) vary per goroutine?(I'll do some digging, appreciate any pointers)