I recently rewrote my open source database in Go and it's been a great experience. I ran into a profiling issue that I couldn't find anything on Google about. I'm importing data from the GitHub Archive into my database and while running profiling it's showing that 97.9% of the time is spent in runtime.mach_semaphore_timedwait:blackdog2:sky benbjohnson$ go tool pprof http://localhost:8585/debug/pprof/profileGathering CPU profile from http://localhost:8585/debug/pprof/profile?seconds=30 for 30 seconds to/Users/benbjohnson/pprof/a.out.1364574658.localhost-port8585Be patient...Wrote profile to /Users/benbjohnson/pprof/a.out.1364574658.localhost-port8585Welcome to pprof! For help, type 'help'.(pprof) top10Total: 1504 samples1473 97.9% 97.9% 1473 97.9% runtime.mach_semaphore_timedwait18 1.2% 99.1% 18 1.2% runtime.sigprocmask13 0.9% 100.0% 13 0.9% runtime.mach_semaphore_signal0 0.0% 100.0% 31 2.1% net.(*pollServer).Run0 0.0% 100.0% 13 0.9% net.(*pollServer).WakeFD0 0.0% 100.0% 18 1.2% net.(*pollster).WaitFD0 0.0% 100.0% 1473 97.9% runtime.MHeap_Scavenger0 0.0% 100.0% 13 0.9% runtime.chansend0 0.0% 100.0% 13 0.9% runtime.chansend10 0.0% 100.0% 18 1.2% runtime.exitsyscallThe database demultiplexes the Go HTTP router through a single channel to grab a reference to the table and then it sends the actual processing of the insert of the data to a separate event loop inside a servlet.Project code: https://github.com/skydb/sky/tree/goHTTP handler for insert: https://github.com/skydb/sky/blob/go/skyd/server_event_handlers.go#L102The server is running at about 110% CPU (on my dual-core MacBook Pro) and the importer is inserting one item at a time. I'm getting throughput of about 400 items/second which seems low.What is causing the runtime.mach_semaphore_timedwait to occur? I was thinking that it could just be idle waiting for the importer in between inserts but the CPU usage seems really high for that.Ben Johnson
Andy-I tried pulling out the server level channel and it only slightly improved performance. I'll play around with it some more and see if I can figure it out. The locks are unfortunately necessary on insert. The query speed is really where I'm trying to optimize and that effectively uses no locks and is blazingly fast (10MM records/sec/core). I'm just hoping to bump up my insert performance. Something doesn't seem quite right with it.