14.4.0 server not loading correctly

322 views
Skip to first unread message

Hugh Acland

unread,
Jan 3, 2015, 9:29:20 AM1/3/15
to go...@googlegroups.com
Hi

I have successfully installed 14.4.0 on a Digital Ocean ubuntu 14.04. Java installed, port 8153 opened and reachable.

The web interface at http://<my server>:8153/go just spins with nothing displayed.

Have tried this is on two separate boxes. Same result. Installed via dpkg using go-server-14.4.0-1536.deb. I can see the running jvm process in ps aux.

Anyone have any idea? Is this a broken build?

thanks in advance
Hugh

Hugh Acland

unread,
Jan 3, 2015, 11:49:30 AM1/3/15
to go...@googlegroups.com
Ok, so finally it loads. But it takes a very, very, very, very long time. Like 15 minutes. On a box with 4GB 4 cores.

This is not going to be very user-friendly if it takes this long to load. What is going on?

Md. Ali Ejaz

unread,
Jan 5, 2015, 1:08:14 AM1/5/15
to Hugh Acland, go...@googlegroups.com
Hi Hugh,

Performance of Go server depends on many factors like the size of database, number of current pipelines, number of agents pinging the server, number of unique materials, etc.

Based on the configuration you have, 4GB might or might not be sufficient. The <go-server>/go/api/support could provide details into your configuration.

Unless I see the output of the aforementioned API, I cannot conclude on the possible reason behind slow server startup.
A detailed analysis would require profiling the server with YourKit, and looking at the snapshot for the performance contention.

--
You received this message because you are subscribed to the Google Groups "go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email to go-cd+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
- Ali
@mdaliejaz

Matthew Boedicker

unread,
Jan 21, 2015, 4:09:13 AM1/21/15
to Md. Ali Ejaz, Hugh Acland, go...@googlegroups.com
We are also experiencing problems with slow server startup.

Using 14.4.0 the server does not answer requests on port 8153 until about 20 minutes after it starts. According to netstat it is listening during that time, but curls to 8153 hang and never return.

This may or may not be related to another issue where after the server has been running for a couple of days, the cpu suddenly spikes to 100% and stops answering requests.

I will followup with output from /go/api/support.

Hugh Acland

unread,
Jan 21, 2015, 9:14:40 AM1/21/15
to go...@googlegroups.com, hu...@zuriar.com
Hi Ali

It appears that I am not alone in finding this slow-start up issue.

Can you confirm, are you at ThoughtWorks? Do you work on this project? This is kinda critical going forward to get this resolved!

thanks
Hugh

Hugh Acland

unread,
Jan 21, 2015, 9:15:38 AM1/21/15
to go...@googlegroups.com, mdal...@gmail.com, hu...@zuriar.com
Hi Matthew,

Please do pass on any info you find. 
thanks
Hugh

Sriram Narayanan

unread,
Jan 21, 2015, 11:08:22 AM1/21/15
to Hugh Acland, go...@googlegroups.com, Md. Ali Ejaz
On Wed, Jan 21, 2015 at 10:15 PM, Hugh Acland <hu...@zuriar.com> wrote:
Hi Matthew,

Please do pass on any info you find. 
thanks
Hugh

On Wednesday, January 21, 2015 at 9:09:13 AM UTC, Matthew Boedicker wrote:
We are also experiencing problems with slow server startup.

Using 14.4.0 the server does not answer requests on port 8153 until about 20 minutes after it starts. According to netstat it is listening during that time, but curls to 8153 hang and never return.

This may or may not be related to another issue where after the server has been running for a couple of days, the cpu suddenly spikes to 100% and stops answering requests.

I will followup with output from /go/api/support.

Hmm... curiously, I'm facing the same issue at work. I plan to debug Go tomorrow. I'll update this thread if I find anything.

-- RAm

Md. Ali Ejaz

unread,
Jan 21, 2015, 11:37:30 AM1/21/15
to go...@googlegroups.com, Hugh Acland, Sriram Narayanan, mboed...@pivotal.io
@Matthew Could you tell me the java heap allocated to the go-server? Also, are there any agents running on this machine? I would like to know the machine/VM configuration you are using for Go.

I will followup with output from /go/api/support.

It should help in understanding the configuration and the hung threads, and might help in understanding the sudden cpu spike you mentioned. 
It might not help much in analysing the slow startup. We might have to profile the server to understand it properly.

@Hugh

Can you confirm, are you at ThoughtWorks? Do you work on this project? This is kinda critical going forward to get this resolved!

Yes, I work in ThoughtWorks and I'm one of the core committers on this project. But even if I wasn't, I wouldn't have minded getting the issue resolved :)

@Ram That would be great.



I would try to reproduce the issue as well and see if I can reciprocate the behaviour. If I couldn't, I would like to profile one of your servers.
--
- Ali
@mdaliejaz

Hugh Acland

unread,
Jan 21, 2015, 11:39:40 AM1/21/15
to go...@googlegroups.com, hu...@zuriar.com, srir...@gmail.com, mboed...@pivotal.io
Hi Ali

Thanks, if it would help I can give you root access to the sandbox Digital Ocean box I have installed Go-Cd on. There is nothing important on the box. Let me have your email and I will send you SSH details....Then feel free to dig around

regards
Hugh

Md. Ali Ejaz

unread,
Jan 21, 2015, 11:44:57 AM1/21/15
to Hugh Acland, go...@googlegroups.com, Sriram Narayanan, mboed...@pivotal.io
Hi Hugh,

That would be great :) Thank you.

PS: I might not be able to start my investigation until tomorrow (I'm based out of India).

--
You received this message because you are subscribed to the Google Groups "go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email to go-cd+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
- Ali
@mdaliejaz

Hugh Acland

unread,
Jan 21, 2015, 12:02:27 PM1/21/15
to go...@googlegroups.com, hu...@zuriar.com, srir...@gmail.com, mboed...@pivotal.io
Hi Ali - 

i just emailed you login details to your gmail account

regards
Hugh

Md. Ali Ejaz

unread,
Jan 21, 2015, 12:07:21 PM1/21/15
to Hugh Acland, go...@googlegroups.com, Sriram Narayanan, mboed...@pivotal.io
Got it. Thank you.

I'll look at it and get back on this thread :)

Md. Ali Ejaz

unread,
Jan 22, 2015, 5:37:01 AM1/22/15
to Hugh Acland, go...@googlegroups.com, Sriram Narayanan, Matthew Boedicker, Aravind SV
Hi,

Aravind and I paired on this today morning and were able to find the cause of this slowness.

tl;dr:
Use /dev/urandom instead of /dev/random for the random number generation.
Run Go server with the option: -Djava.security.egd=file:/dev/./urandom. With this the dashboard came up quite quickly.

Longer version:
Go server uses JRuby which was trying to generate some uuids using Java's SecureRandom and was blocked due to starved Linux entropy pool. Part of stacktrace below:

"180936571@qtp-1983968298-15" prio=10 tid=0x00007f2a6819c800 nid=0x38ae in Object.wait() [0x00007f2a274ee000]
   java.lang.Thread.State: RUNNABLE
        at sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:203)
        - locked <0x00000000fb5fbc28> (a sun.security.provider.SecureRandom)
        at java.security.SecureRandom.nextBytes(SecureRandom.java:455)
        - locked <0x00000000fb5fbf40> (a java.security.SecureRandom)
        at org.jruby.ext.securerandom.SecureRandomLibrary.nextBytes(SecureRandomLibrary.java:49)
        at org.jruby.ext.securerandom.SecureRandomLibrary.uuid(SecureRandomLibrary.java:36)
        at org.jruby.ext.securerandom.SecureRandomLibrary$INVOKER$s$0$0$uuid.call(SecureRandomLibrary$INVOKER$s$0$0$uuid.gen)
        at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:134)
        ......

The Digital Ocean box Hugh shared with me had an available entropy count of approximately 200 on it. On the Linux box I had, the entropy count was 3210. Vast difference!
The below command can be used to check the amount of available entropy on the Linux box:
# cat /proc/sys/kernel/random/entropy_avail

From a tutorial page on Digital Ocean:
Linux already gets very good quality random data from the aforementioned hardware sources, but since a headless machine usually has no keyboard or mouse, there is much less entropy generated. Disk and network I/O represent the majority of entropy generation sources for these machines, and these produce very sparse amounts of entropy. 
 
To test how much time it took to get random data using /dev/random and /dev/urandom, we ran the following commands and observed that reading even 100 bytes from /dev/random was very slow when compared to /dev/urandom.

For /dev/urandom:
root@go-cd:~# dd if=/dev/urandom count=1 bs=100 >/dev/null
1+0 records in
1+0 records out
100 bytes (100 B) copied, 0.00030182 s, 331 kB/s
root@go-cd:~# dd if=/dev/urandom count=1 bs=100 >/dev/null
1+0 records in
1+0 records out
100 bytes (100 B) copied, 9.3816e-05 s, 1.1 MB/s
root@go-cd:~# dd if=/dev/urandom count=1 bs=100 >/dev/null
1+0 records in
1+0 records out
100 bytes (100 B) copied, 0.000887821 s, 113 kB/s

For /dev/random (Note it does not even give 100 bytes):
root@go-cd:~# dd if=/dev/random count=1 bs=100 >/dev/null
0+1 records in
0+1 records out
16 bytes (16 B) copied, 0.000168808 s, 94.8 kB/s
root@go-cd:~# dd if=/dev/random count=1 bs=100 >/dev/null
0+1 records in
0+1 records out
6 bytes (6 B) copied, 117.944 s, 0.0 kB/s
root@go-cd:~# dd if=/dev/random count=1 bs=100 >/dev/null
0+1 records in
0+1 records out
6 bytes (6 B) copied, 56.5452 s, 0.0 kB/s

The amount of available entropy could have been low on the Digital Ocean box due it being a headless VM. When the server was starting this amount of available entropy went down (we saw it going down to 0) and due to this the server was waiting for the entropy pool to get populated again, thus increasing the startup time for the Go server.

By default Java/JRuby uses cryptographically secure entropy from /dev/random. This pool usually gets starved and threads waiting on it might get blocked for a long time. This default can be changed to PRNG (Pseudo Random Number Generator) /dev/urandom with the following property:
-Djava.security.egd=file:/dev/./urandom (note the extra '/./'. It's required to workaround this defect).

I added this property in /etc/default/go-server: GO_SERVER_SYSTEM_PROPERTIES=-Djava.security.egd=file:/dev/./urandom, and observed the server started quite quickly. 

Thanks to Hugh for providing access to his Digital Ocean box.

Some references:

Regards,
Ali and Aravind pairing

Md. Ali Ejaz

unread,
Jan 22, 2015, 5:40:08 AM1/22/15
to Hugh Acland, go...@googlegroups.com, Sriram Narayanan, Matthew Boedicker, Aravind SV
@Matthew @Ram Could you please verify if the slow startup in your case is due to the same reason.
--
- Ali
@mdaliejaz

Hugh Acland

unread,
Jan 22, 2015, 8:48:47 AM1/22/15
to go...@googlegroups.com, hu...@zuriar.com, srir...@gmail.com, mboed...@pivotal.io, arv...@thoughtworks.com
Many thanks Ali - that all makes perfect sense. Thanks for sorting this so quickly

regards
Hugh

Matthew Boedicker

unread,
Jan 22, 2015, 11:16:15 AM1/22/15
to Md. Ali Ejaz, Hugh Acland, go...@googlegroups.com, Sriram Narayanan, Aravind SV
Thanks for the excellent support Ali and Aravind.

This could also be the cause of the CPU spikes we sometimes see where the server stops responding. We will try it and let you know.
Reply all
Reply to author
Forward
0 new messages