Marek Grzes
unread,Jun 14, 2014, 8:16:06 PM6/14/14Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to ippc-2014...@googlegroups.com
Dear All,
Our initial server configuration was using OpenJDK (default java package
on Amazon Linux). We have confirmation from two teams that their
experiment has been successful using this configuration. The third team
is about to finish their experiment too. Two teams, however, could not
finish their trials because our server crashed after a few hours of
their experiment. There were no exceptions and no error messages in the
log files on the server, so we have no idea why this happened. JVM did
not seem to crash because I could not see any log messages which would
confirm that. So, we either had a standard System.exit() termination of
the JVM or something else happened (my quick search of the web shows
that if the application crashes on some low lever operation, such as
TCP/IP I assume, then no logs may be reported).
For the two teams that wish to repeat their experiment and for the last
team that has not attempted their trial yet, I started their servers
using Oracle JDK (following Scott's suggestion that it may be more
stable than OpenJDK). I also added a resurrect script which will restart
the RDDL server within no more than 10 seconds on average after the
server has crashed (the script is attached). This means that if your
client can reconnect, then server crashes won't be a big problem, and
the server will be available continuously.
Thank you all for your patience,
Marek