Client can not send message to Job Server

1,220 views
Skip to first unread message

Tuấn Nguyễn Ngọc

unread,
May 14, 2014, 9:43:27 AM5/14/14
to gea...@googlegroups.com
Hi all

I setup Gearman Job server on Amazon EC2 instance, then create log file and start gearman by "gearmand -d", but in log file show "ERROR 2014-05-14 10:56:26.000000 [  main ] Timeout occured when calling bind() for 0.0.0.0:4730 -> libgearman-server/gearmand.cc:688". I tried solution in https://groups.google.com/forum/#!topic/gearman/OSDDOOQgvMU. I re-start Job Server with "gearmand --listen="EC2-IP" --port=4730", it works normally but now my client cannot send message to Job Server. If I put my client to EC2 server and add gearman_client_add_server(client, "localhost", 4730);, it works. What's wrong in here?

Anyone can help me?

Eric Lambert

unread,
May 14, 2014, 10:27:09 AM5/14/14
to gea...@googlegroups.com
Hi Tuan:

I am having trouble following the exact details your question, but have you checked the EC2 firewall rules for your instance and verified that it is configured to pass through traffic on port 4730?

Eric


--
You received this message because you are subscribed to the Google Groups "Gearman" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gearman+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tuấn Nguyễn Ngọc

unread,
May 14, 2014, 11:50:07 PM5/14/14
to gea...@googlegroups.com
Dear Eric

Yes I have added port 4730 to security group on EC2 instance
The exact problem I have is: My Client and my Worker cannot send/receive Job to/from  JobServer which is setup in Amazon EC2 instance.
In Job Server, all configs is default (please see the post above)
In log file, it show "ERROR 2014-05-14 10:56:26.000000 [  main ] Timeout occured when calling bind() for 0.0.0.0:4730 -> libgearman-server/gearmand.cc: 688"

The code in Client like as:
 
gearman_client_st *client= gearman_client_create(NULL);
 
  gearman_return_t ret= gearman_client_add_server(client, "EC2 public IP", 4730);
  if (gearman_failed(ret))
  {
    return EXIT_FAILURE;
  }
gearman_argument_t value= gearman_argument_make(0, 0, myMessage.c_str(), (long) myMessage.size());
 
  gearman_task_st *task= gearman_execute(client,
                                         "jobName", strlen(jobName), 
                                         NULL, 0, 
                                         NULL,
                                         &value, 0);
 
Could you hep me solve this issue?

Eric Lambert

unread,
May 16, 2014, 12:13:09 AM5/16/14
to gea...@googlegroups.com
Tuan:

When I start an instance on EC2 I am able to connect to the server via the EC2 public IP address from my laptop.  Also if i start a worker process on the same machine that is running gearmand, the worker connects as long as the hostname/ip address I give to the worker can resolve to the interface on which gearmand is listening.

That being said, I am still not entirely clear on the steps you are following and the results you are seeing. In your latest email you said that when you started gearmand you see the following in the log

 "ERROR 2014-05-14 10:56:26.000000 [  main ] Timeout occured when calling bind() for 0.0.0.0:4730 -> libgearman-server/gearmand.cc: 688"

Which seems to suggest that you are not specifying which address gearmand should bind to since 0.0.0.0 is what it uses when none was specified.  And certainly if you are seeing this message in the log then gearmand is not up as it should exit after this condition. So perhaps that is why the workers can not connect? You might want to investigate why the server fails to bind in that case. 

Here some steps that you can follow which might help clear up what is going

1) Start gearmand and bind it to the private ip address of the ec2 instance

$ gearmand -d --verbose DEBUG --listen <EC2_PRIVATE_IP>

2) verify that gearmand is running 

3) from the ec2 instance, start a worker which attaches to the server.  You can use the gearman client binary to do this, so no need to write your own.

$ gearman -h  <EC2_PRIVATE_IP> -w -f wc -- wc -l

4) If the worker fails to connect, it should log some errors to stdout/err. Another way to verify the work connected is to telnet to port 4730 and issue the status command. If the worker connected you should see it in the list worker reported by the command (see sample below)

$ telnet  <EC2_PRIVATE_IP> 4730
Trying 54.187.226.148...
Connected to  <EC2_PRIVATE_IP>.
Escape character is '^]'.
status
wc 0 0 1
.

5) repeat steps 3 and 4 from a non-ec2 machine and use either the EC2_PUBLIC_IP or EC2_PUBLIC_DNS for the host running the server. 

Hopefully this will help clarify what the problem might be. My suspicion is this is some kind of setup/config issue.

Hope this is helpful

Eric

Tuấn Nguyễn Ngọc

unread,
May 16, 2014, 3:51:17 AM5/16/14
to gea...@googlegroups.com
Hi Eric

When I ran "$ gearmand -d --verbose DEBUG --listen <EC2_PRIVATE_IP>", the result shows:

"Initializing Gear on port 4730 with SSL: false"

What happened with my gearman?

My Ubuntu version 12.10
Gearman version 1.1.12

Niel

unread,
Aug 11, 2014, 6:54:56 AM8/11/14
to gea...@googlegroups.com
Hi Tuan,

Not sure if applicable, but I've come across a similar issue couple of times.

I've found if you restart a gearman job server while very busy, a PHP worker (on the same machine) somehow prevents the port from being re-used. For some reason the lsof command shows that PHP has got a connection ESTABLISHED, but the gearman is not running. After killing all php scripts, gearman magically binds again.

Not sure if this is a gearman- or php (or extention) issue, unfortunately this makes gearman unsafe for production use.

ERROR 2014-08-11 07:28:48.000000 [  main ] Timeout occured when calling bind() for 0.0.0.0:4730 -> libgearman-server/gearmand.cc:616
Reply all
Reply to author
Forward
0 new messages