Help! IOError/Socket Closed

1,340 views
Skip to first unread message

Liam Kirsher

unread,
Jun 12, 2012, 9:08:40 PM6/12/12
to celery...@googlegroups.com
Hi! Help!

I haven't looked at celery in a while since it's been working.  Suddenly -- after a crash -- it is failing, and I can't figure out why.

The reported problem is "Socket closed"
Exception Type: IOError
Exception Value:
Socket closed
Exception Location: /var/www/prod.companyname.com/eggs/amqplib-1.0.2-py2.6.egg/amqplib/client_0_8/method_framing.py in read_method, line 221
See attached for the whole traceback.

It's nice that there is a fairly precise error message, but I'm not sure where to go from here. 

Any help would be greatly appreciated.  Even just some debugging tips!

Rabbit seems to be working fine.
 sudo /usr/sbin/rabbitmqctl status
Status of node 'rabbit@ip-10-46-145-220' ...
[{running_applications,[{rabbit,"RabbitMQ","2.2.0"},
                        {os_mon,"CPO  CXC 138 46","2.1.8"},
                        {sasl,"SASL  CXC 138 11","2.1.5.4"},
                        {mnesia,"MNESIA  CXC 138 12","4.4.7"},
                        {stdlib,"ERTS  CXC 138 10","1.15.5"},
                        {kernel,"ERTS  CXC 138 10","2.12.5"}]},
 {nodes,[{disc,['rabbit@ip-10-46-145-220']}]},
 {running_nodes,['rabbit@ip-10-46-145-220']}]
...done.


Oh, I guess if you're gonna help it might be useful to know versions of things.
Rabbit 2.2.0
celery-2.5.1-py2.6.egg
django_celery-2.2.4-py2.6.egg
Django 1.2.5
Centos 5.x 64-bit

Thanks in advance,
Liam


django celeryd  -l debug
2012-06-12 20:42:28,217 DEBUG root: loaded config
2012-06-12 20:42:28,218 INFO companyname.local_settings: Logging started
[2012-06-12 17:42:29,756: WARNING/MainProcess]  

 -------------- celery@ip-10-46-145-220 v2.5.1
---- **** -----
--- * ***  * -- [Configuration]
-- * - **** ---   . broker:      amqp://companyname@localhost:5672//
- ** ----------   . loader:      djcelery.loaders.DjangoLoader
- ** ----------   . logfile:     [stderr]@DEBUG
- ** ----------   . concurrency: 2
- ** ----------   . events:      OFF
- *** --- * ---   . beat:        OFF
-- ******* ----
--- ***** ----- [Queues]
 --------------   . prod:        exchange:prod (direct) binding:default
                  

[Tasks]
  . celery.backend_cleanup
  . celery.chord
  . celery.chord_unlock
  . obsidian.api2.tasks.add
  . obsidian.api2.tasks.process_audio

[2012-06-12 17:42:29,757: DEBUG/MainProcess] [Worker] Loading modules.
[2012-06-12 17:42:29,765: DEBUG/MainProcess] [Worker] Claiming components.
[2012-06-12 17:42:29,765: DEBUG/MainProcess] [Worker] Building boot step graph.
[2012-06-12 17:42:29,766: DEBUG/MainProcess] [Worker] New boot order: ['queues', 'pool', 'mediator', 'beat', 'autoreloader', 'timers', 'state-db', 'autoscaler', 'consumer']
[2012-06-12 17:42:29,768: DEBUG/MainProcess] Starting celery.concurrency.processes.TaskPool...
[2012-06-12 17:42:29,770: DEBUG/MainProcess] created semlock with handle 46912756953088
[2012-06-12 17:42:29,770: DEBUG/MainProcess] created semlock with handle 46912756957184
[2012-06-12 17:42:29,770: DEBUG/MainProcess] created semlock with handle 46912756961280
[2012-06-12 17:42:29,771: DEBUG/MainProcess] created semlock with handle 46912756965376
[2012-06-12 17:42:29,771: DEBUG/MainProcess] created semlock with handle 46912756969472
[2012-06-12 17:42:29,771: DEBUG/MainProcess] created semlock with handle 46912756973568
[2012-06-12 17:42:29,771: DEBUG/MainProcess] created semlock with handle 46912756977664
[2012-06-12 17:42:29,772: DEBUG/MainProcess] created semlock with handle 46912756981760
[2012-06-12 17:42:29,772: DEBUG/MainProcess] created semlock with handle 46912756985856
[2012-06-12 17:42:29,779: INFO/PoolWorker-1] child process calling self.run()
[2012-06-12 17:42:29,781: DEBUG/MainProcess] created semlock with handle 46912756989952
[2012-06-12 17:42:29,782: DEBUG/MainProcess] created semlock with handle 46912756994048
[2012-06-12 17:42:29,783: DEBUG/MainProcess] created semlock with handle 46912756998144
[2012-06-12 17:42:29,783: DEBUG/MainProcess] created semlock with handle 46912757002240
[2012-06-12 17:42:29,783: DEBUG/MainProcess] created semlock with handle 46912757006336
[2012-06-12 17:42:29,788: INFO/PoolWorker-2] child process calling self.run()
[2012-06-12 17:42:29,789: DEBUG/MainProcess] worker handler starting
[2012-06-12 17:42:29,790: DEBUG/MainProcess] result handler starting
[2012-06-12 17:42:29,790: DEBUG/MainProcess] celery.concurrency.processes.TaskPool OK!
[2012-06-12 17:42:29,791: DEBUG/MainProcess] Starting celery.worker.mediator.Mediator...
[2012-06-12 17:42:29,792: DEBUG/MainProcess] celery.worker.mediator.Mediator OK!
[2012-06-12 17:42:29,792: DEBUG/MainProcess] Starting celery.worker.consumer.Consumer...
[2012-06-12 17:42:29,792: WARNING/MainProcess] celery@ip-10-46-145-220 has started.
[2012-06-12 17:42:29,792: DEBUG/MainProcess] Consumer: Re-establishing connection to the broker...
[2012-06-12 17:42:29,796: DEBUG/MainProcess] Start from server, version: 8.0, properties: {u'information': u'Licensed under the MPL.  See http://www.rabbitmq.com/', u'product': u'RabbitMQ', u'version': u'2.2.0', u'copyright': u'Copyright (C) 2007-2010 LShift Ltd., Cohesive Financial Technologies LLC., and Rabbit Technologies Ltd.', u'platform': u'Erlang/OTP'}, mechanisms: [u'PLAIN', u'AMQPLAIN'], locales: [u'en_US']
[2012-06-12 17:42:32,798: ERROR/MainProcess] Consumer: Connection Error: Socket closed. Trying again in 2 seconds...
[2012-06-12 17:42:34,809: DEBUG/MainProcess] Start from server, version: 8.0, properties: {u'information': u'Licensed under the MPL.  See http://www.rabbitmq.com/', u'product': u'RabbitMQ', u'version': u'2.2.0', u'copyright': u'Copyright (C) 2007-2010 LShift Ltd., Cohesive Financial Technologies LLC., and Rabbit Technologies Ltd.', u'platform': u'Erlang/OTP'}, mechanisms: [u'PLAIN', u'AMQPLAIN'], locales: [u'en_US']
[2012-06-12 17:42:37,801: ERROR/MainProcess] Consumer: Connection Error: Socket closed. Trying again in 4 seconds...
[2012-06-12 17:42:41,802: DEBUG/MainProcess] Start from server, version: 8.0, properties: {u'information': u'Licensed under the MPL.  See http://www.rabbitmq.com/', u'product': u'RabbitMQ', u'version': u'2.2.0', u'copyright': u'Copyright (C) 2007-2010 LShift Ltd., Cohesive Financial Technologies LLC., and Rabbit Technologies Ltd.', u'platform': u'Erlang/OTP'}, mechanisms: [u'PLAIN', u'AMQPLAIN'], locales: [u'en_US']
[2012-06-12 17:42:44,804: ERROR/MainProcess] Consumer: Connection Error: Socket closed. Trying again in 6 seconds...



-- 
Liam Kirsher
PGP: http://liam.numenet.com/pgp/
Django Piston crash report.html

Ask Solem

unread,
Jun 13, 2012, 4:01:18 AM6/13/12
to celery...@googlegroups.com
Have you looked at the RabbitMQ logs?
They're usually located in /var/log/rabbitmq/

Have you tried restarting RabbitMQ?


--
Ask Solem
twitter.com/asksol | +44 (0)7713357179

signature.asc

Liam Kirsher

unread,
Jun 13, 2012, 12:38:14 PM6/13/12
to celery...@googlegroups.com
Aha! In the rabbitmq log I found the following error:
> =INFO REPORT==== 13-Jun-2012::11:51:48 ===
> closing TCP connection <0.30262.2> from 127.0.0.1:37243
>
> =ERROR REPORT==== 13-Jun-2012::11:51:49 ===
> exception on TCP connection <0.30268.2> from 127.0.0.1:37247
> {channel0_error,starting,
> {amqp_error,access_refused,
> "login refused for user 'companyname'",
> 'connection.start_ok'}}
>
> =INFO REPORT==== 13-Jun-2012::11:51:49 ===
> closing TCP connection <0.30268.2> from 127.0.0.1:37247

> /usr/sbin/rabbitmqctl list_users
> Listing users ...
> guest true
> ...done.
So the "companyname" user is not there!

After poking around I discovered that

ls /var/lib/rabbitmq/mnesia
> rabbit@domU-12-31-39-07-3D-F1/
> rabbit@domU-12-31-39-07-3D-F1-plugins-expand/
> rabbit@ip-10-46-145-220/
> rabbit@ip-10-46-145-220-plugins-expand/
Those are internal hostnames that apparently were re-assigned after
stopping and starting the AMI. The one with "domU" seems to contain all
the previously existing queues and other information -- like users?
I tried renaming/replacing the directories, but rabbitmq wouldn't start.

I'm hoping there's some way I can restore the configuration here,
without having to recreate it.
Any ideas?

Liam

Ask Solem

unread,
Jun 13, 2012, 1:37:07 PM6/13/12
to celery...@googlegroups.com

On 13 Jun 2012, at 17:38, Liam Kirsher wrote:

>
> ls /var/lib/rabbitmq/mnesia
>> rabbit@domU-12-31-39-07-3D-F1/
>> rabbit@domU-12-31-39-07-3D-F1-plugins-expand/
>> rabbit@ip-10-46-145-220/
>> rabbit@ip-10-46-145-220-plugins-expand/
> Those are internal hostnames that apparently were re-assigned after stopping and starting the AMI. The one with "domU" seems to contain all the previously existing queues and other information -- like users?
> I tried renaming/replacing the directories, but rabbitmq wouldn't start.

Erlang/OTP/mnesia is sensitive to dynamic hostnames, you should add a static hostname to /etc/hosts
and use that. You can use the RABBITMQ_NODENAME environment variable to set the hostname
that rabbitmq-server uses.

>
> I'm hoping there's some way I can restore the configuration here, without having to recreate it.
> Any ideas?



I don't know, it's probably possible, you could send a mail to the rabbitmq-discuss mailing-list.
But if you don't care about the data (messages) then it's probably easier to restore
the configuration from scratch (having a script to do this is probably a good idea,
even if only to work as documentation).
signature.asc
Reply all
Reply to author
Forward
0 new messages