unable to start or restart AtoM worker

367 views
Skip to first unread message

omphile....@botho.ac.bw

unread,
Jul 2, 2020, 9:21:00 AM7/2/20
to AtoM Users
Good Day 

Guys I am trying to start or restart my AtoM work but I'm  getting below error output.

please assist

Thank you
Capture.PNG

José Raddaoui

unread,
Jul 2, 2020, 9:46:52 AM7/2/20
to AtoM Users
Hi there,

It looks like it hit the restart limit. You could try the tip from the documentation ...


You may want to check the worker log (notes just above the tip) to know what is going/went wrong.

Best regards,
Radda.

Omphile Sebonego

unread,
Jul 3, 2020, 8:46:20 AM7/3/20
to ica-ato...@googlegroups.com
Good day

it true it have reached the restart limit,I tried all the command on the document  but I'm unable to start the AtoM worker service again, is there any solution.

--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/0bc470f7-7018-489c-b942-11708f05fb60n%40googlegroups.com.

Grant Forrest

unread,
Jul 5, 2020, 9:18:09 AM7/5/20
to AtoM Users
Can you share the code in your atom-worker.service so we can see what your settings are?
It could be in /etc/systemd/system/atom-worker.service

Omphile Sebonego

unread,
Jul 6, 2020, 2:38:01 AM7/6/20
to ica-ato...@googlegroups.com
[Unit]
Description=AtoM worker
After=network.target
# High interval and low restart limit to increase the possibility
# of hitting the rate limits in long running recurrent jobs.
StartLimitIntervalSec=24h
StartLimitBurst=3

[Install]
WantedBy=multi-user.target

[Service]
Type=simple
User=www-data
Group=www-data
WorkingDirectory=/usr/share/nginx/atom
ExecStart=/usr/bin/php7.2 -d memory_limit=-1 -d error_reporting="E_ALL" symfony jobs:worker
KillSignal=SIGTERM
Restart=on-failure
RestartSec=30






it in /usr/lib/systemd/system/atom-worker.service, I am using ubuntu 18.04,nginx server, the issue i that I'm able to edit the archival description but once I try to save them I get the internal error 500,I am trying to solve the error.

Dan Gillean

unread,
Jul 6, 2020, 1:12:18 PM7/6/20
to ICA-AtoM Users
Hi Omphile, 

This configuration looks correct. You are using PHP 7.2, and not PHP 7.0, right? What was in the Nginx error logs when you received the 500 error?

First, let's make sure that there isn't a large job in the queue that keeps immediately killing AtoM by exhausting the available memory. The following command will clear ALL current and queued jobs: 
  • php symfony jobs:clear
Typically, resetting the failed atom-worker looks like this: 
  • sudo systemctl reset-failed atom-worker
  • sudo systemctl start atom-worker
And then run the status command to make sure it worked. If this is not working at all, it might be useful to see what the console is outputting for each step. 

Regards, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him


Omphile Sebonego

unread,
Jul 7, 2020, 11:25:53 AM7/7/20
to ica-ato...@googlegroups.com
Good Day.

Guys I tried all off the above commands irt still doesn't start,remember we are trying to save archival description and we get an 500 internal server  error, I get the following out put when I try to start it:




root@atom:~# sudo systemctl daemon-reload
root@atom:~# sudo systemctl reset-failed atom-worker
root@atom:~# sudo systemctl start atom-worker
root@atom:~# systemctl status atom-worker.service
? atom-worker.service - AtoM worker
   Loaded: loaded (/usr/lib/systemd/system/atom-worker.service; enabled; vendor preset: enabled)
   Active: activating (auto-restart) (Result: exit-code) since Tue 2020-07-07 17:12:02 CAT; 14s ago
  Process: 8832 ExecStart=/usr/bin/php7.2 -d memory_limit=-1 -d error_reporting=E_ALL symfony jobs:worker (code=exited, status=1/FAILURE)
 Main PID: 8832 (code=exited, status=1/FAILURE)
root@atom:~# ^C
root@atom:~#


My gearman-job-server is as follows:


# This is a configuration file for /etc/init.d/gearman-job-server; it allows
# you to perform common modifications to the behavior of the gearman-job-server
# daemon startup without editing the init script (and thus getting prompted by
# dpkg on upgrades).  We all love dpkg prompts.

# Examples ( from http://gearman.org/manual/job_server )
#
# Use drizzle as persistent queue store
# PARAMS="-q libdrizzle --libdrizzle-db=some_db --libdrizzle-table=gearman_queue"
#
# Use mysql as persistent queue store
# PARAMS="-q libdrizzle --libdrizzle-host=10.0.0.1 --libdrizzle-user=gearman \
#                       --libdrizzle-password=secret --libdrizzle-db=some_db \
#                       --libdrizzle-table=gearman_queue --libdrizzle-mysql"
#
# Missing examples for memcache persitent queue store...

# Parameters to pass to gearmand.
PARAMS="--listen=localhost \
 --daemon \
 --log-file=/var/log/gearman-job-server/gearmand.log"



and I get dis output when I run the commandsudo tail -f /var/log/nginx/error.log:

2020/07/07 09:19:30 [error] 1073#1073: *219214 FastCGI sent in stderr: "PHP message: Could not connect to 127.0.0.1:4730" while reading response header from upstream, client: , server: _, request: "POST /index.php/secretariat/edit HTTP/1.1", upstream: "fastcgi://unix:/run/php7.2-fpm.atom.sock:", host: "", referrer: "http:///index.php/secretariat/edit"



Please guys assist

raddao...@gmail.com

unread,
Jul 7, 2020, 12:08:37 PM7/7/20
to AtoM Users
Hi Omphile,

Could you check and share the worker logs when it tries to restart:

sudo journalctl -f atom-worker

We may find something useful in there.

Bests,
Radda.

Omphile Sebonego

unread,
Jul 8, 2020, 2:52:54 AM7/8/20
to ica-ato...@googlegroups.com
good Day I tried the command,
sudo journalctl -f atom-worker,and got this output

root@atom:~# sudo journalctl -f atom-worker
Failed to add match 'atom-worker': Invalid argument
root@atom:~#

but when I run this one I get:

ob for atom-worker.service failed.
See "systemctl status atom-worker.service" and "journalctl -xe" for details.
root@atom:~# journalctl -xe
Jul 08 08:38:31 atom systemd[1]: atom-worker.service: Start request repeated too
Jul 08 08:38:31 atom systemd[1]: atom-worker.service: Failed with result 'start-
Jul 08 08:38:31 atom systemd[1]: Failed to start AtoM worker.
-- Subject: Unit atom-worker.service has failed
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Unit atom-worker.service has failed.
--
-- The result is RESULT.
Jul 08 08:38:31 atom sudo[12127]: pam_unix(sudo:session): session closed for use
Jul 08 08:38:33 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:3
Jul 08 08:38:33 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:3
Jul 08 08:38:35 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:3
Jul 08 08:38:35 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:3
Jul 08 08:38:38 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:3
Jul 08 08:38:38 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:3
Jul 08 08:38:40 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:4
Jul 08 08:38:40 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:4
Jul 08 08:38:43 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:4
Jul 08 08:38:43 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:4
Jul 08 08:38:45 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:4
Jul 08 08:38:45 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:4
lines 1086-1108/1108 (END)


Jul 08 08:38:31 atom systemd[1]: atom-worker.service: Start request repeated too quickly.
Jul 08 08:38:31 atom systemd[1]: atom-worker.service: Failed with result 'start-limit-hit'.
Jul 08 08:38:31 atom systemd[1]: Failed to start AtoM worker.
-- Subject: Unit atom-worker.service has failed
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Unit atom-worker.service has failed.
--
-- The result is RESULT.
Jul 08 08:38:31 atom sudo[12127]: pam_unix(sudo:session): session closed for user root
Jul 08 08:38:33 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:33Z","tags":["warning","elasticsearch","admin"],"pid":811,"message":"Unable to revive connection: http://localhost:9200/"}
Jul 08 08:38:33 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:33Z","tags":["warning","elasticsearch","admin"],"pid":811,"message":"No living connections"}
Jul 08 08:38:35 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:35Z","tags":["warning","elasticsearch","admin"],"pid":811,"message":"Unable to revive connection: http://localhost:9200/"}
Jul 08 08:38:35 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:35Z","tags":["warning","elasticsearch","admin"],"pid":811,"message":"No living connections"}
Jul 08 08:38:38 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:38Z","tags":["warning","elasticsearch","admin"],"pid":811,"message":"Unable to revive connection: http://localhost:9200/"}
Jul 08 08:38:38 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:38Z","tags":["warning","elasticsearch","admin"],"pid":811,"message":"No living connections"}
Jul 08 08:38:40 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:40Z","tags":["warning","elasticsearch","admin"],"pid":811,"message":"Unable to revive connection: http://localhost:9200/"}
Jul 08 08:38:40 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:40Z","tags":["warning","elasticsearch","admin"],"pid":811,"message":"No living connections"}
Jul 08 08:38:43 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:43Z","tags":["warning","elasticsearch","admin"],"pid":811,"message":"Unable to revive connection: http://localhost:9200/"}
Jul 08 08:38:43 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:43Z","tags":["warning","elasticsearch","admin"],"pid":811,"message":"No living connections"}
Jul 08 08:38:45 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:45Z","tags":["warning","elasticsearch","admin"],"pid":811,"message":"Unable to revive connection: http://localhost:9200/"}
Jul 08 08:38:45 atom kibana[811]: {"type":"log","@timestamp":"2020-07-08T06:38:45Z","tags":["warning","elasticsearch","admin"],"pid":811,"message":"No living connections"}
~
~

--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.

Omphile Sebonego

unread,
Jul 8, 2020, 3:40:00 AM7/8/20
to ica-ato...@googlegroups.com
Manage to run the command , sudo journalctl -f -u atom-worker  and got the following out put:

root@atom:~# sudo journalctl -f -u atom-worker
-- Logs begin at Wed 2020-05-06 01:19:08 CAT. --
Jul 08 09:28:17 atom php7.2[13198]:
Jul 08 09:28:17 atom systemd[1]: atom-worker.service: Main process exited, code=exited, status=1/FAILURE
Jul 08 09:28:17 atom systemd[1]: atom-worker.service: Failed with result 'exit-code'.
Jul 08 09:28:27 atom systemd[1]: Stopped AtoM worker.
Jul 08 09:28:59 atom systemd[1]: Started AtoM worker.
Jul 08 09:29:01 atom php7.2[13210]:
Jul 08 09:29:01 atom php7.2[13210]:   Couldn't connect to any available servers
Jul 08 09:29:01 atom php7.2[13210]:
Jul 08 09:29:01 atom systemd[1]: atom-worker.service: Main process exited, code=exited, status=1/FAILURE
Jul 08 09:29:01 atom systemd[1]: atom-worker.service: Failed with result 'exit-code'.
Jul 08 09:29:31 atom systemd[1]: atom-worker.service: Service hold-off time over, scheduling restart.
Jul 08 09:29:31 atom systemd[1]: atom-worker.service: Scheduled restart job, restart counter is at 1.
Jul 08 09:29:31 atom systemd[1]: Stopped AtoM worker.
Jul 08 09:29:31 atom systemd[1]: Started AtoM worker.
Jul 08 09:29:33 atom php7.2[13219]:
Jul 08 09:29:33 atom php7.2[13219]:   Couldn't connect to any available servers
Jul 08 09:29:33 atom php7.2[13219]:
Jul 08 09:29:33 atom systemd[1]: atom-worker.service: Main process exited, code=exited, status=1/FAILURE
Jul 08 09:29:33 atom systemd[1]: atom-worker.service: Failed with result 'exit-code'.
Jul 08 09:30:03 atom systemd[1]: atom-worker.service: Service hold-off time over, scheduling restart.
Jul 08 09:30:03 atom systemd[1]: atom-worker.service: Scheduled restart job, restart counter is at 2.
Jul 08 09:30:03 atom systemd[1]: Stopped AtoM worker.
Jul 08 09:30:03 atom systemd[1]: atom-worker.service: Start request repeated too quickly.
Jul 08 09:30:03 atom systemd[1]: atom-worker.service: Failed with result 'exit-code'.
Jul 08 09:30:03 atom systemd[1]: Failed to start AtoM worker.

José Raddaoui

unread,
Jul 8, 2020, 2:45:58 PM7/8/20
to AtoM Users

Hi Omphile,

It looks like the issue is related to Elasticsearch (from the AtoM worker and Kibana logs). Is Elasticsearch properly configured an running?


You could check the status of Elasticsearch (considering it's running on the same server) with cURL:

curl -XGET 'localhost:9200/?pretty'

Or, similar to the worker, with the systemctl and journalctl managers over the "elasticsearch" service.

Best regards.

Omphile Sebonego

unread,
Jul 9, 2020, 3:43:03 AM7/9/20
to ica-ato...@googlegroups.com
good day

Elastic search is working properly:

root@atom:~# curl -XGET 'localhost:9200/?pretty'
{
  "name" : "KLaRtx7",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "UtA_o2RmQq6iGrwWloKhQg",
  "version" : {
    "number" : "5.6.16",
    "build_hash" : "3a740d1",
    "build_date" : "2019-03-13T15:33:36.565Z",
    "build_snapshot" : false,
    "lucene_version" : "6.6.1"
  },
  "tagline" : "You Know, for Search"
}
root@atom:~#

Omphile Sebonego

unread,
Jul 9, 2020, 3:44:20 AM7/9/20
to ica-ato...@googlegroups.com
root@atom:~# sudo systemctl status gearman-job-server
? gearman-job-server.service - gearman job control server
   Loaded: loaded (/lib/systemd/system/gearman-job-server.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Thu 2020-07-09 08:06:18 CAT; 1h 28min ago
     Docs: http://gearman.info/
  Process: 20965 ExecStart=/usr/sbin/gearmand --pid-file=/run/gearman/gearmand.pid $PARAMS (code=exited, status=0/SUCCESS)
  Process: 20964 ExecStartPre=/usr/bin/install -d -o gearman /run/gearman (code=exited, status=0/SUCCESS)
 Main PID: 20966 (code=exited, status=1/FAILURE)

Jul 09 08:06:18 atom systemd[1]: gearman-job-server.service: Service hold-off time over, scheduling restart.
Jul 09 08:06:18 atom systemd[1]: gearman-job-server.service: Scheduled restart job, restart counter is at 5.
Jul 09 08:06:18 atom systemd[1]: Stopped gearman job control server.
Jul 09 08:06:18 atom systemd[1]: gearman-job-server.service: Start request repeated too quickly.
Jul 09 08:06:18 atom systemd[1]: gearman-job-server.service: Failed with result 'exit-code'.
Jul 09 08:06:18 atom systemd[1]: Failed to start gearman job control server.
root@atom:~#

mayb there coul be clue.


Omphile Sebonego

unread,
Jul 9, 2020, 6:34:50 AM7/9/20
to ica-ato...@googlegroups.com
Ok guy I managed by re-installing my gearman-job-server, then I started it with the command: gearmand -d,it gave me an opportunity to successfully start my atom worker

Thank you.

José Raddaoui

unread,
Jul 9, 2020, 7:12:25 AM7/9/20
to AtoM Users

That's great news Omphile, glad you got it working!

Ricardo Pinho

unread,
Aug 14, 2022, 1:36:58 PMAug 14
to AtoM Users
Hi atom users!
I've recently came across this same issue reported by "Omphile" and re-installing gearman-job-server doesn't solved it!
Because when you, for any reason, need to reboot the server the gearman-job-server doesn't start, nor the atom-worker, that depends on it.

I found a solution in my case, inspire by this message:
"Typically this error happens due to a parameter in GearmanClient::addServer().
It doesn't like "localhost" as a parameter.
Try specifying 127.0.0.1 or specifying nothing."

So I've edited the default PARAMS in gearman-job-server config file, to replace localhost by 127.0.0.1:

sudo vi /etc/default/gearman-job-server

Edit line:
PARAMS="--listen=localhost \

And replace localhost like this:
PARAMS="--listen=127.0.0.1 \

Then you need to restart gearman-job-server:
sudo systemctl restart gearman-job-server

And test if it's running with:
sudo systemctl status gearman-job-server
If running ok, it should return:
● gearman-job-server.service - gearman job control server

   Loaded: loaded (/lib/systemd/system/gearman-job-server.service; enabled; vend
   Active: active (running) since ********* ago

And then restart atom-worker:
sudo systemctl reset-failed atom-worker
sudo systemctl start atom-worker

Test if it's running ok:
sudo systemctl status atom-worker
If running ok, it should return:
● atom-worker.service - AtoM worker
   Loaded: loaded (/usr/lib/systemd/system/atom-worker.service; enabled; vendor preset: enabled)
   Active: active (running) since ************** ago

It worked for me, even after rebooting the server, every service started ok!
Hope this solves others users in similar situations (gearman instaled and running on the same server)

Best regards,
Ricardo Pinho

Dan Gillean

unread,
Aug 15, 2022, 9:00:24 AMAug 15
to ICA-AtoM Users
Hi Ricardo, 

Interesting - thanks so much for sharing what worked for you! We will keep this in mind if we see the issue occurring again. 

Cheers, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him

Reply all
Reply to author
Forward
0 new messages