job appears to run twice

67 views
Skip to first unread message

GS

unread,
Sep 4, 2018, 8:10:51 PM9/4/18
to AtoM Users
Hi Atom users,

Recent CSV import job appears to run twice see below.
Client reported the data was imported but appeared to still be running.
Any suggestions, is it a known issue in 2.4.0.

[info] [2018-09-04 15:17:00] Job 1511990 "arFileImportJob": Job started.
[info] [2018-09-04 15:17:00] Job 1511990 "arFileImportJob": Importing CSV file: N386 was Z125 DRAFT Sept 2018 CSV.csv.
[info] [2018-09-04 15:17:00] Job 1511990 "arFileImportJob": Indexing imported records.
[info] [2018-09-04 15:17:00] Job 1511990 "arFileImportJob": Update type: import-as-new
[info] [2018-09-05 04:51:27] Job 1511990 "arFileImportJob": Job started.
[info] [2018-09-05 04:51:27] Job 1511990 "arFileImportJob": Importing CSV file: N386 was Z125 DRAFT Sept 2018 CSV.csv.
[info] [2018-09-05 04:51:27] Job 1511990 "arFileImportJob": Indexing imported records.
[info] [2018-09-05 04:51:27] Job 1511990 "arFileImportJob": Update type: import-as-new

Thanks,

George

GS

unread,
Sep 5, 2018, 1:50:29 AM9/5/18
to AtoM Users
Hi,

Could this be a retried job or would it be seen as a separate entry/job?

Thanks,

George

Dan Gillean

unread,
Sep 5, 2018, 1:55:48 PM9/5/18
to ICA-AtoM Users
Hi George,

Was the import run more than once, or did the log appear like this on the first try?

We have found and fixed one similar (but different, ha) bug where we discovered that libxml warnings and errors output in the log were not being cleared between XML imports, leading to duplications in the job log when a file is imported a second time. See:
It's possible this is related, but affecting CSV imports.

If the job appears stalled, you can run the following command. Note that it will kill ALL active jobs - so if you have others in the queue you may need to manually restart them again after. It will also clear old jobs from the Jobs page, so if you want to keep that data, try exporting a CSV of the job logs first, for reference. See:
The command to clear all jobs should be run from the root AtoM installation directory. If you have followed our recommended installation instructions, this is generally /usr/share/nginx/atom. You can run the command like so:
  • php symfony jobs:clear
You can also use the following commands to stop, start, restart the job scheduler, or get a status update on whether it is running.

On Ubuntu 14.04:
  • sudo start atom-worker
  • sudo stop atom-worker
  • sudo restart atom-worker
  • sudo status atom-worker
On Ubuntu 16.04:
  • sudo systemctl enable atom-worker
  • sudo systemctl start atom-worker
  • sudo systemctl stop atom-worker
  • sudo systemctl restart atom-worker
  • sudo systemctl status atom-worker

If you can tell me more about the exact steps and pre-existing variable that might have affected the import that led to this outcome, I can try to reproduce it locally and determine if you've encountered a new bug.

Regards,

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory


--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To post to this group, send email to ica-ato...@googlegroups.com.
Visit this group at https://groups.google.com/group/ica-atom-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/9ba65894-5916-44b5-a290-9b98665cdd62%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

GS

unread,
Sep 7, 2018, 12:42:15 AM9/7/18
to AtoM Users
Hi Dan,

The job was run once by one of the Library Archivists.
The information I provided was cut 'n pasted from the web interface (not the text log file).
I already knew about clearing jobs and re-starting the atom-worker so I was able to stop the second run.
The Archivist has reported duplicated records 93 items out of 202 (I assume because of the job interruption).
Other users are importing data but as far as I know this is the first time that the CSV import has apparently run twice resulting in duplicated data.
Our installation is based on RHEL6 so you may be comparing apples with oranges.
Importing from CSV appears to be glacially slow.

Regards,

George

GS

unread,
Nov 1, 2018, 6:44:01 PM11/1/18
to ica-ato...@googlegroups.com
Hi AtoM,

The import job running twice appears to have happened again, the job itself was very slow as you will see from the datestamps (see attached image also)

[info] [2018-10-31 09:21:37] Job 1513057 "arFileImportJob": Job started.
[info] [2018-10-31 09:21:37] Job 1513057 "arFileImportJob": Importing CSV file: N388 Paddy Troy.csv.
[info] [2018-10-31 09:21:37] Job 1513057 "arFileImportJob": Indexing imported records.
[info] [2018-10-31 09:21:37] Job 1513057 "arFileImportJob": Update type: import-as-new
[info] [2018-11-02 04:27:30] Job 1513057 "arFileImportJob": Job started.
[info] [2018-11-02 04:27:30] Job 1513057 "arFileImportJob": Importing CSV file: N388 Paddy Troy.csv.
[info] [2018-11-02 04:27:30] Job 1513057 "arFileImportJob": Indexing imported records.
[info] [2018-11-02 04:27:30] Job 1513057 "arFileImportJob": Update type: import-as-new

Apart from cancelling the job now is there any way to stop this from happening again.
This is a RHEL6 installation and may be hard to compare with a recommended installation/setup.

Regards,

George
AtoMjobrestart.png

Dan Gillean

unread,
Nov 5, 2018, 11:59:08 AM11/5/18
to ica-ato...@googlegroups.com

Hi George, 

I'm trying to get some ideas about this from our team, but so far nothing. It's difficult for us to offer suggestions in this case because we haven't seen the issue before and have so far been unable to reproduce it on an Ubuntu installation. I'm worried it may have to do with the dependencies in your centOS installation. I will keep trying to gather more information and suggestions for you. 

Regards, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

On Thu, Nov 1, 2018 at 6:44 PM GS <ggr.s...@gmail.com> wrote:
Hi AtoM,

The import job running twice appears to have happened again, the job itself was very slow as you will see from the datestamps

Message has been deleted

Brandon Uhlman

unread,
Nov 15, 2018, 9:48:41 AM11/15/18
to AtoM Users

We’ve seen this sort of activity in the log files on our Ubuntu server, when a long-running (in our case, authority) import was interrupted and restarted. The root cause, in our case, was that php-fpm was getting restarted by a logrotate cron job.

 

That doesn’t explain the long run time for the import, but if you had something similar scheduled, it could explain why it was getting restarted.

 

~B

 

From: ica-ato...@googlegroups.com <ica-ato...@googlegroups.com> On Behalf Of GS
Sent: November 14, 2018 10:08 PM
To: AtoM Users <ica-ato...@googlegroups.com>
Subject: Re: [atom-users] job appears to run twice

 

Hi Dan,

 

regarding the slowness of the import.

Is it possible that there are multiple re-indexing threads being run simultaneously?

I've just done a test import on our dev server without indexing then started an index run on the commandline.

This doesn't seem to max out mysqld  which does happen when importing and indexing via the web interface.

 

Regards,

 

George

Reply all
Reply to author
Forward
0 new messages