Problems with import logs not being cleared between imports

44 views
Skip to first unread message

quenti...@gmail.com

unread,
Aug 8, 2018, 3:47:06 AM8/8/18
to AtoM Users
Hello !
I am encountering a weird behavior when importing XML (EAD 2002) files into AtoM 2.4.
I'd like to note that I am setting up AtoM for a colleague, and that I don't really know the software in depth. So the "problem" may very well be related to a configuration parameter.

We want to import some XML files into AtoM. This is the first time these files are imported, and unfortunately, some entries are incorrect. We can easily fix them by reading the job log.

However, the behavior of these jobs is not very user-friendly so far due to the following issue:
On the first try, I import my XML file, and the job returns the following log :

Aug 08 07:05:07 atomefa2 php[24699]: 2018-08-08 00:05:07 > Job started.
Aug 08 07:05:07 atomefa2 php[24699]: 2018-08-08 00:05:07 > Importing XML file: AMATH5.xml.
Aug 08 07:05:07 atomefa2 php[24699]: 2018-08-08 00:05:07 > Indexing imported records.
Aug 08 07:05:07 atomefa2 php[24699]: 2018-08-08 00:05:07 > Update type: import-as-new
Aug 08 07:05:08 atomefa2 php[24699]: 2018-08-08 00:05:08 > libxml error 504 on line 5 in input file: Element eadheader content does not follow the DTD, expecting (eadid , filedesc , profiledesc? , revisiondesc?), got (eadid itemdesc proitemdesc )
Aug 08 07:05:08 atomefa2 php[24699]: 2018-08-08 00:05:08 > libxml error 534 on line 7 in input file: No declaration for element itemdesc
Aug 08 07:05:08 atomefa2 php[24699]: 2018-08-08 00:05:08 > libxml error 534 on line 23 in input file: No declaration for element proitemdesc
Aug 08 07:05:08 atomefa2 php[24699]: 2018-08-08 00:05:08 > Creating a new record: Amathonte
Aug 08 07:05:08 atomefa2 php[24699]: 2018-08-08 00:05:08 > Creating a new record: Rapports et correspondance
Aug 08 07:05:08 atomefa2 php[24699]: 2018-08-08 00:05:08 > Creating a new record: Fouilles Amathonte 1969-1996.
Aug 08 07:05:08 atomefa2 php[24699]: 2018-08-08 00:05:08 > Import complete.
Aug 08 07:05:08 atomefa2 php[24699]: 2018-08-08 00:05:08 > Job finished.


On the 2nd try, I tried to import the exact same file for this example. Note the log:

Aug 08 07:05:51 atomefa2 php[24699]: 2018-08-08 00:05:51 > Job started.
Aug 08 07:05:51 atomefa2 php[24699]: 2018-08-08 00:05:51 > Importing XML file: AMATH5.xml.
Aug 08 07:05:51 atomefa2 php[24699]: 2018-08-08 00:05:51 > Indexing imported records.
Aug 08 07:05:51 atomefa2 php[24699]: 2018-08-08 00:05:51 > Update type: import-as-new
Aug 08 07:05:51 atomefa2 php[24699]: 2018-08-08 00:05:51 > libxml error 504 on line 5 in input file: Element eadheader content does not follow the DTD, expecting (eadid , filedesc , profiledesc? , revisiondesc?), got (eadid itemdesc proitemdesc )
Aug 08 07:05:51 atomefa2 php[24699]: 2018-08-08 00:05:51 > libxml error 534 on line 7 in input file: No declaration for element itemdesc
Aug 08 07:05:51 atomefa2 php[24699]: 2018-08-08 00:05:51 > libxml error 534 on line 23 in input file: No declaration for element proitemdesc
Aug 08 07:05:51 atomefa2 php[24699]: 2018-08-08 00:05:51 > libxml error 504 on line 5 in input file: Element eadheader content does not follow the DTD, expecting (eadid , filedesc , profiledesc? , revisiondesc?), got (eadid itemdesc proitemdesc )
Aug 08 07:05:51 atomefa2 php[24699]: 2018-08-08 00:05:51 > libxml error 534 on line 7 in input file: No declaration for element itemdesc
Aug 08 07:05:51 atomefa2 php[24699]: 2018-08-08 00:05:51 > libxml error 534 on line 23 in input file: No declaration for element proitemdesc
Aug 08 07:05:51 atomefa2 php[24699]: 2018-08-08 00:05:51 > libxml error 504 on line 5 in input file: Element eadheader content does not follow the DTD, expecting (eadid , filedesc , profiledesc? , revisiondesc?), got (eadid itemdesc proitemdesc )
Aug 08 07:05:51 atomefa2 php[24699]: 2018-08-08 00:05:51 > libxml error 534 on line 7 in input file: No declaration for element itemdesc
Aug 08 07:05:51 atomefa2 php[24699]: 2018-08-08 00:05:51 > libxml error 534 on line 23 in input file: No declaration for element proitemdesc
Aug 08 07:05:51 atomefa2 php[24699]: 2018-08-08 00:05:51 > Creating a new record: Amathonte
Aug 08 07:05:51 atomefa2 php[24699]: 2018-08-08 00:05:51 > Creating a new record: Rapports et correspondance
Aug 08 07:05:51 atomefa2 php[24699]: 2018-08-08 00:05:51 > Creating a new record: Fouilles Amathonte 1969-1996.
Aug 08 07:05:51 atomefa2 php[24699]: 2018-08-08 00:05:51 > Import complete.
Aug 08 07:05:51 atomefa2 php[24699]: 2018-08-08 00:05:51 > Job finished.


Seems to me that the log is not cleared and previous one is carried into the new one. The only way I found so far to get only the latest errors is to restart the atom-worker between each import.
When it comes to our case, our first XML file logged a few hundred of errors due to a misspelled attribute that was almost everywhere.
We fixed them, and were unable to understand why the job kept outputting the same errors.

So, is that normal ? If it's the case, is there a way to ask Atom to "clear" the log of a given file before a new import ?

Thanks a lot,
Quentin

Dan Gillean

unread,
Aug 9, 2018, 4:06:19 PM8/9/18
to ICA-AtoM Users
Hi Quentin, 

This is not normal! I think you've actually found a bug. It looks like, when an XML import job is run from the WebUI, the $libxmlerror value is not being cleared between runs, meaning that another import of the same content will accidentally append the warnings and errors into the next console output. 

We've filed a bug ticket here: 
Ultimately, this shouldn't prevent your import from progressing, but it's not ideal. As you may know, we rely on community support to sponsor or submit new development in AtoM - you can read more about how we maintain and develop AtoM here: 

If this is a priority for your institution, and you are interested in sponsoring an immediate fix, please feel free to contact me off-list, and we can prepare an estimate for you. Otherwise, I have added to a list of community-reported bugs for consideration in our next release. 

Regards, 


Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-users+unsubscribe@googlegroups.com.
To post to this group, send email to ica-atom-users@googlegroups.com.
Visit this group at https://groups.google.com/group/ica-atom-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/34c1962e-415a-4f03-8d17-4d1220ee4a66%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

quenti...@gmail.com

unread,
Aug 10, 2018, 3:44:05 AM8/10/18
to AtoM Users
Hello Dan !
I managed to get some time this morning to focus on the problem.
In addition to what was said, it seems that when running the import from the shell using the symfony tasks, errors are carried between each files. Fortunately, this is cleared between each execution of the symfony task.

This, adding to the details provided in the ticket #12370, led me to a simple fix that resolves the problem both in the WebUI and the shell.
I therefore opened a PR on the GitHub repository: https://github.com/artefactual/atom/pull/772

Unfortunately, I don't have time to run extensive tests. Also, I hope the PR fits your guidelines. I'll keep an eye on it if there's any comments or problem !

Best Regards,
Quentin
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To post to this group, send email to ica-ato...@googlegroups.com.

Dan Gillean

unread,
Aug 10, 2018, 1:18:58 PM8/10/18
to ICA-AtoM Users
Hi Quentin, 

Thank you so much! We've received your pull request and your signed contributor's agreement; one of our developers has done an initial review with you, and the work has now been merged into both our qa/2.5.x and stable/2.4.x branches. I will do some testing soon, and will let you know if there are any issues via the PR. Once it's verified, we'll add your name to our Community contributors list, and will also credit you on the 2.4.1 release page next to the issue. 

We're very grateful for how proactive you have been in reporting AND fixing this issue! 

Cheers, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-users+unsubscribe@googlegroups.com.
To post to this group, send email to ica-atom-users@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages