export XML EAD : error on line 2 at column 1: Extra content at the end of the document

636 views
Skip to first unread message

Nam Pham

unread,
Aug 11, 2021, 9:53:17 AM8/11/21
to AtoM Users
Hi,
I have a trouble for exporting an XML file. When I click on Export "XML EAD 2020", there is this error message.error_exporting.png
If I download from the clipboard, the XML file is empty.
It happens only when I import a CSV file, that seems correctly made because AtoM recognize it.

Thanks for your help,

Nam 

Dan Gillean

unread,
Aug 13, 2021, 10:06:52 AM8/13/21
to ICA-AtoM Users
Hi Nam, 

If this is only happening with CSV imported files, then I suspect the issue is not a bug in the XML export code, but rather something in the CSV. 

Are you using Excel to prepare your CSV? If so, and if this is only happening with files imported via CSV, then it's possible that the CSV contains non-UTF-8 characters. In some cases, it might be possible to create a CSV that's valid enough to import, but still contains characters that can cause problems later. This can also happen if you've prepared your data by cutting and pasting it into a spreadsheet from something like a Word document that is not saved with UTF-8 encoding. 

In general, we recommend using LibreOffice Calc for CSV preparation. Microsoft likes to use its own custom line ending characters and character encoding (such as Win-1252), by default, or else adopt the computer's default encoding, which is not always UTF-8. These can lead to CSV import issues if not properly addressed. Later versions of Excel now contain CSV UTF-8 save options, so the situation is much improved, but we still hear of issues due to this. Not only is Calc open source, but it also allows you to set the character encoding, separator character, and string delimiter every time you open a CSV, with a preview of the resulting formatting when represented in a spreadsheet, to ensure the data is displayed correctly. It also makes saving with the correct encoding and line endings much simpler. 

Are you able to edit the record in AtoM's interface? If it is caused by unexpected characters, you may be able to identify and fix them in edit mode, since AtoM's display is UTF-8. Non-UTF-8 characters may be visible, allowing you to delete and re-type them.

Alternatively, are you able to export this problem record as a CSV file? It may be possible to open it with UTF-8 encoding and identify problem characters. Alternatively, opening it in a text editor instead would allow you to see the raw input, which may make finding and fixing any problem characters easier. However, if you do this, be very careful about editing it this way! If you accidentally delete a comma separator or a string delimiter, you could end up making the CSV malformed, so be sure to Save As and keep the original. 

I will consult with our team and see if they have any other suggestions. Let us know how it goes in the meantime. 

Cheers, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him


--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/a92681ac-9942-4a9c-ad41-4cbcfaded057n%40googlegroups.com.

Nam Pham

unread,
Sep 22, 2021, 6:04:02 AM9/22/21
to AtoM Users
Hi Dan,
Sorry for my late reply.
Everything works fine. I deleted the non-UTF-8 in my CSV file with Sublime text.

Thanks again,
Cheers.
Nam Pham
Centre des littératures en Suisse romande - UNIL

Reply all
Reply to author
Forward
0 new messages