--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To post to this group, send email to ica-ato...@googlegroups.com.
Visit this group at https://groups.google.com/group/ica-atom-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/d736ddc5-ee35-48e5-a9d6-e61e1036b5a8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
On 13 Dec 2018, at 2:21 am, Dan Gillean <d...@artefactual.com> wrote:
Hi Sarah,
My first guess on seeing this is that you've used Microsoft Excel to prepare a CSV for import. Is this correct?
As noted in our documentation (here), AtoM expects a CSV that is UTF-8 encoded with unix-style line endings. Unfortunately, Microsoft has had a tendency to ignore de facto standards and go its own way, and by default Excel sheets are encoded with a custom character encoding (WinLatin), and make use of custom Windows-style line endings. These line ending characters are not typically visible in the spreadsheet (you would need to open the CSV in a text editor and enable the display of special characters). It seems that sometimes empty rows in an Excel spreadsheet are interpreted as rows to imported, causing this issue.
It is technically possible to use Excel if you dig deep enough into the settings and customize some aspect (they don't make it easy), and I've found that editing a well-formed CSV created outside of Excel won't mess it up. But if you are preparing new records for import, I strongly recommend using something like LibreOffice Calc, which allows you to set the character encoding every time you open the file, and uses the correct line ending glyphs. It's also open source!
We have seen this happen to users before when importing Excel-prepared CSV files.... which is unfortunate in general, but fortunate for you! Because we've helped users resolve this in the past!
Please see this forum thread for a script you can use that will delete all empty descriptions:Please use it at your own discretion - while we've tested it, it comes with no guarantees - we will not be responsible if something goes wrong, and we STRONGLY recommend that you back up your database first, just in case!
Regards,
On Wed, Dec 12, 2018 at 2:15 AM <sarah.le...@anu.edu.au> wrote:
Hello
Recently a lot (thousands) of records with no metadata have appeared in our database. Here is an example http://archivescollection.anu.edu.au/index.php/fx9p-ny7t-2az9 Has anyone else had this problem and can anyone suggest what's causing it? We suspect it's related to uploadin item descriptions using csv. Perhaps there are too may rows in the csv sheet? So my two questions are can anyone suggest how we can stop it recurring and is there a way to do a bulk delete of all the thousands of record with no metadata?
Thanks for your help.
Sarah
University of Tasmania Electronic Communications Policy (December, 2014).
This email is confidential, and is for the intended recipient only. Access, disclosure, copying, distribution, or reliance on any of it by anyone outside the intended recipient organisation is prohibited and may be a criminal offence. Please delete if obtained
in error and email confirmation to the sender. The views expressed in this email are not necessarily the views of the University of Tasmania, unless clearly intended otherwise.
First, that reporting about character encoding, line endings, and separator characters used in the CSV be added to the output of the check-import task.
Second, that the csv:check-import task include a --fix option that attempts to clean up unexpected line endings and character encodings, and possibly convert different separator characters to AToM's expected ones as well (i.e. commas). I would suggest that when run, the task outputs the converted CSV in the same location as it was, with "_fixed" appended to the filename. Users could then choose to re-run the check task against the fixed version to see an updated output and confirm that the conversion was successful, or proceed directly with the import
Third, that checks for character encoding and line endings be incorporated into the CSV import task. If AtoM's current expectations for these are not found, then by default the import is halted and an error message outlining the issue (e.g. "CSV is not UTF-8 encoded" etc) is provided.
Fourth, that a --fix option be included in the csv:import task. When used, the user is first prompted with a warning, and asked if they have made a backup first (y/n must be entered. We could possibly allow this to be skipped if a --force option is included as well, for scripting purposes, but only from the CLI). When yes is selected, AtoM will attempt to fix any encoding/line ending/separator issues prior to proceeding with the import.
In the user interface, this could be a checkbox available to administrators that says "Fix CSV issues on import" that is selected when configuring the CSV import. When checked, it could immediately trigger a warning modal that appears, encouraging users to make sure they have a backup first (note: the idea of adding functionality to allow administrators to automatically generate a SQL database dump backup on import, store it temporarily, and possibly even load the backup automatically if the import fails has also been discussed, and would pair well with this feature).
--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To post to this group, send email to ica-ato...@googlegroups.com.
Visit this group at https://groups.google.com/group/ica-atom-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/4F033D08-C9D2-41D1-98BD-5A6F6EB26C04%40utas.edu.au.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/CAC1FhZJmWZPOzSLWs3CjEOvawGJn6U6XpbMT9C2JG-sM%3DqNzuA%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/18058B45-2353-406A-93C1-797458C8D26E%40utas.edu.au.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/4945413c-8a8c-43fc-a38c-aa8af7473899%40googlegroups.com.