CSV imports missing items

57 views
Skip to first unread message

kjam...@ualberta.ca

unread,
Aug 26, 2016, 9:28:22 AM8/26/16
to AtoM Users
We've been doing CSV imports, sometimes through the GUI sometimes on the command line, and for whatever reason random lines of the CSV don't import. When we've identified them we can import them on a new CSV without changing them at all, so its not the data that's causing the issue. ANy idea why this is happening or what we can do so it stops happening? Its not a lot of lines missing, but having to check them one by one to figure out which ones are missing is time consuming.

Dan Gillean

unread,
Aug 26, 2016, 4:24:43 PM8/26/16
to ICA-AtoM Users
Hi there,

To my knowledge, I have not encountered this issue before! What version of AtoM are you using? Have you followed our recommended installation guidelines, or is there anything different or particular about your installation that we should know about?

Are you getting any warnings or errors in the console when importing via the CLI? Is it possible you have included a child description before its parent in the CSV? AtoM will progress through the rows in order, so if a child appears before the parent it references, AtoM won't know what to do with that record and it's possible it could end up being skipped (though IIRC this usually just halts the import).

Having not seen this before, I'm really not sure what to suggest! Do you have a test instance where you can try to reproduce this? e.g. if you purge your data and re-import the same CSV, will it drop the same rows, or different/random ones?

Any further information you can provide will help us help you try to figure out what's going on. Thanks!


Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

On Fri, Aug 26, 2016 at 6:28 AM, <kjam...@ualberta.ca> wrote:
We've been doing CSV imports, sometimes through the GUI sometimes on the command line, and for whatever reason random lines of the CSV don't import. When we've identified them we can import them on a new CSV without changing them at all, so its not the data that's causing the issue. ANy idea why this is happening or what we can do so it stops happening? Its not a lot of lines missing, but having to check them one by one to figure out which ones are missing is time consuming.

--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-users+unsubscribe@googlegroups.com.
To post to this group, send email to ica-atom-users@googlegroups.com.
Visit this group at https://groups.google.com/group/ica-atom-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/8bd0c800-2f16-4fd3-b443-24be77e1d817%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

kjam...@ualberta.ca

unread,
Aug 29, 2016, 4:00:00 PM8/29/16
to AtoM Users
We are using version 2.2.1 and as far as I know its been a straightforward installation. I know we have the whole thing running on one VM though and I know you recommend running elastic search on a separate VM. I'm not sure if that;'s relevant to this issue. 

There haven't been any warnings out of the ordinary (even with a perfect upload it tends to give one warning). For nesting using CSV we actually have just been using the qubitslug field and importing top level records in one CSV, then series level records in a second CSV and file descriptions in a third CSV so that we don't have to deal with making sure child/parent relationships are accurate within the same spreadsheet. 

I'll play around with the test instance and see if there is any rhyme or reason to which lines are being dropped. 


On Friday, August 26, 2016 at 2:24:43 PM UTC-6, Dan Gillean wrote:
Hi there,

To my knowledge, I have not encountered this issue before! What version of AtoM are you using? Have you followed our recommended installation guidelines, or is there anything different or particular about your installation that we should know about?

Are you getting any warnings or errors in the console when importing via the CLI? Is it possible you have included a child description before its parent in the CSV? AtoM will progress through the rows in order, so if a child appears before the parent it references, AtoM won't know what to do with that record and it's possible it could end up being skipped (though IIRC this usually just halts the import).

Having not seen this before, I'm really not sure what to suggest! Do you have a test instance where you can try to reproduce this? e.g. if you purge your data and re-import the same CSV, will it drop the same rows, or different/random ones?

Any further information you can provide will help us help you try to figure out what's going on. Thanks!


Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

On Fri, Aug 26, 2016 at 6:28 AM, <kjam...@ualberta.ca> wrote:
We've been doing CSV imports, sometimes through the GUI sometimes on the command line, and for whatever reason random lines of the CSV don't import. When we've identified them we can import them on a new CSV without changing them at all, so its not the data that's causing the issue. ANy idea why this is happening or what we can do so it stops happening? Its not a lot of lines missing, but having to check them one by one to figure out which ones are missing is time consuming.

--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To post to this group, send email to ica-ato...@googlegroups.com.

kjam...@ualberta.ca

unread,
Aug 29, 2016, 6:29:10 PM8/29/16
to AtoM Users, kjam...@ualberta.ca
I've reimported a CSV and of course can't get it to replicate what it had done before. The warning message we always receive when importing CSV reads:

Warnings were encountered:

PHP Warning: Module 'apc' already loaded in Unknown on line 0

I looked into the last time this happened (last week) and it was a file in the middle of the CSV and was one of a dozen or so files with the same slug and for some reason one of them imported but did not nest itself along the others with the exact same slug and was left free-floating. Its unclear how many of the rows that we deem missing have imported but not nested for whatever reason. But as I said, when we copy and paste the CSV row into another file and re-upload without any changes it works fine so I have no reason to suspect that it is a data issue. 

Hope the extra info is useful!

Dan Gillean

unread,
Aug 30, 2016, 2:55:08 PM8/30/16
to ICA-AtoM Users
Hi there,

Thanks for this extra information!

I would definitely suggest that you use the command-line for your CSV imports, to rule out the possibility that any of this is caused by timeouts. Note that when an import times out via the web browser, it can sometimes leave corrupted data in your database - which could further be adding to your problems! There are some previous posts in the forum that offer some simple tasks that can resolve the most common of these issues, and some more detailed ones if you want to deal with it at the database level - here's a post that summarizes:

I haven't seen that APC warning before either, and so far neither have the developers here whom I've spoken with - so I'm wondering if it might be related? Do you know what version of PHP you are running? With 5.5 or later, you'll also need to have APCu installed - I've previously tried to outline the difference between APC and APCu in relation to PHP versions in this thread:

You can run php -m to see a list of all the PHP modules you have installed. If you're running PHP 5.5 or higher, you should ensure that you have both apc and apcu in that list.

You might also want to see if you can increase the memory allocation in your VM - it's possible that the memory is being exhausted during the import process and this is causing issues? Just spitballing at this point.

Regards,


Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-users+unsubscribe@googlegroups.com.
To post to this group, send email to ica-atom-users@googlegroups.com.

kjam...@ualberta.ca

unread,
Aug 30, 2016, 3:29:46 PM8/30/16
to AtoM Users
Thanks! I have passed this on to out IT people. Apparently we are running 5.3.3, so they are looking into it. 

Our plan is to mostly use the command line for importing but the case I described where it imported but didn't link with the slug even though everything else with the same slug nested properly was using a command line import. Hopefully its just some weird bug with our set up. I will follow up when I hear back from IT.
Reply all
Reply to author
Forward
0 new messages