ORA csv import limits

18 views
Skip to first unread message

Michael Grass

unread,
Apr 27, 2020, 12:51:00 PM4/27/20
to ORA Google Group
Attempting to import csv data from a twitter scrape export.  Tested import on a smaller dataset ~500 records and it imported without issue.  My target dataset has about 500k records.  It's able to view the attribute values from my top line but when selecting 'finish' nothing is happening, even after waiting 30 min.  Is there an upper threshold for the number of records I can batch import?  Should I break it into smaller sections?

Kathleen Carley

unread,
Apr 27, 2020, 2:22:18 PM4/27/20
to ORA List
it depends which version of ora you have.  If the one from the CASOS website - (ORA-LITE) there is a limit of 2000.
If ORA-PRO there should not be a limit.

On Mon, Apr 27, 2020 at 12:51 PM Michael Grass <gra...@gmail.com> wrote:
Attempting to import csv data from a twitter scrape export.  Tested import on a smaller dataset ~500 records and it imported without issue.  My target dataset has about 500k records.  It's able to view the attribute values from my top line but when selecting 'finish' nothing is happening, even after waiting 30 min.  Is there an upper threshold for the number of records I can batch import?  Should I break it into smaller sections?

--
Computational Analysis of Social and Organizational Systems (CASOS)
The Institute for Software Research, Department of Computer Science
Carnegie Mellon University, Pittsburgh, PA, USA
http://casos.cs.cmu.edu/
---
You received this message because you are subscribed to the Google Groups "ORA Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ORA-google-gro...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ORA-google-group/7e232418-a675-4192-8768-06506e60538c%40googlegroups.com.

Jeff Reminga

unread,
Apr 27, 2020, 2:30:09 PM4/27/20
to ORA-goog...@googlegroups.com
Hi Michael,

Importing CSV (assuming you have ORA-PRO) has no limits except memory; if there is sufficient memory, importing 500k records should take a minute or two .
Sometimes importing CSV fails because there is an incorrectly embedded quotation mark (" or ') in a cell that makes the parser read all the text until the next quotation mark.

Thanks,
Jeff


Michael Grass

unread,
Apr 27, 2020, 2:40:55 PM4/27/20
to ORA Google Group
Thanks for the reply, Dr. Carley and Jeff.  

I'm using ORA Pro and wasn't expecting there to be any limitations, other than hardware.  I'm on version 3.0.9.9.100.  I've given it full access to 16 cores and 64GB of RAM.  If I attempt to import the csv it reads in the headers and I'm able to choose the formats but then stalls on import.  If I attempt to import the xls data dump it stalls immediately.  I've given it a little over an hour with no visible changes in the GUI.  

Is there a place it stashes logs so I can see better what's happening under the hood?




On Monday, April 27, 2020 at 2:22:18 PM UTC-4, Kathleen Carley wrote:
it depends which version of ora you have.  If the one from the CASOS website - (ORA-LITE) there is a limit of 2000.
If ORA-PRO there should not be a limit.

On Mon, Apr 27, 2020 at 12:51 PM Michael Grass <gra...@gmail.com> wrote:
Attempting to import csv data from a twitter scrape export.  Tested import on a smaller dataset ~500 records and it imported without issue.  My target dataset has about 500k records.  It's able to view the attribute values from my top line but when selecting 'finish' nothing is happening, even after waiting 30 min.  Is there an upper threshold for the number of records I can batch import?  Should I break it into smaller sections?

--
Computational Analysis of Social and Organizational Systems (CASOS)
The Institute for Software Research, Department of Computer Science
Carnegie Mellon University, Pittsburgh, PA, USA
http://casos.cs.cmu.edu/
---
You received this message because you are subscribed to the Google Groups "ORA Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ORA-goog...@googlegroups.com.

Michael Grass

unread,
Apr 27, 2020, 4:34:19 PM4/27/20
to ORA Google Group

I eventually got an error message on the Excel import attempt, "Please select at least one attribute to import."  

I found that ORA writes logs to the user home directory in OS X but I don't see anything of use in the logs it seems to be writing there.

Jeff Reminga

unread,
Apr 28, 2020, 8:56:29 AM4/28/20
to ORA-goog...@googlegroups.com
Hi Michael,

Are you importing an Excel file or a CSV file?
Large Excel files (.xlsx) require a lot of memory when importing, whereas if you first save-as from excel format to TSV or CSV, then you can import that file easily.
Thanks,
Jeff

To unsubscribe from this group and stop receiving emails from it, send an email to ORA-google-gro...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ORA-google-group/b5cc0d3f-e2dc-424e-b2da-678b31bd96b3%40googlegroups.com.

Kathleen Carley

unread,
Apr 28, 2020, 9:06:14 AM4/28/20
to ORA List
why would ora have made the comment about an attribute?

Michael Grass

unread,
Apr 28, 2020, 10:02:41 AM4/28/20
to ORA Google Group
Thanks for continuing the thread this morning.  I'm having more luck with the CSV, but the xls file is getting me no where.  It's possible I've botched it when I wrote my data out of python.  I'm gonna scrap trying to use it.  ORA was throwing the attribute error on the XLS import attempt.  It makes zero sense to me, but I took 10k records from my CSV file and copy/pasted it into another file - without changing any content - and that seems to have worked.  

I'm working off of a new MacBook with 16 cores and 64GB of RAM so I may try copying more data like that and try it again.  

Again, I really appreciate the help.  


Kathleen Carley

unread,
Apr 28, 2020, 10:08:15 AM4/28/20
to ORA List
no worries - if csv works - great - we generally use that for large files not xls
if not - feel free to share dta nd we cna try to figure out what happened

To unsubscribe from this group and stop receiving emails from it, send an email to ORA-google-gro...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ORA-google-group/6fbea559-4528-481d-be0d-d152ff0aae04%40googlegroups.com.

Michael Grass

unread,
Apr 28, 2020, 10:55:06 AM4/28/20
to ORA Google Group

Dr. Carley, if I may, I'd love to have a second (or more) set of eyes on this file.  Kristen Kerr wasn't able to do anything with it either, so it's possible my data is bad in some way that I can't see.  It's a 56mb sized file and the forum won't take it here. Is there somewhere else that's easy for you to access?  Slack channel perhaps?

Kathleen Carley

unread,
Apr 28, 2020, 11:18:41 AM4/28/20
to ORA List
any chance you can put it on dropbox

To unsubscribe from this group and stop receiving emails from it, send an email to ORA-google-gro...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ORA-google-group/d80de600-21ce-4db5-92f4-42cefdcf7f0c%40googlegroups.com.

Michael Grass

unread,
Apr 28, 2020, 11:49:26 AM4/28/20
to ORA Google Group

Jeff Reminga

unread,
Apr 28, 2020, 2:06:05 PM4/28/20
to ORA-goog...@googlegroups.com
Hi Michael,

Thanks for sharing your data; I have found the problem.

When I import using the latest version of ORA the following warning occurs in a dialog (this warning dialog feature is relatively new because these import issues are common):

 

LINE_NUMBER  MESSAGE             TYPE

18801    Inconsistent number of columns: expected 17, actual 802             WARNING

 

And indeed that row is corrupted and has incorrect commas (embedded commas that make the cell appear to run on for many lines).

 

In particular, the “bio” column has the offending cells.

 

A workaround which I successfully used is to remove the contents of that column, save out the file, and then it imports in 10 seconds on my machine.

 

You might need that column, however, as it appears to be the message body.

Thanks,
Jeff

To unsubscribe from this group and stop receiving emails from it, send an email to ORA-google-gro...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ORA-google-group/180b0a0c-4b85-4d55-92b5-70b7a4e8a0ff%40googlegroups.com.

Michael Grass

unread,
Apr 28, 2020, 2:11:10 PM4/28/20
to ORA Google Group
Thanks Jeff.  I can live without the bio column for sure.  I grabbed it because it was available, but not at all important to my analysis. 

Kathleen Carley

unread,
Apr 28, 2020, 2:12:12 PM4/28/20
to ORA List
Note - if you remove commas from bio column it could be added back in later


To unsubscribe from this group and stop receiving emails from it, send an email to ORA-google-gro...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ORA-google-group/d31d3092-4cb8-4263-b55d-d26696e792a7%40googlegroups.com.

Michael Grass

unread,
Apr 28, 2020, 2:18:29 PM4/28/20
to ORA Google Group
Yeah, I don't think I need it, but if the sponsor changes their mind, I'll run it through R and clean it up.  Thanks again for the quick assistance gang.

Kathleen Carley

unread,
Apr 28, 2020, 2:20:34 PM4/28/20
to ORA List
best of luck

To unsubscribe from this group and stop receiving emails from it, send an email to ORA-google-gro...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ORA-google-group/9728a440-60d6-4a1f-a423-bbc057616ff2%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages