Ingest of Stata files (dta)

18 views

Skip to first unread message

Philipp at UiT

unread,

May 19, 2020, 12:02:31 PM5/19/20

to Dataverse Users Community

Yesterday, one of our researchers uploaded a 600 MB Stata file (.dta). The file was ingesting for about one day, before Dataverse just now displayed the message that the file is successfully ingested (1153 variables, 510133 observations).

>> Has anyone experienced similarly long ingest periods?

I also uploaded the same file to the Harvard Demo Dataverse (https://demo.dataverse.org/dataset.xhtml?persistentId=doi:10.70122/FK2/HLHYQG; Phil: I gave you curator access to the dataverse). There, file ingest was "completed" much faster (a few minutes), but afterwards, I got the following message:

"Tabular data ingest failed. Ingest failed to produce Summary Statistics and/or UNF signatures; /tmp/tempTabfile.6815203645101011389. (No such file or directory)"

>> Any idea what went wrong?

Best, Philipp

Philip Durbin

unread,

May 19, 2020, 1:15:48 PM5/19/20

to dataverse...@googlegroups.com

I was able to download the file and get ingest started on my laptop but I only let it run for half an hour so I don't know if it would have completed or not.

A 600 MB Stata file strikes me as somewhat large (half a million observations, like you said) but I'd be curious to hear what's common, if people use the :TabularIngestSizeLimit setting, etc.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/3372ad30-fbaa-4c6e-9550-b332533d1851%40googlegroups.com.

Philip Durbin
Software Developer for http://dataverse.org
http://www.iq.harvard.edu/people/philip-durbin

Reply all

Reply to author

Forward

0 new messages