--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/3372ad30-fbaa-4c6e-9550-b332533d1851%40googlegroups.com.
Thanks, Christian
To view this discussion visit https://groups.google.com/d/msgid/dataverse-community/9c411a32-ad02-4cc1-80c7-d192a22bd567n%40googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/dataverse-community/906fa2e5-0232-4a12-a2d4-6cdb9eee6fcfn%40googlegroups.com.
I generated now 2 test files (uniformly distributed random value 0-10, without variable and value labels) as dta, sav, csv and have ingested them on two test server (dv03, dv06) of us. The attached Stata code generates the test data.
11000 observations, 6200 variables:
test_data_11000x6200v.dta 270MB dv03: 14h10, dv06: 13h40
test_data_11000x6200v.sav 533MB dv03: 6h11, dv06: 7h55
test_data_11000x6200v.csv 139MB dv03: 6h36, dv06: 7h52
6200 observations, 11000 variables:
test_data_6200x11000v.dta 273MB dv03: 21h58, dv06: 28h27
test_data_6200x11000v.sav 533MB dv03: 11h46, dv06: 14h54
test_data_6200x11000v.csv 139MB dv03: 14h1, dv06: 14h53
Interestingly, dta takes much longer as sav and csv, sav is even faster than csv. If the data matrix observation-variables is transposed, the duration increases significantly.