Automatic conversion CSV to TAB

15 views
Skip to first unread message

Tutasi

unread,
Jul 28, 2022, 7:43:59 AM7/28/22
to Dataverse Users Community
Hi, i have some questions and comments about uploading CSV and its ingestion to TAB:

(1) All data files should be in the same format, preferably CSV, but there are some with csv and some in TAB. The dataverse converts the csv files to tab by default, but there are some files that it has not converted because it has detected some errors. Sometimes I do not see these errors, possibly it is some character that I do not see 

(2) I uploaded some data files and Dataverse has converted  the original files (which have csv format) into tab files (not all). But it did it wrong, because it didn't convert the columns correctly (it didn't understand that the column separator was the semicolon) and because it didn't use the correct character code (Windows ANSI). All this affected the accents. I have already seen that, in the download arrow, there is an option to download the original file in csv, in addition to the tab format, but I would be in favor of leaving it only in csv. I have no answer here, it is true that it has not converted them well 

(3) In this same download arrow there is the option to download a VARIABLE METADATA file in XML. This description is also incorrect because it also considers all variables as a single variable. If you explain to me the meaning of the tags and their values, I could change it myself; otherwise it would be better to remove this option.

Philip Durbin

unread,
Aug 2, 2022, 4:28:53 PM8/2/22
to dataverse...@googlegroups.com
Hi! What version of Dataverse are you running, please? We tried to improve the errors for failed ingest in Dataverse 5.10. You can see some screenshots at https://github.com/IQSS/dataverse/pull/8271

Lots of people have asked about being able to skip the ingest of tabular files, especially CSV which is already a preservation format. (Creating a TSV file makes more sense when the original file is a proprietary format like Stata.) As of Dataverse 5.11 you can pass "tabIngest":"false" to the API: https://guides.dataverse.org/en/5.11.1/api/native-api.html#add-a-file-to-a-dataset

You're welcome to comment on this issue where we're tracking a number of "allow skipping tabular ingest" requests: https://github.com/IQSS/dataverse/issues/8526

For the variable metadata problem, would you be able to create a GitHub issue with a sample CSV file and screenshots of the problem?

Thanks,

Phil

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/6b2d25ff-4029-4563-bc31-e860ae4039c7n%40googlegroups.com.


--
Reply all
Reply to author
Forward
0 new messages