CSV with columns between quotes

92 views
Skip to first unread message

fabius pocus

unread,
Jul 16, 2022, 12:29:40 PM7/16/22
to OpenRefine
Hi. I have a simple dataset:

A B
1 a
2 b
3 c

I'd like to obtain a csv file with tow columns: A (numeric) and B (string):

A, "B"
1, "a"
2, "b"
3, "c"

I use Custom Tabular Exporter but it puts quote evertwhere or ad a lot of quotes to the elements of column B:

""a""

or

"""a"""

How can I obtain a simple schema like before? Thanks a lot,

Fabio

Owen Stephens

unread,
Jul 19, 2022, 4:24:42 AM7/19/22
to OpenRefine
Hi Fabio.

I've been trying to recreate your problem, but for me it works as I expect.

Can you give some more information about

Version of OpenRefine
The data you are exporting - for example are the numbers in column A stored as numbers in OpenRefine or as Strings?
The export options you are selecting - if you are using the Custom Tabular Export perhaps you can paste the Option Code here?

Thanks

Owen

fabius pocus

unread,
Jul 20, 2022, 9:07:00 AM7/20/22
to OpenRefine
Hi. Thanks a lot for your effort. I am trying to solve this by templating tool. Anyway, I am using Openrefine 3-6rc1 (last unstable release). This is what I used:

{
  "format": "csv",
  "separator": ",",
  "lineSeparator": "\n",
  "encoding": "UTF-8",
  "quoteAll": false,
  "outputColumnHeaders": true,
  "outputBlankRows": false,
  "columns": [
    {
      "name": "A",
      "reconSettings": {
        "output": "entity-name",
        "blankUnmatchedCells": false,
        "linkToEntityPages": true
      },
      "dateSettings": {
        "format": "iso-8601",
        "useLocalTimeZone": false,
        "omitTime": false
      }
    },
    {
      "name": "B",
      "reconSettings": {
        "output": "entity-name",
        "blankUnmatchedCells": false,
        "linkToEntityPages": true
      },
      "dateSettings": {
        "format": "iso-8601",
        "useLocalTimeZone": false,
        "omitTime": false
      }
    }
  ]
}

I tried to consider A like a string and like a number but I can't obtain what I wanted. If you want to use my data, please copy this:

A,B
1,a
2,b
3,c

to the clipboard and recall it from Openrefine. Thanks a lor again,

Fabio

Owen Stephens

unread,
Jul 20, 2022, 9:48:10 AM7/20/22
to OpenRefine
Thanks Fabio

I think I have misunderstood your problem.
If I use the code and data you have supplied then my export looks like:

A,B
1,a
2,b
3,c


Which is exactly what I'd expect
If I change the export by checking the check box "Always quote text" on the Download tab, then I get:

"A","B"
"1","a"
"2","b"
"3","c"


Which is also as expected - because csv files have no concept of numeric vs textual data - all the data is essentially text in a csv. I think perhaps "Always quote values" might be a better label for the option, but it behaves as I would expect

As I understand it, you want the values in column B quoted and the values in column A not quoted - this is definitely something you'll need to use the export template to achieve.

If you have the appropriate data types in OpenRefine already (i.e. numbers are converted to numbers, dates to dates, strings are strings) then you might be able to use something like:

Prefix
"A","B"
followed by a new line (i.e. press enter after the column headings)

Row template:
{{jsonize(cells["A"].value)}},{{jsonize(cells["B"].value)}}

Row separator:
<new line> (just select all the content and press enter to put in a new line character

Suffix:
leave empty

I think this will bascially work BUT note that json comments any quotes in the values like this:
\"

and in a csv it is more common to use double quotes
""

So if your project is like

A,B
1,some text which includes "a quote"
2,b
3,c


Then the export template I've described above would output

"A","B"
1,"some text which includes \"a quote\""
2,"b"
3,"c"


but a more traditional csv might be like:

"A","B"
1,"some text which includes ""a quote"""
2,"b"
3,"c"


If you want to get the latter you'll need to do some work to handle the quote marks appropriately in the export - which means writing some additional GREL in the export rather than using the 'jsonize' function

Best wishes

Owen

fabius pocus

unread,
Jul 21, 2022, 3:00:10 PM7/21/22
to OpenRefine
Thanks a lot for your effort. I noticed templating but data explorer was my first choice. Thanks again,

Fabio

Reply all
Reply to author
Forward
0 new messages