Exporting to CSV

52 views
Skip to first unread message

Wayne Heming

unread,
Apr 26, 2023, 10:24:26 PM4/26/23
to Subsurface Divelog
I guess some people could call this a bug, and others would call it an inconvenience.

First I noticed that the CSV export is TAB Separated Variable File not Comma Separated Variable file.
When I open the file with Excel (2016) it screws up the heading 
"dive number" "date" "time" 
becomes
dive number"date" "time"
and all the content is in one column.

Why the heading is screwed up, I am not sure, that is an Excel issue, but the content in a single column, this is because Excel is looking for commas, and if you happen to have a comma in the notes or tags, it separates that into a new column, making it harder to convert.

By the way, if I open in a text editor the heading is fine, so the export did export the data and headings, although separated by TAB and not COMMA

If I rename the file from .csv to .txt then Excel opens it and asks for information about the delimiters, select tab and it works fine, or using an editor to Find and replace TABS with COMMA's it opens correctly in Excel, even if there is a COMMA in the notes, as all fields are also enclose in " ", good move here.

Maybe Subsurface can either export it as a .TXT file rather than .CSV or replace the TABS with COMMAS. Which is what CSV means, or maybe a preference option for Export COMMA/TAB
Replace with COMMAS would be my preferred option.

Martin Gröger

unread,
Apr 27, 2023, 12:58:51 AM4/27/23
to subsurfac...@googlegroups.com

Hm…

if I open a CSV eport in open office it all looks as it should. Mybee it is mor an excel-issue than a subsurface issue. btw – excel 2016 is not the newest one – we got 2023 😉

keep on howling

grey

--
You received this message because you are subscribed to the Google Groups "Subsurface Divelog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to subsurface-dive...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/subsurface-divelog/b7d62449-d7c3-4b1f-9dca-53d2a8bd257an%40googlegroups.com.

 

Wayne Heming

unread,
Apr 27, 2023, 9:12:24 PM4/27/23
to Subsurface Divelog
Yeah I know, but Excel is expecting COMMAs in a CSV file, that is what CSV means. Yes it could do it better, BTW it does the same on the latest Office 365.
Just putting it out there,  other applications also fail as well not just Excel. Especially database applications. 

Michael Keller

unread,
Apr 27, 2023, 9:37:49 PM4/27/23
to Subsurface Divelog
Hi Wayne.

This is interesting: 'CSV summary dive details' indeed produces TSV and not CSV, whereas the other two 'CSV' export functions 'CSV Dive computer dive profile' and 'CSV Computed Profile Panel data' produce CSV.
I could find a changeset where this is changed for 'CSV summary dive details' (https://github.com/subsurface/subsurface/commit/6c825785401ea3e450795db8887eba9444c9b7b5), but no rationale for why this is done.
But to me it looks clear that something is wrong here - either this function should be changed back to produce CSV (consistent with the other 'CSV' export functions), or it needs to be renamed to be consistent with what it does.
I have opened a pull request to start a discussion on this: https://github.com/subsurface/subsurface/pull/3872

Ngā mihi
  Michael Keller

Dirk Lehmann

unread,
Apr 28, 2023, 1:37:46 AM4/28/23
to subsurfac...@googlegroups.com
CSV is a standardized file-NAME.
The separation can be set as required and therefore the importing program has to ask for the separation setting.

If that does not, then you need to know the separator and set that in the exporting program prior to the export.

Michael Keller

unread,
Apr 28, 2023, 2:06:25 AM4/28/23
to Subsurface Divelog
Hi Dirk.

On Friday, April 28, 2023 at 5:37:46 PM UTC+12 dirkle...@googlemail.com wrote:
CSV is a standardized file-NAME.
The separation can be set as required and therefore the importing program has to ask for the separation setting.

Apologies, until now I had assumed that the 'CSV' in the Subsurface Export functionality meant 'Comma Separated Values' - a data format that is defined in this RFC: https://datatracker.ietf.org/doc/html/rfc4180
This is the data format used when importing and exporting files with the extension '.csv' by applications such as LibreOffice, Microsoft Excel, or Google Sheets.

Can you please point me to the definition for the standard for the file-NAME that is used in Subsurface?

Kia mārū
  Michael Keller
 

Dirk Lehmann - GM

unread,
Apr 28, 2023, 2:16:19 AM4/28/23
to subsurfac...@googlegroups.com
It is no strict definition.
Several delimiters are used.

Here is a suitable abstract from Wikipedia:

------‐
Specification

RFC 4180 proposes a specification for the CSV format; however, actual practice often does not follow the RFC and the term "CSV" might refer to any file that:[1][5]

is plain text using a character encoding such as ASCII, various Unicode character encodings (e.g. UTF-8), EBCDIC, or Shift JIS,

consists of records (typically one record per line),

----------

TypeApp for Android herunterladen

Michael Keller

unread,
Apr 28, 2023, 2:50:07 AM4/28/23
to Subsurface Divelog
Hi Dirk.


On Friday, April 28, 2023 at 6:16:19 PM UTC+12 dirkle...@googlemail.com wrote:
It is no strict definition.

Ok, I thought you'd mentioned a 'standard' in your post.
 
Here is a suitable abstract from Wikipedia:

------‐
Specification

RFC 4180 proposes a specification for the CSV format; however, actual practice often does not follow the RFC and the term "CSV" might refer to any file that:[1][5]

is plain text using a character encoding such as ASCII, various Unicode character encodings (e.g. UTF-8), EBCDIC, or Shift JIS,

consists of records (typically one record per line),

----------
 
What is the source you are citing from? Is it https://en.wikipedia.org/wiki/Comma-separated_values?

If so, the very first line of this wikipedia article states:

A comma-separated values (CSV) file is a delimited text file that uses a comma to separate values."
(emphasis mine)


I think in general most people and applications expect 'CSV' / .csv files to use a comma as the field separator (which is what the original poster pointed out).
So if the desired format for this particular export function is indeed tab separated values then it will be an improvement if it is renamed to 'tab delimited summary dive details' or 'TSV summary dive details' (https://en.wikipedia.org/wiki/Tab-separated_values).

Kia mārū
  Michael Keller

Robert Helling

unread,
Apr 28, 2023, 4:01:13 AM4/28/23
to subsurfac...@googlegroups.com
Hi everyone,

CSV is not a properly standardised file format but an ad hoc one. 

On 28. Apr 2023, at 08:50, Michael Keller <mike...@042.ch> wrote:

If so, the very first line of this wikipedia article states:

A comma-separated values (CSV) file is a delimited text file that uses a comma to separate values."
(emphasis mine)


Continue reading from there:

"The CSV file format is not fully standardized. Separating fields with commas is the foundation, but commas in the data or embedded line breaks have to be handled specially. Some implementations disallow such content while others surround the field with quotation marks, which yet again creates the need for escaping if quotation marks are present in the data.
The term "CSV" also denotes several closely-related delimiter-separated formats that use other field delimiters such as semicolons.[2] These include tab-separated valuesand space-separated values. A delimiter guaranteed not to be part of the data greatly simplifies parsing.
Alternative delimiter-separated files are often given a ".csv" extension despite the use of a non-comma field separator. This loose terminology can cause problems in data exchange. Many applications that accept CSV files have options to select the delimiter character and the quotation character. Semicolons are often used instead of commas in many European locales in order to use the comma as the decimal separator and, possibly, the period as a decimal grouping character.“

Anyway, I think it is fair to assume by default that fields are separated by commas. But life gets more difficult if your fields contain commas and you need to quote fields and then you need to quote the quotation character.

BTW, in MS Excel, do not „Open“ the .csv file but „Import“ it. Then you will be asked what your favourite delimiter is.

Best
Robert
signature.asc

Dirk Lehmann GM

unread,
Apr 28, 2023, 4:06:33 AM4/28/23
to subsurfac...@googlegroups.com
Depends on what people you know.
People around me know that CSV has variable delimiters.

And as programs offer options to set what delimiters are used for the CSV-import it must be usual.

Not all "standards" are always "standardized" completely. That's normal evolving in IT-business.

Cheers.... And-out

TypeApp for Android herunterladen

Michael Keller

unread,
Apr 28, 2023, 4:35:15 AM4/28/23
to Subsurface Divelog
Hi Robert.

On Friday, April 28, 2023 at 8:01:13 PM UTC+12 Robert C. Helling wrote:
CSV is not a properly standardised file format but an ad hoc one. 

The RFC for CSV (https://datatracker.ietf.org/doc/html/rfc4180) is actually giving a pretty concise definition of what the intended format for CSV is. And since 2005 it is the standard that is used by all applications that handle MIME types, like browsers or email clients.

In general, I think it is a good approach to consider publications by engineering bodies (like the IETF) to be of higher relevance than crowdsourced websites like wikipedia.
 
Continue reading from there:

"The CSV file format is not fully standardized. Separating fields with commas is the foundation, but commas in the data or embedded line breaks have to be handled specially. Some implementations disallow such content while others surround the field with quotation marks, which yet again creates the need for escaping if quotation marks are present in the data.
The term "CSV" also denotes several closely-related delimiter-separated formats that use other field delimiters such as semicolons.[2] These include tab-separated valuesand space-separated values. A delimiter guaranteed not to be part of the data greatly simplifies parsing.
Alternative delimiter-separated files are often given a ".csv" extension despite the use of a non-comma field separator. This loose terminology can cause problems in data exchange.
(emphasis mine)

 To me this looks like even the authors of the wikipedia article acknowledge that using non-comma delimiters to separate fields in CSV files can be problematic - so not sure this is the right approach in Subsurface.

Anyway, I think it is fair to assume by default that fields are separated by commas. But life gets more difficult if your fields contain commas and you need to quote fields and then you need to quote the quotation character.

This is true.
But it's not actually that difficult - as long as all fields potentially containing field separators or quotation characters (or even line separators) are quoted, and any quotation characters within these fields are doubled up this is enough to guarantee that the format can be parsed correctly.

Ngā mihi
  Michael Keller

Wayne Heming

unread,
May 7, 2023, 8:16:36 PM5/7/23
to Subsurface Divelog
This subject has certainly created some conversation. While I totally agree with Michael Keller, I guess there is no right or wrong here, but maybe a better "More compatible" use of the CSV format will be the best option. If you read the RFC https://www.ietf.org/rfc/rfc4180.txt beginning on page 2 there is a very good description of the intent of what the CSV format was suppose to achieve. It includes 7 points that are, as it states, " the format that seems to be followed by most implementations" why can't Subsurface be one of the "most"

Wayne Heming

unread,
May 17, 2023, 8:33:01 PM5/17/23
to Subsurface Divelog
Could we at least get the export function modified to allow to specify the file extension, currently it is .csv and the only way to change it is to rename it afterwards.

Michael Keller

unread,
May 17, 2023, 8:41:31 PM5/17/23
to subsurfac...@googlegroups.com
Hi Wayne.


On 18/05/23 12:33, Wayne Heming wrote:
> Could we at least get the export function modified to allow to specify
> the file extension, currently it is .csv and the only way to change it
> is to rename it afterwards.


Better than that: https://github.com/subsurface/subsurface/pull/3872
will make it generate actual CSV.


Kia mārū

  Michael Keller

Wayne Henderson

unread,
May 17, 2023, 10:15:06 PM5/17/23
to subsurfac...@googlegroups.com
Hi,

I’m the wrong Wayne!
> --
> You received this message because you are subscribed to the Google Groups "Subsurface Divelog" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to subsurface-dive...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/subsurface-divelog/11d95818-48c0-ee7c-268b-5e7b0d057575%40042.ch.

Jason Bramwell

unread,
May 18, 2023, 2:02:54 AM5/18/23
to subsurfac...@googlegroups.com
Wayne Henderson,
That email was not sent to you directly, it was sent to the developers mailing list of which you are a member so everyone on the list gets every message.

Jason

Sent from my iPhone

> On 18 May 2023, at 03:15, Wayne Henderson <whende...@gmail.com> wrote:
>
> Hi,
> To view this discussion on the web visit https://groups.google.com/d/msgid/subsurface-divelog/324D4A0A-D840-4C19-B0C1-7E6CBAAF679E%40gmail.com.
Reply all
Reply to author
Forward
0 new messages