Date/time formats and text versus numeric cells in Excel

52 views
Skip to first unread message

Oliver Bock

unread,
Feb 12, 2018, 8:39:37 PM2/12/18
to CSVChat
Many years ago, Bruce said "XlsReader will always return dates in the default [short] format of the operating system".

I think this is still true, but note that:
  • Even if a user selects a non-default date/time format in Excel, the system default will be used.
  • Dates entered as text (but with no leading ') are returned using the entered text, even though you might think that Excel has recognised them as dates because it shows a date format selected in Format Cells.
My users are complaining because the same data file can read differently on a machine with a different date/time format.  Upon initial load I try to guess a parser for date/time fields, and my program remembers this, and breaks when the format changes.  Unfortunately I cannot just use the default date/time parser because dates entered as text _won't_ change when the machine changes, and I cannot tell the difference because CsvReader returns a string either way.

Is there any way around this, or a way to detect whether a cell is numeric versus text?

Regards,
  Oliver

shriop

unread,
Feb 13, 2018, 12:34:51 AM2/13/18
to CSVChat
Simple answer, no, currently at least.

I was solving the opposite problem when I designed out how this works, preventing some odd custom date format that a user applies from breaking parsing logic.

That's not to say that something can't be added, but without giving you all the gory details about the inner workings of Excel, I'm not currently picturing what could be implemented to solve the problem you're facing.

So let's talk this through. Let's say there was a way, how would you see it working and how would you see it solving your current problem?

Keep in mind that if we can come up with something, I'll still have to expand the problem set and solution to currencies, numbers, time, etc.

Bruce Dunwiddie

Oliver Bock

unread,
Feb 13, 2018, 4:21:05 PM2/13/18
to csv...@googlegroups.com

Hi Bruce,

 

The real kicker for me is dates, where months/days may swap places; likely formats for times can be unambiguously parsed with the right regular expression.  If I could detect that a returned date had a number underlying it and (thus was in the machine’s short date format) then I would record the short date format active at that time and, if the file was loaded on a computer with a different short date format, I would (for each cell):

 

  1. Parse it using the machine’s new short date format.
  2. Transform it back into a string using the old short date format.
  3. Continue processing as normal.

 

While this is complicated, I cannot think of another way to deal with a column that may have a mixture of text and numeric dates.  (To put this in context, such data is presented to our users as a column of text, to which we will apply an automatically-chosen parser, or one that the user manually chooses/writes.  This parser will continue to behave properly in this scheme.)

 

I can understand that this change would break the CSV-ish abstraction so I’ll understand if you consider it inappropriate.

 

Regards,

  Oliver

--
You received this message because you are subscribed to the Google Groups "CSVChat" group.
To unsubscribe from this group and stop receiving emails from it, send an email to csvchat+u...@googlegroups.com.
To post to this group, send email to csv...@googlegroups.com.
Visit this group at https://groups.google.com/group/csvchat.
For more options, visit https://groups.google.com/d/optout.

shriop

unread,
Feb 15, 2018, 5:08:59 PM2/15/18
to CSVChat
I don't find the request inappropriate. I'm trying to think through a more full featured abstraction that would get you to the same place without cluttering up the interface.

Technically, you could upgrade to the full source code option and integrate your own implementation for this. The code in these sections is very straightforward for what you're looking for. There are basically case statements for the different type of cell data types, so you could just keep your flags in a collection, then you just have to expose that collection up two layers or so. You would probably want to add the same for both the xls implementation and the xlsx implementation, which are very different, but equally simplistic for this logic.

Bruce Dunwiddie

To unsubscribe from this group and stop receiving emails from it, send an email to csvchat+unsubscribe@googlegroups.com.

shriop

unread,
Feb 15, 2018, 5:25:18 PM2/15/18
to CSVChat
I could possibly calculate and expose what is essentially the same as the Range.NumberFormat property, https://msdn.microsoft.com/en-us/vba/excel-vba/articles/range-numberformat-property-excel , but at the cell level. I'm assuming this would return "General" for your date value that is coming across as a string and either one of the built in date formats or a custom format for your other instances.

Bruce Dunwiddie

Oliver Bock

unread,
Feb 15, 2018, 9:52:04 PM2/15/18
to csv...@googlegroups.com

Hi Bruce,

 

Exposing NumberFormat would do the job, but it might be confusing because CsvReader is not actually using the format stored in Excel when converting the numbers to strings (but is using the default short date/time format).  I’m not arguing for it to change its formatting, since that would only complicate the issue.

 

How about adding a Values2 property (a homage to Excel’s Range.Value2) which is of type object[], containing strings, doubles or DateTimes (Kind.Unknown)?  This would have the benefit that you could load numbers and dates precisely, with no loss of precision/decimal places that might otherwise occur because they are going via strings, and library users might not parse them exactly right (e.g. not using the correct round trip format specifiers).  This is also attractive because the reader knows this data’s type, so why not preserve it?

 

Obviously CSV files would always return strings in this array.

 

We are certainly willing to upgrade to the source code licence, but we would prefer not to do the implementation work ourselves.  Could we pay extra to have this feature added more generally?

 

Regards,

  Oliver

--

You received this message because you are subscribed to the Google Groups "CSVChat" group.

To unsubscribe from this group and stop receiving emails from it, send an email to csvchat+u...@googlegroups.com.

Reply all
Reply to author
Forward
0 new messages