Bug and question in regards to working with 1904 date format

248 views
Skip to first unread message

Stefan

unread,
Jun 16, 2016, 6:23:59 AM6/16/16
to openpyxl-users
Hi,

I need use negative times in Excel and therefore need Excel's "Use 1904 date format" option. So far, I was not able to write a file from openpyxl with this option enabled.

One idea I had was to create a blank file in Excel, enable said option and save it. Then I use load_workbook to load this file, write my stuff into it and save it again. If you try to load a file with the 1904 date format enabled, you get an error:

Traceback (most recent call last):
  File "./[...].py", line 179, in <module>
    wb3 = load_workbook('[...]')
  File "[...]/src/openpyxl/openpyxl/reader/excel.py", line 152, in load_workbook
    parser.parse()
  File "[...]/src/openpyxl/openpyxl/packaging/workbook.py", line 46, in parse
    wb.excel_base_date = CALENDAR_MAC_1904
NameError: global name 'wb' is not defined

This was easily fixed by replacing
wb.excel_base_date = CALENDAR_MAC_1904
in the workbook.py on line 46 with
self.wb.excel_base_date = CALENDAR_MAC_1904

But now that I can load the file, the 1904 option still gets removed after saving again. Is it possible to save files from openpyxl with the 1904 date format option enabled?

Thanks in advance!
-Stefan

Charlie Clark

unread,
Jun 16, 2016, 8:49:04 AM6/16/16
to openpyx...@googlegroups.com
Am .06.2016, 12:23 Uhr, schrieb Stefan <stefan...@gmail.com>:

> Hi,
>
> I need use negative times in Excel and therefore need Excel's "Use 1904
> date format" option. So far, I was not able to write a file from openpyxl
> with this option enabled.

Sounds interesting. Can you please provide some more information? The
specification has the following to say about the two different epochs:

"""
Two different bases are used for converting dates to and from serial
date-times:
In the 1900 date system, the lower limit is January 1st, 0001 00:00:00,
which has a serial date-
time of -693593. The upper-limit is December 31st, 9999, 23:59:59.999,
which has a serial date-time of 2,958,465.9999884. The base date for this
system is 00:00:00 on December 30th, 1899, which has a serial date-time of
0.
In the 1904 date system, the lower limit is January 1st, 0001, 00:00:00,
which has a serial date-time
of -695055. The upper limit is December 31st, 9999, 23:59:59.999, which
has a serial date-time of 2,957,003.9999884. The base date for this system
is 00:00:00 on January 1st, 1904, which has a serial date-time of 0.
"""
In practice, Excel can't handle negative date times, but you should be
able to get them with either epoch. The 1904 is basically a wart from an
older version of Excel for MacOS that I'd like to remove, because this
version has no been supported for a while, we also can't test with it and
it would vastly simplify the handling of dates and times. Can you provide
a file showing this in practice?

> One idea I had was to create a blank file in Excel, enable said option
> and save it. Then I use load_workbook to load this file, write my stuff
> into it and save it again. If you try to load a file with the 1904 date
> format
> enabled, you get an error:
>
> Traceback (most recent call last):
> File "./[...].py", line 179, in <module>
> wb3 = load_workbook('[...]')
> File "[...]/src/openpyxl/openpyxl/reader/excel.py", line 152, in
> load_workbook
> parser.parse()
> File "[...]/src/openpyxl/openpyxl/packaging/workbook.py", line 46, in
> parse
> wb.excel_base_date = CALENDAR_MAC_1904
> NameError: global name 'wb' is not defined

If you're seeing the workbook package then I assume you're working with
2.4?

> This was easily fixed by replacing
> wb.excel_base_date = CALENDAR_MAC_1904
> in the workbook.py on line 46 with
> self.wb.excel_base_date = CALENDAR_MAC_1904

Looks good. I haven't got time at the moment to fix this including the
relevant test. Do you fancy submitting a PR for this?

> But now that I can load the file, the 1904 option still gets removed
> after saving again. Is it possible to save files from openpyxl with the
> 1904 date format option enabled?

See above: I don't think this should strictly be necessary and that
normalising files to the 1900 epoch is the way to go but I guess it should
be available if it is really required?

Charlie
--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Kronenstr. 27a
Düsseldorf
D- 40217
Tel: +49-211-600-3657
Mobile: +49-178-782-6226

stefan...@gmail.com

unread,
Jun 16, 2016, 9:59:41 AM6/16/16
to openpyxl-users
Hi Charlie, thanks for your fast answer!
I have attached two Excel files, both with the same content. In example_1900.xlsx, you'll see that in C3 the difference between A3 and B3 can't be displayed because it's negative. Enabling "Use 1904 date format" results in the second file, example_1904.xlsx, where the negative time difference is correctly displayed. See also https://support.microsoft.com/en-us/kb/182247. The same thing happens if you write a negative datetime.timedelta into a cell.
 

> One idea I had was to create a blank file in Excel, enable said option  
> and  save it. Then I use load_workbook to load this file, write my stuff  
> into it and save it again. If you try to load a file with the 1904 date  
> format
> enabled, you get an error:
>
> Traceback (most recent call last):
>   File "./[...].py", line 179, in <module>
>     wb3 = load_workbook('[...]')
>   File "[...]/src/openpyxl/openpyxl/reader/excel.py", line 152, in
> load_workbook
>     parser.parse()
>   File "[...]/src/openpyxl/openpyxl/packaging/workbook.py", line 46, in
> parse
>     wb.excel_base_date = CALENDAR_MAC_1904
> NameError: global name 'wb' is not defined

If you're seeing the workbook package then I assume you're working with  
2.4?

Yes, the error only happened with 2.4. With 2.3.5 the error did not occur, but while trying to figure this out, I found out that 2.4 added a WorkbookProperties class which includes a property called "date1904" and I was hoping to use this property somehow. But I couldn't find a way to apply these "WorkbookProperties" to a workbook, maybe because it's only for internal use or something.
 

> This was easily fixed by replacing
> wb.excel_base_date = CALENDAR_MAC_1904
> in the workbook.py on line 46 with
> self.wb.excel_base_date = CALENDAR_MAC_1904

Looks good. I haven't got time at the moment to fix this including the  
relevant test. Do you fancy submitting a PR for this?

I've actually never submitted a PR anywhere, but I can see if I can do it this weekend. Do you mean by "relevant test" something like sample files which show the bug?
 

> But now that I can load the file, the 1904 option still gets removed  
> after saving again. Is it possible to save files from openpyxl with the  
> 1904 date format option enabled?

See above: I don't think this should strictly be necessary and that  
normalising files to the 1900 epoch is the way to go but I guess it should  
be available if it is really required?

I couldn't find any way to display negative time values using the 1900 epoch, so I'm not really sure what you suggest I should do now. If it was only about displaying these values, I could convert them to a string and use that, but the resulting Excel file that my Python script creates will be used in further analysis where the cells have to be in [hh]:mm:ss format so the calculations work properly. I was actually giving up already and decided to just manually apply the "Use 1904 date format" setting to the resulting file, but maybe we can figure out a solution.

Best,
-Stefan
example_1904.xlsx
example_1900.xlsx

Charlie Clark

unread,
Jun 16, 2016, 11:26:19 AM6/16/16
to openpyx...@googlegroups.com
Am .06.2016, 15:59 Uhr, schrieb <stefan...@gmail.com>:

> I have attached two Excel files, both with the same content.
> In example_1900.xlsx, you'll see that in C3 the difference between A3 and
> B3 can't be displayed because it's negative. Enabling "Use 1904 date
> format" results in the second file, example_1904.xlsx, where the negative
> time difference is correctly displayed. See also
> https://support.microsoft.com/en-us/kb/182247. The same thing happens if
> you write a negative datetime.timedelta into a cell.

WTF! This is a huge bug in MS Excel! The specification is quite clear
about this: dates from 0000-01-01 should be supported independently of the
epoch. openpyxl does this correctly as do OpenOffice and LibreOffice.

> I couldn't find any way to display negative time values using the 1900
> epoch, so I'm not really sure what you suggest I should do now. If it was
> only about displaying these values, I could convert them to a string and
> use that, but the resulting Excel file that my Python script creates will
> be used in further analysis where the cells have to be in [hh]:mm:ss
> format so the calculations work properly. I was actually giving up
> already and
> decided to just manually apply the "Use 1904 date format" setting to the
> resulting file, but maybe we can figure out a solution.

Converting to a string is obviously not an option but switching the epoch
has to be one of the shittiest workarounds out there. I suppose that the
solution is that we make sure that the 1904 epoch works properly and
roundtrips but this really does need deprecating. Because openpyxl is
based around the file format this should also be reported to to the ECMA
Working Group so that pressure can be put on Microsoft to fix this.

In the meantime you should probably be able to work get what you want with
openpyxl 2.3.5 because this doesn't have the packaging change.

John McNamara

unread,
Jun 16, 2016, 12:09:31 PM6/16/16
to openpyxl-users, stefan...@gmail.com


On Thursday, 16 June 2016 14:59:41 UTC+1, stefan...@gmail.com wrote:

I couldn't find any way to display negative time values using the 1900 epoch, so I'm not really sure what you suggest I should do now. If it was only about displaying these values, I could convert them to a string and use that, but the resulting Excel file that my Python script creates will be used in further analysis where the cells have to be in [hh]:mm:ss format so the calculations work properly. I was actually giving up already and decided to just manually apply the "Use 1904 date format" setting to the resulting file, but maybe we can figure out a solution.


Hi,

XlsxWriter supports the 1904 epoch and there is a test you can use if you attempt a PR to OpenPyXL.

    https://github.com/jmcnamara/XlsxWriter/blob/master/xlsxwriter/test/worksheet/test_date_time_02.py


HOWEVER, I agree with Charlie that this is a pretty crappy workaround for the MS support docs to propose.

I'd also question whether or not you actually need to do this. If your openpyxl spreadsheet is being consumed by another application does it care if the negative numbers are displayed or not. The value of the =A1-B1 calculation is still there even if it can't be displayed. For example if you wrote another formula like =A1-B1+A1 you would get a result that could be displayed.

Dates in Excel are actually just ordinary numbers with formatting. So, if you omitted the time format and used the "General" format you would get the exact same result (a negative number) that could be used in additional date calculations.

John


 

Charlie Clark

unread,
Jun 16, 2016, 12:54:35 PM6/16/16
to openpyx...@googlegroups.com
Am .06.2016, 18:09 Uhr, schrieb John McNamara <jmcna...@gmail.com>:

> XlsxWriter supports the 1904 epoch and there is a test you can use if you
> attempt a PR to OpenPyXL.

Thanks, John. I think we've actually got the conversion using the 1904
epoch properly tested. But what's not happening properly is the
preservation of the Workbook setting due to the extensive refactoring of
packaging in 2.4 to prepare for better roundtrip support. Should be pretty
easy to fix and write an integration test for but the stop energy is
pretty big on this. I'm sorely tempted to drop support for this Excel 2003
for Mac wart entirely.

Charlie Clark

unread,
Jul 11, 2016, 1:57:59 PM7/11/16
to openpyx...@googlegroups.com
Am .06.2016, 18:54 Uhr, schrieb Charlie Clark
<charli...@clark-consulting.eu>:

> Thanks, John. I think we've actually got the conversion using the 1904
> epoch properly tested. But what's not happening properly is the
> preservation of the Workbook setting due to the extensive refactoring of
> packaging in 2.4 to prepare for better roundtrip support. Should be
> pretty easy to fix and write an integration test for but the stop energy
> is pretty big on this. I'm sorely tempted to drop support for this Excel
> 2003 for Mac wart entirely.

FWIW the 2.4 branch now has support for this wart.
Reply all
Reply to author
Forward
0 new messages