32/64 bit issue?

102 views
Skip to first unread message

Adrian Klaver

unread,
Dec 26, 2014, 2:43:15 PM12/26/14
to python...@googlegroups.com
Trying to figure out an issue I think is a 32/64 bit problem.

xlrd==0.9.2, though I see it with 0.9.0 and 0.8.0 also.

xls file that has a UPC column with values like:

744803988492
744835984653
744835984608

On a 32 bit machine I can open the file with xlrd and get the correct
values.

Copy file to 64 bit machine and all I get is values of 0.0 for the UPC
column. I can change the formatting in the file to text and I still get
0.0 for values. If I copy the file back to a 32 bit machine the problem
persists. The only way I can get correct values is have the file
originate on a 32 bit machine and open it on a 32 bit machine.

Any thoughts?

--
Adrian Klaver
adrian...@aklaver.com

Adrian Klaver

unread,
Dec 26, 2014, 3:02:10 PM12/26/14
to python...@googlegroups.com
Should have added that the file starts on Ubuntu 10.04 32-bit and is
copied via scp to openSUSE 13.1 32-bit. At this point everything still
works. I then scp to a openSUSE 13.1 64-bit machine. It is there that I
have issues. If I copy it back to the openSUSE 13.1 32-bit machine I
still have issues. The only way to get the values correct again is to
scp from the Ubuntu machine to the openSUSE 32-bit machine or work on
the Ubuntu machine.

John Yeung

unread,
Dec 26, 2014, 3:49:53 PM12/26/14
to python-excel
On Fri, Dec 26, 2014 at 3:02 PM, Adrian Klaver
<adrian...@aklaver.com> wrote:
> Should have added that the file starts on Ubuntu 10.04 32-bit and is copied
> via scp to openSUSE 13.1 32-bit. At this point everything still works. I
> then scp to a openSUSE 13.1 64-bit machine. It is there that I have issues.

This may be naive of me, but maybe there are issues with scp? Are you
sure the files are bit-for-bit identical across the transfer?

Which versions (and bitnesses) of Pythons are being used on the
respective systems?

Finally, how are the originals being generated? Clearly not with Excel. xlwt?

Everything I'm asking is just a shot in the dark. The whole thing
sounds very bizarre to me.

John Y.

Adrian Klaver

unread,
Dec 27, 2014, 9:53:55 AM12/27/14
to python...@googlegroups.com
On 12/26/2014 12:49 PM, John Yeung wrote:
> On Fri, Dec 26, 2014 at 3:02 PM, Adrian Klaver
> <adrian...@aklaver.com> wrote:
>> Should have added that the file starts on Ubuntu 10.04 32-bit and is copied
>> via scp to openSUSE 13.1 32-bit. At this point everything still works. I
>> then scp to a openSUSE 13.1 64-bit machine. It is there that I have issues.
>
> This may be naive of me, but maybe there are issues with scp? Are you
> sure the files are bit-for-bit identical across the transfer?

Well the 32 to 32 bit transfer works fine. I will need to figure another
way to get the file from the 32 bit to 64 bit machine and will get back
on this.

>
> Which versions (and bitnesses) of Pythons are being used on the
> respective systems?

Python 2.6.5(32) on the Ubuntu machine, 2.7.6(32) on the one openSUSE
13.1, 2.7.6(64) on the other openSUSE machine
>
> Finally, how are the originals being generated? Clearly not with Excel. xlwt?

The originals come from a vendor and I am not sure how they are
produced, though I suspect they are created programmatically. The format
and the data have not changed in years and on 32 bit machines everything
works.

>
> Everything I'm asking is just a shot in the dark. The whole thing
> sounds very bizarre to me.

Me to. I suspect some sort of precision problem, not sure exactly how to
determine though.
>
> John Y.
>


--
Adrian Klaver
adrian...@aklaver.com

Adrian Klaver

unread,
Dec 27, 2014, 3:21:58 PM12/27/14
to python...@googlegroups.com
Well that was a trip down the rabbit hole!

I started by trying different transfer methods and I could not replicate
the issue, even using scp. This was transferring the file in multiple
directions, but only opening to read in xlrd.

Then I remembered I may have passed the file through
LibreOffice(4.1.6.2) previously, so I decided to do that again. That is
where things got weird. In summary 32/64 bit does not matter, but:

1) Just opening the file in LO without changing anything is fine.

2) Open the file and make a change anywhere and save as xls. The values
in LO stay the same, but when you open the file in xlrd they become 0.0

3) Take the changed file Save As *.ods and then Save As *.xls and the
problem goes away. You see the values in both LO and xlrd.

4) Open the original(before change) *.xls and Save As *.xlsx. Make
changes to *.xlsx file and save, no problem.

So saving an *.xls file in place in LO has an issue. On to the LO bug
tracker.


Thanks for the sounding board. Will report back on any feedback from LO.

Adrian Klaver

unread,
Dec 28, 2014, 5:24:37 PM12/28/14
to python...@googlegroups.com
On 12/27/2014 12:21 PM, Adrian Klaver wrote:

>
>
> Thanks for the sounding board. Will report back on any feedback from LO.

I opened a bug report in LibreOffice Bugzilla:

https://bugs.freedesktop.org/show_bug.cgi?id=87773

raal has started doing some testing on this using xlrd. Someone more
familiar with xlrd than I should probably take a look at the results.

John Machin

unread,
Dec 29, 2014, 4:58:11 PM12/29/14
to python...@googlegroups.com


On Monday, December 29, 2014 9:24:37 AM UTC+11, aklaver wrote:
On 12/27/2014 12:21 PM, Adrian Klaver wrote:

>
>
> Thanks for the sounding board. Will report back on any feedback from LO.

I opened a bug report in LibreOffice Bugzilla:

https://bugs.freedesktop.org/show_bug.cgi?id=87773


Hi Adrian, can you please send me copies of (a) original XLS file (b) result of open in LO, change cell A1, save as XLS?
This looks weird ... Original values appear to be strings, but mutate into numbers. I'm looking forward to seeing what is actually in the second file at the BIFF level.
Cheers,
John


John Machin

unread,
Dec 30, 2014, 4:01:25 PM12/30/14
to python...@googlegroups.com

Adrian, Thanks for sending the 2 files.

ORIGINAL: Each cell is set up as a formula (just as though you had typed e.g. ="1234567" into the UI) AND it fills in the value appropriately. This is grossly wasteful of space and CPU cycles compared with a string constant (like you had typed '1234567 into the UI). In any case, xlrd extracts the value happily.

SAVED: As above, but the value is set to be a numeric zero, hence xlrd returns numeric zero (xlrd doesn't have a formula evaluator, like LO). I would regard this a a bug in LO; the LO maintainers may have a different view.

By the way, one little piece of mystery is left: The ORIGINAL file last-saved-by is "Calc" i.e. LO or OO, same as the SAVED file ... you may need to dig a little further into the history of the ORIGINAL file.

Cheers,
John

Adrian Klaver

unread,
Dec 30, 2014, 4:26:09 PM12/30/14
to python...@googlegroups.com
On 12/30/2014 01:01 PM, John Machin wrote:
>
>
> On Tuesday, December 30, 2014 8:58:11 AM UTC+11, John Machin wrote:
>
>
>
> On Monday, December 29, 2014 9:24:37 AM UTC+11, aklaver wrote:
>
> On 12/27/2014 12:21 PM, Adrian Klaver wrote:
>
> >
> >
> > Thanks for the sounding board. Will report back on any
> feedback from LO.
>
> I opened a bug report in LibreOffice Bugzilla:
>
> https://bugs.freedesktop.org/show_bug.cgi?id=87773
> <https://bugs.freedesktop.org/show_bug.cgi?id=87773>
>
>
>
> Hi Adrian, can you please send me copies of (a) original XLS file
> (b) result of open in LO, change cell A1, save as XLS?
> This looks weird ... Original values appear to be strings, but
> mutate into numbers. I'm looking forward to seeing what is actually
> in the second file at the BIFF level.
> Cheers,
> John
>
>
> Adrian, Thanks for sending the 2 files.
>
> ORIGINAL: Each cell is set up as a formula (just as though you had typed
> e.g. ="1234567" into the UI) AND it fills in the value appropriately.
> This is grossly wasteful of space and CPU cycles compared with a string
> constant (like you had typed '1234567 into the UI). In any case, xlrd
> extracts the value happily.

Well the file comes from a third party and I half suspect it is
generated programmatically.

>
> SAVED: As above, but the value is set to be a numeric zero, hence xlrd
> returns numeric zero (xlrd doesn't have a formula evaluator, like LO). I
> would regard this a a bug in LO; the LO maintainers may have a different
> view.

I am not understanding.

Where is the value set to 0.0?

If I look in the saved version of the file I still ="744835962453"

>
> By the way, one little piece of mystery is left: The ORIGINAL file
> last-saved-by is "Calc" i.e. LO or OO, same as the SAVED file ... you
> may need to dig a little further into the history of the ORIGINAL file.

It is confirmation of an order. I have been trying for years to get the
data in a non-spreadsheet format so as I do not have deal with all the
issues in parsing something where the data stays the same but the
formatting changes.

>
> Cheers,
> John
>

--
Adrian Klaver
adrian...@aklaver.com

Adrian Klaver

unread,
Dec 30, 2014, 4:53:38 PM12/30/14
to python...@googlegroups.com
On 12/30/2014 01:01 PM, John Machin wrote:
>
>
I forgot which file you where looking at. That was a sample file I
created by cut and pasting UPC values from the file the data originated
in. So it was an LO to LO transfer. So only saving the pasted values did
not have an effect either.

John Yeung

unread,
Dec 30, 2014, 6:00:23 PM12/30/14
to python-excel
On Tue, Dec 30, 2014 at 4:26 PM, Adrian Klaver
<adrian...@aklaver.com> wrote:
> On 12/30/2014 01:01 PM, John Machin wrote:
>> ORIGINAL: Each cell is set up as a formula (just as though you had typed
>> e.g. ="1234567" into the UI) AND it fills in the value appropriately.
>> This is grossly wasteful of space and CPU cycles compared with a string
>> constant (like you had typed '1234567 into the UI). In any case, xlrd
>> extracts the value happily.
>>
>> SAVED: As above, but the value is set to be a numeric zero, hence xlrd
>> returns numeric zero (xlrd doesn't have a formula evaluator, like LO). I
>> would regard this a a bug in LO; the LO maintainers may have a different
>> view.
>
> I am not understanding.
>
> Where is the value set to 0.0?
>
> If I look in the saved version of the file I still ="744835962453"

Adrian,

You can think of cells as having a "slot" for the formula and another
slot for the naked (fully calculated) value. In all cases, xlrd will
read only the naked value slot. If you see

="744835962453"

then you are looking at a formula, not a value.

Now, when the Excel GUI saves an Excel file, any formula-containing
cells are evaluated and the value slot is populated with the result.
But many (most?) non-Excel programs do no such thing. They will
populate the value slot with 0 (zero).

See XlsxWriter's documentation on its approach to writing formula cells:

https://xlsxwriter.readthedocs.org/worksheet.html#write_formula

Relevant excerpt:

"XlsxWriter doesn’t calculate the value of a formula and instead
stores the value 0 as the formula result. It then sets a global flag
in the XLSX file to say that all formulas and functions should be
recalculated when the file is opened. This is the method recommended
in the Excel documentation and in general it works fine with
spreadsheet applications. However, applications that don’t have a
facility to calculate formulas, such as Excel Viewer, or some mobile
applications will only display the 0 results."

>> By the way, one little piece of mystery is left: The ORIGINAL file
>> last-saved-by is "Calc" i.e. LO or OO, same as the SAVED file ... you
>> may need to dig a little further into the history of the ORIGINAL file.
>
> It is confirmation of an order. I have been trying for years to get the data
> in a non-spreadsheet format so as I do not have deal with all the issues in
> parsing something where the data stays the same but the formatting changes.

By "history", I think John Machin is referring to the purely technical
matter of the modification history of the file. He is saying that the
file is self-reporting as last being saved by LO or OO. Which is to
say that maybe it was in fact last saved by LO or OO, or it could be
that whatever program actually last saved the file has "spoofed" the
signature of LO/OO.

John Y.

Adrian Klaver

unread,
Dec 30, 2014, 7:01:56 PM12/30/14
to python...@googlegroups.com
Got it now. Thanks.

>
> By "history", I think John Machin is referring to the purely technical
> matter of the modification history of the file. He is saying that the
> file is self-reporting as last being saved by LO or OO. Which is to
> say that maybe it was in fact last saved by LO or OO, or it could be
> that whatever program actually last saved the file has "spoofed" the
> signature of LO/OO.

See my later post. There is actually more history, but I don't think it
matters to this problem other then why the formulas are present in the
cells instead of strings. This I believe is because the root xls file is
generated from a script from data in a datastore by another party. I
have no control over this so I have do to the adapting.
Reply all
Reply to author
Forward
0 new messages