xlwt: formula enhancements (incl. inter-sheet references) available in svn

1,342 views
Skip to first unread message

John Machin

unread,
Feb 12, 2009, 7:56:55 PM2/12/09
to python...@googlegroups.com
Hi all,

I have just committed a bunch of formula and work-sheet-name-checking
changes to svn (revision 3750). Full details below -- really full
details including rationales and design decisions for the benefit of
Chris who's doing an xl.* tute at PyCon and Ana who's porting xlwt to Ruby.

Please consider this as a release candidate -- some testing wouldn't go
astray.

Cheers,
John

=== inter-sheet references in formulas ===

Examples:
(1) a formula in Sheet1 refers to Sheet2!A1:Z99
(2) totalling a 3D block of cells: SUM(Sheet2:Sheet9!B2:G10)

CREDITS: This work is based on the pyExcelerator patch written by Ruben
Mendes. Christophe Tronche ported Ruben's patch to xlwt, and made
a significant improvement to the formula lexer.

Notes:

(1) [Extension to Excel syntax] You may use a 0-based sheet index
instead of a sheet name e.g. SUM(0:2!A1) instead of SUM(First:Third!A1).

(2) Referenced sheets must be created with WorkBook.add_sheet() before
the referencing formula is used in Worksheet.write(). Otherwise, an
exception is raised (either unknown sheet name, or sheet index out of
range, as appropriate). This checking *must* be done at some stage; if
not, the damage is at best a cell displaying #REF!; at worst, Excel will
crash. Note that you don't need to fill in any cells in the referenced
sheet before you reference it; it suffices that the sheet exists. This
should not concern anyone who is referencing sheets by name -- if you
know the name, you can add the sheet. It may concern people who are
referencing by sheet index and know how many sheets they need but don't
know the names yet -- in this presumably unlikely scenario, the best way
way out would be to offer a sheet-renaming facility rather than rewrite
the reference back-patching (of which the checking is an adjunct) so
that it's done at Workbook.save() time.

(3) If a sheet name is not simple (roughly: it wouldn't qualify as an
identifier in Python), you need to quote it so that it can be recognised
as a sheet name when the formula is compiled. Utils.quote_sheet_name can
be used if you want to do it programatically:

def quote_sheet_name(unquoted_sheet_name):
if not valid_sheet_name(unquoted_sheet_name):
raise Exception(
'attempt to quote an invalid worksheet name %r' %
unquoted_sheet_name)
return u"'" + unquoted_sheet_name.replace(u"'", u"''") + u"'"

or you can do it manually:
O'Reilly -> 'O''Reilly'
123 -> '123'
Smith & Wesson -> 'Smith & Wesson'

In a formula, the only penalty for quoting a name that doesn't need
quoting is that your code will look slightly less elegant than otherwise.

On the other hand, don't bother trying to quote a sheet name before
passing it to Workbook.add_sheet() -- you don't need to, and you won't
be allowed to; see later.

=== IF and CHOOSE functions ===

xlwt now follows Excel in emitting special-case code for the IF and
CHOOSE functions -- they now avoid evaluating arguments that don't need
to be evaluated.

These functions can now handle passing through both value and
reference arguments. For example, SUM(IF(cond, B1:B10, C1:C10)) gives
the same result as IF(cond, SUM(B1:B10), SUM(C1:C10)).

The 2-argument form of the IF function has not been
implemented, and missing arguments are not catered for. I'd be
interested to hear from anyone who thinks that it's a good idea to write
IF(cond, , false_value) instead of IF(cond, 0, false_value
IF(cond, true_value) instead of IF(cond, true_value, FALSE)
and IF(cond, true_value, ) instead of IF(cond, true_value, 0)
especially if they claim to remember which form does what. Did you know
that Excel allows you to write IF(,,) which is interpreted as IF(0,0,0)?

=== SUM() function with one argument ===

xlwt now emits special-case code when SUM() has only one argument. Why?
Because Excel does so, and it was only 2 extra lines of code.

=== argument separators ===

xlwt now permits a comma "," as an alternative to a semicolon ";"
between function arguments: SUM(1,2,3) or SUM(1;2;3) or (if you think
consistency is boring) SUM(1,2;3). Note that the separators are not
stored in the compiled formula. When displaying a formula, Excel uses
the separator appropriate to the user's locale.

=== Bug: bad formula code when abs(integer_constant) >= 65536 ===

Excel has two formula opcodes for a constant number; one takes a float,
the other takes only an unsigned 16-bit integer. The latter was being
used for any integer constant. The result was that (e.g.) 65537 would
become 1, and -65537 [the - is a unary minus operator, not part of the
constant as far as the lexer is concerned] would become -1. Python 2.5
and 2.6 produced a deprecation warning (from struct.pack):
DeprecationWarning: 'H' format requires 0 <= number <= 65535
but 2.3 and 2.4 would let it pass silently. Now the opcode used for an
integer constant is conditional on its size.

=== Buglet: formula lexer wouldn't allow " in a text constant ===

Text constants use only " as quote characters. To get one in as content,
you need to double it. To get a cell to display the 7 characters
"hello", you need Formula('"""hello"""') where (starting from either
end) the ' is for Python, the outer " delimits a text constant in a
formula, and the inner "" is for the one " that you want to appear. The
lexer hadn't been told the doubling story, and regarded that as 3 text
constants with lengths 0, 5, and 0. Now fixed.

=== antlr.py ===

The formula compiler (ExcelFormulaParser.py) is generated by ANTLRv2
(http://www.antlr2.org/) from the grammar/action file excel-formula.g,
and antlr.py is in effect a run-time library for the compiler. All of
this has been updated to use the latest (last?) ANTLRv2 release (2.7.7)
and the licence-related text in antlr.py changed to accord with the
regulations of the Python packaging team of a Linux distribution.

=== checking for invalid or duplicate sheet names ===

This has been put in as an offshoot of sheet references in formulas.

The validity of a sheet name is checked in Workbook.add_sheet() by
calling Utils.valid_sheet_name:

def valid_sheet_name(sheet_name):
if sheet_name == u"" or sheet_name[0] == u"'" or len(sheet_name) > 31:
return False
for c in sheet_name:
if c in u"[]:\\?/*\x00":
return False
return True

These restrictions are necessary to keep Excel happy but may not be
sufficient. E.g. the \x00 restriction is not documented but was
discovered by experiment with sheets named u"A" + unichr(i) for i in
range(32). Yes, you can have a sheet named "\n". That's why the above
function is not called sensible_sheet_name. Advice to the overly
curious: don't try feeding Excel a file created using range(65536) at
home, /or/ at the office. I'll break that down into bite-sized chunks
and report later.

Note that starting a sheet name with an apostrophe is not allowed; it's
even written into the OOXML spec. This is why trying to supply a quoted
sheet name is futile.

xlwt now maintains a mapping from sheet_name.lower() to sheet_index.
This is used by Workbook.add_sheet() to check for duplicate sheet names.

If the sheet name supplied is not already unicode, it is decoded before
the invalid and duplicate checks are applied.

Note that if any invalid or duplicate sheet names exist in the file,
Excel 2003 will pop up a dialogue box mentioning "unreadable
content" and offering the user a repair job. This latter consists of
renaming dodgy sheets as "Recovered_Sheet1", "Recovered_Sheet2", etc.
It even creates a very informative log file, e.g.
"""
Microsoft Office Excel File Repair Log

Errors were detected in file 'C:\junk\100dupsheets.xls'
The following is a list of repairs:

Renamed invalid sheet name.
Renamed invalid sheet name.
[snip another 97 lines, all alike]
"""

There's a strong presumption that you don't want your users to have so
much fun, hence the restrictions :-)

=== That's all, folks ===

Chris Withers

unread,
Feb 13, 2009, 4:38:21 AM2/13/09
to python...@googlegroups.com
John Machin wrote:
> (2) Referenced sheets must be created with WorkBook.add_sheet() before
> the referencing formula is used in Worksheet.write(). Otherwise, an
> exception is raised (either unknown sheet name, or sheet index out of
> range, as appropriate).

Are you saying xlwt does this checking or that the code using xlwt must
do the checking. It should really be the former ;-)

> (3) If a sheet name is not simple (roughly: it wouldn't qualify as an
> identifier in Python), you need to quote it so that it can be recognised
> as a sheet name when the formula is compiled. Utils.quote_sheet_name can
> be used if you want to do it programatically:

Is that xlwt.Utils.quote_sheetname or something else?

> === Bug: bad formula code when abs(integer_constant) >= 65536 ===
>
> Excel has two formula opcodes for a constant number; one takes a float,
> the other takes only an unsigned 16-bit integer. The latter was being
> used for any integer constant. The result was that (e.g.) 65537 would
> become 1, and -65537 [the - is a unary minus operator, not part of the
> constant as far as the lexer is concerned] would become -1. Python 2.5
> and 2.6 produced a deprecation warning (from struct.pack):
> DeprecationWarning: 'H' format requires 0 <= number <= 65535
> but 2.3 and 2.4 would let it pass silently. Now the opcode used for an
> integer constant is conditional on its size.
>
> === Buglet: formula lexer wouldn't allow " in a text constant ===
>
> Text constants use only " as quote characters. To get one in as content,
> you need to double it. To get a cell to display the 7 characters
> "hello", you need Formula('"""hello"""') where (starting from either
> end) the ' is for Python, the outer " delimits a text constant in a
> formula, and the inner "" is for the one " that you want to appear. The
> lexer hadn't been told the doubling story, and regarded that as 3 text
> constants with lengths 0, 5, and 0. Now fixed.

So, just to double check, these two are fixed on the xlwt trunk?

> === checking for invalid or duplicate sheet names ===
>
> This has been put in as an offshoot of sheet references in formulas.
>
> The validity of a sheet name is checked in Workbook.add_sheet() by
> calling Utils.valid_sheet_name:
>
> def valid_sheet_name(sheet_name):
> if sheet_name == u"" or sheet_name[0] == u"'" or len(sheet_name) > 31:
> return False
> for c in sheet_name:
> if c in u"[]:\\?/*\x00":
> return False
> return True

Shouldn't the above be:

def valid_sheet_name(sheet_name):
if not sheet_name or or len(sheet_name) > 1:


return False
for c in sheet_name:
if c in u"'[]:\\?/*\x00":
return False
return True

> Note that starting a sheet name with an apostrophe is not allowed; it's


> even written into the OOXML spec. This is why trying to supply a quoted
> sheet name is futile.

...so why not just ban apostrophes in sheet names full stop?

> xlwt now maintains a mapping from sheet_name.lower() to sheet_index.
> This is used by Workbook.add_sheet() to check for duplicate sheet names.

yay! :-)

When all this makes it into an xlwt release (pretty please, along with
an xlrd release, they'll get much more testing if they're released and
you can always do a x.y.1 release for any bugs found. please. PLEASE.
"help us John, you're our only hope!", etc ;-) ) I'll remove the code
that does these checks from xlutils.filter...

This all looks great, thanks to all who helped!

cheers,

Chris

--
Simplistix - Content Management, Zope & Python Consulting
- http://www.simplistix.co.uk

Ch. Tronche

unread,
Feb 13, 2009, 4:47:38 AM2/13/09
to python-excel
Congratulations !

I've no doubt an important milestone to a wider xlwt acceptance and
use has been passed.

Thank you to John and all the others for their dedicated work.

Regards.

John Machin

unread,
Feb 13, 2009, 6:04:02 AM2/13/09
to python...@googlegroups.com
On 13/02/2009 8:38 PM, Chris Withers wrote:
> John Machin wrote:
>> (2) Referenced sheets must be created with WorkBook.add_sheet() before
>> the referencing formula is used in Worksheet.write(). Otherwise, an
>> exception is raised (either unknown sheet name, or sheet index out of
>> range, as appropriate).
>
> Are you saying xlwt does this checking or that the code using xlwt must
> do the checking. It should really be the former ;-)

C:\junk>\python23\python
Python 2.3.5 (#62, Feb 8 2005, 16:23:02) [MSC v.1200 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import xlwt
>>> wb = xlwt.Workbook()
>>> ws = wb.add_sheet('foo')
>>> ws.write(0, 0, xlwt.Formula('frobozz!a1'))
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "C:\python23\Lib\site-packages\xlwt\Worksheet.py", line 1002, in
write
self.row(r).write(c, label, style)
File "C:\python23\Lib\site-packages\xlwt\Row.py", line 247, in write
self.__parent_wb.add_sheet_reference(label)
File "C:\python23\Lib\site-packages\xlwt\Workbook.py", line 341, in
add_sheet_reference
self.raise_bad_sheetname(ref0)
File "C:\python23\Lib\site-packages\xlwt\Workbook.py", line 323, in
raise_bad_sheetname
raise Exception("Formula: unknown sheet name %s" % sheetname)
Exception: Formula: unknown sheet name frobozz
>>> ws.write(0, 0, xlwt.Formula('666!a1'))
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "C:\python23\Lib\site-packages\xlwt\Worksheet.py", line 1002, in
write
self.row(r).write(c, label, style)
File "C:\python23\Lib\site-packages\xlwt\Row.py", line 247, in write
self.__parent_wb.add_sheet_reference(label)
File "C:\python23\Lib\site-packages\xlwt\Workbook.py", line 343, in
add_sheet_reference
ref0n = self.convert_sheetindex(ref0, n_sheets)
File "C:\python23\Lib\site-packages\xlwt\Workbook.py", line 330, in
convert_sheetindex
raise Exception(msg)
Exception: Formula: sheet index (666) >= number of sheets (1)

>> (3) If a sheet name is not simple (roughly: it wouldn't qualify as an
>> identifier in Python), you need to quote it so that it can be recognised
>> as a sheet name when the formula is compiled. Utils.quote_sheet_name can
>> be used if you want to do it programatically:
>
> Is that xlwt.Utils.quote_sheetname or something else?

There is no xlwt.Utils.quote_sheetname; it's something else
(xlwt.Utils.quote_sheet_name)


>> === Bug: bad formula code when abs(integer_constant) >= 65536 ===
>>
>> Excel has two formula opcodes for a constant number; one takes a float,
>> the other takes only an unsigned 16-bit integer. The latter was being
>> used for any integer constant. The result was that (e.g.) 65537 would
>> become 1, and -65537 [the - is a unary minus operator, not part of the
>> constant as far as the lexer is concerned] would become -1. Python 2.5
>> and 2.6 produced a deprecation warning (from struct.pack):
>> DeprecationWarning: 'H' format requires 0 <= number <= 65535
>> but 2.3 and 2.4 would let it pass silently. Now the opcode used for an
>> integer constant is conditional on its size.
> >
>> === Buglet: formula lexer wouldn't allow " in a text constant ===
>>
>> Text constants use only " as quote characters. To get one in as content,
>> you need to double it. To get a cell to display the 7 characters
>> "hello", you need Formula('"""hello"""') where (starting from either
>> end) the ' is for Python, the outer " delimits a text constant in a
>> formula, and the inner "" is for the one " that you want to appear. The
>> lexer hadn't been told the doubling story, and regarded that as 3 text
>> constants with lengths 0, 5, and 0. Now fixed.
>
> So, just to double check, these two are fixed on the xlwt trunk?

First: """Now the opcode used for an integer constant is conditional on
its size.""" Now implies fixed.
Second: """Now fixed."""

trunk shmunk. You're the SVN expert; I just pressed the usual buttons in
TortoiseSVN. All the changes are going be in the trunk together or
somewhere else together. Talking about trunks: have you seen the series
of ads for some American auto parts company featuring The Trunk Monkey?
[worth finding on YouTube if you haven't]. Sometimes I feel like
deputising The Trunk Monkey to answer questions here :-)

>> === checking for invalid or duplicate sheet names ===
>>
>> This has been put in as an offshoot of sheet references in formulas.
>>
>> The validity of a sheet name is checked in Workbook.add_sheet() by
>> calling Utils.valid_sheet_name:
>>
>> def valid_sheet_name(sheet_name):
>> if sheet_name == u"" or sheet_name[0] == u"'" or len(sheet_name) > 31:
>> return False
>> for c in sheet_name:
>> if c in u"[]:\\?/*\x00":
>> return False
>> return True
>
> Shouldn't the above be:
>
> def valid_sheet_name(sheet_name):
> if not sheet_name or or len(sheet_name) > 1:
> return False
> for c in sheet_name:
> if c in u"'[]:\\?/*\x00":
> return False
> return True

No.

>> Note that starting a sheet name with an apostrophe is not allowed; it's
>> even written into the OOXML spec. This is why trying to supply a quoted
>> sheet name is futile.
>
> ...so why not just ban apostrophes in sheet names full stop?

In case that was not a rhetorical question: I don't know; ask
bill....@alumni.microsoft.com

>> xlwt now maintains a mapping from sheet_name.lower() to sheet_index.
>> This is used by Workbook.add_sheet() to check for duplicate sheet names.
>
> yay! :-)
>
> When all this makes it into an xlwt release (pretty please, along with
> an xlrd release, they'll get much more testing if they're released and
> you can always do a x.y.1 release for any bugs found. please. PLEASE.
> "help us John, you're our only hope!", etc ;-)

OK, Princess Leia, seeing you asked so nicely ...

Chris Withers

unread,
Feb 14, 2009, 7:38:56 AM2/14/09
to python...@googlegroups.com
John Machin wrote:
> >>> import xlwt
> >>> wb = xlwt.Workbook()
> >>> ws = wb.add_sheet('foo')
> >>> ws.write(0, 0, xlwt.Formula('frobozz!a1'))
> Traceback (most recent call last):
> ...

> Exception: Formula: unknown sheet name frobozz
> >>> ws.write(0, 0, xlwt.Formula('666!a1'))
> Traceback (most recent call last):
> ...

> Exception: Formula: sheet index (666) >= number of sheets (1)

Cool, and with the minor tweaks I just applied, these become doctests.
Would be great in they ended up in a tests model for xlwt.
All we'd need is some way of running them, my favourite would be
zope.testing, but that prettymuch requires we have a buildout each for
xlrd and xlwt. Not a biggie, and I'm happy to set it up if you'd like.

Let me know...

>>> (3) If a sheet name is not simple (roughly: it wouldn't qualify as an
>>> identifier in Python), you need to quote it so that it can be recognised
>>> as a sheet name when the formula is compiled. Utils.quote_sheet_name can
>>> be used if you want to do it programatically:
>> Is that xlwt.Utils.quote_sheetname or something else?
>
> There is no xlwt.Utils.quote_sheetname; it's something else
> (xlwt.Utils.quote_sheet_name)

yes yes ;-)

> First: """Now the opcode used for an integer constant is conditional on
> its size.""" Now implies fixed.
> Second: """Now fixed."""
>
> trunk shmunk. You're the SVN expert; I just pressed the usual buttons in
> TortoiseSVN.

That'll be trunk then *grinz* One day I'll tempt you into reading the
svn branch/merge pattern...

http://svnbook.red-bean.com/en/1.5/svn.branchmerge.html

...although this may be simpler:

http://svnbook.red-bean.com/en/1.4/svn.branchmerge.html

...given that 1.5 introduces a whole load of new bells and whistles.

> somewhere else together. Talking about trunks: have you seen the series
> of ads for some American auto parts company featuring The Trunk Monkey?
> [worth finding on YouTube if you haven't]. Sometimes I feel like
> deputising The Trunk Monkey to answer questions here :-)

http://www.trunkmonkeyad.com/

These two feel like they could be used on some of the posters to this forum:

http://www.trunkmonkeyad.com/1qt.htm
http://www.trunkmonkeyad.com/4qt.htm

>>> def valid_sheet_name(sheet_name):
>>> if sheet_name == u"" or sheet_name[0] == u"'" or len(sheet_name) > 31:
>>> return False
>>> for c in sheet_name:
>>> if c in u"[]:\\?/*\x00":
>>> return False
>>> return True
>> Shouldn't the above be:
>>
>> def valid_sheet_name(sheet_name):
>> if not sheet_name or or len(sheet_name) > 1:
>> return False
>> for c in sheet_name:
>> if c in u"'[]:\\?/*\x00":
>> return False
>> return True
>
> No.

Why not?

>>> Note that starting a sheet name with an apostrophe is not allowed; it's
>>> even written into the OOXML spec. This is why trying to supply a quoted
>>> sheet name is futile.
>> ...so why not just ban apostrophes in sheet names full stop?
>
> In case that was not a rhetorical question: I don't know; ask
> bill....@alumni.microsoft.com

Well, just because MS allow it, doesn't mean xlwt has to...

>> When all this makes it into an xlwt release (pretty please, along with
>> an xlrd release, they'll get much more testing if they're released and
>> you can always do a x.y.1 release for any bugs found. please. PLEASE.
>> "help us John, you're our only hope!", etc ;-)
>
> OK, Princess Leia, seeing you asked so nicely ...

Glad I shaved :-P

John Machin

unread,
Feb 14, 2009, 4:58:25 PM2/14/09
to python...@googlegroups.com
On 14/02/2009 11:38 PM, Chris Withers wrote:
> John Machin wrote:

>
> Cool, and with the minor tweaks I just applied, these become doctests.
> Would be great in they ended up in a tests model for xlwt.
> All we'd need is some way of running them, my favourite would be
> zope.testing, but that prettymuch requires we have a buildout each for
> xlrd and xlwt. Not a biggie, and I'm happy to set it up if you'd like.

How about a separate thread about testing?

>
>>>> def valid_sheet_name(sheet_name):
>>>> if sheet_name == u"" or sheet_name[0] == u"'" or len(sheet_name) > 31:
>>>> return False
>>>> for c in sheet_name:
>>>> if c in u"[]:\\?/*\x00":
>>>> return False
>>>> return True
>>> Shouldn't the above be:
>>>
>>> def valid_sheet_name(sheet_name):
>>> if not sheet_name or or len(sheet_name) > 1:
>>> return False
>>> for c in sheet_name:
>>> if c in u"'[]:\\?/*\x00":
>>> return False
>>> return True
>> No.
>
> Why not?

(a) max len is 31, not 1
(b) readability: avoids double negative of "not sheet_name ... return
False"; avoids tiny pause caused by consulting mental operator
precedence table to resolve "not X or Y"

>>>> Note that starting a sheet name with an apostrophe is not allowed; it's
>>>> even written into the OOXML spec. This is why trying to supply a quoted
>>>> sheet name is futile.
>>> ...so why not just ban apostrophes in sheet names full stop?
>> In case that was not a rhetorical question: I don't know; ask
>> bill....@alumni.microsoft.com

> Well, just because MS allow it, doesn't mean xlwt has to...

We may want to consider a companion function sensible_sheet_name for
*voluntary* use. Even in that function, apostrophes should be allowed.
Otherwise we'd be canning such sensible sheet names as O'Reilly,
l'Hōpital, Nuku'alofa, ...

Cheers,
John

Chris Withers

unread,
Feb 20, 2009, 11:50:21 AM2/20/09
to python...@googlegroups.com
John Machin wrote:
> On 14/02/2009 11:38 PM, Chris Withers wrote:
>> John Machin wrote:
>
>> Cool, and with the minor tweaks I just applied, these become doctests.
>> Would be great in they ended up in a tests model for xlwt.
>> All we'd need is some way of running them, my favourite would be
>> zope.testing, but that prettymuch requires we have a buildout each for
>> xlrd and xlwt. Not a biggie, and I'm happy to set it up if you'd like.
>
> How about a separate thread about testing?

Tried that already:

http://groups.google.com/group/python-excel/browse_thread/thread/b3c2a3c3f2462bf5/c29117b9c9e3f235?lnk=gst&q=testing#c29117b9c9e3f235

...so thought I'd try again ;-)

>>>>> def valid_sheet_name(sheet_name):
>>>>> if sheet_name == u"" or sheet_name[0] == u"'" or len(sheet_name) > 31:
>>>>> return False
>>>>> for c in sheet_name:
>>>>> if c in u"[]:\\?/*\x00":
>>>>> return False
>>>>> return True
>>>> Shouldn't the above be:
>>>>
>>>> def valid_sheet_name(sheet_name):
>>>> if not sheet_name or or len(sheet_name) > 1:
>>>> return False
>>>> for c in sheet_name:
>>>> if c in u"'[]:\\?/*\x00":
>>>> return False
>>>> return True
>>> No.
>> Why not?
>
> (a) max len is 31, not 1

It's okay, a unit test would have caught that ;-)

> (b) readability: avoids double negative of "not sheet_name ... return
> False"; avoids tiny pause caused by consulting mental operator
> precedence table to resolve "not X or Y"

Less of a mental pause than parsing:

if sheet_name == u"" or sheet_name[0] == u"'" or len(sheet_name) > 31

;-)

>> Well, just because MS allow it, doesn't mean xlwt has to...
>
> We may want to consider a companion function sensible_sheet_name for
> *voluntary* use. Even in that function, apostrophes should be allowed.
> Otherwise we'd be canning such sensible sheet names as O'Reilly,
> l'Hōpital, Nuku'alofa, ...

True.

I'm -1 on sensible_sheet_name, I think people would get confused between
the two...

Chris

PS: Despite my Princess Leia impression, we still have no releases that
I know of :-(

Igal

unread,
Apr 3, 2009, 2:23:10 PM4/3/09
to python-excel
G-d bless you dude!

MS Excel does not support formulas with semicolons as a separator
e.g. =IF(B274=0;"NA";B264/B274)
Excel would give a #VALUE error and it took me some time to figure out
why.

So adding the support for commas was a smart thing to do.

Cheers!

John Machin

unread,
Apr 3, 2009, 5:56:48 PM4/3/09
to python...@googlegroups.com
On 4/04/2009 5:23 AM, Igal wrote:
> G-d bless you dude!

Hi Igal, welcome to the group, and thanks for the kind wishes!

However adding support for commas as separators in formulas has
absolutely nothing to do with your problem.

> MS Excel does not support formulas with semicolons as a separator

When a user is typing in a formula, Excel expects the user to use the
"list separator" that is defined for the user's locale. This is either a
comma or a semicolon.

Excel compiles the formula into the bytecode for its virtual formula
machine, and that bytecode is stored in the XLS file with no indication
of whether the user typed commas or semicolons.

The user can then e-mail the file to a colleague in another locale, and
Excel will decompile the formulas and display them using whatever list
separator is used in the colleague's locale.

xlwt now accepts either comma or semicolon as an argument separator in a
formula, compiles the formula, and stores it in the file, whereupon
(bugs excepted) the situation is as though a user had entered the formula.

> e.g. =IF(B274=0;"NA";B264/B274)
> Excel would give a #VALUE error

This was because (as mentioned more than once in this group) there was a
bug in the way xlwt was compiling IF() functions.

Did you have any problems with other functions e.g. SUM(1;2;3) ?

Consider this: if your theory is correct, pyExcelerator (and
consequently xlwt) accepted only semicolons for about four years and
because "MS Excel does not support formulas with semicolons as a
separator" no use of functions with more than one argument was possible,
and nobody complained and nobody did the trivial changes necessary to
fix it. Probable?

> and it took me some time to figure out
> why.

Any reason why you didn't ask here?

> So adding the support for commas was a smart thing to do.

"So" not being the correct word ;-)

Cheers,
John

Roger

unread,
Apr 24, 2009, 12:16:57 PM4/24/09
to python-excel
I upgraded to xlwt 0.7.1 but still can't seem to get formulas with
sheet references to work.

>>> import xlwt as xls
>>> f = xls.Formula('SUM(B!:A:A)')

Traceback (most recent call last):
File "<pyshell#4>", line 1, in <module>
f = xls.Formula('SUM(B!:A:A)')
File "c:\docume~1\roger\locals~1\temp\easy_install-vv811b\xlwt-0.7.1-
py2.5-win32.egg.tmp\xlwt\ExcelFormula.py", line 22, in __init__
raise ExcelFormulaParser.FormulaParseException, "can't parse
formula " + s
FormulaParseException: can't parse formula SUM(B!:A:A)

Thanks in advance for any assistance you can provide.

-Roger

John Machin

unread,
Apr 24, 2009, 8:00:23 PM4/24/09
to python...@googlegroups.com
On 25/04/2009 2:16 AM, Roger wrote:
> I upgraded to xlwt 0.7.1 but still can't seem to get formulas with
> sheet references to work.
>
>>>> import xlwt as xls
>>>> f = xls.Formula('SUM(B!:A:A)')
>
> Traceback (most recent call last):
> File "<pyshell#4>", line 1, in <module>
> f = xls.Formula('SUM(B!:A:A)')
> File "c:\docume~1\roger\locals~1\temp\easy_install-vv811b\xlwt-0.7.1-
> py2.5-win32.egg.tmp\xlwt\ExcelFormula.py", line 22, in __init__
> raise ExcelFormulaParser.FormulaParseException, "can't parse
> formula " + s
> FormulaParseException: can't parse formula SUM(B!:A:A)
>
> Thanks in advance for any assistance you can provide.
>

Hi Roger,

I guess that you typed that into one or more of Excel, OpenOffice.org
Calc, or Gnumeric and got told "The formula you typed contains an
error." or "Err:510" or "Invalid expression" and were hoping that xlwt
would be more helpful.

Yes, sorry, "can't parse formula" is rather terse and uninformative.
Some time in the future (no promises; it's just a wet-Sunday-afternoon
exercise at the moment) you might get something like:

FormulaParseError: Expected 'REF2D', found 'COLON'
'SUM(B! << : >> A:A)'

A REF2D is a cell range reference like A1 or A1:Z99 (or A or B:C or 1 or
2:3).

Did you mean to type B!A:A (or B!A which means the same)? xlwt doesn't
support bare column letters like A or B:C right now; you'd have to work
around it by using A1:A65536 or B1:C65536. Likewise bare row numbers
like 1 or 2:3 -- use A1:IV1 or A2:IV3 instead. Note: this deficiency has
nothing to do with inter-sheet references. It was inherited from
pyExcelerator and in the last 4 years or so AFAIK nobody has mentioned
it as worth fixing.

HTH,
John

John Machin

unread,
Apr 25, 2009, 1:11:05 AM4/25/09
to python...@googlegroups.com
On 25/04/2009 10:00 AM, John Machin wrote:

> A REF2D is a cell range reference like A1 or A1:Z99 (or A or B:C or 1 or
> 2:3).
>
> Did you mean to type B!A:A (or B!A which means the same)? xlwt doesn't
> support bare column letters like A or B:C right now; you'd have to work
> around it by using A1:A65536 or B1:C65536. Likewise bare row numbers
> like 1 or 2:3 -- use A1:IV1 or A2:IV3 instead.

I was wrong; A and 1 are not valid references. You need A:A to specify
"all of column A" and "1:1" to specify "all of row 1".

Rewriting the above:

A REF2D is a cell range reference like A1 or A1:Z99 (or A:A or B:C or
1:1 or 2:3).

Did you mean to type B!A:A ? xlwt doesn't support bare column letters
like A:A or B:C right now; you'd have to work around it by using
A1:A65536 or B1:C65536. Likewise bare row numbers like 1:1 or 2:3 -- use

A1:IV1 or A2:IV3 instead.

Cheers,
John

Reply all
Reply to author
Forward
0 new messages