Read in Cell Value without Formula (as displayed text)

John Kaska

unread,

Nov 25, 2013, 2:00:51 PM11/25/13

to openpyx...@googlegroups.com

Hi,

Thanks for a wonderful library. I upgraded and now my cells read back the formula in the cell rather then the resulting value (the value that the user see's inside the cell). I'm glad to see that the support for reading cell formulas has been added but I need to continue to read the resulting float value rather than the formula.

How can I read the resulting cell float value rather than the formula? My code is:

float(worksheet.cell('A1').value)

Thanks

Adam Morris

unread,

Nov 26, 2013, 3:29:12 PM11/26/13

to openpyx...@googlegroups.com

At the moment, it looks like only the formulas are stored. I know there have been a few other discussions on this, but not sure on the consensus:

https://bitbucket.org/ericgazoni/openpyxl/issue/177/data_type-of-rawcell-wrong-in-one

https://groups.google.com/forum/#!searchin/openpyxl-users/formula$20value/openpyxl-users/KV8cm8cRf4E/EZvarxjZC9kJ

I'm curious if would make sense to read the formula values along with the formulas, and store them in a dictionary on the worksheet?

Something like :

worksheet.cell('A1').data_type == Cell.TYPE_FORMULA
worksheet.cell('A1').value # "=1+3"
worksheet.cached_valuel['A1'] # 4

That way, you could check if the worksheet.cell('A1').data_type == Cell.TYPE_FORMULA and 'A1' in worksheet.cached_value in case you wanted to use this value.

Storing on the worksheet, instead of the cell, saves memory and performance, and allows the individual who cares about cached formula values to have a way to access them. (Knowing that it's up to them to check if it's available.) That and it would be a simple patch which wouldn't effect writing or other areas of the code.

--Adam

John Kaska

unread,

Nov 27, 2013, 2:33:07 PM11/27/13

to openpyx...@googlegroups.com

I like the way that looks.

We were reading in the values and performing actions on them. Now we get back =L13/100*ASIN(L2) etc… There isn't much we can do with that. I think having the ability to do both is a good idea like you are saying. For us we are going to drop down to an older version of openpyxl but hopefully that functionality will be added in the future.

John Kaska

unread,

Nov 27, 2013, 2:34:43 PM11/27/13

to openpyx...@googlegroups.com

I saw you added conditional formatting too which is what we want to use. So its kind of sad that the ability to read values wasn't preserved but hopefully that will be worked on soon

John Kaska

unread,

Nov 27, 2013, 2:35:54 PM11/27/13

to openpyx...@googlegroups.com

none the less, excellent tool!

Charlie Clark

unread,

Nov 27, 2013, 4:52:23 PM11/27/13

to openpyx...@googlegroups.com

It's an interesting question! The short answer: currently you can't but I
will look at making it available by configuration.

The change was made in response to a bug ticket and raised on this mailing
list. Openpyxl used *ignore* formulae when reading them but not when
writing them. This was inconsistent. There are good uses cases for wanting
*either* the formula *or* the cached value. If you are reading and writing
an Excel file you are almost certainly likely to want to preserve existing
attributes such as charts; if you are only treating Excel as a data source
you probably only want the calculated values.

Formulae are cell attributes and set as the cell value when written by
openpyxl so reading them will have to preserve this behaviour otherwise
roundtripping will not be possible. Cached values will also have to
declared invalid if a cell value is written to. I think I have an idea as
how I'm going to approach this. We already have the ability to disable
type inference when reading files and might use that for formulae.

Charlie
--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Kronenstr. 27a
Düsseldorf
D- 40217
Tel: +49-211-600-3657
Mobile: +49-178-782-6226

Adam Morris

unread,

Nov 27, 2013, 5:43:48 PM11/27/13

to openpyx...@googlegroups.com

Good point on needing to declare the cached values invalid if the cell value is written to. That makes it a bit more complex.

I tried an idea - but curious to see how you might do this to dance with guess_types.

--Adam

Iuri

unread,

Nov 27, 2013, 5:59:06 PM11/27/13

to openpyx...@googlegroups.com

+1 for that.

I want these cached values too! I'm looking for alternatives but didn't found one.

--

iurisilvio

--
You received this message because you are subscribed to the Google Groups "openpyxl-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openpyxl-user...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Charlie Clark

unread,

Nov 28, 2013, 10:25:40 AM11/28/13

to openpyx...@googlegroups.com

Am 27.11.2013, 23:43 Uhr, schrieb Adam Morris <w...@myemptybucket.com>:

> Good point on needing to declare the cached values invalid if the cell
> value is written to. That makes it a bit more complex.
> I tried an idea - but curious to see how you might do this to dance with
> guess_types.

Well, basically I'd like to avoid having too many flags.
`guess_types=True` makes certain assumptions about how the library is
being used so it might be reasonable to use this to get the formula or
data value (depending on what becomes the default behaviour). However,
just thinking it through makes it obvious that it doesn't really have
legs. I think that something like `data_only=True` probably makes more
sense where the cached value is used (destructively) to set the value.
Default behaviour would be to use the formula. This should be pretty easy
to do apart from the fact that the iterating reader does not yet respect
these options :-( Consistent behaviour when reading and writing is very
important, I think.

@ John Laska, Luri - would `data_only=True` be okay for you?

Iuri

unread,

Nov 28, 2013, 11:37:21 AM11/28/13

to openpyx...@googlegroups.com

Yes, the `data_only=True` is okay for me. In some cases, I use formulae (`data_only=False`), but I never mix cached values and formulae, your assumptions are good.

Looking forward for this patch. Thanks!

--
You received this message because you are subscribed to the Google Groups "openpyxl-users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to openpyxl-users+unsubscribe@googlegroups.com.

Adam Morris

unread,

Nov 28, 2013, 2:03:33 PM11/28/13

to openpyx...@googlegroups.com

+1 for less flags.

The cached values are somewhat inconsistent - they're not always set. Shared formulas only set the first cached value (I think), whereas array formula's only seem to set the formula in the first cell, and only values (no formulas) in the other cells. (So, a formula doesn't always have a cached value, and sometimes it's technically a formula, but just shows the value).

I like the data_only flag - then it seems set_explicit_value could set the cached value if it was a formula and the cached value existed when data_only = True.

Charlie Clark

unread,

Nov 28, 2013, 2:59:44 PM11/28/13

to openpyx...@googlegroups.com

Am 28.11.2013, 20:03 Uhr, schrieb Adam Morris <w...@myemptybucket.com>:

> +1 for less flags.
> The cached values are somewhat inconsistent - they're not always set.

Which is why this should not be the default: it was but, er, only by
default because formulae weren't handled.

> Shared formulas only set the first cached value (I think), whereas array
> formula's only seem to set the formula in the first cell, and only values
> (no formulas) in the other cells. (So, a formula doesn't always have a
> cached value, and sometimes it's technically a formula, but just shows
> the
> value).

We'll need to look at this in greater detail.

> I like the data_only flag - then it seems set_explicit_value could set
> the
> cached value if it was a formula and the cached value existed when
> data_only = True.

Well, it would just ignore the formula and pass the value to the cell
binding mechanism. I think there's a "value only" option in Excel for
something similar when copying & pasting so we might go with that.

In the meantime people will have to hack the relevant section in
reader/worksheet.py
ll 132:147 in trunk:

if formula is not None:
if formula.text:
value = "=" + str(formula.text)
else:
value = "="
formula_type = formula.get('t')
if formula_type:
ws.formula_attributes[coordinate] = {'t': formula_type}
if formula.get('si'): # Shared group index for shared
formulas
ws.formula_attributes[coordinate]['si'] =
formula.get('si')
if formula.get('ref'): # Range for shared formulas
ws.formula_attributes[coordinate]['ref'] =
formula.get('ref')
if not guess_types and formula is None:
ws.cell(coordinate).set_explicit_value(value=value,
data_type=data_type)
else:
ws.cell(coordinate).value = value

Simply removing the formula block and removing formula condition from the
guess_types line will do the trick.

John Kaska

unread,

Dec 3, 2013, 12:34:57 PM12/3/13

to openpyx...@googlegroups.com

Thanks. I'm making the changes but it appears to still read in the formula. I think I'm missing something from what you were saying. Here is what I did:

#if formula is not None:

# if formula.text:

# value = "=" + str(formula.text)

# else:

# value = "="

# formula_type = formula.get('t')

# if formula_type:

# ws.formula_attributes[coordinate] = {'t': formula_type}

# if formula.get('si'): # Shared group index for shared formulas

# ws.formula_attributes[coordinate]['si'] = formula.get('si')

# if formula.get('ref'): # Range for shared formulas

# ws.formula_attributes[coordinate]['ref'] = formula.get('ref')

if not guess_types:

ws.cell(coordinate).set_explicit_value(value=value, data_type=data_type)

else:

ws.cell(coordinate).value = value

Charlie Clark

unread,

Dec 3, 2013, 2:27:09 PM12/3/13

to openpyx...@googlegroups.com

Am 03.12.2013, 18:34 Uhr, schrieb John Kaska <johnk...@gmail.com>:

> Thanks. I'm making the changes but it appears to still read in the
> formula.
> I think I'm missing something from what you were saying. Here is what I
> did

What reader are you using? ie.

from openpyxl import load_workbook
wb = load_workbook("somefile.xlsx", use_iterators=False)

Otherwise it's probably best to install 1.6.2 until we get something
working for this.

Adam Morris

unread,

Dec 4, 2013, 8:53:50 AM12/4/13

to openpyx...@googlegroups.com

That should work if you've got use_iterators = False on the workbook. If not, perhaps you have multiple versions on your computer - do you need to run python setup.py build && python setup.py install?

--Adam

Charlie Clark

unread,

Dec 4, 2013, 9:24:38 AM12/4/13

to openpyx...@googlegroups.com

Am 04.12.2013, 14:53 Uhr, schrieb Adam Morris <w...@myemptybucket.com>:

> That should work if you've got use_iterators = False on the workbook. If
> not, perhaps you have multiple versions on your computer - do you need to
> run python setup.py build && python setup.py install?

Actually, if hacking it's best to do it from a checkout. We have a couple
of tests which check whether formulae are coming through or not and if
they fail then you know you're on the right track. Making the change to
explicitly support `data_only=True` isn't too much work but it means
refactoring the iterator reader to support the variable. Ideally, I think
the readers should be class-based to allow instance variables to handle
this.

Charlie Clark

unread,

Dec 9, 2013, 3:58:46 PM12/9/13

to openpyx...@googlegroups.com

Am .12.2013, 15:24 Uhr, schrieb Charlie Clark
<charli...@clark-consulting.eu>:

> Actually, if hacking it's best to do it from a checkout. We have a
> couple of tests which check whether formulae are coming through or not
> and if they fail then you know you're on the right track. Making the
> change to explicitly support `data_only=True` isn't too much work but it
> means refactoring the iterator reader to support the variable. Ideally,
> I think the readers should be class-based to allow instance variables to
> handle this.

FWIW I've added preliminary support for this for normal workbooks:
load_workbook(FILENAME, data_only=True)

You'll need to checkout the 1.8 branch to use it.

Iuri

unread,

Dec 9, 2013, 4:02:26 PM12/9/13

to openpyx...@googlegroups.com

Thanks! I will test it soon and give you some feedback here.

On Mon, Dec 9, 2013 at 6:58 PM, Charlie Clark <charli...@clark-consulting.eu> wrote:

Iuri

unread,

Dec 18, 2013, 11:04:06 AM12/18/13

to openpyx...@googlegroups.com

Hi Charlie,

I tested it and didn't worked the way I expected. Here is my script example and the output: https://gist.github.com/iurisilvio/f0c9cb108d260d406898

My xlsx file has just one sheet with one row: A1="123", A2="=A1".

What I'm doing wrong? Or it doesn't work with `use_iterators=True` and `data_only=True`?

Thanks!

2013/12/9 Iuri <iuris...@gmail.com>

Iuri

unread,

Dec 18, 2013, 11:07:00 AM12/18/13

to openpyx...@googlegroups.com

Sorry. I read again your last email about it, you said it is just to normal workbooks. So, it works as expected.

Do you have plans to optimized reader and data_only=True? This is the my use case. I have to read a large Excel and add it to my database, so I need data and optimized reader.

Thanks!

2013/12/18 Iuri <iuris...@gmail.com>

Charlie Clark

unread,

Dec 18, 2013, 11:22:35 AM12/18/13

to openpyx...@googlegroups.com

Am .12.2013, 17:07 Uhr, schrieb Iuri <iuris...@gmail.com>:

> Sorry. I read again your last email about it, you said it is just to
> normal
> workbooks. So, it works as expected.
> Do you have plans to optimized reader and data_only=True? This is the my
> use case. I have to read a large Excel and add it to my database, so I
> need
> data and optimized reader.

Yes, that's something I hope to work on over the holidays and it's pretty
much the only thing holding off a 1.8 release. Unfortunately, as I haven't
touched any of that code it's a bit confusing and will need refactoring to
be able to handle these kind of flags. That use_iterators sets
guess_types=False is another inconsistency that I'd like to get rid of,
though I'm tending to disable type inference when reading sheets as the
default. Whatever the defaults (and whether they are right or wrong) are
they should be consistent whichever method you're using.

Charlie Clark

unread,

Dec 30, 2013, 4:20:19 PM12/30/13

to openpyx...@googlegroups.com

Am .12.2013, 17:22 Uhr, schrieb Charlie Clark
<charli...@clark-consulting.eu>:

> Yes, that's something I hope to work on over the holidays and it's
> pretty much the only thing holding off a 1.8 release. Unfortunately, as
> I haven't touched any of that code it's a bit confusing and will need
> refactoring to be able to handle these kind of flags. That use_iterators
> sets guess_types=False is another inconsistency that I'd like to get rid
> of, though I'm tending to disable type inference when reading sheets as
> the default. Whatever the defaults (and whether they are right or wrong)
> are they should be consistent whichever method you're using.

Just to let you know that I've added the `data_only=True` flag to iterable
worksheets as well. Test coverage is particularly poor for iterable
worksheets so I'm still refactoring. But this does at least work in the
single existing example. Please let me know how you get on.

Iuri

unread,

Jan 1, 2014, 9:59:41 AM1/1/14

to openpyx...@googlegroups.com

Perfect. I tested here with my Excel reader and it worked. Just added the `data_only=True`. The current 1.8 branch works for me.

Thanks! Happy new year!

2013/12/30 Charlie Clark <charli...@clark-consulting.eu>

Iuri

unread,

Jan 7, 2014, 12:10:57 PM1/7/14

to openpyx...@googlegroups.com

Hi, Charlie!

Today I discovered the 1.8 iterable worksheet does not limit the memory usage. Using 1.6.2, I read a large Excel (30k lines, 80 columns) with this code:

import openpyxl

r = openpyxl.load_workbook("C:/path/to/some.xlsx", use_iterators=True)

w = r.worksheets[0]

for n, line in enumerate(w.iter_rows()):

if n % 1000 == 0:

print n

The memory usage increases linearly.

Is it expected?

Thanks!

2014/1/1 Iuri <iuris...@gmail.com>

Charlie Clark

unread,

Jan 7, 2014, 4:06:26 PM1/7/14

to openpyx...@googlegroups.com

Am .01.2014, 18:10 Uhr, schrieb Iuri <iuris...@gmail.com>:

> Hi, Charlie!
> Today I discovered the 1.8 iterable worksheet does not limit the memory
> usage. Using 1.6.2, I read a large Excel (30k lines, 80 columns) with
> this
> code:
> import openpyxl
> r = openpyxl.load_workbook("C:/path/to/some.xlsx", use_iterators=True)
> w = r.worksheets[0]
> for n, line in enumerate(w.iter_rows()):
> if n % 1000 == 0:
> print n
> The memory usage increases linearly.
> Is it expected?

No, it isn't so thanks for the report. I did change the way
IterableWorksheet works - it now streams the entire worksheet XML directly
from the archive rather than trying to chunk the archive but if the memory
use in linear that suggests cells or parser elements are being created but
not being released so I may have overlooked an element.clear() in the code
or something. Is the memory a problem for you at the moment? I wouldn't
have thought that would be the case with only 30k rows but obviously this
needs investigating.

Charlie Clark

unread,

Jan 7, 2014, 4:28:47 PM1/7/14

to openpyx...@googlegroups.com

Am .01.2014, 22:06 Uhr, schrieb Charlie Clark
<charli...@clark-consulting.eu>:

> so I may have overlooked an element.clear() in the code or something.

Can you try patching in this and let me know whether it makes a difference
to memory use?

diff -r 56366070d303 openpyxl/reader/iter_worksheet.py
--- a/openpyxl/reader/iter_worksheet.py Thu Jan 02 23:46:28 2014 +0100
+++ b/openpyxl/reader/iter_worksheet.py Tue Jan 07 22:26:45 2014 +0100
@@ -254,6 +254,7 @@
data_type = Cell.TYPE_FORMULA
value = "=" + formula
yield RawCell(row, column_str, coord, value,
data_type, style_id, None)
+ element.clear()

Iuri

unread,

Jan 7, 2014, 5:00:02 PM1/7/14

to openpyx...@googlegroups.com

I tried your diff, it breaks my code.

Traceback (most recent call last):

File "a.py", line 10, in <module>

for n, line in enumerate(w.iter_rows()):

File "f:\projects\openpyxl\openpyxl\reader\iter_worksheet.py", line 220, in get_squared_range

cell = cell._replace(internal_value=unicode(self._string_table[int(cell.internal_value)])) #pylint: disable-msg=W0212

ValueError: invalid literal for int() with base 10: ''

I also tested the element.clear inside the if block, it improved the memory usage, but it is not really fixed. I ran around 10k lines, the python process was using ~300MB. This memory usage works for me, but the 1.6.2 limited the memory usage aroung 20MB.

Before this patch, it was using ~2GB (!!) for 10k lines. With all the 30k lines, it was killed with ~5GB.

If you have more patches, I'm here to test it.

Thanks!

2014/1/7 Charlie Clark <charli...@clark-consulting.eu>

Am .01.2014, 22:06 Uhr, schrieb Charlie Clark <charlie.clark@clark-consulting.eu>:

so I may have overlooked an element.clear() in the code or something.

Can you try patching in this and let me know whether it makes a difference to memory use?

diff -r 56366070d303 openpyxl/reader/iter_worksheet.py
--- a/openpyxl/reader/iter_worksheet.py Thu Jan 02 23:46:28 2014 +0100
+++ b/openpyxl/reader/iter_worksheet.py Tue Jan 07 22:26:45 2014 +0100
@@ -254,6 +254,7 @@
data_type = Cell.TYPE_FORMULA
value = "=" + formula
yield RawCell(row, column_str, coord, value, data_type, style_id, None)
+ element.clear()

--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Kronenstr. 27a
Düsseldorf
D- 40217
Tel: +49-211-600-3657
Mobile: +49-178-782-6226

Charlie Clark

unread,

Jan 8, 2014, 5:56:22 AM1/8/14

to openpyx...@googlegroups.com

Am .01.2014, 23:00 Uhr, schrieb Iuri <iuris...@gmail.com>:

> I tried your diff, it breaks my code.
> Traceback (most recent call last):
> File "a.py", line 10, in <module>
> for n, line in enumerate(w.iter_rows()):
> File "f:\projects\openpyxl\openpyxl\reader\iter_worksheet.py", line
> 220,
> in get_squared_range
> cell =
> cell._replace(internal_value=unicode(self._string_table[int(cell.internal_value)]))
> #pylint: disable-msg=W0212
> ValueError: invalid literal for int() with base 10: ''

That looks like a line break issue due to being in an e-mail. I ran some
tests locally and element.clear() has no effect. This is only to be
expected as element is overwritten by every time the loop runs.

> I also tested the element.clear inside the if block, it improved the
> memory
> usage, but it is not really fixed. I ran around 10k lines, the python
> process was using ~300MB. This memory usage works for me, but the 1.6.2
> limited the memory usage aroung 20MB.

> Before this patch, it was using ~2GB (!!) for 10k lines. With all the 30k
> lines, it was killed with ~5GB.

That is indeed excessive. I only have benchmark/reader.py to go on here.
But when I compare it with

https://bitbucket.org/ericgazoni/openpyxl/commits/1043f5c5cde6ccacf553632069e733386a1dd6f9

Then there is little or no difference. I've made no substantial changes to
that part of the code before then. This isn't to say that there aren't
differences between 1.6.2 and it, but I'm not really aware of them.

When I run benchmark/reader.py I have about 1.1 GB of memory in use when
not optimised and 900 MB when optimised.

> If you have more patches, I'm here to test it.

I can't think of anything easy at the moment and, based on my benchmarks,
I can only think that the memory use you're seeing is caused by other
parts of the code. Unfortunately, I have little experience with memory
optimisation in Python. Adam has written some docs on investigating use of
memory but we need to formalise it so that we can make changes based on
reproducible scenarios. How are you checking memory use? Do you have some
test files we could use?

As things stand I don't think it should stop a release of 1.8. I've
already done a lot of work in 1.9 to harmonise the interfaces for normal
and optimised readers so that future changes (such as data_only) are more
reliable and test isolation is higher. I'm also hoping to meet up with
Eric at FOSDEM at the end of the month and discuss things like this with
him.

Charlie

Iuri

unread,

Jan 8, 2014, 8:16:26 AM1/8/14

to openpyx...@googlegroups.com

Hi Charlie,

I did it checking Windows memory management. Now, I found memory_profiler package, it looks great. Probably a great helper to your benchmark tests.

I changed my benchmark script to use this memory_profiler and your file openpyxl/benchmarks/files/large.xlsx.

I created a gist with my script and the output: https://gist.github.com/iurisilvio/8316537

The v1.6.2 really limit the memory. It was great.

2014/1/8 Charlie Clark <charli...@clark-consulting.eu>

Charlie Clark

unread,

Jan 8, 2014, 8:57:57 AM1/8/14

to openpyx...@googlegroups.com

Am .01.2014, 14:16 Uhr, schrieb Iuri <iuris...@gmail.com>:

> Hi Charlie,
> I did it checking Windows memory management. Now, I found memory_profiler
> package, it looks great. Probably a great helper to your benchmark tests.

Certainly a start, thanks. Guppy looks like it will provide more granular
information as to which objects are using all the memory: are cells not
being garbage collected? Or is something else (string table? styles?)
causing the problem?

> I changed my benchmark script to use this memory_profiler and your file
> openpyxl/benchmarks/files/large.xlsx.

> I created a gist with my script and the output:
> https://gist.github.com/iurisilvio/8316537

What information is being printed.

> The v1.6.2 really limit the memory. It was great.

I don't think that's in doubt but I'm not sure as to why. If you look at
the log for iter_worksheet.py most of the stuff looks very innocuous and
not directly related to memory use. However, memory use with 1.7.0 was
lower so I'll see what I can do to reintegrate how that works.

Iuri

unread,

Jan 8, 2014, 9:12:23 AM1/8/14

to openpyx...@googlegroups.com

The `memory_usage` print a list of memory usage in MB. So, the output of my bench.py is the row number and a list with one element of memory usage, in MB.


0 [28.640625] -> 28MB
50 [198.54296875] -> 198MB
100 [366.6328125] -> 366MB



I just started using memory_profiler today, but the numbers match with Windows Task Manager.

2014/1/8 Charlie Clark <charli...@clark-consulting.eu>

Charlie Clark

unread,

Jan 8, 2014, 9:47:07 AM1/8/14

to openpyx...@googlegroups.com

Am .01.2014, 14:57 Uhr, schrieb Charlie Clark
<charli...@clark-consulting.eu>:

> However, memory use with 1.7.0 was lower so I'll see what I can do to
> reintegrate how that works.

Okay. It seems skipping over value fields seems to be required. I've just
pushed this to the 1.8 branch. Note, it causes something to break when
handling formulae, unless data_only=True so you should be happy. I hope I
can fix this quickly and am pleased that none of the other stuff I've done
needs reverting.

Iuri

unread,

Jan 9, 2014, 11:25:55 AM1/9/14

to openpyx...@googlegroups.com

Today I checked out 1.9 branch and ran my benchmark. The memory usage is like in v1.6.2, almost constant.

Also, I found your memory usage benchmark in 1.9 (https://bitbucket.org/ericgazoni/openpyxl/commits/6789104f4b0c66cdcb2e53ae40ee897b89a637f5). I tried it. It breaks just because you forgot the `n=0`, the memory is fine.

I will try it with my real code soon, but it looks great. Thanks!

2014/1/8 Charlie Clark <charli...@clark-consulting.eu>

Am .01.2014, 14:57 Uhr, schrieb Charlie Clark <charlie.clark@clark-consulting.eu>:

However, memory use with 1.7.0 was lower so I'll see what I can do to reintegrate how that works.

Okay. It seems skipping over value fields seems to be required. I've just pushed this to the 1.8 branch. Note, it causes something to break when handling formulae, unless data_only=True so you should be happy. I hope I can fix this quickly and am pleased that none of the other stuff I've done needs reverting.

Charlie
--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Kronenstr. 27a
Düsseldorf
D- 40217
Tel: +49-211-600-3657
Mobile: +49-178-782-6226

Charlie Clark

unread,

Jan 9, 2014, 11:36:38 AM1/9/14

to openpyx...@googlegroups.com

Am .01.2014, 17:25 Uhr, schrieb Iuri <iuris...@gmail.com>:

> Today I checked out 1.9 branch and ran my benchmark. The memory usage is
> like in v1.6.2, almost constant.

Yes, thanks for pointing the problem out.

> Also, I found your memory usage benchmark in 1.9 (
> https://bitbucket.org/ericgazoni/openpyxl/commits/6789104f4b0c66cdcb2e53ae40ee897b89a637f5).
> I tried it. It breaks just because you forgot the `n=0`, the memory is
> fine.

Works fine here with tox -e memory

> I will try it with my real code soon, but it looks great. Thanks!

Thanks to everyone else who's contributed to the 1.8 release: Adam Morris,
John Bovey and Stefan Behnel. I'm hoping to make 1.9 include a lot more
bug fixes as we still have about 6ß outstanding so pull requests are
welcome. Hint, hint. ;-)

bkwe...@gmail.com

unread,

Dec 3, 2018, 11:50:09 AM12/3/18

to openpyxl-users

Im currently stuck in a spot where I am trying to pull a number from a cell. Like mentioned above im getting the formula in the cell returned instead of a number. When I use data_only=True I now get None returned because the formula is in the cell and not an actual number. How Can I go about pulling the number that I visually see when I view the spreadsheet?

Charlie Clark

unread,

Dec 3, 2018, 11:57:44 AM12/3/18

to openpyx...@googlegroups.com

Am .12.2018, 17:50 Uhr, schrieb <bkwe...@gmail.com>:

> Im currently stuck in a spot where I am trying to pull a number from a
> cell. Like mentioned above im getting the formula in the cell returned
> instead of a number. When I use data_only=True I now get None returned
> because the formula is in the cell and not an actual number. How Can I
> go about pulling the number that I visually see when I view the
> spreadsheet?

If an application has calculated the result of the formula and saved copy
of the value in the file then openpyxl should find this when
data_only=True. If this isn't the case then it could be a bug but I'd need
to see a file to investigate.

Brandon Weibley

unread,

Dec 3, 2018, 12:09:29 PM12/3/18

to openpyx...@googlegroups.com

The problem is that I have a column on a spreadsheet that calculates a running inventory count. Every cell in the column is a formula (start with the value above you, subtract the value 2 cells to the left, add the value 1 cell to the left). When you view the spreadsheet you see the values but when I close the spreadsheet and run this python code It returns something like '=D1065-B1066+C1066' which is the formula from that cell. I cant just pull the numbers and make python do the calculation because all of column D is a formula just like the one above. When I use data_only=True then I get None since there is no number in that cell, only a formula. Any suggestions on how to work around this? Im trying to pull this data from 1 tab and populate it to another. If there was a way to just link the new cells location to the old one the correct value would show when I open the spreadsheet. Basically if I could make it say something like '=Coatings!D1066' and place that in the new cell it would work for this.

--
You received this message because you are subscribed to a topic in the Google Groups "openpyxl-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/openpyxl-users/g3PQIpqzBnA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to openpyxl-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Brandon Weibley

unread,

Dec 3, 2018, 12:20:36 PM12/3/18

to openpyx...@googlegroups.com

It just hit me how I could fix this, but I am not very familiar with splicing. When I am getting a formula like '=D1065-B1066+C1066' I could make this work If I was able to add the tab prefix in front of each cell to make it '= Coatings!D1065-Coatings!B1066+Coatings!C1066' Any idea how I could do this?

Reply all

Reply to author

Forward