How to use ws[coordinate]?

5,716 views
Skip to first unread message

Андрей Ширшов

unread,
Oct 11, 2017, 5:42:02 AM10/11/17
to openpyxl-users
Hello!

I just tried:

ws.cell(1,1)

I got: "Using a coordinate with ws.cell is deprecated. Use ws[coordinate] instead"

Okay, now I try:

ws[1,1]

I got TypeError: expected string or bytes-like object

So, how can I use ws[coordinate]?

Best regards,
Andrey.

Андрей Ширшов

unread,
Oct 11, 2017, 5:49:43 AM10/11/17
to openpyxl-users
It seems to me I puzzled out.

ws[1] gives 1st line
ws[1][0] gives 1st cell in 1st line

To be honest it's very uncomfortable, because now I should have 1 in my memory:
ws[1][1] == ws.cells(row=1, column=2)



среда, 11 октября 2017 г., 12:42:02 UTC+3 пользователь Андрей Ширшов написал:

Charlie Clark

unread,
Oct 11, 2017, 6:58:34 AM10/11/17
to openpyx...@googlegroups.com
Am .10.2017, 11:42 Uhr, schrieb Андрей Ширшов <sh.an...@gmail.com>:

> ws.cell(1,1)
> I got: "Using a coordinate with ws.cell is deprecated. Use ws[coordinate]
> instead"

This is all explained here:

http://openpyxl.readthedocs.io/en/default/tutorial.html#playing-with-data

You used to be able to pass in Excel style coordinates as the first
parameter to ws.cell(). It is this that has been deprecated with
programmatic access via ws.cell, ws.iter_rows or ws.iter_cols and ad-hoc
access via Pythonic indexing.

Charlie
--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Kronenstr. 27a
Düsseldorf
D- 40217
Tel: +49-211-600-3657
Mobile: +49-178-782-6226

cha...@futurescope.co.uk

unread,
Nov 27, 2017, 2:11:42 AM11/27/17
to openpyxl-users
I am new to openpyxl and python. I have a simple problem that I just can't seem to work out.

I have a large xlsx file with rows of data that I want to iterate over. I want to insert the data into a SQL database so want to pick out data from individual columns on each row.

The nearest openpyxl example seems to be:

from openpyxl import load_workbook
wb = load_workbook(filename='large_file.xlsx', read_only=True)
ws = wb['big_data']

for row in ws.rows:
for cell in row:
print(cell.value)

However that just prints out the cell values.

How do I access column data such as
MyColumn = "C"
ThisRow.column(MyColumn).value

A newbie focused example would be really appreciated.

Thanks

Charlie Clark

unread,
Nov 27, 2017, 3:27:02 AM11/27/17
to openpyx...@googlegroups.com
Am .11.2017, 08:11 Uhr, schrieb <cha...@futurescope.co.uk>:

> A newbie focused example would be really appreciated.

See
http://openpyxl.readthedocs.io/en/default/tutorial.html#accessing-many-cells

Charles Hooper

unread,
Nov 27, 2017, 5:01:41 AM11/27/17
to openpyxl-users
Sorry to be really dim about this but that page doesn't give a newbie much help.

If you take the code below, what do you put where the ????? are to access (for example) column C for that row?

from openpyxl import load_workbook
wb = load_workbook(filename='large_file.xlsx', read_only=True)
ws = wb['big_data']

for row in ws.rows:
   ????? - How to access Column C for this row??


Thanks

Charlie Clark

unread,
Nov 27, 2017, 5:51:55 AM11/27/17
to openpyx...@googlegroups.com
Am .11.2017, 11:01 Uhr, schrieb Charles Hooper <cha...@futurescope.co.uk>:

> Sorry to be really dim about this but that page doesn't give a newbie
> much help.
>
> If you take the code below, what do you put where the ????? are to access
> (for example) column C for that row?

If you want a column:

colC = ws['C']

> from openpyxl import load_workbook
> wb = load_workbook(filename='large_file.xlsx', read_only=True)
> ws = wb['big_data']
>
> for row in ws.rows:
> ????? - How to access Column C for this row??

That would be the third cell in any row.

Alper Paksoy

unread,
Nov 27, 2017, 9:48:40 AM11/27/17
to openpyxl-users
Hi Charles,

Pls check

https://automatetheboringstuff.com/chapter12/

to learn more nicely about Python and openpyxl. The chapter is slightly outdated due to version changes in openpyx since the book has been published but still very useful. In fact, if you are a fresh Python newbie like I was just a few months ago, I would heavily recommend the book which is available at that address for free. I would also recommend the video course with the same title at Udemy from the book’s author Al Sweigart.

Alper
Message has been deleted

Charles

unread,
Nov 27, 2017, 5:17:39 PM11/27/17
to openpyxl-users
OK. Got it

from openpyxl import load_workbook
wb = load_workbook(filename='DummyDataFileV1.xlsx', read_only=True)
ws = wb['Data']

for row in range(2, ws.max_row + 1):
Place = ws['A' + str(row)].value
print (Place)

Thanks for everyone's help!

Charles

Charlie Clark

unread,
Nov 28, 2017, 5:15:30 AM11/28/17
to openpyx...@googlegroups.com
Am .11.2017, 19:45 Uhr, schrieb Charles Hooper <cha...@futurescope.co.uk>:

> I'm trying to do this so I can iterate through lots of excel rows
> (16,000)
> and pick out certain columns to put the data in a database.

The iter_cols method does not exist for read-only workbooks.

Charlie Clark

unread,
Nov 28, 2017, 5:16:52 AM11/28/17
to openpyxl-users, Charles
I don't understand this code and strongly advise against creating your own
loops like this: not only is it completely unnecessary, you're likely to
make mistakes.

Charles

unread,
Nov 28, 2017, 6:04:06 AM11/28/17
to openpyxl-users
Well this was taken from the page suggested by Alper above - https://automatetheboringstuff.com/chapter12/

See the section: Step 1: Read the Spreadsheet Data

The code snippet you suggested didn't work - ws['C'] - and as you say the iter_cols method does not exist for read-only workbooks (Read only for the number of rows I'm using).

If you can provide a better example, I'll know what to use.

Charles 

Alper Paksoy

unread,
Nov 28, 2017, 4:15:38 PM11/28/17
to openpyxl-users
Hi Charles,

Can you pls mark the code using curly braces from the editor menu when you share code within a message here? 

Sorry if you already know those below or have noticed them by now. If not, there is an alternative as per below in accessing cells in a column if you can live with a certain warning message.  

#!/usr/bin/env python3

import openpyxl
wb = openpyxl.load_workbook(filename='example.xlsx') 
ws = wb['Data']

column = ws['A']
colValues = []
for cell in column[1:]:
    colValues.append(cell.value)
    print(cell.value)

print(colValues)

The code above loops through the cells in the iterable variable "column" starting with the second row. The values of cells are accumulated in the list variable colValues. They are also printed one by one as an example. The output of this program is as follows. The "example.xlsx" file is attached to this message. 

Apples
Cherries
Pears
Oranges
Apples
Bananas
Strawberries
['Apples', 'Cherries', 'Pears', 'Oranges', 'Apples', 'Bananas', 'Strawberries']

I get the following warning message when I run this code. It does not, however, seem to affect the efficiency or functionality of the code.

Warning (from warnings module):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/openpyxl/reader/worksheet.py", line 312
    warn(msg)
UserWarning: Unknown extension is not supported and will be removed

The code, however, does not work if you open the excel file in read only mode through the load_workbook function as per below (everything else in the code is the same).

wb = openpyxl.load_workbook(filename='example.xlsx', read_only=True)

In this case, I get the following error message and the program flow is terminated.

Traceback (most recent call last):
  File "/Users/Alper/Library/Mobile Documents/com~apple~CloudDocs/Documents/Python/Other code/openpyxl/charles_column.py", line 7, in <module>
    column = ws['A']
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/openpyxl/worksheet/read_only.py", line 85, in __getitem__
    return meth(key)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/openpyxl/worksheet/worksheet.py", line 363, in __getitem__
    cols = tuple(self.iter_cols(min_col, max_col))
AttributeError: 'ReadOnlyWorksheet' object has no attribute 'iter_cols'

Based on Charlie's response to an issue filed by a user in September 2017, I understand one has to choose between the warning message or the read_only mode when it comes to the ability to directly access a column of cells in openpyxl. I don't fully understand the reason behind it, but it seems to be a deliberate choice by the developers. 


Hope these help.

Alper
example.xlsx

Charles

unread,
Nov 29, 2017, 1:35:52 AM11/29/17
to openpyxl-users
Thanks Alper,

Your code makes sense for one column. What would the code look like if you wanted to list out both columns row by row (ie without creating an array)?

My usecase is : I have a spreadsheet with 16,000 rows and 20 columns - it's basically a database table in excel. I want to read all rows one by one and to put the data from some of the columns into a SQL database. 

My code for your data might look like:
from openpyxl import load_workbook
wb = load_workbook(filename='DummyDataFileV1.xlsx', read_only=True)
ws = wb['Data']

for row in range(2, ws.max_row + 1):
   Fruit= ws['A' + str(row)].value
    Quantity= ws['B' + str(row)].value
        print ("Fruit: ", Fruit, "Quantity: ", Quantity)



Charles

Charlie Clark

unread,
Nov 29, 2017, 3:27:49 AM11/29/17
to openpyx...@googlegroups.com
Am .11.2017, 07:35 Uhr, schrieb Charles <cha...@futurescope.co.uk>:

> Your code makes sense for one column. What would the code look like if
> you
> wanted to list out both columns row by row (ie without creating an
> array)?

for row in in ws.iter_rows(min_col=2, max_col=3):
values = [c.value for c in row]
print(values)

Simples.

Alper Paksoy

unread,
Nov 29, 2017, 3:32:57 AM11/29/17
to openpyxl-users
No problem.

For the sake of simplicity, let's assume you have a five-column table. 

Let's also assume you want to print out the values in the first, third and fourth columns for each row starting right after the column headings. 

The following should do this. I put the second "for" loop to show how you can print them out nicely in case that is something you need. The example file I used in the code is also attached.

#!/usr/bin/env python3

import openpyxl
wb = openpyxl.load_workbook(filename='example_2.xlsx') 
ws = wb['Data']

for row in ws.iter_rows(min_row=2):
    print('item:', row[0].value, 'price:', row[2].value,
          'origin:', row[3].value)

print()

for row in ws.iter_rows(min_row=2):
    print('item:', row[0].value.rjust(15), 'price:',
          str(row[2].value).rjust(5), 'origin:', row[3].value)

The output would be as follows.

item: Apples price: 2.87 origin: CA
item: Cherries price: 0.43 origin: VA
item: Pears price: 3.14 origin: NY
item: Oranges price: 3.78 origin: AZ
item: Apples price: 4.17 origin: CA
item: Bananas price: 4.25 origin: OR
item: Strawberries price: 4.99 origin: TX

item:           Apples price:  2.87 origin: CA
item:        Cherries price:  0.43 origin: VA
item:            Pears price:  3.14 origin: NY
item:        Oranges price:  3.78 origin: AZ
item:           Apples price:  4.17 origin: CA
item:        Bananas price:  4.25 origin: OR
item:  Strawberries price:  4.99 origin: TX

Alper
example_2.xlsx

Alper Paksoy

unread,
Nov 29, 2017, 8:30:44 PM11/29/17
to openpyxl-users
Hi Charlie,

What would your code look like if the columns to printed were not consecutive? Thx. 

Charlie Clark

unread,
Nov 30, 2017, 3:53:55 AM11/30/17
to openpyx...@googlegroups.com
Am .11.2017, 02:30 Uhr, schrieb Alper Paksoy <paksoy...@gmail.com>:

> Hi Charlie,
> What would your code look like if the columns to printed were not
> consecutive? Thx.

It would be easy to extend the logic accordingly: you are just dealing
with a list. For example you might pass in a list of column indices that
are valid and put that in a condition for the list. But I'd be tempted to
create the list of row values and have column selection logic act on it.

Charles

unread,
Nov 30, 2017, 7:36:27 AM11/30/17
to openpyxl-users
Alper,

Thanks. Your code was simply explained. I'm using it successfully now. Thanks for your help and patience.

Charles
Reply all
Reply to author
Forward
0 new messages