Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How do I read Excel file in Python?

14 views
Skip to first unread message

kath

unread,
Oct 5, 2006, 10:11:14 AM10/5/06
to
How do I read an Excel file in Python?

I have found a package to read excel file, which can be used on any
platform.

http://www.lexicon.net/sjmachin/xlrd.htm
I installed and working on the examples, I found its printing of cell's
contents in a different manner.

>>> import xlrd
>>> book=xlrd.open_workbook("Calculation_file.xls")
>>> book=xlrd.open_workbook("testbook.xls")
>>> sh=book.sheet_by_index(0)
>>> for row in range(sh.nrows):
print sh.row(rx)
[text:u'name', text:u'address', text:u'ph']
[text:u'sudhir', text:u'bangalore', number:1234.0]
[text:u'vinay', text:u'bangalore', number:3264.0]

I am bit confused with slicing. help me....


Thank you,
Regards,
Sudhir.

John Machin

unread,
Oct 5, 2006, 11:21:58 AM10/5/06
to

kath wrote:
> How do I read an Excel file in Python?
>
> I have found a package to read excel file, which can be used on any
> platform.

Hi Sudhir,
So far, so good :-)

>
> http://www.lexicon.net/sjmachin/xlrd.htm
> I installed and working on the examples, I found its printing of cell's
> contents in a different manner.
>
> >>> import xlrd
> >>> book=xlrd.open_workbook("Calculation_file.xls")
> >>> book=xlrd.open_workbook("testbook.xls")
> >>> sh=book.sheet_by_index(0)
> >>> for row in range(sh.nrows):
> print sh.row(rx)
> [text:u'name', text:u'address', text:u'ph']
> [text:u'sudhir', text:u'bangalore', number:1234.0]
> [text:u'vinay', text:u'bangalore', number:3264.0]

It helps when asking questions if you copy/paste exactly what is on
your screen;
in this case
print sh.row(rx)
would have given an error; you must have typed
for rx in range.....

A row is returned as a sequence of Cell objects. What you are seeing is
Python automatically doing repr(cell) on each cell in the row. The
Cell.__repr__ method formats it that way for debugging. Here are some
examples from a little test file of mine:

>>> import xlrd
>>> bk = xlrd.open_workbook('sjm1.xls')
>>> sh = bk.sheet_by_index(0)
>>> row0 = sh.row(0)
>>> row0
[text:u'fubar', number:1.0, number:2.0]
>>> firstcell = row0[0]
>>> type(firstcell)
<class 'xlrd.sheet.Cell'>
>>> firstcell.ctype
1
>>> # cell type 1 is text
>>> firstcell.value
u'fubar'
>>> repr(firstcell)
"text:u'fubar'"

>
> I am bit confused with slicing. help me....
>

None of the above is anything to do with slicing; is this a 2nd
problem?

Perhaps you are having trouble with this:
>>> help(sh.row_slice)
Help on method row_slice in module xlrd.sheet:

row_slice(self, rowx, start_colx=0, end_colx=None) method of
xlrd.sheet.Sheet instance
##
# Returns a slice of the Cell objects in the given row.
>>>

sh.row_slice(rowx, lo, hi) gives the same result as sh.row(rowx)[lo:hi]
-- it is provided because the latter would be inefficient for getting a
small slice from a long row.

If you are having trouble with the general concept of slicing, perhaps
you might like to try the Python tutorial. Otherwise, please try to be
a bit more specific about what the confusion is.

HTH, and e-mail me if you prefer ...

Cheers,
John

kath

unread,
Oct 5, 2006, 1:00:50 PM10/5/06
to


Hi, thanks for the reply. I just took some time reading help file and
came to know to there is nothing do with slicing. But I do have a
problem with date field in the excel file.

the date( 8/9/2006 ) in Excel file, i am getting the value as 38938.0,
which I get when I convert date values to general format in Excel. I
want the actual date value. How do get that?


Thank you.
regards,
Sudhir.

Matimus

unread,
Oct 5, 2006, 1:25:37 PM10/5/06
to
> the date( 8/9/2006 ) in Excel file, i am getting the value as 38938.0,
> which I get when I convert date values to general format in Excel. I
> want the actual date value. How do get that?

38938 appears to be the date in days since 1/1/1900. I'm sure someone
can help you figure out how to convert that to a more useful value.

-Matt

Steve Holden

unread,
Oct 5, 2006, 2:25:59 PM10/5/06
to pytho...@python.org

I believe the win32all extension contains functionality to help with
that, thought it's a long time since I even ran PythonWin (sorry, Mark).

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden

Simon Brunning

unread,
Oct 5, 2006, 3:03:56 PM10/5/06
to pytho...@python.org

>>> excel_date = 38938.0
>>> python_date = datetime.date(1900, 1, 1) +
datetime.timedelta(days=excel_date)
>>> python_date
datetime.date(2006, 8, 11)

--
Cheers,
Simon B
si...@brunningonline.net
http://www.brunningonline.net/simon/blog/

Simon Brunning

unread,
Oct 5, 2006, 3:20:32 PM10/5/06
to pytho...@python.org
On 10/5/06, Simon Brunning <si...@brunningonline.net> wrote:
> On 5 Oct 2006 10:25:37 -0700, Matimus <mccr...@gmail.com> wrote:
> >>> excel_date = 38938.0
> >>> python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date)
> >>> python_date
> datetime.date(2006, 8, 11)

Err, that's the wrong answer, isn't it? Perhaps it shoud be
datetime.date(1900, 1, 29)?

houdin...@yahoo.com

unread,
Oct 5, 2006, 3:49:53 PM10/5/06
to
> > >>> excel_date = 38938.0
> > >>> python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date)
> > >>> python_date
> > datetime.date(2006, 8, 11)
>
> Err, that's the wrong answer, isn't it? Perhaps it shoud be
> datetime.date(1900, 1, 29)?

Actually was about to post same solution and got same results. (BTW
Simon, the OP date is Aug 9th, 2006). Scratched head and googled for
excel date calculations... found this bug where it treats 1900 as leap
year incorrectly:
http://www.ozgrid.com/Excel/ExcelDateandTimes.htm

Plus it treats 1 jan 1900 as day 1, not 0 so just subtract 2 in the
calc:
>>>python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date - 2)
>>> python_date
datetime.date(2006, 8, 9)

HTH.

John Machin

unread,
Oct 5, 2006, 7:54:23 PM10/5/06
to

... and 2006-08-09 is only the correct answer if the spreadsheet was,
when last saved, using the 1900 ("Windows") date system, not the 1904
("Macintosh") date system.

All the OP needs to do is to read the documentation that comes with the
xlrd package. It describes the problems with Excel dates, and offers
functions for conversion between the Excel date numbers and (year,
month, day, hour, minute, second) tuples which of course are
interoperable with Python's datetime module and with mx.DateTime.

| >>> import xlrd
| >>> xlrd.xldate_as_tuple(38938.0, 0)
| (2006, 8, 9, 0, 0, 0)
| >>> xlrd.xldate_as_tuple(38938.0, 1)
| (2010, 8, 10, 0, 0, 0)
| >>>

Cheers,
John

Simon Brunning

unread,
Oct 6, 2006, 5:21:37 AM10/6/06
to pytho...@python.org
On 5 Oct 2006 12:49:53 -0700, houdin...@yahoo.com

<houdin...@yahoo.com> wrote:
> Actually was about to post same solution and got same results. (BTW
> Simon, the OP date is Aug 9th, 2006). Scratched head and googled for
> excel date calculations... found this bug where it treats 1900 as leap
> year incorrectly:
> http://www.ozgrid.com/Excel/ExcelDateandTimes.htm

Ah - I was reading the OP's 8/9/2006 in the European way - DD/MM/YYYY.

One might argue over whether DD/MM/YYYY or MM/DD/YYYY are more
rational, but I find it best to avoid *both* those formats, 'cos they
are so easily confused.

Giles Brown

unread,
Oct 6, 2006, 6:22:39 AM10/6/06
to
John Machin wrote:
> houdin...@yahoo.com wrote:
> > > > >>> excel_date = 38938.0
> > > > >>> python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date)
> > > > >>> python_date
> > > > datetime.date(2006, 8, 11)
> > >
> > > Err, that's the wrong answer, isn't it? Perhaps it shoud be
> > > datetime.date(1900, 1, 29)?
> >
> > Actually was about to post same solution and got same results. (BTW
> > Simon, the OP date is Aug 9th, 2006). Scratched head and googled for
> > excel date calculations... found this bug where it treats 1900 as leap
> > year incorrectly:
> > http://www.ozgrid.com/Excel/ExcelDateandTimes.htm
> >
> > Plus it treats 1 jan 1900 as day 1, not 0 so just subtract 2 in the
> > calc:
> > >>>python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date - 2)
> > >>> python_date
> > datetime.date(2006, 8, 9)
> >
>
> ... and 2006-08-09 is only the correct answer if the spreadsheet was,
> when last saved, using the 1900 ("Windows") date system, not the 1904
> ("Macintosh") date system.

John,
Just for me own curiosity, is this Excel concept of date numbers same
as the OLE
concept (see http://msdn2.microsoft.com/en-us/library/82ab7w69.aspx or
search "MFC DATE" on MSDN).

I put in some test cases for conversion code here:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/496683

But would be interested to add any additional info on variations on
this theme.

Cheers,
Giles

Message has been deleted

John Machin

unread,
Oct 6, 2006, 10:24:23 AM10/6/06
to

Hi Giles,

Those OLE date numbers coincide with Excel 1900-system date numbers
from 1900-03-01 onwards. Excel treats day 60 as the non-existent
1900-02-29.

Cheers,
John

0 new messages