extract rows by checking the value in particular coloums

988 views
Skip to first unread message

ashish...@live.in

unread,
Dec 25, 2011, 11:13:44 PM12/25/11
to python-excel
Hello Everyone,

i am quite new to phyton scripting so i need a little help for my
script.

I have a huge excel file containing data like: name,address, number of
hours , age , error id.

now from this file i have to extract only first 10 rows in which
error id is not blank and contains only an integer.

I tried a lot but got stuck with some or the other form of error. I am
quite new to phyton n still learning it but my deadline to finish this
work is very nearby as my project work largely depend on this script.
So i request if anyone can help or guide me through this problem.

Thanking in advance.

Best Regards,
Ashu.

Gerard Blais

unread,
Dec 26, 2011, 3:34:48 PM12/26/11
to python...@googlegroups.com
you need xlrd to open the workbook

you need to assign a variable to refer the sheet you want (you can also use a number, so 0 will be the first sheet.

You need to look at cells by indexing their rows and columns (bith numbering schemes start at 0)

Here's the general idea:

import os
import xlrd

os.chdir("d://python_projects//euler")

wb = xlrd.open_workbook("p233.xls")

sheet = wb.sheet_by_index(0)

col = 17

rows = []   ###--a list of matching row indices

print "sheet has %d rows and %d columns" % (sheet.nrows, sheet.ncols) 

for row in xrange(sheet.nrows):
    value = sheet.cell_value(row, col)
    print row, col, value
   
    try:
        z = float(value)
    except ValueError:
        continue
    if z % 1 == 0:
        rows.append (row)
       
        if len(rows) == 10: break
   
        
###-- do what you like with the indices in hits

print "matching rows are", rows



--
You received this message because you are subscribed to the Google Groups "python-excel" group.
To post to this group, send an email to python...@googlegroups.com.
To unsubscribe from this group, send email to python-excel...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/python-excel?hl=en-GB.




--

ashish chopra

unread,
Dec 26, 2011, 10:02:57 PM12/26/11
to python...@googlegroups.com, gerard...@gmail.com
Hello Gerard,

Thanx a lot for your help. It is really very kind of u.

I downloaded the xlrd and was able to open the file. But as i started writing the code, i got stuck and had some errors shown up, also tried going through the forum for the desired solution but coudn't find one. 

I hereby request you to kindly look into the code and help me in debugging the errors..

import os
import xlrd
os.chdir("d://ashii")

wb = xlrd.open_workbook("d:/ashii/ashuu.xls")
sheet = wb.sheet_by_index(0)
print "sheet has %d rows and %d columns" % (sheet.nrows, sheet.ncols) ## till here everything worked fine as i was able to see the output for this command

col = 4
rows = []

for row in xrange(sheet.nrows):
    value = sheet.cell_value(row,col)
    print row,col, value

But Now i get error saying 
Traceback (most recent call last):
  File "C:\script1", line 13, in <module>
    value = sheet.cell_value(row,col)
  File "C:\Python27\lib\site-packages\xlrd\sheet.py", line 263, in cell_value
    return self._cell_values[rowx][colx]
IndexError: list index out of range


Thanking you in advance!!!

Best Regards,
Ashu




    


Date: Mon, 26 Dec 2011 15:34:48 -0500
Subject: Re: [pyxl] extract rows by checking the value in particular coloums
From: gerard...@gmail.com
To: python...@googlegroups.com

werner

unread,
Dec 27, 2011, 6:56:21 AM12/27/11
to python...@googlegroups.com
On 12/27/2011 04:02 AM, ashish chopra wrote:
Hello Gerard,

Thanx a lot for your help. It is really very kind of u.

I downloaded the xlrd and was able to open the file. But as i started writing the code, i got stuck and had some errors shown up, also tried going through the forum for the desired solution but coudn't find one. 

I hereby request you to kindly look into the code and help me in debugging the errors..

import os
import xlrd
os.chdir("d://ashii")

wb = xlrd.open_workbook("d:/ashii/ashuu.xls")
sheet = wb.sheet_by_index(0)
print "sheet has %d rows and %d columns" % (sheet.nrows, sheet.ncols) ## till here everything worked fine as i was able to see the output for this command

col = 4
don't hard code your columns, do this instead
cols = sheet.ncols

rows = []

for row in xrange(sheet.nrows):
        for col in cols:
            value = sheet.cell_value(row, col)
            print value

Untested, but hopefully works

Werner

werner

unread,
Dec 27, 2011, 12:17:15 PM12/27/11
to python...@googlegroups.com
Hi Gerard,


On 12/27/2011 04:02 AM, ashish chopra wrote:
Hello Gerard,

Thanx a lot for your help. It is really very kind of u.

I downloaded the xlrd and was able to open the file. But as i started writing the code, i got stuck and had some errors shown up, also tried going through the forum for the desired solution but coudn't find one. 

I hereby request you to kindly look into the code and help me in debugging the errors..

import os
import xlrd
os.chdir("d://ashii")

wb = xlrd.open_workbook("d:/ashii/ashuu.xls")
sheet = wb.sheet_by_index(0)
print "sheet has %d rows and %d columns" % (sheet.nrows, sheet.ncols) ## till here everything worked fine as i was able to see the output for this command

col = 4
rows = []

for row in xrange(sheet.nrows):
    value = sheet.cell_value(row,col)
    print row,col, value

But Now i get error saying 
Traceback (most recent call last):
  File "C:\script1", line 13, in <module>
    value = sheet.cell_value(row,col)
  File "C:\Python27\lib\site-packages\xlrd\sheet.py", line 263, in cell_value
    return self._cell_values[rowx][colx]
IndexError: list index out of range

Was a bit in a hurry this morning.

I am noob to xlrd too, but my reading of the above error message is that either "row" or "col" is out side the range of the sheet you work with, therefore my suggestion to get it using sheet.ncols.

Werner

ashish chopra

unread,
Dec 27, 2011, 12:27:23 PM12/27/11
to python...@googlegroups.com
Hello  Werner ,

Thanx a lot for the suggestion.

Gerard already gave me the solution of my problem. It was a typo from my side as i had only 4 columns on my sheet and i was giving the col number as 4 which should have been 3 as xlrd starts counting from 0.

As i have corrected the problem the code runs fine. :-)

Best Regards,
Ashu.




Date: Tue, 27 Dec 2011 18:17:15 +0100
From: wbr...@free.fr
To: python...@googlegroups.com

Subject: Re: [pyxl] extract rows by checking the value in particular coloums

Reply all
Reply to author
Forward
0 new messages