Extracting minimum and maximum values from a column

3,184 views
Skip to first unread message

Jignesh Sutar

unread,
Dec 10, 2013, 2:08:32 PM12/10/13
to python...@googlegroups.com
Hi,

I'd like to ask if this is the appropriate method (see code below) for extracting the minimum/maximum values from a certain column in a excel file? I am querying this as I have 100k cases and it takes some time to fetch the results and as this is part of a bigger automation process it'd be great if this could be achieved faster.


#using profiles.xls template example as shipped with xlrd module.

import xlrdxls = xlrd.open_workbook(r"profiles.xls")
dataSheet = xls.sheet_by_name("PROFILEDEF")
dataCol = set([dataSheet.cell_value(row, 1) for row in range(1, dataSheet.nrows)])
minValue = min(dataCol)
maxValue = max(dataCol)
print minValue
print maxValue


Many thanks in advance.
Jignesh

Matthew Smith

unread,
Dec 10, 2013, 4:05:56 PM12/10/13
to python...@googlegroups.com
Personally, I would make a test set of 50 or so cases that exercised my and utilize that for my pre-production code. 

As far as the code goes, it might make more sense (for memory footprint) to keep temp_max and temp_min values and compare them while iterating through the column. That way you wouldn't have to carry around that big set in memory. 

Just some thoughts. 


--
You received this message because you are subscribed to the Google Groups "python-excel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python-excel...@googlegroups.com.
To post to this group, send an email to python...@googlegroups.com.
Visit this group at http://groups.google.com/group/python-excel.
For more options, visit https://groups.google.com/groups/opt_out.



--
Matthew Smith

Guest Researcher at NIST


John Yeung

unread,
Dec 10, 2013, 5:03:42 PM12/10/13
to python-excel
Using xlrd, you probably can't go too much faster than what you have.
You can be a little simpler and use the included col_values() sheet
method instead:

dataCol = set(dataSheet.col_values(1))

It amounts to about the same as what you're doing already, but looks nicer.

John Y.
Reply all
Reply to author
Forward
0 new messages