Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

sum function

141 views
Skip to first unread message

mike...@gmail.com

unread,
Oct 4, 2012, 4:52:50 PM10/4/12
to
Hi All,

I am new to python and am getting the data from hbase.
I am trying to do sum on the column as below


scanner = client.scannerOpenWithStop("tab", "10", "1000", ["cf:col1"])
total = 0.0
r = client.scannerGet(scanner)
while r:
for k in (r[0].columns):
total += float(r[0].columns[k].value)
r = client.scannerGet(scanner)

print total

Do you know of better (faster) way to do sum?

Any thoughts please?

Thanks

Ian Kelly

unread,
Oct 4, 2012, 5:04:53 PM10/4/12
to Python
On Thu, Oct 4, 2012 at 2:52 PM, <mike...@gmail.com> wrote:
> scanner = client.scannerOpenWithStop("tab", "10", "1000", ["cf:col1"])
> total = 0.0
> r = client.scannerGet(scanner)
> while r:
> for k in (r[0].columns):
> total += float(r[0].columns[k].value)
> r = client.scannerGet(scanner)
>
> print total
>
> Do you know of better (faster) way to do sum?

scanner = client.scannerOpenWithStop("tab", "10", "1000", ["cf:col1"])
next_r = itertools.partial(client.scannerGet, scanner)
total = sum(float(col.value) for r in iter(next_r, None) for col in
r.itervalues())

Ian Kelly

unread,
Oct 4, 2012, 5:05:48 PM10/4/12
to Python
On Thu, Oct 4, 2012 at 3:04 PM, Ian Kelly <ian.g...@gmail.com> wrote:
> scanner = client.scannerOpenWithStop("tab", "10", "1000", ["cf:col1"])
> next_r = itertools.partial(client.scannerGet, scanner)
> total = sum(float(col.value) for r in iter(next_r, None) for col in
> r.itervalues())

That should be "functools" above, not "itertools". :-P

Mike

unread,
Oct 4, 2012, 5:29:54 PM10/4/12
to Python
I get below error

NameError: name 'functools' is not defined

Thanks

Mike

unread,
Oct 4, 2012, 5:29:54 PM10/4/12
to comp.lan...@googlegroups.com, Python

Mike

unread,
Oct 4, 2012, 5:31:19 PM10/4/12
to Python
Thanks Ian for the quick reply.

I get the below error.

NameError: name 'itertools' is not defined

Thanks

Mike

unread,
Oct 4, 2012, 5:31:19 PM10/4/12
to comp.lan...@googlegroups.com, Python

Dave Angel

unread,
Oct 4, 2012, 5:39:50 PM10/4/12
to Mike, pytho...@python.org
On 10/04/2012 05:29 PM, Mike wrote:
> I get below error
>
> NameError: name 'functools' is not defined
>

functools is a module in the standard library. You need to import it.

import functools



--

DaveA

Chris Angelico

unread,
Oct 4, 2012, 5:34:41 PM10/4/12
to pytho...@python.org
functools is a module:

import functools

ChrisA

Mike

unread,
Oct 4, 2012, 8:40:49 PM10/4/12
to Mike, pytho...@python.org, d...@davea.name
I imported functools. Now I get the below error please.


Traceback (most recent call last):
File "test.py", line 16, in <module>
total = sum(float(col.value) for r in iter(next_r, None) for col in r.itervalues())
File "test.py", line 16, in <genexpr>
total = sum(float(col.value) for r in iter(next_r, None) for col in r.itervalues())
AttributeError: 'list' object has no attribute 'itervalues'

Thanks

Mike

unread,
Oct 4, 2012, 8:40:49 PM10/4/12
to comp.lan...@googlegroups.com, pytho...@python.org, d...@davea.name, Mike
On Thursday, October 4, 2012 5:40:26 PM UTC-4, Dave Angel wrote:
I imported functools. Now I get the below error please.


Traceback (most recent call last):
File "test.py", line 16, in <module>
total = sum(float(col.value) for r in iter(next_r, None) for col in r.itervalues())
File "test.py", line 16, in <genexpr>
total = sum(float(col.value) for r in iter(next_r, None) for col in r.itervalues())

Ian Kelly

unread,
Oct 4, 2012, 8:59:03 PM10/4/12
to Python
On Thu, Oct 4, 2012 at 6:40 PM, Mike <mike...@gmail.com> wrote:
> Traceback (most recent call last):
> File "test.py", line 16, in <module>
> total = sum(float(col.value) for r in iter(next_r, None) for col in r.itervalues())
> File "test.py", line 16, in <genexpr>
> total = sum(float(col.value) for r in iter(next_r, None) for col in r.itervalues())
> AttributeError: 'list' object has no attribute 'itervalues'

"r.itervalues()" should have been "r[0].columns.itervalues()", I
think. It's hard to test code against an API that you don't have. :-)

Mike

unread,
Oct 4, 2012, 10:01:24 PM10/4/12
to Python

I agree with you, Ian. Thanks for all the help. Now I get the below error.

File "test.py", line 17, in <module>
total = sum(float(col.value) for r in iter(next_r, None) for col in r[0].columns.itervalues())
File "test.py", line 17, in <genexpr>
total = sum(float(col.value) for r in iter(next_r, None) for col in r[0].columns.itervalues())

Thanks

Mike

unread,
Oct 4, 2012, 10:01:24 PM10/4/12
to comp.lan...@googlegroups.com, Python

I agree with you, Ian. Thanks for all the help. Now I get the below error.

File "test.py", line 17, in <module>
total = sum(float(col.value) for r in iter(next_r, None) for col in r[0].columns.itervalues())
File "test.py", line 17, in <genexpr>
total = sum(float(col.value) for r in iter(next_r, None) for col in r[0].columns.itervalues())

Thanks

Ramchandra Apte

unread,
Oct 5, 2012, 4:31:29 AM10/5/12
to Python
You have missed the last line of the traceback (error)

Ramchandra Apte

unread,
Oct 5, 2012, 4:31:29 AM10/5/12
to comp.lan...@googlegroups.com, Python
On Friday, 5 October 2012 07:31:24 UTC+5:30, Mike wrote:

Mike

unread,
Oct 5, 2012, 9:39:15 AM10/5/12
to
Sorry about that. Here you go

Traceback (most recent call last):
File "test.py", line 17, in <module>
total = sum(float(col.value) for r in iter(next_r, None) for col in r[0].columns.itervalues())
File "test.py", line 17, in <genexpr>
total = sum(float(col.value) for r in iter(next_r, None) for col in r[0].columns.itervalues())
IndexError: list index out of range

Ramchandra Apte

unread,
Oct 5, 2012, 9:41:43 AM10/5/12
to
the variable "r" is an empty list

Mike

unread,
Oct 5, 2012, 9:47:12 AM10/5/12
to
Here is the actual code.

scanner = client.scannerOpenWithStop("tab", "10", "1000", ["cf:col1"])
next_r = functools.partial(client.scannerGet, scanner)
total = sum(float(col.value) for r in iter(next_r, None) for col in r[0].columns.itervalues())


Scanner does have rows.

Are we missing something please?

Thanks

Terry Reedy

unread,
Oct 5, 2012, 2:52:13 PM10/5/12
to pytho...@python.org
On 10/5/2012 9:47 AM, Mike wrote:
> On Friday, October 5, 2012 9:41:44 AM UTC-4, Ramchandra Apte wrote:
>> On Friday, 5 October 2012 19:09:15 UTC+5:30, Mike wrote:
>>
>>> On Thursday, October 4, 2012 4:52:50 PM UTC-4, Mike wrote:
>>
>>>
>>
>>>> Hi All,
>>
>>>
>>
>>>>
>>
>>>
>>
>>>>
>>
>>>
>>
>>>>
>>
>>>
>>
>>>> I am new to python and am getting the data from hbase.

If you want as many people as possible to read your posts, stop using a
mail-agent and site that spits in the face of readers by doubling blank
lines each iteration. Alternatives have been discussed previously.

--
Terry Jan Reedy

Ian Kelly

unread,
Oct 5, 2012, 3:29:04 PM10/5/12
to Python
On Fri, Oct 5, 2012 at 7:39 AM, Mike <mike...@gmail.com> wrote:
> Sorry about that. Here you go
>
> Traceback (most recent call last):
> File "test.py", line 17, in <module>
> total = sum(float(col.value) for r in iter(next_r, None) for col in r[0].columns.itervalues())
> File "test.py", line 17, in <genexpr>
> total = sum(float(col.value) for r in iter(next_r, None) for col in r[0].columns.itervalues())
> IndexError: list index out of range

Maybe the sentinel value is not None as I assumed, and it's
overrunning the end of the data? What does
client.scannerGet return when there is no more data?

Mike

unread,
Oct 5, 2012, 4:03:49 PM10/5/12
to
I added the print command.

It prints [] when there is no data.

Thanks

Mike

unread,
Oct 5, 2012, 4:09:11 PM10/5/12
to pytho...@python.org
Terry,

I am not using the mail client. I am just posting on the site.

Something wrong with this site. When you do individual reply, it does the double posting which it shouldn't. See "Ramachandra Apte's" reply. It is posted twice too.

Thanks



Mike

unread,
Oct 5, 2012, 4:09:11 PM10/5/12
to comp.lan...@googlegroups.com, pytho...@python.org

Ian Kelly

unread,
Oct 5, 2012, 4:10:36 PM10/5/12
to Python
On Fri, Oct 5, 2012 at 2:03 PM, Mike <mike...@gmail.com> wrote:
> I added the print command.
>
> It prints [] when there is no data.

Change "iter(next_r, None)" to "iter(next_r, [])"

Dave Angel

unread,
Oct 5, 2012, 4:26:26 PM10/5/12
to Mike, pytho...@python.org
On 10/05/2012 04:09 PM, Mike wrote:
> Terry,
>
> I am not using the mail client. I am just posting on the site.

And which site would that be (that you're using)? There are a few. I'm
guessing you use google-groups. And all of them get gatewayed to the
actual list, with differing numbers of bugs.

I use email to access it directly. I solve one of the duplicate-message
problems with google groups by automatically deleting any message
addressed to google-groups. There are about 100 such messages each month.

Another problem is that lots of these gateways post to both the
newsgroup and to the python-list.

>
> Something wrong with this site. When you do individual reply, it does the double posting which it shouldn't. See "Ramachandra Apte's" reply. It is posted twice too.
>
> Thanks
>
>
>


--

DaveA

Mike

unread,
Oct 5, 2012, 5:19:56 PM10/5/12
to
That worked, Ian.

Thanks

Ramchandra Apte

unread,
Oct 6, 2012, 8:38:38 AM10/6/12
to d...@davea.name
On Saturday, 6 October 2012 02:09:56 UTC+5:30, Dave Angel wrote:
> On 10/05/2012 04:09 PM, Mike wrote:
>
> > Terry,
>
> >
>
> > I am not using the mail client. I am just posting on the site.
>
>
>
> And which site would that be (that you're using)? There are a few. I'm
>
> guessing you use google-groups. And all of them get gatewayed to the
>
> actual list, with differing numbers of bugs.
>
>
>
> I use email to access it directly. I solve one of the duplicate-message
>
> problems with google groups by automatically deleting any message
>
> addressed to google-groups. There are about 100 such messages each month.
>
>
>
> Another problem is that lots of these gateways post to both the
>
> newsgroup and to the python-list.
>
I found out earlier that this was why my posts was being double-posted in Google Groups.
0 new messages