I am new to python and am getting the data from hbase. I am trying to do sum on the column as below
scanner = client.scannerOpenWithStop("tab", "10", "1000", ["cf:col1"])
total = 0.0
r = client.scannerGet(scanner)
while r:
for k in (r[0].columns):
total += float(r[0].columns[k].value)
r = client.scannerGet(scanner)
On Thu, Oct 4, 2012 at 2:52 PM, <mike20...@gmail.com> wrote:
> scanner = client.scannerOpenWithStop("tab", "10", "1000", ["cf:col1"])
> total = 0.0
> r = client.scannerGet(scanner)
> while r:
> for k in (r[0].columns):
> total += float(r[0].columns[k].value)
> r = client.scannerGet(scanner)
> print total
> Do you know of better (faster) way to do sum?
scanner = client.scannerOpenWithStop("tab", "10", "1000", ["cf:col1"])
next_r = itertools.partial(client.scannerGet, scanner)
total = sum(float(col.value) for r in iter(next_r, None) for col in
r.itervalues())
On Thu, Oct 4, 2012 at 3:04 PM, Ian Kelly <ian.g.ke...@gmail.com> wrote:
> scanner = client.scannerOpenWithStop("tab", "10", "1000", ["cf:col1"])
> next_r = itertools.partial(client.scannerGet, scanner)
> total = sum(float(col.value) for r in iter(next_r, None) for col in
> r.itervalues())
That should be "functools" above, not "itertools". :-P
On Thursday, October 4, 2012 5:40:26 PM UTC-4, Dave Angel wrote:
> On 10/04/2012 05:29 PM, Mike wrote:
> > I get below error
> > NameError: name 'functools' is not defined
> functools is a module in the standard library. You need to import it.
> import functools
> --
> DaveA
I imported functools. Now I get the below error please.
Traceback (most recent call last):
File "test.py", line 16, in <module>
total = sum(float(col.value) for r in iter(next_r, None) for col in r.itervalues()) File "test.py", line 16, in <genexpr>
total = sum(float(col.value) for r in iter(next_r, None) for col in r.itervalues()) AttributeError: 'list' object has no attribute 'itervalues'
On Thursday, October 4, 2012 5:40:26 PM UTC-4, Dave Angel wrote:
> On 10/04/2012 05:29 PM, Mike wrote:
> > I get below error
> > NameError: name 'functools' is not defined
> functools is a module in the standard library. You need to import it.
> import functools
> --
> DaveA
I imported functools. Now I get the below error please.
Traceback (most recent call last):
File "test.py", line 16, in <module>
total = sum(float(col.value) for r in iter(next_r, None) for col in r.itervalues()) File "test.py", line 16, in <genexpr>
total = sum(float(col.value) for r in iter(next_r, None) for col in r.itervalues()) AttributeError: 'list' object has no attribute 'itervalues'
On Thu, Oct 4, 2012 at 6:40 PM, Mike <mike20...@gmail.com> wrote:
> Traceback (most recent call last):
> File "test.py", line 16, in <module>
> total = sum(float(col.value) for r in iter(next_r, None) for col in r.itervalues())
> File "test.py", line 16, in <genexpr>
> total = sum(float(col.value) for r in iter(next_r, None) for col in r.itervalues())
> AttributeError: 'list' object has no attribute 'itervalues'
"r.itervalues()" should have been "r[0].columns.itervalues()", I
think. It's hard to test code against an API that you don't have. :-)
I agree with you, Ian. Thanks for all the help. Now I get the below error.
File "test.py", line 17, in <module>
total = sum(float(col.value) for r in iter(next_r, None) for col in r[0].columns.itervalues())
File "test.py", line 17, in <genexpr>
total = sum(float(col.value) for r in iter(next_r, None) for col in r[0].columns.itervalues())
I agree with you, Ian. Thanks for all the help. Now I get the below error.
File "test.py", line 17, in <module>
total = sum(float(col.value) for r in iter(next_r, None) for col in r[0].columns.itervalues())
File "test.py", line 17, in <genexpr>
total = sum(float(col.value) for r in iter(next_r, None) for col in r[0].columns.itervalues())
Traceback (most recent call last):
File "test.py", line 17, in <module>
total = sum(float(col.value) for r in iter(next_r, None) for col in r[0].columns.itervalues())
File "test.py", line 17, in <genexpr>
total = sum(float(col.value) for r in iter(next_r, None) for col in r[0].columns.itervalues())
IndexError: list index out of range
> > > Do you know of better (faster) way to do sum?
> > > Any thoughts please?
> > > Thanks
> > Sorry about that. Here you go
> > Traceback (most recent call last):
> > File "test.py", line 17, in <module>
> > total = sum(float(col.value) for r in iter(next_r, None) for col in r[0].columns.itervalues())
> > File "test.py", line 17, in <genexpr>
> > total = sum(float(col.value) for r in iter(next_r, None) for col in r[0].columns.itervalues())
> > IndexError: list index out of range
> the variable "r" is an empty list
Here is the actual code.
scanner = client.scannerOpenWithStop("tab", "10", "1000", ["cf:col1"]) next_r = functools.partial(client.scannerGet, scanner)
total = sum(float(col.value) for r in iter(next_r, None) for col in r[0].columns.itervalues())
> On Friday, October 5, 2012 9:41:44 AM UTC-4, Ramchandra Apte wrote:
>> On Friday, 5 October 2012 19:09:15 UTC+5:30, Mike wrote:
>>> On Thursday, October 4, 2012 4:52:50 PM UTC-4, Mike wrote:
>>>> Hi All,
>>>> I am new to python and am getting the data from hbase.
If you want as many people as possible to read your posts, stop using a mail-agent and site that spits in the face of readers by doubling blank lines each iteration. Alternatives have been discussed previously.
On Fri, Oct 5, 2012 at 7:39 AM, Mike <mike20...@gmail.com> wrote:
> Sorry about that. Here you go
> Traceback (most recent call last):
> File "test.py", line 17, in <module>
> total = sum(float(col.value) for r in iter(next_r, None) for col in r[0].columns.itervalues())
> File "test.py", line 17, in <genexpr>
> total = sum(float(col.value) for r in iter(next_r, None) for col in r[0].columns.itervalues())
> IndexError: list index out of range
Maybe the sentinel value is not None as I assumed, and it's
overrunning the end of the data? What does
client.scannerGet return when there is no more data?
I am not using the mail client. I am just posting on the site.
Something wrong with this site. When you do individual reply, it does the double posting which it shouldn't. See "Ramachandra Apte's" reply. It is posted twice too.
I am not using the mail client. I am just posting on the site.
Something wrong with this site. When you do individual reply, it does the double posting which it shouldn't. See "Ramachandra Apte's" reply. It is posted twice too.