Message from discussion
Importing data to Riak
Received: by 10.213.21.201 with SMTP id k9mr130140ebb.5.1321376903396;
Tue, 15 Nov 2011 09:08:23 -0800 (PST)
X-BeenThere: nosql-databases@googlegroups.com
Received: by 10.213.15.75 with SMTP id j11ls408450eba.2.gmail; Tue, 15 Nov
2011 09:08:22 -0800 (PST)
Received: by 10.213.110.3 with SMTP id l3mr135797ebp.8.1321376902472;
Tue, 15 Nov 2011 09:08:22 -0800 (PST)
Received: by 10.213.110.3 with SMTP id l3mr135796ebp.8.1321376902442;
Tue, 15 Nov 2011 09:08:22 -0800 (PST)
Return-Path: <the.mindstorm.mailinglist+caf_=nosql-databases=googlegroups....@gmail.com>
Received: from mail-fx0-f46.google.com (mail-fx0-f46.google.com [209.85.161.46])
by gmr-mx.google.com with ESMTPS id z20si4096176faf.1.2011.11.15.09.08.22
(version=TLSv1/SSLv3 cipher=OTHER);
Tue, 15 Nov 2011 09:08:22 -0800 (PST)
Received-SPF: pass (google.com: domain of the.mindstorm.mailinglist+caf_=nosql-databases=googlegroups....@gmail.com designates 209.85.161.46 as permitted sender) client-ip=209.85.161.46;
Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of the.mindstorm.mailinglist+caf_=nosql-databases=googlegroups....@gmail.com designates 209.85.161.46 as permitted sender) smtp.mail=the.mindstorm.mailinglist+caf_=nosql-databases=googlegroups....@gmail.com; dkim=neutral (body hash did not verify) header...@gmail.com
Received: by mail-fx0-f46.google.com with SMTP id n18so808092fag.5
for <nosql-databases@googlegroups.com>; Tue, 15 Nov 2011 09:08:22 -0800 (PST)
Received: by 10.205.129.12 with SMTP id hg12mr16965270bkc.113.1321376902266;
Tue, 15 Nov 2011 09:08:22 -0800 (PST)
X-Forwarded-To: nosql-databases@googlegroups.com
X-Forwarded-For: the.mindstorm.mailingl...@gmail.com nosql-databases@googlegroups.com
Delivered-To: the.mindstorm.mailingl...@gmail.com
Received: by 10.204.36.130 with SMTP id t2cs80020bkd;
Tue, 15 Nov 2011 09:08:21 -0800 (PST)
Received: by 10.50.158.227 with SMTP id wx3mr28700952igb.52.1321376899238;
Tue, 15 Nov 2011 09:08:19 -0800 (PST)
Return-Path: <riak-users-boun...@lists.basho.com>
Received: from host3.emwd.com (host3.emwd.com. [72.52.162.75])
by mx.google.com with ESMTPS id z3si20734675vdv.132.2011.11.15.09.08.18
(version=TLSv1/SSLv3 cipher=OTHER);
Tue, 15 Nov 2011 09:08:18 -0800 (PST)
Received-SPF: neutral (google.com: 72.52.162.75 is neither permitted nor denied by best guess record for domain of riak-users-boun...@lists.basho.com) client-ip=72.52.162.75;
Received: from localhost ([127.0.0.1] helo=host3.emwd.com)
by host3.emwd.com with esmtp (Exim 4.69)
(envelope-from <riak-users-boun...@lists.basho.com>)
id 1RQMUY-0007SX-AQ; Tue, 15 Nov 2011 12:08:14 -0500
Received: from mail-yw0-f43.google.com ([209.85.213.43])
by host3.emwd.com with esmtps (TLSv1:RC4-SHA:128) (Exim 4.69)
(envelope-from <sharmanitishd...@gmail.com>) id 1RQMUV-0007S6-U4
for riak-us...@lists.basho.com; Tue, 15 Nov 2011 12:08:12 -0500
Received: by ywn1 with SMTP id 1so6780869ywn.2
for <riak-us...@lists.basho.com>; Tue, 15 Nov 2011 09:08:12 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
h=mime-version:in-reply-to:references:date:message-id:subject:from:to
:cc:content-type;
bh=ryX2ZT511KG/5a8nyqZaLPdyyqZdce7iawCf3om3iTc=;
b=Qf4+gh9MKjNxccxGuDLPAbHQaoqAAz3CP36FliSE7U7bwHQ9xrTnYUBHos7GQ+4kOP
owYZYUzGR1Z34CLNCG1AvBEmih1BrFllrV3EF9jnGspYMle6urTj4xN0I+yEdAAZRfhU
ImIp2FN+CngNHIS+hAUug3wFoQfCspaMaABI0=
MIME-Version: 1.0
Received: by 10.50.169.1 with SMTP id aa1mr29784890igc.9.1321376892777; Tue,
15 Nov 2011 09:08:12 -0800 (PST)
Received: by 10.231.33.140 with HTTP; Tue, 15 Nov 2011 09:08:12 -0800 (PST)
In-Reply-To: <CAHgGLC=JGn-jOqPTdPFptsHczsFXurP3ubeJgjHE1T+97Vo...@mail.gmail.com>
References: <CAEYuqBtq4AWyG7OGC8kAeO6wa7HVcvLxhQgvDHy_R3AjKuE...@mail.gmail.com>
<EC9EB045-2336-4D00-BF37-0567E761F...@basho.com>
<CAHgGLC=JGn-jOqPTdPFptsHczsFXurP3ubeJgjHE1T+97Vo...@mail.gmail.com>
Date: Tue, 15 Nov 2011 18:08:12 +0100
Message-ID: <CAEYuqBsfbhXDCXhHW7U_pgw6qaFA3hdV6tGD3TfKLiZU+D+...@mail.gmail.com>
Subject: Re: Importing data to Riak
From: Nitish Sharma <sharmanitishd...@gmail.com>
To: Andres Jaan Tack <andres.jaan.t...@eesti.ee>
Cc: riak-users <riak-us...@lists.basho.com>,
Russell Brown <russel...@basho.com>
X-BeenThere: riak-us...@lists.basho.com
X-Mailman-Version: 2.1.14-1
Precedence: list
List-Id: discussion of the Riak data storage system
<riak-users_lists.basho.com.lists.basho.com>
List-Unsubscribe: <http://lists.basho.com/mailman/options/riak-users_lists.basho.com>,
<mailto:riak-users-requ...@lists.basho.com?subject=unsubscribe>
List-Archive: <http://lists.basho.com/pipermail/riak-users_lists.basho.com>
List-Post: <mailto:riak-us...@lists.basho.com>
List-Help: <mailto:riak-users-requ...@lists.basho.com?subject=help>
List-Subscribe: <http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com>,
<mailto:riak-users-requ...@lists.basho.com?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============1866951781=="
Errors-To: riak-users-boun...@lists.basho.com
Sender: riak-users-boun...@lists.basho.com
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - host3.emwd.com
X-AntiAbuse: Original Domain - gmail.com
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - lists.basho.com
--===============1866951781==
Content-Type: multipart/alternative; boundary=e89a8f23545d4ef4c504b1c908b3
--e89a8f23545d4ef4c504b1c908b3
Content-Type: text/plain; charset=ISO-8859-1
Hi,
I tried importing the data using Python library (with protocol buffers).
After storing several objects, I get thread exception with timeout errors.
Following is the traceback:
File "/usr/lib/python2.7/threading.py", line 552, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 505, in run
self.__target(*self.__args, **self.__kwargs)
File "python_load_data.py", line 23, in worker
new_obj.store()
File
"/usr/local/lib/python2.7/dist-packages/riak-1.3.0-py2.7.egg/riak/riak_object.py",
line 296, in store
Result = t.put(self, w, dw, return_body)
File
"/usr/local/lib/python2.7/dist-packages/riak-1.3.0-py2.7.egg/riak/transports/pbc.py",
line 188, in put
msg_code, resp = self.recv_msg()
File
"/usr/local/lib/python2.7/dist-packages/riak-1.3.0-py2.7.egg/riak/transports/pbc.py",
line 370, in recv_msg
raise Exception(msg.errmsg)
Exception: timeout
The cluster consists of 3 nodes (Ubuntu 10.04).
Any Suggestions?
Cheers
Nitish
On Mon, Nov 14, 2011 at 2:20 PM, Andres Jaan Tack <andres.jaan.t...@eesti.ee
> wrote:
> I was able to achieve similar results. I wrote a Ruby process that would
> keep at most n (I think n = 10) things at once and reached 2,500ish req/s
> on my macbook pro.
>
> I loaded data to a cluster of six Riak nodes by running several of these
> processes at once and attaching each to a different Riak node, and I hit
> 18,000 req/s. I'm not sure whether loading different nodes affected the
> speed or not, now that I think of it.
>
>
> 2011/11/14 Russell Brown <russel...@basho.com>
>
>>
>> On 14 Nov 2011, at 11:47, Nitish Sharma wrote:
>>
>> > Hi,
>> > This is more sort of a discussion than a question. I am just trying to
>> see the trend in how users import their data to Riak.
>> > For the data I am using, I was able to achieve almost 150
>> records/second with PHP library, and 400 records/second with node.js
>> (fairly new with node; was hitting memory wall when trying to import 1
>> million records).
>> > What are some hacks/tricks/tweaks to import large amount of data to
>> Riak?
>>
>> New keys, new data, straight in for the first time, no fetch before
>> store? I've had reasonable results creating a *number* of threads and using
>> the Java Raw PB client to write.
>>
>> For example, maybe have a 1 or a couple of threads that reads data (from
>> Oracle, a file, what-have-you) and puts it on a queue, and have a bunch of
>> threads that pull data off the queue, create a riak object and store it.
>> From my laptop I've got up to 2500 writes a second like this, and it was
>> just ad hoc, throw away code with 4 threads against a small 3 node cluster
>> (running on desktops.)
>>
>> I imagine others on the list have more direct, real world examples?
>>
>> Cheers
>>
>> Russell
>>
>> >
>> > Cheers
>> > Nitish
>> > _______________________________________________
>> > riak-users mailing list
>> > riak-us...@lists.basho.com
>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>> _______________________________________________
>> riak-users mailing list
>> riak-us...@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>
>
--e89a8f23545d4ef4c504b1c908b3
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Hi,<br>
I tried importing the data using Python library (with protocol buffers).
After storing several objects, I get thread exception with timeout errors.=
Following=20
is the traceback:<br>
<br>
=A0 File "/usr/lib/python2.7/threading.py", line 552, in __bootst=
rap_inner<br>
=A0=A0=A0 self.run()<br>
=A0 File "/usr/lib/python2.7/threading.py", line 505, in run<br>
=A0=A0=A0 self.__target(*self.__args, **self.__kwargs)<br>
=A0 File "python_load_data.py", line 23, in worker<br>
=A0=A0=A0 new_obj.store()<br>
=A0 File "/usr/local/lib/python2.7/dist-packages/riak-1.3.0-py2.7.egg/=
riak/riak_object.py", line 296, in store<br>
=A0=A0=A0 Result =3D t.put(self, w, dw, return_body)<br>
=A0 File "/usr/local/lib/python2.7/dist-packages/riak-1.3.0-py2.7.egg/=
riak/transports/pbc.py", line 188, in put<br>
=A0=A0=A0 msg_code, resp =3D self.recv_msg()<br>
=A0 File "/usr/local/lib/python2.7/dist-packages/riak-1.3.0-py2.7.egg/=
riak/transports/pbc.py", line 370, in recv_msg<br>
=A0=A0=A0 raise Exception(msg.errmsg)<br>
Exception: timeout<br><br>
The cluster consists of 3 nodes (Ubuntu 10.04).<br>
Any Suggestions?<br>
<br>
Cheers<br>
Nitish<br><br><div class=3D"gmail_quote">On Mon, Nov 14, 2011 at 2:20 PM, A=
ndres Jaan Tack <span dir=3D"ltr"><<a href=3D"mailto:andres.jaan.tack@ee=
sti.ee">andres.jaan.t...@eesti.ee</a>></span> wrote:<br><blockquote clas=
s=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;pad=
ding-left:1ex;">
I was able to achieve similar results. I wrote a Ruby process that would ke=
ep at most n (I think n =3D 10) things at once and reached 2,500ish req/s o=
n my macbook pro.<div><br></div><div>I loaded data to a cluster of six Riak=
nodes by running several of these processes at once and attaching each to =
a different Riak node, and I hit 18,000 req/s. I'm not sure whether loa=
ding different nodes affected the speed or not, now that I think of it.<div=
>
<div></div><div class=3D"h5"><br>
<br><div class=3D"gmail_quote">2011/11/14 Russell Brown <span dir=3D"ltr">&=
lt;<a href=3D"mailto:russel...@basho.com" target=3D"_blank">russelldb@basho=
.com</a>></span><br><blockquote class=3D"gmail_quote" style=3D"margin:0 =
0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><div></div><div><br>
On 14 Nov 2011, at 11:47, Nitish Sharma wrote:<br>
<br>
> Hi,<br>
> This is more sort of a discussion than a question. I am just trying to=
see the trend in how users import their data to Riak.<br>
> For the data I am using, I was able to achieve almost 150 records/seco=
nd with PHP library, and 400 records/second with node.js (fairly new with n=
ode; was hitting memory wall when trying to import 1 million records).<br>
> What are some hacks/tricks/tweaks to import large amount of data to Ri=
ak?<br>
<br>
</div></div>New keys, new data, straight in for the first time, no fetch be=
fore store? I've had reasonable results creating a *number* of threads =
and using the Java Raw PB client to write.<br>
<br>
For example, maybe have a 1 or a couple of threads that reads data (from Or=
acle, a file, what-have-you) and puts it on a queue, and have a bunch of th=
reads that pull data off the queue, create a riak object and store it. From=
my laptop I've got up to 2500 writes a second like this, and it was ju=
st ad hoc, throw away code with 4 threads against a small 3 node cluster (r=
unning on desktops.)<br>
<br>
I imagine others on the list have more direct, real world examples?<br>
<br>
Cheers<br>
<br>
Russell<br>
<br>
><br>
> Cheers<br>
> Nitish<br>
> _______________________________________________<br>
> riak-users mailing list<br>
> <a href=3D"mailto:riak-us...@lists.basho.com" target=3D"_blank">riak-u=
s...@lists.basho.com</a><br>
> <a href=3D"http://lists.basho.com/mailman/listinfo/riak-users_lists.ba=
sho.com" target=3D"_blank">http://lists.basho.com/mailman/listinfo/riak-use=
rs_lists.basho.com</a><br>
<br>
<br>
_______________________________________________<br>
riak-users mailing list<br>
<a href=3D"mailto:riak-us...@lists.basho.com" target=3D"_blank">riak-users@=
lists.basho.com</a><br>
<a href=3D"http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.c=
om" target=3D"_blank">http://lists.basho.com/mailman/listinfo/riak-users_li=
sts.basho.com</a><br>
</blockquote></div><br></div></div></div>
</blockquote></div><br>
--e89a8f23545d4ef4c504b1c908b3--
--===============1866951781==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
_______________________________________________
riak-users mailing list
riak-us...@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
--===============1866951781==--