Swapping a read only store

tim robertson

unread,

Jun 20, 2009, 3:28:36 PM6/20/09

to project-...@googlegroups.com

Hi all,

While swapping the store that is build from Hadoop, I can see it
copying the data onto the /tmp drive, and it says it was successful,
but the index doesn't seem to get onto the voldemort read-only store,
which still remains with index and data files of size 0.
I know I must have a bad config somewhere, but can't locate it. Any ideas?

The following shows the swap store, the empty data index directory and
also the temp directory with the index

Many thanks for any hints

Tim

[root@ip-10-244-181-204 voldemort]# $VOLDEMORT_HOME/bin/swap-store.sh
--file hdfs://ip-10-244-181-204.ec2.internal:9000/user/root/voldemort/PD
--name=density-tiles --cluster=config/cluster.xml
09/06/20 15:24:26 INFO readonly.StoreSwapper: Invoking fetch for node
0 for hdfs://ip-10-244-181-204.ec2.internal:9000/user/root/voldemort/PD/node-0
09/06/20 15:24:26 INFO readonly.StoreSwapper: Invoking fetch for node
2 for hdfs://ip-10-244-181-204.ec2.internal:9000/user/root/voldemort/PD/node-2
09/06/20 15:24:26 INFO readonly.StoreSwapper: Invoking fetch for node
3 for hdfs://ip-10-244-181-204.ec2.internal:9000/user/root/voldemort/PD/node-3
09/06/20 15:24:26 INFO readonly.StoreSwapper: Invoking fetch for node
1 for hdfs://ip-10-244-181-204.ec2.internal:9000/user/root/voldemort/PD/node-1
09/06/20 15:24:26 INFO readonly.StoreSwapper: Invoking fetch for node
4 for hdfs://ip-10-244-181-204.ec2.internal:9000/user/root/voldemort/PD/node-4
09/06/20 15:24:27 INFO gui.ReadOnlyStoreManagementServlet: Executing
fetch of hdfs://ip-10-244-181-204.ec2.internal:9000/user/root/voldemort/PD/node-0
09/06/20 15:24:27 INFO fetcher.HdfsFetcher: Starting copy of
hdfs://ip-10-244-181-204.ec2.internal:9000/user/root/voldemort/PD/node-0/0.data
to /tmp/hdfs-fetcher/hdfs-fetcher/node-0/0.data
09/06/20 15:24:27 INFO readonly.StoreSwapper: Fetch succeeded on node 3
09/06/20 15:24:27 INFO readonly.StoreSwapper: Fetch succeeded on node 2
09/06/20 15:24:27 INFO fetcher.HdfsFetcher: Completed copy of
hdfs://ip-10-244-181-204.ec2.internal:9000/user/root/voldemort/PD/node-0/0.data
to /tmp/hdfs-fetcher/hdfs-fetcher/node-0/0.data
09/06/20 15:24:27 INFO readonly.StoreSwapper: Fetch succeeded on node 1
09/06/20 15:24:27 INFO fetcher.HdfsFetcher: Starting copy of
hdfs://ip-10-244-181-204.ec2.internal:9000/user/root/voldemort/PD/node-0/0.index
to /tmp/hdfs-fetcher/hdfs-fetcher/node-0/0.index
09/06/20 15:24:27 INFO readonly.StoreSwapper: Fetch succeeded on node 4
09/06/20 15:24:27 INFO fetcher.HdfsFetcher: Completed copy of
hdfs://ip-10-244-181-204.ec2.internal:9000/user/root/voldemort/PD/node-0/0.index
to /tmp/hdfs-fetcher/hdfs-fetcher/node-0/0.index
09/06/20 15:24:27 INFO gui.ReadOnlyStoreManagementServlet: Fetch complete.
09/06/20 15:24:27 INFO readonly.StoreSwapper: Fetch succeeded on node 0
09/06/20 15:24:27 INFO readonly.StoreSwapper: Attempting swap for node
0 dir = /tmp/hdfs-fetcher/hdfs-fetcher/node-0
09/06/20 15:24:27 INFO readonly.ReadOnlyStorageEngine: Swapping files
for store 'density-tiles' from /tmp/hdfs-fetcher/hdfs-fetcher/node-0
09/06/20 15:24:27 INFO readonly.ReadOnlyStorageEngine: Acquiring write
lock on 'density-tiles':
09/06/20 15:24:27 INFO readonly.ReadOnlyStorageEngine: Renaming data
and index files for 'density-tiles':
09/06/20 15:24:27 INFO readonly.ReadOnlyStorageEngine: Setting primary
files for store 'density-tiles' to
/tmp/hdfs-fetcher/hdfs-fetcher/node-0
09/06/20 15:24:27 INFO readonly.ReadOnlyStorageEngine: Rolling back
store 'density-tiles' to version 1.
09/06/20 15:24:27 INFO readonly.ReadOnlyStorageEngine: Rollback
operation completed on 'density-tiles', releasing lock.
09/06/20 15:24:27 INFO readonly.ReadOnlyStorageEngine: Swap operation
completed on 'density-tiles', releasing lock.
09/06/20 15:24:27 INFO readonly.StoreSwapper: Swap succeeded for node 0
09/06/20 15:24:27 INFO readonly.StoreSwapper: Attempting swap for node
1 dir = /tmp/hdfs-fetcher/hdfs-fetcher/node-1
09/06/20 15:24:27 INFO readonly.StoreSwapper: Swap succeeded for node 1
09/06/20 15:24:27 INFO readonly.StoreSwapper: Attempting swap for node
2 dir = /tmp/hdfs-fetcher/hdfs-fetcher/node-2
09/06/20 15:24:27 INFO readonly.StoreSwapper: Swap succeeded for node 2
09/06/20 15:24:27 INFO readonly.StoreSwapper: Attempting swap for node
3 dir = /tmp/hdfs-fetcher/hdfs-fetcher/node-3
09/06/20 15:24:27 INFO readonly.StoreSwapper: Swap succeeded for node 3
09/06/20 15:24:27 INFO readonly.StoreSwapper: Attempting swap for node
4 dir = /tmp/hdfs-fetcher/hdfs-fetcher/node-4
09/06/20 15:24:27 INFO readonly.StoreSwapper: Swap succeeded for node 4
09/06/20 15:24:27 INFO readonly.StoreSwapper: Swap succeeded on all
nodes in 0 seconds.
[root@ip-10-244-181-204 voldemort]#
[root@ip-10-244-181-204 voldemort]#
[root@ip-10-244-181-204 voldemort]#
[root@ip-10-244-181-204 voldemort]#
[root@ip-10-244-181-204 voldemort]# cd data/read-only/density-tiles/version-0/
[root@ip-10-244-181-204 version-0]# ls -lh
total 0
-rw-r--r-- 1 root root 0 Jun 20 15:21 0.data
-rw-r--r-- 1 root root 0 Jun 20 15:21 0.index
[root@ip-10-244-181-204 version-0]#
[root@ip-10-244-181-204 version-0]#
[root@ip-10-244-181-204 version-0]#
[root@ip-10-244-181-204 version-0]# ls -lh
/tmp/hdfs-fetcher/hdfs-fetcher/node-0
total 41M
-rw-r--r-- 1 root root 30M Jun 20 15:24 0.data
-rw-r--r-- 1 root root 11M Jun 20 15:24 0.index
[root@ip-10-244-181-204 version-0]#

Elias Torres

unread,

Jun 20, 2009, 4:53:56 PM6/20/09

to project-...@googlegroups.com

Did your hadoop job only created 1 chunk per node? only 0.* no n.*?
I'm just worrying whether the download from HDFS failed, but probably
not.

The other thing I pointed out in my tutorial was that I set the tmp
dir to the same device as the actual data directory. Although, I'm not
sure if you are using /mnt on the ec2 nodes.

Can you please try that? Can you check if there's another version-X
directory after the swap?

Thanks for sticking with this, I guess we're still working out some
quirks, but I know you'll get to a working setup soon.

-Elias

tim robertson

unread,

Jun 20, 2009, 5:03:31 PM6/20/09

to project-...@googlegroups.com

Hey - no need to thank me . it's me that just keep asking and asking ;o)

I plan to write up a little blog (referencing yours of course) for
people using the cloudera AMI and a small cluster when I have it
running.

I am using /mnt so will move that.
It did only create 1 chunk per node but this was a small dataset - not
sure if that matters (I am running processing for a billion as well
but that MR is still going).
I made 4 partitions per machine, which was simply because I saw it in
an example.

I am just downloading the latest 0.52-snapshot as I was still working
from the one last weekend in case something was fixed as well.
Should be set up and running soon, but I'm going to crash soon. Will
keep at it tomorrow.

One thing that was not immediately obvious was stopping voldemort
services. I killed them manually, but presume there is a more
graceful method.

Cheers

Tim

Jay Kreps

unread,

Jun 20, 2009, 5:21:05 PM6/20/09

to project-...@googlegroups.com

Hi Tim,

Moving the transferred files to the new directory is failing after the download is complete. Since it can't make the new data the live data, it rolls back to the prior version (which may be empty in your case):

09/06/20 15:24:27 INFO readonly.

ReadOnlyStorageEngine: Renaming data
and index files for 'density-tiles':
09/06/20 15:24:27 INFO readonly.ReadOnlyStorageEngine: Setting primary
files for store 'density-tiles' to
/tmp/hdfs-fetcher/hdfs-fetcher/node-0
09/06/20 15:24:27 INFO readonly.ReadOnlyStorageEngine: Rolling back
store 'density-tiles' to version 1.
09/06/20 15:24:27 INFO readonly.ReadOnlyStorageEngine: Rollback
operation completed on 'density-tiles', releasing lock.

We should improve this logging to make it more verbose about the failure. This means the attempt to rename and open the new store failed. Not sure why this would be, are you sure you are using the latest version?

-Jay

Jay Kreps

unread,

Jun 20, 2009, 5:22:47 PM6/20/09

to project-...@googlegroups.com

Also, with respect to shutting down, there is a shutdown handler so
killing the process is in fact a graceful shutdown (but kill -9 is
not).

-Jay

tim robertson

unread,

Jun 20, 2009, 5:36:18 PM6/20/09

to project-...@googlegroups.com

Hi Both

Thanks for the replies. I now have success!

root@ip-10-244-181-204 density]#
$VOLDEMORT_HOME/bin/voldemort-shell.sh density-tiles
tcp://localhost:6666
Established connection to density-tiles via tcp://localhost:6666
> get "13140803_10_1000_648"
version(): "64029_41513_1,64029_41514_14"

(This will tomorrow become a PNG image rendered in realtime from
tomcat for a google maps density layer of species distribution - like
http://eol-map.gbif.org/EOLSpeciesMap.html?taxon_id=13839800)

I did 2 things at the same time - got the latest code (was 7 days old
before) and changed the tmp directory to the same /mnt as the
voldemort data store. From your suspicions I presume the later was
the fix.

For any EC2 users stumbling on this thread, I added the following to
server.properties:
hdfs.fetcher.tmp.dir=/mnt/tmp-vol

Tomorrow I will try and put it under some load. I will try and verify
the results on the list and will be happy to try and run any tests you
wish if this helps.

Sincere thanks to you both for your patience and responses.

Cheers,

Jay Kreps

unread,

Jun 20, 2009, 5:43:26 PM6/20/09

to project-...@googlegroups.com

Excellent. I will follow-up and improve the error messages for that
kind of failure.

-Jay

thieveryC

unread,

Jun 20, 2009, 6:05:34 PM6/20/09

to project-voldemort

Hi again!

First of all, i want to thank you for the reply in my last post. I was
executing the version 0.51 because i couldnt use git in the university
network (dont know why...).
I'm trying to swap the read-only-store but it returns an error:

$ bin/swap-store.sh --cluster config/test_config3/config/cluster.xml --
file hdfs://lazuli.local:9000/user/vmf/wordcounts --name wordcounts
[2009-06-20 22:45:57,298] INFO Invoking fetch for node 0 for
hdfs://lazuli.local:9000/user/vmf/wordcounts/node-0
(voldemort.store.readonly.StoreSwapper)
Exception in thread "main" voldemort.VoldemortException:
voldemort.VoldemortException: Swap request on node 0 (http://
lazuli.local:8081/read-only/mgmt) failed: Error while performing
operation: Call failed on local exception
at voldemort.store.readonly.StoreSwapper.invokeFetch
(StoreSwapper.java:100)
at voldemort.store.readonly.StoreSwapper.swapStoreData
(StoreSwapper.java:65)
at voldemort.store.readonly.StoreSwapper.main(StoreSwapper.java:196)
Caused by: voldemort.VoldemortException: Swap request on node 0
(http://lazuli.local:8081/read-only/mgmt) failed: Error while
performing operation: Call failed on local exception
at voldemort.store.readonly.StoreSwapper$1.call(StoreSwapper.java:85)
at voldemort.store.readonly.StoreSwapper$1.call(StoreSwapper.java:75)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)
at java.util.concurrent.FutureTask.run(FutureTask.java:123)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask
(ThreadPoolExecutor.java:650)
at java.util.concurrent.ThreadPoolExecutor$Worker.run
(ThreadPoolExecutor.java:675)
at java.lang.Thread.run(Thread.java:595)

Before this, i could succesfully use the hadoop-build-readonly-
store.sh script.

$ bin/hadoop-build-readonly-store.sh --input /output --output
wordcounts --tmpdir /tmp/hdfs-fetcher/hdfs-fetcher --mapper
HadoopStoreMapper --jar lib/wordcount-mapper.jar --cluster /home/vmf/
voldemort/config/test_config3/config/cluster.xml --storename
wordcounts --storedefinitions /home/vmf/voldemort/config/test_config3/
config/stores.xml --chunksize 1073741824 --replication 1
09/06/20 22:38:35 INFO mr.HadoopStoreBuilder: Data size = 24451,
replication factor = 1, numNodes = 1, chunk size = 1073741824,
num.chunks = 1
09/06/20 22:38:35 INFO mr.HadoopStoreBuilder: Number of reduces: 1
09/06/20 22:38:35 INFO mr.HadoopStoreBuilder: Building store...
09/06/20 22:38:37 INFO mapred.FileInputFormat: Total input paths to
process : 1
09/06/20 22:38:38 INFO mapred.JobClient: Running job:
job_200906202027_0011
09/06/20 22:38:39 INFO mapred.JobClient: map 0% reduce 0%
09/06/20 22:38:51 INFO mapred.JobClient: map 100% reduce 0%
09/06/20 22:39:03 INFO mapred.JobClient: map 100% reduce 100%
09/06/20 22:39:05 INFO mapred.JobClient: Job complete:
job_200906202027_0011
09/06/20 22:39:05 INFO mapred.JobClient: Counters: 19
09/06/20 22:39:05 INFO mapred.JobClient: Job Counters
09/06/20 22:39:05 INFO mapred.JobClient: Launched reduce tasks=1
09/06/20 22:39:05 INFO mapred.JobClient: Rack-local map tasks=1
09/06/20 22:39:05 INFO mapred.JobClient: Launched map tasks=2
09/06/20 22:39:05 INFO mapred.JobClient: Data-local map tasks=1
09/06/20 22:39:05 INFO mapred.JobClient: FileSystemCounters
09/06/20 22:39:05 INFO mapred.JobClient: FILE_BYTES_READ=251
09/06/20 22:39:05 INFO mapred.JobClient: HDFS_BYTES_READ=88
09/06/20 22:39:05 INFO mapred.JobClient: FILE_BYTES_WRITTEN=572
09/06/20 22:39:05 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=299
09/06/20 22:39:05 INFO mapred.JobClient: Map-Reduce Framework
09/06/20 22:39:05 INFO mapred.JobClient: Reduce input groups=7
09/06/20 22:39:05 INFO mapred.JobClient: Combine output records=0
09/06/20 22:39:05 INFO mapred.JobClient: Map input records=7
09/06/20 22:39:05 INFO mapred.JobClient: Reduce shuffle bytes=257
09/06/20 22:39:05 INFO mapred.JobClient: Reduce output records=0
09/06/20 22:39:05 INFO mapred.JobClient: Spilled Records=14
09/06/20 22:39:05 INFO mapred.JobClient: Map output bytes=231
09/06/20 22:39:05 INFO mapred.JobClient: Map input bytes=58
09/06/20 22:39:05 INFO mapred.JobClient: Combine input records=0
09/06/20 22:39:05 INFO mapred.JobClient: Map output records=7
09/06/20 22:39:05 INFO mapred.JobClient: Reduce input records=7

Any ideia?

Finally, I just want to ask what is the best way of retrieving
information from Voldemort to be processed in Hadoop.
1. getAll(Iterator<Key>) ?
2. In special cases, can I read directly from BDB (for instance) ?
3. Am I missing something?

Thank you very much.

Jay Kreps

unread,

Jun 20, 2009, 6:15:15 PM6/20/09

to project-...@googlegroups.com

Is there an exception in the logs of node 0?

-Jay

thieveryC

unread,

Jun 20, 2009, 6:58:44 PM6/20/09

to project-voldemort

Yes...sorry, i forgot to mention that.

09/06/20 23:54:03 INFO server.VoldemortServer: Startup completed in
783 ms.
09/06/20 23:56:04 INFO gui.ReadOnlyStoreManagementServlet: Executing
fetch of hdfs://lazuli.local:9000/user/vmf/wordcounts/node-0
09/06/20 23:56:05 ERROR gui.ReadOnlyStoreManagementServlet: Error
while performing operation.
java.io.IOException: Call failed on local exception
at org.apache.hadoop.ipc.Client.call(Client.java:718)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
at org.apache.hadoop.dfs.$Proxy6.getProtocolVersion(Unknown
Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319)
at org.apache.hadoop.dfs.DFSClient.createRPCNamenode
(DFSClient.java:103)
at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:173)
at org.apache.hadoop.dfs.DistributedFileSystem.initialize
(DistributedFileSystem.java:67)
at org.apache.hadoop.fs.FileSystem.createFileSystem
(FileSystem.java:1339)
at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:
56)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:
1351)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:213)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
at voldemort.store.readonly.fetcher.HdfsFetcher.fetch
(HdfsFetcher.java:82)
at
voldemort.server.http.gui.ReadOnlyStoreManagementServlet.doFetch
(ReadOnlyStoreManagementServlet.java:162)
at
voldemort.server.http.gui.ReadOnlyStoreManagementServlet.doPost
(ReadOnlyStoreManagementServlet.java:125)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:
727)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:
820)
at org.mortbay.jetty.servlet.ServletHolder.handle
(ServletHolder.java:502)
at org.mortbay.jetty.servlet.ServletHandler.handle
(ServletHandler.java:389)
at org.mortbay.jetty.handler.ContextHandler.handle
(ContextHandler.java:765)
at org.mortbay.jetty.handler.HandlerWrapper.handle
(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at org.mortbay.jetty.HttpConnection.handleRequest
(HttpConnection.java:534)
at org.mortbay.jetty.HttpConnection$RequestHandler.content
(HttpConnection.java:879)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:747)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:
218)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:
404)
at org.mortbay.io.nio.SelectChannelEndPoint.run
(SelectChannelEndPoint.java:409)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run
(QueuedThreadPool.java:520)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:358)
at org.apache.hadoop.ipc.Client$Connection.receiveResponse
(Client.java:499)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:
441)
09/06/20 23:54:03 INFO server.VoldemortServer: Startup completed in
783 ms.
09/06/20 23:56:04 INFO gui.ReadOnlyStoreManagementServlet: Executing
fetch of hdfs://lazuli.local:9000/user/vmf/wordcounts/node-0
09/06/20 23:56:05 ERROR gui.ReadOnlyStoreManagementServlet: Error
while performing operation.
java.io.IOException: Call failed on local exception
at org.apache.hadoop.ipc.Client.call(Client.java:718)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
at org.apache.hadoop.dfs.$Proxy6.getProtocolVersion(Unknown
Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319)
at org.apache.hadoop.dfs.DFSClient.createRPCNamenode
(DFSClient.java:103)
at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:173)
at org.apache.hadoop.dfs.DistributedFileSystem.initialize
(DistributedFileSystem.java:67)
at org.apache.hadoop.fs.FileSystem.createFileSystem
(FileSystem.java:1339)
at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:
56)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:
1351)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:213)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
at voldemort.store.readonly.fetcher.HdfsFetcher.fetch
(HdfsFetcher.java:82)
at
voldemort.server.http.gui.ReadOnlyStoreManagementServlet.doFetch
(ReadOnlyStoreManagementServlet.java:162)
at
voldemort.server.http.gui.ReadOnlyStoreManagementServlet.doPost
(ReadOnlyStoreManagementServlet.java:125)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:
727)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:
820)
at org.mortbay.jetty.servlet.ServletHolder.handle
(ServletHolder.java:502)
at org.mortbay.jetty.servlet.ServletHandler.handle
(ServletHandler.java:389)
at org.mortbay.jetty.handler.ContextHandler.handle
(ContextHandler.java:765)
at org.mortbay.jetty.handler.HandlerWrapper.handle
(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)

On 20 Jun, 23:15, Jay Kreps <jay.kr...@gmail.com> wrote:
> Is there an exception in the logs of node 0?
>
> -Jay
>

Elias Torres

unread,

Jun 20, 2009, 7:25:06 PM6/20/09

to project-...@googlegroups.com, project-voldemort

What's the version of Hadoop you are using?

-Elias

thieveryC

unread,

Jun 20, 2009, 7:49:41 PM6/20/09

to project-voldemort

Hadoop-0.20.0
But I had to use a previous .jar in order to compile HadoopStoreMapper
because of a class version problem. :(
I'll try the same version i compiled...

Thank you.

P.S. - Can you say anything about this :

Finally, I just want to ask what is the best way of retrieving
information from Voldemort to be processed in Hadoop.
1. getAll(Iterator<Key>) ?
2. In special cases, can I read directly from BDB (for instance) ?
3. Am I missing something?

On 21 Jun, 00:25, Elias Torres <el...@torrez.us> wrote:
> What's the version of Hadoop you are using?
>
> -Elias
>

> ...
>
> mais informações »

Elias Torres

unread,

Jun 20, 2009, 8:21:55 PM6/20/09

to project-...@googlegroups.com

I think you should be running 0.18.3 on the server to make sure there
are no incompatibilities on the HDFS client side of things. Voldemort
uses a 0.18.1 core jar to talk to HDFS directly. That's why you're
seeing this problem.

What you're asking is a very good question for which I don't have a
direct answer. If you search the mailing list there has already been
questions like that and some thoughts on how to go about it. However,
what you're asking it's a bit more hadoop-related. Basically, you
would like an inputformat/scanner to read at a high rate from a Hadoop
cluster while mixing read-only and read-write stores. In some regards,
I'd say that Hbase is probably something worth looking at, especially
given that their latest release touts very high online performance.

-Elias

Jay Kreps

unread,

Jun 20, 2009, 8:31:14 PM6/20/09

to project-...@googlegroups.com

Yeah this is definitely a Hadoop version conflict. The fetcher in
voldemort is attempting to fetch data using a 0.18 jar which we
bundle, but your hadoop cluster is version 20. The hadoop HDFS client
and server unfortunately aren't compatible across major versions (I
tried), and you get a variety of weird issues such as this if you try
to use them together.

I believe the fix is to change the ant target to java 1.6 (which is
required by hadoop 0.19+), delete the 0.18 hadoop jar from voldemort,
rebuild, and redeploy voldemort with 0.20 jar.

I am not sure the jobs will build cleanly with hadoop 0.20, there have
been a LOT of API changes in 0.20, though many are backwards
compatible. Let me know if this works, and which if any changes to the
jobs you need to make to get it to build with 0.20.

-Jay

thieveryC

unread,

Jun 26, 2009, 12:43:41 PM6/26/09

to project-voldemort

Hello again,

i didn't succeeded swapping the stores.
I'll try to explain everything i did and the results of all my
"actions".

this is the result of using the wordcount example from hadoop:

wordA 2
wordB 2
wordC 2

the HadoopStoreMapper is this:

import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;

import voldemort.store.readonly.mr.AbstractHadoopStoreBuilderMapper;
public class HadoopStoreMapper extends
AbstractHadoopStoreBuilderMapper<LongWritable,Text> {

@Override
public Object makeKey(LongWritable key, Text value) {
return value.toString().split("\t")[0];
}

@Override
public Object makeValue(LongWritable key, Text value) {
return Integer.parseInt(value.toString().split("\t")[1]);
}

}

The configuration of the store was copied from Elias tutorial.

This is the command i used do build-read-only-store :

bin/hadoop-build-readonly-store.sh --input /output --output
wordcounts --tmpdir tmpbuider --mapper HadoopStoreMapper --jar /home/
vmf/voldemort/lib/wordcount-mapper.jar --cluster /home/vmf/voldemort/

config/test_config3/config/cluster.xml --storename wordcounts --
storedefinitions /home/vmf/voldemort/config/test_config3/config/

stores.xml --chunksize 1073741824 --replication 2

/output is the directory created by executing the hadoop example (the
file part-00000... and the _logs directoy)

The result is this :

09/06/26 17:22:05 INFO mr.HadoopStoreBuilder: Data size = 17708,

replication factor = 1, numNodes = 1, chunk size = 1073741824,
num.chunks = 1

09/06/26 17:22:05 INFO mr.HadoopStoreBuilder: Number of reduces: 1
09/06/26 17:22:05 INFO mr.HadoopStoreBuilder: Building store...
09/06/26 17:22:10 INFO mapred.FileInputFormat: Total input paths to
process : 1
09/06/26 17:22:10 INFO mapred.FileInputFormat: Total input paths to
process : 1
09/06/26 17:22:11 INFO mapred.JobClient: Running job:
job_200906231643_0008
09/06/26 17:22:12 INFO mapred.JobClient: map 0% reduce 0%
09/06/26 17:22:33 INFO mapred.JobClient: map 50% reduce 0%
09/06/26 17:22:40 INFO mapred.JobClient: map 50% reduce 16%
09/06/26 17:22:54 INFO mapred.JobClient: map 100% reduce 16%
09/06/26 17:22:59 INFO mapred.JobClient: Job complete:
job_200906231643_0008
09/06/26 17:22:59 INFO mapred.JobClient: Counters: 16
09/06/26 17:22:59 INFO mapred.JobClient: File Systems
09/06/26 17:22:59 INFO mapred.JobClient: HDFS bytes read=37
09/06/26 17:22:59 INFO mapred.JobClient: HDFS bytes written=183
09/06/26 17:22:59 INFO mapred.JobClient: Local bytes read=119
09/06/26 17:22:59 INFO mapred.JobClient: Local bytes written=324
09/06/26 17:22:59 INFO mapred.JobClient: Job Counters
09/06/26 17:22:59 INFO mapred.JobClient: Launched reduce tasks=1
09/06/26 17:22:59 INFO mapred.JobClient: Launched map tasks=2
09/06/26 17:22:59 INFO mapred.JobClient: Data-local map tasks=2
09/06/26 17:22:59 INFO mapred.JobClient: Map-Reduce Framework
09/06/26 17:22:59 INFO mapred.JobClient: Reduce input groups=3
09/06/26 17:22:59 INFO mapred.JobClient: Combine output records=0
09/06/26 17:22:59 INFO mapred.JobClient: Map input records=3
09/06/26 17:22:59 INFO mapred.JobClient: Reduce output records=0
09/06/26 17:22:59 INFO mapred.JobClient: Map output bytes=99
09/06/26 17:22:59 INFO mapred.JobClient: Map input bytes=24
09/06/26 17:22:59 INFO mapred.JobClient: Combine input records=0
09/06/26 17:22:59 INFO mapred.JobClient: Map output records=3
09/06/26 17:22:59 INFO mapred.JobClient: Reduce input records=3

I guess no problem until here.

bin/swap-store.sh --cluster config/test_config3/config/cluster.xml --

file hdfs://lazuli.local:9000/user/vmf/wordcounts --name wordcounts

[2009-06-26 17:25:04,862] INFO Invoking fetch for node 0 for
hdfs://lazuli.local:9000/user/vmf/wordcounts/node-0
(voldemort.store.readonly.StoreSwapper)
[2009-06-26 17:25:05,874] INFO Fetch succeeded on node 0
(voldemort.store.readonly.StoreSwapper)
[2009-06-26 17:25:05,874] INFO Attempting swap for node 0 dir = /tmp/
hdfs-fetcher/hdfs-fetcher/node-0
(voldemort.store.readonly.StoreSwapper)
[2009-06-26 17:25:05,912] INFO Swap succeeded for node 0
(voldemort.store.readonly.StoreSwapper)
[2009-06-26 17:25:05,913] INFO Swap succeeded on all nodes in 1
seconds. (voldemort.store.readonly.StoreSwapper)

And the log of the voldemort-server :

09/06/26 17:25:05 INFO gui.ReadOnlyStoreManagementServlet: Executing

fetch of hdfs://lazuli.local:9000/user/vmf/wordcounts/node-0

09/06/26 17:25:05 INFO fetcher.HdfsFetcher: Starting copy of
hdfs://lazuli.local:9000/user/vmf/wordcounts/node-0/0.data to /tmp/
hdfs-fetcher/hdfs-fetcher/node-0/0.data
09/06/26 17:25:05 INFO fetcher.HdfsFetcher: Completed copy of
hdfs://lazuli.local:9000/user/vmf/wordcounts/node-0/0.data to /tmp/
hdfs-fetcher/hdfs-fetcher/node-0/0.data
09/06/26 17:25:05 INFO fetcher.HdfsFetcher: Starting copy of
hdfs://lazuli.local:9000/user/vmf/wordcounts/node-0/0.index to /tmp/
hdfs-fetcher/hdfs-fetcher/node-0/0.index
09/06/26 17:25:05 INFO fetcher.HdfsFetcher: Completed copy of
hdfs://lazuli.local:9000/user/vmf/wordcounts/node-0/0.index to /tmp/
hdfs-fetcher/hdfs-fetcher/node-0/0.index
09/06/26 17:25:05 INFO gui.ReadOnlyStoreManagementServlet: Fetch
complete.
09/06/26 17:25:05 INFO readonly.ReadOnlyStorageEngine: Swapping files
for store 'wordcounts' from /tmp/hdfs-fetcher/hdfs-fetcher/node-0
09/06/26 17:25:05 INFO readonly.ReadOnlyStorageEngine: Acquiring write
lock on 'wordcounts':
09/06/26 17:25:05 INFO readonly.ReadOnlyStorageEngine: Renaming data
and index files for 'wordcounts':
09/06/26 17:25:05 INFO readonly.ReadOnlyStorageEngine: Setting primary
files for store 'wordcounts' to /tmp/hdfs-fetcher/hdfs-fetcher/node-0
09/06/26 17:25:05 INFO readonly.ReadOnlyStorageEngine: Rolling back
store 'wordcounts' to version 1.
09/06/26 17:25:05 INFO readonly.ReadOnlyStorageEngine: Rollback
operation completed on 'wordcounts', releasing lock.
09/06/26 17:25:05 INFO readonly.ReadOnlyStorageEngine: Swap operation
completed on 'wordcounts', releasing lock.

The Rolling back operation is supposed to execute at this point? The
problem is that when i connect to the store , every get return null.

Thank you very much,
Matheus Almeida

Reply all

Reply to author

Forward