Building a read-only store

65 visningar
Hoppa till det första olästa meddelandet

Yoav

oläst,
28 aug. 2009 18:23:432009-08-28
till project-voldemort
Hello,

I have a couple of stores using bdb and I want to "convert" one of
them to the read-only store to hopefully get better performance and
balance.
Is there any suggested way to do so?
Assuming I can iterate the data (or recreate it), it seems I can use
JsonStoreBuilder which receives some file as input.
Can you provide some information about the format of this file and how
I can put multiple (and numerous) values in it?

Of course, if this is not the best way / there is another way to
programatically build the read-only store I would be happy to know.

Thanks!

Jay Kreps

oläst,
2 sep. 2009 23:14:172009-09-02
till project-...@googlegroups.com
Hi Yoav,

The input format for the json store builder is json as a sequence of
key-values. So for example a text file containing

1 {'name':'Jay', 'age':29}
2 {'name':'Joe', 'age':32}

Would turn into a store with our binary json serialization containing
two keys (1 and 2) and the two values given above.

Hope that helps.

There is no automatic conversion between storage engines. That would
be a cool feature to have, but probably would not work for the
readonly store without some work since it doesn't take updates.

-Jay

Yoav

oläst,
4 sep. 2009 02:02:072009-09-04
till project-voldemort
Hi Jay,

Thanks for the reply.
I am having some issues with the store creation and I would appreciate
if you can take a look and see what it is I am doing wrong.

I have created a file using JSONWriter which produced a file similar
to what you sent (no spaces and no line breaks)
I ran the json store builder script to create the output in a
temporary directory. Although it seems it did not go over all records,
it completed successfully.

I then tried using swap-store script to get this working and it seems
nothing was actually done (nothing changed in the data directory and
log messages also indicates so)

[2009-09-04 05:59:15,869] INFO Invoking fetch for node 0 for /mnt/
output/node-0 (voldemort.store.readonly.StoreSwapper)
[2009-09-04 05:59:16,156] INFO Fetch succeeded on node 0
(voldemort.store.readonly.StoreSwapper)
[2009-09-04 05:59:16,156] INFO Attempting swap for node 0 dir = /mnt/
output/node-0 (voldemort.store.readonly.StoreSwapper)
[2009-09-04 05:59:16,160] INFO Swap succeeded for node 0
(voldemort.store.readonly.StoreSwapper)
[2009-09-04 05:59:16,160] INFO Swap succeeded on all nodes in 0
seconds. (voldemort.store.readonly.StoreSwapper)

Trying to manually copy the files under /mnt/output/node-0/ to /usr/
local/voldemort/config/single_node_cluster/data/read-only/
WordProbability/version-0/ caused voldemort to not start properly.

Anything comes to mind?

Thanks!




On Sep 3, 6:14 am, Jay Kreps <jay.kr...@gmail.com> wrote:
> Hi Yoav,
>
> The input format for the json store builder is json as a sequence of
> key-values. So for example a text file containing
>
> 1 {'name':'Jay', 'age':29}
> 2 {'name':'Joe', 'age':32}
>
> Would turn into a store with our binary json serialization containing
> two keys (1 and 2) and the two values given above.
>
> Hope that helps.
>
> There is no automatic conversion between storage engines. That would
> be a cool feature to have, but probably would not work for the
> readonly store without some work since it doesn't take updates.
>
> -Jay
>

Yoav

oläst,
15 sep. 2009 05:01:482009-09-15
till project-voldemort
Hi,

I think I managed to make some progress, or at least I am getting a
different error :)
I failed to work with the swap-store utility and decided to copy the
data myself.
The new error, is at voldemort start up and specifies some out of
memory issue. Is there anyway to define the memory size?

Here is the error:

Exception in thread "main" voldemort.VoldemortException:
java.io.IOException: Map failed
at voldemort.store.readonly.ChunkedFileSet.mapFile
(ChunkedFileSet.java:143)
at voldemort.store.readonly.ChunkedFileSet.<init>
(ChunkedFileSet.java:90)
at voldemort.store.readonly.ReadOnlyStorageEngine.open
(ReadOnlyStorageEngine.java:129)
at voldemort.store.readonly.ReadOnlyStorageEngine.<init>
(ReadOnlyStorageEngine.java:113)
at
voldemort.store.readonly.ReadOnlyStorageConfiguration.getStore
(ReadOnlyStorageConfiguration.java:61)
at voldemort.server.storage.StorageService.getStorageEngine
(StorageService.java:258)
at voldemort.server.storage.StorageService.openStore
(StorageService.java:158)
at voldemort.server.storage.StorageService.startInner
(StorageService.java:151)
at voldemort.server.AbstractService.start(AbstractService.java:
63)
at voldemort.server.VoldemortServer.startInner
(VoldemortServer.java:147)
at voldemort.server.AbstractService.start(AbstractService.java:
63)
at voldemort.server.VoldemortServer.main(VoldemortServer.java:
194)
Caused by: java.io.IOException: Map failed
at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:803)
at voldemort.store.readonly.ChunkedFileSet.mapFile
(ChunkedFileSet.java:139)
... 11 more
Caused by: java.lang.OutOfMemoryError: Map failed
at sun.nio.ch.FileChannelImpl.map0(Native Method)
at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:800)
... 12 more
> > > them to theread-onlystore to hopefully get better performance and
> > > balance.
> > > Is there any suggested way to do so?
> > > Assuming I can iterate the data (or recreate it), it seems I can use
> > > JsonStoreBuilder which receives some file as input.
> > > Can you provide some information about the format of this file and how
> > > I can put multiple (and numerous) values in it?
>
> > > Of course, if this is not the best way / there is another way to
> > > programatically build theread-onlystore I would be happy to know.
>
> > > Thanks!
>
>

Elias Torres

oläst,
15 sep. 2009 08:01:132009-09-15
till project-...@googlegroups.com
Can you describe your machine settings, memory, CPU, etc?

Voldemort startup parameters, config properties?

-Elias

Yoav Naveh

oläst,
15 sep. 2009 08:59:202009-09-15
till project-...@googlegroups.com
Hi Elias,

I am testing on a small instance at EC2:
1.7 GB memory
1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit)
160 GB instance storage (150 GB plus 10 GB root partition)
32-bit platform
I/O Performance: Moderate

The configuration is:
server.properties
the only parameter is:
#ReadOnly
enable.readonly.engine=true

Store configuration:
this is the only read-only store
Here are some parameters:

<persistence>read-only</persistence>
    <routing>client</routing>
    <replication-factor>1</replication-factor>
    <required-reads>1</required-reads>
    <required-writes>1</required-writes>

The files under config/single_node_cluster/data/read-only/store_name/version-0 are:
-rw-r--r-- 1 root root456M 0.data
-rw-r--r-- 1 root root  79M 0.index
-rw-r--r-- 1 root root 455M1.data
-rw-r--r-- 1 root root  79M 1.index

Finally, the voldemort-server.sh is defined to run with 2GB (-Xmx2G)  (Assume part is swap)

Any additional data I can provide?


Thanks!

Elias Torres

oläst,
15 sep. 2009 09:23:092009-09-15
till project-...@googlegroups.com
Java version?

Yoav Naveh

oläst,
15 sep. 2009 09:38:532009-09-15
till project-...@googlegroups.com
~# java -version
java version "1.6.0_0"
OpenJDK  Runtime Environment (build 1.6.0_0-b11)
OpenJDK Client VM (build 1.6.0_0-b11, mixed mode, sharing)

Elias Torres

oläst,
15 sep. 2009 09:53:452009-09-15
till project-...@googlegroups.com
Could you try with Sun's JDK?

-Elias

Yoav Naveh

oläst,
15 sep. 2009 10:19:372009-09-15
till project-...@googlegroups.com
Hi,

I changed the voldemort-server.sh file to run:
/usr/lib/jvm/java-6-sun/bin/java $VOLD_OPTS -cp $CLASSPATH voldemort.server.VoldemortServer $@

Where:
~# /usr/lib/jvm/java-6-sun/bin/java -version
java version "1.6.0_16"
Java(TM) SE Runtime Environment (build 1.6.0_16-b01)
Java HotSpot(TM) Client VM (build 14.2-b01, mixed mode, sharing)


I get the same output (same exception as mentioned below).

Perhaps it is best to go back and understand the swap store issue? (Or does it simply replace the files in some hot-swap manner? If so, then I assume this is not relevant)

Yoav

Elias Torres

oläst,
15 sep. 2009 10:41:052009-09-15
till project-...@googlegroups.com
That's correct. There's nothing special about swap store issue. The key is that the files were built correctly and uploaded to the right path on the system.

Are you sure that those are the files being read? 
Can you try smaller ones? 
What's the JVM memory size when trying to start? 
Have you Google for OutOfMemory exceptions using MapFile?
Do all nodes in the cluster give you the same problem?
Have you tried another EC2 instance?
Have you tried downloading those files and tried it locally?

-Elias

Yoav Naveh

oläst,
15 sep. 2009 12:13:542009-09-15
till project-...@googlegroups.com
Hi Elias,

This seems like a memory issue.
A (much) smaller file I tested works perfectly.
the JVM is set to 2GB which is well over the 2 files of a bit less than 500MB that reside in that folder. Is there any reason for those files to occupy more in memory?
I can test this on a machine with more memory but I am a bit puzzled on why is so much memory required.

Yoav

bhupesh bansal

oläst,
15 sep. 2009 13:36:342009-09-15
till project-...@googlegroups.com
Hey Yoav,

saw a similar issue today, try setting VOLD_OPTS explicitly in the shell before you start voldemort.

export VOLD_OPTS='-Xmx256M -d64 -server

Best
Bhupesh

Yoav Naveh

oläst,
16 sep. 2009 02:15:322009-09-16
till project-...@googlegroups.com
Hi Bhupesh,

Thank you for the suggestion.
I am already setting VOLD_OPTS in my startup script but I tried in the shell before running and got the same error.
I am pretty sure the definition of VOLD_OPTS in the script itself works because if I change it (in the script) to 4G than voldemort fails to start because it cannot allocate so much memory.

Yoav

Jay Kreps

oläst,
16 sep. 2009 11:35:232009-09-16
till project-...@googlegroups.com
Hi Yoav,

The problem is pretty simple, the read only store memory-maps the
index files. This is generally a good thing to do from a performance
standpoint but it does require that you have sufficient address space
for the mapping. I suspect the issue you are having is that you are
attempting to memory map more than 2GB of index data on a 32-bit
machine which has only a 2GB address space. Can you try on a 64bit
amazon image?

-Jay

Yoav Naveh

oläst,
16 sep. 2009 16:22:552009-09-16
till project-...@googlegroups.com
Hi Jay,

Does this makes sense, even though my data files are (combined) only ~1G?
See my previous info:

>>>>>> The files under config/single_node_cluster/data/read-only/store_name/version-0 are:
>>>>>> -rw-r--r-- 1 root root456M 0.data
>>>>>> -rw-r--r-- 1 root root  79M 0.index
>>>>>> -rw-r--r-- 1 root root 455M1.data
>>>>>> -rw-r--r-- 1 root root  79M 1.index

Thanks,
Yoav

Jay Kreps

oläst,
16 sep. 2009 17:29:492009-09-16
till project-...@googlegroups.com
What is the heap size you are using for your JVM? You have only 2GB
total of address space on a 32 bit machine including the jvm itself
and all memory mappings and we do multiple mappings per file as each
mapping is non-threadsafe in java so you must make sure size_of_jvm +
num_mappings * index_size < 2GB. The default number of mappings is 5,
and that can be overridden by the property readonly.file.handles, so
this means the mappings are taking 158 * 5 = 395MB of address space
(though no actual memory). If you are using only read-only storage
engine then you really don't need any significant amount of java heap
so you should be able to reduce that.

Yoav Naveh

oläst,
16 sep. 2009 17:47:312009-09-16
till project-...@googlegroups.com
Hi Jay,

Deleting my bdb store and setting readonly.file.handles=1 now allows voldemort to start.
Is there any suggestion regarding this parameter?

Yoav

Jay Kreps

oläst,
16 sep. 2009 20:04:222009-09-16
till project-...@googlegroups.com
It controls the maximum parallelism of searches in a given index
chunk. These searches are pretty fast so 1 or 2 may be fine.

-Jay
Svara alla
Svara författaren
Vidarebefordra
0 nya meddelanden