Distributed pictures storage

36 views
Skip to first unread message

Gabriel

unread,
Jan 1, 2010, 1:54:30 PM1/1/10
to Hazelcast
Hi,

I just discovered Hazelcast and I'm planning to use it to build a
distributed pictures storage. Pictures are very small, less than 40 k
and I'm storing those in a DB.
I need some guidelines/help to integrate Hazelcast on top of the
existing app. The app is giving a simple API to store/retrieve/remove
a picture from db. My idea is to use Hazelcast to build of cluster of
10 machines that will store pictures. I'm thinking to use a map that
will have the key, being a GUID and value to be an instance of a
class, let's say Picture.

1) Based on my current understanding of Hazelcast I should provide
some sort of implementation to be able to take advantages of current
Hazelcast backup mechanism. Is there any API? I guess that I will need
some sort of Listener kind of API to notify when a pictures has to be
moved from one machine to the other. My plan is to keep 3 copies of
same picture.
2) Hazelcast is taking care of data partitioning. I would like to
leverage this functionality but when the Picture instance is not
stored locally on the node where the request arrived, how should I
read the data from other machine?

The classic pattern of such system are build using architectures where
there are at least 2 types of nodes in the cluster (or 2 clusters): a
tracker node and a storage node. Tracker knows where the file is
stored, so a client will talk with a tracker and tracker will return
the location that is used by client to retrieved the file. My goal is
to combine it in one type a node leveraging Hazelcast functionality.

Thanks,
Gabi

Talip Ozturk

unread,
Jan 1, 2010, 3:16:10 PM1/1/10
to hazelcast
Map<GUID, Picture> mapPictures = Hazelcast.getMap("pictures");

> 1) Based on my current understanding of Hazelcast I should provide
> some sort of implementation to be able to take advantages of current
> Hazelcast backup mechanism. Is there any API? I guess that I will need
> some sort of Listener kind of API to notify when a pictures has to be
> moved from one machine to the other. My plan is to keep 3 copies of
> same picture.

Backups are taken automatically by Hazelcast. You don't have to
explicitly backup things. Just set the the backup-count to 3.
For more info, check out
http://code.google.com/docreader/#p=hazelcast&s=hazelcast&t=MapBackup

> 2) Hazelcast is taking care of data partitioning. I would like to
> leverage this functionality but when the Picture instance is not
> stored locally on the node where the request arrived, how should I
> read the data from other machine?


This is done automatically too. Hazelcast knows where your picture
entries are so mapPictures.get(GUID) will return the Picture even if
it is remotely stored.

-talip

Gabriel Ciuloaica

unread,
Jan 1, 2010, 4:25:46 PM1/1/10
to haze...@googlegroups.com
Thanks for fast reply Talip.

So, let me know if I got it clear. I have to implement MapStore/MapLoad interfaces that will take care of  interacting with DB.... then everything is handled by Hazelcast.

Thanks,
Gabi


--

You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To post to this group, send email to haze...@googlegroups.com.
To unsubscribe from this group, send email to hazelcast+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/hazelcast?hl=en.



Talip Ozturk

unread,
Jan 1, 2010, 6:01:35 PM1/1/10
to hazelcast
> So, let me know if I got it clear. I have to implement MapStore/MapLoad
> interfaces that will take care of  interacting with DB.... then everything
> is handled by Hazelcast.

Yes.

Gabriel Ciuloaica

unread,
Jan 5, 2010, 5:02:51 PM1/5/10
to haze...@googlegroups.com
Hi Talip,

First of all, I will same that the API is so simple to use. I was able to have it run the cluster in less than an hour... (3 nodes, only..) . Great work Talip.

I have a functional application at this moment that persist pictures in a Berkeley DB instance. It works fine for small size picture files, but have issues with larger files. As far that I seen during testing it looks like if the size is over 8k, get operation is not working properly, returning a byte buffer damaged. If the size of the file is getting over few hundred k, StreamCorruptedException  is thrown during de-serialization, in toObject() method from Serialize class. I will continue to dig into this to see if there is an issue in the Serialize class or in other side.
Is there a limitation in size of the byte buffer that could be stored as value in a map?

One work-around is to split the file in smaller chunks and use a multi-map. 
I have tried to split it in multiple chunks and use query api to get all the chunks from a file but I can't use it while is query only items in memory and not what has been persisted, as far as I understood from documentation.

Do you have other suggestion?

I have seen that you are planning to provide a Build-in file storage. as far as I can imagine from feature name, it something similar with what I'm doing but with some differences. I will be glad to contribute on this feature. Let me know what you are planning to do regarding this feature.

Thanks,
Gabi


Talip Ozturk

unread,
Jan 5, 2010, 5:26:54 PM1/5/10
to hazelcast
Gabi.

> As far that I seen during testing it looks like if
> the size is over 8k, get operation is not working properly, returning a byte
> buffer damaged. If the size of the file is getting over few hundred k,
> StreamCorruptedException  is thrown during de-serialization, in toObject()
> method from Serialize class. I will continue to dig into this to see if
> there is an issue in the Serialize class or in other side.
> Is there a limitation in size of the byte buffer that could be stored as
> value in a map?

8k is no problem. we tested serializing over 1MB objects with no problem.

> One work-around is to split the file in smaller chunks and use a multi-map.
> I have tried to split it in multiple chunks and use query api to get all the
> chunks from a file but I can't use it while is query only items in memory
> and not what has been persisted, as far as I understood from documentation.
> Do you have other suggestion?

I would try to see if I am able to serialize/deserialize the big
picture object to file first.

> I have seen that you are planning to provide a Build-in file storage. as far
> as I can imagine from feature name, it something similar with what I'm doing
> but with some differences. I will be glad to contribute on this feature. Let
> me know what you are planning to do regarding this feature.

sure.. that would be great. let's wait until we clarify the road-map
and execution plan for this feature.

thanks,
-talip

Gabriel Ciuloaica

unread,
Jan 6, 2010, 7:07:04 AM1/6/10
to haze...@googlegroups.com
Hi Talip,

Thanks for your prompt answers.
I'm able to serialize/deserialize the pictures to a file successfully. The persistence is configured to "write through" . I can extract the object from db successfully that was initial stored through Hazelcast, though a test case that is accessing the db directly.
I have done the tests on one node cluster and I have observed that if I will restart the server (Hazelcast), and invoke get on the map, then the object is load by Hazelacast from db and delivered successfully.

What can be the problem?

Thanks,
Gabi

Gabriel Ciuloaica

unread,
Jan 7, 2010, 10:16:06 AM1/7/10
to haze...@googlegroups.com
I have tried with 1.8.1-SNAPSHOT build, published today, and I'm getting same exception when I'm invoking get() from map:
java.io.StreamCorruptedException: invalid type code: 00
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1356)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
at com.hazelcast.client.Serializer.toObject(Serializer.java:170)
at com.hazelcast.client.ProxyHelper.getValue(ProxyHelper.java:117)
at com.hazelcast.client.ProxyHelper.doOp(ProxyHelper.java:98)
at com.hazelcast.client.MapClientProxy.get(MapClientProxy.java:184)
at com.devsprint.distributed.fs.file.FileProcessorImpl.getFileChunk(FileProcessorImpl.java:98)
at com.devsprint.distributed.fs.file.FileProcessorImpl.getFileChunk(FileProcessorImpl.java:1)
at com.devsprint.distributed.fs.client.StorageClient.download(StorageClient.java:152)
at com.devsprint.distributed.fs.client.StorageClient.runCommand(StorageClient.java:96)
at com.devsprint.distributed.fs.client.StorageClient.main(StorageClient.java:65)

As value for map I have a simple class that has 3 Strings. The exception is not thrown for each map element, but random.

I have tried also the option suggested in the wiki by making the value class implementing com.hazelcast.nio.DataSerializable but same exception is thrown.
During debug, I have seen that is not able to re-read the chunks of bytes from stream.

Thanks,
Gabi

Talip Ozturk

unread,
Jan 7, 2010, 12:38:26 PM1/7/10
to hazelcast
Gabi,

You are using hazelcast-client.. nice!

Can you create a simple app to reproduce the problem so that we can run it here?

Thanks,
-talip

Gabriel Ciuloaica

unread,
Jan 8, 2010, 6:28:09 PM1/8/10
to haze...@googlegroups.com
Hi Talip,

The app is very simple. I has one class called Entry that has 3 String fields, getters and setters. It use a map<String, Entry> to store items.
So, using native client API put in the map works just great. Get is not working properly, I have retested today with latest changes from trunk and I got same result.

However, today I have transformed one node in Super client and all operations put/get/remove works properly.

What is your recommendation for following approach? I'm planning to build a simple REST API over HTTP that will offer GET/POST/DELETE operations. Should this server implemented as a Superclient or regular node, or using native client to interact with cluster?

Thanks,
Gabi

Fuad Malikov

unread,
Jan 11, 2010, 10:15:07 AM1/11/10
to haze...@googlegroups.com
Hi Gabi, 

On Sat, Jan 9, 2010 at 1:28 AM, Gabriel Ciuloaica <gciul...@gmail.com> wrote:
Hi Talip,

The app is very simple. I has one class called Entry that has 3 String fields, getters and setters. It use a map<String, Entry> to store items.
So, using native client API put in the map works just great. Get is not working properly, I have retested today with latest changes from trunk and I got same result.

This issue should be solved now. Could you please try the latest snapshot from:  http://www.hazelcast.com/downloads.jsp. You should change both server and client jars. 

 
However, today I have transformed one node in Super client and all operations put/get/remove works properly.

What is your recommendation for following approach? I'm planning to build a simple REST API over HTTP that will offer GET/POST/DELETE operations. Should this server implemented as a Superclient or regular node, or using native client to interact with cluster?

As far as I understand, you have hazelcast cluster servers and REST API servers. From those REST servers you put/get to the cluster. 
If the Rest servers are the only ones that put/get to cluster, then you should use SuperClient. But if the cluster servers already busy(without Rest servers), doing a lot of distributed map operations you should use Java Client on Rest side. 

Regards,

-fuad
 
Thanks,
Gabi


Gabriel Ciuloaica

unread,
Jan 11, 2010, 10:30:30 AM1/11/10
to haze...@googlegroups.com
Thanks for the advise.

I will test both ways. At beginning there will be not much processing in the cluster but later I may add other features like search based on metadata information or some batch processing that may add more processing.

Thanks,
Gabi 

Gabriel Ciuloaica

unread,
Jan 11, 2010, 3:15:34 PM1/11/10
to haze...@googlegroups.com
Hi,

I can confirm that the original issue is solved. Hazelcast-client api is working properly. I have tested operations on Map only.

Great job!
Thanks,
Gabi
Reply all
Reply to author
Forward
0 new messages