Reliability in of the cloud

31 views
Skip to first unread message

Alan Ho

unread,
Jun 5, 2008, 8:41:47 PM6/5/08
to cloud-c...@googlegroups.com
I guessed that about google app engine too.

Things get really interesting when you need to do election leader decisions across data centers. E.g. If you are doing a big map-reduce task in one data center, it goes down, so you want to finish the task in another data center.

How does one transfer the task ? Is it even worth solving ?

Alan Ho



From: Reuven Cohen <r...@enomaly.com>
Sent: June 05, 2008 10:03 AM
To: cloud-c...@googlegroups.com
Subject: Re: The Business of Building Clouds

From what I've seen of Google App Engine, they distribute your python code to dozens of servers and then use some kind of round robin to spread the load. Nothing ground breaking.

r/c

On Thu, Jun 5, 2008 at 12:59 PM, wyim wyim <wingm...@hotmail.com> wrote:


In regards to failover, does Google App Engine have some sort of a LoadBalancer API?
 
thanks
Wayne Yim



From: stuartc...@gmail.com
Subject: Re: The Business of Building Clouds
Date: Thu, 5 Jun 2008 09:07:12 -0700




On 5-Jun-08, at 8:35 AM, Alan Ho wrote:

Picking a provider that has data center failover is critical - but it does mean that you write your application in a way that can failover gracefully. Cloud providers need to provide the base infrastructure to do so OR constrain the user to a particular programming paradigm (like the limitations of google app engine)

That's a very astute observation, Alan.    Constraining an architecture to induce certain properties (guarantees?) is likely the right approach.  

 Though I wonder if AppEngine is a bit too "Nanny-ish" that limit its audience in ways that don't really impact the big picture qualities.   

For example, the choice of Python was easy because it was a standard Google language, but that doesn't seem to be inherently a more applicable language than say C#, Java or Ruby.


I expect in the future that cloud computing systems will provide the concept of "cloud events" in case of major datacenter failures. I just don't see any way round it.

I wonder if Google actually provides this sort of failover for AppEngine today.   Certainly, they could, though they provide no such guarantees at the moment.

As for "cloud events" - yup.   In the traditional data centre, it's likely SNMP or JMX traps.   On the cloud, it's not entirely clear if/where SNMP would play.   Or WS-Man.   Or something newer (?).

Cheers
Stu







--
--

Reuven Cohen
Founder & Chief Technologist, Enomaly Inc.
www.enomaly.com :: 416 848 6036 x 1
skype: ruv.net // aol: ruv6

blog > www.elasticvapor.com
-
Get Linked in> http://linkedin.com/pub/0/b72/7b4


Khazret Sapenov

unread,
Jun 5, 2008, 9:00:17 PM6/5/08
to cloud-c...@googlegroups.com
Alan,
If you are talking about Hadoop, then high availability is not inherent in it yet (but maybe it changed recently).
As far as I know, while there is Secondary Name Node provided (that resides in another data center) there's no guarantee of real time switch of Job Tracker/Name Node/Task Tracker/Data Nodes of DC A to Job Tracker/Name Node/Task Tracker/Data Nodes of DC B.
 
cheers

--
Khaz Sapenov,
Director of Research & Development
Enomaly Labs

US Phone: 212-461-4988 x5
Canada Phone: 416-848-6036 x5
E-mail: kh...@enomaly.net
Get Linked in> http://www.linkedin.com/in/sapenov

ian...@gmail.com

unread,
Jun 5, 2008, 9:00:26 PM6/5/08
to cloud-c...@googlegroups.com
Race them against each other and use the first result.
Ian

Ian Rae
Syntenic Inc.
514-944-4008

-----Original Message-----
From: Alan Ho <karl...@yahoo.ca>

Date: Thu, 5 Jun 2008 17:41:47
To:<cloud-c...@googlegroups.com>
Subject: Reliability in of the cloud


I guessed that about google app engine too.

Things get really interesting when you need to do election leader decisions across data centers. E.g. If you are doing a big map-reduce task in one data center, it goes down, so you want to finish the task in another data center.

How does one transfer the task ? Is it even worth solving ?

Alan Ho

----------------


From: Reuven Cohen <r...@enomaly.com>
Sent: June 05, 2008 10:03 AM
To: cloud-c...@googlegroups.com
Subject: Re: The Business of Building Clouds

From what I've seen of Google App Engine, they distribute your python code to dozens of servers and then use some kind of round robin to spread the load. Nothing ground breaking.

r/c


On Thu, Jun 5, 2008 at 12:59 PM, wyim wyim <wingm...@hotmail.com <mailto:wingmanyim@hotmailcom> > wrote:

In regards to failover, does Google App Engine have some sort of a LoadBalancer API?
 
thanks
Wayne Yim

----------------
From: stuartc...@gmail.com <mailto:stuartc...@gmail.com>

To: cloud-c...@googlegroups.com <mailto:cloud-c...@googlegroups.com>

Subject: Re: The Business of Building Clouds
Date: Thu, 5 Jun 2008 09:07:12 -0700

On 5-Jun-08, at 8:35 AM, Alan Ho wrote:

Picking a provider that has data center failover is critical - but it does mean that you write your application in a way that can failover gracefully. Cloud providers need to provide the base infrastructure to do so OR constrain the user to a particular programming paradigm (like the limitations of google app engine)

That's a very astute observation, Alan.    Constraining an architecture to induce certain properties (guarantees?) is likely the right approach.  


 Though I wonder if AppEngine is a bit too "Nanny-ish" that limit its audience in ways that don't really impact the big picture qualities.   


For example, the choice of Python was easy because it was a standard Google language, but that doesn't seem to be inherently a more applicable language than say C#, Java or Ruby.


I expect in the future that cloud computing systems will provide the concept of "cloud events" in case of major datacenter failures. I just don't see any way round it.


I wonder if Google actually provides this sort of failover for AppEngine today.   Certainly, they could, though they provide no such guarantees at the moment.


As for "cloud events" - yup.   In the traditional data centre, it's likely SNMP or JMX traps.   On the cloud, it's not entirely clear if/where SNMP would play.   Or WS-Man.   Or something newer (?).


Cheers
Stu


--
--

Reuven Cohen
Founder & Chief Technologist, Enomaly Inc.

www.enomaly.com <http://www.enomaly.com> :: 416 848 6036 x 1
skype: ruv.net <http://ruv.net> // aol: ruv6

blog > www.elasticvapor.com <http://wwwelasticvapor.com>
-
Get Linked in> http://linkedin.com/pub/0/b72/7b4 <http://linkedin.com/pub/0/b72/7b4>

Alan Ho

unread,
Jun 5, 2008, 9:32:21 PM6/5/08
to cloud-c...@googlegroups.com
I don't think real-time transfer is needed, but your framework should be intelligent enough to continue in the light of failure Although its possible to have 2 trackers that work in a High Availibilty mode (constantly pinging each other to detect failure), hooking in cloud events can make life easier.

The question remains - is broadcasting failures necessary ?

Regards,
Alan Ho





From: Khazret Sapenov <sap...@gmail.com>
Sent: June 05, 2008 6:00 PM
To: cloud-c...@googlegroups.com
Subject: Re: Reliability in of the cloud


Alan,
If you are talking about Hadoop, then high availability is not inherent in it yet (but maybe it changed recently).
As far as I know, while there is Secondary Name Node provided (that resides in another data center) there's no guarantee of real time switch of Job Tracker/Name Node/Task Tracker/Data Nodes of DC A to Job Tracker/Name Node/Task Tracker/Data Nodes of DC B.
 
cheers

--
Khaz Sapenov,
Director of Research & Development
Enomaly Labs

US Phone: 212-461-4988 x5
Canada Phone: 416-848-6036 x5
E-mail: kh...@enomaly.net
Get Linked in> http://www.linkedin.com/in/sapenov
On Thu, Jun 5, 2008 at 8:41 PM, Alan Ho <karl...@yahoo.ca> wrote:
I guessed that about google app engine too.

Things get really interesting when you need to do election leader decisions across data centers. E.g. If you are doing a big map-reduce task in one data center, it goes down, so you want to finish the task in another data center.

How does one transfer the task ? Is it even worth solving ?

Alan Ho


From: Reuven Cohen <r...@enomaly.com>
Sent: June 05, 2008 10:03 AM
To: cloud-c...@googlegroups.com
Subject: Re: The Business of Building Clouds

From what I've seen of Google App Engine, they distribute your python code to dozens of servers and then use some kind of round robin to spread the load. Nothing ground breaking.

r/c

On Thu, Jun 5, 2008 at 12:59 PM, wyim wyim <wingm...@hotmail.com> wrote:


In regards to failover, does Google App Engine have some sort of a LoadBalancer API?
 
thanks
Wayne Yim


Subject: Re: The Business of Building Clouds
Date: Thu, 5 Jun 2008 09:07:12 -0700




On 5-Jun-08, at 8:35 AM, Alan Ho wrote:

Picking a provider that has data center failover is critical - but it does mean that you write your application in a way that can failover gracefully. Cloud providers need to provide the base infrastructure to do so OR constrain the user to a particular programming paradigm (like the limitations of google app engine)

That's a very astute observation, Alan.    Constraining an architecture to induce certain properties (guarantees?) is likely the right approach.  

 Though I wonder if AppEngine is a bit too "Nanny-ish" that limit its audience in ways that don't really impact the big picture qualities.   

For example, the choice of Python was easy because it was a standard Google language, but that doesn't seem to be inherently a more applicable language than say C#, Java or Ruby.


I expect in the future that cloud computing systems will provide the concept of "cloud events" in case of major datacenter failures. I just don't see any way round it.

I wonder if Google actually provides this sort of failover for AppEngine today.   Certainly, they could, though they provide no such guarantees at the moment.

As for "cloud events" - yup.   In the traditional data centre, it's likely SNMP or JMX traps.   On the cloud, it's not entirely clear if/where SNMP would play.   Or WS-Man.   Or something newer (?).

Cheers
Stu







--
--

Reuven Cohen
Founder & Chief Technologist, Enomaly Inc.

Sassa NF

unread,
Jun 6, 2008, 7:06:56 AM6/6/08
to cloud-c...@googlegroups.com
I have a better idea.

Start task 1, and start task 2 doing the same, but later, _not_
simultaneously. Cancel task 2, when you receive result from task 1.
Cancel task 1, when you receive the result from task 2.

The benefit is that task 2 will not spend the same amount of
time/CPU/resources, as task 1, when everything goes OK.

This will work only if cancelling tasks is supported. The timing of
starting task 2 is related to the amount of time you are prepared to
wait after the expected time of finishing task 1.


Sassa

2008/6/6 <ian...@gmail.com>:

Alexis Richardson

unread,
Jun 6, 2008, 4:09:17 AM6/6/08
to cloud-c...@googlegroups.com
The problem of delivering availability and reliability, across two
clouds, or across one cloud and your local network, is one that we ran
into at CohesiveFT. We ran into it because some of our own systems
are hosted in a 'multi-sourced' or 'multi-provider' way.

To solve this problem one of the team, Dmitriy Samovskiy, created a
cross-VPN routing layer which we then open sourced as 'VCUBEV'. It's
a really simple and neat idea, and anyone else is welcome to use it.
There are a one or two gotchas to watch out for, such as latency
between sites, but we've found that for a capable administrator these
are manageable.

A good place to start is:

http://highscalability.com/manage-downtime-risk-connecting-multiple-data-centers-secure-virtual-lan

More details and downloadable code are here:

http://www.cohesiveft.com/Developer/VcubeV/VcubeV_-_Usecases/

Enjoy :-)

alexis

--
Alexis Richardson
+44 20 7617 7339 (UK)
+44 77 9865 2911 (cell)
+1 650 206 2517 (US)

Gavan Corr

unread,
Jun 6, 2008, 8:57:43 AM6/6/08
to cloud-c...@googlegroups.com
There are a number of commercial data caching solutions in the market, Gemstone, Coherence (now from Oracle) and Gigaspaces, and to a lesser extent terracotta. of those, Gemstone is the only one I have seen successfully deployed in a large scale multi site environment to ensure consistency of data between multiple sites, and to do reliable failover if a node or a center fails. Hadoop is gaining interest but not there yet...
Gavan




Visit our website at http://www.nyse.com
*****************************************************************************
Note: The information contained in this message and any attachment to it is privileged, confidential and protected from disclosure. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by replying to the message, and please delete it from your system. Thank you. NYSE Euronext, Inc.

Alan Ho

unread,
Jun 6, 2008, 10:56:53 AM6/6/08
to cloud-c...@googlegroups.com
That's a really nice solution - I'm already envisioning a whole host of products that can be launched on an EC2 cluster (disclaimer : I work for Amazon).

I can see how it is useful in some use cases.

----- Original Message ----
From: Alexis Richardson <alexis.r...@gmail.com>
To: cloud-c...@googlegroups.com
Sent: Friday, June 6, 2008 1:09:17 AM
Subject: Re: Reliability in of the cloud

http://highscalability.com/manage-downtime-risk-connecting-multiple-data-centers-secure-virtual-lan

http://www.cohesiveft.com/Developer/VcubeV/VcubeV_-_Usecases/

Enjoy :-)

alexis

--

Alexis Richardson
+44 20 7617 7339 (UK)
+44 77 9865 2911 (cell)
+1 650 206 2517 (US)

__________________________________________________________________
Get a sneak peak at messages with a handy reading pane with All new Yahoo! Mail: http://ca.promos.yahoo.com/newmail/overview2/

Alan Ho

unread,
Jun 6, 2008, 11:04:33 AM6/6/08
to cloud-c...@googlegroups.com
There is good old oracle with hot-standby. Its not perfect, but our applications was able to fail-over gracefully from one oracle instance to another oracle instance.

Regards,
Alan Ho




----- Original Message ----
From: Gavan Corr <gc...@nyx.com>
To: cloud-c...@googlegroups.com
Sent: Friday, June 6, 2008 5:57:43 AM
Subject: Re: Reliability in of the cloud

There are a number of commercial data caching solutions in the market, Gemstone, Coherence (now from Oracle) and Gigaspaces, and to a lesser extent terracotta. of those, Gemstone is the only one I have seen successfully deployed in a large scale multi site environment to ensure consistency of data between multiple sites, and to do reliable failover if a node or a center fails. Hadoop is gaining interest but not there yet...
Gavan



On Jun 5, 2008, at 9:00 PM, Khazret Sapenov wrote:


Visit our website at http://www.nyse.com
*****************************************************************************
Note: The information contained in this message and any attachment to it is privileged, confidential and protected from disclosure. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by replying to the message, and please delete it from your system. Thank you. NYSE Euronext, Inc.





Ask a question on any topic and get answers from real people. Go to Yahoo! Answers.

Stuart Charlton

unread,
Jun 6, 2008, 1:32:28 PM6/6/08
to cloud-c...@googlegroups.com
Ahhh Gemstone... how I miss thee.

Nati Shalom

unread,
Jun 7, 2008, 5:59:06 PM6/7/08
to cloud-c...@googlegroups.com

The topic of reliability on the cloud can be fairly broad.

Just in this thread were discussing the issues of:

 

  1. Continues high availability
    1. In a relatively stateless job distribution scenario (Map/Reduce)
    2. Data sources and databases
  2.  Consistency – specifically how do we ensure consistency between separate networks and clusters

 

Many others refers to reliability in the cloud with reference to Amazon S3 and the complexity involved in setting up databases on the cloud.

Other would refer to reliability from a different perspective i.e. what happens if the cloud is down?

 

Each of those items deserve an entire discussion.

As Gavan noted many of those issues has been address by some of the data-grid providers (I happen to represent GigaSpaces). The reason is simple: The nature of the grid applications that are using those products in the financial industry have the requirements that I would view as a superset to those of cloud. The demand in this type of applications is to address reliability, consistency, performance and scalability at the same time. Unlike google and other solution that is commonly used for addressing similar requirements compromising on consistency and/or reliability was not an option.

 

As I noted earlier I'm not going to be able to cover how all this is achieved, instead I'll cover some of the core principles.

 

How do we address consistency and avoid split-brain scenario between separate clusters or data-centers:

Split-brain scenario starts when we maintain more then one copy of the data across separate network segments (data centers is just an example for such scenario).

The question is how do we ensure consistency when an update can happen simultaneously on each of the copies. If there is a network failure both can succeed and once the network connection is re-established we can get into inconsistent situation that we can't recover from.

 

There are various patterns for dealing with this scenario:

  1. Conflict resolution – we decide (based on certain algorithm which update wins) – as you can imagine one of the main drawbacks of this approach is that it tends to be very application specific scenario and is error prone. It also requires manual intervention to resolve such conflict.
  2. Master site – In this approach all updates goes to a central location which becomes the master of all updates. – This approach address consistency aspect but lead to scaling and performance issues.

 

So what is the solution?

      Take the master site and partition it between the sites – in this way each site will act as the master for certain part of the data.

      All sites can maintain local copy of the data for read purposes. That means that the data is always available in case of a network failure.

      In this way we can ensure consistency (there is central owner per data item through the entire cluster), scalability (through replication and partitioning), performance (replication can be made local).

 

Now let's look at the Amazon S3 issues.

One of the simplest solution that fits nicely with the cloud is to decouple the persistency layer from the application and use memory resources as the system of records.

I wrote a lengthy post targeted more specifically to MySQL but the same principles that I decided in that post applies here as well.

http://natishalom.typepad.com/nati_shaloms_blog/2008/03/scaling-out-mys.html

 

There is also an interesting opensource project that is using this pattern and built an In-memory data-grid (GigaSpaces in this specific case) synchronized with Amazon SimpleDB.

http://www.openspaces.org/display/EDS/External+Data+Source+by+Amazon+SimpleDB

 

You could use the same pattern to load data from your own local-site i.e. keep the data persistent at your own site and use the data-grid as the system of record for applications running in the cloud.

The data-grid will be responsible for keeping the data-grid in-sync with the local database. You can control the rate in which those two entities will be synchronized based on the performance, network latency and reliability requirements through configuration.

 

From obvious reasons I can speak more on GigaSpaces then I can speak of the other data-grid products such as Gemstone and Coherence but I believe that the principles are similar while the underlying implementation can still be very different. What I can say safely at this stage is that unlike pure data-grid products we see data-grid as a component in broader solution which we refer to as a Scale-out-application server (The equivalent of google AppEngine but for Java,.Net and C++).

 

Nati Shalom

CTO GigaSpaces

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


From: cloud-c...@googlegroups.com [mailto:cloud-c...@googlegroups.com] On Behalf Of Gavan Corr
Sent: Friday, June 06, 2008 3:58 PM
To: cloud-c...@googlegroups.com
Subject: Re: Reliability in of the cloud

 

There are a number of commercial data caching solutions in the market, Gemstone, Coherence (now from Oracle) and Gigaspaces, and to a lesser extent terracotta. of those, Gemstone is the only one I have seen successfully deployed in a large scale multi site environment to ensure consistency of data between multiple sites, and to do reliable failover if a node or a center fails. Hadoop is gaining interest but not there yet...

Gavan

 

 

Stuart Charlton

unread,
Jun 8, 2008, 3:19:39 PM6/8/08
to cloud-c...@googlegroups.com
Log shipping like Oracle's data guard tends to work well in covering for log failures.  I've also seen success with storage-level replication like EMC's SRDF.    

It can be problematic, however, for catastrophic failure (i.e. the whole data center), as the latency of synchronous replication over a WAN can be performance prohibitive beyond a few hundred miles or so (i.e. your DR site couldn't be across the continent).   Of course, you could go asynchronous and tolerate some data loss as acceptable in such a scenario -- just that many aren't willing to (i.e. in CAP conjecture terms,  emphasizing "C"onsistency & "A"vailability but becoming unavailable during a network "P"artition).   Or you could partition / shard the data set itself to segment disruption, though that in itself has a whole pile of tradeoffs.

Solutions like Gemstone, Gigaspaces, Coherence, etc. if you're using "write behind" caching, all basically perform the same idea as log shipping and have similar tradeoffs as a traditional DB's log shipper, though they go about it in different ways. (e.g. GemStone basically *IS* a distributed database, whereas Gigaspaces & Coherence are distributed caches that delegate to an RDBMS or indexed file).


Cheers
Stu

Talip Ozturk

unread,
Jun 9, 2008, 3:21:34 AM6/9/08
to cloud-c...@googlegroups.com
> GemStone basically *IS* a distributed database

Never heard this before. GemStone defines itself as more like in-mermoy data management and caching solution.

"The GemStone® EDF flagship product—GemFire Enterprise™—is an in-memory data management solution that sits in the middle-tier between the applications and the data sources to provide distributed caching, continuous analytics semantics and message bus service all in one and operates at memory speeds. It provides low-latency and near-zero downtime along with horizontal & global scalability. The GemFire Enterprise solution is easy to adopt with simple HashMap API. The programming paradigm is extremely simple and familiar, but what it does behind the scenes is the real value proposition.  You simply "put" your state into the local HashMap of your business service and under the covers the middleware takes care of replicating, persisting and managing this business object to multiple additional servers in a massively distributed environment."

Source: http://gemstone.com/products/gemfire/enterprise.php

No claim of its being a distributed database! Am I missing something?

Regardless I think Gemstone, Oracle Coherence and Gigaspaces are great grid application enablers and cloud is a nice ecosystem to run grid(dy) applications.

Best,

Talip Ozturk
Hazelcast: Clustering and data distribution platform for Java
blog :http://www.jroller.com/talipozturk

sanjee...@comcast.net

unread,
Jun 9, 2008, 9:16:26 AM6/9/08
to cloud-c...@googlegroups.com, Talip Ozturk
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Cloud Computing" group.
To post to this group, send email to cloud-c...@googlegroups.com
To unsubscribe from this group, send email to cloud-computi...@googlegroups.com
For more options, visit this group at http://groups.google.ca/group/cloud-computing?hl=en
-~----------~----~----~----~------~----~------~--~---

Gavan Corr

unread,
Jun 9, 2008, 9:37:40 AM6/9/08
to cloud-c...@googlegroups.com
Just to clarify Gemstone S is the smalltalk database, Gemfire is a separate newer the distributed caching product used in the Grid world.
and just to have all cards on the table (esp for Nati) - I used to work for Gemstone...

Gavan

Stuart Charlton

unread,
Jun 9, 2008, 5:41:19 PM6/9/08
to cloud-c...@googlegroups.com
Right.   Since you said GemStone, I assumed you meant the database itself.   GemFire is more directly comparable to Coherence & Gigaspaces than GemStone/S (or GemStone Facets, the Java object database).    My experience is with the latter two.

Cheers
Stu

Alexey Roytman

unread,
Jun 10, 2008, 2:31:41 AM6/10/08
to cloud-c...@googlegroups.com

Hi,
IBM has a similar product. Its name is ObjectGrid. It can be used as a shared coherent cache, or a data base front-end. It supports maps and entities abstractions, MapReduce / DataGrid patterns and etc.
See http://www-128.ibm.com/developerworks/wikis/display/objectgrid/Getting+started )

So it is another candidate to the cloud environment.


Regards
        Alexey.

Alexey Roytman
Distributed Middleware
IBM Research Laboratory in Haifa



"Talip Ozturk" <ozt...@gmail.com>
Sent by: cloud-c...@googlegroups.com

09-06-08 10:21 AM

Please respond to
cloud-c...@googlegroups.com

To
cloud-c...@googlegroups.com
cc

Gavan Corr

unread,
Jun 10, 2008, 10:07:21 AM6/10/08
to cloud-c...@googlegroups.com
Haven't heard much about this making it into production environments yet Alexey. Anything you can share with us? I know there has been a long effort by IBM to get this out though....

Alexey Roytman

unread,
Jun 10, 2008, 11:05:46 AM6/10/08
to cloud-c...@googlegroups.com

Object Grid was a part of WebSphere eXtended Deployment  (XD) Data Grid. Now it's name is WebSphere eXtreme Scale.
http://www-306.ibm.com/software/webservers/appserv/extremescale/

Despite that it is part of the WebSphere family, it can be integrated in J2SE applications as well.

Regards
        Alexey.



Gavan Corr <gc...@nyx.com>
Sent by: cloud-c...@googlegroups.com

10-06-08 05:07 PM

Talip Ozturk

unread,
Jun 10, 2008, 12:11:10 PM6/10/08
to cloud-c...@googlegroups.com
Billy Newport, main commiter of ObjectGrid (eXtreme Scale), at the link below, explains the way of making an Amazon EC2 image for ObjectGrid.

http://www.devwebsphere.com/devwebsphere/2008/01/making-an-amazo.html

CPU, RAM, Storage cloud will be more useful as applications are architected and developed to run on cloud.  ObjectGrid can help a lot here.

-talip
http://www.hazelcast.com
blog: http://jroller.com/talipozturk

Cameron

unread,
Jun 13, 2008, 4:48:51 PM6/13/08
to Cloud Computing
> There are a number of commercial data caching solutions in the market,
> Gemstone, Coherence (now from Oracle) and Gigaspaces, and to a lesser
> extent terracotta. of those, Gemstone is the only one I have seen
> successfully deployed in a large scale multi site environment to
> ensure consistency of data between multiple sites, and to do reliable
> failover if a node or a center fails.

These capabilities have been standard in Oracle Coherence for a long
time, including two-way replication (multi-master hot-hot with
customizable reconciliation), and have been in production use by
banks, exchanges etc. for some time now on systems of dozens up to
hundreds of servers. This has been one of the key differentiating
factors for Coherence, and by now has helped us gain almost every
major bank as a customer, so I was quite surprised to read your
comment.

Peace,

Cameron Purdy | Oracle
http://www.oracle.com/technology/products/coherence/index.html

On Jun 6, 8:57 am, Gavan Corr <gc...@nyx.com> wrote:
> There are a number of commercial data caching solutions in the market,
> Gemstone, Coherence (now from Oracle) and Gigaspaces, and to a lesser
> extent terracotta. of those, Gemstone is the only one I have seen
> successfully deployed in a large scale multi site environment to
> ensure consistency of data between multiple sites, and to do reliable
> failover if a node or a center fails. Hadoop is gaining interest but
> not there yet...
> Gavan
>
> On Jun 5, 2008, at 9:00 PM, Khazret Sapenov wrote:
>
>
>
> > Alan,
> > If you are talking about Hadoop, then high availability is not
> > inherent in it yet (but maybe it changed recently).
> > As far as I know, while there is Secondary Name Node provided (that
> > resides in another data center) there's no guarantee of real time
> > switch of Job Tracker/Name Node/Task Tracker/Data Nodes of DC A to
> > Job Tracker/Name Node/Task Tracker/Data Nodes of DC B.
>
> > cheers
>
> > --
> > Khaz Sapenov,
> > Director of Research & Development
> > Enomaly Labs
>
> > US Phone: 212-461-4988 x5
> > Canada Phone: 416-848-6036 x5
> > E-mail: k...@enomaly.net
> > On Thu, Jun 5, 2008 at 8:41 PM, Alan Ho <karlu...@yahoo.ca> wrote:
> > I guessed that about google app engine too.
>
> > Things get really interesting when you need to do election leader
> > decisions across data centers. E.g. If you are doing a big map-
> > reduce task in one data center, it goes down, so you want to finish
> > the task in another data center.
>
> > How does one transfer the task ? Is it even worth solving ?
>
> > Alan Ho
>
> > From: Reuven Cohen <r...@enomaly.com>
> > Sent: June 05, 2008 10:03 AM
> > To: cloud-c...@googlegroups.com
> > Subject: Re: The Business of Building Clouds
>
> > From what I've seen of Google App Engine, they distribute your
> > python code to dozens of servers and then use some kind of round
> > robin to spread the load. Nothing ground breaking.
>
> > r/c
>
> > On Thu, Jun 5, 2008 at 12:59 PM, wyim wyim <wingman...@hotmail.com>
> > wrote:
>
> > In regards to failover, does Google App Engine have some sort of a
> > LoadBalancer API?
>
> > thanks
> > Wayne Yim
>
> </pre>
> <P><hr size=1></P>
> <P><STRONG>
> Visit our website at <a href="http://www.nyse.com">http://www.nyse.com</a> <br>
>
> *****************************************************************************
> <br>
> Note: The information contained in this message and any attachment to it is privileged, confidential and protected from disclosure. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by replying to the message, and please delete it from your system. Thank you. NYSE Euronext, Inc.
>
> </STRONG></P><pre>
Reply all
Reply to author
Forward
0 new messages