12 questions at CloudCamp Bengaluru

12 views
Skip to first unread message

Saifi Khan

unread,
Apr 4, 2009, 7:19:02 AM4/4/09
to bigc...@googlegroups.com
Hi all:

The first ever CloudCamp in India was held in Bengaluru
last week on Sunday, March 29 at IIM, Bengaluru.

There were 12 questions that were raised during the un-Panel.

We captured them as a photo
http://picasaweb.google.com/twincling/CloudcampBengaluru?feat=directlink#5320704467569033666

and

http://picasaweb.google.com/twincling/CloudcampBengaluru?feat=directlink#5320704472227279698


thanks
Saifi.

Namita Iyer

unread,
Apr 4, 2009, 3:12:48 AM4/4/09
to bigc...@googlegroups.com


The questions were very interesting and thoughts just keep flowing when you
see them.

When I saw the questions again, I got new ideas about question 1:
What should developers do differently when developing for the cloud ?

Most of the time we think about the Cloud as an infrastructure that provides
us unlimited computing power, memory and storage. However, would it not be
good to design without this consideration. Eg. How about designing for the
Cloud as if you would design for an embedded system, where resources are
minimal and we need to have an optimized design to begin with.

I say this because, even though, the resources are unlimited, we now have to
pay per cpucycle, per mb of memory etc. So, the less we use the less we would
pay.

Having a cost optimized design in place before developing an app for the Cloud
would definitely help with changing service plans (remember how changes in
telephone service plans move our monthly expenses up or down, because we are
used to calling mom on certain days of the week - and we cant change that
because our service plan changed).

I'm eager to hear more on this from others on the list.

Namita

Saifi Khan

unread,
Apr 4, 2009, 9:56:49 AM4/4/09
to bigc...@googlegroups.com
On Sat, 4 Apr 2009, Namita Iyer wrote:

>
> On Saturday 04 April 2009 04:49:02 pm Saifi Khan wrote:
> > Hi all:
> >

> > We captured them as a photo
> > http://picasaweb.google.com/twincling/CloudcampBengaluru?feat=directlink#53
> >20704467569033666
> >
>

> I say this because, even though, the resources are unlimited, we now have to
> pay per cpucycle, per mb of memory etc. So, the less we use the less we would
> pay.
>

This is very fine grained and good for service provider costing.

A consumer would always work with more granular entity.

Is there a unit consumption model ? like in utility world ie.
cubic metres of gas, kilowatt hrs of electricity or litres of
petrol etc.


thanks
Saifi.

Saifi Khan

unread,
Apr 5, 2009, 4:42:26 PM4/5/09
to bigc...@googlegroups.com
>
> Hi all:

The three points i mentioned in response to the questions were:

1. Semi structured data (key-value style) can go the cloud,
whereas you can continue to host your Database instances
in the current infrastructure.

2. want to experiment with cloud computing ?
Sure, make a private cloud.

3. OS is dead !
bootloader is in MBR
network stack is TOE/NIC/NPU
services are in VirtualMachine
devices are in VirtualMachine
filesystems are in VirtualMachine
VirtualMachine environment is switched, tasked, transported

Who needs an OS ?

Open Source is the "only way" to cloud computing !
Open Source nurtures, supports and builds on Open Standards !

Without Open standards, bhendor lockins will happen once again.
Wow, we just locked ourself in the cloud :)


thanks
Saifi.

Aditya Lal

unread,
Apr 6, 2009, 5:46:01 AM4/6/09
to bigc...@googlegroups.com
Hi Namita,

In my opinion, cloud poses a very different challenge of scaling out as compared to performance (which is more about getting maximum juice out of a set of boxes using scale up). 

Moving to cloud is a fundamental change - in technology and the mind set. It thrives on "eventual consistency model" as against (for lack of better term) "complete consistency model" which is very difficult to achieve. And since we want to run it on throw-away hardware running any kind of OS wherein things can go down any time - makes the problem even harder to solve as we are trying to solve "reliability", "availability" and "consistency" together. Here you must understand nobody who provides cloud computing service will guarantee service availability but will definitely provide methods in which you can bring up your service in case it goes down.

Also, most of our repository of application/enterprise problems consists of OLTP systems that requires the data be consistent "always" - one can get sued if one finds the balance inconsistent. Of course in traditional scale-up architecture this problem has been solved through an excellent relational model. 

Let me illustrate the complexity of a simple problem in large scale-out system -

You are to sort a "file of strings" on 1G RAM/100G HDD machine. Write an algorithm for solving the problem for each of the following cases - 100M, 1G, 50G and 100G file size (assuming there is separate memory available for OS and other processes). You will notice, that a very well known sorting problem will suddenly become different from traditional way of solving it. And mostly I have found that if I ask somebody to solve it 1 by 1 based on inputs, I get a different algorithm each time. And this has nothing to do with their understanding or intelligence. Its just that we never think of a large scale. Therefore one should be glad in just getting a solution in reasonable time and resources rather than worrying too much about performance.

Taking the problem to next level -
Assume that you are given 'N' nodes (each 1G-RAM/100G-HDD) to solve sorting problem with file size 1TB distributed in a bunch of nodes. How will your solution be different ? Now, think of other related issues - how will you accept the input, how will you provide the output, how long will you allow your consumer to be able to access the data, security of data, what kind of web API will it be, etc. 

Anyway, I am trying to point out is that - first developers should be trained to understand the large size of data set so that they change their mindset and algorithm to suite a machine cluster (basically follow new rules of the game). Once the problem is broken then solving it locally is just a part of the problem (which we all know how to do it well). Also, note that there will always be a large overhead of synchronization, book-keeping and of course - security which changes the landscape completely. Thus in cloud the traditional definition of performance parameters does not hold good. We ought to worry about -
- what is the reliability% we are looking at ?
- what is availability% we are looking at ?
- what is the time frame by which the data should become eventually consistent ?
- is the data transfer over the network minimal ?
- is the number of nodes that we use are minimal ?
- is the node itself optimized (traditional performance problem) ?

An analogy - the difference between working in traditional system and cloud is that of product company and services company methodology of getting work done. In product company, the hired human resource-set is typically very very capable and are few in numbers. They run the entire show but if they are down so is a large portion of the system. But in a services company - the human resource-set are mostly above average and are found in large numbers but they are fitted into a "superbly" managed system that continues to function even if that resource-set is down for some time.

Therefore in order to work well in a cloud one should aim to make optimal use of resources working at average performance. Here you MUST assume that bad things (bugs) will happen and continue to happen. But you should still achieve consistency (eventual or otherwise), high reliability and very high availability (depending upon your application the adjectives may differ). Thinking of it as an embedded system will be severely limit the thought process of the developer/architect and will take the focus away from larger issue. Also, cpu cycle, memory, disk will continue to become cheaper day by day so too much emphasis is not needed.

Comments and flames welcome ... 

Thanks
Aditya

On 04-Apr-09, at 12:42 PM, Namita Iyer wrote:

<snip>

Most of the time we think about the Cloud as an infrastructure that provides
us unlimited computing power, memory and storage. However, would it not be
good to design without this consideration. Eg. How about designing for the
Cloud as if you would design for an embedded system, where resources are
minimal and we need to have an optimized design to begin with.

I say this because, even though, the resources are unlimited, we now have to
pay per cpucycle, per mb of memory etc. So, the less we use the less we would
pay.

Having a cost optimized design in place before developing an app for the Cloud
would definitely help with changing service plans (remember how changes in
telephone service plans move our monthly expenses up or down, because we are
used to calling mom on certain days of the week - and we cant change that
because our service plan changed).

Namita Iyer

unread,
Apr 7, 2009, 6:43:34 AM4/7/09
to bigc...@googlegroups.com
Hi Aditya,

Good to hear from you (... long time :-) ...).

You have an interesting point of view. "throw-away hardware running any kind of OS wherein things can go down any time", "average resources", "superbly managed system that continues to function even if that resource-set is down for some time", "what is the time frame by which the data should become eventually consistent ?"

This clearly means that all apps cannot move to the cloud.
What according to you are the traits of an app that can move to the cloud ?

Namita


On Mon, Apr 6, 2009 at 3:16 PM, Aditya Lal <aditya...@gmail.com> wrote:
Hi Namita,

In my opinion, cloud poses a very different challenge of scaling out as compared to performance (which is more about getting maximum juice out of a set of boxes using scale up). 

<< snipped >>

Aditya Lal

unread,
Apr 9, 2009, 7:57:58 AM4/9/09
to bigc...@googlegroups.com
Hi Namita,

Ideally one should be able to move any application provided it is not
specialized for a box. I think following application characteristics
matters the most -
- Availability %
- Reliability %
- Eventual Consistency Time Period (what if I show stale data for that
time period ?)
- Latency

In general, to achieve high availability with low reliability or high
reliability with low availability is easy. Achieving both together is
a hard problem.
- Thus it is trivial to move a stateless application (availability =
high, reliability = low) to the cloud.
- A multi-user application with only user local data with low
availability per user is also simple
- in case the system is recovering, it is not available

Some companies have already solved the problem of achieving high
reliability and provide them as services viz. Amazon. Today if your
application can be mapped to Amazon EC2 & S3 combination you can build
very large scale and highly reliable/available applications.

Some clarification on "interesting" points -

"throw-away hardware running any kind of OS where things can go down
any time" ::
- A data center typically contains very large number of machines
- The cost of a reboot and fixing the problem is typically higher than
moving the application to another machine
- Periodically a dump of bad machines are taken out and recycled
- No service provider will guarantee 100% availability
- thus there is always > 0 probability wherein machine may just go down
- and all you will get is a new machine instance (without a backup)
- As far as I know nobody provides automatic backup of your running
instance
- But you may avail to storage service viz. Amazon S3
- here there is still a probability that you may lose data (albeit
very low)
- typically the probability of failure is equivalent to you managing
your own machine
- but remember higher the reliability % => higher the service cost
Therefore if we are creating a highly reliable and available
application over cloud one has to worry about "a machine going down
irrecoverably" and include an ability to continue.

"average resources" ::
- A typical machine in a cloud may not get upgraded for 2-3 years
- Thus you may be working on a dated machine
- You can of course move to a different service provider but ...
- Also a typical scale-out application has synchronization overheads
- Thus one must not rely on a strong machine working on full capacity
- it may just go down and the replacement may not be that strong
- aim to get work done such a way that it can be easily replaced with
a similar or multiple weaker resources
- of course, your average performer resource itself may be very high ;)

"superbly managed system that ... " ::
- One must have a resource management system that "transparently" move
applications in case of failures
- but to make it "transparent" is not easy
- because it completely depends upon the application availability and
consistency requirements

"what is the time frame by which the data becomes consistent" ::
- In cloud High reliability is typically achieved through data
replication
- Synchronizing changes between replicas may take time
- Thus data is not consistent for brief period
- The queries are typically designed to look into any one of the
replicated nodes
- it may be out of sync if the query is immediate
- or if there is some recovery happening
- Therefore depending upon the application's consistency requirement
the solution may vary

Aditya

On 07-Apr-09, at 4:13 PM, Namita Iyer wrote:

> Hi Aditya,
>
> Good to hear from you (... long time :-) ...).
>
> You have an interesting point of view. "throw-away hardware running
> any kind of OS wherein things can go down any time", "average
> resources", "superbly managed system that continues to function even
> if that resource-set is down for some time", "what is the time frame
> by which the data should become eventually consistent ?"
>
> This clearly means that all apps cannot move to the cloud.
> What according to you are the traits of an app that can move to the
> cloud ?
>
> Namita
>

> <snipped>

Reply all
Reply to author
Forward
0 new messages