The case for a cloud computing price war

33 views
Skip to first unread message

dberninger

unread,
Oct 29, 2010, 4:48:41 PM10/29/10
to Cloud Computing
The long list of issues associated with sessions at the various cloud
conferences from security to picking a hypervisor remain moot until
the cloud industry becomes more price competitive with on premise
computing options. The ability to leave EC2 instance prices unchanged
for years at a time may seem like good news for Amazon, but high costs
keep the mass migration of computing to the cloud on hold. The
absence of price competition does not represent good news for anyone,
it means there is not enough deployment activity for anyone to care.

Price competition can arise via: 1) Adoption of Amazon's ECU as a
standard cloud compute capacity metric; 2) Tracking cloud metrics
(compute, memory, storage, bandwidth) as a function of price; and 3)
Standardized multi-vendor cloud configurations.

See additional background on these points at
http://cloudpricecalculator.com/blog/the-case-for-a-cloud-computing-price-war/

Counter arguments?

Regards,

Dan

........................................
Daniel Berninger
President, goCipher Software
e: dbern...@gocipher.com
tel SD: +1.202.250.3838
sip HD: d...@danielberninger.com
w: www.gocipher.com

cloudsigma

unread,
Oct 30, 2010, 6:41:12 AM10/30/10
to Cloud Computing
Dear Dan,

As a vendor in the IaaS space, I completely agree that current pricing
models are deliberately designed to mask the amount of computing
performance actually being delivered and the raw resource being
allocated. It isn't just an issue of pricing but also of resource
bundling. These two go hand in hand. When you bundle resources
together and create false constructs such as server instance sizes,
you hide the real pricing. It can and will change.

In terms of your conclusions, I'd agree with most but strongly
disagree that adoption of Amazon's ECU is any part of the answer. So
with regards to your conclusions:

1. Amazon's ECU Unit as a standard Unit
Adoption of Amazon's ECU is not part of the solution. What is an ECU
and why exactly is it necessary? I'd love to know and I work in the
sector?! From my perspective, we have a pretty well defined way of
understanding computing processing power, its called MHz/GHz and also
number of cores (in combination). How is an 'ECU' superior to saying
1GHz for example? I struggle to find a justification except to move
pricing away from actual underlying resources being delivered. Price
in easy to understand resource/time units. Its that simple. People
understand GHz, they understand GB for RAM, storage and data transfer.
If it ain't broke don't fix it! Some in the cloud space seem to invent
new terms for existing well understood concepts. Customers should
reject attempts at this.

2.Tracking cloud metrics (compute, memory, storage, bandwidth) as a
function of price
Yes and no. This statement implicitly accepts that server instance
sizes are necessary. They aren't. They are a vendor construct and
don't exist in the underlying technology. They make cloud computing
more restrictive in some respects than dedicated hardware. Its a
vendor imposed problem, not a cloud computing problem.

Pricing should be by each individual resource by unit and purchased
independently. You can therefore purchase the aggregate resources you
need then make the infrastructure you need by combining them together
in ways that fit YOUR needs. For me it seems self-evident that this is
a superior structure from a customer's perspective. So two things need
to happen, firstly you need transparent pricing by resource and then
you need the elimination of server instance sizes so resources can be
combined in a fluid way. That's exactly how we approach IaaS. You can
check out our pricing at http://www.cloudsigma.com/en/pricing/price-schedules?t=1
.

3. Standardized multi-vendor cloud configurations.
I'm not sure exactly what you mean by this, I'm guessing server
instance size? I'd argue this isn't necessary, if you have pricing by
unit then you have a transparent comparison mechanism. You can make
comparisons with pricing by resource only but with unbundled resources
you can make straight instant comparisons across platforms.


So, if you had this transparent pricing adopted by other vendors (you
know who you are!) then you'd be half way to actually understanding
what you are getting for your money. Halfway? Yes that's right because
all is not equal among cloud vendors. CPU can be heavily contended,
storage performance can vary enormously as can networking performance.
I'd use the analogy of a car engine. If I told you I had a 2 litre
engine, could you tell me how much horse power my car has? Of course
not and its the same with a cloud vendor. Its not just about the
capacity a customer is supposedly getting for their money, its the
architecture, hardware and contention that the cloud vendor uses that
determines how much computing you'll get done for that amount of
capacity. So, the second component is benchmarking that reveals
performance and real computing throughput behind the figures.

I'd argue that performance transparency is just as critical as price
transparency and they go together. I myself wrote a blog post about
this recently (see http://www.cloudsigma.com/en/blog/2010/08/28/9-cloud-server-benchmarking
) and just last week on the important of server monitoring in the
cloud (see http://www.cloudsigma.com/en/blog/2010/10/24/12-cloud-server-monitoring
).

Thanks for stimulating the debate!

Best wishes,

Robert

--
Robert Jenkins
CloudSigma
http://www.cloudsigma.com/
http://www.twitter.com/CloudSigma

On Oct 29, 11:48 pm, dberninger <dan.bernin...@gmail.com> wrote:
> The long list of issues associated with sessions at the various cloud
> conferences from security to picking a hypervisor remain moot until
> the cloud industry becomes more price competitive with on premise
> computing options. The ability to leave EC2 instance prices unchanged
> for years at a time may seem like good news for Amazon, but high costs
> keep the mass migration of computing to the cloud on hold.  The
> absence of price competition does not represent good news for anyone,
> it means there is not enough deployment activity for anyone to care.
>
> Price competition can arise via: 1) Adoption of Amazon's ECU as a
> standard cloud compute capacity metric; 2) Tracking cloud metrics
> (compute, memory, storage, bandwidth) as a function of price; and 3)
> Standardized multi-vendor cloud configurations.
>
> See additional background on these points athttp://cloudpricecalculator.com/blog/the-case-for-a-cloud-computing-p...
>
> Counter arguments?
>
> Regards,
>
> Dan
>
> ........................................
> Daniel Berninger
> President, goCipher Software
> e: dbernin...@gocipher.com

Daniel Berninger

unread,
Oct 30, 2010, 12:55:49 PM10/30/10
to cloud-c...@googlegroups.com
Hello Robert,

Thanks for your comments.

I am glad you agree existing pricing models obscure the value
proposition of cloud IaaS offers.

We also agree the ECU is presently meaningless.

The call for embracing the ECU is also a call to establish a consensus
measure of the ECU (with or without Amazon).

The existing consensus measures for memory (GB), storage (TB), and
bandwidth (GB) mean we are relatively close to the transparency you
describe.

I raised the issue of instance definitions and a single comprehensive
index only as a matter of preference.

The Cloud Price Normalization index (cloudpricecalculator.com) offers
buyers a means of direct comparison, but the need for a consensus
definition of ECU is the primary issue.

Amazon dominates the confused status quo of IaaS, but even EC2 is tiny
compared to the overall consumption of on premise compute resources.

This will not change until the price performance of cloud improves.

The continuous improvement cycle that serves as the basis for all
infotech growth requires actually knowing how to measure price
performance.

Regards,

Dan

On Sat, Oct 30, 2010 at 6:41 AM, cloudsigma <rob...@cloudsigma.com> wrote:
> Dear Dan,
>
> As a vendor in the IaaS space, I completely agree that current pricing
> models are deliberately designed to mask the amount of computing
> performance actually being delivered and the raw resource being
> allocated. It isn't just an issue of pricing but also of resource
> bundling. These two go hand in hand. When you bundle resources
> together and create false constructs such as server instance sizes,
> you hide the real pricing. It can and will change.
>

<snip>

Miha Ahronovitz

unread,
Oct 30, 2010, 1:35:45 PM10/30/10
to cloud-c...@googlegroups.com, cloudsigma
Robert, good reply. Like any other purchase, compute utility is no
different.
There is no replacement for research to compare offerings. The creation
of a standardized unit ECU is an old wish. See the idea of Sun Unit of
Compute Power intially proposed in 2003
http://bit.ly/dCdFB8

At that time we thought be being early, we will get the unit adopted. It
was before Amazon gained real strength. Today, the aspiration a unique
unit of computer utility is proven utopia.

Here is an blog comparing Cloud Sigma and Amazon offerings
http://bit.ly/cWySC0
Assume we had an unit for ECU, how can it be applied a Cluster Compute
Instance, which is arbitrarily determined by Amazon and which for sure
will change any time?

Cheers,

Miha

mij123.vcf

Sassa

unread,
Nov 1, 2010, 8:53:53 AM11/1/10
to Cloud Computing
Thanks, Rob!


On Oct 30, 10:41 am, cloudsigma <rob...@cloudsigma.com> wrote:
...
> 3. Standardized multi-vendor cloud configurations.
> I'm not sure exactly what you mean by this, I'm guessing server
> instance size? I'd argue this isn't necessary, if you have pricing by
> unit then you have a transparent comparison mechanism. You can make
> comparisons with pricing by resource only but with unbundled resources
> you can make straight instant comparisons across platforms.
>
> So, if you had this transparent pricing adopted by other vendors (you
> know who you are!) then you'd be half way to actually understanding
> what you are getting for your money. Halfway? Yes that's right because
> all is not equal among cloud vendors. CPU can be heavily contended,
> storage performance can vary enormously as can networking performance.

Golden words! This is The pebble in my shoe all this time.

It's not the computing time, and not the GHz that I am paying for. I
am paying for a service that *says* GHz on the tin.

But:

* do you count the time my threads weren't scheduled because of
contention between themselves?
* do you count the time my threads weren't scheduled because of
contention between the guests?
* is SPARC GHz comparable to x86 GHz? (is Westmere GHz comparable to
Celeron GHz?)

...* is the promise of 99.999% discernible from 95.995%? (how long
does it take to discern them? I don't want my business to be a
benchmark of that promise)


Essentially, how much of the offer is consolidation, and how much of
it is performance. If it is tilted towards consolidation, you
shouldn't count on GHz on the tin (but the price is very competitive).
If it is tilted towards performance, how much monetary gain do you get
compared to the in-house infrastructure?



> I'd use the analogy of a car engine. If I told you I had a 2 litre
> engine, could you tell me how much horse power my car has? Of course
> not and its the same with a cloud vendor. Its not just about the
> capacity a customer is supposedly getting for their money, its the
> architecture, hardware and contention that the cloud vendor uses that
> determines how much computing you'll get done for that amount of
> capacity. So, the second component is benchmarking that reveals
> performance and real computing throughput behind the figures.
>
> I'd argue that performance transparency is just as critical as price
> transparency and they go together. I myself wrote a blog post about
> this recently (seehttp://www.cloudsigma.com/en/blog/2010/08/28/9-cloud-server-benchmarking
> ) and just last week on the important of server monitoring in the
> cloud (seehttp://www.cloudsigma.com/en/blog/2010/10/24/12-cloud-server-monitoring
> ).

I'd even say it isn't even an argument - it is a given which the
customers don't have an easy way to gauge.



Sassa



>
> Thanks for stimulating the debate!
>
> Best wishes,
>
> Robert
>
> --
> Robert Jenkins
> CloudSigmahttp://www.cloudsigma.com/http://www.twitter.com/CloudSigma

cloudsigma

unread,
Nov 1, 2010, 5:00:44 AM11/1/10
to Cloud Computing
Dear Dan and Miha,

Thanks for the additional information which is most interesting.

In terms of the goal of creating some standardised unit of
measurement, in my opinion this misses the point. There is no standard
computing use. Computing is heterogeneous and to think that one
measurement is directly relevant to all computing is erroneous. There
isn't one definitive benchmark for measuring performance. Actually a
system that performs very well for one use may be terrible for another
use. That's why comparisons of a general form are correct in general
and wrong specifically for practically everyone :-)

Just to expand on two points:

BUNDLING
Resource bundling undermines comparisons such as the one at
cloudpricecalculator.com that Dan pointed to. Amazon over-bundles
storage to quite a silly degree in my opinion. From my discussions
I've not met anyone that comes close to using the amount of storage
Amazon are bundling. Is Amazon just being generous? Of course not,
they know it makes their 'offer' look much better than it really is
especially when compared with other vendors that don't bundle. They
force users to have too much storage. When you then stack that up
against vendors that don't bundle the per GB of those other vendors is
HIGHER but this comparison assumes full utilisation. If you compare
what you actually NEED against the cost of whatever is provided
(disregarding extra resources) then the cost at Amazon tends to be
higher not lower. Its a very clever distortion mechanism and the
reason why Amazon ranks pretty high on your list for some instance
sizes despite the well documented performance issues that many users
face who actually have heavy computing loads on that platform.

DIFFERENTIATING PERFORMANCE FACTORS: QUALITY LEVEL
Performance can be defined in terms of the various resources made
available to cloud users namely CPU, RAM, Storage and data transfer.

CPU and RAM performance is basically a commodity product nowadays. At
least 90% of performance difference between cloud platforms is in
relation to contention. Platforms with high contention will have worse
performance on average and a higher variance in performance. Actually,
the variance of performance is just as important as the absolute
level. I'd take a slightly lower reliable performance over a wildly
fluctuating one any day. Again, comparisons of the sort outlined don't
have the frequency to capture variability of performance.

Real performance differences are coming from storage in particular and
to a lesser extent networking (data transfer). Firstly, storage in the
cloud can be persistent (like ours) or ephemeral (like Amazon's
standard instance storage). Looking at comparisons, you are mixing
persistent and ephemeral storage which is fundamentally incorrect,
they are two totally different products. Why? Non-persistent storage
is often stored without RAID as it doesn't need to be persistent or
robust. This can help boost performance but at the expense of
robustness. So, storage in itself is a complicated multi-faceted
resource and a trade-off between various factors. You also have
additional options such as encryption (in our case) which further
complicate matters. Finally, what about data lock-in? Cheap storage is
not so great if you can't get at your data very effectively.

Moving on to networking performance, again we are looking at a multi-
faceted beast. What is the actual networking performance of the cloud
servers? Is that important to you as a customer? Is the bandwidth
(distinct from data transfer) throttled? If so at what level?


The biggest actual fundamental differences between cloud performance
as I've said relate to storage and networking performance. There are
big differences in CPU and RAM but these relate largely to vendor
specific contention ratios which can be arbitrarily set by each
vendor.

We also have reliability issues which is again another factor that is
directly related if you create a price/performance measure. How? Well
a vendor without much redundancy in a cheap data centre can have a
high price/performance because they can lower their costs a great
deal. Those lower prices are coming at the expense of reliability and
availability. I'd argue that's a variable that should be factored in
for sure for most customers.

To summarise, the idea of comparing performance of different clouds in
some sort of generalised form is (without wishing to sound too geeky)
like trying to solve quadratic equations with too many unknown
variables. Its insoluble without making assumptions about variables or
ignoring them altogether and the end result becomes of little use. To
illustrate my point I've included below just some of the variables
that are missing from any ECU type comparison attempt that I've seen:

- variability of service
- quality of service (storage robustness, storage type, networking
quality, networking speed etc.)
- price of actual resources consumed not bundled combinations
- billing granularity
- currency fluctuations
- vendor lock-in
- green/environmental impact considerations
- redundancy of the platform
- jurisdiction/ legal considerations
- SLA offered

Some of them above are not strictly price/performance variables but a
good number of them are. I've yet to see any sort of comparison coming
close to capturing even a small number of those vital aspects. Sorry
to sound so negative but I think you are trying to create something
that can't be accurate because its just too complicated an
optimisation and the 'measure' of performance will be different for
different users. Don't forget, once you set any sort of arbitrary
comparison its easy for a vendor to optimise for it and 'perform'
well. That doesn't translate to actual high real world performance/
value for money for customers for the reasons set out above.

Kind regards,

Robert

--
Robert Jenkins
CloudSigma AG
http://www.cloudsigma.com/blog
http://www.twitter.com/CloudSigma

On Oct 30, 7:35 pm, Miha Ahronovitz <mij...@sbcglobal.net> wrote:
> Robert, good reply. Like any other purchase, compute utility is no
> different.
> There is no replacement for research to compare offerings. The creation
> of a standardized unit ECU is an old wish. See the idea of Sun Unit of
> Compute Power intially proposed in 2003http://bit.ly/dCdFB8
>
> At that time we thought be being early, we will get the unit adopted. It
> was before Amazon gained real strength. Today, the aspiration a unique
> unit of computer utility is proven utopia.
>
> Here is an blog comparing Cloud Sigma and Amazon offeringshttp://bit.ly/cWySC0
> > check out our pricing athttp://www.cloudsigma.com/en/pricing/price-schedules?t=1
> > .
>
> > 3. Standardized multi-vendor cloud configurations.
> > I'm not sure exactly what you mean by this, I'm guessing server
> > instance size? I'd argue this isn't necessary, if you have pricing by
> > unit then you have a transparent comparison mechanism. You can make
> > comparisons with pricing by resource only but with unbundled resources
> > you can make straight instant comparisons across platforms.
>
> > So, if you had this transparent pricing adopted by other vendors (you
> > know who you are!) then you'd be half way to actually understanding
> > what you are getting for your money. Halfway? Yes that's right because
> > all is not equal among cloud vendors. CPU can be heavily contended,
> > storage performance can vary enormously as can networking performance.
> > I'd use the analogy of a car engine. If I told you I had a 2 litre
> > engine, could you tell me how much horse power my car has? Of course
> > not and its the same with a cloud vendor. Its not just about the
> > capacity a customer is supposedly getting for their money, its the
> > architecture, hardware and contention that the cloud vendor uses that
> > determines how much computing you'll get done for that amount of
> > capacity. So, the second component is benchmarking that reveals
> > performance and real computing throughput behind the figures.
>
> > I'd argue that performance transparency is just as critical as price
> > transparency and they go together. I myself wrote a blog post about
> > this recently (seehttp://www.cloudsigma.com/en/blog/2010/08/28/9-cloud-server-benchmarking
> > ) and just last week on the important of server monitoring in the
> > cloud (seehttp://www.cloudsigma.com/en/blog/2010/10/24/12-cloud-server-monitoring
>  mij123.vcf
> < 1KViewDownload

Daniel Berninger

unread,
Nov 1, 2010, 10:14:51 AM11/1/10
to cloud-c...@googlegroups.com
Hello,

Apologies in advance for continuing this thread....

I think we are making the issue more complicated than necessary.

The issue is not unique to cloud computing.

The electric utility industry does not exist if the utility does not
provide customers a measure of what they are receiving?

Would people be willing to buy gasoline from a gas station that does
not give them a reliable metric to know what they are buying?

Does it make sense for different gas stations to use different metrics?

Does it seem likely customers would be happy if the cloud computing
industry decided to replace GB as the measure of Memory or TB as the
measure of storage or GB as the measure of bandwidth with some vague
abstractions as Amazon did with the ECU regarding compute resources?

Customers can buy on-premise computer equipment without a compute
metric because the unit of compute is the processor itself. This is
not the case with cloud computing where processor resources get sliced
and diced.

The issue here is fundamental and urgent. The cloud industry will not
move beyond experimentation if there is no consensus measure of what
we are selling?

Regards,

Dan

........................................
Daniel Berninger
President, goCipher Software

e: dbern...@gocipher.com

w: www.cloudpricecalculator.com/blog


On Mon, Nov 1, 2010 at 5:00 AM, cloudsigma <rob...@cloudsigma.com> wrote:
> Dear Dan and Miha,
>
> Thanks for the additional information which is most interesting.
>
> In terms of the goal of creating some standardised unit of
> measurement, in my opinion this misses the point. There is no standard
> computing use. Computing is heterogeneous and to think that one
> measurement is directly relevant to all computing is erroneous. There
> isn't one definitive benchmark for measuring performance. Actually a
> system that performs very well for one use may be terrible for another
> use. That's why comparisons of a general form are correct in general
> and wrong specifically for practically everyone :-)
>

<snip>

cloudsigma

unread,
Nov 1, 2010, 12:39:35 PM11/1/10
to Cloud Computing
Dear Sassa,

Just to pick up on a couple of points you raise:

* do you count the time my threads weren't scheduled because of
contention between themselves?
Actually this depends on the virtualisation used. We use KVM which is
in our opinion the truest virtualisation hypervisor out there. Why?
Because KVM virtualises the cores themselves in a way that Xen and
others just don't. So, on our platform there isn't a round robin style
distribution of computing jobs, they run concurrently with with the
relevant computing power you've purchased. You'll only see queuing on
our system if its completely overloaded and we prevent that by not
contending resources in the heavy way of other platforms. We
effectively manage server load so it never exceeds the number of cores
i.e. there isn't any waiting time for threads to start processing.

We even expose power tools such as the ability for a user to specify
the number of cores they want their computing distributed across. This
can have a major performance enhancing effect for some applications
particularly if you match the cores to the thread number the software
is optimised for. Again, I haven't really seen this on other major
platforms.

* is SPARC GHz comparable to x86 GHz? (is Westmere GHz comparable to
Celeron GHz?)
I don't have sufficient knowledge to comment on SPARC and we don't use
them at all. In terms of Intel versus AMD, we don't see big
differences in performance. The two biggest determinants of
performance are CPU contention ratio (vendor determined) and number of
cores used (usually vendor determined, in our case option for user to
determine).

...* is the promise of 99.999% discernible from 95.995%? (how long
does it take to discern them? I don't want my business to be a
benchmark of that promise)
If you are talking about availability then yes. An availability of
95.995% is very low. The lower figure during the course of a year this
is a full 350.84 hours of down time versus 0.0876 hours of downtime. A
rather large difference! A vendor slipping below even 99.5% will be
pretty obvious to most customers. We encourage all users to use server
monitoring to keep tabs on their infrastructure.

Kind regards,

Robert

--
Robert Jenkins
CloudSigma
http://www.cloudsigma.com/blog
http://www.twitter.com/CloudSigma

Amy Wohl

unread,
Nov 1, 2010, 2:56:50 PM11/1/10
to cloud-c...@googlegroups.com
I agree. Customers are satisfied with services like Amazon because there is
some deliverable metric (although I think it's pretty complicated to figure
out what exactly it is). To them, the metric is the work they got done at a
price point.

But to process complex multiple transactions on multiple clouds, we are
going to have some agreed upon metrics. There is work going on to define
these metrics, along with the work on interoperability (related), but it
hasn't gotten very far yet, partly because we're at such an early market
stage and the provider market is fragmented (and may stay that way for some
time).

Perhaps we can figure out ways to measure what customers are getting?

Amy Wohl

Amy D. Wohl
Editor, Amy Wohl's Opinions
1954 Birchwood Park Drive North
Cherry Hill, NJ 08003
856-874-4034
a...@wohl.com
www.wohl.com

Hello,

Regards,

Dan

--
~~~~~
UP 2010 Call For Proposals: http://www.up-con.com/content/call-proposals.
Nominate for UP-Start Cloud Awards http://www.up-con.com/cloud-award
Official Cloud Slam websites - http://cloudslam10.com and
http://cloudslam09.com
Posting guidelines: http://bit.ly/bL3u3v
Follow us on Twitter http://twitter.com/cloudcomp_group or @cloudcomp_group
Post Job/Resume at http://cloudjobs.net
Buy hundreds of conference sessions and panels on cloud computing on DVD at
http://www.amazon.com/gp/product/B002H07SEC,
http://www.amazon.com/gp/product/B002H0IW1U or get instant access to
downloadable versions at http://cloudslam09.com/content/registration-5.html
and http://cloudslam10.com/content/registration

~~~~~
You received this message because you are subscribed to the Google Groups
"Cloud Computing" group.
To post to this group, send email to cloud-c...@googlegroups.com
To unsubscribe from this group, send email to
cloud-computi...@googlegroups.com

Sassa

unread,
Nov 1, 2010, 6:59:06 PM11/1/10
to Cloud Computing
This is good you mentioned the gasoline analogy.

Because I noticed that my car runs longer on the same 95 (octane)
petrol bought from BP than the same 95 from ASDA (your Walmart), and
the difference is more than the price.

I don't want to be in the same position buying seemingly identical GHz
from two providers and figure out empirically that there is something
else in the mix they didn't tell me about.

Sassa
> e: dbernin...@gocipher.com

Jim Starkey

unread,
Nov 1, 2010, 10:53:31 PM11/1/10
to cloud-c...@googlegroups.com
Ladies and gentlemen, may I introduce a little bit of technical insight?

The clock rating of a processor is next to meaningless. It can be used
to compare the relative speeds of two processors in the same family, but
nothing more. Clock speed can't be used to compare two processors from
different families from the same foundry, let alone different families
from different manufacturers.

There are a dozen or more variables that come into the mix. First is
super-scalar performance -- how many instructions get executed in a
clock time. This depends on the instruction mix (fixed vs. floating
point), branch prediction, length of the pipeline, cache hit rate, and
gobs of other factors. Then here are cache contention issues. AMD
used to wipe the floor with Intel's front side bus until Intel got a lot
smarter. And the feature list goes on and on.

Then there are the number and speed of memory channels, memory speed,
cache speed, etc. All go into the mix.

Making the issue even more complex, if possible, is that nobody throws
out last year's processors to be stylish. Processors are kept in play
until either the leasing terms or energy consumption leads to retirement.

Comparing architectures as different as Intel and Sparc is even more
hopeless. Later Sparcs were big on what Sun called hardware threads
(closer to what CDC called barrel processors than cores), so a
production job architected to run on Sparc would have an performance
advantage that would not generalize.

It's hopeless, ladies and gentlemen. All you can do is run your
application mix and measure the results. The only alternative is bogo-mips.


The only way to measure the

Sassa

unread,
Nov 2, 2010, 5:25:07 AM11/2/10
to Cloud Computing
Hi Robert,


On Nov 1, 5:39 pm, cloudsigma <rob...@cloudsigma.com> wrote:
> Dear Sassa,
>
> Just to pick up on a couple of points you raise:
>
> * do you count the time my threads weren't scheduled because of
> contention between themselves?
> Actually this depends on the virtualisation used. We use KVM which is
> in our opinion the truest virtualisation hypervisor out there. Why?
> Because KVM virtualises the cores themselves in a way that Xen and
> others just don't. So, on our platform there isn't a round robin style
> distribution of computing jobs, they run concurrently with with the
> relevant computing power you've purchased. You'll only see queuing on
> our system if its completely overloaded and we prevent that by not
> contending resources in the heavy way of other platforms. We
> effectively manage server load so it never exceeds the number of cores
> i.e. there isn't any waiting time for threads to start processing.

Hmmmm.... This doesn't quite get through to me just yet.

1. I bought 1 core, spawned 200 threads, all runnable. How much do I
pay for? (I think, 1 core)
2. I bought 16 cores, spawned 200 threads, 2 are runnable. How much do
I pay for? How many other guests do I share the 16 cores with?



> We even expose power tools such as the ability for a user to specify
> the number of cores they want their computing distributed across. This
> can have a major performance enhancing effect for some applications
> particularly if you match the cores to the thread number the software
> is optimised for. Again, I haven't really seen this on other major
> platforms.
>
> * is SPARC GHz comparable to x86 GHz? (is Westmere GHz comparable to
> Celeron GHz?)
> I don't have sufficient knowledge to comment on SPARC and we don't use
> them at all. In terms of Intel versus AMD, we don't see big
> differences in performance. The two biggest determinants of
> performance are CPU contention ratio (vendor determined) and number of
> cores used (usually vendor determined, in our case option for user to
> determine).

OK, SPARC was just an example of a completely different architecture.


> ...* is the promise of 99.999% discernible from 95.995%? (how long
> does it take to discern them? I don't want my business to be a
> benchmark of that promise)
> If you are talking about availability then yes. An availability of
> 95.995% is very low. The lower figure during the course of a year this
> is a full 350.84 hours of down time versus 0.0876 hours of downtime. A
> rather large difference! A vendor slipping below even 99.5% will be
> pretty obvious to most customers. We encourage all users to use server
> monitoring to keep tabs on their infrastructure.

Well... non-availability is the extreme of non-performance. (i.e. an
observer detects non-availability by noticing the response time is
higher than a threshold)

If my app is unavailable for as much as 100 microseconds in every
second because of the presence of other guests, it is already
affecting 0.01% of availability. I guess in a broader sense the
question is, can someone claim app is 99.999% available, if it is 0.5%
less performant. You may call this distinction quite academic, but it
stops being academic when you consider that no one introduces a clear
distinction between the two, availability is readily claimed by all
vendors, and not the related performance guarantee - which goes back
to the original question of GHz vs GHz.


Sassa

> Kind regards,
>
> Robert
>
> --
> Robert Jenkins
> CloudSigmahttp://www.cloudsigma.com/bloghttp://www.twitter.com/CloudSigma

Paul Robinson

unread,
Nov 2, 2010, 10:06:47 AM11/2/10
to cloud-c...@googlegroups.com
On 2 Nov 2010, at 02:53, Jim Starkey wrote:

> The clock rating of a processor is next to meaningless.


Which is why nobody in the industry has used it for anything beyond an identifying label for at least a decade and a half now! :-)

All of your points stand about the variables in the mix. Which brings us to:


> It's hopeless, ladies and gentlemen. All you can do is run your application mix and measure the results. The only alternative is bogo-mips.


Ahhh, benchmarking.

Anybody else remember when vendors would spend all their time trying to tweak their configs specifically for benchmarking loads, so the scores would be high but the real load performance might be completely different?

I remember working at an ISP where we were being benchmarked against other ISPs and our bonus payments were linked to relative performance: we ended up doing things like authenticating you before user/pass was sent down the line and then dropping the line after the fact if it failed in order to get an extra fraction of a second in the authentication. We would spot inbound calls from testing modems on CLIs and route them to quieter modem racks. We would traffic shape. We'd do anything! And it worked: we got the best marks in the UK.

That said, a benchmark of what is meant by a "large instance" for various workloads would be useful, and would be useful in helping to compare offerings from rival cloud providers.

A benchmarking system broken down into a range of common application configurations, making compute units more comparable would be quite desirable I think.

If I see an Amazon instance is benchmarked independently at 200 reqs/second for a common application configuration for a popular blogging tool, and a competing option that is 10% more expensive is rated at 300 reqs/second, I can make an assessment of "value".

If I can find x prime factors per week with one compute instance and the same software on a competitor instance get me 15% more but costs 12% more as well, again I can make an assessment of value.

Competitors can then distinguish themselves: "For high RAM and CPU workloads we're 8% cheaper"; "For blogging applications, you get 25% more requests per dollar than with the competition", etc. and can tailor their environments accordingly. Some players will come out on top for HPC applications. Others for data processing workloads, others again for web applications.

All we need to do is find a way to get it funded - would companies pay for this information do you think?

Paul Robinson

Sassa

unread,
Nov 2, 2010, 7:17:55 PM11/2/10
to Cloud Computing
Yes.

And measuring it isn't for free, too. Some "30-day trial" may be
enough for a start-up, but a bigger fish would need to invest more
man*hour just to work out if they get what they need.

It'd be nice to hear how to accurately sell the latency from sharing
the system with the other guests (=the latency you can't tweak by
customer's personal engineering effort). I mean, in the private cloud
you have extra visibility outside the virtualized environment, and
leverage to adjust it to your business needs. I am not sure how the
same can be done in a "public" cloud - a cloud with a strategy to
maximize profit, which doesn't necessarily mean meeting 100% of my
individual business goal.


Sassa

cloudsigma

unread,
Nov 3, 2010, 5:50:41 AM11/3/10
to Cloud Computing
Dear Paul,

I totally agree with you. Benchmarks can be gamed quite easily and
there is no replacement for real world use. I agree that a raft of
scenarios could be tested regularly and repeated frequently to also
give information on variance etc. In terms of cost, as a vendor we'd
happily participate and provide the capacity for free and I know other
vendors that would do so also. As such the cost of such an operation
would come largely down to an investment of time.

Kind regards,

Robert

--
Robert Jenkins
CloudSigma
http://www.cloudsigma.com

Yuan

unread,
Nov 3, 2010, 5:47:24 PM11/3/10
to Cloud Computing
Sassa,

My feeling is that the "visibility wall" in public cloud offering is
just a matter of market condition. If enough customers demand that
extra visibility, cloud business will come up with a framework to
provide it. (There will always be this chicken and egg race, I know.)

On a different topic, I agree that non-availability is an extreme case
of underperformance. But the extremity is worth a distinction in many
business applications, e.g., virtual desktop for my mobile work
force. In other applications where user experience is critical - for
example, Amazon.com may stand to lose $2M if site performance is
slowed by 25% for 24 hours (or 3% of its daily sales), a quantitative
approach should apply. Let's say an EC2 SLA should guarantee me X
zillion Lintel cycles per day, among other things; alternatively, a
guarantee of N zillion Lintel cycles per hour at any given time (akin
to peak throughput) - or simply use throughput as ECU. If I'm
receiving x * X zillion cycles that day where x<1, a y * (1 - x)
penalty should be assessed. In the fictitious Amazon.com example
above, it would be reasonable to make y = 15%, in which 12%=3%/25% is
my financial loss, 3% is a penalty.

My cent.

Yuan Liu

Sassa

unread,
Nov 5, 2010, 6:26:33 AM11/5/10
to Cloud Computing
On Nov 3, 9:47 pm, Yuan <yuans...@gmail.com> wrote:
> Sassa,
>
> My feeling is that the "visibility wall" in public cloud offering is
> just a matter of market condition.  If enough customers demand that
> extra visibility, cloud business will come up with a framework to
> provide it. (There will always be this chicken and egg race, I know.)

No, that's not it. They'll do it, if this feature improves their
profit.


> On a different topic, I agree that non-availability is an extreme case
> of underperformance.  But the extremity is worth a distinction in many
> business applications, e.g., virtual desktop for my mobile work

Yes, but do they make this distinction?

Losing 1 hour in one accident in 5 months has the same impact on
throughput as losing 1 second every hour. You'd need a completely
different granularity of diagnostics and monitoring tools to catch the
latter, though. At which point does the service become mis-sold?

So you'd need to link the following:

* promise to your needs
* real throughput to the promise
* SLA to real failure recovery

If you run a benchmark, you'd still need to link the benchmark to the
real life.


> force.  In other applications where user experience is critical - for
> example, Amazon.com may stand to lose $2M if site performance is
> slowed by 25% for 24 hours (or 3% of its daily sales), a quantitative
> approach should apply.  Let's say an EC2 SLA should guarantee me X
> zillion Lintel cycles per day, among other things; alternatively, a
> guarantee of N zillion Lintel cycles per hour at any given time (akin
> to peak throughput) - or simply use throughput as ECU.  If I'm
> receiving x * X zillion cycles that day where x<1, a y * (1 - x)
> penalty should be assessed.  In the fictitious Amazon.com example
> above, it would be reasonable to make y = 15%, in which 12%=3%/25% is
> my financial loss, 3% is a penalty.

How many zillion cycles does your app need? (how do you know you need
X zillion cycles?)


Sassa

John Gilda

unread,
Nov 5, 2010, 3:58:33 PM11/5/10
to Cloud Computing
A cloud computing price war is inevitable but we are not there yet.

  • At least for IaaS, we are talking outsourcing of capacity. Price obfuscation is as old as outsourcing. It is a tactic to make it difficult to compare between competing offers and to make it difficult to compare with customers' internal cost structures. The big players want people to buy on brand not price. The challengers may want the topic to focus on price performance but will face a blizzard of competing metrics that give people headaches.
  • Given the variability of workload requirements, infrastructure or gear metrics will not tell the workload and user experience story. For firms with hybrid infrastructure models, application performance monitoring will become more important to service delivery and understanding providers' true performance.
  • How important is price/performance? Most of the first generation offerings are being built on similar architectures and will perform in a narrow band. Significant early adopters should be more concerned with viability of the company, business continuity and disaster recovery, proximity to users, and ability to exit post haste at low cost and effort, than with nominal price/performance differences.
  • If you put big chunks of capacity into the cloud, your rates are going to be negotiated and will have little relationship to retail rate/performance cards.
  • Expect there to be a consolidation and pricing pressure as some misjudge the market, over build capacity, then panic - think it may take more than a year to get  to that point given current cloud optimism. Remember the consolidation in the hosting world in 2002 where folks like Exodus got ahead of the curve and LoudCloud got dismantled. A portion of that was external economics, but there was general over exuberance about how much hosting capacity was really needed.
  • There may be a battle of the titans in the top tier depending on how much the folks sitting on piles of cash want to invest in buying market share when adoption hits the tipping point but they are smart enough to wait until they see the surge in the dollars flowing.
  • Despite the holy grail of elasticity, there will be investment units that affect capacity such as data centers, containers, or other units. Folks with vacant paid for capacity will be more aggressive price-wise than those that have an investment to make or are waiting for a DC to come on line or the next container of compute to hit the dock. This could give rise to very interesting marketplaces that look like commodity trading. For that to happen, standardized units will be required.

j


John Gilda

LinkedIn profileView my profile


> Date: Fri, 5 Nov 2010 03:26:33 -0700
> Subject: [ Cloud Computing ] Re: The case for a cloud computing price war
> From: sass...@gmail.com
> To: cloud-c...@googlegroups.com

Yuan

unread,
Nov 8, 2010, 2:58:06 AM11/8/10
to Cloud Computing
> How many zillion cycles does your app need? (how do you know you need
> X zillion cycles?)

No need to calculate. It's all a matter of SLA. If I'm buying 3.6
billion cycles during any hour, provider should give me 3.6 billion
cycles. I may be under capacitated already. I could be losing $1 M a
day already. But this doesn't matter. What matters is that the
provider must not reduce this 3.6 billion by 10%. Even the fictitious
3% sales loss on 25% cycle reduction doesn't matter; it's just a
financial term I negotiate. (There needs to be some baring on reality
for both sides to accept, of course.)

These are grid computing ideals that never materialised.

Yuan Liu

Jim Starkey

unread,
Nov 8, 2010, 9:37:00 AM11/8/10
to cloud-c...@googlegroups.com
Two questions. How are going to measure the number of available
cycles? And what cloud providers offer an SLA that guarantees any
specific level of performance?


--
Jim Starkey
Founder, NimbusDB, Inc.
978 526-1376

Miha Ahronovitz

unread,
Nov 8, 2010, 1:43:53 PM11/8/10
to cloud-c...@googlegroups.com
Jim, Yuan, good comments and questions. We talked about it before. Usually what we pay should be based on something we see and feel. SLA is something users and cloud owners agree.

One SLA implementation that I know well is SDM (Services Domain Manager) from Grid Engine. The documentation is here:
http://wikis.sun.com/display/gridengine62u6/Service+Domain+Manager

For easy reference , this how SLA implementation works:

The Service Domain Manager (SDM) module distributes resources between different services according to configurable Service Level Agreements (SLAs). The SLAs are based on Service Level Objectives (SLOs). In the context of SDM, a service is defined as a scalable and manageable software product that performs a specific function for its users. SDM functionality enables you to manage resources for all kind of scalable services. A scalable service is a service for which additional resources enable better performance.

SDM can manage services such as:

  • Grid Engine clusters. The first version of the SDM module supports multi-clustering of Grid Engine clusters.
  • Application servers
  • Reassignment of software licenses between different organizational entities of a company

A service uses Key Performance Indicators (KPIs) to determine its need for a resource. Those KPIs are reported through a Service Adapter that acts as a proxy between a service and the SDM core system. The SDM supports the concept of a spare pool which enables SDM to withdraw unneeded computational resources (hosts, which can be physical hosts or virtualized machines) from services. SDM also enables you to power off unneeded or underutilized machines to minimize the environmental impact of your data centers.

The SDM module is designed to handle different kinds of services that have in common a need for the same set of resources. The SDM module enables those services to share their resources effectively and increase resource utilization.


As Yuan wrote, this was a grid implementaation, that actually materialized. The problem is, can we use SDM in a public cloud? Amazon, RackSpace, Cloud Sigma (not sure, mybe Robert can clarify) work on principle that I say how many instance I need and I work with them. If I have too much, or too little, this is the user headache. There is no automated re-assigning of servers and instances, depending on demand and depending on the SLAs we need to provide.

What the user should pay-per-use? Only for the services delivered according to the SLA. the SDM pdf documentation can be downloaded here:

http://wikis.sun.com/display/gridengine62u6/Service+Domain+Manager+Guide+%28Printable%29

I went sideways from the topic, but bottom line, in an ideal world, all I need to to is to set a SLA and pay different prices for different SLAs. The price per cycle, per what ever geeky parameter, does not count. Then the cloud management - that user do not even know it exists - will adjust automatically the resources inside the cloud, so it can meet every SLA.

Utopia? No. we are very close to it. But no one has done it yet.

Miha
mij123.vcf

Greg Pfister

unread,
Nov 8, 2010, 5:02:22 PM11/8/10
to Cloud Computing
John,

A competent, knowledgeable position, very well stated. Thank you. I
agree completely.

Greg Pfister
http://perilsofparallel.blogspot.com/

On Nov 5, 12:58 pm, John Gilda <jgilda...@hotmail.com> wrote:
> A cloud computing price war is inevitable but we are not there yet.
>
> At least for IaaS, we are talking outsourcing of capacity. Price obfuscation is as old as outsourcing. It is a tactic to make it difficult to compare between competing offers and to make it difficult to compare with customers' internal cost structures. The big players want people to buy on brand not price. The challengers may want the topic to focus on price performance but will face a blizzard of competing metrics that give people headaches.
> Given the variability of workload requirements, infrastructure or gear metrics will not tell the workload and user experience story. For firms with hybrid infrastructure models, application performance monitoring will become more important to service delivery and understanding providers' true performance.How important is price/performance? Most of the first generation offerings are being built on similar architectures and will perform in a narrow band. Significant early adopters should be more concerned with viability of the company, business continuity and disaster recovery, proximity to users, and ability to exit post haste at low cost and effort, than with nominal price/performance differences.If you put big chunks of capacity into the cloud, your rates are going to be negotiated and will have little relationship to retail rate/performance cards.
> Expect there to be a consolidation and pricing pressure as some misjudge the market, over build capacity, then panic - think it may take more than a year to get  to that point given current cloud optimism. Remember the consolidation in the hosting world in 2002 where folks like Exodus got ahead of the curve and LoudCloud got dismantled. A portion of that was external economics, but there was general over exuberance about how much hosting capacity was really needed.
> There may be a battle of the titans in the top tier depending on how much the folks sitting on piles of cash want to invest in buying market share when adoption hits the tipping point but they are smart enough to wait until they see the surge in the dollars flowing.
> Despite the holy grail of elasticity, there will be investment units that affect capacity such as data centers, containers, or other units. Folks with vacant paid for capacity will be more aggressive price-wise than those that have an investment to make or are waiting for a DC to come on line or the next container of compute to hit the dock. This could give rise to very interesting marketplaces that look like commodity trading. For that to happen, standardized units will be required.
>
> j
>
> John Gilda
>
> View my
> profile
>
>
>
>
>
>
>
> > Date: Fri, 5 Nov 2010 03:26:33 -0700
> > Subject: [ Cloud Computing ] Re: The case for a cloud computing price war
> > From: sassa...@gmail.com
> ...
>
> read more »

cloudsigma

unread,
Nov 9, 2010, 4:53:32 AM11/9/10
to Cloud Computing
Dear all,

I just thought it was worth making a few observations at this point.
Firstly, don't confuse software with infrastructure. We don't for
example have any visibility inside of our client's cloud servers. For
SaaS you can/should get application performance SLAs, for IaaS not
because its not protruding into the software layer (if its proper IaaS
which is sadly quite a rare beast currently).

The second thing to point out is that you shouldn't mix the concept of
'on demand' with reserved capacity because these two things are very
different concepts and as a result have very different prices. If you
are purchasing XXGHz 24/7 capacity then that capacity is being
reserved for you and will be there when you use it. If you want to pay
for 'only what you use' in terms of cycles, you are effectively asking
for a no-commitment on demand pricing using CPU cycles as the metric.
That's possible but actually its going to be a lot more expensive
because its unpredictable for the vendor and without any commitment.
You can't do any of this for RAM in any case because the RAM needs to
stay reserved for one server instance otherwise you get instability,
memory errors etc. So really the only resource you can look at for
this is CPU. I agree that many vendors currently are playing a bit
fast and loose by not even giving users CPU information at all. Its
not a technological limitation so they should be pushed by customers
to do so. Our 'on-demand' basically gives you a certain CPU capacity
for the billing cycle which is 5 minutes. The reality is that most
customers know the majority of their computing need in advance and so
its cheaper for them to purchase it on subscription then use on-demand
burst pricing for peaks or unexpected events.

The final point is that from our experience, it isn't problems with
pricing holding people back. We already price by resource in an
unbundled and flexible way. Changing this for CPU specifically to
cycles instead of reserved capacity (be that for 5 minutes or a month)
isn't that revolutionary. The whole premise is that somehow the
customers are not getting the capacity they need. This simply isn't
the case or our experience. The way we work is we proportionally
allocate CPU, often its under 100% allocation of the physical server's
whole capacity and users get proportionally more CPU than they are
actually paying for. We also allow users to control the number of
cores they are allocated. So a 2GHz server could use 1, 2 or four
cores for example. That's possible because we use KVM and virtualise
cores themselves. Users can therefore optimise their server
configuration to fit with the applications they are using and how they
support multi-threading.

We monitor the load on our machines; we have many heavy processing
companies doing all sorts of computing. To date I've never seen any of
our physical hosts max out on load i.e. their load rising more than 1
per core. That means things aren't being queued and people aren't
getting their computing delayed. We achieve this by careful resource
allocation on each physical host and by not being greedy and creating
heavy contention for CPU resources. As a result, we've never had a
customer not use our cloud due to performance or them not getting the
resources they were paying for. Maybe resource contention is more of
an issue with other vendors, I don't use them so I couldn't tell
you :-) If so then this comes down to more of a vendor issue than an
underlying cloud issue.

Best wishes,

Robert

--
Robert Jenkins
CloudSigma
http://www.cloudsigma.com/blog

On Nov 9, 12:02 am, Greg Pfister <greg.pfis...@gmail.com> wrote:
> John,
>
> A competent, knowledgeable position, very well stated. Thank you. I
> agree completely.
>
> Greg Pfisterhttp://perilsofparallel.blogspot.com/
> ...
>
> read more »

Sassa

unread,
Nov 9, 2010, 7:51:10 AM11/9/10
to Cloud Computing
Well, I see that saying "it's a matter of SLA" is just a way to defer
the solution to someone else. You need to enforce the SLA; hence you
need the leverage to measure the real demand for your service and
relate it to the real service the provider delivers.

In order to know how much you lost, you need to know how much you
should be getting in the first place.


Sassa

Sassa

unread,
Nov 9, 2010, 7:50:02 AM11/9/10
to Cloud Computing
Yes, this sort of SLA will work in house. Performance-oriented setups
will not overspend the resource, so change of KPI is directly linked
to change in the demand.

...and here's how the size tuning happens:

* watch the KPI
* ask for more capacity/release capacity
* measure the KPI difference
* if the KPI didn't change (much), give the requested/claim the
released capacity back

How much to ask for? (=How much of a difference to expect?)

Is there a promise of linear scalability, if my service scales
linearly in the house?

Is there a promise that the amount of resource utilized is exactly the
amount of resource I need? (=controlled, measurable, or no loss
through sharing of resources)

I.e. do I happen to ask for more resources when the KPI went down
because other guests are sharing my resources?


Sassa


On Nov 8, 6:43 pm, Miha Ahronovitz <mij...@sbcglobal.net> wrote:
> Jim, Yuan, good comments and questions. We talked about it before.
> Usually what we pay should be based on something we see and feel. SLA is
> something users and cloud owners agree.
>
> One SLA implementation that I know well is SDM (Services Domain Manager)
> from Grid Engine. The documentation is here:http://wikis.sun.com/display/gridengine62u6/Service+Domain+Manager
>
> For easy reference , this how SLA implementation works:
>
> > The Service Domain Manager (SDM) module distributes resources between
> > different services according to configurable Service Level Agreements
> > (SLAs). The SLAs are based on Service Level Objectives (SLOs). In the
> > context of SDM, a service is defined as a scalable and manageable
> > software product that performs a specific function for its users. SDM
> > functionality enables you to manage resources for all kind of
> > *scalable services*. A scalable service is a service for which
> > additional resources enable better performance.
>
> > SDM can manage services such as:
>
> >     * Grid Engine clusters. The first version of the SDM module
> >       supports multi-clustering of Grid Engine clusters.
> >     * Application servers
> >     * Reassignment of software licenses between different
> >       organizational entities of a company
>
> > A service uses Key Performance Indicators (KPIs) to determine its need
> > for a resource. Those KPIs are reported through a Service Adapter that
> > acts as a proxy between a service and the SDM core system. The SDM
> > supports the concept of a spare pool which enables SDM to withdraw
> > unneeded computational resources (hosts, which can be physical hosts
> > or virtualized machines) from services. SDM also enables you to power
> > off unneeded or underutilized machines to minimize the environmental
> > impact of your data centers.
>
> > The SDM module is designed to handle different kinds of services that
> > have in common a need for the same set of resources. The SDM module
> > enables those services to share their resources effectively and
> > increase resource utilization.
>
> As Yuan wrote, this was a grid implementaation, that actually
> materialized. The problem is, can we use SDM in a public cloud? Amazon,
> RackSpace, Cloud Sigma (not sure, mybe Robert can clarify) work on
> principle that I say how many instance I need and I work with them. If I
> have too much, or too little, this is the user headache. There is no
> automated re-assigning of servers and instances, depending on demand and
> depending on the SLAs we need to provide.
>
> What the user should pay-per-use? Only for the services delivered
> according to the SLA. the SDM pdf documentation can be downloaded here:
>
> http://wikis.sun.com/display/gridengine62u6/Service+Domain+Manager+Gu...
>
> I went sideways from the topic, but bottom line, in an ideal world, all
> I need to to is to set a SLA and pay different prices for different
> SLAs. The price per cycle, per what ever geeky parameter, does not
> count. Then the cloud management - that user do not even know it exists
> - will adjust automatically the resources inside the cloud, so it can
> meet every SLA.
>
> Utopia? No. we are very close to it. But no one has done it yet.
>
> Miha
>
> On 11/8/2010 6:37 AM, Jim Starkey wrote:>  Two questions.  How are going to measure the number of available
> > cycles?  And what cloud providers offer an SLA that guarantees any
> > specific level of performance?
>
> > On 11/8/2010 2:58 AM, Yuan wrote:
> >>> How many zillion cycles does your app need? (how do you know you need
> >>> X zillion cycles?)
> >> No need to calculate.  It's all a matter of SLA.  If I'm buying 3.6
> >> billion cycles during any hour, provider should give me 3.6 billion
> >> cycles.  I may be under capacitated already.  I could be losing $1 M a
> >> day already.  But this doesn't matter.  What matters is that the
> >> provider must not reduce this 3.6 billion by 10%.  Even the fictitious
> >> 3% sales loss on 25% cycle reduction doesn't matter; it's just a
> >> financial term I negotiate. (There needs to be some baring on reality
> >> for both sides to accept, of course.)
>
> >> These are grid computing ideals that never materialised.
>
> >> Yuan Liu
>
>
>
>  mij123.vcf
> < 1KViewDownload

Yuan

unread,
Nov 9, 2010, 2:31:32 PM11/9/10
to Cloud Computing
First of all, linear scalability is not part of SLA. Grid utility
_should_ be the sole factor. I'm really glad to know Sun's grid
management abstraction - they are always on to something. (Thanks,
Miha.) Too bad they don't sell enough of those. Maybe too tied in
with Sun's own hardware? But there must be others in the grid
industry offering high level utility management.

How do I meet my scale needs under this utopia? I pay non-linearly if
my application doesn't scale linearly. This is the same with your
electricity bills, your gasoline bills, etc. How much (penalty ratio,
I assume) to ask for? Long before EC2 was born, Amazon was a
foreleader in assessing financial loss due to downtime, traffic
disruption, etc. They actually published some case studies. Numbers
I saw really only apply to financial transaction part of the business
but are indication that such _can be_ assessed. I vaguely remember
seeing some assessment about user experience related loss, too
(abandoned purchase, even abandoned search, for example). In each
vertical, there are always some analytical parameters you can derive.
Even in a brand new line of business, you can borrow other industries'
comparable elements. The goal is not to establish an accurate base
line, but to establish a convincible (and realistic) base line.
Amazon's analysis convinced their internal business investment to "pay
up" for infrastructure, maintenance, etc. Had their service
infrastructure been hosted externally, that would have been their
weapon to leverage penalty from the hosting company.

Maybe it's worth clarifying that this "grid SLA" approach will only
apply to grid-type public clouds, to which EC2 approximates. There
are plenty of domain specific public clouds that have no grid
exposure. (I thought hard but couldn't exclude WebEx from today's
definition of cloud, however unwilling I am. Even Google Apps,
Azure, ... They don't really offer raw horse powers.) In this space,
Amazon has yet to meet serious competition. This is enough to explain
why they are not that into grid pricing and providing more
visibility. Traditional enterprise hosting services, though they are
making transformation into "cloud," are too far from Amazon's space.
The game of those providers, in my observation, is to create enough
obscurity in traditional business case presentation in order to avoid
exposure of the grid. (Physical servers can make a grid, too.) They
still have enough space to maneuver before disruptive forces - think
Loudcloud, sigh - change rules of the game.

As to how to calculate number of cycles and other grid metric, in a
shared environment, some of these are provided by virtualisation
framework such as VMWare. (Me thinks cycles gauge used to be a staple
in first gen cloud - aka time share. No?) Sun's suite is an example
of 3rd party offering. The big question under this discussion, as I
see it, is that providers are not interested in providing them, yet.
Yes, there is an element of deferring solution because the problem has
not been eminent - to providers. IMO these solutions should not be
for clients to muster from within their jail house, even though
measurements can be done to some extent. (E.g., peak throughput.)
Hardware partitioning is not a new concept to mainframes, nor to big
iron shops like Sun. Clients should use financial muscle to force
providers to expose their "floor plans," so to speak.

Robert said it: This is not a technological problem.

Yuan Liu

Sassa NF

unread,
Nov 9, 2010, 2:36:23 PM11/9/10
to Miha Ahronovitz, cloud-c...@googlegroups.com
I can see how this can work in a system where both the provider and
the consumer are not interested in claiming false figures.


Sassa


2010/11/9 Miha Ahronovitz <mij...@sbcglobal.net>:


> On 11/9/2010 4:50 AM, Sassa wrote:
>>
>> Is there a promise that the amount of resource utilized is exactly the
>> amount of resource I need? (=controlled, measurable, or no loss
>> through sharing of resources)
>>
>> I.e. do I happen to ask for more resources when the KPI went down
>> because other guests are sharing my resources?
>>

> you ask the right questions, and if you mean by KPI, key performance
> Indicators,  in the example I described for Grid Engine, Each SLA has
> Service Level Objectives. In HPC / EDA could mean "maintain response time
> under 5 minutes (batch processing compute intensive), or 5 seconds in other
> instances. Then the resources are automatically allocated. See the links in
> my previous post. SLO are in exxence KPIs specific to an SLA
>
>
> Miha
>

Miha Ahronovitz

unread,
Nov 9, 2010, 10:02:25 AM11/9/10
to cloud-c...@googlegroups.com, Sassa
On 11/9/2010 4:50 AM, Sassa wrote:
> Is there a promise that the amount of resource utilized is exactly the
> amount of resource I need? (=controlled, measurable, or no loss
> through sharing of resources)
>
> I.e. do I happen to ask for more resources when the KPI went down
> because other guests are sharing my resources?
>
mij123.vcf

Yuan

unread,
Nov 10, 2010, 12:39:19 AM11/10/10
to Cloud Computing
On Nov 9, 11:36 am, Sassa NF <sassa...@gmail.com> wrote:
> I can see how this can work in a system where both the provider and
> the consumer are not interested in claiming false figures.

Not sure if I understand your concern here. Service level objectives
have been the corner stone of every SLA in existence, what's so
imaginative about it? The only irony is that SLOs in many traditional
IT SLA's are so low that they don't reflect the true value in today's
Web based businesses. Even so, easily obtainable SLO's such as round
trip time for a particular type of service transaction are common
place in addition to more mundane ones such as unplanned down time. I
have certainly seen specialised, custom SLO's been used. Providers
and clients sure have different agendas; but when they do sit on the
table to ink down an SLA, they have to agree to a fictitious number in
each SLO. Sometimes providers have an upper hand, sometimes clients
have an upper hand. But unless one side does not want the deal, they
always agree to a number.

If you think of a telco as a cloud service provider - which is true in
the sense of application cloud, their fine tuned SLA's can be an
excellent example of what a utility cloud SLA should look like.
Interestingly, with telcos, penalty is usually not customised to
individual client business loss, but has been standardised across the
telco industry. When utility cloud matures, standardised SLA can be a
good model, too.

Yuan Liu
Reply all
Reply to author
Forward
0 new messages