|Big Numbers||DrQ||8/27/12 7:59 AM|
Don't be too easily impressed by big numbers.
Recently, I saw a tweet regarding the amount of CPU time devoted by Fermilab to crunching LHC data:
(Fermilab) Progress in HEP Computing: Recent LHC resource appetite is a mind-boggling
Is it mind-boggling? This caused me to do some calculations.
Let's compare with SETI@home.
> cpu.yrs <- 2e6 #since 1999
# Per day...
But it's not clear exactly when the end-point of "since" was measured (e.g., 2009, 2012?). So, for simplicity, I'll just round it up to 500 cpu-yrs per day or 1/2 a CPU-millennium per day.
> 500*3 #over 3 days
That's 1.5 thousand cpu-yrs in 3 days or very similar to the Fermilab claim. And since those Fermilab cycles are also highly distributed, like SETI, the number is impressive but not quite so impressive as it first appeared.
|Re: Big Numbers||SteveJ||8/27/12 3:40 PM|
DrQ wrote on 28/08/12 12:59 AM:
What's their daily failure rate? [500 cpu-yrs/day]"1.5 CPU-millennia every 3 days"
- 500 cpu-years = 182,625 days == # cpus.
2 cpu/system == 100,000 sys, rounded up [@450W ea, 50MWatt]
Though, do they fudge the numbers by counting cores not CPU's?
Divide by 2 or 4.
99.99% availability? = 00.01% down == 1/10,000 hours failure.
Need MTTR to convert to a failure/rate. Guess 1hr MTTR
= 100,000 sys/10,000 = 10 fails/hr-operated = 240 fails/day.
More likely they get 5-10 times better reliability, but MTTR would be 1 day [ie. a daily fix sweep]
at 1 in 100,000 availability [5 nines, 99.999%], they have 1 sys fail/hr,
or ~20-100/day replaced systems.
They would also have 2,000 48-port switches.
Can't imagine them having less than MTBF of worse than 5-10M hrs MTBF (guess).
They might have 2 switches/day fail.
They wouldn't have less than 2 disks/system, more likely 3-4.
200,000 drives * 24-hours/day = 5M hrs/day = 5e6hrs/day
Manufacturers like to quote 1M-hrs MTBF, or 100 years (a 1%/yr failure rate)
The 2007 paper by Google talks of failures as %/yr.
They found 1.7% in 1st year rising to ~8.5% in 3rd year.
Guess 4% = 8,000 drives/yr = ~20 drives/day
Disk errors? Even the google paper didn't go there :-)
Manufacturers typically quote hard errors of 1 in 10^15 bits read for SATA class drives.
If drives are run hard, say 0.5Gbps (5e8 bps), there are 32M (3.2e6) seconds/yr,
or 1.6e15 bits/yr/drive.
So each drive would experience at least 1-2 Unrecoverable Read Error/year.
Each system, 3-4 times that [with 3-4 drives].
With a fleet of 3-400,000 drives, you're looking at 1-2,000 URR's/day.
you'd want to be running RAID of some sort...
They have a busy bunch of beavers looking after their gear.
Google is reported to have close to 1M systems.
I can't imagine that scale.
-- Steve Jenkin, Info Tech, Systems and Design Specialist. 0412 786 915 (+61 412 786 915) PO Box 48, Kippax ACT 2615, AUSTRALIA stev...@gmail.com http://members.tip.net.au/~sjenkin
|Re: Big Numbers||James||8/28/12 7:25 AM|
I was wondering that one myself. And not just cores, but logical cores with SMT/Hyperthreading.
AMD has chips that pack in 16 cores per socket (Opteron 6200).
|Re: Big Numbers||DrQ||8/28/12 12:31 PM|
In a (sideways) related note, I just came across this claim:
Anyone wanna check that one out?
|Re: Big Numbers||SteveJ||8/28/12 5:29 PM|
After I wrote this, I realised I have a real gap in my knowledge of stats:
- for single or small numbers of machines, how do you calculate useful numbers from MTBF's?
If I buy a new PC every 3 years, only ever owning one at time, and the disk drives have are rated at "1M hours MTBF", what's the probability in a lifetime of ownership (50 yrs) of having a failure?
Is it just 1% every year and 50*1% for 50 years?
Or, what proportion of single PC owners will experience a disk drive failure in 50 years of ownership?
For small numbers, many small businesses have 4-5 servers with a few disks each, and tend to keep each server 4-5 years.
What's the likelihood of having to replace a disk in a server?
Comes to a prosaic purchase decision:
steve jenkin wrote on 28/08/12 8:40 AM:
|Re: Big Numbers||SteveJ||8/28/12 5:31 PM|
Fat fingered :-(
steve jenkin wrote on 29/08/12 10:29 AM:
- do we purchase a spare disk (or two) along with the server, to put on
the shelf as a replacement, or
- do we buy maintenance at 10-20% purchase price?
|Re: Big Numbers||Darryl Gove||8/28/12 6:42 PM|
I think at this point you are moving beyond probability into risk assessment.
You can work out the probability of a disk failure (etc.) But you need
to assign a cost the the various situations. For example, if your disk
contains critical work, then it would be better to have some raid
system. Assuming the disk failure is not about data loss, purely
convenience, then the decision is more about whether you want to have
the downtime, the cost of the spare disk vs the cost of buying a disk
when you need it - vs the cost of just buying a new machine out of
A similar set of arguments apply to the maintenance. In thiis case
it's the cost of your time fixing the problem, the cost of the lost
hours of productivity/downtime.
Of course, you can then front-load it by asking whether it's better to
buy a machine with redundancy in order to avoid downtime if a disk
(etc) goes out.
> You received this message because you are subscribed to the Google Groups "Guerrilla Capacity Planning" group.
> To post to this group, send email to guerrilla-cap...@googlegroups.com.
> To unsubscribe from this group, send email to email@example.com.
> For more options, visit this group at http://groups.google.com/group/guerrilla-capacity-planning?hl=en.
|Re: Big Numbers||Darryl Gove||8/28/12 6:42 PM|
So you have 1/100 chance of a disk failure in one year. The age of the
disks doesn't matter so we can ignore the fact that you replace your
machine every three years.
The crucial step is that the chance of experiencing (at least) one
failure is 1- probability of experiencing none.
The probability of experiencing no failures over 50 years is (99/100)
^ 50. So the probability of experiencing at least one is 1-(99/100)^50
|Re: Big Numbers||SteveJ||8/28/12 7:09 PM|
Darryl Gove wrote on 29/08/12 11:26 AM:
Thanks very much. That was what I was missing... Pretty dumb, I know :-(The probability of experiencing no failures over 50 years is (99/100) ^ 50. So the probability of experiencing at least one is 1-(99/100)50 = 0.39.
|Re: Big Numbers||SteveJ||8/28/12 9:17 PM|
Darryl,The probability of experiencing no failures over 50 years is (99/100) ^ 50. So the probability of experiencing at least one is 1-(99/100)50 = 0.39.
Thinking a little more on this.
If I have a server and run it 4-5 years and there's an averaged change of failure of 6%/year of drives,
then over the 5 year life of a single drive:
- Prob. No Failure/yr = 1 - prob(failure/yr) = 1 - .06 = .94
- prob No failure in 5 yr = 0.94 ^ 5 = 0.734
Now if I have 3 drives, is the Probability of No drives failing in a single year the product of 1 - sum(prob failure)??
ie. 0.94 * 0.94 * 0.94 = 0.831
or 1 - (0.06 + 0.06 +.06) = 1 - 0.18) = .82
I'm guessing having working out the yearly rate of "No Fails this year" for a group of drives, then the probability for getting through the entire 5 years with no failures is the product:
ie. p1 * p2 * p3 * p4 * p5
|Re: Big Numbers||Darryl Gove||8/28/12 9:59 PM|
On 28 August 2012 21:05, steve jenkin <stev...@gmail.com> wrote:Not the sum - you can only add the probabilities of mutually exclusive events.
Yes, this is the probability of no drive failing during one year.
No, this is not right (imagine that the probability of a drive failing is 0.5 :)
0.831^5 = 0.396
So the total probability is (prob working for one year)^( #drives * #years )
|Re: Big Numbers||SteveJ||8/28/12 11:44 PM|
Thanks very much for explaining it so patiently to me, A Bear of Little
Brain [reference to Pooh Bear]
Darryl Gove wrote on 29/08/12 2:53 PM:
|Re: Big Numbers||rml...@gmail.com||8/30/12 4:34 PM|
Nice analyses by Steve and Darryl. I question the uniform distribution of failures. Because systems tend to have burn-in failures early in life and burn-out/wear-out failures late in life with stability in between, the uniform distribution seems suspect. Because this is a bathtub shaped distribution, wouldn't an exponentiated Weibel distribution be a better model for the failures that are being discussed? If we assume that product life is divided into three parts--infant mortality, random failures, and wear-out--then three functions may express the probability of failure depending on where the system is at in its lifespan. Some vendors burn-in their systems prior to customer delivery to minimize customers experiencing infant mortality. I know that this moves us from performance to reliability.
|Re: Big Numbers||SteveJ||8/30/12 5:14 PM|
rml...@gmail.com wrote on 31/08/12 9:34 AM:
> I question the uniform distribution of failures.Bob,
you're dead right :-) A Simplifying Assumption.
There are two well known papers on large-scale HDD failures, published
within the last 5 years. One by Google. [others will know the refs]
HDD failure rate changes with age, use, temperature and power-cycles.
It's also not usefully near the Vendor-published figures from
Nor, very surprisingly, does S.M.A.R.T. reporting give you much
predictive power. IIRC, google says more than 50% of failures are not
predictable from those logs.
The wild card is "new technologies" - how will they perform?
We've entered the last factor-10 increase of HDD recording density
(~2020) and three new recording techniques are yet to enter mainstream
- HAMR [heat assisted Magnetic recording - higher coercivity media,
heated by laser]
- BPR [Bit Patterned Media. shape the bit areas
- Shingled Writes [better described as Multi-track overlapped write
with no in-place update]
The replacement technologies, broadly called "Storage Class Memories",
will, like Flash memory, have completely different wear and failure
Even 'Flash' seems to now be in a region of declining returns with
feature size reduction. We're currently at 100 electrons per cell,
looking to get to 10. Yes, that's One Hundred. A figure I find hard to
comprehend in consumer devices.
But, from where we are now, there aren't any technologies that will
surpass HDD in capacity/price for the next 20 years.
But as Neil has so ably pointed out recently with Fusion-IO and
PCI-SSD's, HDD's are no longer "useful" or cost-effective for serving
Random I/O loads. [Sell your shares in Enterprise Disk Array
manufacturers, but not Seagate or Western Digital].
HDD's work well for streaming-IO: think CD-ROM or DVD.
- while they can seek, they are horribly slow at it...
- It takes 100-1000 times as long to read a HDD with 'random I/O'
compared to seek-and-stream.
Point of that excursion:
HDD reliability figures are going to become important to archivists
and not for Performance Analysts.