dcra...@GMAIL.COM (David Crayford) writes:
> Emulex sells an HBA that handles over 1M IOPS on a single port. IIRC,
> x86 Xeon class servers have something called DDIO which facilitates
> writes directly to processor cache.
> It's not too dissimilar to offloading I/O to SAPs. I've got old
> colleagues that work on distributed now and they are of the opinion
> that I/O bandwidth is not an issue on x86 systems,
> but it's not exactly commodity hardware. They're all hooked up using
> 16Gbs fiber connected to a SAN using PCIe, the same as z Systems.
>
> I would question the RAS capabilities rather than I/O.
Last published mainframe I/O I've seen was peak I/O benchmark for z196
which got 2M IOPS using 104 FICON (running over 104 fibre-channel). Also
that all 14 SAPs would run 100% busy getting 2.2M SSCHs/sec but
recommendation was keeping SAPs to 75% or 1.5M SSCHs/sec.
About the same time of the z196 peak I/O benchmark there was
fibre-channel announced for e5-2600 blade claiming over million IOPS,
two such fibre-channel getting more throughput than 104 FICON (running
over 104 fibre-channel) ... aka FICON is enormously heavy-weight
protocol that drastically cuts the native throughput of fibre-channel.
disclaimer: 1980 I was asked to do the support for channel extender for
STL (now IBM Silicon Valley Lab), they were moving 300 people from the
IMS group to offsite bldg. with access back to the STL datacenter; they
had tried remote 3270 but found the human factors intolerable. The
channel extender support put channel attached 3270 controllers out at
the offsite bldg ... and resulted in response indistinguishable from
channel attach 3270 controllers within the STL bldg. The vendor they
tried to get approval from IBM to release the support, but there was a
group in POK that was playing with some serial stuff and they got it
blocked because they were afraid it might interfer with getting their
stuff released.
In 1988, I'm asked to help standardize some serial stuff that LLNL was
playing with which quickly becomes fibre channel standard ... one of the
issues is that protocol latency effects increases with increase in
bandwidth ... so that it becomes apparent at relatively short
distances. One of the features with the 1980 work is that it localized
the enormous IBM channel protocol latency at the offsite bldg and then
used much more efficient protocol the longer distance. For fibre-channel
used the much more efficient protocol for everything.
In 1990, the POK group finally get their stuff release as ESCON when it
is already obsolete. Then some POK engineers become involved with fibre
channel standard and define a protocol that enormously cuts the native
throughput ... that is eventually released as FICON. Note that the more
recent zHPF/TCW work for FICON looks a little more like the work that I
had done back in 1980.
Besides the peak I/O benchmark FICON throughput issue (compared to
native fibre channel issue) there is also the overhead of CKD
simulation. There hasn't been any real CKD disks built for decades,
current CKD disks are all simulation on industry standard commodity
disks.
Other tivia, when I moved to San Jose Research in the 70s, they let me
wander around. At the time the disk engineering lab (bldg 14) and disk
product test lab (bldg 15) they were running pre-scheduled standalone
mainframe around the clock, 7x24. At one point they had tried to us MVS
for concurrent testing, but found that MVS had 15mins MTBF in that
environment. I offerred to rewrite I/O supervisor that made it bullet
proof and never fail ... being able to do ondemand, anytime concurrent
testing, greatly improving productivity. I happened to mention that MVS
15min MTBF in an internal-only report on the work ... which brings down
the wrath of the MVS group on my head (not that it was untrue, but that
it exposed the information to the rest of the company). When they found
that they couldn't get me fired, they then were to make sure they made
my career as unpleasant as possible (blocking promotions and awards
whenever they could).
z900, 16 processors, 2.5BIPS (156MIPS/proc), Dec2000
z990, 32 processors, 9BIPS, (281MIPS/proc), 2003
z9, 54 processors, 18BIPS (333MIPS/proc), July2005
z10, 64 processors, 30BIPS (469MIPS/proc), Feb2008
z196, 80 processors, 50BIPS (625MIPS/proc), Jul2010
EC12, 101 processors, 75BIPS (743MIPS/proc), Aug2012
z13 published refs is 30% move throughput than EC12 (or about 100BIPS)
with 40% more processors ... or about 710MIPS/proc
z196 era e5-2600v1 blade rated at 400-500+BIPS depending on model,
e5-2600v4 blades are three-four times that, around 1.5TIPS (1500BIPS).
i.e. since the start of the century, commodity processors have increased
their processing power significantly more aggresively than
mainframe. They have also come to dominate the wafer-chip manufacturing
technology ... and essentially mainframe chips have converged to use the
same technology (in much the same way mainframe has converged to use
industry standard fibre channel and disks). EC12 financials implied that
a single minimum sized chip wafer run produced more EC12 processor chips
than will ever be sold.
Typical cloud megadatacenter has several hundred thousand systems with
millions of processors ... operated by around 100 people or less (rather
than people/system, it is systems/person) ... and have more aggregate
processing capacity than all mainframes in the world today. Systems are
designed for fall-over and redundancy ... and with larger operations
with dozen or more such cloud megadatacenters around the world, they are
also designed for fallover and redundancy between datacenters.
Max. mainframe configuration around $30M compared to couple thousand for
e5-2600 blade ... say 1/10,000th the cost for 15 times the processing
power. System costs have dropped so drastically that power&cooling cost
have increasingly come to dominate for cloud megadatacenter.
--
virtualization experience starting Jan1968, online at home since Mar1970