Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

MIPS vs. VUPS

349 views
Skip to first unread message

Ronald Becker Williams

unread,
Oct 29, 1992, 4:44:40 PM10/29/92
to
We've been told by DEC Ultrix Personnell that plus or
minus a few, that VUPS = MIPS, and vice versa. Is
this reasonable?

All email replies thankfully accepted.

George S. Chapman

unread,
Nov 1, 1992, 1:33:21 PM11/1/92
to
Absolutely. That was what the original equivalence standard was.
I've been using it for years when comparing to other vendors.

Now, of course, DEC no longer rates their products in VUP's; they
are publishing TPC-A benchmarks in the SOC. Oh well....

=====================================
George S. Chapman
Director, Technology Planning
The Midwest Stock Exchange, Inc.
Chicago, Illinois; Bangkok Thailand
mithr...@isvnet.enet.dec.com
=====================================


>DATE: Thu, 29 Oct 92 22:44:40 +0100
>FROM: Ronald Becker Williams <r...@bwunltd.wciu.edu>

Jim Gettys

unread,
Nov 1, 1992, 7:43:59 PM11/1/92
to
As I understand it, the SPEC benchmarks use a VAX 11/780 as their reference,
running a frozen version of C and Fortran compilers. So SPECmarks are roughly
equivalent to VUPS. Note that a VAX 11/780 would now be rated at significanly
more than one SPECmark though, just because the compilers have improved over the
years.

MIPS, however, is a much looser and less well defined term, and much less
closely equivalent. While a VAX 11/780 is often called roughly a one mip machine,
in fact, it executes around 1/2 million VAX instructions/second; but each instruction
on a VAX tends to do much more than most other machines (accomplishing around twice
as much as on most other machines). RISC technology has shown, however, that
this tradeoff in the end, with current technology, was a poor trade
(An Alpha, manufactured in the identical process technology as an NVAX chip,
runs real programs at least twice the speed as an NVAX)
One of the IBM 360 or 370's was the original "one MIP machine",
but VAXen seemed to become much more widespread.

In anycase, MIPS is best thought of as "Meaningless Indicator of Performance"; stick
to useful benchmark programs to actually measure speed for your application. For commonplace
integer and floating point applications, the integer and floating SPECmarks are
reasonable starting points when comparing machines.
- Jim Gettys

--
Digital Equipment Corporation
Cambridge Research Laboratory

FRED W. BACH

unread,
Nov 1, 1992, 10:16:00 PM11/1/92
to
In article <1992Nov2.0...@crl.dec.com>, j...@crl.dec.com (Jim Gettys) writes...

[stuff deleted]

>
>MIPS, however, is a much looser and less well defined term, and much less

[more stuff deleted]

>In anycase, MIPS is best thought of as "Meaningless Indicator of Performance"; stick

> - Jim Gettys


>
>--
>Digital Equipment Corporation
>Cambridge Research Laboratory

I have an acronym program. Here is its output for "MIPS" :

MIPS Massively Inconclusive Performance Suite
MIPS Meaningless Indicator of Processor Speed
MIPS Meaningless Information Propagated by Salesmen
MIPS Million Instructions Per Second
MIPS Millions of floating-point Instructions Per Second

The last one is probably wrong. The others may be correct. ;-)


Fred W. Bach , Operations Group | Internet: mu...@erich.triumf.ca
TRIUMF (TRI-University Meson Facility) | Voice: 604-222-1047 loc 278/419
4004 WESBROOK MALL, UBC CAMPUS | FAX: 604-222-1074
University of British Columbia, Vancouver, B.C., CANADA V6T 2A3

These are my opinions, which should ONLY make you read, think, and question.
They do NOT necessarily reflect the views of my employer or fellow workers.

AJ Casamento

unread,
Nov 1, 1992, 4:47:34 PM11/1/92
to

Folks,

The use of the VUPS measurement (VAX Unit of Performance) was an early move
by Digital to provide some standard comparison of machine performance by some
means other than MIPS.

Vendor MIP ratings had varied so much (and some of the processors were being
generated with fun instructions that were specifically targeted at increassing
the Dhrystone rating) that it seemed some standard was necessary. We needed one
even in describing our own machines.

A year or so later, the SPEC rating was established. DEC continued to provide
VUP ratings along WITH the SPEC ratings in order to provide a transition for
the customers who had gotten accustomed to VUPS. I can even remember being
taken to task by a customer at DECUS New Orleans in 1990 because he believed
that DEC had created the SPEC rating (he didn't realize it was a multi-company
effort) as a new way to confuse people.

SPECmarks seemed a reasonable effort (an independent body to control both the
test code and the reporting format). It's based on a VAX 11/780 because "THE"
known-base-machine had to be guaranteed to be supported for a long time (read,
forever) and DEC was the only participant willing to do so. I had actually
remembered that it was the SPECmark rating that used new compilers and the VUP
testing that had a fixed compiler revision level; but Jim's memory is probably
better than mine.

The benchmarks will continue to evolve. They will have to, to keep up with
the new machines (you can do some fun things with benchmarks when you can run
the entire thing in a 1MB cache). They are provided as a roughest approximate
yard stick. Predominately for those people who only look at an overall chart of
machines.

The best way of judging a systems performance is still to run your own set of
applications on it, with the hardware configured the way you want it. Sometimes
that isn't possible, and that's when folks default to MIPS/SPECs/VUPS as a way
to judge.


All of the above is just my own silly opinion, of course. Still, I hope it
provides some prespective.

Thanx,
AJ

ps. Now, if I could just get everybody to agree to stop those silly "Entry
System Configuration" routines where every company quotes a configuration
you wouldn't wish on your worst enemy. Unfortunately, that's one of those
"I'll put my gun down if you put your's down. You go first..." type of
discussions. Sigh...


**********************************************************************
* AJ Casamento "The question is not whether or *
* Digital's TRI/ADD Program not the opinions are mine; but *
* 529 Bryant Ave. PAG-2 rather, which of my personalities *
* Palo Alto, CA 94301-1616 do they belong to?" *
* 415.617.3460 *
* a...@pa.dec.com *
**********************************************************************

vax...@v36.chemie.uni-konstanz.de

unread,
Nov 2, 1992, 5:09:41 AM11/2/92
to

In article <9210...@bwunltd.wciu.edu>, r...@bwunltd.wciu.edu (Ronald
Becker Williams) writes:

Not true. 1 VUP is about 1.4-1.5 MIPS. But 1 VUP = 1 Specmark.

Dileep Bhandarkar

unread,
Nov 2, 1992, 11:23:35 AM11/2/92
to

In article <9210...@bwunltd.wciu.edu>, r...@bwunltd.wciu.edu (Ronald Becker Williams) writes...

There are 3 metrics that use the VAX-11/780 as the base:

1. Dhrystone MIPS. A system's Dhrystones/sec is divided by 1757 (this was the
VAX-11/780's Dhrystone number at one time).

2. VUPS. We used the central tendency (geometric mean) of 99 diverse benchmarks
to determine the relative performance compared to a VAX-11/780, both using
current compilers.

3. SPECmarks. The VAX-11/780 was used to generate reference numbers (most using
VAX/VMS, a couple using Ultrix) and then used to compute SPECratios, SPECmark
etc. Reference numbers were not updated as VAX compilers, preprocessors etc
improved/emerged. There was a very strong correlation between VUPS and
SPECmarks for VAX systems until the matrix300 benchmark was cracked.

Our competition has often used VUPS or MIPS to mean "times VAX-11/780" for any
benchmarks using any (not always the best) compiler on VAX.

At this stage, I would toss all of these out the window and look at SPECint92
and SPECfp92. Of course your mileage will vary and individual SPECratios may
be closer to your application.

Dileep

Greg Pavlov

unread,
Nov 2, 1992, 9:47:16 PM11/2/92
to
In article <AJC.92No...@thendara.pa.dec.com>, a...@pa.dec.com (AJ Casamento) writes:
>
>
> Folks,
>
> The use of the VUPS measurement (VAX Unit of Performance) was an early move
> by Digital to provide some standard comparison of machine performance by some
> means other than MIPS.
>
Mebbe. But it also may have simply been an outgrowth of an earlier "prac-
tice" to rate VAXen in terms of "n x a VAX 780" (and, for a while, "n x a
VAX 8200", similar power but in new clothing). If the above were true, then
DEC would have (should have ?) rated the first-generation MIPS/RISC-based
machines in VUPS. Instead, they were rated only in MIPS in press releases,
ads, catalogues, etc. while VAX-based systems continued to be rated in 780
terms. To the suspicious among us, this looked like an attempt to keep
people from directly comparing the two lines, given the large disparity in
cost between them at the time.....

greg pavlov
pav...@fstrf.org

M Darrin Chaney

unread,
Nov 3, 1992, 9:41:07 AM11/3/92
to
pav...@niktow.canisius.edu (Greg Pavlov) writes:

>a...@pa.dec.com (AJ Casamento) writes:
>> The use of the VUPS measurement (VAX Unit of Performance) was an early move
>> by Digital to provide some standard comparison of machine performance by some
>> means other than MIPS.
>>
> Mebbe. But it also may have simply been an outgrowth of an earlier "prac-
> tice" to rate VAXen in terms of "n x a VAX 780" (and, for a while, "n x a
> VAX 8200", similar power but in new clothing). If the above were true, then
> DEC would have (should have ?) rated the first-generation MIPS/RISC-based
> machines in VUPS. Instead, they were rated only in MIPS in press releases,
> ads, catalogues, etc. while VAX-based systems continued to be rated in 780
> terms. To the suspicious among us, this looked like an attempt to keep
> people from directly comparing the two lines, given the large disparity in
> cost between them at the time.....

To the clear thinking among us, it was a marketing move aimed at the
simpletons who think that "MIPS" is the only real comparison between
processors because "VUPS only apply to VAXes." When DEC started selling
RISC machines, they were moving into another market, one which they didn't
shape, and they had to conform with standards already in place. That meant
using MIPS.

It's funny, because MIPS are meaningful only within an architecture, whereas
VUPS are an actual unit of power that can be applied across multiple
architectures. However, it was common practice in the RISC world to use MIPS
because they sounded so much better.

Even the DOS people caught on. I can remember someone telling me (back in
1988) that the 386 was running at 21 MIPS. I believe that if you ran it at
50MHz (unsupported, didn't last long), and had some realistic instruction
sequence like "add ax to ax, branch to that instruction", you could get a
rating like that. Companies like Sun, HP, IBM, and DEC rarely stoop so low,
but MIPS are meaningless anyway.

For a good comparison among machines, look for the SpecMark comparisons.
For an understanding of how machines compare on all levels, the TPC benchmarks
are excellent, and they are governed by a third party, so there should be
no bullsh*t.

Darrin
--

mdchaney@iubacs mdch...@bronze.ucs.indiana.edu mdch...@rose.ucs.indiana.edu

"I want- I need- to live, to see it all..."

Thomas Sippel - Dau

unread,
Nov 4, 1992, 2:46:01 PM11/4/92
to
In article <1NOV1992...@erich.triumf.ca>, mu...@erich.triumf.ca (FRED W. BACH) writes:
-
- MIPS Massively Inconclusive Performance Suite
- MIPS Meaningless Indicator of Processor Speed
- MIPS Meaningless Information Propagated by Salesmen
- MIPS Million Instructions Per Second
- MIPS Millions of floating-point Instructions Per Second
-
- The last one is probably wrong. The others may be correct. ;-)

Well, the last one should be

FLOP FLoating-Point OPeration

Which reminds me of a conference (on parallel processing) where somebody
claimed that "megaflops are bunk", and somebody else later suggested

BUNK Binary or Unary Normalized Konstruct

and commented this would leed to Megabunks, which would probably be a flop.

Thomas

--
*** This is the operative statement, all previous statements are inoperative.
* email: cmaae47 @ ic.ac.uk (Thomas Sippel - Dau) (uk.ac.ic on Janet)
* voice: +44 71 589 5111 x4937 or 4934 (day), or +44 71 823 9497 (fax)
* snail: Imperial College of Science, Technology and Medicine
* The Center for Computing Services, Kensington SW7 2BX, Great Britain

Cui-Qing Yang

unread,
Nov 4, 1992, 4:25:24 PM11/4/92
to
I need to convert some of font files from the standard X format (.bdf or .snf)
into DECwindow font format (.dwf). Does anyone know if there's such a program?
If so, where can I get it? Thanks in advance for any help.

Cui-Qing Yang

Dept. of Computer Science
University of Norht Texas
cqy...@ponder.csci.unt.edu

dfi...@colorne.uucp

unread,
Nov 7, 1992, 8:58:32 AM11/7/92
to
George S. Chapman writes :
>>Ronald Becker Williams writes :

>>We've been told by DEC Ultrix Personnell that plus or
>>minus a few, that VUPS = MIPS, and vice versa. Is
>>this reasonable?

>Absolutely. That was what the original equivalence standard was.


>I've been using it for years when comparing to other vendors.

The problem with this is that, once you jump machine architectures, this
number becomes mostly (if not completely) useless. The reason that DEC
stopped quoting MIPS was that they were beating up IBM a few years back
(circa 1985 ?) about how their price per MIP beat the pants off of IBM
by something like a factor of 2 to 1. Then IBM came back with some benchmarks
and proved that you could get twice as many 'transactions' processed with
an IBM MIP than you could with a DEC MIP. To which DEC responded that IBM's
numbers were invalid because they represented batched transactions and not
online, to which IBM responded that large systems are measured in transactions
and not interactive access, to which ...

VUPS does make a lot of sense in comparing a VAXen, since the VAX
architecture is still somewhat the same (although the implementation has
varied quite a bit from processor line to processor line). However, DEC
generally gives VUP's in ranges (particularly on their larger systems)
because a VUP started out as an 11/780 (which they used to call a MIP)
and there have been enough changes in the VAX implementation since
then that your rating will vary depend on specifically what you are
trying to doing. Case in point is that some of the old 8xxx series
could easily beat some of the earlier 6xxx series machines which had
a much higher VUP rating, primarily because of some instructions which
were emulated (the older 6xxx series actually used the uVAX chip set,
although I am not sure if the 64xx/65xx still do).

An of course, I/O intensive applications will run much better on an XMI
attached RF-series disk than a Unibus attached RA-series disk, regardless
of the VUP rating !

IMHO, VUP/MIP ratings between completely different machine architectures
are relatively useless. VUP/MIP ratings between the same architecture
machines gives a close ballpark for comparison, but are still not
concrete, unless you take other differences in the machines implementation
into effect (i.e., I/O bus, memory bus, emulated instructions, etc.)

Regards,

Dave.
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
= David E. Filip UUCP : dfi...@colornet.com =
= ColorNet Information Systems CIS : 76430,3111 =
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
= Standards are wonderful 'cause there are so many to choose from ! =
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

jgre...@gmail.com

unread,
Jul 26, 2015, 6:53:41 PM7/26/15
to
Z Z was saax

mcle...@gmail.com

unread,
Jul 26, 2015, 7:19:55 PM7/26/15
to
None of these are worth much. Manufacturers can tweak their machines to optimize performance for a single test or set of tests. The use of multiple benchmark tests attempts to reduce this tweaking but the word is 'reduce', not 'eliminate'.

One approach is to run your own application or set of apps on the machine to see what the performance is. The downside to this is when performance is unsatisfactory, is your best option to spend money on fixing your code or on buying a more powerful system? Only you can answer that because it comes in the wider context of available money, skill of staff, desirability of having a more powerful system and so on.

Simon Clubley

unread,
Jul 26, 2015, 8:03:30 PM7/26/15
to
On 2015-07-26, jgre...@gmail.com <jgre...@gmail.com> wrote:
> Z Z was saax

Personally I think MIPS is a more basic and primitive architecture than
(say) ARM.

Simon.

PS: Sorry, couldn't resist. I know what the OP meant but I've been
working on some M4K MIPS assembly code this weekend... :-)

PPS: BTW OP, you need a new keyboard.

PPPS: To actually answer the original question, it's what you can get
done in a unit of time that matters and that's where things like code
density come into play.

--
Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world

glen herrmannsfeldt

unread,
Jul 26, 2015, 9:02:14 PM7/26/15
to
jgre...@gmail.com wrote:
> Z Z was saax

There is a MIPS article in Wikipedia, which doesn't seem so useful.

At some point, they compare instructions per second between a 4004
and 370/158, and suggest that it has meaning.

In the days when MIPS appeared, different processors did about the
same amount of processing per instruction, and so it had some
meaning.

But even then, for comparing among scientific processors (likely
with 36 bit words), or within commercial processors, processing
six bit characters it worked, but not between them.

But processors have changed so much, that it really is Meaningless
Indicator of Processor Speed.

-- glen

Arne Vajhøj

unread,
Jul 26, 2015, 10:02:00 PM7/26/15
to
I thought it was Meaningless Information Provided by Salesmen.

:-)

Arne


Paul Sture

unread,
Jul 27, 2015, 5:04:45 AM7/27/15
to
On 2015-07-27, Simon Clubley
<clubley@remove_me.eisner.decus.org-Earth.UFP> wrote:
> On 2015-07-26, jgre...@gmail.com <jgre...@gmail.com> wrote:
>> Z Z was saax
>
> Personally I think MIPS is a more basic and primitive architecture than
> (say) ARM.
>
> Simon.
>
> PS: Sorry, couldn't resist. I know what the OP meant but I've been
> working on some M4K MIPS assembly code this weekend... :-)
>
> PPS: BTW OP, you need a new keyboard.
>
> PPPS: To actually answer the original question, it's what you can get
> done in a unit of time that matters and that's where things like code
> density come into play.

And as I have run into recently modern CPUs have various levels of
hardware assist which can make a distinct difference on specific
workloads. Yes most of us know this sort of stuff is around, but do we
know what software uses it and what the impact of not having hardware
assist for a given function has on your application mix?

When running openSUSE on an HP NL40 Microserver, a startup message
saying that no CRC32 hardware unit is found flashes by. CRC32 is used
by btrfs:

<https://en.wikipedia.org/wiki/Btrfs#Checksum_tree_and_scrubbing>

"CRC-32C checksums are computed for both data and metadata and stored as
checksum items in a checksum tree."

Something else I've just discovered on this box is that there appears
to be a widespread assumption that "compression is good". The
specific example here is a bit of code for migrating virtual machines
from one host to another, which employs gzip.

On the Microserver this migration seemed to be taking an excessive
amount of time so I timed it with and without gzip on a set of files
which total 83GB

With gzip: 2.5 hours
Without gzip: 14 minutes

Gzipping the uncompressed files takes ~50 minutes on my Mac mini (and 24
minutes to decompress), which isn't a trivial operation either, and I'm
simply not that desperate for the disk space saved in this instance
(83GB comes down to 23GB). On the bandwith front, I can certainly ship
the uncompressed files to another system faster than going through the
compression.

Folks seem to get excited about compression at the file systems level
too, but when the majority of files involved with a typical office
file server are already compressed (.docx, .xlsx, jpg, mp3/4 etc)
trying to compress them further might just be a waste of time.

--
1972 - IBM begins development on its last tape drive (3480) ever because
of the declining cost of disk drives.

already...@yahoo.com

unread,
Jul 27, 2015, 8:32:14 AM7/27/15
to
24 min is approximately a time that it takes to read 23 GB and to write 83GB on Mac Mini's anemic 5400 rpm HDD. Uncompression itself close to free.

I am not sure what is "HP NL40 Microserver". May be, you meant N40L? Then its has a processor that is several (3?) times slower than non-very-old Mac Mini, which should be still good enough for decompression, but could be slow for compression. So, on your specific hardware SUSE assumption about "compression is good" happened to be wrong, but on more typical SUSE machines it could be correct.

Bob Gezelter

unread,
Jul 27, 2015, 9:08:43 AM7/27/15
to
Simon,

Indeed.

First, an observation: This is the reactivation of a 22-year old thread. The individual who re-awakened the thread would have done far better, IMHO, to have opened a new thread.

Compression is a tradeoff between processing power and some other resource. With compression, the content of the data has a large influence. Data containing many repeating strings (e.g., blanks, zeroes, or printable ASCII) can yield very substantial compressions (e.g., I have seen very much more than two orders of magnitude) over uncompressed data. If the data is random, encrypted, or previously compressed, the gains are negligible for significant processing.

With a high bandwidth link and low-compression yield, compression is indeed not a gain.

- Bob Gezelter, http://www.rlgsc.com

Hans Vlems

unread,
Jul 27, 2015, 10:02:55 AM7/27/15
to
You might want to post your message again, starting a new thread. This way your post gets buried here.
Hans

Scott Dorsey

unread,
Jul 27, 2015, 10:35:37 AM7/27/15
to
On 2015-07-26, jgre...@gmail.com <jgre...@gmail.com> wrote:
> Z Z was saax

Real MIPS would be related to the speed of instruction execution on a machine
and usually makes some assumptions like the code all fits in cache and the
pipeline is kept 75% full.

The thing is, a lot of manufacturers rated systems in "IBM MIPS" which was
basic CPU performance scaled to that of a 360/75. This rating bears no
resemblence to direct cycle time measurements.

Other people rated systems in terms of 11/780 equivalents, and called that
MIPS or sometimes VUPS, but used the terms interchangeably.

So, MIPS unfortunately means at least three different things, which means
when somebody uses the term you have to stop them and ask which of them
they are talking about. If they are salesmen they probably won't know.
--scott

--
"C'est un Nagra. C'est suisse, et tres, tres precis."

Paul Sture

unread,
Aug 5, 2015, 7:05:41 AM8/5/15
to
On 2015-07-27, Bob Gezelter <geze...@rlgsc.com> wrote:
> On Monday, July 27, 2015 at 5:04:45 AM UTC-4, Paul Sture wrote:

>> Gzipping the uncompressed files takes ~50 minutes on my Mac mini (and 24
>> minutes to decompress), which isn't a trivial operation either, and I'm
>> simply not that desperate for the disk space saved in this instance
>> (83GB comes down to 23GB). On the bandwith front, I can certainly ship
>> the uncompressed files to another system faster than going through the
>> compression.
>>
>> Folks seem to get excited about compression at the file systems level
>> too, but when the majority of files involved with a typical office
>> file server are already compressed (.docx, .xlsx, jpg, mp3/4 etc)
>> trying to compress them further might just be a waste of time.
>>
>
> Compression is a tradeoff between processing power and some other
> resource. With compression, the content of the data has a large
> influence. Data containing many repeating strings (e.g., blanks,
> zeroes, or printable ASCII) can yield very substantial compressions
> (e.g., I have seen very much more than two orders of magnitude) over
> uncompressed data. If the data is random, encrypted, or previously
> compressed, the gains are negligible for significant processing.

Indeed, the example I quoted above was 83GB coming down to 23GB. Some
50GB of the original is a SimH instance with a full installation or two
of VMS plus the Freeware CDs. On another note, for backup purposes that
combination of data dedups extremely well :-)

> With a high bandwidth link and low-compression yield, compression is
> indeed not a gain.

It's been an interesting learning curve here. What I have is a server
with relatively low CPU power, but a disk system which is super fast
compared to the Mac. It's now running SmartOS rather than Suse and that
brings ZFS, zones, and the OS loaded into RAM.

ZFS: amongst a whole pile of other goodies, ZFS shares read requests
among mirror set members - c.f. the VMS ability HBVS ability to
balance reads across shadow members.

zones: although what you are running looks like a virtual machine, and
has its own sandboxed environment like a virtual machine, its processes
are running on the host. Linux or Windows via KVM are also supported,
and for selected Linux distributions there's another flavour known as
LX, designed to replace KVM; it does this by intercepting Linux system
calls.

OS in RAM: Not just the OS but all installed software. Combined with
ZFS atop faster disks, getting to a Python prompt or compiling
small to medium sized* programs is faster than on my Mac.

* being deliberately vague here; I've yet to compare the compile times
of large complex programs

--
If it jams - force it. If it breaks, it needed replacing anyway.

Paul Sture

unread,
Aug 5, 2015, 7:05:41 AM8/5/15
to
On 2015-07-27, already...@yahoo.com <already...@yahoo.com> wrote:
> On Monday, July 27, 2015 at 12:04:45 PM UTC+3, Paul Sture wrote:
>>
>> Something else I've just discovered on this box is that there appears
>> to be a widespread assumption that "compression is good". The
>> specific example here is a bit of code for migrating virtual machines
>> from one host to another, which employs gzip.
>>
>> On the Microserver this migration seemed to be taking an excessive
>> amount of time so I timed it with and without gzip on a set of files
>> which total 83GB
>>
>> With gzip: 2.5 hours
>> Without gzip: 14 minutes
>>
>> Gzipping the uncompressed files takes ~50 minutes on my Mac mini (and 24
>> minutes to decompress), which isn't a trivial operation either,
>
> 24 min is approximately a time that it takes to read 23 GB and to
> write 83GB on Mac Mini's anemic 5400 rpm HDD. Uncompression itself
> close to free.

Good point. Here are the detailed figures:

Mac mini, using a USB3 external drive with lots of free space:

$ time gzip *

real 52m8.677s
user 47m23.876s
sys 1m33.583s

and to uncompress:

$ time gunzip *.gz

real 24m7.784s
user 4m43.800s
sys 1m8.167s

Not totally free, but close enough.

A problem with gzip/gunzip is that they do their stuff in situ. With
VMS Backup or tar you have the option of doing operations across
spindles to minimise disk head thrashing.

>
> I am not sure what is "HP NL40 Microserver". May be, you meant N40L?

Correct, I meant N40L.

> Then its has a processor that is several (3?) times slower than
> non-very-old Mac Mini, which should be still good enough for
> decompression, but could be slow for compression. So, on your specific
> hardware SUSE assumption about "compression is good" happened to be
> wrong, but on more typical SUSE machines it could be correct.

Indeed. A more typical usage of a low end server such as the N40L will
be where folks are generating compressed files on (faster) desktop
systems and merely storing the results on the server.

FWIW there's another possibility here and that's to split up files to be
compressed and run n jobs where n = the number of cores you have
available. In the case of a single large file, gzip files can be
concatenated (see the man entries for gzip and gunzip), so it could be
split, the components compressed in separate jobs so that they get their
own core, then the results concatenated.

terry+go...@tmk.com

unread,
Aug 5, 2015, 9:00:28 AM8/5/15
to
On Wednesday, August 5, 2015 at 7:05:41 AM UTC-4, Paul Sture wrote:
> FWIW there's another possibility here and that's to split up files to be
> compressed and run n jobs where n = the number of cores you have
> available. In the case of a single large file, gzip files can be
> concatenated (see the man entries for gzip and gunzip), so it could be
> split, the components compressed in separate jobs so that they get their
> own core, then the results concatenated.

If you're not tied to the [PK]ZIP format, pbzip2 is a multithreaded implementation of the bzip2 compression format: http://compression.ca/pbzip2

Paul Sture

unread,
Aug 5, 2015, 11:39:10 AM8/5/15
to
Thanks, I'll have a look.

I'm unsure for the original task but bzip2 is a supported alternative
for other stuff in this area.

glen herrmannsfeldt

unread,
Aug 5, 2015, 3:49:37 PM8/5/15
to
Paul Sture <nos...@sture.ch> wrote:
> On 2015-07-27, already...@yahoo.com <already...@yahoo.com> wrote:

(snip on disk time vs. CPU compression time)

> Not totally free, but close enough.

> A problem with gzip/gunzip is that they do their stuff in situ. With
> VMS Backup or tar you have the option of doing operations across
> spindles to minimise disk head thrashing.

gzip < file > /somewhere/else/file.gz

depending on your shell, use a shell loop over files.

foreach file ( * )
gzip < $file > /somewhere/else/$file.gz
end

-- glen

terry+go...@tmk.com

unread,
Aug 5, 2015, 8:09:42 PM8/5/15
to
On Wednesday, August 5, 2015 at 11:39:10 AM UTC-4, Paul Sture wrote:
> > If you're not tied to the [PK]ZIP format, pbzip2 is a multithreaded
> > implementation of the bzip2 compression format:
> > http://compression.ca/pbzip2
>
> Thanks, I'll have a look.
>
> I'm unsure for the original task but bzip2 is a supported alternative
> for other stuff in this area.

I don't think it has a VMS port, but that could probably be handled. What is more important is how many cores are available, which is presumably a lesser issue with newer Itaniums (Itanii?) and will not be an issue on x86. I'd also check to see how saturated the disk I/O subsystem gets from a single-threaded bzip2 - if you don't have excess disk performance, parallelizing the compression phase will be less of a win.

I regularly use pbzip2 for files > 250GB, but I have 24 fast x86-64 cores to throw at it and 4GB/sec disk I/O available.

Hans Vlems

unread,
Aug 6, 2015, 2:15:18 AM8/6/15
to
I think the plural of Itanium is Itaniacs :-)

Gianluca Bonetti

unread,
Aug 6, 2015, 3:40:59 AM8/6/15
to
Most funny thing is that somebody has resureccted a thread from 1992. :D

Paul Sture

unread,
Aug 6, 2015, 5:13:38 AM8/6/15
to
On 2015-08-05, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
> Paul Sture <nos...@sture.ch> wrote:
>> On 2015-07-27, already...@yahoo.com <already...@yahoo.com> wrote:
>
> (snip on disk time vs. CPU compression time)
>
>> Not totally free, but close enough.
>
>> A problem with gzip/gunzip is that they do their stuff in situ. With
>> VMS Backup or tar you have the option of doing operations across
>> spindles to minimise disk head thrashing.
>
> gzip < file > /somewhere/else/file.gz

That works a treat and leaves the original in place. I was half
expecting to need the --keep switch to avoid deleting the original
(this on OS X - the behaviour might differ elsewhere).

> depending on your shell, use a shell loop over files.
>
> foreach file ( * )
> gzip < $file > /somewhere/else/$file.gz
> end

The syntax I got working in bash on OS X was this:

for file in *
do
gzip < $file > /somewhere/else/$filename
done

Paul Sture

unread,
Aug 6, 2015, 5:57:56 AM8/6/15
to
On 2015-08-06, terry+go...@tmk.com <terry+go...@tmk.com> wrote:
> On Wednesday, August 5, 2015 at 11:39:10 AM UTC-4, Paul Sture wrote:
>> > If you're not tied to the [PK]ZIP format, pbzip2 is a multithreaded
>> > implementation of the bzip2 compression format:
>> > http://compression.ca/pbzip2
>>
>> Thanks, I'll have a look.
>>
>> I'm unsure for the original task but bzip2 is a supported alternative
>> for other stuff in this area.
>
> I don't think it has a VMS port, but that could probably be handled.
> What is more important is how many cores are available, which is
> presumably a lesser issue with newer Itaniums (Itanii?) and will not
> be an issue on x86.

On my first test of pbzip2 on a Mac, it reported that it was using 4
cores as a default. Activity Monitor showed it going straight to 297%
CPU usage. That settled down to around 200% once it got going, which
is probably some combination of disk speed and other stuff running
(a Mac running OS X has quite a lot going on in the background).

> I'd also check to see how saturated the disk I/O subsystem gets from a
> single-threaded bzip2 - if you don't have excess disk performance,
> parallelizing the compression phase will be less of a win.

Possibly why CPU usage dropped to 200%.

Glen suggestion of using redirection works nicely. I haven't timed it
yet, but for the above test I redirected output to another disk and
was rewarded by a pleasant lack of disk rattling.

> I regularly use pbzip2 for files > 250GB, but I have 24 fast x86-64
> cores to throw at it and 4GB/sec disk I/O available.

Want one. ;-)

Paul Sture

unread,
Aug 6, 2015, 5:59:47 AM8/6/15
to
On 2015-08-06, Hans Vlems <hvl...@freenet.de> wrote:

> I think the plural of Itanium is Itaniacs :-)

What's the collective noun?

I'll suggest a (marketing) flop of Itanics.

George Cornelius

unread,
Aug 15, 2015, 4:07:44 AM8/15/15
to
In article <epnb9c-...@news.chingola.ch>, Paul Sture <nos...@sture.ch> writes:
> On 2015-08-06, Hans Vlems <hvl...@freenet.de> wrote:
>
>> I think the plural of Itanium is Itaniacs :-)
>
> What's the collective noun?
>
> I'll suggest a (marketing) flop of Itanics.

Megaflops.

Paul Sture

unread,
Aug 15, 2015, 4:59:28 AM8/15/15
to
On 2015-08-15, George Cornelius <corn...@eisner.decus.org> wrote:
> In article <epnb9c-...@news.chingola.ch>,
> Paul Sture <nos...@sture.ch> writes:
>> On 2015-08-06, Hans Vlems <hvl...@freenet.de> wrote:
>>
>>> I think the plural of Itanium is Itaniacs :-)
>>
>> What's the collective noun?
>>
>> I'll suggest a (marketing) flop of Itanics.
>
> Megaflops.

-:)

--
"I have spent most of the day putting in a comma and
the rest of the day taking it out." -- Oscar Wilde

George Cornelius

unread,
Aug 15, 2015, 12:03:18 PM8/15/15
to
In article <d4a3ac-...@news.chingola.ch>, Paul Sture <nos...@sture.ch> writes:
> On 2015-08-15, George Cornelius <corn...@eisner.decus.org> wrote:
>> In article <epnb9c-...@news.chingola.ch>,
>> Paul Sture <nos...@sture.ch> writes:
>>> On 2015-08-06, Hans Vlems <hvl...@freenet.de> wrote:
>>>
>>>> I think the plural of Itanium is Itaniacs :-)
>>>
>>> What's the collective noun?
>>>
>>> I'll suggest a (marketing) flop of Itanics.
>>
>> Megaflops.
>
> -:)


I suppose by collective noun you meant "collection of".

Like

a MIP of mainframes
a VUP of VAXen
and
an MFLOP of Itanics

The M works in two ways but Marketing is the one I'll
go with here so as to not disturb any feathers that
might later need unruffling.

Paul Sture

unread,
Aug 15, 2015, 5:12:31 PM8/15/15
to
On 2015-08-15, George Cornelius <corn...@eisner.decus.org> wrote:
> In article <d4a3ac-...@news.chingola.ch>, Paul Sture <nos...@sture.ch> writes:
>> On 2015-08-15, George Cornelius <corn...@eisner.decus.org> wrote:
>>> In article <epnb9c-...@news.chingola.ch>,
>>> Paul Sture <nos...@sture.ch> writes:
>>>> On 2015-08-06, Hans Vlems <hvl...@freenet.de> wrote:
>>>>
>>>>> I think the plural of Itanium is Itaniacs :-)
>>>>
>>>> What's the collective noun?
>>>>
>>>> I'll suggest a (marketing) flop of Itanics.
>>>
>>> Megaflops.
>>
>> -:)
>
>
> I suppose by collective noun you meant "collection of".
>
> Like
>
> a MIP of mainframes
> a VUP of VAXen
> and
> an MFLOP of Itanics
>
> The M works in two ways but Marketing is the one I'll
> go with here so as to not disturb any feathers that
> might later need unruffling.

I meant "collective" noun as in

<https://en.wikipedia.org/wiki/Collective_noun>

"In linguistics, a collective noun is a word which refers to a
collection of things taken as a whole."

Less mundane examples being "a murder of crows" or a "parliament of
owls", and for Datatrieve fans "a peregrine of Wombats" :-)

<https://en.wikipedia.org/wiki/List_of_animal_names>

A "surfeit of skunks" sounds pretty appropriate...

Dead Skunk ♪ ♪ ♪
<https://www.youtube.com/watch?v=UejelYnVI3U>


--
"Trust me I'm a cloud provider."

Chris Quayle

unread,
Aug 22, 2015, 11:16:17 AM8/22/15
to
Paul,

I don't really understand why you want to use a compressed file system
/ compress files. It might have had some validity in the early days when
disk space was expensive, but on the early processors, it always was a
performance hog, as the processors were usually very lame in terms of
compute performance. Now disk space is dirt cheap so there's really no
justification. Even laptops seem to have 500-1000Gb these days, though you
would have to be a fool to fill that without a solid data recovery process.

The other point is about disk & i/o bandwidth. IDE disks are notoriously
slow,
especially the 5400rpm types, They often have none of the features, such as
tagged command queueing that sas, scsi and probably sata has these days.
Those little N40 mini servers from HP are really neat, but one would imagine
that they are targeted to low bandwidth apps. You don't need a lot of cpu
power for even a moderately loaded nfs or smb server, but you do need good
io bandwidth to get the performance.

The first thing I would do would be to put in a sas or sata raid board
(there
is a spare io slot, yes ?) and throw away the ide drives. May be unpc to say
so, but even 2.5" sas drives, which have great seek times, are quite cheap
on the usual site, even new, if you have patience.

Glad to see someone else is on Suse. Still the best and most robust distro,
but i'm still on 11.4 to avoid systemd. Not quite ready for that yet :-)...

0 new messages