Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Truth or Dare?

3 views
Skip to first unread message

Rich Thomson

unread,
Mar 27, 1991, 7:41:04 PM3/27/91
to
Several times on comp.graphics the SGI folks have offered up the
famous "1 million polygons/second" quotation from marketing literature
when referring to peak performance on the VGX series of machines.

The polygons in question are 50 pixel triangle strips that are flat
shaded. Also, they may have rendered them without the Z buffer turned
on (the spec sheet is away from me at the moment).

What I would like to know is: does anybody have a program, no matter
how contorted, that actually achieves this number? As far as I can
tell the number is derived from theoretical peak performance of the
VGX pipeline, or some portion of it.

This highlights a sore point in graphics benchmarking in general,
which is getting anybody to agree on the same thing to be measured.
Hopefully the PLB and GPC efforts of NCGA will alleviate this problem,
or at least translate it into ``who has the most optimized GPC
program''.

When I asked the local SGI rep for more information about these
numbers, he sent me a data sheet that described the particular
polygons in question (revealing that they were as described above),
but no working code.

So: is this a "theoretical" number, or is it attainable by a program,
no matter how contorted?

-- Rich
Rich Thomson tho...@cs.utah.edu {bellcore,hplabs,uunet}!utah-cs!thomson
``Read my MIPs -- no new VAXes!!'' --George Bush after sniffing freon

Bruce R. Holloway

unread,
Mar 28, 1991, 7:56:30 PM3/28/91
to
In article <1991Mar27.1...@hellgate.utah.edu> tho...@cs.utah.edu (Rich Thomson) writes:
>Several times on comp.graphics the SGI folks have offered up the
>famous "1 million polygons/second" quotation from marketing literature
>when referring to peak performance on the VGX series of machines.
>
>The polygons in question are 50 pixel triangle strips that are flat
>shaded. Also, they may have rendered them without the Z buffer turned
>on (the spec sheet is away from me at the moment).
>...

>So: is this a "theoretical" number, or is it attainable by a program,
>no matter how contorted?

We have a suite of performance measurement programs which allow lots of
primitives with different characteristics to be tried. I ran a certain
case and got the following output:

* Mesh, DisplayList, Subpixel, RGBmode, Flat, Area = 50.000000, PYM_FILL, MeshesPerSecond = 886681

My machine is only a 5-span, which means I have just one raster memory board.
Apparently the quoted performance numbers are for a 10-span system. With
slightly smaller triangles I obtained:

* Mesh, DisplayList, Subpixel, RGBmode, Flat, Area = 40.500000, PYM_FILL, MeshesPerSecond = 1022706

The inner loops looked like this:

makeobj(*name=genobj());

for(jdx=0; jdx<OBJSZ; jdx++) {
fp=(float *)rectbuf;
bgntmesh();
v3f(fp); v3f(fp+4);
for (fp+=8, i=LoopCount; i>0; --i,fp+=40) {
v3f(fp); v3f(fp+4);
v3f(fp+8); v3f(fp+12);
v3f(fp+16); v3f(fp+20);
v3f(fp+24); v3f(fp+28);
v3f(fp+32); v3f(fp+36);
}
endtmesh();
}

closeobj();

grestartwatch();
for (i=events/(objects*OBJSZ); i>0; i--)
callobj(Obj);
sec = gstopwatch();
return((int) events/sec);

My cursory inspection of the program revealed zbuffer(TRUE) &
zfunction(ZF_ALWAYS). I'd send something simple & complete if I had time,
but this is the best I can do right now.

Regards, bruceh

Jorge Lach

unread,
Apr 1, 1991, 10:33:33 AM4/1/91
to
In article <1991Mar29.0...@odin.corp.sgi.com>, bru...@sgi.com (Bruce R. Holloway) writes:
|> In article <1991Mar27.1...@hellgate.utah.edu> tho...@cs.utah.edu (Rich Thomson) writes:
|> >Several times on comp.graphics the SGI folks have offered up the
|> >famous "1 million polygons/second" quotation from marketing literature
|> >when referring to peak performance on the VGX series of machines.
|> >
|> >The polygons in question are 50 pixel triangle strips that are flat
|> >shaded. Also, they may have rendered them without the Z buffer turned
|> >on (the spec sheet is away from me at the moment).
|> >...
|> >So: is this a "theoretical" number, or is it attainable by a program,
|> >no matter how contorted?
|>
|> We have a suite of performance measurement programs which allow lots of
|> primitives with different characteristics to be tried. I ran a certain
|> case and got the following output:
|>
|> * Mesh, DisplayList, Subpixel, RGBmode, Flat, Area = 50.000000, PYM_FILL, MeshesPerSecond = 886681
|> ...

|> * Mesh, DisplayList, Subpixel, RGBmode, Flat, Area = 40.500000, PYM_FILL, MeshesPerSecond = 1022706
|>


And what would the performance be in shading modes other than flat?
Also, what kind of light modelling is being applied to the polygons
in question?

------------------

jo...@dg.dg.com Technical Systems Division
Jorge Lach Data General Corp., Westboro, Massachusetts

"I speak only for myself, not for my company; in fact, my
company does not speak, and it is not really mine..."

Kurt Akeley

unread,
Apr 1, 1991, 10:49:02 AM4/1/91
to
In article <1991Mar27.1...@hellgate.utah.edu> tho...@cs.utah.edu (Rich Thomson) writes:
|>Several times on comp.graphics the SGI folks have offered up the
|>famous "1 million polygons/second" quotation from marketing literature
|>when referring to peak performance on the VGX series of machines.
|>
|>The polygons in question are 50 pixel triangle strips that are flat
|>shaded. Also, they may have rendered them without the Z buffer turned
|>on (the spec sheet is away from me at the moment).
|>...
|>So: is this a "theoretical" number, or is it attainable by a program,
|>no matter how contorted?

We take our graphics performance claims very seriously here at Silicon
Graphics. They represent performances that are achievable with carefully
tuned programs that use ONLY commands that are available in the
Graphics Library. We distinguish between two different classes of
performance: primitive performance and fill performance. Primitive
performance is the rate that points, lines, polygons, and characters
can be drawn per second, assuming no limitation from pixel manipulation,
measured with different modes activated (such as lighting). Fill
performance is the rate that pixels are filled, assuming an infinite supply
of graphics primitives, again with different modes active (zbuffer,
blending, etc). Primitive performances are always specified conservatively,
we always claim a lower number than can actually be achieved. Fill
performances are usually specified as a "not-to-be-exceeded" maximum
that is approached asymptotically as larger and larger primitives are
drawn. Due to slight overheads of memory and display refresh,
achievable fill rates are sometimes as much as 10 percent lower
than the claimed rates.

I have included the source code to the program that I use to verify
the performance of triangle meshes. I ran this program on my 5-span
VGX with the following results:

size=8, offset=4, zbuffer(1), events=500000, lighting=1
running on cashew, GL4DVGX-4.0, Fri Mar 29 15:22:58 1991
Triangle mesh performance (lighted):
1 triangles per mesh: 189393 triangles per second
2 triangles per mesh: 304878 triangles per second
3 triangles per mesh: 299400 triangles per second
4 triangles per mesh: 387596 triangles per second
5 triangles per mesh: 471698 triangles per second
10 triangles per mesh: 574712 triangles per second
20 triangles per mesh: 641025 triangles per second
30 triangles per mesh: 675648 triangles per second
62 triangles per mesh: 714240 triangles per second
Display listed triangle mesh (lighted):
62 triangles per mesh: 769181 triangles per second
Display listed triangle mesh (colored):
62 triangles per mesh: 1020342 triangles per second
Quadrilateral strip performance (lighted):
31 quads per mesh: 342465 quads per second
Independent triangle performance (lighted):
1 triangles per mesh: 192307 triangles per second
Triangle mesh performance (flat shaded):
10 triangles per mesh: 943396 triangles per second
20 triangles per mesh: 925925 triangles per second
30 triangles per mesh: 1063787 triangles per second
62 triangles per mesh: 1086886 triangles per second
Quadrilateral strip performance (flat shaded):
31 quads per mesh: 420167 quads per second
Independent triangle performance (flat shaded):
1 triangles per mesh: 280898 triangles per second

Note that performances of well over 1 million triangles per second are
achieved for long meshes of single- and multi-colored triangles, with
the zbuffer enabled. When lighting and smooth shading are enabled, the
performance drops to roughly 3/4 of a million triangles per second.
(Here's where the marketing error crept in - the correct claim is
1 million connected triangles per second, multi-colored, flat shaded,
zbuffered, projected, subpixel positioned. It's hard to keep all these
modes straight!) Note also that I had to limit the triangles to about
30 pixels each, as my machine is a 5-span VGX. A 10-span VGX has double
the fill rate, and is able to achieve these performances with 50-pixel
triangles.

I will be happy to supply other performance testing routines by email
on request.

-- kurt

----------------------------- cut here ----------------------------------

/**************************************************************************
Copyright 1991 by Silicon Graphics Incorporated, Mountain View, California.

All Rights Reserved

Permission to use, copy, modify, and distribute this software and its
documentation for any purpose and without fee is hereby granted,
provided that the above copyright notice appear in all copies and that
both that copyright notice and this permission notice appear in
supporting documentation, and that the name of Silicon Graphics not be
used in advertising or publicity pertaining to distribution of the
software without specific, written prior permission.

SILICON GRAPHICS DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO
EVENT SHALL SILICON GRAPHICS BE LIABLE FOR ANY SPECIAL, INDIRECT OR
CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF
USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR
OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
PERFORMANCE OF THIS SOFTWARE.

**************************************************************************/

/*
* Kurt Akeley
* March 1991
*
* Test performance of triangle meshes (strips)
*/

#include <stdio.h>
#include <gl.h>
#include <device.h>
#include <sys/types.h>
#include <sys/times.h>
#include <sys/param.h>

#define OUTFILE stdout
#define MAXVERTEX 102
#define VERTSIZE 8

#define LIGHTVERT(i) n3f(fp+(VERTSIZE*(i))); v3f(fp+(VERTSIZE*(i))+4)
#define COLORVERT(i) c3f(fp+(VERTSIZE*(i))); v3f(fp+(VERTSIZE*(i))+4)
#define FLATVERT(i) v3f(fp+(VERTSIZE*(i))+4)

float *meshbuf;
int dolighting = TRUE;

long events;

main(argc,argv) char *argv[]; {
register i,j,k;
int size, offset;
int doz;

/* allocate a quad-aligned buffer */
meshbuf = (float*)malloc(sizeof(float)*(MAXVERTEX*VERTSIZE + 3));
meshbuf = (float*)(((long)(meshbuf+3)) & 0xfffffff0);

/* evaluate command line arguments */
size = 10;
offset = 5;
doz = TRUE;
events = 100000;
dolighting = TRUE;
if (argc > 1)
size = atoi(argv[1]);
if (argc > 2)
offset = atoi(argv[2]);
if (argc > 3)
doz = atoi(argv[3]);
if (argc > 4)
events = atoi(argv[4]);
if (argc > 5)
dolighting = atoi(argv[5]);
fprintf(OUTFILE,"size=%d, offset=%d, zbuffer(%d), events=%d, lighting=%d\n",
size, offset, doz, events, dolighting);
fflush(OUTFILE);

/* initialize graphics */
prefposition(10,32*size+30,10,size+30);
foreground();
winopen("meshspeed");
RGBmode();
overlay(2);
gconfig();
zbuffer(doz);
zfunction(ZF_ALWAYS);
subpixel(TRUE);
qdevice(ESCKEY);
shademodel(GOURAUD);
timestamp();

/* initialize lighting */
initlight();

/* clear the screen */
cpack(0);
clear();
drawmode(OVERDRAW);
color(0);
clear();
drawmode(NORMALDRAW);

/* initialize data arrays */
for (i=0; i<MAXVERTEX; i+=1) {
meshbuf[VERTSIZE*i+0] = (i&1) ? 0.0 : 1.0;
meshbuf[VERTSIZE*i+1] = 0.0;
meshbuf[VERTSIZE*i+2] = (i&1) ? 1.0 : 0.0;
meshbuf[VERTSIZE*i+3] = 0;
meshbuf[VERTSIZE*i+4] = 10.0 + (float)(size*(i>>1)) +
(float)(offset*(i&1));
meshbuf[VERTSIZE*i+5] = 10.0 + (float)(size*(i&1));
meshbuf[VERTSIZE*i+6] = 0.0;
meshbuf[VERTSIZE*i+7] = 0;
}

/* run the tests */
fprintf(OUTFILE,"Triangle mesh performance (lighted):\n");
meshlight1();
meshlight2();
meshlight3();
meshlight4();
meshlight5();
meshlight10();
meshlight20();
meshlight30();
meshlight62();
fprintf(OUTFILE,"Display listed triangle mesh (lighted):\n");
dlmeshlight62();
fprintf(OUTFILE,"Display listed triangle mesh (colored):\n");
dlmeshcolor62();
fprintf(OUTFILE,"Quadrilateral strip performance (lighted):\n");
quadlight31();
fprintf(OUTFILE,"Independent triangle performance (lighted):\n");
indlight();
fprintf(OUTFILE,"Triangle mesh performance (flat shaded):\n");
meshflat10();
meshflat20();
meshflat30();
meshflat62();
fprintf(OUTFILE,"Quadrilateral strip performance (flat shaded):\n");
quadflat31();
fprintf(OUTFILE,"Independent triangle performance (flat shaded):\n");
indflat();
}


meshlight1() {
register i;
register float *fp;
czclear(0,0);
light(TRUE);
startclock();
for (i=events; i>0; i--) {
fp = meshbuf;
bgntmesh();
LIGHTVERT(0);
LIGHTVERT(1);
LIGHTVERT(2);
endtmesh();
}
stopclock(events,1,"triangles");
light(FALSE);
}

meshlight2() {
register i;
register float *fp;
czclear(0,0);
light(TRUE);
startclock();
for (i=events/2; i>0; i--) {
fp = meshbuf;
bgntmesh();
LIGHTVERT(0);
LIGHTVERT(1);
LIGHTVERT(2);
LIGHTVERT(3);
endtmesh();
}
stopclock(events/2,2,"triangles");
light(FALSE);
}

meshlight3() {
register i;
register float *fp;
czclear(0,0);
light(TRUE);
startclock();
for (i=events/3; i>0; i--) {
fp = meshbuf;
bgntmesh();
LIGHTVERT(0);
LIGHTVERT(1);
LIGHTVERT(2);
LIGHTVERT(3);
LIGHTVERT(4);
endtmesh();
}
stopclock(events/3,3,"triangles");
light(FALSE);
}

meshlight4() {
register i;
register float *fp;
czclear(0,0);
light(TRUE);
startclock();
for (i=events/4; i>0; i--) {
fp = meshbuf;
bgntmesh();
LIGHTVERT(0);
LIGHTVERT(1);
LIGHTVERT(2);
LIGHTVERT(3);
LIGHTVERT(4);
LIGHTVERT(5);
endtmesh();
}
stopclock(events/4,4,"triangles");
light(FALSE);
}

meshlight5() {
register i;
register float *fp;
czclear(0,0);
light(TRUE);
startclock();
for (i=events/5; i>0; i--) {
fp = meshbuf;
bgntmesh();
LIGHTVERT(0);
LIGHTVERT(1);
LIGHTVERT(2);
LIGHTVERT(3);
LIGHTVERT(4);
LIGHTVERT(5);
LIGHTVERT(6);
endtmesh();
}
stopclock(events/5,5,"triangles");
light(FALSE);
}

meshlight10() {
register i;
register float *fp;
czclear(0,0);
light(TRUE);
startclock();
for (i=events/10; i>0; i--) {
fp = meshbuf;
bgntmesh();
LIGHTVERT(0);
LIGHTVERT(1);
LIGHTVERT(2);
LIGHTVERT(3);
LIGHTVERT(4);
LIGHTVERT(5);
LIGHTVERT(6);
LIGHTVERT(7);
LIGHTVERT(8);
LIGHTVERT(9);
LIGHTVERT(10);
LIGHTVERT(11);
endtmesh();
}
stopclock(events/10,10,"triangles");
light(FALSE);
}

meshlight20() {
register i;
register float *fp;
czclear(0,0);
light(TRUE);
startclock();
for (i=events/20; i>0; i--) {
fp = meshbuf;
bgntmesh();
LIGHTVERT(0);
LIGHTVERT(1);
LIGHTVERT(2);
LIGHTVERT(3);
LIGHTVERT(4);
LIGHTVERT(5);
LIGHTVERT(6);
LIGHTVERT(7);
LIGHTVERT(8);
LIGHTVERT(9);
LIGHTVERT(10);
LIGHTVERT(11);
LIGHTVERT(12);
LIGHTVERT(13);
LIGHTVERT(14);
LIGHTVERT(15);
LIGHTVERT(16);
LIGHTVERT(17);
LIGHTVERT(18);
LIGHTVERT(19);
LIGHTVERT(20);
LIGHTVERT(21);
endtmesh();
}
stopclock(events/20,20,"triangles");
light(FALSE);
}

meshlight30() {
register i;
register float *fp;
czclear(0,0);
light(TRUE);
startclock();
for (i=events/30; i>0; i--) {
fp = meshbuf;
bgntmesh();
LIGHTVERT(0);
LIGHTVERT(1);
LIGHTVERT(2);
LIGHTVERT(3);
LIGHTVERT(4);
LIGHTVERT(5);
LIGHTVERT(6);
LIGHTVERT(7);
LIGHTVERT(8);
LIGHTVERT(9);
LIGHTVERT(10);
LIGHTVERT(11);
LIGHTVERT(12);
LIGHTVERT(13);
LIGHTVERT(14);
LIGHTVERT(15);
LIGHTVERT(16);
LIGHTVERT(17);
LIGHTVERT(18);
LIGHTVERT(19);
LIGHTVERT(20);
LIGHTVERT(21);
LIGHTVERT(22);
LIGHTVERT(23);
LIGHTVERT(24);
LIGHTVERT(25);
LIGHTVERT(26);
LIGHTVERT(27);
LIGHTVERT(28);
LIGHTVERT(29);
LIGHTVERT(30);
LIGHTVERT(31);
endtmesh();
}
stopclock(events/30,30,"triangles");
light(FALSE);
}

meshlight62() {
register i;
register float *fp;
czclear(0,0);
light(TRUE);
startclock();
for (i=events/62; i>0; i--) {
fp = meshbuf;
bgntmesh();
LIGHTVERT(0);
LIGHTVERT(1);
LIGHTVERT(2);
LIGHTVERT(3);
LIGHTVERT(4);
LIGHTVERT(5);
LIGHTVERT(6);
LIGHTVERT(7);
LIGHTVERT(8);
LIGHTVERT(9);
LIGHTVERT(10);
LIGHTVERT(11);
LIGHTVERT(12);
LIGHTVERT(13);
LIGHTVERT(14);
LIGHTVERT(15);
LIGHTVERT(16);
LIGHTVERT(17);
LIGHTVERT(18);
LIGHTVERT(19);
LIGHTVERT(20);
LIGHTVERT(21);
LIGHTVERT(22);
LIGHTVERT(23);
LIGHTVERT(24);
LIGHTVERT(25);
LIGHTVERT(26);
LIGHTVERT(27);
LIGHTVERT(28);
LIGHTVERT(29);
LIGHTVERT(30);
LIGHTVERT(31);
LIGHTVERT(32);
LIGHTVERT(33);
LIGHTVERT(34);
LIGHTVERT(35);
LIGHTVERT(36);
LIGHTVERT(37);
LIGHTVERT(38);
LIGHTVERT(39);
LIGHTVERT(40);
LIGHTVERT(41);
LIGHTVERT(42);
LIGHTVERT(43);
LIGHTVERT(44);
LIGHTVERT(45);
LIGHTVERT(46);
LIGHTVERT(47);
LIGHTVERT(48);
LIGHTVERT(49);
LIGHTVERT(50);
LIGHTVERT(51);
LIGHTVERT(52);
LIGHTVERT(53);
LIGHTVERT(54);
LIGHTVERT(55);
LIGHTVERT(56);
LIGHTVERT(57);
LIGHTVERT(58);
LIGHTVERT(59);
LIGHTVERT(60);
LIGHTVERT(61);
LIGHTVERT(62);
LIGHTVERT(63);
endtmesh();
}
stopclock(events/62,62,"triangles");
light(FALSE);
}

#define MESHPERDL 10

dlmeshlight62() {
register i,j;
register float *fp;
makeobj(1);
for (i=0; i<MESHPERDL; i++) {
fp = meshbuf;
bgntmesh();
for (j=0; j<64; j++) {
LIGHTVERT(j);
}
endtmesh();
}
closeobj();
czclear(0,0);
light(TRUE);
startclock();
for (i=events/(62*MESHPERDL); i>0; i--)
callobj(1);
stopclock(events/62,62,"triangles");
light(FALSE);
}

dlmeshcolor62() {
register i,j;
register float *fp;
makeobj(1);
for (i=0; i<MESHPERDL; i++) {
fp = meshbuf;
bgntmesh();
for (j=0; j<64; j++) {
COLORVERT(j);
}
endtmesh();
}
closeobj();
czclear(0,0);
shademodel(FLAT);
startclock();
for (i=events/(62*MESHPERDL); i>0; i--)
callobj(1);
stopclock(events/62,62,"triangles");
shademodel(GOURAUD);
}

quadlight31() {
register i;
register float *fp;
czclear(0,0);
light(TRUE);
startclock();
for (i=events/31; i>0; i--) {
fp = meshbuf;
bgnqstrip();
LIGHTVERT(0);
LIGHTVERT(1);
LIGHTVERT(2);
LIGHTVERT(3);
LIGHTVERT(4);
LIGHTVERT(5);
LIGHTVERT(6);
LIGHTVERT(7);
LIGHTVERT(8);
LIGHTVERT(9);
LIGHTVERT(10);
LIGHTVERT(11);
LIGHTVERT(12);
LIGHTVERT(13);
LIGHTVERT(14);
LIGHTVERT(15);
LIGHTVERT(16);
LIGHTVERT(17);
LIGHTVERT(18);
LIGHTVERT(19);
LIGHTVERT(20);
LIGHTVERT(21);
LIGHTVERT(22);
LIGHTVERT(23);
LIGHTVERT(24);
LIGHTVERT(25);
LIGHTVERT(26);
LIGHTVERT(27);
LIGHTVERT(28);
LIGHTVERT(29);
LIGHTVERT(30);
LIGHTVERT(31);
LIGHTVERT(32);
LIGHTVERT(33);
LIGHTVERT(34);
LIGHTVERT(35);
LIGHTVERT(36);
LIGHTVERT(37);
LIGHTVERT(38);
LIGHTVERT(39);
LIGHTVERT(40);
LIGHTVERT(41);
LIGHTVERT(42);
LIGHTVERT(43);
LIGHTVERT(44);
LIGHTVERT(45);
LIGHTVERT(46);
LIGHTVERT(47);
LIGHTVERT(48);
LIGHTVERT(49);
LIGHTVERT(50);
LIGHTVERT(51);
LIGHTVERT(52);
LIGHTVERT(53);
LIGHTVERT(54);
LIGHTVERT(55);
LIGHTVERT(56);
LIGHTVERT(57);
LIGHTVERT(58);
LIGHTVERT(59);
LIGHTVERT(60);
LIGHTVERT(61);
LIGHTVERT(62);
LIGHTVERT(63);
endqstrip();
}
stopclock(events/31,31,"quads");
light(FALSE);
}

indlight() {
register i;
register float *fp;
czclear(0,0);
light(TRUE);
startclock();
for (i=events/4; i>0; i--) {
fp = meshbuf;
bgnpolygon();
LIGHTVERT(0);
LIGHTVERT(1);
LIGHTVERT(2);
endpolygon();
bgnpolygon();
LIGHTVERT(0);
LIGHTVERT(1);
LIGHTVERT(2);
endpolygon();
bgnpolygon();
LIGHTVERT(0);
LIGHTVERT(1);
LIGHTVERT(2);
endpolygon();
bgnpolygon();
LIGHTVERT(0);
LIGHTVERT(1);
LIGHTVERT(2);
endpolygon();
}
stopclock(events,1,"triangles");
light(FALSE);
}

meshflat10() {
register i;
register float *fp;
czclear(0,0);
cpack(0xffffffff);
shademodel(FLAT);
startclock();
for (i=events/10; i>0; i--) {
fp = meshbuf;
bgntmesh();
FLATVERT(0);
FLATVERT(1);
FLATVERT(2);
FLATVERT(3);
FLATVERT(4);
FLATVERT(5);
FLATVERT(6);
FLATVERT(7);
FLATVERT(8);
FLATVERT(9);
FLATVERT(10);
FLATVERT(11);
endtmesh();
}
stopclock(events/10,10,"triangles");
shademodel(GOURAUD);
}

meshflat20() {
register i;
register float *fp;
czclear(0,0);
cpack(0xffffffff);
shademodel(FLAT);
startclock();
for (i=events/20; i>0; i--) {
fp = meshbuf;
bgntmesh();
FLATVERT(0);
FLATVERT(1);
FLATVERT(2);
FLATVERT(3);
FLATVERT(4);
FLATVERT(5);
FLATVERT(6);
FLATVERT(7);
FLATVERT(8);
FLATVERT(9);
FLATVERT(10);
FLATVERT(11);
FLATVERT(12);
FLATVERT(13);
FLATVERT(14);
FLATVERT(15);
FLATVERT(16);
FLATVERT(17);
FLATVERT(18);
FLATVERT(19);
FLATVERT(20);
FLATVERT(21);
endtmesh();
}
stopclock(events/20,20,"triangles");
shademodel(GOURAUD);
}

meshflat30() {
register i;
register float *fp;
czclear(0,0);
cpack(0xffffffff);
shademodel(FLAT);
startclock();
for (i=events/30; i>0; i--) {
fp = meshbuf;
bgntmesh();
FLATVERT(0);
FLATVERT(1);
FLATVERT(2);
FLATVERT(3);
FLATVERT(4);
FLATVERT(5);
FLATVERT(6);
FLATVERT(7);
FLATVERT(8);
FLATVERT(9);
FLATVERT(10);
FLATVERT(11);
FLATVERT(12);
FLATVERT(13);
FLATVERT(14);
FLATVERT(15);
FLATVERT(16);
FLATVERT(17);
FLATVERT(18);
FLATVERT(19);
FLATVERT(20);
FLATVERT(21);
FLATVERT(22);
FLATVERT(23);
FLATVERT(24);
FLATVERT(25);
FLATVERT(26);
FLATVERT(27);
FLATVERT(28);
FLATVERT(29);
FLATVERT(30);
FLATVERT(31);
endtmesh();
}
stopclock(events/30,30,"triangles");
shademodel(GOURAUD);
}

meshflat62() {
register i;
register float *fp;
czclear(0,0);
cpack(0xffffffff);
shademodel(FLAT);
startclock();
for (i=events/62; i>0; i--) {
fp = meshbuf;
bgntmesh();
FLATVERT(0);
FLATVERT(1);
FLATVERT(2);
FLATVERT(3);
FLATVERT(4);
FLATVERT(5);
FLATVERT(6);
FLATVERT(7);
FLATVERT(8);
FLATVERT(9);
FLATVERT(10);
FLATVERT(11);
FLATVERT(12);
FLATVERT(13);
FLATVERT(14);
FLATVERT(15);
FLATVERT(16);
FLATVERT(17);
FLATVERT(18);
FLATVERT(19);
FLATVERT(20);
FLATVERT(21);
FLATVERT(22);
FLATVERT(23);
FLATVERT(24);
FLATVERT(25);
FLATVERT(26);
FLATVERT(27);
FLATVERT(28);
FLATVERT(29);
FLATVERT(30);
FLATVERT(31);
FLATVERT(32);
FLATVERT(33);
FLATVERT(34);
FLATVERT(35);
FLATVERT(36);
FLATVERT(37);
FLATVERT(38);
FLATVERT(39);
FLATVERT(40);
FLATVERT(41);
FLATVERT(42);
FLATVERT(43);
FLATVERT(44);
FLATVERT(45);
FLATVERT(46);
FLATVERT(47);
FLATVERT(48);
FLATVERT(49);
FLATVERT(50);
FLATVERT(51);
FLATVERT(52);
FLATVERT(53);
FLATVERT(54);
FLATVERT(55);
FLATVERT(56);
FLATVERT(57);
FLATVERT(58);
FLATVERT(59);
FLATVERT(60);
FLATVERT(61);
FLATVERT(62);
FLATVERT(63);
endtmesh();
}
stopclock(events/62,62,"triangles");
shademodel(GOURAUD);
}

quadflat31() {
register i;
register float *fp;
czclear(0,0);
cpack(0xffffffff);
shademodel(FLAT);
startclock();
for (i=events/31; i>0; i--) {
fp = meshbuf;
bgnqstrip();
FLATVERT(0);
FLATVERT(1);
FLATVERT(2);
FLATVERT(3);
FLATVERT(4);
FLATVERT(5);
FLATVERT(6);
FLATVERT(7);
FLATVERT(8);
FLATVERT(9);
FLATVERT(10);
FLATVERT(11);
FLATVERT(12);
FLATVERT(13);
FLATVERT(14);
FLATVERT(15);
FLATVERT(16);
FLATVERT(17);
FLATVERT(18);
FLATVERT(19);
FLATVERT(20);
FLATVERT(21);
FLATVERT(22);
FLATVERT(23);
FLATVERT(24);
FLATVERT(25);
FLATVERT(26);
FLATVERT(27);
FLATVERT(28);
FLATVERT(29);
FLATVERT(30);
FLATVERT(31);
FLATVERT(32);
FLATVERT(33);
FLATVERT(34);
FLATVERT(35);
FLATVERT(36);
FLATVERT(37);
FLATVERT(38);
FLATVERT(39);
FLATVERT(40);
FLATVERT(41);
FLATVERT(42);
FLATVERT(43);
FLATVERT(44);
FLATVERT(45);
FLATVERT(46);
FLATVERT(47);
FLATVERT(48);
FLATVERT(49);
FLATVERT(50);
FLATVERT(51);
FLATVERT(52);
FLATVERT(53);
FLATVERT(54);
FLATVERT(55);
FLATVERT(56);
FLATVERT(57);
FLATVERT(58);
FLATVERT(59);
FLATVERT(60);
FLATVERT(61);
FLATVERT(62);
FLATVERT(63);
endqstrip();
}
stopclock(events/31,31,"quads");
shademodel(GOURAUD);
}

indflat() {
register i;
register float *fp;
czclear(0,0);
cpack(0xffffffff);
shademodel(FLAT);
startclock();
for (i=events/4; i>0; i--) {
fp = meshbuf;
bgnpolygon();
FLATVERT(0);
FLATVERT(1);
FLATVERT(2);
endpolygon();
bgnpolygon();
FLATVERT(0);
FLATVERT(1);
FLATVERT(2);
endpolygon();
bgnpolygon();
FLATVERT(0);
FLATVERT(1);
FLATVERT(2);
endpolygon();
bgnpolygon();
FLATVERT(0);
FLATVERT(1);
FLATVERT(2);
endpolygon();
}
stopclock(events,1,"triangles");
shademodel(GOURAUD);
}

struct tms tbuf;
long gtime;

startclock() {
sleep(1);
finish();
gtime = times(&tbuf);
}

stopclock(meshes,tripermesh,s)
long meshes,tripermesh;
char *s;
{
float period;
float timepermesh;
float rate, crate;
int callspermesh;
finish();
callspermesh = 2*tripermesh + 6;
period = (float)(times(&tbuf) - gtime) / 100.0;
timepermesh = period / (float)meshes;
rate = (float)tripermesh / timepermesh;
fprintf(OUTFILE," %2d %s per mesh: %6d %s per second\n",
tripermesh, s, (int)rate, s);
fflush(OUTFILE);
interrupt();
}

float brass[] = {
AMBIENT, 0.35, 0.25, 0.1,
DIFFUSE, 0.65, 0.5, 0.35,
SPECULAR, 0.0, 0.0, 0.0,
SHININESS, 5.0,
LMNULL
};

float whitelight[] = {
AMBIENT, 0.0, 0.0, 0.0,
LCOLOR, 1.0, 1.0, 1.0,
POSITION, 0.0, 0.0, 1.0, 0.0,
LMNULL
};

float infinite[] = {
AMBIENT, 0.3, 0.3, 0.3,
LOCALVIEWER, 0.0,
LMNULL
};

float idmat[] = {
1.0, 0.0, 0.0, 0.0,
0.0, 1.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0,
0.0, 0.0, 0.0, 1.0
};


initlight() {
/* provide a simple lighting model just for timing purposes */
long xsize,ysize;
getsize(&xsize,&ysize);
mmode(MPROJECTION);
ortho(-0.5,(float)xsize-0.5,-0.5,(float)ysize-0.5,0.0,10000.0);
mmode(MVIEWING);
loadmatrix(idmat);

lmdef(DEFMATERIAL, 1, 0, brass);
lmdef(DEFLIGHT, 1, 0, whitelight);
lmdef(DEFLMODEL, 1, 0, infinite);
lmbind(LIGHT1, 1);
lmbind(LMODEL, 1);
}

light(b) {
/* turn lighting on and off */
if (dolighting)
lmbind(MATERIAL, b ? 1 : 0);
else
cpack(0xffffffff);
}

interrupt() {
/* check queue for escape key and exit if found */
short dev,val;
while (qtest()) {
if (qread(&val) == ESCKEY) {
exit(0);
}
}
}

timestamp() {
/* print host name and time/date */
char s[100];
char gv[100];
time_t t;

gethostname(s,100);
gversion(gv);
t = time(0);
fprintf(OUTFILE,"running on %s, %s, %s",s,gv,ctime(&t));
}

Rich Thomson

unread,
Apr 9, 1991, 5:46:16 PM4/9/91
to
In article <1991Mar28....@hellgate.utah.edu>, I posted a VGX
benchmark program mailed to me by Brian McClendon of SGI. I have
recently found the time to take a close look at this program and the
one posted by Kurt Akeley.

Although I have yet to try this particular program out on a VGX
machine, I will postpone that effort folr this particular program. If
we examine the code of the program, we find that the polygons it is
attempting to display are created with the following loop:

#define SQRT3_2 (1.7321/2.0)


/* initialize data arrays */

for (i=0; i<(1 + NUMTRI/2); i++) {
tribuf[i*8+0] = size*i;
tribuf[i*8+1] = 0;
tribuf[i*8+2] = 0;
tribuf[i*8+4] = size*i + size/2;
tribuf[i*8+5] = size*SQRT3_2;
tribuf[i*8+6] = 0;
}

[...]

bgntmesh();
for(i=0;i<(1 + NUMTRI/2);i++)
{
n3f(&normbuf[(i%2)*4]);
v3f(&tribuf[i*8]);
n3f(&normbuf[(i%4)*4]);
v3f(&tribuf[i*8 + 4]);
}
endtmesh();
closeobj();

Notice that this creates a big, linear triangle strip that stretches
off the right side of the screen (especially if the triangles are the
50-pixel triangles quoted in the marketing literature). This results
in most of the triangles being clipped from the view volume.

The program that Kurt Akeley posted in article
<1991Apr1.1...@odin.corp.sgi.com> was much more reasonable, it
created a certain number of triangles per strip, with each strip being
linear, but with all the strips beginning at the same position
relative to the display window:

/* initialize data arrays */
for (i=0; i<MAXVERTEX; i+=1) {
meshbuf[VERTSIZE*i+0] = (i&1) ? 0.0 : 1.0;
meshbuf[VERTSIZE*i+1] = 0.0;
meshbuf[VERTSIZE*i+2] = (i&1) ? 1.0 : 0.0;
meshbuf[VERTSIZE*i+3] = 0;
meshbuf[VERTSIZE*i+4] = 10.0 + (float)(size*(i>>1)) +
(float)(offset*(i&1));
meshbuf[VERTSIZE*i+5] = 10.0 + (float)(size*(i&1));
meshbuf[VERTSIZE*i+6] = 0.0;
meshbuf[VERTSIZE*i+7] = 0;
}

[...]

#define LIGHTVERT(i) n3f(fp+(VERTSIZE*(i))); v3f(fp+(VERTSIZE*(i))+4)


for (i=events; i>0; i--) {
fp = meshbuf;
bgntmesh();
LIGHTVERT(0);
LIGHTVERT(1);
LIGHTVERT(2);
endtmesh();
}

Now on to some comments on Kurt's article:

> We take our graphics performance claims very seriously here at Silicon
> Graphics.

I'm sure you take them as seriously as MIPS, HP, and IBM take their
spec mark ratings. Sadly the graphics community does not yet have the
equivalent of the specmark rating on which to intelligently compare
different platforms. Just look at the claims made when comparing X
implementations. The customer gets left in the lurch unless they
undertake analyzing the voluminous output of x11perf to find out the
real story.

I began to be skeptical when I saw the figure posted several times on
comp.graphics and queries to the poster responded with "its from out
marketing literature, I'll ask a ``tech type'' to send you a program"
(I never heard back from him). Also, at a recent VGX demonstration at
the U, the rep couldn't tell me details about the figure, nor could he
show me a program with a high polygon rate. He also didn't have any
models with several hundred thousand (say, 40% of the peak figure,
or 300K - 400K polygons) polygons, although he's a sharp enough man
that I imagine he WILL have them next time in case I'm there. ;-}

Hopefully, when the Graphics Performance Committee releases its
Picture Level Benchmark program (& numbers come forth from vendors)
this situation will be alleviated. For now, we are stuck with
comparing performance numbers from each different vendor and
attempting to infer useful comparisons from widely differing measures.

For instance, you say:
> [quoted performance comes from] tuned programs that use ONLY


> commands that are available in the Graphics Library.

So these numbers are highly tuned for the architecture of the VGX and
are reproducible only with a vendor-specific library. This is very
understandable, giving the position SGI holds in the 3D market, but it
is very difficult to compare different platforms with these kinds of
numbers in your hand. [Perhaps that is the intention of the marketing
dept? ;-]

> I ran this program on my 5-span VGX with the following results:
> size=8, offset=4, zbuffer(1), events=500000, lighting=1
> running on cashew, GL4DVGX-4.0, Fri Mar 29 15:22:58 1991
> Triangle mesh performance (lighted):
> 1 triangles per mesh: 189393 triangles per second

[stuff deleted]


> 30 triangles per mesh: 675648 triangles per second
> 62 triangles per mesh: 714240 triangles per second
> Display listed triangle mesh (lighted):
> 62 triangles per mesh: 769181 triangles per second
> Display listed triangle mesh (colored):
> 62 triangles per mesh: 1020342 triangles per second

I find this interesting. Apparently, the way to max out the VGX is to
use display lists. I thought SGI considered display lists "naughty".
Several times on comp.graphics, SGI folks have bashed display-list
oriented techniques and the company's position paper on "PEX & PHIGS"
states over and over the advantages of immediate mode over display-list
techniques. I find it particularly ironic then that the 1 M p/s
number comes from display-list techniques.

Another poster asked about how things change when lights are turned
on, etc. I think Kurt's table (along with examining the source)
answers this question. Naturally, the more lights are turned on, the
slower things get (can't compute everything instantaneously). Also, I
notice that these polygons aren't depth cued, which would also reduce
the numbers somewhat (naturally, as stated they are PEAK numbers).

> Note that performances of well over 1 million triangles per second are
> achieved for long meshes of single- and multi-colored triangles, with
> the zbuffer enabled. When lighting and smooth shading are enabled, the
> performance drops to roughly 3/4 of a million triangles per second.

I notice that the zbuffer was enabled, but that the Z test was set to
ZF_ALWAYS. I can imagine a good microcoder optimizing that case so as
to not perform the read-modify-write cycle to the Z buffer (since the
test will always win anyway). Is a r-m-w cycle taking place, or is it
just being written through?

Thanks again Kurt for clarifying these mysteries!

Jeff Hanson

unread,
Apr 10, 1991, 8:37:50 AM4/10/91
to
Rich Thomson writes (and makes some good points, too.)

[ ... stuff deleted ... ]

> Sadly the graphics community does not yet have the
> equivalent of the specmark rating on which to intelligently compare
> different platforms. Just look at the claims made when comparing X
> implementations. The customer gets left in the lurch unless they
> undertake analyzing the voluminous output of x11perf to find out the
> real story.

Any interested in x11perf benchmarking and/or information on PLB benchmark
should get the following publication. HP Apollo 9000 Series 700 -
Performance Brief (5091-1137E 3/91). In it you will find x11perf organized
into 4 groups as proposed by Digital Review. (I wrote to DR urging them
to make their programs available that organize the data and draw the Kiviat
graph, no reply so far. Perhaps HP could make this available.) You will
also find the preliminary PLB numbers that were published in the January
issue of the Anderson Report. These numbers were also published in Unix
Today. I urge anyone involved in graphics and benchmarking to get more
information about PLB because you will be able to create PLB benchmarks and
run them in the very near future (say 6 months, max). A brief synopsis is
below.

The Picture-Level Benchmark - The Industry's Solution for Measuring Graphics
Display Performance.

What is the PLB - The PLB is a software package that provides a standard
method of measuring graphics display performance for different hardware
platforms. It consists of three elements:

The Benchmark Interface Format (BIF), a standardized file structure that
allows users to port application geometry and actions the geometry will
perform to the PLB program.

The Benchmark Timing Methodology (BTM), which provides a consistent method
of measuring the time it takes for hardware to display and perform actions
on a user's application geometry.

The Benchmark Reporting Format (BRF), which provides a standardized report
that allows "apple-to-apple" comparisons of graphics display performance
for different hardware platforms.

How do you use the PLB? - The first step is to translate your data sets from
a typical application into the standard BIF. Once your data set has been
translated, you are ready to run performance test. At the vendor's site or
your own, you can view your data set as it runs on the vendor's system.
The viewing is important, since the PLB does not measure image quality --
it is up to you to make these visual comparisons among the different
systems you test.

For more information contact:

NCGA Technical Services and Standards
2722 Merrilee Drive, Suite 200
Fairfax, VA 22031
Phone: 703-698-9600, ext. 318
Fax: 703-560-2752

[ ... stuff deleted ... ]

> Also, at a recent VGX demonstration at
> the U, the rep couldn't tell me details about the figure, nor could he
> show me a program with a high polygon rate. He also didn't have any
> models with several hundred thousand (say, 40% of the peak figure,
> or 300K - 400K polygons) polygons, although he's a sharp enough man
> that I imagine he WILL have them next time in case I'm there. ;-}

The powerflip program accepts several models so you can load up a few thousand
polygons. It also gives the polygons/second.

> Hopefully, when the Graphics Performance Committee releases its
> Picture Level Benchmark program (& numbers come forth from vendors)
> this situation will be alleviated. For now, we are stuck with
> comparing performance numbers from each different vendor and
> attempting to infer useful comparisons from widely differing measures.

Beat on your vendor of choice for PLB numbers. User demands shall be heard!

[ ... stuff deleted ... ]
--
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
\ / \ / \ / \ / \ / \ / Jeff Hanson \ / \ / \ / \ / \ / \ /
* ViSC: Better * toha...@gonzo.lerc.nasa.gov * * * * * *
/ \ / \ Science / \ / \ NASA Lewis Research Center / \ / \ Through / \ / \
* * * * * * * Cleveland, Ohio 44135 * * * Pictures * *
\ / \ / \ / \ Telephone - (216) 433-2284 Fax - (216) 433-2182 \ / \ / \ /
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*

Kurt Akeley

unread,
Apr 10, 1991, 7:53:06 PM4/10/91
to
In article <1991Apr9.1...@hellgate.utah.edu>, tho...@cs.utah.edu (Rich Thomson) writes:

[stuff deleted]

|>
|> > Display listed triangle mesh (colored):
|> > 62 triangles per mesh: 1020342 triangles per second
|>
|> I find this interesting. Apparently, the way to max out the VGX is to
|> use display lists. I thought SGI considered display lists "naughty".

While we may have implied this, it is not our technical position. The
Graphics Library has included graphical objects from its creation, and will
continue to do so. Graphical objects are the right choice for network
graphics, for example, and may also yield the best performance in simplistic
example codes (such as my benchmark). What *is* naughty is to force
programmers to use graphical objects, or to force them to use immediate mode.
We do neither.

[stuff deleted]

|> I notice that the zbuffer was enabled, but that the Z test was set to
|> ZF_ALWAYS. I can imagine a good microcoder optimizing that case so as
|> to not perform the read-modify-write cycle to the Z buffer (since the
|> test will always win anyway). Is a r-m-w cycle taking place, or is it
|> just being written through?

The r-m-w cycle is taking place. Because ZF_ALWAYS does not eliminate the
nead for the write cycle, it simply isn't worth it to us to optimize this
case.

-- kurt

0 new messages