Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

Measuring PDP-11 relative instruction times with GRUB

48 views

Skip to first unread message

paramucho

unread,

Jan 2, 2013, 9:08:10 AM1/2/13

Following up an earlier discussion I wrote a little holiday utility,
GRUB, to measure relative instruction times. This early set of results
already provides some interesting information.

Testing:

o Each operation was repeated for the duration of a clock tick.
o The twelve tests are run sequentially as a "test cycle".
o Ten test cycles are executed and displayed.
o Tests were performed on SimH, E11 and V11, mapped and unmapped.

Operation Legend:

<nop > <nop>
<clr > <clr r0>
<tst > <tst r0>
<inc > <inc r0>
<r,r > <mov r0,r0>
<r,( > <mov r0,(r1)>
<(,r > <mov (r1),r0>
<+,- > <mov (r1)+,-(r1)>
<r,a > <mov r0,e$mloc>
<a,r > <mov e$mloc,r0>
<a,a > <mov e$mloc,e$mloc>
<x,x > <mov @e$msrc(r0),@e$mdst(r0)>

The results below are from a quiet machine (my Windows 8 laptop).

The higher a result number is, the more operations were performed.
Actual instruction counts are divided to get reasonable display
values.

==============
SimH
==============

SimH running under RT-11/FB and RT-11/XM. The results are essentially
the same, regardless of the operation performed or whether memory
management is enabled (RT-11/XM).

Those interesting bands of 76 and 112 etc come from SimH's calibration
feature which attempt to maintain a constant virtual environment,
basically robbing Peter to pay Paul for lost time due to scheduling of
other processes etc. The reason that all the timings are the same is
that SimH calibration unit is the "instruction". If it used the
"memory reference" as the calibration unit then the results would look
more like they do under E11 and V11.

Results: An easy detection signature for SimH is: no difference.
Questions: What's SimH like without calibration.

SimH: RT-11/FB
--------------
nop clr tst inc r,r r,( (,r +,- r,a a,r a,a x,x
71 71 71 71 71 76 76 76 76 76 76 76
76 76 76 76 76 76 76 76 76 76 76 76
76 76 76 76 76 76 76 76 76 76 76 76
76 76 76 76 76 76 76 76 76 76 76 76
76 76 76 76 76 112 112 112 112 112 112 112
112 112 112 112 112 112 112 112 112 112 112 112
112 112 112 112 112 112 112 112 112 112 112 112
112 112 112 112 112 112 112 112 112 112 112 112
112 112 112 112 112 95 95 95 95 95 95 95
95 95 95 95 95 95 95 95 95 95 95 95

SimH: RT-11/XM
--------------
nop clr tst inc r,r r,( (,r +,- r,a a,r a,a x,x
76 76 76 76 76 76 76 76 76 76 76 76
76 76 76 76 76 76 76 76 76 76 76 76
76 76 76 76 76 76 76 76 76 76 76 76
76 76 76 76 87 87 87 87 87 87 87 87
87 87 87 87 87 87 87 87 87 87 87 87
87 87 87 87 87 87 87 87 87 87 87 87
87 87 87 87 87 87 87 87 87 87 87 87
87 87 87 87 75 75 75 75 75 75 75 75
75 75 75 75 75 75 75 75 75 75 75 75
75 75 75 75 75 75 75 75 75 75 75 75

==============
E11
==============

The same tests under Ersatz-11. E11 is clearly faster and clearly
optimised so that the simple operations run faster than those which
require more memory accesses. Unmapped and mapped results are similar.

Speed doesn't really help as a signature since on a heavily loaded
system E11 will slow down to Simh and V11 speeds.

Results: The difference between "nop" and "x,x" is a signature.
Questions: The three starred (*) results are interesting because the
behaviour repeats. Might some granulity in the test, or
perhaps in E11.

E11: RT-11/SJ
-------------
nop clr tst inc r,r r,( (,r +,- r,a a,r a,a x,x
622 945 588 289 529 383 381 321 379 333 287 176
627 943 601 291 463 447 381 360 329 334 286 173
630 1082* 512 291 464 445 382 319 369 333 327 151
632 944 600 291 530 371 382 318 383 334 286 176
630 931 586 289 460 447 381 364 329 334 287 176
630 1081* 525 281 460 448 382 317 381 334 328 148
632 943 607 287 531 384 381 318 372 327 286 176
632 945 597 291 465 431 381 363 328 328 287 177
630 1071* 498 290 464 446 382 318 384 334 328 151
630 948 591 292 531 382 383 316 382 332 285 172

E11: RT-11/XM
-------------
nop clr tst inc r,r r,( (,r +,- r,a a,r a,a x,x
738 811 599 386 398 383 446 273 386 390 246 173
739 793 598 341 396 448 511 273 321 390 245 177
842 813 511 340 398 436 445 273 382 444 246 152
736 813 582 391 398 383 447 271 385 390 246 177
729 791 586 332 385 438 497 266 323 380 242 166
833 783 488 328 386 434 433 262 373 433 241 149
721 813 595 389 398 372 445 271 385 390 246 177
726 799 569 336 399 444 511 271 329 391 246 177
844 814 510 340 398 448 447 273 385 391 246 198
726 794 569 339 398 509 381 273 374 390 246 177
XXX YYY

=============
V11
=============

RUST is slower than expected. I may have to try swapping compilers
because I know I use fewer cycles per instruction than SimH. On the
other hand, I don't understand what SimH calibration is doing to the
results. In fact, V11 may be even slower because it was emulating a 50
hertz clock.

Unmapped tests run faster than mapped tests under V11.

As stated, speed doesn't help us discriminate, and neither does a
difference between mapped and unmapped.

V11, like E11 has a slope from "nop" to "x,x". However, the ratio of
the first and last tests might be useful for discrimination:

E11: 738/173 => 4.3
V11: 132/34 => 5.0

There's another signature. The columns marked with "XXX" and "YYY"
above (E11) and below (V11) show different slopes.

E11: +,- r,a
E11: 273 < 386
V11: 68 > 52
XXX YYY

Results: Differences in detail to the E11 signature are a signature.
Questions: Why slower than expected?

V11: RUST/SJ
------------
nop clr tst inc r,r r,( (,r +,- r,a a,r a,a x,x
132 97 82 64 68 68 69 68 52 52 43 34
130 95 91 66 65 66 68 68 52 52 43 34
123 95 90 66 66 68 67 67 52 52 43 34
132 89 89 65 68 68 68 66 52 52 42 34
133 96 85 64 68 67 68 66 52 51 43 34
133 94 91 65 68 68 69 68 53 52 43 34
132 97 91 65 68 67 68 66 53 52 43 34
130 97 91 64 67 68 67 68 52 52 43 34
123 96 91 66 68 59 64 67 53 51 43 34
119 95 91 66 67 68 68 67 53 52 43 34
XXX YYY
V11: RUST/XM
------------
nop clr tst inc r,r r,( (,r +,- r,a a,r a,a x,x
85 65 59 49 50 48 47 44 38 36 29 23
81 64 62 52 50 46 45 44 37 37 30 23
75 63 62 53 51 47 48 44 37 35 29 23
81 56 60 52 51 48 47 44 38 37 29 24
85 64 55 52 51 48 47 44 35 37 30 24
85 66 63 52 50 47 48 42 37 38 30 24
85 65 63 53 50 48 49 45 38 37 30 24
85 66 63 52 51 47 48 45 38 38 30 24
80 65 63 53 52 46 48 45 38 38 30 24
73 57 63 53 52 48 48 44 37 37 30 23

=======
Summary:

Preliminary results:

1. If the timings are a flat line, then it's SimH.

2. If there's a slope then use the difference between the first and
last to choose E11 from V11, and/or the XXX/YYY difference.

Note, the number of tests can be reduced in an application to:

<nop > <nop>
<+,- > <mov (r1)+,-(r1)>
<r,a > <mov r0,e$mloc>
<x,x > <mov @e$msrc(r0),@e$mdst(r0)>

Which gives more opportunity for repeated tests.

I haven't tested real hardware yet, and I won't be able to until the
end of the month. I also need to put together a more orderly test
environment--it's all a bit hand-held at the moment which leaves room
for errors.

These are results from an initial day of programming. I'm sure they'll
be further insights (and along the way I found a couple of bugs in V11
regard to timed waits under Windows). The results will be refined when
more development takes place, but since I never know when I'll be able
to get back to a project I've put them out in their early form.

I've put a copy of the utility on-line for anyone interested at:

http://code.google.com/p/rust/downloads/list

====================================================================
NOISY SYSTEMS
====================================================================

The results above are from a "quiet machine". A moderately noisy
machine looks more like this (V11). The signal is still visible in the
data, but there's noise to be weeded out as well.

nop clr tst inc r,r r,( (,r +,- r,a a,r a,a x,x
72 17 10 33 30 39 32 37 26 26 28 19
78 60 14 12 31 30 46 38 11 29 20 19
70 39 59 33 16 37 23 32 31 25 15 15
58 52 41 36 43 30 35 30 26 34 24 18
51 30 58 35 35 44 14 31 36 25 24 21
54 63 34 8 26 22 31 27 25 31 25 17
87 43 54 39 32 35 39 33 36 29 23 23
53 58 57 33 46 39 31 48 25 28 28 17
81 30 44 47 33 39 45 32 31 31 22 22
67 39 37 31 41 42 33 42 25 27 29 15

On really noisy systems many clock ticks may be lost because of
scheduling. You see that happening in the columns with results of
under 5 (below).

The actual execution time of the GRUB test becomes interesting on
really noisy systems. E11 and V11 complete the tests in 4 or 5 seconds
where SimH takes up to two minutes--this may be some kind of problem
with the calibration algorithm or implementation. It would be
interesting to see the behaviour with calibration switched off, but I
can't see any kind of command for that in the documentation.

E11: 4 seconds
nop clr tst inc r,r r,( (,r +,- r,a a,r a,a x,x
612 806 377 239 354 378 392 270 345 262 191 176
533 610 137 116 217 127 242 241 61 88 156 37
595 808 528 240 425 360 5 178 164 92 23 51
476 228 324 293 329 330 324 221 317 276 234 82
500 204 206 262 305 366 350 254 296 308 282 121
478 825 119 247 300 355 407 272 344 108 126 95
68 785 91 101 115 319 124 243 154 244 250 126
686 31 0 29 0 2 40 148 222 181 140 116
531 472 358 249 429 250 303 280 285 276 150 91
478 623 466 239 292 324 394 270 320 298 258 138

V11: 5 seconds
nop clr tst inc r,r r,( (,r +,- r,a a,r a,a x,x
21 46 5 11 1 49 33 53 46 40 28 30
115 62 26 13 21 40 27 1 2 13 2 1
112 66 51 54 47 59 59 55 39 29 18 3
111 76 76 44 58 55 60 47 42 41 37 25
116 49 36 36 7 53 55 56 43 35 13 14
123 63 80 55 53 38 20 36 30 9 19 15
97 87 82 62 48 11 32 47 9 30 36 24
107 93 62 52 28 23 35 25 20 4 3 3
53 40 88 18 58 36 55 65 42 31 35 15
111 83 62 65 42 59 41 28 47 18 18 0

SIMH: 115 seconds
nop clr tst inc r,r r,( (,r +,- r,a a,r a,a x,x
30 30 30 30 30 30 30 30 30 30 30 30
30 30 30 30 30 30 30 30 30 30 30 30
30 30 30 30 30 30 30 30 30 30 30 30
30 30 30 30 30 30 30 30 30 30 30 30
30 30 30 30 30 30 30 30 30 30 30 30
30 30 30 30 30 30 30 30 30 30 30 30
30 30 30 30 30 30 30 30 30 30 30 30
30 30 30 30 30 30 30 30 30 30 30 30
30 30 30 30 30 30 30 30 30 30 30 30
30 30 30 30 30 30 30 30 30 30 30 30

Ian

0 new messages