POPF still broken in 286?

Don Deal

unread,

Nov 16, 1986, 9:07:10 PM11/16/86

to

IBM put a warning in the AT Technical Reference, and I can remember
several people complaining about a problem with the POPF instruction that
allowed interrupts to be processed even when interrupts had been disabled.
Was this problem fixed in subsequent masks for the 286, or does it still
exist?

That a problem like this could occur (and has with vendors other than
Intel) makes me wonder what kind of testing goes on before products are
released to market. Given that increasingly complicated architectures
are showing up in most microprocessor families, it would seem that additional
testing is in order. Is anyone familiar with the testing cycles that go
on for microprocessors?

--
D.L. Deal, Office of Computing Services, Georgia Tech, Atlanta GA, 30332-0275
Phone: (404) 894-4660 ARPA: d...@pyr.ocs.gatech.edu BITNET: cc100dd@gitvm1
uucp: ...!{akgua,allegra,amd,hplabs,ihnp4,masscomp,ut-ngp}!gatech!gitpyr!don

Daniel M. Frank

unread,

Nov 17, 1986, 9:53:57 PM11/17/86

to

In article <26...@gitpyr.gatech.EDU> d...@gitpyr.gatech.EDU (Don Deal) writes:
> IBM put a warning in the AT Technical Reference, and I can remember

>several people complaining about a problem with the POPF instruction ...

>
> That a problem like this could occur (and has with vendors other than
>Intel) makes me wonder what kind of testing goes on before products are
>released to market.

Recently, a friend of mine had some problems running a protected mode
operating system on an IBM AT. He traced it to an early and buggy lot
of 286 chips, a few of which he was unlucky enough to receive via IBM.
On contacting IBM, he was told, "It's an Intel problem". When he
called Intel, he was told that IBM had been aware of the problem with
the early runs, and Intel had been unwilling to ship without a letter
from IBM acknowledging the problem and absolving Intel of liability.
IBM duly provided the letter, and Intel shipped the chips.

I should note that this is hearsay. Perhaps one of the folks from
Intel could be kind enough to confirm or deny it. In any case, it
takes a long time to prepare a Tech Ref manual, and even longer to
write the BIOS, which includes workaround code for many of these bugs.
It is almost inconceivable that IBM didn't know about the problems
long before the introduction of the AT.

The AT is no miracle of engineering anyway. The BIOS is filled
with funny delay loops and useless instructions all designed to pass
the time until the contents of device registers become valid. Too
cheap to build the hardware right, I guess. It's no surprise to me
that IBM also accepted bad chips and worked around the problems.
And protected mode? No protected mode operating systems around
anyway, for years maybe. I can't wait until p.m. DOS comes out :-).

--
Dan Frank
uucp: ... uwvax!prairie!dan
arpa: dan%cas...@spool.wisc.edu

Tom Kohrs

unread,

Nov 18, 1986, 2:26:10 PM11/18/86

to

> IBM put a warning in the AT Technical Reference, and I can remember
> several people complaining about a problem with the POPF instruction that
> allowed interrupts to be processed even when interrupts had been disabled.
> Was this problem fixed in subsequent masks for the 286, or does it still
> exist?
>

The problem with the POPF instruction (it would always enable interrupts)
was only in the B-step parts (identifiable by markings of (c) Intel'82 or
(c) Intel '83). The C-step and E-step have this problem fixed. Almost all
of the B-step parts that were shipped went to IBM.

> That a problem like this could occur (and has with vendors other than
> Intel) makes me wonder what kind of testing goes on before products are
> released to market. Given that increasingly complicated architectures

^^^^^^^^^^^

> are showing up in most microprocessor families, it would seem that additional
> testing is in order. Is anyone familiar with the testing cycles that go
> on for microprocessors?

Complicated is the key word. As architectures become more and more complicated
it takes longer to generate all of the test vectors necessary to prove the
design. Initial part testing takes two forms. Running software from a previous
part (the 8086 in the case of the 286) and specific vectors designed to stress
the part. How much gets found and fixed before the parts ship in volume has
more to do with marketing considerations than the technical correctness of
the chip. Long term testing is done by trying to put the chip through every
conceivable sequence of events (both hardware and software) and by following
up on problem reports from the field. You would be amazed at what some people
will try to do to a chip.
--
------
"Ever notice how your mental image of someone you've
known only by phone turns out to be wrong?
And on a computer net you don't even have a voice..."

to...@intsc.UUCP Tom Kohrs
Regional Architecture Specialist
Intel - Santa Clara

Tom Kohrs

unread,

Nov 19, 1986, 10:24:23 PM11/19/86

to

> Recently, a friend of mine had some problems running a protected mode
> operating system on an IBM AT. He traced it to an early and buggy lot
> of 286 chips, a few of which he was unlucky enough to receive via IBM.

IBM did ship a lot of PCAT's with B step parts. They were made aware of
the problems that these parts had and they decided that they were acceptable
for the product that they wanted to ship. The real drawback to this decision
is that some of the bugs make it real difficult to write a multitasking os.
With the installed base out there having unknown parts it is going to be
hard on the software developers. One option would be to ship an E-step
286 with each copy of the software (small plug from someone with no personal
interest (:-) ), this would have the added advantage of slowing down
pirated copies.

Anyone needing a copy of old (B-step) errata lists should get in touch with
the local Intel sales office before investing a lot of time into a piece of
software that maynot run on all machines.

james

unread,

Nov 21, 1986, 9:33:04 AM11/21/86

to

IN article <4...@intsc.UUCP>, to...@intsc.UUCP (Tom Kohrs) wrote:
> The problem with the POPF instruction (it would always enable interrupts)
> was only in the B-step parts (identifiable by markings of (c) Intel'82 or
> (c) Intel '83). The C-step and E-step have this problem fixed. Almost all
> of the B-step parts that were shipped went to IBM.

Well, what's the most current rev. level for the 80286, ie, how recent
should mine be to avoid all known bugs? Chip bugs are normally top secret,
but I assume the the current chip rev. level isn't sensitive.

> How much gets found and fixed before the parts ship in volume has
> more to do with marketing considerations than the technical correctness of
> the chip. Long term testing is done by trying to put the chip through every
> conceivable sequence of events (both hardware and software) and by following
> up on problem reports from the field. You would be amazed at what some people
> will try to do to a chip.

Well, gee, I can't think of any bugs in the MC68020 (not XC68020) offhand,
although the argument might be that they haven't been found yet, or that
Motorola has had better luck hiding them than Intel has had. Of course,
the 68000 did have the bug with the status register in which you could read
the priviledge level directly from user mode (although this was later
documented as a feature :-).
--
James R. Van Artsdalen ...!ut-ngp!utastro!osi3b2!james "Live Free or Die"

Mark Campbell

unread,

Nov 29, 1986, 8:33:22 AM11/29/86

to

In article <8...@reality1.uucp> ja...@reality1.UUCP (james) writes:
>IN article <4...@intsc.UUCP>, to...@intsc.UUCP (Tom Kohrs) wrote:
>> How much gets found and fixed before the parts ship in volume has
>> more to do with marketing considerations than the technical correctness of
>> the chip. Long term testing is done by trying to put the chip through every
>> conceivable sequence of events (both hardware and software) and by following
>> up on problem reports from the field. You would be amazed at what some people
>> will try to do to a chip.
>
>Well, gee, I can't think of any bugs in the MC68020 (not XC68020) offhand,
>although the argument might be that they haven't been found yet, or that

>Motorola has had better luck hiding them than Intel has had. [...]

Would it have made any difference if Intel had called the I80286 the
XI80286 before a certain release of the chip? All of the so-called
X parts I have are labelled "MC68020". The argument you're putting forth
is syntactic; the semantics are identical.

What really irritates me is releasing different revisions of these
parts with no way for the S/W to be able to detect the different
revisions. It's terrible that Motorola went to the trouble of dumping
a revision number of the MC68020 in the microstate during certain
exceptions but has never updated that revision number. This means
that the bug fixes for the X parts must be retained in current releases
of software because there is no way in S/W to tell that which machines
in the field have XC68020's in them.

Around here we have a joke that the first update of the revision
number for the MC68020 will be in the MC68030.

I'm not as familiar with the Intel parts so I don't know if they have
the same problems -- I would assume so. In any case, why
don't you microprocessor developers out there take pity on the
rest of us and update your revision numbers once in a while.
--

Mark Campbell Phone: (803)-791-6697 E-Mail: !ncsu!ncrcae!sauron!campbell

Chuck McManis

unread,

Dec 1, 1986, 3:39:42 PM12/1/86

to

In article <8...@reality1.uucp>, ja...@reality1.uucp (james) writes:
> Well, what's the most current rev. level for the 80286, ie, how recent
> should mine be to avoid all known bugs? Chip bugs are normally top secret,
> but I assume the the current chip rev. level isn't sensitive.

James, I don't think chip bugs are "top secret", ask your Intel sales rep
for an Errata sheet. They can also generally tell you the current rev
level or 'stepping' as the semiconducter trade likes to refer to it as.

>
> Well, gee, I can't think of any bugs in the MC68020 (not XC68020) offhand,
> although the argument might be that they haven't been found yet, or that
> Motorola has had better luck hiding them than Intel has had. Of course,
> the 68000 did have the bug with the status register in which you could read
> the priviledge level directly from user mode (although this was later
> documented as a feature :-).

Well gee, I bet you couldn't think of any '286 bugs offhand if you hadn't
actually been bit by one. CPU's in general are getting so complicated
that the time to test every transistor in a 32 bit CPU can often be
measured in hours, one of the biggest challenges facing a Production
Engineer is to get that testing time down as low as possible. And yes
were bugs in the 68020, and consider that the '286 is both a CPU and
a MMU in one package and compare it against the '020 and '451 or '851.
It is all quite silly, 99% of the 'bugs' are so obscure that they
are *never* seen by 99% of the users.

For a sobering look at the problems of testing VLSI look at some of the
recent Digital Design, EDN, and other technical mags.

--
--Chuck McManis
uucp: {anywhere}!sun!cmcmanis BIX: cmcmanis ARPAnet: cmcm...@sun.com
These opinions are my own and no one elses, but you knew that didn't you.

Chris Lent

unread,

Dec 3, 1986, 6:33:44 AM12/3/86

to

In article <97...@sun.uucp>, cmcm...@sun.uucp (Chuck McManis) writes:
> In article <8...@reality1.uucp>, ja...@reality1.uucp (james) writes:
> > Well, what's the most current rev. level for the 80286, ie, how recent
> > should mine be to avoid all known bugs? Chip bugs are normally top secret,
> > but I assume the the current chip rev. level isn't sensitive.
> James, I don't think chip bugs are "top secret", ask your Intel sales rep
> for an Errata sheet. They can also generally tell you the current rev
> level or 'stepping' as the semiconducter trade likes to refer to it as.

I'm was just wondering what are EASY (or relatively) ways to get mask-revision
levels on various processors. From what I remember one of the long exception
types on the MC68XXXX tacked on a mask revision level to the machine state.

The idea here is that when a instruction is restarted, if the mask-revision
level doesn't match, something happens (I believe an exception). The
reason for this is in a multi-processor environment an interrupted instruction
might be restarted on another processor. Does the 80XXX family have
a similar capability?

Also, from what I remember these long execption machine state dumps
vary in length for the 68000, 010 and 020, and with a little tricky code
it was possible to determine the processor type.

Further, it can be VERY handy to be able to determine processor type from
software in boot-up code. Late PDP-11 and all Vaxen (so far :-) have
a simple instruction to grab processor type (and ECO level on the Vaxen).
This allows code to be I) easily processor specific and II) allow marketing
to enforce the "you MUST pay more for the same code on a larger machine in
the XXXYYYZZZ series of COMPLETELY compatible machines" dictum.

That's all
Chris Lent
----
--
Chris Lent ihnp4!allegra!phri!cooper!chris