Re: A20 line detection

Message has been deleted

Rod Pemberton

unread,

Jun 1, 2008, 7:49:18 AM6/1/08

to

"Mike Gonta" <mike...@gmail.com> wrote in message
news:g1u0hm$bgv$1...@aioe.org...
> "Rod Pemberton" wrote in message
>
> > Mike Gonta's aeBIOS had a read/write check for memory wrap. (It didn't
> > work
> > on the machine that is working with Kristof's image.)
>
> I beleive the cause of the failure was the incorrect location
> of the wbinvd instruction.

Huh? Your code worked properly with WBINVD without the BIOS anti-virus
setting enabled... (once I found it). It was only with the BIOS anti-virus
setting enabled that it didn't work.

> In the revised version it is located between the move and the
> compare (there is also a redundant wbinvd at the beginning).

The check originally compared "fixed value" BIOS memory locations with
written values to see if they were fixed. My PC acted like locations in the
PC BIOS were writable RAM instead of unwritable ROM. This was due to the
BIOS anti-virus setting. I believe the setting enabled cacheing of the PC
BIOS in RAM just like the optional BIOS settings for other upper memory
regions, VESA BIOS, etc. IIRC, you rewrote the check so it didn't access
the BIOS regions.

> > The "Virus" setting in my BIOS copies the PC BIOS into RAM,
> > "enables" the RAM, unmaps the BIOS - to prevent writing BIOS
> > updates... He expected the BIOS values to be non-changeable
> > near 1Mb and had to rework his code.
>
> Why would the anti-virus allow the "non-changeable" BIOS to be
> changed?

I think the reason is above, but I can check without using your code to see
if writing or reading to my PC BIOS with the BIOS anti-virus setting on acts
like RAM or ROM...

Rod Pemberton

Message has been deleted

H. Peter Anvin

unread,

Jun 2, 2008, 12:46:00 PM6/2/08

to Mike Gonta

Mike Gonta wrote:
>
> I beleive the cause of the failure was the incorrect location
> of the wbinvd instruction.

> In the revised version it is located between the move and the
> compare (there is also a redundant wbinvd at the beginning).
>

There shouldn't be any need for WBINVD; A20 masking happens inside the
cache hierarchy.

-hpa

Message has been deleted

H. Peter Anvin

unread,

Jun 2, 2008, 9:09:43 PM6/2/08

to Mike Gonta

Mike Gonta wrote:

> Actually the "wbinvd" instruction is used in a memory testing routine
> to determine if the A20 line is enabled.
>
> ;; revised code
> detect_a20:
> push edi
> mov edi, 200000h-4
> mov eax, [edi]
> mov DWORD [edi], edi
> wbinvd
> ;; correct location in code to flush the cache
> ;; so that the read (compare) is not cached
> cmp DWORD [edi], edi
> mov [edi], eax
>

Which is completely pointless, the only thing the cache flush does for
you here is provide a sizable delay.

-hpa

Robert Redelmeier

unread,

Jun 2, 2008, 9:33:32 PM6/2/08

to

In alt.lang.asm H. Peter Anvin <h...@zytor.com> wrote in part:

> Which is completely pointless, the only thing the cache
> flush does for you here is provide a sizable delay.

If you want a big non-CPU dependant delay, try a HLT or five.
Don't worry, the machine will eat one HLT per interrupt --
at least the timer tick.

-- Robert

Message has been deleted

Maxim S. Shatskih

unread,

Jun 3, 2008, 9:31:19 AM6/3/08

to

I'm just amazed.

Why reinvent the wheel and not clone the Linux's A20 logic? it is
well-tested on lots of BIOSes I think.

--
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
ma...@storagecraft.com
http://www.storagecraft.com

James Harris

unread,

Jun 3, 2008, 11:53:34 AM6/3/08

to

On 3 Jun, 14:31, "Maxim S. Shatskih" <ma...@storagecraft.com> wrote:
> I'm just amazed.
>
> Why reinvent the wheel and not clone the Linux's A20 logic? it is
> well-tested on lots of BIOSes I think.

Wouldn't using Linux's A20 logic (or even using it as a basis for
one's code) fall under the terms of a derivative work and thus require
GNU licensing for one's own code? Not everyone want's to release under
that licence.

Maxim S. Shatskih

unread,

Jun 3, 2008, 12:22:01 PM6/3/08

to

> Wouldn't using Linux's A20 logic (or even using it as a basis for
> one's code) fall under the terms of a derivative work and thus require
> GNU licensing for one's own code?

No. Just do not copy-paste the actual source.

Phil Carmody

unread,

Jun 3, 2008, 4:50:51 PM6/3/08

to

"Maxim S. Shatskih" <ma...@storagecraft.com> writes:
>> Wouldn't using Linux's A20 logic (or even using it as a basis for
>> one's code) fall under the terms of a derivative work and thus require
>> GNU licensing for one's own code?
>
> No. Just do not copy-paste the actual source.

It would still be "based on" the GPL code, and therefore it
would be a derivative work.

Phil
--
Dear aunt, let's set so double the killer delete select all.
-- Microsoft voice recognition live demonstration

H. Peter Anvin

unread,

Jun 3, 2008, 7:02:13 PM6/3/08

to Mike Gonta

Mike Gonta wrote:
>> Which is completely pointless,
>
> Hi Peter,
>
> Are you saying that a memory write to read only memory
> address 200000h-4 ( = 1 Meg - 4 if A20 line not enabled) will not be
> cached and read back on a memory read?
>

That is correct, the MTRRs (P6+) or KEN# logic (older CPUs) will prevent
the caching and cause the write to go to the bus (even if the RAM is
shadowed for the readonly direction.)

It's still rather inefficient way to do it; poking a location in RAM
until it disambiguates is usually quite a bit quicker, especially if the
A20 line is already enabled and all you need is a test.

-hpa

H. Peter Anvin

unread,

Jun 3, 2008, 7:04:15 PM6/3/08

to Phil Carmody

Phil Carmody wrote:
> "Maxim S. Shatskih" <ma...@storagecraft.com> writes:
>>> Wouldn't using Linux's A20 logic (or even using it as a basis for
>>> one's code) fall under the terms of a derivative work and thus require
>>> GNU licensing for one's own code?
>> No. Just do not copy-paste the actual source.
>
> It would still be "based on" the GPL code, and therefore it
> would be a derivative work.

*As the author* of that code, I can say that I do not consider the
algorithm proprietary. Furthermore, the equivalent algorithm is also
implemented under the BSD license in the package "wraplinux",
specifically the file reloc/a20.S.

-hpa

Message has been deleted

Maxim S. Shatskih

unread,

Jun 4, 2008, 1:14:21 AM6/4/08

to

> > It would still be "based on" the GPL code, and therefore it
> > would be a derivative work.
>
> *As the author* of that code, I can say that I do not consider the
> algorithm proprietary.

There is no proprietary _algorithms_ and _ideas_ in GPL world.

Such thing require a _patent_ which are not used in GPL world.

Code - yes, algorithms - no.

H. Peter Anvin

unread,

Jun 4, 2008, 2:07:28 AM6/4/08

to Maxim S. Shatskih

Maxim S. Shatskih wrote:
>>> It would still be "based on" the GPL code, and therefore it
>>> would be a derivative work.
>> *As the author* of that code, I can say that I do not consider the
>> algorithm proprietary.
>
> There is no proprietary _algorithms_ and _ideas_ in GPL world.
>
> Such thing require a _patent_ which are not used in GPL world.
>
> Code - yes, algorithms - no.
>

Sort of - kind of. Law is a lot fuzzier than what we programmers like
to think, partly because the cost of a dispute, even if eventually
resolved in favor of the statue as written (which isn't certain) can be
extremely costly.

Either way, the original statement is something called an "estoppel".
What it is, in very simplified term (a lawyer would roast me for this)
is a statement by a potential party in a legal dispute (in this case me
as author of the said code) that I do not intend to make a certain kind
of claim (remember that anyone can make a *claim* about the law --
doesn't mean that they are actually correct.)

So, anyway, go ahead and re-implement it, or just use the code that I
pointed out that is under the BSD license. With my blessings.

-hpa

Message has been deleted

H. Peter Anvin

unread,

Jun 8, 2008, 7:01:55 PM6/8/08

to Mike Gonta

Mike Gonta wrote:
> This is one of the benefits of using a 32 bit PM BIOS extender to test
> a snippet of code that would normally be done in real mode.

You are aware that P6+ do not, by specification, support A20M# in
protected mode, right? In other words, you try to use A20M# in
protected mode in these chips, and you get G-d knows what.

In this case you're probably seeing a case of picking up the store from
the store buffer.

-hpa

H. Peter Anvin

unread,

Jun 8, 2008, 7:10:50 PM6/8/08

to Mike Gonta

Thinking about it some more, you probably do need to execute a
serializing instruction of some sort (I use an I/O port reference since
it also provides a modicum of timing independence, at least as far as
the overall system is concerned) in order for the test to be valid.

It somewhat sucks to have to do, but it's a *lot* cheaper than a WBINVD,
by something like three orders of magnitude.

-hpa

Rod Pemberton

unread,

Jun 8, 2008, 8:26:41 PM6/8/08

to

"H. Peter Anvin" <h...@zytor.com> wrote in message
news:484C64E...@zytor.com...

> Mike Gonta wrote:
> > This is one of the benefits of using a 32 bit PM BIOS extender to test
> > a snippet of code that would normally be done in real mode.
>
> You are aware that P6+ do not, by specification, support A20M# in
> protected mode, right?

So, this page isn't correct or upto date:
http://www.sandpile.org/ia32/legacy.htm

It does say A20M# is ignored for SMM.

> In other words, you try to use A20M# in
> protected mode in these chips, and you get G-d knows what.

What is the protected mode default A20M# state for P6+? enabled?

So, if one had a need to change the A20 state, e.g. prior to reboot, you
must switch the cpu back to real mode first, then set A20, yes?

--
My AMD K6-2 machine had fits with Ben Lunt's FYSOS code - believed to be an
A20 problem. I'm not sure what his fix was, but he said that he had been
enabling A20 in 'unreal' mode. Do you think that would cause A20 problems?

--
Since you're covering much about the A20, why do you write 0xdf or 0xdd to
enable/disable? e.g., instead of 0xff or 0xfd? I may have run across that
somewhere, but I don't recall at the moment...

--
What about other "suspicious" things I've read about A20 (e.g., old a.o.d.
posts...):
- should reset keyboard controller after A20 enable
- should write to port 0x64, 0xFF - to update I/O...
- should write to port 0x4F, al - any value in al, to flush I/O caching...
and provide a delay instead of jmp $+2
- do not use fast A20 (PS/2), before checking if slow A20 (kbd) enabled
- should pulse A20 after set

Rod Pemberton
PS Sorry, I dropped alt.comp.lang.assembler - can only reply to three...
Tack it back on.

H. Peter Anvin

unread,

Jun 8, 2008, 11:31:11 PM6/8/08

to Mike Gonta

H. Peter Anvin wrote:
>
> Thinking about it some more, you probably do need to execute a
> serializing instruction of some sort (I use an I/O port reference since
> it also provides a modicum of timing independence, at least as far as
> the overall system is concerned) in order for the test to be valid.
>
> It somewhat sucks to have to do, but it's a *lot* cheaper than a WBINVD,
> by something like three orders of magnitude.
>

Thinking even more about it, it's pretty obvious: the x86 does not
guarantee that page table changes are visible until there is a
serializing instruction, and A20, being part of the TLB block, behaves
like it.

-hpa

H. Peter Anvin

unread,

Jun 8, 2008, 11:37:41 PM6/8/08

to Rod Pemberton

Rod Pemberton wrote:
>> You are aware that P6+ do not, by specification, support A20M# in
>> protected mode, right?
>
> So, this page isn't correct or upto date:
> http://www.sandpile.org/ia32/legacy.htm
>
> It does say A20M# is ignored for SMM.
>
>> In other words, you try to use A20M# in
>> protected mode in these chips, and you get G-d knows what.
>
> What is the protected mode default A20M# state for P6+? enabled?

It's unsupported... you're not supposed to ever let A20 be disabled.
There are a number of errata related to this kind of handling,
especially with paging enabled (I don't know off the top of my head any
CPUs which have errata with A20 in protected mode with paging disabled.)

> So, if one had a need to change the A20 state, e.g. prior to reboot, you
> must switch the cpu back to real mode first, then set A20, yes?

Yes. Note that the standard state after system reset is A20 enabled in
the KBC, disabled in port 92h.

> with Ben Lunt's FYSOS code - believed to be an
> A20 problem. I'm not sure what his fix was, but he said that he had been
> enabling A20 in 'unreal' mode. Do you think that would cause A20 problems?
>
> --
> Since you're covering much about the A20, why do you write 0xdf or 0xdd to
> enable/disable? e.g., instead of 0xff or 0xfd? I may have run across that
> somewhere, but I don't recall at the moment...
>
> --
> What about other "suspicious" things I've read about A20 (e.g., old a.o.d.
> posts...):
> - should reset keyboard controller after A20 enable
> - should write to port 0x64, 0xFF - to update I/O...
> - should write to port 0x4F, al - any value in al, to flush I/O caching...
> and provide a delay instead of jmp $+2
> - do not use fast A20 (PS/2), before checking if slow A20 (kbd) enabled
> - should pulse A20 after set

1. You need to flush (not reset) the keyboard controller after write.
2. I have not seen anything like that.
3. JMP $+2 became obsolete with the Pentium. Using an I/O port (pretty
much any unused I/O port will do) is both more consistent in terms of
delay, and serializes I/O as well.
4. Correct, at least if you want suspend/resume to work. Some BIOSes
incorrectly don't save and restore the state of port 92h.
5. I have not seen anything like that.

Note, also, that if your BIOS supports it, INT 15h, AX=2401h is the best
way to enable A20.

-hpa

James Harris

unread,

Jun 9, 2008, 1:03:02 PM6/9/08

to

On 8 Jun, 23:01, "H. Peter Anvin" <h...@zytor.com> wrote:
> Mike Gonta wrote:
> > This is one of the benefits of using a 32 bit PM BIOS extender to test
> > a snippet of code that would normally be done in real mode.
>
> You are aware that P6+ do not, by specification, support A20M# in
> protected mode, right? In other words, you try to use A20M# in
> protected mode in these chips, and you get G-d knows what.

This prompted me to look for some per-CPU data. I found this useful
page:

http://cpu.linuxmania.net/liste/cpuinfo/technical-data_CPU.htm

AMD mentions it differently but the Intel 486 data sheets clearly
confirm the undefined behaviour of A20M# under protected mode

and the real-mode-only nature of the A20M# line.

Rod Pemberton

unread,

Jun 9, 2008, 2:14:45 PM6/9/08

to

"H. Peter Anvin" <h...@zytor.com> wrote in message

news:484CA585...@zytor.com...

> Rod Pemberton wrote:
> > What about other "suspicious" things I've read about A20 (e.g., old
a.o.d.
> > posts...):
> > - should reset keyboard controller after A20 enable
> > - should write to port 0x64, 0xFF - to update I/O...
> > - should write to port 0x4F, al - any value in al, to flush I/O
caching...
> > and provide a delay instead of jmp $+2
> > - do not use fast A20 (PS/2), before checking if slow A20 (kbd)
enabled
> > - should pulse A20 after set
>
> 1. You need to flush (not reset) the keyboard controller after write.
> 2. I have not seen anything like that.

That's odd...
http://groups.google.com/group/fa.linux.kernel/msg/ada7fac2ecb760a6

Various old usenet posts indicate Minix had/have it.

> 3. JMP $+2 became obsolete with the Pentium. Using an I/O port (pretty
> much any unused I/O port will do) is both more consistent in terms of
> delay, and serializes I/O as well.
> 4. Correct, at least if you want suspend/resume to work. Some BIOSes
> incorrectly don't save and restore the state of port 92h.
> 5. I have not seen anything like that.
>
> Note, also, that if your BIOS supports it, INT 15h, AX=2401h is the best
> way to enable A20.
>

Thanks,

Rod Pemberton

unread,

Jun 9, 2008, 3:25:29 PM6/9/08

to

"Rod Pemberton" <do_no...@nohavenot.cmm> wrote in message
news:g2js2l$3b5$1...@aioe.org...

> "H. Peter Anvin" <h...@zytor.com> wrote in message
> news:484CA585...@zytor.com...
> > Rod Pemberton wrote:
> > > What about other "suspicious" things I've read about A20 (e.g., old
> a.o.d.
> > > posts...):
> > > - should reset keyboard controller after A20 enable
> > > - should write to port 0x64, 0xFF - to update I/O...
> > > - should write to port 0x4F, al - any value in al, to flush I/O
> caching...
> > > and provide a delay instead of jmp $+2
> > > - do not use fast A20 (PS/2), before checking if slow A20 (kbd)
> enabled
> > > - should pulse A20 after set
> >
> > 1. You need to flush (not reset) the keyboard controller after write.
> > 2. I have not seen anything like that.
>
> That's odd...
> http://groups.google.com/group/fa.linux.kernel/msg/ada7fac2ecb760a6
>
> Various old usenet posts indicate Minix had/have it.
>

I also found it as part of a .pdf Google managed to cache, but for a
different reason (USB):

"4.2 USB A20 Gate Pass through Porting Notes

We can change A20 on/off by writing command (D1) into 8042-keyboard
controller by following sequence:

Cycle Address Data
Write 64h D1h (1 or more) (Starts the Sequence)
Write 60h xxh
Read 64h N/A (0 or more)
Write 64h 0FFh

During the memory change mode (A20 on/off), if USB_RXC0[5] A20Gate
Pass Through Enabled (A20PTEN), SMI# will not be generated during
the sequence until it is finished even if the various enable bits
are set. Some DOS games (Doom2,..) may disable A20GATE for high
memory access and enable it as existed but without writing 64h port
with 0FFh. As a result, the USB host cannot generate IO trap SMI
(the sequence did not end yet) and lead the legacy USB keyboard/mouse
to fail. For the above reason as using 60/64 port IO trap, we must
enable USB_RX41[1]=1 to ignore the 4th step in the sequence.

The description of USB_RX41[1] is as following:

bit 1: A20gate Pass Through Option
0-Pass through A20GATE command sequence defined in UHCI
1- Don't pass through Write I/O port 64(0FFh) (Ignore the steps)
"

> > 3. JMP $+2 became obsolete with the Pentium. Using an I/O port (pretty
> > much any unused I/O port will do) is both more consistent in terms of
> > delay, and serializes I/O as well.
> > 4. Correct, at least if you want suspend/resume to work. Some BIOSes
> > incorrectly don't save and restore the state of port 92h.
> > 5. I have not seen anything like that.
> >
> > Note, also, that if your BIOS supports it, INT 15h, AX=2401h is the best
> > way to enable A20.

Rod Pemberton

Maxim S. Shatskih

unread,

Jun 9, 2008, 3:33:22 PM6/9/08

to

> are set. Some DOS games (Doom2,..) may disable A20GATE for high
> memory access

PharLap DOS extender.

I wonder what is the need to _disable_ A20, except on exit of the DOS extender
back to DOS :-)

If you have no DOS - then there is no need to ever disable A20. Enable it once
and forever.

Rod Pemberton

unread,

Jun 9, 2008, 4:04:33 PM6/9/08

to

"Rod Pemberton" <do_no...@nohavenot.cmm> wrote in message

news:g2k079$ibb$1...@aioe.org...

> "Rod Pemberton" <do_no...@nohavenot.cmm> wrote in message
> news:g2js2l$3b5$1...@aioe.org...
> > "H. Peter Anvin" <h...@zytor.com> wrote in message
> > news:484CA585...@zytor.com...
> > > Rod Pemberton wrote:
> > > > What about other "suspicious" things I've read about A20 (e.g., old
> > a.o.d.
> > > > posts...):

...

> > > > - should write to port 0x64, 0xFF - to update I/O...

...

It seems the sequence also has special meaning on AMD Geode's - where I'd
guess it can be used as an end-of-A20 indicator...:

"5.10.8 Theory - Force A20 Low Sequence
The FA20 sequence occurs frequently in DOS applications.
Mostly, the sequence is to set FA20 high; that is, do
not force address bit 20 to a 0. High is the default state of
this signal. To reduce the number of ASMIs caused by the
A20 sequence, KEL generates an ASMI only if the
GateA20 sequence would change the state of A20.
The A20 sequence is initiated with a write of D1h to I/O
Address 064h. On detecting this write, the KEL sets the
A20Sequence bit in HCE_Control (KEL Memory Offset
100h[5]). It captures the data byte in HCE_Input (KEL
Memory Offset 104h[7:0]), but does not set the InputFull bit
in HCE_Status (KEL Memory Offset 10Ch[1]). When
A20Sequence is set, a write of a value to I/O Address 060h
that has bit 1 set to a value different than A20State in
HCE_Control (KEL Memory Offset 100h[8]) causes Input-
Full to be set and causes an ASMI. An ASMI with both
InputFull and A20Sequence set indicates that the application
is trying to change the setting of FA20 on the keyboard
controller. However, when A20Sequence is set, and a write
of a value to I/O Address 060h that has bit 1 set to the
same value as A20State in HCE_Control is detected, then
no ASMI will occur.
As mentioned above, a write to I/O Address 064h of any
value other than D1h causes A20Sequence to be cleared.
If A20Sequence is active and a value of FFh is written to
I/O Address 064h, A20Sequence is cleared but InputFull is
not set. A write of any value other than D1h or FFh causes
InputFull to be set, which then causes an ASMI. A write of
FFh to I/O Address 064h when A20Sequence is not set
causes InputFull to be set. The current value of the
A20_Mask is maintained in two unconnected places. The
A20State bit in HCE_Control and bit 1 in Port A. The value
of A20State is only changed via a software write to
HCE_Control. It is set to 0 at reset. The value of bit 1 in
Port A changes on any write to Port A. From reset PortA[1]
is 1.
"

Apparently, a number of circuit designers believe that sequence is
standardized enough to use it for other purposes...

Rod Pemberton

H. Peter Anvin

unread,

Jun 9, 2008, 5:24:07 PM6/9/08

to Rod Pemberton

Rod Pemberton wrote:
>>
>> "4.2 USB A20 Gate Pass through Porting Notes
>>
>> We can change A20 on/off by writing command (D1) into 8042-keyboard
>> controller by following sequence:
>>
>> Cycle Address Data
>> Write 64h D1h (1 or more) (Starts the Sequence)
>> Write 60h xxh
>> Read 64h N/A (0 or more)
>> Write 64h 0FFh
>> During the memory change mode (A20 on/off), if USB_RXC0[5] A20Gate
>> Pass Through Enabled (A20PTEN), SMI# will not be generated during
>> the sequence until it is finished even if the various enable bits
>> are set. Some DOS games (Doom2,..) may disable A20GATE for high
>> memory access and enable it as existed but without writing 64h port
>> with 0FFh. As a result, the USB host cannot generate IO trap SMI
>> (the sequence did not end yet) and lead the legacy USB keyboard/mouse
>> to fail. For the above reason as using 60/64 port IO trap, we must
>> enable USB_RX41[1]=1 to ignore the 4th step in the sequence.

This is very interesting; I have seen platforms on which legacy USB
stops working in boot loaders. Linux, obviously, doesn't care, but If
this turns out to be a suitable workaround it would be highly useful.

-hpa

H. Peter Anvin

unread,

Jun 9, 2008, 5:29:49 PM6/9/08

to Rod Pemberton

Rod Pemberton wrote:
>
> That's odd...
> http://groups.google.com/group/fa.linux.kernel/msg/ada7fac2ecb760a6
>

Seems to have slipped my mind. Thanks for refreshing it.

-hpa

Message has been deleted

H. Peter Anvin

unread,

Jun 9, 2008, 6:00:01 PM6/9/08

to Rod Pemberton

Rod Pemberton wrote:
>
> I also found it as part of a .pdf Google managed to cache, but for a
> different reason (USB):
>
> "4.2 USB A20 Gate Pass through Porting Notes
>
> We can change A20 on/off by writing command (D1) into 8042-keyboard
> controller by following sequence:
>
> Cycle Address Data
> Write 64h D1h (1 or more) (Starts the Sequence)
> Write 60h xxh
> Read 64h N/A (0 or more)
> Write 64h 0FFh

Okay, looked into it...

This is pretty much an argument if the command should be D1 DF or D1 FF.
It would be interesting to see if there are USB controllers which have
problems with the D1 DF form.

The UHCI spec does appear to demand the D1 FF form in order to use its
SMI mitigation mechanism, and the PDF document you seem to be referring
to (a VIA BIOS porting guide) is telling its BIOS developers to disable
exactly this SMI mitigation option in the UHCI controller, since it
doesn't work properly. However, this presumably means there are some
early BIOSes which have this mitigation mechanism enabled, and therefore
screw up if you need legacy USB support with A20 enabled.

This is not an issue for Linux, obviously, but for other users it might
be an issue.

-hpa

H. Peter Anvin

unread,

Jun 9, 2008, 6:08:33 PM6/9/08

to Mike Gonta

Mike Gonta wrote:
>
> However I believe this to be a case of the data floating on the bus.
> An old technique of pre-charging the bus with a number of "CLD"
> instructions in order to be certain of the read from static memory
> also does the job. I've tested it with a series of five "CLD" or five
> "STD" instructions, your mileage may vary.
>

Don't think it has anything to do with that. You're probably either
flushing the pipeline or causing a trap to microcode.

-hpa

H. Peter Anvin

unread,

Jun 9, 2008, 6:12:40 PM6/9/08

to Rod Pemberton

H. Peter Anvin wrote:
> Rod Pemberton wrote:
>>
>> I also found it as part of a .pdf Google managed to cache, but for a
>> different reason (USB):
>>
>> "4.2 USB A20 Gate Pass through Porting Notes
>>
>> We can change A20 on/off by writing command (D1) into 8042-keyboard
>> controller by following sequence:
>>
>> Cycle Address Data
>> Write 64h D1h (1 or more) (Starts the Sequence)
>> Write 60h xxh
>> Read 64h N/A (0 or more)
>> Write 64h 0FFh
>
> Okay, looked into it...
>
> This is pretty much an argument if the command should be D1 DF or D1 FF.
> It would be interesting to see if there are USB controllers which have
> problems with the D1 DF form.
>

Scratch that again. It's actually an issue of a null command (FF) sent
separately from the D1 DF command.

What's worse is that the above sequence is flatly wrong. It doesn't
include any synchronization between the command and data write at all.
I hate USB. It's broken in so many ways.

Anyway, the null command looks like it shouldn't hurt, I'm going to test
it out on a machine which I have which does have broken legacy USB.

-hpa

H. Peter Anvin

unread,

Jun 9, 2008, 7:09:14 PM6/9/08

to Rod Pemberton

H. Peter Anvin wrote:
>
> Scratch that again. It's actually an issue of a null command (FF) sent
> separately from the D1 DF command.
>
> What's worse is that the above sequence is flatly wrong. It doesn't
> include any synchronization between the command and data write at all.
> I hate USB. It's broken in so many ways.
>
> Anyway, the null command looks like it shouldn't hurt, I'm going to test
> it out on a machine which I have which does have broken legacy USB.
>

Hm. I can't seem to replicate the issue on any of the machines I
*thought* had issues with this stuff. The only machine I know for sure
I have in my stash with this particular bug is missing a (nonstandard)
power supply. Bloody hell.

-hpa

H. Peter Anvin

unread,

Jun 9, 2008, 7:56:21 PM6/9/08

to Rod Pemberton

H. Peter Anvin wrote:
>
> Scratch that again. It's actually an issue of a null command (FF) sent
> separately from the D1 DF command.
>
> What's worse is that the above sequence is flatly wrong. It doesn't
> include any synchronization between the command and data write at all.
> I hate USB. It's broken in so many ways.
>
> Anyway, the null command looks like it shouldn't hurt, I'm going to test
> it out on a machine which I have which does have broken legacy USB.
>

Okay, I'm starting to guess here what's going on.

FF is a "pulse output" command that doesn't actually pulse anything, but
it probably takes about as long as any other pulse command; 6 盜
according to aeb's website, but that doesn't include the I/O delays
imposed by the KBC itself, which might very well help with
synchronization on machines with hideously slow KBCs like some old
Toshiba laptops.

Then some "clever" person working on UHCI decided that this was "the
standard A20 sequence." God, I hate USB. Everywhere you look there is
braindamage.

-hpa

Rod Pemberton

unread,

Jun 10, 2008, 12:58:12 PM6/10/08

to

"H. Peter Anvin" <h...@zytor.com> wrote in message

news:484DC325...@zytor.com...

> H. Peter Anvin wrote:
> >
> > Scratch that again. It's actually an issue of a null command (FF) sent
> > separately from the D1 DF command.
> >
> > What's worse is that the above sequence is flatly wrong. It doesn't
> > include any synchronization between the command and data write at all.
> > I hate USB. It's broken in so many ways.
> >
> > Anyway, the null command looks like it shouldn't hurt, I'm going to test
> > it out on a machine which I have which does have broken legacy USB.
> >
>
> Okay, I'm starting to guess here what's going on.
>
> FF is a "pulse output" command that doesn't actually pulse anything, but
> it probably takes about as long as any other pulse command; 6 盜
> according to aeb's website, but that doesn't include the I/O delays
> imposed by the KBC itself,

Ah, yes, 6us - most of what I got says 6ms... PS/2 Tech ref says:
"... pulses... for approximately six microseconds."

Anyway, it's not always 6us for 0xFF. Take this RadiSys 82600 High
Integration Dual
PCI System Controller :

"5.37.2 GATEA20 and RESET
Since the D1 command takes place in the same bus cycle it is issued, a
change in GATEA20 will happen immediately. An Fx command will take from 6.4
to 128 us to generate a reset depending on the clock speed. An FF command
will take place in the same bus cycle it is issued."

> which might very well help with
> synchronization on machines with hideously slow KBCs like some old
> Toshiba laptops.
>
> Then some "clever" person working on UHCI decided that this was "the
> standard A20 sequence." God, I hate USB. Everywhere you look there is
> braindamage.

What? Wait... Which USB doc standardizes it? UHCI?

Rod Pemberton

H. Peter Anvin

unread,

Jun 10, 2008, 1:52:40 PM6/10/08

to Rod Pemberton

Rod Pemberton wrote:
>>
>> Then some "clever" person working on UHCI decided that this was "the
>> standard A20 sequence." God, I hate USB. Everywhere you look there is
>> braindamage.
>
> What? Wait... Which USB doc standardizes it? UHCI?
>

UHCI.

-hpa

Rod Pemberton

unread,

Jun 11, 2008, 6:24:35 PM6/11/08

to

"H. Peter Anvin" <h...@zytor.com> wrote in message

news:484DAAD8...@zytor.com...

> What's worse is that the above sequence is flatly wrong. It doesn't
> include any synchronization between the command and data write at all.

FYI, one PC of mine fails to enable A20 with the UHCI sequence (pre-USB),
while another succeeds (UHCI)...

I'd guess that should be the first A20 method tried on a UHCI machine to
ensure the USB "A20 Gate Pass Through Sequence" has the best chance of
succeeding to prevent the cpu from entering SMM for A20 enable/disable.
You'd have to fallback to other A20 enables/disables, if it fails.

Regretting being "*the author* of that code", yet? ;-)

Rod Pemberton

H. Peter Anvin

unread,

Jun 11, 2008, 7:11:27 PM6/11/08

to Rod Pemberton

Rod Pemberton wrote:
>
> FYI, one PC of mine fails to enable A20 with the UHCI sequence (pre-USB),
> while another succeeds (UHCI)...
>
> I'd guess that should be the first A20 method tried on a UHCI machine to
> ensure the USB "A20 Gate Pass Through Sequence" has the best chance of
> succeeding to prevent the cpu from entering SMM for A20 enable/disable.
> You'd have to fallback to other A20 enables/disables, if it fails.

Preventing the CPU from entering SMM is generally not a goal (it speeds
things up slightly, but at the risk of causing other problems) and IMO
it was a major mistake on the part of the UHCI designers to try to make
that happen, especially since they did so using incorrect criteria.

The UHCI sequence is almost certainly wrong - it doesn't specify
synchronization between the port 64 and port 60 write. However, it
might still mean that if you care about legacy USB after A20 enabling,
issuing the null command might still help.

Note, again, that Linux doesn't care: it never uses legacy USB after
entering protected mode.

> Regretting being "*the author* of that code", yet? ;-)

Not at all. I want to explore this issue and has put out a few feelers
for people with machines that exhibit USB keyboard lockups. In
particular, there is a set of Dell machines with a particular BIOS
revision that is known problematic that this might help with... or it
might not, but it's the best lead I've seen so far.

Debugging by Internet rumour...

-hpa

Rod Pemberton

unread,

Jun 11, 2008, 9:26:09 PM6/11/08

to

"H. Peter Anvin" <h...@zytor.com> wrote in message

news:48505B9F...@zytor.com...

>
> Preventing the CPU from entering SMM is generally not a goal (it speeds
> things up slightly, but at the risk of causing other problems)
>

Well, I for one am not familiar with what, if anything, is done in SMM. If
anything has been standardized, I missed it. So, perhaps my perspective was
a bit simplistic compared to yours:

1) One doesn't know if SMM will do something to an A20 enable/disable
sequence, what SMM will do if it does something, or if SMM will do it
correctly some or all of the time.
2) A20 enable/disable by the keyboard controller is very reliable and must
work properly.
3) "A20GATE Pass Through Sequence," if enabled, allows one to choose a
trusted method (keyboard controller) over some unknown method (SMM) - even
if the sequence only works for certain machines, e.g. USB UHCI.

I.e., I'd attempt preference the keyboard controller over other options
because I trust it.

> it was a major mistake on the part of the UHCI designers to try to make
> that happen, especially since they did so using incorrect criteria.

I don't know what their criteria was. I'm assuming they realized that SMM
implementing an A20 enable/disable:
1) could destroy code compatibility
2) was unecessary for most code, OSes, and BIOSes, so they allowed a
pass-through

> that Linux doesn't care: never uses legacy USB
> after entering protected mode.

Was/is the SMM trapping of I/O ports only for real modes?...

Was/is the "A20GATE Pass Through Sequence" only for real modes?...

Rod Pemberton

H. Peter Anvin

unread,

Jun 11, 2008, 11:48:18 PM6/11/08

to Rod Pemberton

Rod Pemberton wrote:
> 3) "A20GATE Pass Through Sequence," if enabled, allows one to choose a
> trusted method (keyboard controller) over some unknown method (SMM) - even
> if the sequence only works for certain machines, e.g. USB UHCI.
>
> I.e., I'd attempt preference the keyboard controller over other options
> because I trust it.

This would have been true had the "A20GATE Pass Through Sequence"
actually been correct for all KBCs, which it isn't.

>> it was a major mistake on the part of the UHCI designers to try to make
>> that happen, especially since they did so using incorrect criteria.
>
> I don't know what their criteria was. I'm assuming they realized that SMM
> implementing an A20 enable/disable:
> 1) could destroy code compatibility
> 2) was unecessary for most code, OSes, and BIOSes, so they allowed a
> pass-through

More likely they were worried about performance for DOS programs. It
would have been better if they had told the BIOS vendors to implement
INT 15h, AX=2401h. It would have been even better if they had gotten
their CPU vendors to add an A20 enable override fully internal to the CPU.

>> that Linux doesn't care: never uses legacy USB
>> after entering protected mode.
>
> Was/is the SMM trapping of I/O ports only for real modes?...

No.

> Was/is the "A20GATE Pass Through Sequence" only for real modes?...

You're not supposed to leave A20 disabled in protected mode;
*definitely* not with paging on.

-hpa

x

unread,

Jun 12, 2008, 12:30:32 AM6/12/08

to H. Peter Anvin, Mike Gonta

One small, tangential note:

H. Peter Anvin wrote:
> Thinking about it some more, you probably do need to execute a
> serializing instruction of some sort (I use an I/O port reference since
> it also provides a modicum of timing independence, at least as far as
> the overall system is concerned) in order for the test to be valid.

I/Os are *not* architecturally defined as serializing instructions.
Volume 3 has the short list of what is really a serializing instruction.
In non-privileged mode, you are pretty much left with CPUID.

I/Os are almost always *effectively* serializing, just because they take
so long (they really are slow, or the chipset combination forces them to
be slow to have legacy behavior) I/Os do also have defined ordering
effects on some blocks (store buffer, etc), but are not truly
"serializing instructions".

H. Peter Anvin

unread,

Jun 12, 2008, 1:12:51 AM6/12/08

to x, Mike Gonta

This is true, of course, but I/O instructions, as well as uncached
memory references, are fully serializing from a memory-I/O-bus
consistency point of view. I was being sloppy with what exact meaning
of "serializing" I was using; they are, indeed, not serializing in the
x86 architectural sense.

-hpa

James Harris

unread,

Jun 12, 2008, 6:30:21 AM6/12/08

to

Do you mean they are serializing purely because they are slow and thus
allow any write buffers to drain to memory? If so I'm a little
uncomfortable with this. What's to stop a PC being manufactured which
has faster I/O? Some of the PCI ports can be memory speed, can they
not?

Or am I mixing up two concepts...?

Rod Pemberton

unread,

Jun 12, 2008, 12:05:09 PM6/12/08

to

"x" <x...@x.com> wrote in message news:4850A668...@x.com...

> One small, tangential note:
>
> H. Peter Anvin wrote:
> > Thinking about it some more, you probably do need to execute a
> > serializing instruction of some sort (I use an I/O port reference since
> > it also provides a modicum of timing independence, at least as far as
> > the overall system is concerned) in order for the test to be valid.
>
> I/Os are *not* architecturally defined as serializing instructions.
> Volume 3 has the short list of what is really a serializing instruction.
> In non-privileged mode, you are pretty much left with CPUID.
>

Sandpile.org has a nice list of serializing instructions and some info on
their interactions/complications/usage:
http://www.sandpile.org/ia32/coherent.htm

> I/Os are almost always *effectively* serializing, just because they take
> so long (they really are slow, or the chipset combination forces them to
> be slow to have legacy behavior) I/Os do also have defined ordering
> effects on some blocks (store buffer, etc), but are not truly
> "serializing instructions".

Sandpile.org calls these "store buffer draining" on the same page.

Rod Pemberton
PS I dropped alt.comp.lang.assembler. Reply to three only... Tack it back
on if you reply.

H. Peter Anvin

unread,

Jun 12, 2008, 5:23:02 PM6/12/08

to James Harris

James Harris wrote:
>
> Do you mean they are serializing purely because they are slow and thus
> allow any write buffers to drain to memory? If so I'm a little
> uncomfortable with this. What's to stop a PC being manufactured which
> has faster I/O? Some of the PCI ports can be memory speed, can they
> not?
>
> Or am I mixing up two concepts...?

An uncached access (which includes port I/O) will drain all buffers and
write combiners ahead of it, in order to produce an in-order sequence of
events visible to external devices. They are serializing against main
memory as seen by other CPUs and the DMA transactor, but do not flush
caches.

-hpa

H. Peter Anvin

unread,

Jun 27, 2008, 4:17:28 PM6/27/08

to

H. Peter Anvin wrote:
>
> The UHCI sequence is almost certainly wrong - it doesn't specify
> synchronization between the port 64 and port 60 write. However, it
> might still mean that if you care about legacy USB after A20 enabling,
> issuing the null command might still help.
>
> Note, again, that Linux doesn't care: it never uses legacy USB after
> entering protected mode.
>

> Debugging by Internet rumour...
>

After quite a few false starts, I have successfully tracked down one
system (a HP DL360 G5) which does, indeed, require the FF to port 64 in
order for legacy USB to not lock up. Again, it's not an issue for Linux
per se, since it will not be using legacy USB, but for other users of
this A20-toggling algorithm, it should be added.

http://git.etherboot.org/?p=wraplinux.git;a=commitdiff;h=db2a9ea510c3561a937574338740edffc118bc0d

-hpa