Huh? Your code worked properly with WBINVD without the BIOS anti-virus
setting enabled... (once I found it). It was only with the BIOS anti-virus
setting enabled that it didn't work.
> In the revised version it is located between the move and the
> compare (there is also a redundant wbinvd at the beginning).
The check originally compared "fixed value" BIOS memory locations with
written values to see if they were fixed. My PC acted like locations in the
PC BIOS were writable RAM instead of unwritable ROM. This was due to the
BIOS anti-virus setting. I believe the setting enabled cacheing of the PC
BIOS in RAM just like the optional BIOS settings for other upper memory
regions, VESA BIOS, etc. IIRC, you rewrote the check so it didn't access
the BIOS regions.
> > The "Virus" setting in my BIOS copies the PC BIOS into RAM,
> > "enables" the RAM, unmaps the BIOS - to prevent writing BIOS
> > updates... He expected the BIOS values to be non-changeable
> > near 1Mb and had to rework his code.
>
> Why would the anti-virus allow the "non-changeable" BIOS to be
> changed?
I think the reason is above, but I can check without using your code to see
if writing or reading to my PC BIOS with the BIOS anti-virus setting on acts
like RAM or ROM...
Rod Pemberton
There shouldn't be any need for WBINVD; A20 masking happens inside the
cache hierarchy.
-hpa
Which is completely pointless, the only thing the cache flush does for
you here is provide a sizable delay.
-hpa
If you want a big non-CPU dependant delay, try a HLT or five.
Don't worry, the machine will eat one HLT per interrupt --
at least the timer tick.
-- Robert
Why reinvent the wheel and not clone the Linux's A20 logic? it is
well-tested on lots of BIOSes I think.
--
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
ma...@storagecraft.com
http://www.storagecraft.com
Wouldn't using Linux's A20 logic (or even using it as a basis for
one's code) fall under the terms of a derivative work and thus require
GNU licensing for one's own code? Not everyone want's to release under
that licence.
No. Just do not copy-paste the actual source.
It would still be "based on" the GPL code, and therefore it
would be a derivative work.
Phil
--
Dear aunt, let's set so double the killer delete select all.
-- Microsoft voice recognition live demonstration
That is correct, the MTRRs (P6+) or KEN# logic (older CPUs) will prevent
the caching and cause the write to go to the bus (even if the RAM is
shadowed for the readonly direction.)
It's still rather inefficient way to do it; poking a location in RAM
until it disambiguates is usually quite a bit quicker, especially if the
A20 line is already enabled and all you need is a test.
-hpa
*As the author* of that code, I can say that I do not consider the
algorithm proprietary. Furthermore, the equivalent algorithm is also
implemented under the BSD license in the package "wraplinux",
specifically the file reloc/a20.S.
-hpa
There is no proprietary _algorithms_ and _ideas_ in GPL world.
Such thing require a _patent_ which are not used in GPL world.
Code - yes, algorithms - no.
Sort of - kind of. Law is a lot fuzzier than what we programmers like
to think, partly because the cost of a dispute, even if eventually
resolved in favor of the statue as written (which isn't certain) can be
extremely costly.
Either way, the original statement is something called an "estoppel".
What it is, in very simplified term (a lawyer would roast me for this)
is a statement by a potential party in a legal dispute (in this case me
as author of the said code) that I do not intend to make a certain kind
of claim (remember that anyone can make a *claim* about the law --
doesn't mean that they are actually correct.)
So, anyway, go ahead and re-implement it, or just use the code that I
pointed out that is under the BSD license. With my blessings.
-hpa
You are aware that P6+ do not, by specification, support A20M# in
protected mode, right? In other words, you try to use A20M# in
protected mode in these chips, and you get G-d knows what.
In this case you're probably seeing a case of picking up the store from
the store buffer.
-hpa
Thinking about it some more, you probably do need to execute a
serializing instruction of some sort (I use an I/O port reference since
it also provides a modicum of timing independence, at least as far as
the overall system is concerned) in order for the test to be valid.
It somewhat sucks to have to do, but it's a *lot* cheaper than a WBINVD,
by something like three orders of magnitude.
-hpa
So, this page isn't correct or upto date:
http://www.sandpile.org/ia32/legacy.htm
It does say A20M# is ignored for SMM.
> In other words, you try to use A20M# in
> protected mode in these chips, and you get G-d knows what.
What is the protected mode default A20M# state for P6+? enabled?
So, if one had a need to change the A20 state, e.g. prior to reboot, you
must switch the cpu back to real mode first, then set A20, yes?
--
My AMD K6-2 machine had fits with Ben Lunt's FYSOS code - believed to be an
A20 problem. I'm not sure what his fix was, but he said that he had been
enabling A20 in 'unreal' mode. Do you think that would cause A20 problems?
--
Since you're covering much about the A20, why do you write 0xdf or 0xdd to
enable/disable? e.g., instead of 0xff or 0xfd? I may have run across that
somewhere, but I don't recall at the moment...
--
What about other "suspicious" things I've read about A20 (e.g., old a.o.d.
posts...):
- should reset keyboard controller after A20 enable
- should write to port 0x64, 0xFF - to update I/O...
- should write to port 0x4F, al - any value in al, to flush I/O caching...
and provide a delay instead of jmp $+2
- do not use fast A20 (PS/2), before checking if slow A20 (kbd) enabled
- should pulse A20 after set
Rod Pemberton
PS Sorry, I dropped alt.comp.lang.assembler - can only reply to three...
Tack it back on.
Thinking even more about it, it's pretty obvious: the x86 does not
guarantee that page table changes are visible until there is a
serializing instruction, and A20, being part of the TLB block, behaves
like it.
-hpa
It's unsupported... you're not supposed to ever let A20 be disabled.
There are a number of errata related to this kind of handling,
especially with paging enabled (I don't know off the top of my head any
CPUs which have errata with A20 in protected mode with paging disabled.)
> So, if one had a need to change the A20 state, e.g. prior to reboot, you
> must switch the cpu back to real mode first, then set A20, yes?
Yes. Note that the standard state after system reset is A20 enabled in
the KBC, disabled in port 92h.
> with Ben Lunt's FYSOS code - believed to be an
> A20 problem. I'm not sure what his fix was, but he said that he had been
> enabling A20 in 'unreal' mode. Do you think that would cause A20 problems?
>
> --
> Since you're covering much about the A20, why do you write 0xdf or 0xdd to
> enable/disable? e.g., instead of 0xff or 0xfd? I may have run across that
> somewhere, but I don't recall at the moment...
>
> --
> What about other "suspicious" things I've read about A20 (e.g., old a.o.d.
> posts...):
> - should reset keyboard controller after A20 enable
> - should write to port 0x64, 0xFF - to update I/O...
> - should write to port 0x4F, al - any value in al, to flush I/O caching...
> and provide a delay instead of jmp $+2
> - do not use fast A20 (PS/2), before checking if slow A20 (kbd) enabled
> - should pulse A20 after set
1. You need to flush (not reset) the keyboard controller after write.
2. I have not seen anything like that.
3. JMP $+2 became obsolete with the Pentium. Using an I/O port (pretty
much any unused I/O port will do) is both more consistent in terms of
delay, and serializes I/O as well.
4. Correct, at least if you want suspend/resume to work. Some BIOSes
incorrectly don't save and restore the state of port 92h.
5. I have not seen anything like that.
Note, also, that if your BIOS supports it, INT 15h, AX=2401h is the best
way to enable A20.
-hpa
This prompted me to look for some per-CPU data. I found this useful
page:
http://cpu.linuxmania.net/liste/cpuinfo/technical-data_CPU.htm
AMD mentions it differently but the Intel 486 data sheets clearly
confirm the undefined behaviour of A20M# under protected mode
and the real-mode-only nature of the A20M# line.
That's odd...
http://groups.google.com/group/fa.linux.kernel/msg/ada7fac2ecb760a6
Various old usenet posts indicate Minix had/have it.
> 3. JMP $+2 became obsolete with the Pentium. Using an I/O port (pretty
> much any unused I/O port will do) is both more consistent in terms of
> delay, and serializes I/O as well.
> 4. Correct, at least if you want suspend/resume to work. Some BIOSes
> incorrectly don't save and restore the state of port 92h.
> 5. I have not seen anything like that.
>
> Note, also, that if your BIOS supports it, INT 15h, AX=2401h is the best
> way to enable A20.
>
Thanks,
Rod Pemberton
I also found it as part of a .pdf Google managed to cache, but for a
different reason (USB):
"4.2 USB A20 Gate Pass through Porting Notes
We can change A20 on/off by writing command (D1) into 8042-keyboard
controller by following sequence:
Cycle Address Data
Write 64h D1h (1 or more) (Starts the Sequence)
Write 60h xxh
Read 64h N/A (0 or more)
Write 64h 0FFh
During the memory change mode (A20 on/off), if USB_RXC0[5] A20Gate
Pass Through Enabled (A20PTEN), SMI# will not be generated during
the sequence until it is finished even if the various enable bits
are set. Some DOS games (Doom2,..) may disable A20GATE for high
memory access and enable it as existed but without writing 64h port
with 0FFh. As a result, the USB host cannot generate IO trap SMI
(the sequence did not end yet) and lead the legacy USB keyboard/mouse
to fail. For the above reason as using 60/64 port IO trap, we must
enable USB_RX41[1]=1 to ignore the 4th step in the sequence.
The description of USB_RX41[1] is as following:
bit 1: A20gate Pass Through Option
0-Pass through A20GATE command sequence defined in UHCI
1- Don't pass through Write I/O port 64(0FFh) (Ignore the steps)
"
> > 3. JMP $+2 became obsolete with the Pentium. Using an I/O port (pretty
> > much any unused I/O port will do) is both more consistent in terms of
> > delay, and serializes I/O as well.
> > 4. Correct, at least if you want suspend/resume to work. Some BIOSes
> > incorrectly don't save and restore the state of port 92h.
> > 5. I have not seen anything like that.
> >
> > Note, also, that if your BIOS supports it, INT 15h, AX=2401h is the best
> > way to enable A20.
Rod Pemberton
PharLap DOS extender.
I wonder what is the need to _disable_ A20, except on exit of the DOS extender
back to DOS :-)
If you have no DOS - then there is no need to ever disable A20. Enable it once
and forever.
It seems the sequence also has special meaning on AMD Geode's - where I'd
guess it can be used as an end-of-A20 indicator...:
"5.10.8 Theory - Force A20 Low Sequence
The FA20 sequence occurs frequently in DOS applications.
Mostly, the sequence is to set FA20 high; that is, do
not force address bit 20 to a 0. High is the default state of
this signal. To reduce the number of ASMIs caused by the
A20 sequence, KEL generates an ASMI only if the
GateA20 sequence would change the state of A20.
The A20 sequence is initiated with a write of D1h to I/O
Address 064h. On detecting this write, the KEL sets the
A20Sequence bit in HCE_Control (KEL Memory Offset
100h[5]). It captures the data byte in HCE_Input (KEL
Memory Offset 104h[7:0]), but does not set the InputFull bit
in HCE_Status (KEL Memory Offset 10Ch[1]). When
A20Sequence is set, a write of a value to I/O Address 060h
that has bit 1 set to a value different than A20State in
HCE_Control (KEL Memory Offset 100h[8]) causes Input-
Full to be set and causes an ASMI. An ASMI with both
InputFull and A20Sequence set indicates that the application
is trying to change the setting of FA20 on the keyboard
controller. However, when A20Sequence is set, and a write
of a value to I/O Address 060h that has bit 1 set to the
same value as A20State in HCE_Control is detected, then
no ASMI will occur.
As mentioned above, a write to I/O Address 064h of any
value other than D1h causes A20Sequence to be cleared.
If A20Sequence is active and a value of FFh is written to
I/O Address 064h, A20Sequence is cleared but InputFull is
not set. A write of any value other than D1h or FFh causes
InputFull to be set, which then causes an ASMI. A write of
FFh to I/O Address 064h when A20Sequence is not set
causes InputFull to be set. The current value of the
A20_Mask is maintained in two unconnected places. The
A20State bit in HCE_Control and bit 1 in Port A. The value
of A20State is only changed via a software write to
HCE_Control. It is set to 0 at reset. The value of bit 1 in
Port A changes on any write to Port A. From reset PortA[1]
is 1.
"
Apparently, a number of circuit designers believe that sequence is
standardized enough to use it for other purposes...
Rod Pemberton
This is very interesting; I have seen platforms on which legacy USB
stops working in boot loaders. Linux, obviously, doesn't care, but If
this turns out to be a suitable workaround it would be highly useful.
-hpa
Seems to have slipped my mind. Thanks for refreshing it.
-hpa
Okay, looked into it...
This is pretty much an argument if the command should be D1 DF or D1 FF.
It would be interesting to see if there are USB controllers which have
problems with the D1 DF form.
The UHCI spec does appear to demand the D1 FF form in order to use its
SMI mitigation mechanism, and the PDF document you seem to be referring
to (a VIA BIOS porting guide) is telling its BIOS developers to disable
exactly this SMI mitigation option in the UHCI controller, since it
doesn't work properly. However, this presumably means there are some
early BIOSes which have this mitigation mechanism enabled, and therefore
screw up if you need legacy USB support with A20 enabled.
This is not an issue for Linux, obviously, but for other users it might
be an issue.
-hpa
Don't think it has anything to do with that. You're probably either
flushing the pipeline or causing a trap to microcode.
-hpa
Scratch that again. It's actually an issue of a null command (FF) sent
separately from the D1 DF command.
What's worse is that the above sequence is flatly wrong. It doesn't
include any synchronization between the command and data write at all.
I hate USB. It's broken in so many ways.
Anyway, the null command looks like it shouldn't hurt, I'm going to test
it out on a machine which I have which does have broken legacy USB.
-hpa
Hm. I can't seem to replicate the issue on any of the machines I
*thought* had issues with this stuff. The only machine I know for sure
I have in my stash with this particular bug is missing a (nonstandard)
power supply. Bloody hell.
-hpa
Okay, I'm starting to guess here what's going on.
FF is a "pulse output" command that doesn't actually pulse anything, but
it probably takes about as long as any other pulse command; 6 盜
according to aeb's website, but that doesn't include the I/O delays
imposed by the KBC itself, which might very well help with
synchronization on machines with hideously slow KBCs like some old
Toshiba laptops.
Then some "clever" person working on UHCI decided that this was "the
standard A20 sequence." God, I hate USB. Everywhere you look there is
braindamage.
-hpa
Ah, yes, 6us - most of what I got says 6ms... PS/2 Tech ref says:
"... pulses... for approximately six microseconds."
Anyway, it's not always 6us for 0xFF. Take this RadiSys 82600 High
Integration Dual
PCI System Controller :
"5.37.2 GATEA20 and RESET
Since the D1 command takes place in the same bus cycle it is issued, a
change in GATEA20 will happen immediately. An Fx command will take from 6.4
to 128 us to generate a reset depending on the clock speed. An FF command
will take place in the same bus cycle it is issued."
> which might very well help with
> synchronization on machines with hideously slow KBCs like some old
> Toshiba laptops.
>
> Then some "clever" person working on UHCI decided that this was "the
> standard A20 sequence." God, I hate USB. Everywhere you look there is
> braindamage.
What? Wait... Which USB doc standardizes it? UHCI?
Rod Pemberton
UHCI.
-hpa
FYI, one PC of mine fails to enable A20 with the UHCI sequence (pre-USB),
while another succeeds (UHCI)...
I'd guess that should be the first A20 method tried on a UHCI machine to
ensure the USB "A20 Gate Pass Through Sequence" has the best chance of
succeeding to prevent the cpu from entering SMM for A20 enable/disable.
You'd have to fallback to other A20 enables/disables, if it fails.
Regretting being "*the author* of that code", yet? ;-)
Rod Pemberton
Preventing the CPU from entering SMM is generally not a goal (it speeds
things up slightly, but at the risk of causing other problems) and IMO
it was a major mistake on the part of the UHCI designers to try to make
that happen, especially since they did so using incorrect criteria.
The UHCI sequence is almost certainly wrong - it doesn't specify
synchronization between the port 64 and port 60 write. However, it
might still mean that if you care about legacy USB after A20 enabling,
issuing the null command might still help.
Note, again, that Linux doesn't care: it never uses legacy USB after
entering protected mode.
> Regretting being "*the author* of that code", yet? ;-)
Not at all. I want to explore this issue and has put out a few feelers
for people with machines that exhibit USB keyboard lockups. In
particular, there is a set of Dell machines with a particular BIOS
revision that is known problematic that this might help with... or it
might not, but it's the best lead I've seen so far.
Debugging by Internet rumour...
-hpa
Well, I for one am not familiar with what, if anything, is done in SMM. If
anything has been standardized, I missed it. So, perhaps my perspective was
a bit simplistic compared to yours:
1) One doesn't know if SMM will do something to an A20 enable/disable
sequence, what SMM will do if it does something, or if SMM will do it
correctly some or all of the time.
2) A20 enable/disable by the keyboard controller is very reliable and must
work properly.
3) "A20GATE Pass Through Sequence," if enabled, allows one to choose a
trusted method (keyboard controller) over some unknown method (SMM) - even
if the sequence only works for certain machines, e.g. USB UHCI.
I.e., I'd attempt preference the keyboard controller over other options
because I trust it.
> it was a major mistake on the part of the UHCI designers to try to make
> that happen, especially since they did so using incorrect criteria.
I don't know what their criteria was. I'm assuming they realized that SMM
implementing an A20 enable/disable:
1) could destroy code compatibility
2) was unecessary for most code, OSes, and BIOSes, so they allowed a
pass-through
> that Linux doesn't care: never uses legacy USB
> after entering protected mode.
Was/is the SMM trapping of I/O ports only for real modes?...
Was/is the "A20GATE Pass Through Sequence" only for real modes?...
Rod Pemberton
This would have been true had the "A20GATE Pass Through Sequence"
actually been correct for all KBCs, which it isn't.
>> it was a major mistake on the part of the UHCI designers to try to make
>> that happen, especially since they did so using incorrect criteria.
>
> I don't know what their criteria was. I'm assuming they realized that SMM
> implementing an A20 enable/disable:
> 1) could destroy code compatibility
> 2) was unecessary for most code, OSes, and BIOSes, so they allowed a
> pass-through
More likely they were worried about performance for DOS programs. It
would have been better if they had told the BIOS vendors to implement
INT 15h, AX=2401h. It would have been even better if they had gotten
their CPU vendors to add an A20 enable override fully internal to the CPU.
>> that Linux doesn't care: never uses legacy USB
>> after entering protected mode.
>
> Was/is the SMM trapping of I/O ports only for real modes?...
No.
> Was/is the "A20GATE Pass Through Sequence" only for real modes?...
You're not supposed to leave A20 disabled in protected mode;
*definitely* not with paging on.
-hpa
H. Peter Anvin wrote:
> Thinking about it some more, you probably do need to execute a
> serializing instruction of some sort (I use an I/O port reference since
> it also provides a modicum of timing independence, at least as far as
> the overall system is concerned) in order for the test to be valid.
I/Os are *not* architecturally defined as serializing instructions.
Volume 3 has the short list of what is really a serializing instruction.
In non-privileged mode, you are pretty much left with CPUID.
I/Os are almost always *effectively* serializing, just because they take
so long (they really are slow, or the chipset combination forces them to
be slow to have legacy behavior) I/Os do also have defined ordering
effects on some blocks (store buffer, etc), but are not truly
"serializing instructions".
This is true, of course, but I/O instructions, as well as uncached
memory references, are fully serializing from a memory-I/O-bus
consistency point of view. I was being sloppy with what exact meaning
of "serializing" I was using; they are, indeed, not serializing in the
x86 architectural sense.
-hpa
Do you mean they are serializing purely because they are slow and thus
allow any write buffers to drain to memory? If so I'm a little
uncomfortable with this. What's to stop a PC being manufactured which
has faster I/O? Some of the PCI ports can be memory speed, can they
not?
Or am I mixing up two concepts...?
Sandpile.org has a nice list of serializing instructions and some info on
their interactions/complications/usage:
http://www.sandpile.org/ia32/coherent.htm
> I/Os are almost always *effectively* serializing, just because they take
> so long (they really are slow, or the chipset combination forces them to
> be slow to have legacy behavior) I/Os do also have defined ordering
> effects on some blocks (store buffer, etc), but are not truly
> "serializing instructions".
Sandpile.org calls these "store buffer draining" on the same page.
Rod Pemberton
PS I dropped alt.comp.lang.assembler. Reply to three only... Tack it back
on if you reply.
An uncached access (which includes port I/O) will drain all buffers and
write combiners ahead of it, in order to produce an in-order sequence of
events visible to external devices. They are serializing against main
memory as seen by other CPUs and the DMA transactor, but do not flush
caches.
-hpa
After quite a few false starts, I have successfully tracked down one
system (a HP DL360 G5) which does, indeed, require the FF to port 64 in
order for legacy USB to not lock up. Again, it's not an issue for Linux
per se, since it will not be using legacy USB, but for other users of
this A20-toggling algorithm, it should be added.
http://git.etherboot.org/?p=wraplinux.git;a=commitdiff;h=db2a9ea510c3561a937574338740edffc118bc0d
-hpa