Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH v5] x86: Enable fast strings on Intel if BIOS hasn't already

572 views
Skip to first unread message

Andy Lutomirski

unread,
May 1, 2013, 1:50:01 AM5/1/13
to
From: Andrew Lutomirski <lu...@mit.edu>

Intel SDM volume 3A, 8.4.2 says:

Software can disable fast-string operation by clearing the
fast-string-enable bit (bit 0) of IA32_MISC_ENABLE MSR.
However, Intel recomments that system software always enable
fast-string operation.

The Intel DQ67SW board (with latest BIOS) disables fast string
operations if TXT is enabled. A Lenovo X220 disables it regardless
of TXT setting. I doubt I'm the only person with a dumb BIOS like
this.

Signed-off-by: Andy Lutomirski <lu...@amacapital.net>
---

v4 was a almost two years ago, but I just noticed that this is still a problem.
This is tested on v3.9.

https://patchwork.kernel.org/patch/1073972/

This is identical to v4 of this patch except that it uses wrmsrl_safe instead
of wrmsr_safe.

arch/x86/kernel/cpu/intel.c | 27 ++++++++++++++++++++++-----
1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 1905ce9..a4a3ef2 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -29,6 +29,7 @@
static void __cpuinit early_init_intel(struct cpuinfo_x86 *c)
{
u64 misc_enable;
+ bool allow_fast_string = true;

/* Unmask CPUID levels if masked: */
if (c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xd)) {
@@ -119,10 +120,11 @@ static void __cpuinit early_init_intel(struct cpuinfo_x86 *c)
* (model 2) with the same problem.
*/
if (c->x86 == 15) {
- rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable);
+ allow_fast_string = false;

+ rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable);
if (misc_enable & MSR_IA32_MISC_ENABLE_FAST_STRING) {
- printk(KERN_INFO "kmemcheck: Disabling fast string operations\n");
+ printk_once(KERN_INFO "kmemcheck: Disabling fast string operations\n");

misc_enable &= ~MSR_IA32_MISC_ENABLE_FAST_STRING;
wrmsrl(MSR_IA32_MISC_ENABLE, misc_enable);
@@ -131,13 +133,28 @@ static void __cpuinit early_init_intel(struct cpuinfo_x86 *c)
#endif

/*
- * If fast string is not enabled in IA32_MISC_ENABLE for any reason,
- * clear the fast string and enhanced fast string CPU capabilities.
+ * If BIOS didn't enable fast string operation, try to enable
+ * it ourselves. If that fails, then clear the fast string
+ * and enhanced fast string CPU capabilities.
*/
if (c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xd)) {
rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable);
+
+ if (allow_fast_string &&
+ !(misc_enable & MSR_IA32_MISC_ENABLE_FAST_STRING)) {
+ misc_enable |= MSR_IA32_MISC_ENABLE_FAST_STRING;
+ wrmsrl_safe(MSR_IA32_MISC_ENABLE, misc_enable);
+
+ /* Re-read to make sure it stuck. */
+ rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable);
+
+ if (misc_enable & MSR_IA32_MISC_ENABLE_FAST_STRING)
+ printk_once(KERN_INFO FW_WARN "IA32_MISC_ENABLE.FAST_STRING_ENABLE was not set\n");
+ }
+
if (!(misc_enable & MSR_IA32_MISC_ENABLE_FAST_STRING)) {
- printk(KERN_INFO "Disabled fast string operations\n");
+ if (allow_fast_string)
+ printk_once(KERN_INFO "Failed to enable fast string operations\n");
setup_clear_cpu_cap(X86_FEATURE_REP_GOOD);
setup_clear_cpu_cap(X86_FEATURE_ERMS);
}
--
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Ingo Molnar

unread,
May 1, 2013, 7:20:02 AM5/1/13
to

* Andy Lutomirski <lu...@amacapital.net> wrote:

> From: Andrew Lutomirski <lu...@mit.edu>
>
> Intel SDM volume 3A, 8.4.2 says:
>
> Software can disable fast-string operation by clearing the
> fast-string-enable bit (bit 0) of IA32_MISC_ENABLE MSR.
> However, Intel recomments that system software always enable
> fast-string operation.
>
> The Intel DQ67SW board (with latest BIOS) disables fast string
> operations if TXT is enabled. A Lenovo X220 disables it regardless
> of TXT setting. I doubt I'm the only person with a dumb BIOS like
> this.

Hm, I think we could try this.

Do we know whether Windows enables it? Most laptop vendors will
test/certify on Windows, so that's the 'expected' environment.
I think we should also printk if we enabled it against the BIOS setting -
so that if the user sees any problems it can possibly be tracked back to
this change ...

I.e. stay silent if the BIOS has it enabled already - but otherwise
document our action.

Thanks,

Ingo

Borislav Petkov

unread,
May 1, 2013, 7:40:02 AM5/1/13
to
On Tue, Apr 30, 2013 at 10:46:00PM -0700, Andy Lutomirski wrote:
> From: Andrew Lutomirski <lu...@mit.edu>
>
> Intel SDM volume 3A, 8.4.2 says:
>
> Software can disable fast-string operation by clearing the
> fast-string-enable bit (bit 0) of IA32_MISC_ENABLE MSR.
> However, Intel recomments that system software always enable
> fast-string operation.
>
> The Intel DQ67SW board (with latest BIOS) disables fast string
> operations if TXT is enabled. A Lenovo X220 disables it regardless
> of TXT setting. I doubt I'm the only person with a dumb BIOS like
> this.

Hmm, interesting. So I have a x230 and it is enabled here:

# rdmsr -x 0x000001a0
850089

It could be some fast strings erratum like AAJ6 or BD3 (they have
different names for what apparently is the same erratum in different
docs). Simply search for "intel fast strings erratum" and sample the
first couple of pdfs to get an idea.

If this erratum is actually the case here, it has no fix according to
the docs (same core in different packages :)) and it looks like OEM
vendors want to be on the safe side by disabling fast strings. So, in
this case, if you force-enable it, you could risk forcing the erratum if
the conditions apply (crossing page boundary with inconsistent memory
types).

You could check whether the CPU revisions you have are affected by the
erratum.
Nit:

Why this printk here? You say already below that we've failed enabling
fast strings.

> + }
> +
> if (!(misc_enable & MSR_IA32_MISC_ENABLE_FAST_STRING)) {
> - printk(KERN_INFO "Disabled fast string operations\n");
> + if (allow_fast_string)
> + printk_once(KERN_INFO "Failed to enable fast string operations\n");
> setup_clear_cpu_cap(X86_FEATURE_REP_GOOD);
> setup_clear_cpu_cap(X86_FEATURE_ERMS);
> }
> --

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

Theodore Ts'o

unread,
May 1, 2013, 12:40:01 PM5/1/13
to
On Wed, May 01, 2013 at 01:33:52PM +0200, Borislav Petkov wrote:
> It could be some fast strings erratum like AAJ6 or BD3 (they have
> different names for what apparently is the same erratum in different
> docs). Simply search for "intel fast strings erratum" and sample the
> first couple of pdfs to get an idea.

This errata does seem pretty scary:

Problem: Under certain conditions as described in the Software
Developers Manual section "Out-of-Order Stores For String
Operations in Pentium 4, Intel Xeon, and P6 Family
Processors" the processor performs REP MOVS or REP STOS as
fast strings. Due to this erratum fast string REP MOVS/REP
STOS instructions that cross page boundaries from WB/WC
memory types to UC/WP/WT memory types, may start using an
incorrect data size or may observe memory ordering
violations.

Implication: Upon crossing the page boundary the following may occur,
dependent on the new page memory type:

* UC the data size of each write will now always be 8
bytes, as opposed to the original data size.
* WP the data size of each write will now always be 8
bytes, as opposed to the original data size and
there may be a memory ordering violation.
* WT there may be a memory ordering violation.

In fact, there is the question of whether we should be checking to see
if the CPU stepping is one of the ones with the bug, and if so, to
have Linux disable fast strings even if the BIOS didn't, instead of
blindly enabling fast strings....

- Ted

Andy Lutomirski

unread,
May 1, 2013, 1:00:02 PM5/1/13
to
On Wed, May 1, 2013 at 9:42 AM, H. Peter Anvin <h...@zytor.com> wrote:
> On 05/01/2013 09:34 AM, Theodore Ts'o wrote:
>>
>> In fact, there is the question of whether we should be checking to see
>> if the CPU stepping is one of the ones with the bug, and if so, to
>> have Linux disable fast strings even if the BIOS didn't, instead of
>> blindly enabling fast strings....
>>
>
> The erratum reads seriously, but it only affects crossings between pages
> of different page types, which is rare in itself. WT and WP are not
> even used in Linux; the UC case we end up doing 8-byte stores instead of
> the proper size, which is wrong, but for the case where the user is
> malicious the user could just do that directly, and it seems extremely
> hard to envision a scenario where someone would do that intentionally.

(Just my luck. I'm currently trying to implement WT via PAT by
stealing a slot from either UC or UC-.)

There's already a warning in the Intel system programming manual:

11.5.2.3
Writing Values Across Pages with Different Memory Types
If two adjoining pages in memory have different memory types, and a
word or longer
operand is written to a memory location that crosses the page boundary between
those two pages, the operand might be written to memory twice. This
action does not
present a problem for writes to actual memory; however, if a device is
mapped the
memory space assigned to the pages, the device might malfunction.

Is there any code that memcpys across memory types and expects any
particularly sensible behavior out of it?


I'll try to see what Windows is doing. From my cursory reading of the
errata documents, this affects basically all CPUs -- it doesn't seem
to have been fixed in any revision of anything. So this erratum
doesn't seem to explain why different BIOSes would do different
things.


--Andy

P.S. The printk is in the right place in the patch, but the text is
misleading. I'll fix it if there's a v6.

>
> -hpa
>



--
Andy Lutomirski
AMA Capital Management, LLC

Andi Kleen

unread,
May 1, 2013, 1:00:02 PM5/1/13
to
Theodore Ts'o <ty...@mit.edu> writes:
>
> In fact, there is the question of whether we should be checking to see
> if the CPU stepping is one of the ones with the bug, and if so, to
> have Linux disable fast strings even if the BIOS didn't, instead of
> blindly enabling fast strings....

Crossing pages with different memory attributes shouldn't happen
in normal operation. I wouldn't be too concerned about that one.

I would suggest to leave the bit alone. There may be valid
reasons on the system to either set or not set it, which the kernel
doesn't necessarily know.

-Andi

--
a...@linux.intel.com -- Speaking for myself only

H. Peter Anvin

unread,
May 1, 2013, 1:30:02 PM5/1/13
to
On 05/01/2013 09:34 AM, Theodore Ts'o wrote:
>
> In fact, there is the question of whether we should be checking to see
> if the CPU stepping is one of the ones with the bug, and if so, to
> have Linux disable fast strings even if the BIOS didn't, instead of
> blindly enabling fast strings....
>

The erratum reads seriously, but it only affects crossings between pages
of different page types, which is rare in itself. WT and WP are not
even used in Linux; the UC case we end up doing 8-byte stores instead of
the proper size, which is wrong, but for the case where the user is
malicious the user could just do that directly, and it seems extremely
hard to envision a scenario where someone would do that intentionally.

-hpa

Theodore Ts'o

unread,
May 1, 2013, 1:30:01 PM5/1/13
to
On Wed, May 01, 2013 at 09:42:30AM -0700, H. Peter Anvin wrote:
> The erratum reads seriously, but it only affects crossings between pages
> of different page types, which is rare in itself. WT and WP are not
> even used in Linux; the UC case we end up doing 8-byte stores instead of
> the proper size, which is wrong, but for the case where the user is
> malicious the user could just do that directly, and it seems extremely
> hard to envision a scenario where someone would do that intentionally.

Yeah, I wasn't so much worried about a malicious user as much as a
situation where the you're trying to debug a mysterious and
hard-to-reproduce failure, start tearing your hair out, and wondering
whether you're going insane or the compiler hates you and is out to
get you and you start staring at assembly code to try to figure out
how some piece of memory got mysteriously corrupted....

- Ted

H. Peter Anvin

unread,
May 1, 2013, 1:40:01 PM5/1/13
to
On 05/01/2013 10:20 AM, Theodore Ts'o wrote:
> On Wed, May 01, 2013 at 09:42:30AM -0700, H. Peter Anvin wrote:
>> The erratum reads seriously, but it only affects crossings between pages
>> of different page types, which is rare in itself. WT and WP are not
>> even used in Linux; the UC case we end up doing 8-byte stores instead of
>> the proper size, which is wrong, but for the case where the user is
>> malicious the user could just do that directly, and it seems extremely
>> hard to envision a scenario where someone would do that intentionally.
>
> Yeah, I wasn't so much worried about a malicious user as much as a
> situation where the you're trying to debug a mysterious and
> hard-to-reproduce failure, start tearing your hair out, and wondering
> whether you're going insane or the compiler hates you and is out to
> get you and you start staring at assembly code to try to figure out
> how some piece of memory got mysteriously corrupted....
>

If you are crossing pages with different memory types, the fact that the
sizes being written are wrong is probably the least of your problems.

-hpa

H. Peter Anvin

unread,
May 1, 2013, 1:40:02 PM5/1/13
to
On 05/01/2013 09:54 AM, Andy Lutomirski wrote:
>
> (Just my luck. I'm currently trying to implement WT via PAT by
> stealing a slot from either UC or UC-.)
>

NAK on that. Use a slot in the upper half, perhaps (we already
blacklist the CPUs for which the upper half aren't usable.)

What do you want WT for, anyway?

> Is there any code that memcpys across memory types and expects any
> particularly sensible behavior out of it?

Unlikely (see my post.)

-hpa

H. Peter Anvin

unread,
May 1, 2013, 2:00:02 PM5/1/13
to
On 05/01/2013 10:50 AM, Andy Lutomirski wrote:
>
> Isn't the upper half incompatible with large pages?
>

No, just with attributes *on the page tables themselves*.

Andy Lutomirski

unread,
May 1, 2013, 2:00:02 PM5/1/13
to
On Wed, May 1, 2013 at 10:35 AM, H. Peter Anvin <h...@zytor.com> wrote:
> On 05/01/2013 09:54 AM, Andy Lutomirski wrote:
>>
>> (Just my luck. I'm currently trying to implement WT via PAT by
>> stealing a slot from either UC or UC-.)
>>
>
> NAK on that. Use a slot in the upper half, perhaps (we already
> blacklist the CPUs for which the upper half aren't usable.)

Isn't the upper half incompatible with large pages?

Why the NAK? Unless I've misread the spec, UC and UC- are only
different if there's a WC MTRR set, and I haven't found anything in
the kernel yet that adds a WC MTRR that actually needs that MTRR if
PAT is enabled. I've made my way about half-way through the mtrr_add
calls so far. (The drivers that use MTRRs are graphics devices, ivtv,
fusion MPT, myri10ge, and infiniband.)

>
> What do you want WT for, anyway?

Generically, memory regions in which writes have side effects but
reads are just reads and should be cached.

In particular, persistent (i.e. nonvolatile) memory. There's an NDA
involved, but I can safely say (at least): there seem to be nifty
devices that aren't quite RAM that are nonetheless presented to the
system as RAM. Write are durable, but only if they make it out of
cache before power fails or the CPU resets in such a way that caches
are invalidated but not written back. UC and WC are a bit
heavy-handed because read caching is fine. (PowerPC has nice
instructions for things like "write this back now", but x86 seems to
be missing any way other than WT to force data out to RAM without
invalidating the cache line.)

Making this work with a WT MTRR is probably doable, but it's IMO
rather ugly. Even if I go that route, I'd still want to convince
graphics drivers to stop wasting MTRRs, since they don't need them and
they tend to be in short supply.

Here's an example:

http://www.tomshardware.com/news/Viking-ArxCis-NV-NVDIMM-RAM,21892.html


--Andy

Andy Lutomirski

unread,
May 1, 2013, 2:20:02 PM5/1/13
to
On Wed, May 1, 2013 at 10:54 AM, H. Peter Anvin <h...@zytor.com> wrote:
> On 05/01/2013 10:50 AM, Andy Lutomirski wrote:
>>
>> Isn't the upper half incompatible with large pages?
>>
>
> No, just with attributes *on the page tables themselves*.

Thanks :) Now I found the somewhat alarming algorithm in section 4.9.2.

This will be a bit unpleasant, though, since the _PAGE_CACHE_xxx
macros will become rather confused. I suppose there's no fundamental
reason that pgprot_t has to correspond to pmd bit positions. Sigh.

--Andy

>
> -hpa
>



--
Andy Lutomirski
AMA Capital Management, LLC

Ingo Molnar

unread,
May 10, 2013, 4:30:02 AM5/10/13
to

* Andy Lutomirski <lu...@amacapital.net> wrote:

> > What do you want WT for, anyway?
>
> Generically, memory regions in which writes have side effects but reads
> are just reads and should be cached.
>
> In particular, persistent (i.e. nonvolatile) memory. There's an NDA
> involved, but I can safely say (at least): there seem to be nifty
> devices that aren't quite RAM that are nonetheless presented to the
> system as RAM. Write are durable, but only if they make it out of cache
> before power fails or the CPU resets in such a way that caches are
> invalidated but not written back. UC and WC are a bit heavy-handed
> because read caching is fine. (PowerPC has nice instructions for things
> like "write this back now", but x86 seems to be missing any way other
> than WT to force data out to RAM without invalidating the cache line.)
>
> Making this work with a WT MTRR is probably doable, but it's IMO rather
> ugly. Even if I go that route, I'd still want to convince graphics
> drivers to stop wasting MTRRs, since they don't need them and they tend
> to be in short supply.
>
> Here's an example:
>
> http://www.tomshardware.com/news/Viking-ArxCis-NV-NVDIMM-RAM,21892.html

This looks potentially useful.

I'd consider your cache-attributes review and cleanups to drivers and
infrastructure to be the main upstream benefit we win from your effort.

So as long as your patches go in that general direction, and the PAT code
and its usage gets cleaner and more organized, and there's no showstopper
issue discovered, the fact that you gain ioremap_wt() for your driver is
mostly just a happy coincidence that we don't mind that much.

Maybe in the end we'd have to hide it behind some sort of
CONFIG_COMPAT_PAT trigger and turn it off on old/buggy systems - but in
the first approximation it would be nice to try and make this just a
single variant with no Kconfig complexity?

Thanks,

Ingo
0 new messages