Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Hyperthreading -- Scott?

4 views
Skip to first unread message

Jack Troughton

unread,
Nov 17, 2002, 8:52:56 PM11/17/02
to
Hi Scott!

I was just reading some stuff over on Intel's site about the new
hyperthreading technology in their new P4 processors. I was wondering if
you had any idea if/when we'd see this stuff in our kernels or not?

Regards,

Jack

eric w

unread,
Nov 17, 2002, 9:32:10 PM11/17/02
to

would not the SMP-aware kernel automagically do it???

eric...

Jack Troughton

unread,
Nov 17, 2002, 10:13:01 PM11/17/02
to
eric w wrote:

> On Mon, 18 Nov 2002 01:52:56 UTC, Jack Troughton

> would not the SMP-aware kernel automagically do it???

That has occurred to me... but I figured I'd go to the mountain and find
out for sure:)

Klaus Staedtler

unread,
Nov 18, 2002, 7:17:16 AM11/18/02
to
eric w wrote:
> would not the SMP-aware kernel automagically do it???

No, these aren't physical available processors, they are 'only' logical.
Currently only WinXP and Linux with Kernel > 2.4.18 can use Hypertherading.

Btw. the abbreviation is SMT (simultaneous multithreading) and not SMP.

Klaus Staedtler

Jack Troughton

unread,
Nov 18, 2002, 9:58:25 AM11/18/02
to
Klaus Staedtler wrote:

Well, I read some of the docs at developer.intel.com, and it looks to me
like all it does is make the system LOOK like an SMP system. There are
certain optimisations that need to be made to get the most out of
hyperthreading, but even without them it's possible to get benefits.

OTOH, I also know that I know nothing (little Socrates, there;)... I'm
sure that there are some other people here that could shed a lot more
light on the subject.

Regards,

Jack

Mike Ruskai

unread,
Nov 18, 2002, 11:51:27 AM11/18/02
to

There's only one FPU (which includes MMX and the like).

A normal SMP OS trying to save and restore processor contexts would
probably make a big mess of things.

The OS has to be smart enough to not schedule two threads using floating
point instructions at the same time.

In general, it probably also has to be dumb enough to make sure two
threads from the same process are running at the same time, to avoid
thrashing the cache to pieces.

It'll be interesting to see real-world benchmarks of how it actually
performs. I predict a very limited scope of better performance.


--
- Mike

Remove 'spambegone.net' and reverse to send e-mail.


Chris Stumpf

unread,
Nov 18, 2002, 11:57:09 AM11/18/02
to
On Sun, 17 Nov 2002 20:52:56 -0500, Jack Troughton wrote:

:>Hi Scott!

:>
Jack, you must have missed Kim's announcement that the SMP support in eCS
works just dandy with HyperThreading. This according to someone at Intel
that tested it.

Chris Stumpf
C.S.E. Computer Services
Computer Consultant (OS/2, Lan, Wan, CTI)
Serenity Systems Channel Partner
IBM Certified Systems Expert - OS/2 Warp 4

email: cst...@monmouth.com
phone: (732)496-4699

Stephen Eickhoff

unread,
Nov 18, 2002, 12:16:54 PM11/18/02
to

"Chris Stumpf" <cst...@monmouth.com> wrote in message
news:pfghzcszbazbhgupb...@news.consultron.ca...

> Jack, you must have missed Kim's announcement that the SMP support in eCS
> works just dandy with HyperThreading. This according to someone at Intel
> that tested it.

As I expected. I had heard of issues with Windows 2000, but the blame lies
solely with Microsoft, as their processor tax, er, licensing, would consider
a single hyperthreaded P4 to be two processors. Therefore, in a dual CPU
workstation with two P4 processors, Windows 2000 Pro would think it had four
installed and not enjoy any benefits.


dinkmeister

unread,
Nov 18, 2002, 12:39:41 PM11/18/02
to
lol os/2 has the same processor tax.. unless you hack the smp kernel/os2apic.*
out of a publically available fixpak and get updates from testcase
(which is completely doable :)


regards,
- dink

On Mon, 18 Nov 2002 17:16:54 GMT, Stephen Eickhoff wrote:

:
:"Chris Stumpf" <cst...@monmouth.com> wrote in message

:
:

Scott E. Garfinkle

unread,
Nov 18, 2002, 1:08:19 PM11/18/02
to
Hyperthreading:
1. If it really currently works on OS/2 (including eCS), I'd be deeply, deeply shocked.
2. It's not a kernel change that's required, it's an OS2APIC change.
3. It's on my list of things to do in my spare time, but don't hold your breath.
4. Personally, I find it hard to believe that using an SMP kernel (including Windoze) and recoginizing two
logical CPUs is any faster than an UNI kernel with one physical one. If there's testing to the contrary
on the Intel site, I've missed it.
5. The discussion about FPUs is bogus. Ignore it. This stuff is pretty well understood and handled already
in SMP systems. Whether or not the other FPU is virtual is immaterial.


eric w

unread,
Nov 18, 2002, 3:07:49 PM11/18/02
to

well in the mainframe world, when you run VM, it provides virtual machines
to any other OS (the OS has NO way of knowing if it is running on a
physical or logical processor.)

if intel did it right the same would be TRUE!

eric...

Jack Troughton

unread,
Nov 18, 2002, 3:06:19 PM11/18/02
to
On Mon, 18 Nov 2002 18:08:19 UTC, "Scott E. Garfinkle" <s...@us.death.to.spam.ibm.com> wrote:

>Hyperthreading:
>1. If it really currently works on OS/2 (including eCS), I'd be
>deeply, deeply shocked.

I've read one claim that it does... but OTOH, it would be in Intel's
interest to make it as transparent as possible to the system.

What I've heard is that it works with the SMP kernel.

As always, testing it is the only way to know for sure... and since
I'm not planning on springing for one of those things anytime in the
near future, my personal test is going to have to wait:)

>2. It's not a kernel change that's required, it's an OS2APIC
>change.

That would make sense, considering what I've read about it on the
Intel developer's site.

>3. It's on my list of things to do in my spare time, but don't hold
>your breath.

:-8 <- cheeks distended, face turning blue:)

>4. Personally, I find it hard to believe that using an SMP kernel
>(including Windoze) and recoginizing two logical CPUs is any faster
>than an UNI kernel with one physical one. If there's testing to the
>contrary on the Intel site, I've missed it.

IME, the threading in Windows is not as efficient as the threading
in OS/2. It might be a hardware hack to get around those
limitations...

>5. The discussion about FPUs is bogus. Ignore it. This stuff is
>pretty well understood and handled already in SMP systems. Whether
>or not the other FPU is virtual is immaterial.

Well, that's good... and might help explain why it works with SMP
OS/2. That's assuming that it does actually improve performance, and
not just function without cracking up, of course...

Interesting times:)

--
-------------------------------------------------------------------
* Jack Troughton jake at consultron.ca *
* http://consultron.ca irc.ecomstation.ca *
* Laval Québec Canada news://news.consultron.ca *
-------------------------------------------------------------------

Kim Cheung

unread,
Nov 18, 2002, 3:11:26 PM11/18/02
to

Scott,

Here is part of the original message I received from a friend at Intel (he
was testing the SMP peer-fix for us on his own time - and without his
permission, I would rather not disclose the entire email - just the portion
that's relevant):

============================
I set up a test platform for eCS GA it has
4 - P4 running @ 2.6GHz.
1 - system drive (HPFS) hung off a Adaptec 39160
1 - data drive 160M (JFS) on another Adaptec 39160
this drive is then shared out using peer 2GB RAM
1- Intel Pro 1000 (Gbit over fiber)

Install of eCS smp GA and the smp8603.zip fix.

<snip portion regarding details of the test>

This machine is also capable of HT -- HyperThreading
Are you aware of the HyperThreading Technology ?
This machine is able to present 8 processors to OS/2
when I enable HT. Of course this is a test vehicle only, not GA.

http://www.intel.com/technology/hyperthread/

Has any analysis been done on OS/2 with HT enabled ?
============================

If it's relevant, I can get more details from our friend.


Scott E. Garfinkle

unread,
Nov 18, 2002, 4:58:00 PM11/18/02
to
On Mon, 18 Nov 2002 12:11:26 -0800 (PST), Kim Cheung wrote:

>his machine is able to present 8 processors to OS/2 when I enable HT.

Capable, certainly. But the real question is, does OS/2 *use* the capability. I don't think so,
though I don't say this for sure. If your friend said "I turned on mpcpumon and it showd 8 CPUs",
then I would believe that.

As for the rest (performance comments), it will be interesting to see the benchmarks when (if) I do the
work on the APIC driver.


eric w

unread,
Nov 18, 2002, 5:53:02 PM11/18/02
to
On Mon, 18 Nov 2002 21:58:00 UTC, "Scott E. Garfinkle"
<s...@us.death.to.spam.ibm.com> wrote:

> As for the rest (performance comments), it will be interesting to see the benchmarks when (if) I do the
> work on the APIC driver.
>


What is APIC???

lsu...@mb.sympatico.ca

unread,
Nov 18, 2002, 6:19:04 PM11/18/02
to

APIC is the SMP hardware (guessing) does I/O interupts and other
things. The PSD driver implements a set of APIs the kernel uses to
interface with the SMP hardware, It provides CPU spin locks and things
like that ...

The PSD=OS2APIC.PSD statement in the config.sys file loads it.

--
Lorne Sunley

USBGuy

unread,
Nov 18, 2002, 6:29:10 PM11/18/02
to

>
>>As for the rest (performance comments), it will be interesting to see the benchmarks when (if) I do the
>>work on the APIC driver.
>>
>
>
>
> What is APIC???

Advanced Programable Interrupt Controler

The logical PCU in an HT P4 has it's own so the system'll find 2 APICs
and hence should find 2 CPUs.
As for kernel/PSD stuff it's mosly about use of the PAUSE instruction
on spinlook waits which is a NOP with a prefix an hence irgnored on all
other x86 CPUs.
Other things like thread stack trashing the cache on windows as they are
1MB aligned are application spec and easily fixable by an alloca(64*n)
in thread n.
Btw what is the stack alignment on OS/2 for different threads in a process?

USBGuy

unread,
Nov 18, 2002, 6:33:52 PM11/18/02
to

> As I expected. I had heard of issues with Windows 2000, but the blame lies
> solely with Microsoft, as their processor tax, er, licensing, would consider
> a single hyperthreaded P4 to be two processors. Therefore, in a dual CPU
> workstation with two P4 processors, Windows 2000 Pro would think it had four
> installed and not enjoy any benefits.
>
Well yes but the nice thing is that XP Home edition does support HT
XP copunts the real physical CPUs and I guess the same is true for W2K
by now so with W2K-Professional you get you desktop 4 way system

Alan Beagley

unread,
Nov 18, 2002, 7:17:52 PM11/18/02
to
But I have read that some apps run slower under XP with a hyperthreading CPU than with a regular one.

-=-
Alan


USBGuy wrote:

> . . . the nice thing is that XP Home edition does support HT

Nick Saxon

unread,
Nov 18, 2002, 7:25:35 PM11/18/02
to

Wrong. VM provides a way of querying if an OS runs in a virtual machine or
not.
That's why you can run VM in a VM, bot not VM in a VM in a VM.

>
>if intel did it right the same would be TRUE!
>
>eric...

Nick


eric w

unread,
Nov 18, 2002, 8:18:47 PM11/18/02
to
On Tue, 19 Nov 2002 00:25:35 UTC, "Nick Saxon" <bogus.s...@attbi.com>
wrote:

> On Mon, 18 Nov 2002 20:07:49 GMT, eric w wrote:
>
>> >well in the mainframe world, when you run VM, it provides virtual
machines
> >to any other OS (the OS has NO way of knowing if it is running on a
> >physical or logical processor.)
>
> Wrong. VM provides a way of querying if an OS runs in a virtual machine or
> not.
> That's why you can run VM in a VM, bot not VM in a VM in a VM.
>
> >

i don't believe i am wrong.

yes there is a way to find out if you are running under VM, BUT this is to
take advantage of various microcode assists, etc.

in fact under VSE one had the option to run VMaware or natively. granted
you pay a price in performance but it should be possible.

eric..

Nick Saxon

unread,
Nov 18, 2002, 8:37:42 PM11/18/02
to
On Tue, 19 Nov 2002 01:18:47 GMT, eric w wrote:

>> >to any other OS (the OS has NO way of knowing if it is running on a

-----------------------------=====================================
>> >physical or logical processor.)
------=====================


>>
>> Wrong. VM provides a way of querying if an OS runs in a virtual machine or
>> not.
>> That's why you can run VM in a VM, bot not VM in a VM in a VM.
>>
>> >
>
>i don't believe i am wrong.
>
>yes there is a way to find out if you are running under VM, BUT this is to

-===========================================


>take advantage of various microcode assists, etc.
>
>in fact under VSE one had the option to run VMaware or natively. granted
>you pay a price in performance but it should be possible.
>
>eric..

No need to argue,
Nick


Graham

unread,
Nov 18, 2002, 10:04:54 PM11/18/02
to
Scott E. Garfinkle wrote:
> 4. Personally, I find it hard to believe that using an SMP kernel (including Windoze) and recoginizing two
> logical CPUs is any faster than an UNI kernel with one physical one. If there's testing to the contrary
> on the Intel site, I've missed it.

I think the theory is that great lumps of a processor sit idle while
other pieces are working. For example, the FPU will sit idle while the
ALU is working, and vice versa. If you can get more pieces working at
the same time, overall you should see an improvement in throughput, but
it sounds as though you'd need a fairly mixed workload to get the most
out of it.

Graham.

--
*-* Please remove spam free prefix before replying *-*

Chris Stumpf

unread,
Nov 18, 2002, 3:01:03 PM11/18/02
to
On Mon, 18 Nov 2002 12:39:41 -0500 (EST), dinkmeister wrote:

:>lol os/2 has the same processor tax.. unless you hack the smp kernel/os2apic.*


:>out of a publically available fixpak and get updates from testcase
:>(which is completely doable :)
:>
:>
:>regards,
:>- dink

:>

Huh? Please explain. As far as I know, the SMP support in eCS or WSeB will
work with up to 64 processors, no shenaigans involved.

Chris Stumpf

unread,
Nov 18, 2002, 3:03:46 PM11/18/02
to
On Mon, 18 Nov 2002 12:08:19 -0600 (CST), Scott E. Garfinkle wrote:

:>Hyperthreading:


:>1. If it really currently works on OS/2 (including eCS), I'd be deeply, deeply shocked.

Well, Kim posted that someone at Intel tested eCS with the SMP kernel and it
worked just fine according to them.

:>2. It's not a kernel change that's required, it's an OS2APIC change.


:>3. It's on my list of things to do in my spare time, but don't hold your breath.
:>4. Personally, I find it hard to believe that using an SMP kernel (including Windoze) and recoginizing two
:> logical CPUs is any faster than an UNI kernel with one physical one. If there's testing to the contrary
:> on the Intel site, I've missed it.
:>5. The discussion about FPUs is bogus. Ignore it. This stuff is pretty well understood and handled already
:> in SMP systems. Whether or not the other FPU is virtual is immaterial.

:>
:>

William L. Hartzell

unread,
Nov 18, 2002, 11:12:12 PM11/18/02
to
Sir: USBGuy wrote: >>> As for the rest (performance comments), it will be interesting to see >>> the benchmarks when (if) I do the work on the APIC driver. >> What is APIC??? > Advanced Programable Interrupt Controler > The logical PCU in an HT P4 has it's own so the system'll find 2 APICs > and hence should find 2 CPUs. > As for kernel/PSD stuff it's mosly about use of the PAUSE instruction > on spinlook waits which is a NOP with a prefix an hence irgnored on all > other x86 CPUs. > Other things like thread stack trashing the cache on windows as they are > 1MB aligned are application spec and easily fixable by an alloca(64*n) > in thread n. > Btw what is the stack alignment on OS/2 for different threads in a process? My Bios asks which version of this APIC does the system recongizes. I always select disable. However, another post leads me to believe that if OS/2 had this function enabled, we would have irq 16-31 avaiable to use. For the non-smp kernels, this would fix one of the major problems that the system currently has at little cost. My understanding from this other post that the system could then assign the many PCI slots and functions to any of these irq. and we would not be doing the slot dance with drivers unable to share irq. So any chance of getting this APIC driver for the non smp kernels? Bill <American Thanksgiving Day is November 28>

Jack Troughton

unread,
Nov 18, 2002, 11:41:17 PM11/18/02
to
Chris Stumpf wrote:

> On Mon, 18 Nov 2002 12:39:41 -0500 (EST), dinkmeister wrote:
>
> :>lol os/2 has the same processor tax.. unless you hack the smp
> kernel/os2apic.*
> :>out of a publically available fixpak and get updates from testcase
> :>(which is completely doable :)
> :>
> :>
> :>regards,
> :>- dink
> :>
>
> Huh? Please explain. As far as I know, the SMP support in eCS or
> WSeB will
> work with up to 64 processors, no shenaigans involved.

Well, it was designed to handle up to that many processors, but
AFAIK it's only been tested up to 16.

It's not like machines like that are common, though.

It would be cool to see what it would act like on a machine like
that, though. Imagine, 32 instances of dnetc or seti. You could
pound a lot of data on a machine like that.

A well threaded webserver could probably push a lot of data out too.

Hehehe.:)

Regards,

Jack

Klaus Staedtler

unread,
Nov 19, 2002, 3:15:54 AM11/19/02
to
Chris Stumpf wrote:
> On Mon, 18 Nov 2002 12:08:19 -0600 (CST), Scott E. Garfinkle wrote:
>
> :>Hyperthreading:
> :>1. If it really currently works on OS/2 (including eCS), I'd be deeply, deeply shocked.
>
> Well, Kim posted that someone at Intel tested eCS with the SMP kernel and it
> worked just fine according to them.

Please read Kim's posting more carefully. Someone at Intel has installed
eCS SMP on a HT capable machine, but he didn't say that he turned HT on.

Klaus Staedtler

Chris Stumpf

unread,
Nov 19, 2002, 3:56:50 AM11/19/02
to
On Mon, 18 Nov 2002 23:41:17 -0500, Jack Troughton wrote:

:>
:>Well, it was designed to handle up to that many processors, but

:>AFAIK it's only been tested up to 16.
:>
:>It's not like machines like that are common, though.
:>
:>It would be cool to see what it would act like on a machine like
:>that, though. Imagine, 32 instances of dnetc or seti. You could
:>pound a lot of data on a machine like that.
:>
:>A well threaded webserver could probably push a lot of data out too.
:>
:>Hehehe.:)

Well, someone reported to Kim that they tested eCS SMP on a 32 way Xenon box
and it worked just fine. BTW, I was sitting next to Kim when he got the
email. We both thought that was very cool. As for a webserver, well the
limitation is not cpu, but I/O. The PCI bus will get flooded way before the
cpu gets saturated on a webserver.

Chris Stumpf

unread,
Nov 19, 2002, 3:57:43 AM11/19/02
to
On Tue, 19 Nov 2002 08:15:54 GMT, Klaus Staedtler wrote:

:>
Actually he did. Read it again.

Scott E. Garfinkle

unread,
Nov 19, 2002, 10:19:27 AM11/19/02
to
On Mon, 18 Nov 2002 12:03:46 -0800 (PST), Chris Stumpf wrote:

>:>Hyperthreading:
>:>1. If it really currently works on OS/2 (including eCS), I'd be deeply, deeply shocked.
>Well, Kim posted that someone at Intel tested eCS with the SMP kernel and it

>worked just fine according to [him].
Actually, if you read the tester's comments very carefully, you will realize that he did not, in fact,
verify that OS/2 used the logical processors, only that OS/2 runs on a machine that is HT-capable.
This is a VERY different statement.

Further, the comments by "USBGuy" about what the PSD and SMP support are all about are
completely wrong. I don't have time to go into great detail, but there are a number of hardware
abstraction services provided by the PSD, including counting, starting, stopping CPUs, sending
IPIs, and hardware-specific timer services, among other things. Further, there are numerous
SMP-specific changes that have nothing to do with spinlock. Lastly spinlocks are themselves
utterly different from his comments.

This is the last reply I will make in this thread, which now goes into my kill file.
Bottom line:
1. OS/2 does NOT currently make any distinction between a system where the P4 is HT-capable
or not. I will *probably* eventually change os2apic as necessary to use this.
2. I am NOT convinced, Intel marketing aside, that OS/2 will run faster with an SMP kernel using
2 (or 4) logical CPUs as opposed to a UNI kernel with 1 physical CPU. I do not say it won't,
only that it is not obvious.
3. This stuff about apps needing to change to take adavantage of HT is crap. Apps need to be
written to take advantage of true threading on an OS with an efficient threads mechanism.
Then, whether the scheduling is done on two physical CPUs or whatever is another story.
I tend to think that HT is a marketing tool, although I find plausible the speculation by the poster who
theorized that HT is a response to poor task-switch performance on Windoze.


Jack Troughton

unread,
Nov 19, 2002, 10:37:59 AM11/19/02
to
On Tue, 19 Nov 2002 08:56:50 UTC, "Chris Stumpf" <cst...@monmouth.com> wrote:

>On Mon, 18 Nov 2002 23:41:17 -0500, Jack Troughton wrote:
>
>:>
>:>Well, it was designed to handle up to that many processors, but
>:>AFAIK it's only been tested up to 16.
>:>
>:>It's not like machines like that are common, though.
>:>
>:>It would be cool to see what it would act like on a machine like
>:>that, though. Imagine, 32 instances of dnetc or seti. You could
>:>pound a lot of data on a machine like that.
>:>
>:>A well threaded webserver could probably push a lot of data out too.
>:>
>:>Hehehe.:)
>
>Well, someone reported to Kim that they tested eCS SMP on a 32 way Xenon box
>and it worked just fine. BTW, I was sitting next to Kim when he got the
>email. We both thought that was very cool. As for a webserver, well the
>limitation is not cpu, but I/O. The PCI bus will get flooded way before the
>cpu gets saturated on a webserver.

Well, that's cool. 32 way... for the compensatory CIO;)

The PCI bus would go long before the CPUs... but there are other
buses. With a machine like that, you're not going to be talking
about cheap components anywhere.

At any rate, for me, this is all idle speculation. I'm not going to
be using anything more than a two way for the foreseeable future
anyway:)

Chris Stumpf

unread,
Nov 19, 2002, 11:53:40 AM11/19/02
to
On Tue, 19 Nov 2002 15:37:59 GMT, Jack Troughton wrote:
:>At any rate, for me, this is all idle speculation. I'm not going to

:>be using anything more than a two way for the foreseeable future
:>anyway:)
:>
Well, when the AMD Hammer hits the market, there will be 4 way SMP boards
for it. And from what I hear the manufacturers plan on doing 4 way boards
for the Athalon too.

Kim Cheung

unread,
Nov 19, 2002, 12:50:29 PM11/19/02
to
On Tue, 19 Nov 2002 08:15:54 GMT, Klaus Staedtler wrote:

I tried to get a confirmation about this. And this is what he said:

==========================================
HT works with all operating systems... I've seen so far. The problem is HT
does not benefit all applications and operating systems. There are some
applications broken by HT.

For applications and OS to gain benefits from HT they generally must be
recompiled and optimized for HT.

Apps that are multi-threaded tend to be the best candidates for benefits of
HT.
==========================================

End of quoted message.

Intel does not test OS/2. He did it on his own time. He did say some
applications got broken - like SDD.


Daniela Engert

unread,
Nov 19, 2002, 1:06:57 PM11/19/02
to
Hi Scott!

Scott E. Garfinkle wrote:

>>his machine is able to present 8 processors to OS/2 when I enable HT.
>
> Capable, certainly. But the real question is, does OS/2 *use* the capability. I don't think so,
> though I don't say this for sure. If your friend said "I turned on mpcpumon and it showd 8 CPUs",
> then I would believe that.
>
> As for the rest (performance comments), it will be interesting to see the benchmarks when (if) I do the
> work on the APIC driver.

As far as I understand this issue, taking advantage of HT requires
evaluation of the _MAT (multiple APIC table entry) ACPI object resulting
in a filled MADT structure, which in turn contains the required info to
find the additional LAPICs (very basic description, details left out).

An OS/2 ACPI/APIC PSD is in its embryonic stage (trying to figure out
how minimum ACPI support can be implemented efficiently (in terms of
programmer resources) without requiring changes to other parts of OS/2
or drivers).

Ciao,
Dani


Kim Cheung

unread,
Nov 19, 2002, 1:15:42 PM11/19/02
to
On Tue, 19 Nov 2002 00:57:43 -0800 (PST), Chris Stumpf wrote:

>:>Please read Kim's posting more carefully. Someone at Intel has installed
>:>eCS SMP on a HT capable machine, but he didn't say that he turned HT on.
>:>
>:>Klaus Staedtler
>:>
>Actually he did. Read it again.

Okay, okay, guys. I am trying to get another confirmation about this.

A question came up that he might have run it and then "assumed" that OS/2 is
using multiple CPUs because he was running the SMP kernel. I am trying to
get him to look more carefully and reply more specifically whether OS/2 is
actually seeing and using multiple CPUs.

Please stay tune.


William L. Hartzell

unread,
Nov 19, 2002, 1:57:09 PM11/19/02
to
Sir: Daniela Engert wrote: > An OS/2 ACPI/APIC PSD is in its embryonic stage (trying to figure out > how minimum ACPI support can be implemented efficiently (in terms of > programmer resources) without requiring changes to other parts of OS/2 > or drivers). Thanks! Bill <American Thanksgiving Day is November 28>

Kim Cheung

unread,
Nov 19, 2002, 1:46:21 PM11/19/02
to
On Mon, 18 Nov 2002 12:03:46 -0800 (PST), Chris Stumpf wrote:

>:>Hyperthreading:
>:>1. If it really currently works on OS/2 (including eCS), I'd be deeply, deeply shocked.
>
>Well, Kim posted that someone at Intel tested eCS with the SMP kernel and it
>worked just fine according to them.

I am working to get a more precise clarification as to what my friend meant
by "it worked".


Kim Cheung

unread,
Nov 19, 2002, 6:41:14 PM11/19/02
to

Okay, so far, this is where we stand.

From our friend at Intel:
==========================
If you have HT enabled processors and HT enabled BIOS.
If you look at the CPU monitor you will see all physical and logical
processors.
OS/2 cannot tell the difference between a physical or a logical processor.

Example:

2- physical cpu
HT enabled --- you will see 4 cpu in the CPU monitor

3- physical cpu
HT enabled -- you will see 6 cpu in the CPU monitor
==========================

So, I believe our friend definitely *did* run eCS/Pro in HT mode and CPU
monitor is showing that OS/2 *is* seeing multiple CPUs. No indication on
performance or anything - and some application might not run properly (he
only mentioned SDD).

There is still question why some had saw only one CPU when they tried it and
it appears to be related to the BIOS. If the BIOS is not HT enabled, OS/2
would see only one CPU - that's how I understand it.


Jack Troughton

unread,
Nov 19, 2002, 7:43:07 PM11/19/02
to
Chris Stumpf wrote:

> On Tue, 19 Nov 2002 15:37:59 GMT, Jack Troughton wrote:
> :>At any rate, for me, this is all idle speculation. I'm not going to
> :>be using anything more than a two way for the foreseeable future
> :>anyway:)
> :>
> Well, when the AMD Hammer hits the market, there will be 4 way
> SMP boards
> for it. And from what I hear the manufacturers plan on doing 4
> way boards
> for the Athalon too.

A four way Athlon would be very cool. I'd better start saving those
pennies:)

Even with my limited exposure to SMP, I gotta say it's good.

I bet that if Warp had become the dominant OS instead of Windows,
SMP would be a lot more popular today. The platform gets such a HUGE
benefit out of running on an SMP system.

Regards,

Jack

Jack Troughton

unread,
Nov 19, 2002, 7:47:53 PM11/19/02
to
Scott E. Garfinkle wrote:

> I tend to think that HT is a marketing tool, although I find
> plausible the speculation by the poster who theorized that HT is
> a response to poor task-switch performance on Windoze.

Why, thank you:)

Regards,

Jack

USBGuy

unread,
Nov 20, 2002, 3:14:37 AM11/20/02
to
>
> My Bios asks which version of this APIC does the system recongizes. I
No you system very likly asks for ACPI that is unless you have an SMP
box. Same letters different FLA and something completly different.

Daniela Engert

unread,
Nov 20, 2002, 1:20:22 PM11/20/02
to
Kim Cheung wrote:

> From our friend at Intel:
> ==========================
> If you have HT enabled processors and HT enabled BIOS.
> If you look at the CPU monitor you will see all physical and logical
> processors.
> OS/2 cannot tell the difference between a physical or a logical processor.
>
> Example:
>
> 2- physical cpu
> HT enabled --- you will see 4 cpu in the CPU monitor
>
> 3- physical cpu
> HT enabled -- you will see 6 cpu in the CPU monitor
> ==========================
>
> So, I believe our friend definitely *did* run eCS/Pro in HT mode and CPU
> monitor is showing that OS/2 *is* seeing multiple CPUs. No indication on
> performance or anything - and some application might not run properly (he
> only mentioned SDD).
>
> There is still question why some had saw only one CPU when they tried it and
> it appears to be related to the BIOS. If the BIOS is not HT enabled, OS/2
> would see only one CPU - that's how I understand it.

It *is* a BIOS issue!

OS2APIC.PSD looks at the MPTS 1.1/1.4 SMP table (considered obsolete by
MS) in the BIOS only but not at the ACPI MADT table to find the number
of CPUs and Local APICs. Some time ago Intel recommended to
manufacturers to include only the physical CPUs in the MPT no matter if
HT is enabled or not to avoid confusing "legacy" OS which supposedly
cannot handle the HT feature. In such cases only the MADT contains the
full info. This seems the be changing now, some vendors seem to include
the logical CPUs instead of the physical ones into the MPT SMP tables
when HT is enabled. This would make them visible to OS/2.

Ciao,
Dani


William L. Hartzell

unread,
Nov 20, 2002, 6:42:09 PM11/20/02
to
Sir: USBGuy wrote: >> My Bios asks which version of this APIC does the system recongizes. I > No you system very likly asks for ACPI that is unless you have an SMP > box. Same letters different FLA and something completly different. I wrote down what it said. APIC Mode: choice ENABLED or DISABLED With enabled selected it asks: MPS version: choice 1.1 or 1.4 This last is bogus as this a single processor board. So what does this mean? Don't think it is as you suspect. ACPI is the power management functions and is different from what I speak. Bill <American Thanksgiving Day is November 28>

William L. Hartzell

unread,
Nov 20, 2002, 6:42:09 PM11/20/02
to
Sir: USBGuy wrote: >> My Bios asks which version of this APIC does the system recongizes. I > No you system very likly asks for ACPI that is unless you have an SMP > box. Same letters different FLA and something completly different. My Bios has a switch for APIC and MPS, though it is a single processor board. The manual can be found at abit.com for KX7-333/KX7-333R as a pdf. But I did not save it, so no exact name. See Dani's last post to Kim this thread. The ACPI is for power management and is different from which I speak. Bill <American Thanksgiving Day is November 28>

ste...@scitechsoft.com

unread,
Nov 21, 2002, 1:50:12 AM11/21/02
to
"Scott E. Garfinkle" wrote:

> 1. OS/2 does NOT currently make any distinction between a system where the P4 is
> HT-capable or not.

As someone else suggested, I think how the BIOS represents the "CPU(s)"
plays a factor in this. I've got access to one of these, and the UNI
kernel works great. I'll have to try the SMP kernel on it and see what
it says...

> 2. I am NOT convinced, Intel marketing aside, that OS/2 will run faster with an
> SMP kernel using 2 (or 4) logical CPUs as opposed to a UNI kernel with 1 physical

I'll try the SysBench tests while I'm at it...

> 3. This stuff about apps needing to change to take adavantage of HT is crap. Apps
> need to be written to take advantage of true threading on an OS with an efficient
> threads mechanism.
> Then, whether the scheduling is done on two physical CPUs or whatever is another
> story. I tend to think that HT is a marketing tool, although I find plausible the
> speculation by the poster who theorized that HT is a response to poor task-switch
> performance on Windoze.

I've received some evidence(?) of this:

----------------------

> Does it really just look like two CPU's to applications etc

Yes, in fact even the OS thinks there's two CPU's there unless it's been
told different. e.g. W2K thinks I have two CPU's in the box. (And in
fact since the W2K scheduler is so shitty, it kills the performance
since it's thumping the <snip> thread between "CPU's" unnecessarily.
On a 2ghz P4, I get 50PPS (pixels per second) on our benchmark, on a
3ghz HT P4 I only get 60PPS. Yet when I installed Windows XP Pro on that
same 3ghz P4, I got 80PPS on the same render ... as <snip> explained to
me, XP Pro fixes some problems with the scheduler.

> Or do you have to code stuff specially for it?

Well, there's two things here. Technically, no, as long as your code
already makes use of two or more threads, it in theory can benefit. In
practice, however, if the threads are doing more or less the same thing
and accessing the same memory (e.g. <snip>), then you tend to lose some
of the benefit, since the advantage of hyperthreading is
that if one thread is stalled waiting for RAM, then another thread may
be able to do some work, if the resources it needs are in the cache. If
that's not the case then it doesn't really help.

USBGuy

unread,
Nov 21, 2002, 3:31:03 AM11/21/02
to
>
> The ACPI is for power management and is different from which I speak.

No ACPI is for configuration of the HW which does include the
power states. See Danis comment about using the old MPS versus the
new ACPI info areas.

Well the APIC setting will let the board use the 24 IRQ lines of
the APIC instead of the 16 of the normal PIC. So you get 8 more IRQs
with that on an OS that supports those.

USBGuy

unread,
Nov 21, 2002, 3:51:30 AM11/21/02
to
>
>>Does it really just look like two CPU's to applications etc
>
>
> Yes, in fact even the OS thinks there's two CPU's there unless it's been
> told different. e.g. W2K thinks I have two CPU's in the box. (And in
> fact since the W2K scheduler is so shitty, it kills the performance
> since it's thumping the <snip> thread between "CPU's" unnecessarily.
> On a 2ghz P4, I get 50PPS (pixels per second) on our benchmark, on a
> 3ghz HT P4 I only get 60PPS. Yet when I installed Windows XP Pro on that
> same 3ghz P4, I got 80PPS on the same render ... as <snip> explained to
> me, XP Pro fixes some problems with the scheduler.

According to an detailed article about SMT/HT in the german c't mag the
sheduler was updated to prefer phys over logical cpus which you only see
if you have a dual P4 with SMP => 4 vCPUs.
And the main problem is that NT has a SpinCounter option since NT4SP3
on the critical section so that a blocked thread can check the critsec
it is blocking on more often and doesn't get sheduled with the next
timeslice which can be 120ms on an NT Server. The code used for that
was basicaly eating up vCPU cycles/resources on a SMT system.
And the Pause instruction was added by Intel so the the wait between the
check takes longer and the CPU units can be used by the other vCPU in
the CPU.

>
>>Or do you have to code stuff specially for it?
>
>
> Well, there's two things here. Technically, no, as long as your code
> already makes use of two or more threads, it in theory can benefit. In
> practice, however, if the threads are doing more or less the same thing
> and accessing the same memory (e.g. <snip>), then you tend to lose some
> of the benefit, since the advantage of hyperthreading is
> that if one thread is stalled waiting for RAM, then another thread may
> be able to do some work, if the resources it needs are in the cache. If
> that's not the case then it doesn't really help.

Well you also have to be aware of cache trashing on an SMT system
which can hurt performance and doesn't show up on a SMP system as there
the 2 CPUs don't have a unified cache.
With 1MB aligned stacks in windows and a 1MB/64kB cache window
(P4/Xeons) you get that easily. In the mag they wrote a benchmark a
vtune suggested to add alloca n*64 to 1 thread so they are n cachelines
appart. that did result in a speed up from 63.5% to 71.5% (100% is the
speed of the same thread/code on a single CPU system) which is a speedup
from 27% to43% of the combined workload of both threads.
So if you code with SMT in mind you can benefit further from HT.

Jim Goffena

unread,
Nov 22, 2002, 4:32:50 PM11/22/02
to
i've installed acp2 mutliple times on 4 way machines using Intel Xeon MP
procs. these of course all have HT. HT is enabled in the bios, but i don't
put the secondary logical procs into the mps so of course os2 no see em. at
some oint i'll probably add the secondary cores to the mps table just for
grins and to see os2 w/ 8 procs. but i wouldn't expect it to run great w/
all the logical procs...could get into way to many timing/resource type
issues.

but else it works fine. only problem i have is a trap in doscall1 when
running the uni kernel. but disabling proc cache until the smp kernel loads
gets around that fine. scitech is great as long as you disable write
combining....but this was a while back, and i haven't checked newer scitech
releases on smp machines.


Teruel de Campo

unread,
Nov 22, 2002, 11:36:36 AM11/22/02
to
In message <frthfvozpbz.h...@ausnews.austin.ibm.com> - "Scott E.
Garfinkle" <s...@us.death.to.spam.ibm.com>Tue, 19 Nov 2002 09:19:27 -0600 (CST)
writes:
:>This is the last reply I will make in this thread, which now goes into my kill file.

:>Bottom line:
:> 1. OS/2 does NOT currently make any distinction between a system where the P4 is HT-capable
:> or not. I will *probably* eventually change os2apic as necessary to use this.
:> 2. I am NOT convinced, Intel marketing aside, that OS/2 will run faster with an SMP kernel using
:> 2 (or 4) logical CPUs as opposed to a UNI kernel with 1 physical CPU. I do not say it won't,
:> only that it is not obvious.
:> 3. This stuff about apps needing to change to take adavantage of HT is crap. Apps need to be
:> written to take advantage of true threading on an OS with an efficient threads mechanism.
:> Then, whether the scheduling is done on two physical CPUs or whatever is another story.
:>I tend to think that HT is a marketing tool, although I find plausible the speculation by the poster who
:>theorized that HT is a response to poor task-switch performance on Windoze.

Scott,

Thank you for sharing with us your knowlegde.

Regards,

-=terry (Denver)=-
chu...@attglobal.net
ICQ: 6387625
AIM: terryXela

0 new messages