My requirement is that the ISR be called within .2ms of hardware interrupt.
However, I am seeing jitter of up to at least 1ms (ISR gets called up to 1ms
after hardware interrupt). I understand that Vista (latest updates) is not a
RTOS and that other ISRs can take up to .25ms (according to tracelog) and
that under load conditions, several of these other ISR calls may be handled
before my driver's ISR. But I did the following in order to make sure that
my ISR runs on a processor core which is not ever used by other ISRs:
1. I used the interrupt affinity policy tool to change the processor
affinity for all device drivers (except mine) to 0x3 (this is on a Q6600
4-core system).
2. I used the affinity policiy tool to set my driver's affinity to 0x4
3. Using tracelog, I verified that under system load, only cores 0 & 1 are
used by other drivers' interrupts and that only core 3 is used by my ISR.
In spite of this, I am getting large jitter in ISR latency for my driver (up
to more than 1ms) under system load (using HD, Direct3D etc) - even though my
device is the only device generating interrupts on core 3!
I have the following questions:
1. Does anyone have any idea why this might be the case?
2. On another note, even though I set affinity to 0x3 for all drivers except
mine, they only generate interrupts on core 0 (instead of cores 0 & 1). Why
is that?
3. How can one change Irql and SynchronizeIrql for the interrupt. The
documentation only states that these values are supposed to be taken from
CM_PARTIAL_RESOURCE_DESCRIPTOR. What if I want my IRQL to be very high so
that my driver ISR will preempt other ISR's even if they are running on the
same cpu core? If this is possible then I would not have to prevent all
devices except for mine from generating interrupts on core 2.
Thanks
Philip
-Eliyas
"pgruebele" <pgru...@discussions.microsoft.com> wrote in message
news:C10C18D1-8B22-42E1...@microsoft.com...
Xperf is pretty nice but it is missing lots of the DPC/ISR details that the
tracelog provides. Actually, it would be VERY useful if the CPU or ISR usage
in Xpref would include context switches due to interrupts and DPC (so that
one can see nested interrupts etc).
I'm looking forward to getting more information from you once your sources
get back to you.
Regards
Philip
This is the official forum for discussing windows performance tools:
http://social.msdn.microsoft.com/Forums/en-US/wptk_v4/threads
-Eliyas
"pgruebele" <pgru...@discussions.microsoft.com> wrote in message
news:A2064E9A-80AE-4D24...@microsoft.com...
I moved my xperf post to that forum...
Still looking forward to hearing from you and your sources :-)
-----------
We are not able to clearly explain the reason for this jitter. If your
device is the only interrupting device on processor 3, it should have very
low interrupt latency. Setting affinity to 0x3 allows the hardware to
select between core 0 and core 1 at the time of each interrupt. The
hardware is supposed to choose the “lowest priority” processor but the
algorithms it uses differ from machine to machine. In many cases, core 0 is
the tie-breaker and lots of interrupts end up on core 0.
The best answer we can come up with for why this is happening is that there
is some other kernel activity going on – kernel mode workers that are
disabling interrupts, significant IPI traffic, etc. Maybe from some HD
software? Xperf traces are a good place for you to start. If you are
running on x86 the latency impact of this kind of activity is probably
higher than you are running 64 bit.
----------
So my suggestion would be that you use Xperf to figure out what's going on.
If you can't do that yourself, contact MS DDK support and give the log file
of Xperf and they will be able to work with the kernel engineer to analyze
the trace and identify the problem.
Good Luck.
-Eliyas
The thing is that all interrupts appear to go to CPU0 and most hardware
device DPCs appear to go to CPU0 as well, so there should not be a whole lot
of IPIs, right?
The only DPCs that go to all cores are those from tcpip.sys and usbport.sys,
both of which are very short and infrequent and should not cause this type of
latency even if they cause IPIs.
There must be something else going on... any ideas would be appreciated...
Thanks
Philip Gruebele
I just want to confirm some assumptions in relation to this interrupts
latency issue I am having:
1. each processor/core handles interrupts indepedently of the others.
2. if my device is the only device generating interrupts on core2 (other
devices are on core 0), then no matter what my device's IRQL is set to, its
interrupt should always be serviced immediately since no other interrupts of
higher IRQL should ever be running on core2.
3. if a device driver ISR or DPC temporarily disables interrupts or performs
other similar actions, it will only disable these for the core that it is
running on (core0), and should have no effect on my ISR running on core2.
4. after my device generates an interrupt but before the kernel actually
calls my ISR, does the kernel try to acquire any spin locks which could
explain this latency?
Thanks in advance
Philip Gruebele
Running Xperf gives me the exact sequence of all system interrupts. With it
and excel, I am able to look at the time delta between each of my interrupts
(nirlpk driver, the only one interrupting on cpu core 2). I ran a lot of
disk intensive file searches and managed to make my interrupt get skipped
altogether. The table below shows this. nirlpk is my driver and it is the
ONLY one generating interrupts on core 2. Most of the nirlpk interrupts (not
shown in table) are almost exactly 1.25ms apart as they should be (this is
the period at which the hardware generates interrupts). However, the 2
nirlpk ISR calls in this part of the xperf table are 2.86ms apart. Note that
there is no ISR activity between 28409.61188 and 28411.48736, so there is
absolutely no reason why my interrupt should have been called so late.
DRIVER core ISREnterTime ISRExitTime
_________________________________________________________
nirlpk.sys 2 28408.6198 28408.63592
USBPORT.SYS 0 28408.66316 28408.66808
HDAudBus.sys 0 28408.68044 28408.69992
USBPORT.SYS 0 28409.1872 28409.1902
HDAudBus.sys 0 28409.19404 28409.2018
ubohci.sys 0 28409.21444 28409.21864
ubohci.sys 0 28409.22256 28409.22848
ubohci.sys 0 28409.2426 28409.24764
USBPORT.SYS 0 28409.32928 28409.33412
HDAudBus.sys 0 28409.33704 28409.34732
ubohci.sys 0 28409.59304 28409.59644
ubohci.sys 0 28409.5994 28409.6024
ubohci.sys 0 28409.605 28409.61188
nirlpk.sys 2 28411.48736 28411.50836
ubohci.sys 0 28411.48976 28411.50056
How is this explained? These skipped or delayed ISR calls happen much more
frequently with system activity. Yet the table below shows that there was no
ISR activity before my delayed ISR call at 28411.48736. So, even if my ISR
were being called on core 0 like the other ISRs, there would be no reasonable
explanation of why my ISR is called so late...
Why is the kernel taking so long to call my isr? Once again, the ISR does
not do any serious processing. All it does is acknowledge the interrupt and
check for interrupt overrun (which it got in the example above). I can
therefore say with some certainty that the problem must lie with either the
kernel or interrupt hardware (modern Q6600 680Sli ACPI machine).
Regards
Philip Gruebele
Xperf doesn't see SMIs, right?
--PA
Or with some other piece of software. Don't forget that there is the CLI
instruction that will disable maskable interrupts on the processor. This is
used in various parts of the kernel and in some exported APIs (the
ExInterlockedXxx package comes to mind). I'd find it slightly unusual for a
driver to disable interrupts on the processor with any kind of regular
frequency, though it wouldn't surprise me.
Also, the xperf output shown doesn't seem to take into account system
management interrupts (e.g. clock). These would delay your ISR from running
also.
This doesn't really help get you a solution, of course. But I guess the
moral is that Windows is not real time and even though you've affinitized
all your interrupts to a particular processor you still don't own that
processor. Other things can (and will) thwart your attempts to get
consistent results.
-scott
--
Scott Noone
Software Engineer
OSR Open Systems Resources, Inc.
http://www.osronline.com
"pgruebele" <pgru...@discussions.microsoft.com> wrote in message
news:5712D646-43D3-49FB...@microsoft.com...
>> > is some other kernel activity going on - kernel mode workers that are
The thing is that:
1. doing disk activity such as searching for files greatly increases this
interrupt jitter but this would not cause increases in SMIs, right? So SMIs
don't appear to be the culprit. Other things like tcp traffic seem to have a
similar effect as disk activity.
2. CLI/IF is processor/core specific right? If all ISRs except for mine run
on core 0, then they can only disable interrupts for that core... So other
drivers can't really be causing this (I have verified 100% that only my
driver raises interrupt on core 2).
3. the jitter can reach up to >1.6ms which seems like an awfuly long time.
Since it happens mainly with system load (not just CPU load since interrupts
have priority over all threads...), this means that device driver ISR/DPC
acticity must somehow be causing this indirectly. The question is why?
My application is actually soft-realtime so it can cope with these timing
errors OK. The problem is that when the system gets loaded, this interrupt
jitter becomes so large and frequent that it causes me to have to re-measure
too much data. I don't expect hard realtime performance. I just want to
understand why things are behaving as badly as they are given the
configuration I created...
Thanks
Philip
As long as threads can still be scheduled on the processor then there can be
activity on that processor. Are you also changing the affinity of all
threads so that nothing is scheduled on that proc?
Out of curiosity, which O/S is this?
-scott
--
Scott Noone
Software Engineer
OSR Open Systems Resources, Inc.
http://www.osronline.com
"pgruebele" <pgru...@discussions.microsoft.com> wrote in message
news:5F8A1548-16E3-4824...@microsoft.com...
It should not be necessary to change thread/process affinity since
interrupts pre-empt all threads, including realtime priority threads. It is
my understanding that no matter what threads are running on the system,
interrupts are serviced at a level that is higher than any scheduler managed
threads...
Philip
"Scott Noone" wrote:
> As long as threads can still be scheduled on the processor then there can be
> activity on that processor. Are you also changing the affinity of all
> threads so that nothing is scheduled on that proc?
>
> Out of curiosity, which O/S is this?
>
>
>
-scott
--
Scott Noone
Software Engineer
OSR Open Systems Resources, Inc.
http://www.osronline.com
"pgruebele" <pgru...@discussions.microsoft.com> wrote in message
news:68AE9531-67DD-47F4...@microsoft.com...
Philip
"pgruebele" <pgru...@discussions.microsoft.com> wrote in message
news:A62C9CC9-5E0A-44D2...@microsoft.com...
Anyway, I'm pretty sure that xperf and my ISR timer data are reporting
correct data to me... I just can't find a reasonable explanation for why
this is happening...
I also thought that about 2-3 months ago this was all working perfectly with
jitter <100us. I don't know why now this is happening. I can't be 100% sure
that this was working, but I have a funny feeling that perhaps some Vista
"reliability" fixes that MS released in the last few months might have
changed something which causes this jitter... This is just a hunch and may
well be wrong. Food for though...
Philip
Philip
Then, maybe this is effect of IPIs or something like that.
xperf won't see these.
--PA
I would say it could delay it N units. The system is non-deterministic,
there are way too many factors here.
>All the interrupt does is store the processor context on the stack and call
>the interrupt vector - no matter how many threads are running.
I guess I'm not making the point clear on this one. If there is a thread
actively executing on the processor at an IRQL < your device IRQL, then
sure, the thread is interrupted and your ISR runs. However, during that
thread's time slice it could do things to prevent your ISR from being
delivered, notably CLI or raising the IRQL.
You also have the clock, SMIs (which are used for all kinds of weird
purposes, including fixing platform issues), IPIs (cross processor
scheduling, TLB shootdown), and whatever overhead xperf has (meant to be
minimal, but it's there).
If it were my job to track down this latency, I'd try this across more
systems with different processors and see if there was a difference. Also, I
think Eliyas recommended trying this on x64, which would also be
interesting.
If this this business critical I'd recommend ditching the software based
efforts and getting an Arium: http://www.arium.com/. That should help get a
much better picture as to how the processor is spending its time during the
delay.
HTH and good luck!
-scott
--
Scott Noone
Software Engineer
OSR Open Systems Resources, Inc.
http://www.osronline.com
"pgruebele" <pgru...@discussions.microsoft.com> wrote in message
news:A62C9CC9-5E0A-44D2...@microsoft.com...
Are there any software tools that will measure SMI, IPI, CLOCK, and TLB
shootdown occurences? I can't find any windows related info on TLB
shootdown...
Anyway, I am doing a clean vista install with SP1 but no further updates to
see if this problem really did start with some of the newer updates or
perhaps some driver update that I am not aware of.
I will report back on my findings...
-Philip
If I use a CPU with hyper threading such as the new i7 4 core processor,
each core has 2 hyperthreading logical processors. Each of those
hyperthreading logical processors has its own interrupt registers and related
hardware. If I hijack one of those hyperthreading logical processors and use
it to busy poll my timer so that I remove most of the jitter problem, I
wonder if this busy polling will consume the entire core or whether it will
be able to coexist peacefully with another thread running on the same
processor core... I guess this is an intel architecture problem. The busy
polling thread would loop around a few simple integer instructions, so it
seems like this might work...
Just a though. With cores and hyperthreading increasing the way it is, it
seems like a better solution than adding a RTOS kernel to vista or trying to
figure out how to resolve the interrupt jitter problem, right?
-philip
When a thread is running in user mode it can invoke a system service. That
will transition the mode of thread from the least privileged (user) to the
most privileged (kernel) and begin executing inside the O/S. Various device
drivers may be called while processing the system service, and they may do
either of the above. Also, the O/S may do either of the above during this
process also.
Page faults can come into play here too, the O/S may invoke the storage
stack to bring pages in and that will involve everything from the file
system all the way down to the storage controller (and any filters above or
in between).
> Are there any software tools that will measure SMI, IPI, CLOCK, and TLB
> shootdown occurences?
Not that I'm aware of, though I've never had to so I've never gone looking.
-scott
--
Scott Noone
Software Engineer
OSR Open Systems Resources, Inc.
http://www.osronline.com
"pgruebele" <pgru...@discussions.microsoft.com> wrote in message
news:A05DD66F-24DD-4A4E...@microsoft.com...
-philip
-scott
--
Scott Noone
Software Engineer
OSR Open Systems Resources, Inc.
http://www.osronline.com
"pgruebele" <pgru...@discussions.microsoft.com> wrote in message
news:3A4E0F31-C9BC-41DB...@microsoft.com...