Real time performance on non-root linux

225 views
Skip to first unread message

Dan Zach

unread,
Feb 23, 2017, 9:00:15 AM2/23/17
to Jailhouse
Dear forum,

On the Jetson TK1 inmate I use linux 4.8.2 with PREEMPT-RT patch.
I measure a 1KHz high priority task based on hrtimer events that measures it's own jitter.

The worst case, I get 400uS jitter on 1mS tick - not so bad, but the interesting thing is that the jitter stays as low as 50uS, untill I start activity on the root cell, especially GUI.

So the questions is: if the non-root cell is completely isolated from the root: separate physical memory, PPI from its local timer, where this cross influence can come from?

Thanks
Dan

Henning Schild

unread,
Feb 23, 2017, 11:17:07 AM2/23/17
to jailho...@googlegroups.com
Well the cells are not completely isolated, some shared resources
remain. The jitter you are seeing is caused by those, i.e. caches and
busses. And GPU workloads would likely stress exactly those.

Henning

> Thanks
> Dan
>

Ralf Ramsauer

unread,
Feb 23, 2017, 12:24:03 PM2/23/17
to Henning Schild, jailho...@googlegroups.com
On 02/23/2017 08:17 AM, Henning Schild wrote:
> On Thu, 23 Feb 2017 06:00:15 -0800
> Dan Zach <d...@cobomind.com> wrote:
>
>> Dear forum,
>>
>> On the Jetson TK1 inmate I use linux 4.8.2 with PREEMPT-RT patch.
>> I measure a 1KHz high priority task based on hrtimer events that
>> measures it's own jitter.
Run jailhouse cell stats for your RT-cell and watch the MMIO traps. I
guess it will heavily trap on RT. RT heavily accesses the GICD that
needs to be emulated in hardware
>>
>> The worst case, I get 400uS jitter on 1mS tick - not so bad, but the
Uh, okay? With the root cell idling? CONFIG_HZ_1000?

Uhm -- What's your clocksource? Could you please post /proc/timer_list?
>> interesting thing is that the jitter stays as low as 50uS, untill I
>> start activity on the root cell, especially GUI.
>>
>> So the questions is: if the non-root cell is completely isolated from
>> the root: separate physical memory, PPI from its local timer, where
>> this cross influence can come from?
>
> Well the cells are not completely isolated, some shared resources
> remain. The jitter you are seeing is caused by those, i.e. caches and
> busses. And GPU workloads would likely stress exactly those.
Yep, mainly shared system bus and caches. Others like MMIO dispatching
and IRQ reinjection cause (at least measurable) latencies. I already did
some measurements to get worst case latencies, but I never hit anything
like that, interestingly.

Do those latencies also occur if you use an Preempt-RT patched kernel
without Jailhouse when the GPU gets stressed?

Ralf
>
> Henning
>
>> Thanks
>> Dan
>>
>

Dan Zach

unread,
Feb 23, 2017, 12:49:30 PM2/23/17
to Ralf Ramsauer, Jailhouse, Henning Schild
Good idea - will do the same kernel and test without Jailhouse to compare.

Thanks all

--
You received this message because you are subscribed to a topic in the Google Groups "Jailhouse" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/jailhouse-dev/RfucfkcbNQU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to jailhouse-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dan Zach

unread,
Feb 23, 2017, 1:54:52 PM2/23/17
to Ralf Ramsauer, Henning Schild, Jailhouse
So /proc/timer_list is below.
In the dts configured both: the ARM internal timer and the SoC Tegra timer(dont think it actually works though):

       timer@0,60005000 {
                compatible = "nvidia,tegra124-timer", "nvidia,tegra20-timer";
                reg = <0x60005000 0x400>;
                interrupts = <GIC_SPI 0 IRQ_TYPE_LEVEL_HIGH>,
                             <GIC_SPI 1 IRQ_TYPE_LEVEL_HIGH>,
                             <GIC_SPI 41 IRQ_TYPE_LEVEL_HIGH>,
                             <GIC_SPI 42 IRQ_TYPE_LEVEL_HIGH>,
                             <GIC_SPI 121 IRQ_TYPE_LEVEL_HIGH>,
                             <GIC_SPI 122 IRQ_TYPE_LEVEL_HIGH>;
                clocks = <&tegra_car 5>;
        };


    timer {
        compatible = "arm,armv7-timer";
        interrupts = <GIC_PPI 13
                (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>,
                 <GIC_PPI 14
                (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>,
                 <GIC_PPI 11
                (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>,
                 <GIC_PPI 10
                (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>;
    };


Can you tell if below looks ok?

Timer List Version: v0.8
HRTIMER_MAX_CLOCK_BASES: 4
now at 145981795048 nsecs

cpu: 0
 clock 0:
  .base:       ceec8980
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <ceec8c50>, tick_sched_timer, S:01, tick_nohz_restart, swapper/0/0
 # expires at 145982000000-145982000000 nsecs [in 204952 to 204952 nsecs]
 #1: <cdc25e90>, hrtimer_wakeup, S:01, schedule_hrtimeout_range_clock, bus_10ms/40
 # expires at 145982000000-145982000001 nsecs [in 204952 to 204953 nsecs]
 #2: def_rt_bandwidth, sched_rt_period_timer, S:01, enqueue_task_rt, ktimersoftd/0/4
 # expires at 146026000000-146026000000 nsecs [in 44204952 to 44204952 nsecs]
 #3: <ceec8d70>, watchdog_timer_fn, S:01, watchdog_enable, watchdog/0/14
 # expires at 148032000000-148032000000 nsecs [in 2050204952 to 2050204952 nsecs]
0        d_clock_timer, sched_clock_poll, S:01, sched_clock_postinit, swapper/0/--More--
48 nsecs]s at 4398046511096-4398046511096 nsecs [in 4252064716048 to 42520647160--More--
 clock 1:
  .base:       ceec89c0
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     0 nsecs
active timers:
 clock 2:
  .base:       ceec8a00
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     0 nsecs
active timers:
 clock 3:
  .base:       ceec8a40
  .index:      3
  .resolution: 1 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     0 nsecs
active timers:
  .expires_next   : 145985000000 nsecs
  .hres_active    : 1  .idle_jiffies   : 4294667614
  .idle_calls     : 4
  .idle_sleeps    : 2
  .idle_entrytime : 319965499 nsecs
  .idle_waketime  : 319965499 nsecs
  .idle_exittime  : 319985583 nsecs
  .idle_sleeptime : 1838332 nsecs
  .iowait_sleeptime: 0 nsecs
  .last_jiffies   : 4294667615
  .next_timer     : 329000000
  .idle_expires   : 329000000 nsecs
jiffies: 4294813280

Tick Device: mode:     1
Broadcast device
Clock Event Device: timer0
 max_delta_ns:   536870948001
 min_delta_ns:   1001
 mult:           4294967
 shift:          32
 mode:           1
 next_event:     9223372036854775807 nsecs
 set_next_event: tegra_timer_set_next_event
 shutdown: tegra_timer_shutdown
 periodic: tegra_timer_set_periodic
 oneshot:  tegra_timer_shutdown
 resume:   tegra_timer_shutdown
 event_handler:  tick_handle_oneshot_broadcast
 retries:        0

tick_broadcast_mask: 0
tick_broadcast_oneshot_mask: 0

Tick Device: mode:     1
Per CPU device: 0
Clock Event Device: arch_sys_timer
 max_delta_ns:   178956969028
 min_delta_ns:   1250
 mult:           51539608
 shift:          32
 mode:           3
 next_event:     145985000000 nsecs
 set_next_event: arch_timer_set_next_event_virt
 shutdown: arch_timer_shutdown_virt
 oneshot stopped: arch_timer_shutdown_virt
 event_handler:  hrtimer_interrupt
 retries:        4

#
#

  .nr_events      : 145926
  .nr_retries     : 1
  .nr_hangs       : 0
  .max_hang_time  : 0
  .nohz_mode      : 2
  .last_tick      : 319000000 nsecs
  .tick_stopped   : 0


>
> Henning
>
>> Thanks
>> Dan
>>
>

Ralf Ramsauer

unread,
Feb 23, 2017, 2:56:56 PM2/23/17
to Dan Zach, Henning Schild, Jailhouse
Hi Dan,

On 02/23/2017 10:54 AM, Dan Zach wrote:
> So /proc/timer_list is below.
> In the dts configured both: the ARM internal timer and the SoC Tegra
> timer(dont think it actually works though):
>
> timer@0,60005000 {
> compatible = "nvidia,tegra124-timer",
> "nvidia,tegra20-timer";
> reg = <0x60005000 0x400>;
> interrupts = <GIC_SPI 0 IRQ_TYPE_LEVEL_HIGH>,
> <GIC_SPI 1 IRQ_TYPE_LEVEL_HIGH>,
> <GIC_SPI 41 IRQ_TYPE_LEVEL_HIGH>,
> <GIC_SPI 42 IRQ_TYPE_LEVEL_HIGH>,
> <GIC_SPI 121 IRQ_TYPE_LEVEL_HIGH>,
> <GIC_SPI 122 IRQ_TYPE_LEVEL_HIGH>;
> clocks = <&tegra_car 5>;
> };
Don't use this one. It requires a clock to be gated when being enabled,
so you also need to root-share the whole Tegra CAR (which doesn't work).
Partitioning of clock controllers is currently not supported by Jailhouse.

I've developed a paravirtual c&r controller for jailhouse, and we
already had some off-list discussions as we definitely will have to
address these issues in future, but for the moment clock gating is not
supported.
>
>
> timer {
> compatible = "arm,armv7-timer";
> interrupts = <GIC_PPI 13
> (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>,
> <GIC_PPI 14
> (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>,
> <GIC_PPI 11
> (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>,
> <GIC_PPI 10
> (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>;
> };
Yep, use this timer. Should be absolutely sufficient.
>
>
> Can you tell if below looks ok?
Ok, so this is the timer list of the non-root RT Linux, right?

Does you non-root RT Linux contain this [1] patch?

Ralf

[1]
http://git.kiszka.org/?p=linux.git;a=commitdiff;h=70671bb3cd47fe70b9a7076625e09d0411d58de9
> <https://groups.google.com/d/topic/jailhouse-dev/RfucfkcbNQU/unsubscribe>.
> To unsubscribe from this group and all its topics, send an email to
> jailhouse-de...@googlegroups.com
> <mailto:jailhouse-dev%2Bunsu...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "Jailhouse" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to jailhouse-de...@googlegroups.com
> <mailto:jailhouse-de...@googlegroups.com>.

Dan Zach

unread,
Feb 24, 2017, 6:09:23 AM2/24/17
to Ralf Ramsauer, Henning Schild, Jailhouse
Hello Ralf.

1. Using just 4.8.2 with RT patch (without jailhouse) on Jetson TK1, I get worst case of 50uS with average of just 5uS.
2. The non-root linux: 1000HZ, applied the patch to timer.c (thanks for that) - still average delay on the same test is 40 uS and worth case goes up to 500uS

So I see the steady state being x10 factor on the inmate, even without GPU or any stress at all.

Not sure what the cp15exits mean here, but the number is huge (3.5K/sec)

vmexits_total                    3089789      3447
vmexits_cp15                     2277030      2446
vmexits_virt_irq                  812541      1000
vmexits_mmio                         214         0
vmexits_psci                           3             0
vmexits_management              2              0
vmexits_hypercall                    0             0
vmexits_maintenance               0             0
vmexits_virt_sgi                       0             0



>     <mailto:jailhouse-dev%2Bunsu...@googlegroups.com>.
>     For more options, visit https://groups.google.com/d/optout
>     <https://groups.google.com/d/optout>.
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "Jailhouse" group.
> To unsubscribe from this group and stop receiving emails from it, send

Dan Zach

unread,
Feb 24, 2017, 7:00:26 AM2/24/17
to Ralf Ramsauer, Henning Schild, Jailhouse
The CP15 exits are due to reprogramming the arm native timer via cp15 each time.



>     For more options, visit https://groups.google.com/d/optout
>     <https://groups.google.com/d/optout>.
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "Jailhouse" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to jailhouse-dev+unsubscribe@googlegroups.com

Dan Zach

unread,
Feb 26, 2017, 3:47:15 PM2/26/17
to Ralf Ramsauer, Henning Schild, Jailhouse
Made the root cell to be 4.8.2-rt as well. With Jailhouse on, it runs the same test with typical jitter of 2-3uS and worse case of 35uS.


COUNTER                              SUM   PER SEC                                                                                                                  
vmexits_total                     445327      1136
vmexits_virt_irq                  430071      1107
vmexits_mmio                       10073        15
vmexits_virt_sgi                    4552         8
vmexits_hypercall                    623         7
vmexits_management                    14         0
vmexits_cp15                           0         0
vmexits_maintenance                    0         0
vmexits_psci                           0         0



- Is the root cell handled differently on IRQ handling then non-root?
- Why aren't there any CP15 exits, while the non-root cell with the same kernel, runs 3.5K/sec on timer rearming?


And now the interesting part:

If the root cell itself runs real-time 1ms task, the non-root cell average jitter drops to 12uS (from 50) and worth case to 70uS.
How about that for a puzzle?

Jan Kiszka

unread,
Feb 27, 2017, 8:32:46 AM2/27/17
to Dan Zach, Ralf Ramsauer, Henning Schild, Jailhouse
On 2017-02-26 21:47, Dan Zach wrote:
> Made the root cell to be 4.8.2-rt as well. With Jailhouse on, it runs
> the same test with typical jitter of 2-3uS and worse case of 35uS.
>
>
> COUNTER SUM PER
> SEC
>
> vmexits_total 445327 1136
> vmexits_virt_irq 430071 1107
> vmexits_mmio 10073 15
> vmexits_virt_sgi 4552 8
> vmexits_hypercall 623 7
> vmexits_management 14 0
> vmexits_cp15 0 0
> vmexits_maintenance 0 0
> vmexits_psci 0 0
>
>
>
> - Is the root cell handled differently on IRQ handling then non-root?

Nope.

> - Why aren't there any CP15 exits, while the non-root cell with the same
> kernel, runs 3.5K/sec on timer rearming?

The root cell booted with a hardware-provided broadcast timer and, thus,
can switch to hires timer mode (it actually did that prior to loading
Jaihouse). The non-root cell needs the patch Ralf pointed you to already.

>
>
> And now the interesting part:
>
> _*If the root cell itself*_ runs real-time 1ms task, the non-root cell
> average jitter drops to 12uS (from 50) and worth case to 70uS.
> How about that for a puzzle?

That can be caching effects (keeping a core busy with cache-bound tasks
means it can request less from RAM and, thus, cause less pressure on
that shared link), or it is related to power management (the idle root
cell may enable power savings that have side effects on other cores).

Jan

--
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux

Dan Zach

unread,
Feb 27, 2017, 8:48:33 AM2/27/17
to Jan Kiszka, Ralf Ramsauer, Henning Schild, Jailhouse
Thanks Jan,

How can i verify it takes effect?
Actually I see clock_getres() still returns low resolution, it is supposed to report 1ns, in case of hires timers.

Dan

Jan Kiszka

unread,
Feb 27, 2017, 10:07:53 AM2/27/17
to Dan Zach, Ralf Ramsauer, Henning Schild, Jailhouse
On 2017-02-27 14:48, Dan Zach wrote:
> Thanks Jan,
>
> I
> have http://git.kiszka.org/?p=linux.git;a=commitdiff;h=70671bb3cd47fe70b9a7076625e09d0411d58de9
> <http://www.google.com/url?q=http%3A%2F%2Fgit.kiszka.org%2F%3Fp%3Dlinux.git%3Ba%3Dcommitdiff%3Bh%3D70671bb3cd47fe70b9a7076625e09d0411d58de9&sa=D&sntz=1&usg=AFQjCNHqW-k5allMU9o8u7ykJebRDcEbvQ>
> applied at the non-root cell,
> How can i verify it takes effect?
> Actually I see clock_getres() still returns low resolution, it is
> supposed to report 1ns, in case of hires timers.

Check what /proc/timer_list reports regarding "event_handler", it should
point to hrtimer_interrupt if the patch is working.

Dan Zach

unread,
Feb 28, 2017, 11:49:28 AM2/28/17
to Jan Kiszka, Ralf Ramsauer, Henning Schild, Jailhouse
Jan,
Running gic-demo at 1KHz on an inmate - the jitter is still gives 200 us - worse case.


I wonder why in this bare-metal case, the stats show no CP15 exits as well:

vmexits_total                     144686       987
vmexits_virt_irq                  144683       987
vmexits_management                     2         0
vmexits_mmio                           1         0
vmexits_cp15                           0         0
vmexits_hypercall                      0         0

vmexits_maintenance                    0         0
vmexits_psci                           0         0
vmexits_virt_sgi                       0         0

Thanks

Dan Zach

unread,
Feb 28, 2017, 11:52:06 AM2/28/17
to Jailhouse, d...@cobomind.com, ralf.r...@oth-regensburg.de, henning...@siemens.com

Jan,


Running gic-demo at 1KHz on an inmate - the jitter is still gives 200 us - worse case.


I wonder why in this bare-metal case, the stats show no CP15 exits as well:

vmexits_total 144686 987
vmexits_virt_irq 144683 987
vmexits_management 2 0
vmexits_mmio 1 0
vmexits_cp15 0 0

vmexits_hypercall 0 0


vmexits_maintenance 0 0
vmexits_psci 0 0

vmexits_virt_sgi 0 0

Thanks

Jan Kiszka

unread,
Feb 28, 2017, 11:55:36 AM2/28/17
to Dan Zach, Ralf Ramsauer, Henning Schild, Jailhouse
On 2017-02-28 17:49, Dan Zach wrote:
> Jan,
> Running gic-demo at 1KHz on an inmate - the jitter is still gives 200 us
> - worse case.
>
>
> I wonder why in this bare-metal case, the stats show no CP15 exits as well:
>
> vmexits_total 144686 987
> vmexits_virt_irq 144683 987
> vmexits_management 2 0
> vmexits_mmio 1 0
> vmexits_cp15 0 0
> vmexits_hypercall 0 0
> vmexits_maintenance 0 0
> vmexits_psci 0 0
> vmexits_virt_sgi 0 0
>

How do /proc/timer_list look like, for both cells in comparison? Please
post here as well.

Dan Zach

unread,
Feb 28, 2017, 3:05:51 PM2/28/17
to Jailhouse, d...@cobomind.com, ralf.r...@oth-regensburg.de, henning...@siemens.com

Jan,
Pleae see at the attached file:
timers_list:
1. Root cell (very good jitter ~3us average)
2. Non-root cell (terrible jitter ~40us average)

For the bare metal gic-demo , modified to 1KHz, the jitter is somewhere in between.

Thank you!
Dan

timer_list

Dan Zach

unread,
Mar 3, 2017, 10:32:12 AM3/3/17
to Jailhouse
Any observations on the timer list?

Jan Kiszka

unread,
Mar 3, 2017, 10:38:27 AM3/3/17
to Dan Zach, Jailhouse
On 2017-03-03 16:32, Dan Zach wrote:
> Any observations on the timer list?
>

Nothing obvious. The only difference is virt vs. phys timer registers,
but those shouldn't make a difference. I will have to reproduce the
setup locally. My TK1 is currently out of reach, but the behaviour
should be ARMv7-generic.

Could you share your non-root RT kernel config?

Dan Zach

unread,
Mar 3, 2017, 11:19:17 AM3/3/17
to Jailhouse, d...@cobomind.com

Please find attached

config_RT_482

e.gui...@evidence.eu.com

unread,
Mar 3, 2017, 1:22:08 PM3/3/17
to Jailhouse, d...@cobomind.com
My measurements of jitter in a bare-metal cell on a TegraX1 spans between 5us to 20 with a mean value around 11-12 us.

Moreover the minimum inter-arrival time between two interrupts to be handled is around ~50us.

It seems to me a really high value...


Could be tied to a too low value for the clock frequency?

Actually if I execute:
sudo cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_cur_freq

before enabling jailhose I get the following value

1734000

I don't know what does it means (1,7 Ghz in Khz? Weird..)

but actually is the highest value you get executing:

sudo cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_available_frequencies

Regards,
Errico Guidieri

Jan Kiszka

unread,
Mar 8, 2017, 4:48:54 AM3/8/17
to Dan Zach, Jailhouse
I was able to reproduce on a Banana Pi, but I didn't debug the root
cause yet. Workaround: try to enable CONFIG_ARM_LPAE in the non-root
cell kernel config.

Dan Zach

unread,
Mar 8, 2017, 11:41:18 AM3/8/17
to Jailhouse, d...@cobomind.com

Tried with CONFIG_ARM_LPAE on in the non-root linux. The minimal time is still around 40 uS, when the root cell is stressed, worse case is 700us

Jan Kiszka

unread,
Mar 8, 2017, 11:47:58 AM3/8/17
to Dan Zach, Jailhouse
...but the cp15 exits are gone? We are seeing a different issue
regarding the latencies then. Tried ftrace already?

Dan Zach

unread,
Mar 8, 2017, 11:56:43 AM3/8/17
to Jailhouse, d...@cobomind.com

The CP15 are down to 20/sec

vmexits_total 200708 1019
vmexits_virt_irq 97381 1000
vmexits_cp15 103115 19
vmexits_mmio 207 0
vmexits_psci 3 0
vmexits_management 2 0


vmexits_hypercall 0 0
vmexits_maintenance 0 0

vmexits_virt_sgi 0 0

Still can be the blame for the worth case, but not for the average
Could you explain please how LPAE reduced the CP15 exits so dramatically?

Dan Zach

unread,
Mar 8, 2017, 12:19:36 PM3/8/17
to Jailhouse, d...@cobomind.com

With bare-metal gic demo inmate, the jitter is min:3 avg:10 worse:150
So the linux is to blame. Any suggestions?

Dan Zach

unread,
Mar 14, 2017, 3:41:15 PM3/14/17
to Jailhouse, d...@cobomind.com

When the graphics are active, there is an interrupt storm of tegra display driver:
from /proc/interrupt
106: 43280 0 0 GIC tegradc.1

at rate of ~2KHz.

When I artificially block this interrupt after a while, I see the real time jitter behaving almost perfectly.


Can you see how interrupt SPI storm ,though to the root cell, might damage RT jitter on another cell?

Jan Kiszka

unread,
Mar 15, 2017, 1:38:33 AM3/15/17
to Dan Zach, Jailhouse
The interrupt storm may be just one symptom of load on shared resources.
Maybe the GPU or its driver are also issuing a high load of memory
accesses or are otherwise stressing the shared interconnects.

Looking at the hypervisor, there should be no interference between two
cells if one receives lots of interrupts. Specifically, there is only
one shared lock between both in irqchip.c (dist_lock), but it's taken
only for a very short read-modify-write access.

Gustavo Lima Chaves

unread,
Jun 23, 2017, 1:33:10 PM6/23/17
to Jailhouse, d...@cobomind.com
[...]

> >
> > When I artificially block this interrupt after a while, I see the real time jitter behaving almost perfectly.
> >
> >
> > Can you see how interrupt SPI storm ,though to the root cell, might damage RT jitter on another cell?
> >
>
> The interrupt storm may be just one symptom of load on shared resources.
> Maybe the GPU or its driver are also issuing a high load of memory
> accesses or are otherwise stressing the shared interconnects.
>
> Looking at the hypervisor, there should be no interference between two
> cells if one receives lots of interrupts. Specifically, there is only
> one shared lock between both in irqchip.c (dist_lock), but it's taken
> only for a very short read-modify-write access.
>
> Jan

What about Linux inmate cells on Intel? Has anybody else seem horrid timer resolutions (like ".resolution: 1000000 nsecs") as well? The aforementioned ARM patch makes no sense for the x86's version of time.c, I guess. Should we be providing HPET address to the inmate the same way it's done for PM timer address? Any other ideas?

Regards,

Jan Kiszka

unread,
Jun 23, 2017, 4:36:38 PM6/23/17
to Gustavo Lima Chaves, Jailhouse, d...@cobomind.com
What's your setup exactly?

There is no need to use the slow HPET on x86 anymore, with or without
Jailhouse. We now have the local APIC as reliable high-resolution and
high-performance (because it's core-local) timer.

And as reference clock, the PM timer is easily and cleanly exported to
inmates - that's why we do that.

Jan

Gustavo Lima Chaves

unread,
Jun 23, 2017, 5:57:14 PM6/23/17
to Jan Kiszka, Jailhouse, d...@cobomind.com
* Jan Kiszka <jan.k...@siemens.com> [2017-06-23 22:36:35 +0200]:

> On 2017-06-23 19:33, Gustavo Lima Chaves wrote:
> > [...]
> >
> >>>
> >>> When I artificially block this interrupt after a while, I see the real time jitter behaving almost perfectly.
> >>>
> >>>
> >>> Can you see how interrupt SPI storm ,though to the root cell, might damage RT jitter on another cell?
> >>>
> >>
> >> The interrupt storm may be just one symptom of load on shared resources.
> >> Maybe the GPU or its driver are also issuing a high load of memory
> >> accesses or are otherwise stressing the shared interconnects.
> >>
> >> Looking at the hypervisor, there should be no interference between two
> >> cells if one receives lots of interrupts. Specifically, there is only
> >> one shared lock between both in irqchip.c (dist_lock), but it's taken
> >> only for a very short read-modify-write access.
> >>
> >> Jan
> >
> > What about Linux inmate cells on Intel? Has anybody else seem horrid timer resolutions (like ".resolution: 1000000 nsecs") as well? The aforementioned ARM patch makes no sense for the x86's version of time.c, I guess. Should we be providing HPET address to the inmate the same way it's done for PM timer address? Any other ideas?
>
> What's your setup exactly?

Maybe it's something in my inmate Linux config? I've tried with two
codebases there: RT Linux and upstream (both with Jailhouse patches,
naturally). I'll attach one of the configs.

>
> There is no need to use the slow HPET on x86 anymore, with or without
> Jailhouse. We now have the local APIC as reliable high-resolution and
> high-performance (because it's core-local) timer.

OK, fair. Still wondering what's giving that final bad resolution. The
rest of the setup is a Fedora 25 system on QEMU, kernel 4.8, with
pristine qemu-vm and linux-x86-demo cells running.

>
> And as reference clock, the PM timer is easily and cleanly exported to
> inmates - that's why we do that.

OK, I'll keep searching the source of that strange behavior, thanks.

>
> Jan
>
> --
> Siemens AG, Corporate Technology, CT RDA ITP SES-DE
> Corporate Competence Center Embedded Linux
>
> --
> You received this message because you are subscribed to the Google Groups "Jailhouse" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to jailhouse-de...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

--
Gustavo Lima Chaves
Intel - Open Source Technology Center
yoctoproject-4.9+jh.config

Gustavo Lima Chaves

unread,
Jun 23, 2017, 8:11:30 PM6/23/17
to Jan Kiszka, Jailhouse, d...@cobomind.com
[...]

> > > What about Linux inmate cells on Intel? Has anybody else seem horrid timer resolutions (like ".resolution: 1000000 nsecs") as well? The aforementioned ARM patch makes no sense for the x86's version of time.c, I guess. Should we be providing HPET address to the inmate the same way it's done for PM timer address? Any other ideas?
> >
> > What's your setup exactly?
>
> Maybe it's something in my inmate Linux config? I've tried with two
> codebases there: RT Linux and upstream (both with Jailhouse patches,
> naturally). I'll attach one of the configs.

OK, apparently this fixes that (just as for ARM):

--- a/arch/x86/kernel/time.c
+++ b/arch/x86/kernel/time.c
@@ -85,6 +85,7 @@ static __init void x86_late_time_init(void)
{
x86_init.timers.timer_init();
tsc_init();
+ tick_setup_hrtimer_broadcast();
}

so that I now see

cpu: 0
clock 0:
.base: ffff880003813300
.index: 0
.resolution: 1 nsecs
.get_time: ktime_get
.offset: 0 nsecs



Henning Schild

unread,
Jun 26, 2017, 3:54:29 AM6/26/17
to Gustavo Lima Chaves, Jan Kiszka, Jailhouse, d...@cobomind.com
Am Fri, 23 Jun 2017 14:56:58 -0700
schrieb Gustavo Lima Chaves <gustavo.l...@intel.com>:
So you are running Jailhouse and the rt-preempt non-root cell on qemu?
Getting solid rt-performance there might be tricky, that should not be
your starting point when looking at realtime!

Henning

Gustavo Lima Chaves

unread,
Jun 26, 2017, 12:26:07 PM6/26/17
to Henning Schild, Jan Kiszka, Jailhouse, d...@cobomind.com
* Henning Schild <henning...@siemens.com> [2017-06-26 09:55:59 +0200]:
Yeah, I know QEMU is the place to get RT performance, but I have my
kernel already in place for bare metal testing. Going forward with
those timer resolutions would be a no-go, though, thus why I wanted to
sort that out in the first place. Thanks for the reminder anyway!

>
> Henning
>
> > >
> > > And as reference clock, the PM timer is easily and cleanly exported
> > > to inmates - that's why we do that.
> >
> > OK, I'll keep searching the source of that strange behavior, thanks.
> >
> > >
> > > Jan
> > >
> > > --
> > > Siemens AG, Corporate Technology, CT RDA ITP SES-DE
> > > Corporate Competence Center Embedded Linux
> > >
> > > --
> > > You received this message because you are subscribed to the Google
> > > Groups "Jailhouse" group. To unsubscribe from this group and stop
> > > receiving emails from it, send an email to
> > > jailhouse-de...@googlegroups.com. For more options, visit
> > > https://groups.google.com/d/optout.
> >
>

Jan Kiszka

unread,
Jun 26, 2017, 12:42:15 PM6/26/17
to Gustavo Lima Chaves, Henning Schild, Jailhouse, d...@cobomind.com
If you tested inside QEMU, maybe you didn't provide the guest all CPU
features it desired. -cpu kvm64,...+arat?

Gustavo Lima Chaves

unread,
Jun 26, 2017, 12:55:28 PM6/26/17
to Jan Kiszka, Henning Schild, Jailhouse, d...@cobomind.com
* Jan Kiszka <jan.k...@siemens.com> [2017-06-26 18:42:13 +0200]:
*is not* (damn)

> > kernel already in place for bare metal testing. Going forward with
> > those timer resolutions would be a no-go, though, thus why I wanted to
> > sort that out in the first place. Thanks for the reminder anyway!
>
> If you tested inside QEMU, maybe you didn't provide the guest all CPU
> features it desired. -cpu kvm64,...+arat?

Never seem this arat before, thanks! Will test now (with and without
the patch).

>
> Jan
>
> --
> Siemens AG, Corporate Technology, CT RDA ITP SES-DE
> Corporate Competence Center Embedded Linux
>

Gustavo Lima Chaves

unread,
Jun 26, 2017, 4:01:57 PM6/26/17
to Jan Kiszka, Henning Schild, Jailhouse, d...@cobomind.com
* Gustavo Lima Chaves <gustavo.l...@intel.com> [2017-06-26 09:55:13 -0700]:

[...]

> > > Yeah, I know QEMU is the place to get RT performance, but I have my
>
> *is not* (damn)
>
> > > kernel already in place for bare metal testing. Going forward with
> > > those timer resolutions would be a no-go, though, thus why I wanted to
> > > sort that out in the first place. Thanks for the reminder anyway!
> >
> > If you tested inside QEMU, maybe you didn't provide the guest all CPU
> > features it desired. -cpu kvm64,...+arat?
>
> Never seem this arat before, thanks! Will test now (with and without
> the patch).

Confirmed that it works (without touching the kernel). Thanks! BTW,
should we put it on README.md's QEMU lines? I have +x2apic on mine as
well, not sure if it's default or not. Maybe a virtio-9p-pci entry and
"-serial mon:stdio", since it really helps to have QEMU's serial
output multiplexed with its command terminal.

Cheers.
Reply all
Reply to author
Forward
0 new messages