[RFC Patch 1/2][Bugfix][x86][hw-breakpoint] Clear reserved bits of DR6 in do

K.Prasad

unread,

Dec 26, 2009, 1:30:02 PM12/26/09

to

fix_dr6_reserved_01

K.Prasad

unread,

Dec 26, 2009, 1:30:01 PM12/26/09

to

fix_notify_code_02

Frederic Weisbecker

unread,

Dec 30, 2009, 6:50:02 PM12/30/09

to

On Sat, Dec 26, 2009 at 11:57:25PM +0530, K.Prasad wrote:
> Clear the reserved bits from the stored copy of debug status register (DR6).
> This will help easy bitwise operations.
>
> Signed-off-by: K.Prasad <pra...@linux.vnet.ibm.com>
> ---
> arch/x86/include/asm/debugreg.h | 3 +++
> arch/x86/kernel/traps.c | 3 +++
> 2 files changed, 6 insertions(+)
>
> Index: linux-2.6-tip/arch/x86/include/asm/debugreg.h
> ===================================================================
> --- linux-2.6-tip.orig/arch/x86/include/asm/debugreg.h
> +++ linux-2.6-tip/arch/x86/include/asm/debugreg.h
> @@ -14,6 +14,9 @@
> which debugging register was responsible for the trap. The other bits
> are either reserved or not of interest to us. */
>
> +/* Define reserved bits in DR6 which are always set to 1 */
> +#define DR6_RESERVED (0xFFFF0FF0)
> +

The 12th bit seems to be also reserved.
Shouldn't it be 0xffff1ff0 ?

What kind of bitwise operations do you think it could help?

All of the operations I can find on dr6 are simple masks
test/set/clear.

> #define DR_TRAP0 (0x1) /* db0 */
> #define DR_TRAP1 (0x2) /* db1 */
> #define DR_TRAP2 (0x4) /* db2 */
> Index: linux-2.6-tip/arch/x86/kernel/traps.c
> ===================================================================
> --- linux-2.6-tip.orig/arch/x86/kernel/traps.c
> +++ linux-2.6-tip/arch/x86/kernel/traps.c
> @@ -534,6 +534,9 @@ dotraplinkage void __kprobes do_debug(st
>
> get_debugreg(dr6, 6);
>
> + /* Filter out all the reserved bits which are preset to 1 */
> + dr6 &= ~DR6_RESERVED;
> +
> /* Catch kmemcheck conditions first of all! */
> if ((dr6 & DR_STEP) && kmemcheck_trap(regs))
> return;
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Frederic Weisbecker

unread,

Dec 30, 2009, 7:40:01 PM12/30/09

to

On Sat, Dec 26, 2009 at 11:58:33PM +0530, K.Prasad wrote:
> The hw-breakpoint handler will return NOTIFY_DONE for user-space breakpoints
> to generate SIGTRAP signal (and not for kernel-space addresses).
>
> Signed-off-by: K.Prasad <pra...@linux.vnet.ibm.com>
> ---
> arch/x86/kernel/hw_breakpoint.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> Index: linux-2.6-tip/arch/x86/kernel/hw_breakpoint.c
> ===================================================================
> --- linux-2.6-tip.orig/arch/x86/kernel/hw_breakpoint.c
> +++ linux-2.6-tip/arch/x86/kernel/hw_breakpoint.c
> @@ -502,8 +502,6 @@ static int __kprobes hw_breakpoint_handl
> rcu_read_lock();
>
> bp = per_cpu(bp_per_reg[i], cpu);
> - if (bp)
> - rc = NOTIFY_DONE;
> /*
> * Reset the 'i'th TRAP bit in dr6 to denote completion of
> * exception handling
> @@ -517,6 +515,13 @@ static int __kprobes hw_breakpoint_handl
> rcu_read_unlock();
> break;
> }
> + /*
> + * Further processing in do_debug() is needed for a) user-space
> + * breakpoints (to generate signals) and b) when the system has
> + * taken exception due to multiple causes
> + */
> + if (bp->attr.bp_addr < TASK_SIZE)
> + rc = NOTIFY_DONE;
>
> perf_bp_event(bp, args->regs);
>
>

Oh and now that I see this patch, the previous one indeed makes sense
with this check:

if (dr6 & (~DR_TRAP_BITS))
rc = NOTIFY_DONE;

That said, it means thread.debugreg6 won't get the reserved bits anymore.
I see some use of them from kvm (it restores the reserved bits on guest<->host
switch). Not sure if this inconsistency could affect kvm...

Frederic Weisbecker

unread,

Dec 30, 2009, 7:40:02 PM12/30/09

to

Looks good. This stops any further checks though. This
is fine in case we have the DR_STEP bit as we will
toggle to NOTIFY_DONE. But what about vm86 case? Is it
possible we might still have something to handle in this
side? (although I don't quite understand how we can have
a breakpoint triggered in vm86 mode).

Frederic Weisbecker

unread,

Dec 30, 2009, 7:50:01 PM12/30/09

to

On Sat, Dec 26, 2009 at 11:58:33PM +0530, K.Prasad wrote:
> + /*
> + * Further processing in do_debug() is needed for a) user-space
> + * breakpoints (to generate signals) and b) when the system has
> + * taken exception due to multiple causes
> + */
> + if (bp->attr.bp_addr < TASK_SIZE)
> + rc = NOTIFY_DONE;

BTW, I'm not sure this is the right way to check if we want to send
a signal or not. Although it's not yet supported, we'll probably
bring the support for userspace breakpoints by perf.

May be we should put a "ptrace" flag in struct hw_perf_event instead.

K.Prasad

unread,

Dec 31, 2009, 2:00:01 PM12/31/09

to

On Thu, Dec 31, 2009 at 12:45:00AM +0100, Frederic Weisbecker wrote:
> On Sat, Dec 26, 2009 at 11:57:25PM +0530, K.Prasad wrote:
> > Clear the reserved bits from the stored copy of debug status register (DR6).
> > This will help easy bitwise operations.
> >
> > Signed-off-by: K.Prasad <pra...@linux.vnet.ibm.com>
> > ---
> > arch/x86/include/asm/debugreg.h | 3 +++
> > arch/x86/kernel/traps.c | 3 +++
> > 2 files changed, 6 insertions(+)
> >
> > Index: linux-2.6-tip/arch/x86/include/asm/debugreg.h
> > ===================================================================
> > --- linux-2.6-tip.orig/arch/x86/include/asm/debugreg.h
> > +++ linux-2.6-tip/arch/x86/include/asm/debugreg.h
> > @@ -14,6 +14,9 @@
> > which debugging register was responsible for the trap. The other bits
> > are either reserved or not of interest to us. */
> >
> > +/* Define reserved bits in DR6 which are always set to 1 */
> > +#define DR6_RESERVED (0xFFFF0FF0)
> > +
>
>
> The 12th bit seems to be also reserved.
> Shouldn't it be 0xffff1ff0 ?
>

The 12th bit is reserved to be 0 always.

> What kind of bitwise operations do you think it could help?
>
> All of the operations I can find on dr6 are simple masks
> test/set/clear.
>

As you found out later, this bitmask helps us in
hw_breakpoint_handler().

Thanks,
K.Prasad

K.Prasad

unread,

Dec 31, 2009, 2:10:02 PM12/31/09

to

Can you point me to the relevant code?
Anyway will copy this to Jan Kiszka <jan.k...@web.de> to hear what
this change means to KVM...on a similar note, will be happy to be
re-assured by Roland/Oleg about the patch's harmlessness to the
user-space (ptrace/utrace).

Hi Jan,
Patch 2009122618...@in.ibm.com introduces a change that

Patch 1/2: Clears the arch-reserved bits from debug status register.
This helps easy bitwise operations - such as the check for non-trap bits in
hw_breakpoint_handler. A check for the same using
"if (dr6 & (~DR_TRAP_BITS))" throws incorrect results due to the
presence of preset reserved bits.

Let us know if you foresee any harm from the said change to the
behaviour seen under KVM.

Thanks,
K.Prasad

Frederic Weisbecker

unread,

Jan 9, 2010, 10:20:02 PM1/9/10

to

I see various uses of DR6_VOLATILE and DR6_FIXED_1 in arch/x86/kvm/,
DR6_FIXED_1 being the fixed unused bits in dr6. Not sure how
this patch would affect what's set there.

I'll wait for Jan's answer.

Thanks.

Frederic Weisbecker

unread,

Jan 9, 2010, 10:30:01 PM1/9/10

to

On Fri, Jan 01, 2010 at 12:19:49AM +0530, K.Prasad wrote:
> On Thu, Dec 31, 2009 at 12:45:00AM +0100, Frederic Weisbecker wrote:
> > On Sat, Dec 26, 2009 at 11:57:25PM +0530, K.Prasad wrote:
> > > Clear the reserved bits from the stored copy of debug status register (DR6).
> > > This will help easy bitwise operations.
> > >
> > > Signed-off-by: K.Prasad <pra...@linux.vnet.ibm.com>
> > > ---
> > > arch/x86/include/asm/debugreg.h | 3 +++
> > > arch/x86/kernel/traps.c | 3 +++
> > > 2 files changed, 6 insertions(+)
> > >
> > > Index: linux-2.6-tip/arch/x86/include/asm/debugreg.h
> > > ===================================================================
> > > --- linux-2.6-tip.orig/arch/x86/include/asm/debugreg.h
> > > +++ linux-2.6-tip/arch/x86/include/asm/debugreg.h
> > > @@ -14,6 +14,9 @@
> > > which debugging register was responsible for the trap. The other bits
> > > are either reserved or not of interest to us. */
> > >
> > > +/* Define reserved bits in DR6 which are always set to 1 */
> > > +#define DR6_RESERVED (0xFFFF0FF0)
> > > +
> >
> >
> > The 12th bit seems to be also reserved.
> > Shouldn't it be 0xffff1ff0 ?
> >
>
> The 12th bit is reserved to be 0 always.

Ah, ok.

> > What kind of bitwise operations do you think it could help?
> >
> > All of the operations I can find on dr6 are simple masks
> > test/set/clear.
> >
>
> As you found out later, this bitmask helps us in
> hw_breakpoint_handler().

Yeah, ok. Just waiting for Jan's answer to be sure it has
not side effects :)

Thanks.

Jan Kiszka

unread,

Jan 11, 2010, 2:20:03 PM1/11/10

to

You may need to synchronize me: What does the patch change, the shadow
register KVM will restore into DR6 on return to the host? Or the
register content KVM finds on guest entry?

The rules are simple: On entry, KVM assumes nothing about the register
state, just overwrites it (on demand) with the guest state. On exit, it
calls into hw_breakpoint_restore to ensure the host sees a proper state
(if required). But there is at no time an architecturally invalid state
loaded into the real register (that's basically what DR6_VOLATILE and
DR6_FIXED_1 are used for while in guest mode).

Jan

--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

K.Prasad

unread,

Jan 16, 2010, 2:50:02 PM1/16/10

to

Sorry, this mail got buried deeply in my mailbox (hence the delay).

Basically, this patch tries to remove DR6 from its reserved bits to help
easy checks for certain status bits (such as DR_STEP). For instance, in
order to verify if DR_STEP (Bit 14) is set we must now do
if ((DR6 & ~DR6_RESERVED) & DR_STEP) {}
or
if (DR6 & (DR_STEP | DR6_RESERVED)) {}
which is redundant.

Instead this patch would expunge all reserved bits in DR6 before checks
for various status bits (to detect the cause of exception) are made in
do_debug().

At the outset, I don't think changes in the way the value of DR6 is used
for comparison in do_debug() would affect exception handling for either
KVM's guest or host OS (given that there are no hooks for the same in
do_debug()).

> The rules are simple: On entry, KVM assumes nothing about the register
> state, just overwrites it (on demand) with the guest state. On exit, it
> calls into hw_breakpoint_restore to ensure the host sees a proper state
> (if required). But there is at no time an architecturally invalid state
> loaded into the real register (that's basically what DR6_VOLATILE and
> DR6_FIXED_1 are used for while in guest mode).
>

Such a behaviour shouldn't be affected by the above change...your
confirmation would help!

Thanks,
K.Prasad

K.Prasad

unread,

Jan 20, 2010, 1:10:01 AM1/20/10

to

Hi Jan,
I presume that the above explanation makes the role of this
patch/bugfix clear.

Kindly let me know if you have any further queries.

Jan Kiszka

unread,

Jan 22, 2010, 4:20:01 AM1/22/10

to

Nope. There should be really no conflicts of your optimization with kvm.

Jan

--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

K.Prasad

unread,

Jan 22, 2010, 4:30:03 AM1/22/10

to

On Fri, Jan 22, 2010 at 10:14:54AM +0100, Jan Kiszka wrote:
> K.Prasad wrote:
> > On Sun, Jan 17, 2010 at 01:10:58AM +0530, K.Prasad wrote:
> >> On Mon, Jan 11, 2010 at 08:15:29PM +0100, Jan Kiszka wrote:
> >>> Frederic Weisbecker wrote:
> >>>> On Fri, Jan 01, 2010 at 12:32:17AM +0530, K.Prasad wrote:
> >>>>> On Thu, Dec 31, 2009 at 01:38:09AM +0100, Frederic Weisbecker wrote:
> >>>>>> On Sat, Dec 26, 2009 at 11:58:33PM +0530, K.Prasad wrote:
> >>>>>>> The hw-breakpoint handler will return NOTIFY_DONE for user-space breakpoints
> >>>>>>> to generate SIGTRAP signal (and not for kernel-space addresses).
> >>>>>>>

> >> Such a behaviour shouldn't be affected by the above change...your
> >> confirmation would help!
> >>
> >
> > Hi Jan,
> > I presume that the above explanation makes the role of this
> > patch/bugfix clear.
> >
> > Kindly let me know if you have any further queries.
> >
>
> Nope. There should be really no conflicts of your optimization with kvm.
>
> Jan
>
> --

Hi Jan,
Thanks for the confirmation.

Hi Frederic,
Can you pull these fixes in? (LKML references:
2009122618...@in.ibm.com and 2009122618...@in.ibm.com).

Thanks,
K.Prasad

Frederic Weisbecker

unread,

Jan 25, 2010, 5:20:02 PM1/25/10

to

On Sat, Dec 26, 2009 at 11:58:33PM +0530, K.Prasad wrote:
> The hw-breakpoint handler will return NOTIFY_DONE for user-space breakpoints
> to generate SIGTRAP signal (and not for kernel-space addresses).

Please tell a bit more in your changelogs. It took me some time
to guess whether this is a fix or not.

And this is not a fix but an optimization because SIGTRAP
is only sent if needed.

Here is what happens in do_debug() after handling the
breakpoint:

if (tsk->thread.debugreg6 & (DR_STEP | DR_TRAP_BITS))
send_sigtrap(tsk, regs, error_code, si_code);

This can only happen if we took the ptrace handler path.

Also:

Is that < TASK_SIZE an accurate check? We want support for
userspace breakpoints on perf tools later, and those don't want
signals.

We do this cleanup in the beginning of the breakpoint handler:

current->thread.debugreg6 &= ~DR_TRAP_BITS;

And from ptrace.c:ptrace_triggered():

thread->debugreg6 |= (DR_TRAP0 << i);

This is called on perf_bp_event().
Instead of checking if this is a userspace thread, we should actually
check if this is a ptrace breakpoint by looking at this
in the end of hw_breakpoint_handler().

current->thread.debugreg6 & DR_TRAP_BITS

Only ptrace breakpoints require signals.

Thanks.

Frederic Weisbecker

unread,

Jan 25, 2010, 5:30:02 PM1/25/10

to

On Fri, Jan 22, 2010 at 02:51:27PM +0530, K.Prasad wrote:
> > >> Such a behaviour shouldn't be affected by the above change...your
> > >> confirmation would help!
> > >>
> > >
> > > Hi Jan,
> > > I presume that the above explanation makes the role of this
> > > patch/bugfix clear.
> > >
> > > Kindly let me know if you have any further queries.
> > >
> >
> > Nope. There should be really no conflicts of your optimization with kvm.
> >
> > Jan
> >
> > --
>
> Hi Jan,
> Thanks for the confirmation.
>
> Hi Frederic,
> Can you pull these fixes in? (LKML references:
> 2009122618...@in.ibm.com and 2009122618...@in.ibm.com).

I got a deeper look at these patches and answered with some comments.

If you address these, I can apply for .34 (because it's about optimizations
and not fixes).

Thanks.

K.Prasad

unread,

Jan 27, 2010, 5:30:02 AM1/27/10

to

On Mon, Jan 25, 2010 at 11:21:41PM +0100, Frederic Weisbecker wrote:
> On Fri, Jan 22, 2010 at 02:51:27PM +0530, K.Prasad wrote:
> > > >> Such a behaviour shouldn't be affected by the above change...your
> > > >> confirmation would help!
> > > >>
> > > >
> > > > Hi Jan,
> > > > I presume that the above explanation makes the role of this
> > > > patch/bugfix clear.
> > > >
> > > > Kindly let me know if you have any further queries.
> > > >
> > >
> > > Nope. There should be really no conflicts of your optimization with kvm.
> > >
> > > Jan
> > >
> > > --
> >
> > Hi Jan,
> > Thanks for the confirmation.
> >
> > Hi Frederic,
> > Can you pull these fixes in? (LKML references:
> > 2009122618...@in.ibm.com and 2009122618...@in.ibm.com).
>
>
> I got a deeper look at these patches and answered with some comments.
>
> If you address these, I can apply for .34 (because it's about optimizations
> and not fixes).
>

Sure, please apply them against a version that you deem appropriate.

Thanks,
K.Prasad

K.Prasad

unread,

Jan 27, 2010, 5:30:01 AM1/27/10

to

On Mon, Jan 25, 2010 at 11:11:04PM +0100, Frederic Weisbecker wrote:
> On Sat, Dec 26, 2009 at 11:58:33PM +0530, K.Prasad wrote:
> > The hw-breakpoint handler will return NOTIFY_DONE for user-space breakpoints
> > to generate SIGTRAP signal (and not for kernel-space addresses).
>
>
> Please tell a bit more in your changelogs. It took me some time
> to guess whether this is a fix or not.
>

Sorry about that...will add a descriptive changelog.

> And this is not a fix but an optimization because SIGTRAP
> is only sent if needed.
>
> Here is what happens in do_debug() after handling the
> breakpoint:
>
> if (tsk->thread.debugreg6 & (DR_STEP | DR_TRAP_BITS))
> send_sigtrap(tsk, regs, error_code, si_code);
>
> This can only happen if we took the ptrace handler path.
>

Agreed...signals are prevented as above...except that the notifier
semantics aren't properly used (NOTIFY_DONE vs NOTIFY_STOP).

Well, signal generation for user-space breakpoints happened
unconditionally for 'historical' reasons (guess that Alan Stern's
original patch had it that way).

We could change that into a 'ptrace-only' signal generation now.

> We do this cleanup in the beginning of the breakpoint handler:
>
> current->thread.debugreg6 &= ~DR_TRAP_BITS;
>
> And from ptrace.c:ptrace_triggered():
>
> thread->debugreg6 |= (DR_TRAP0 << i);
>
> This is called on perf_bp_event().
> Instead of checking if this is a userspace thread, we should actually
> check if this is a ptrace breakpoint by looking at this
> in the end of hw_breakpoint_handler().
>
> current->thread.debugreg6 & DR_TRAP_BITS
>
> Only ptrace breakpoints require signals.
>

Yes, this does look like a clean way to limit signals to those requests
that are interested (I was looking at round-about ways like doing a
lookup based on callback functions).

I will send the next version of the patch with the above changes.

Thanks,
K.Prasad

Frederic Weisbecker

unread,

Jan 27, 2010, 11:20:02 AM1/27/10

to

On Wed, Jan 27, 2010 at 03:58:26PM +0530, K.Prasad wrote:
> On Mon, Jan 25, 2010 at 11:11:04PM +0100, Frederic Weisbecker wrote:
> > Is that < TASK_SIZE an accurate check? We want support for
> > userspace breakpoints on perf tools later, and those don't want
> > signals.
> >
>
> Well, signal generation for user-space breakpoints happened
> unconditionally for 'historical' reasons (guess that Alan Stern's
> original patch had it that way).
>
> We could change that into a 'ptrace-only' signal generation now.

Yeah, now that we can have multiple-purpose concurrent breakpoints,
this is necessary.

> > We do this cleanup in the beginning of the breakpoint handler:
> >
> > current->thread.debugreg6 &= ~DR_TRAP_BITS;
> >
> > And from ptrace.c:ptrace_triggered():
> >
> > thread->debugreg6 |= (DR_TRAP0 << i);
> >
> > This is called on perf_bp_event().
> > Instead of checking if this is a userspace thread, we should actually
> > check if this is a ptrace breakpoint by looking at this
> > in the end of hw_breakpoint_handler().
> >
> > current->thread.debugreg6 & DR_TRAP_BITS
> >
> > Only ptrace breakpoints require signals.
> >
>
> Yes, this does look like a clean way to limit signals to those requests
> that are interested (I was looking at round-about ways like doing a
> lookup based on callback functions).
>
> I will send the next version of the patch with the above changes.

Thanks.

[RFC Patch 1/2][Bugfix][x86][hw-breakpoint] Clear reserved bits of DR6 in do_debug()

K.Prasad

K.Prasad

Frederic Weisbecker

Frederic Weisbecker

Frederic Weisbecker

Frederic Weisbecker

K.Prasad

K.Prasad

Frederic Weisbecker

Frederic Weisbecker

Jan Kiszka

K.Prasad

K.Prasad

Jan Kiszka

K.Prasad

Frederic Weisbecker

Frederic Weisbecker

K.Prasad

K.Prasad

Frederic Weisbecker