Changes from v1:
a. Split the patch into multiple patches.
b. Fix some bugs and coding style changes based on Roland's feedback.
c. Add generic PTRACE_GETREGSET/PTRACE_SETREGSET commands through which we can
export the architecture specific regsets using the corresponding
NT_* codes that userland is already aware of. The "xstate" regset is exported
only using this interface.
thanks,
suresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Revert "x86: ptrace and core-dump extensions for xstate"
Revert commit c741196df8a42602965f781e0f09462de81742e2. Updated patches based
on Roland's feedback follow.
Signed-off-by: Suresh Siddha <suresh....@intel.com>
LKML-Reference: <201002092025...@sbs-t61.sc.intel.com>
Signed-off-by: H. Peter Anvin <h...@zytor.com>
---
arch/x86/include/asm/i387.h | 8 +--
arch/x86/include/asm/ptrace-abi.h | 38 ------------------
arch/x86/include/asm/xsave.h | 2 -
arch/x86/kernel/i387.c | 79 -------------------------------------
arch/x86/kernel/ptrace.c | 53 +------------------------
arch/x86/kernel/xsave.c | 1 -
include/linux/elf.h | 1 -
7 files changed, 5 insertions(+), 177 deletions(-)
diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h
index 1cd5d43..ebfb8a9 100644
--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/i387.h
@@ -32,11 +32,9 @@ extern void __math_state_restore(void);
extern void init_thread_xstate(void);
extern int dump_fpu(struct pt_regs *, struct user_i387_struct *);
-extern user_regset_active_fn fpregs_active, xfpregs_active, xstateregs_active;
-extern user_regset_get_fn fpregs_get, xfpregs_get, fpregs_soft_get,
- xstateregs_get;
-extern user_regset_set_fn fpregs_set, xfpregs_set, fpregs_soft_set,
- xstateregs_set;
+extern user_regset_active_fn fpregs_active, xfpregs_active;
+extern user_regset_get_fn fpregs_get, xfpregs_get, fpregs_soft_get;
+extern user_regset_set_fn fpregs_set, xfpregs_set, fpregs_soft_set;
extern struct _fpx_sw_bytes fx_sw_reserved;
#ifdef CONFIG_IA32_EMULATION
diff --git a/arch/x86/include/asm/ptrace-abi.h b/arch/x86/include/asm/ptrace-abi.h
index 447cfb6..8672303 100644
--- a/arch/x86/include/asm/ptrace-abi.h
+++ b/arch/x86/include/asm/ptrace-abi.h
@@ -80,44 +80,6 @@
#define PTRACE_SINGLEBLOCK 33 /* resume execution until next branch */
-/*
- * Structure layout used in PTRACE_GETXSTATEREGS/PTRACE_SETXSTATEREGS is same
- * as the memory layout of xsave used by the processor (except for the bytes
- * 464..511 which can be used by the software). Size of the structure that users
- * need to use for these two interfaces can be obtained by doing:
- * cpuid_count(0xd, 0, &eax, &ptrace_xstateregs_struct_size, &ecx, &edx);
- * i.e., cpuid.(eax=0xd,ecx=0).ebx will be the size that user (debuggers etc)
- * need to use.
- *
- * And format of this structure will be like:
- * struct {
- * fxsave_bytes[0..463]
- * sw_usable_bytes[464..511]
- * xsave_hdr_bytes[512..575]
- * avx_bytes[576..831]
- * future_state etc
- * }
- *
- * Same memory layout will be used for the coredump NT_X86_XSTATE representing
- * the xstate registers.
- *
- * For now, only first 8 bytes of the sw_usable_bytes[464..467] will be used and
- * will be set to OS enabled xstate mask(which is same as the 64bit mask
- * returned by the xgetbv's xCR0). Users (analyzing core dump remotely etc)
- * can use this mask aswell as the mask saved in the xstate_hdr bytes and
- * interpret what states the processor/OS supports and what states are in
- * modified/initialized conditions for the particular process/thread.
- *
- * Also when the user modifies certain state FP/SSE/etc through this
- * PTRACE_SETXSTATEREGS, they must ensure that the xsave_hdr.xstate_bv
- * bytes[512..519] of the above memory layout are updated correspondingly.
- * i.e., for example when FP state is modified to a non-init state,
- * xsave_hdr.xstate_bv's bit 0 must be set to '1', when SSE is modified to
- * non-init state, xsave_hdr.xstate_bv's bit 1 must to be set to '1' etc..
- */
-#define PTRACE_GETXSTATEREGS 34
-#define PTRACE_SETXSTATEREGS 35
-
#ifndef __ASSEMBLY__
#include <linux/types.h>
diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h
index 9e26171..727acc1 100644
--- a/arch/x86/include/asm/xsave.h
+++ b/arch/x86/include/asm/xsave.h
@@ -27,11 +27,9 @@
extern unsigned int xstate_size;
extern u64 pcntxt_mask;
extern struct xsave_struct *init_xstate_buf;
-extern u64 xstate_fx_sw_bytes[6];
extern void xsave_cntxt_init(void);
extern void xsave_init(void);
-extern void update_regset_xstate_info(unsigned int size, u64 xstate_mask);
extern int init_fpu(struct task_struct *child);
extern int check_for_xstate(struct i387_fxsave_struct __user *buf,
void __user *fpstate,
diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index dffdf91..f2f8540 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -224,85 +224,6 @@ int xfpregs_set(struct task_struct *target, const struct user_regset *regset,
return ret;
}
-int xstateregs_active(struct task_struct *target,
- const struct user_regset *regset)
-{
- return (cpu_has_xsave && tsk_used_math(target)) ? xstate_size : 0;
-}
-
-int xstateregs_get(struct task_struct *target, const struct user_regset *regset,
- unsigned int pos, unsigned int count,
- void *kbuf, void __user *ubuf)
-{
- int ret;
-
- if (!cpu_has_xsave)
- return -ENODEV;
-
- ret = init_fpu(target);
- if (ret)
- return ret;
-
- /*
- * First copy the fxsave bytes 0..463
- */
- ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf,
- &target->thread.xstate->xsave, 0,
- (sizeof(struct i387_fxsave_struct) -
- sizeof(xstate_fx_sw_bytes)));
- /*
- * Copy the 48bytes defined by software
- */
- ret |= user_regset_copyout(&pos, &count, &kbuf, &ubuf,
- xstate_fx_sw_bytes,
- (sizeof(struct i387_fxsave_struct) -
- sizeof(xstate_fx_sw_bytes)),
- sizeof(struct i387_fxsave_struct));
- /*
- * Copy the rest of xstate memory layout
- */
- ret |= user_regset_copyout(&pos, &count, &kbuf, &ubuf,
- &target->thread.xstate->xsave.xsave_hdr,
- sizeof(struct i387_fxsave_struct), -1);
- return ret;
-}
-
-int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
- unsigned int pos, unsigned int count,
- const void *kbuf, const void __user *ubuf)
-{
- int ret;
- struct xsave_hdr_struct *xsave_hdr =
- ¤t->thread.xstate->xsave.xsave_hdr;
-
-
- if (!cpu_has_xsave)
- return -ENODEV;
-
- ret = init_fpu(target);
- if (ret)
- return ret;
-
- set_stopped_child_used_math(target);
-
- ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
- &target->thread.xstate->xsave, 0, -1);
-
- /*
- * mxcsr reserved bits must be masked to zero for security reasons.
- */
- target->thread.xstate->fxsave.mxcsr &= mxcsr_feature_mask;
-
- xsave_hdr->xstate_bv &= pcntxt_mask;
- /*
- * These bits must be zero.
- */
- xsave_hdr->reserved1[0] = xsave_hdr->reserved1[1] = 0;
-
-
- return ret;
-}
-
#if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION
/*
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index df9add9..017d937 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -48,7 +48,6 @@ enum x86_regset {
REGSET_FP,
REGSET_XFP,
REGSET_IOPERM64 = REGSET_XFP,
- REGSET_XSTATE,
REGSET_TLS,
REGSET_IOPERM32,
};
@@ -1233,20 +1232,6 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data)
0, sizeof(struct user_i387_struct),
datap);
- case PTRACE_GETXSTATEREGS: /* Get the child extended state. */
- return copy_regset_to_user(child,
- task_user_regset_view(current),
- REGSET_XSTATE,
- 0, xstate_size,
- datap);
-
- case PTRACE_SETXSTATEREGS: /* Set the child extended state. */
- return copy_regset_from_user(child,
- task_user_regset_view(current),
- REGSET_XSTATE,
- 0, xstate_size,
- datap);
-
#ifdef CONFIG_X86_32
case PTRACE_GETFPXREGS: /* Get the child extended FPU state. */
return copy_regset_to_user(child, &user_x86_32_view,
@@ -1576,16 +1561,6 @@ long compat_arch_ptrace(struct task_struct *child, compat_long_t request,
sizeof(struct user32_fxsr_struct),
datap);
- case PTRACE_GETXSTATEREGS: /* Get the child extended state. */
- return copy_regset_to_user(child, &user_x86_32_view,
- REGSET_XSTATE, 0, xstate_size,
- datap);
-
- case PTRACE_SETXSTATEREGS: /* Set the child extended state. */
- return copy_regset_from_user(child, &user_x86_32_view,
- REGSET_XSTATE, 0, xstate_size,
- datap);
-
case PTRACE_GET_THREAD_AREA:
case PTRACE_SET_THREAD_AREA:
#ifdef CONFIG_X86_PTRACE_BTS
@@ -1609,7 +1584,7 @@ long compat_arch_ptrace(struct task_struct *child, compat_long_t request,
#ifdef CONFIG_X86_64
-static struct user_regset x86_64_regsets[] __read_mostly = {
+static const struct user_regset x86_64_regsets[] = {
[REGSET_GENERAL] = {
.core_note_type = NT_PRSTATUS,
.n = sizeof(struct user_regs_struct) / sizeof(long),
@@ -1622,12 +1597,6 @@ static struct user_regset x86_64_regsets[] __read_mostly = {
.size = sizeof(long), .align = sizeof(long),
.active = xfpregs_active, .get = xfpregs_get, .set = xfpregs_set
},
- [REGSET_XSTATE] = {
- .core_note_type = NT_X86_XSTATE,
- .size = sizeof(long), .align = sizeof(long),
- .active = xstateregs_active, .get = xstateregs_get,
- .set = xstateregs_set
- },
[REGSET_IOPERM64] = {
.core_note_type = NT_386_IOPERM,
.n = IO_BITMAP_LONGS,
@@ -1653,7 +1622,7 @@ static const struct user_regset_view user_x86_64_view = {
#endif /* CONFIG_X86_64 */
#if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION
-static struct user_regset x86_32_regsets[] __read_mostly = {
+static const struct user_regset x86_32_regsets[] = {
[REGSET_GENERAL] = {
.core_note_type = NT_PRSTATUS,
.n = sizeof(struct user_regs_struct32) / sizeof(u32),
@@ -1672,12 +1641,6 @@ static struct user_regset x86_32_regsets[] __read_mostly = {
.size = sizeof(u32), .align = sizeof(u32),
.active = xfpregs_active, .get = xfpregs_get, .set = xfpregs_set
},
- [REGSET_XSTATE] = {
- .core_note_type = NT_X86_XSTATE,
- .size = sizeof(u32), .align = sizeof(u32),
- .active = xstateregs_active, .get = xstateregs_get,
- .set = xstateregs_set
- },
[REGSET_TLS] = {
.core_note_type = NT_386_TLS,
.n = GDT_ENTRY_TLS_ENTRIES, .bias = GDT_ENTRY_TLS_MIN,
@@ -1700,18 +1663,6 @@ static const struct user_regset_view user_x86_32_view = {
};
#endif
-u64 xstate_fx_sw_bytes[6];
-void update_regset_xstate_info(unsigned int size, u64 xstate_mask)
-{
-#ifdef CONFIG_X86_64
- x86_64_regsets[REGSET_XSTATE].n = size / sizeof(long);
-#endif
-#if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION
- x86_32_regsets[REGSET_XSTATE].n = size / sizeof(u32);
-#endif
- xstate_fx_sw_bytes[0] = xstate_mask;
-}
-
const struct user_regset_view *task_user_regset_view(struct task_struct *task)
{
#ifdef CONFIG_IA32_EMULATION
diff --git a/arch/x86/kernel/xsave.c b/arch/x86/kernel/xsave.c
index 782c3a3..c5ee17e 100644
--- a/arch/x86/kernel/xsave.c
+++ b/arch/x86/kernel/xsave.c
@@ -337,7 +337,6 @@ void __ref xsave_cntxt_init(void)
cpuid_count(0xd, 0, &eax, &ebx, &ecx, &edx);
xstate_size = ebx;
- update_regset_xstate_info(xstate_size, pcntxt_mask);
prepare_fx_sw_frame();
setup_xstate_init();
diff --git a/include/linux/elf.h b/include/linux/elf.h
index a8c4af0..0cc4d55 100644
--- a/include/linux/elf.h
+++ b/include/linux/elf.h
@@ -361,7 +361,6 @@ typedef struct elf64_shdr {
#define NT_PPC_VSX 0x102 /* PowerPC VSX registers */
#define NT_386_TLS 0x200 /* i386 TLS slots (struct user_desc) */
#define NT_386_IOPERM 0x201 /* x86 io permission bitmap (1=deny) */
-#define NT_X86_XSTATE 0x202 /* x86 extended state using xsave */
#define NT_S390_HIGH_GPRS 0x300 /* s390 upper register halves */
x86, ptrace: Regset extensions to support xstate
Add the xstate regset support which helps extend the kernel ptrace and the
core-dump interfaces to support AVX state etc.
This regset interface is designed to support all the future state that gets
supported using xsave/xrstor infrastructure.
Looking at the memory layout saved by "xsave", one can't say which state
is represented in the memory layout. This is because if a particular state is
in init state, in the xsave hdr it can be represented by bit '0'. And hence
we can't really say by the xsave header wether a state is in init state or
the state is not saved in the memory layout.
And hence the xsave memory layout available through this regset
interface uses SW usable bytes [464..511] to convey what state is represented
in the memory layout.
First 8 bytes of the sw_usable_bytes[464..467] will be set to OS enabled xstate
mask(which is same as the 64bit mask returned by the xgetbv's xCR0).
The note NT_X86_XSTATE represents the extended state information in the
core file, using the above mentioned memory layout.
Signed-off-by: Suresh Siddha <suresh....@intel.com>
LKML-Reference: <201002092025...@sbs-t61.sc.intel.com>
Signed-off-by: Hongjiu Lu <hjl....@gmail.com>
Signed-off-by: H. Peter Anvin <h...@zytor.com>
---
arch/x86/include/asm/i387.h | 12 +++++-
arch/x86/include/asm/xsave.h | 2 +
arch/x86/kernel/i387.c | 88 +++++++++++++++++++++++++++++++++++++++++-
arch/x86/kernel/ptrace.c | 29 +++++++++++++-
arch/x86/kernel/xsave.c | 1 +
include/linux/elf.h | 1 +
6 files changed, 127 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h
index ebfb8a9..da29309 100644
--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/i387.h
@@ -33,8 +33,16 @@ extern void init_thread_xstate(void);
extern int dump_fpu(struct pt_regs *, struct user_i387_struct *);
extern user_regset_active_fn fpregs_active, xfpregs_active;
-extern user_regset_get_fn fpregs_get, xfpregs_get, fpregs_soft_get;
-extern user_regset_set_fn fpregs_set, xfpregs_set, fpregs_soft_set;
+extern user_regset_get_fn fpregs_get, xfpregs_get, fpregs_soft_get,
+ xstateregs_get;
+extern user_regset_set_fn fpregs_set, xfpregs_set, fpregs_soft_set,
+ xstateregs_set;
+
+/*
+ * xstateregs_active == fpregs_active. Please refer to the comment
+ * at the definition of fpregs_active.
+ */
+#define xstateregs_active fpregs_active
extern struct _fpx_sw_bytes fx_sw_reserved;
#ifdef CONFIG_IA32_EMULATION
diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h
index 727acc1..9e26171 100644
--- a/arch/x86/include/asm/xsave.h
+++ b/arch/x86/include/asm/xsave.h
@@ -27,9 +27,11 @@
extern unsigned int xstate_size;
extern u64 pcntxt_mask;
extern struct xsave_struct *init_xstate_buf;
+extern u64 xstate_fx_sw_bytes[6];
extern void xsave_cntxt_init(void);
extern void xsave_init(void);
+extern void update_regset_xstate_info(unsigned int size, u64 xstate_mask);
extern int init_fpu(struct task_struct *child);
extern int check_for_xstate(struct i387_fxsave_struct __user *buf,
void __user *fpstate,
diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index f2f8540..4719bf9 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -164,6 +164,11 @@ int init_fpu(struct task_struct *tsk)
return 0;
}
+/*
+ * The xfpregs_active() routine is same as the fpregs_active() routine,
+ * as the "regset->n" for the xstate regset will be updated based on the feature
+ * capabilites supported by the xsave.
+ */
int fpregs_active(struct task_struct *target, const struct user_regset *regset)
{
return tsk_used_math(target) ? regset->n : 0;
@@ -204,8 +209,6 @@ int xfpregs_set(struct task_struct *target, const struct user_regset *regset,
if (ret)
return ret;
- set_stopped_child_used_math(target);
-
ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
&target->thread.xstate->fxsave, 0, -1);
@@ -224,6 +227,87 @@ int xfpregs_set(struct task_struct *target, const struct user_regset *regset,
return ret;
}
+int xstateregs_get(struct task_struct *target, const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ void *kbuf, void __user *ubuf)
+{
+ int ret;
+ int size = regset->n * regset->size;
+ struct xsave_hdr_struct *xsave_hdr =
+ &target->thread.xstate->xsave.xsave_hdr;
+
+ if (!cpu_has_xsave)
+ return -ENODEV;
+
+ ret = init_fpu(target);
+ if (ret)
+ return ret;
+
+ /*
+ * First copy the fxsave bytes 0..463
+ */
+ ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf,
+ &target->thread.xstate->xsave, 0,
+ offsetof(struct i387_fxsave_struct,
+ sw_reserved));
+ if (!ret)
+ /*
+ * Copy the 48bytes defined by software
+ */
+ ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf,
+ xstate_fx_sw_bytes,
+ offsetof(struct i387_fxsave_struct,
+ sw_reserved),
+ offsetof(struct xsave_struct,
+ xsave_hdr));
+ if (!ret)
+ /*
+ * Copy the rest of xstate memory layout
+ */
+ ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf,
+ xsave_hdr,
+ offsetof(struct xsave_struct,
+ xsave_hdr), size);
+ return ret;
+}
+
+int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ const void *kbuf, const void __user *ubuf)
+{
+ int ret;
+ int size = regset->n * regset->size;
+ struct xsave_hdr_struct *xsave_hdr =
+ &target->thread.xstate->xsave.xsave_hdr;
+
+
+ if (!cpu_has_xsave)
+ return -ENODEV;
+
+ ret = init_fpu(target);
+ if (ret)
+ return ret;
+
+ set_stopped_child_used_math(target);
+
+ ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
+ &target->thread.xstate->xsave, 0, size);
+
+ /*
+ * mxcsr reserved bits must be masked to zero for security reasons.
+ */
+ target->thread.xstate->fxsave.mxcsr &= mxcsr_feature_mask;
+
+ xsave_hdr->xstate_bv &= pcntxt_mask;
+ /*
+ * These bits must be zero.
+ */
+ xsave_hdr->reserved1[0] = xsave_hdr->reserved1[1] = 0;
+
+
+ return ret;
+}
+
#if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION
/*
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 017d937..5942da4 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -48,6 +48,7 @@ enum x86_regset {
REGSET_FP,
REGSET_XFP,
REGSET_IOPERM64 = REGSET_XFP,
+ REGSET_XSTATE,
REGSET_TLS,
REGSET_IOPERM32,
};
@@ -1584,7 +1585,7 @@ long compat_arch_ptrace(struct task_struct *child, compat_long_t request,
#ifdef CONFIG_X86_64
-static const struct user_regset x86_64_regsets[] = {
+static struct user_regset x86_64_regsets[] __read_mostly = {
[REGSET_GENERAL] = {
.core_note_type = NT_PRSTATUS,
.n = sizeof(struct user_regs_struct) / sizeof(long),
@@ -1597,6 +1598,12 @@ static const struct user_regset x86_64_regsets[] = {
.size = sizeof(long), .align = sizeof(long),
.active = xfpregs_active, .get = xfpregs_get, .set = xfpregs_set
},
+ [REGSET_XSTATE] = {
+ .core_note_type = NT_X86_XSTATE,
+ .size = sizeof(long), .align = sizeof(long),
+ .active = xstateregs_active, .get = xstateregs_get,
+ .set = xstateregs_set
+ },
[REGSET_IOPERM64] = {
.core_note_type = NT_386_IOPERM,
.n = IO_BITMAP_LONGS,
@@ -1622,7 +1629,7 @@ static const struct user_regset_view user_x86_64_view = {
#endif /* CONFIG_X86_64 */
#if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION
-static const struct user_regset x86_32_regsets[] = {
+static struct user_regset x86_32_regsets[] __read_mostly = {
[REGSET_GENERAL] = {
.core_note_type = NT_PRSTATUS,
.n = sizeof(struct user_regs_struct32) / sizeof(u32),
@@ -1641,6 +1648,12 @@ static const struct user_regset x86_32_regsets[] = {
.size = sizeof(u32), .align = sizeof(u32),
.active = xfpregs_active, .get = xfpregs_get, .set = xfpregs_set
},
+ [REGSET_XSTATE] = {
+ .core_note_type = NT_X86_XSTATE,
+ .size = sizeof(u32), .align = sizeof(u32),
+ .active = xstateregs_active, .get = xstateregs_get,
+ .set = xstateregs_set
+ },
[REGSET_TLS] = {
.core_note_type = NT_386_TLS,
.n = GDT_ENTRY_TLS_ENTRIES, .bias = GDT_ENTRY_TLS_MIN,
@@ -1663,6 +1676,18 @@ static const struct user_regset_view user_x86_32_view = {
};
#endif
+u64 xstate_fx_sw_bytes[6];
+void update_regset_xstate_info(unsigned int size, u64 xstate_mask)
+{
+#ifdef CONFIG_X86_64
+ x86_64_regsets[REGSET_XSTATE].n = size / sizeof(long);
+#endif
+#if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION
+ x86_32_regsets[REGSET_XSTATE].n = size / sizeof(u32);
+#endif
+ xstate_fx_sw_bytes[0] = xstate_mask;
+}
+
const struct user_regset_view *task_user_regset_view(struct task_struct *task)
{
#ifdef CONFIG_IA32_EMULATION
diff --git a/arch/x86/kernel/xsave.c b/arch/x86/kernel/xsave.c
index c5ee17e..782c3a3 100644
--- a/arch/x86/kernel/xsave.c
+++ b/arch/x86/kernel/xsave.c
@@ -337,6 +337,7 @@ void __ref xsave_cntxt_init(void)
cpuid_count(0xd, 0, &eax, &ebx, &ecx, &edx);
xstate_size = ebx;
+ update_regset_xstate_info(xstate_size, pcntxt_mask);
prepare_fx_sw_frame();
setup_xstate_init();
diff --git a/include/linux/elf.h b/include/linux/elf.h
index 0cc4d55..a8c4af0 100644
--- a/include/linux/elf.h
+++ b/include/linux/elf.h
@@ -361,6 +361,7 @@ typedef struct elf64_shdr {
#define NT_PPC_VSX 0x102 /* PowerPC VSX registers */
#define NT_386_TLS 0x200 /* i386 TLS slots (struct user_desc) */
#define NT_386_IOPERM 0x201 /* x86 io permission bitmap (1=deny) */
+#define NT_X86_XSTATE 0x202 /* x86 extended state using xsave */
x86, ptrace: Prepare regset get/set routines for user specified lengths
Prepare the x86 regset routines for the upcoming generic
PTRACE_GETREGSET/PTRACE_SETREGSET commands. These commands allow the user
to specify how much to read and hence the kernel needs to ensure that
the get/set routines of the regset don't allow the user to access more kernel
buffers than needed.
Signed-off-by: Suresh Siddha <suresh....@intel.com>
LKML-Reference: <201002092025...@sbs-t61.sc.intel.com>
Signed-off-by: H. Peter Anvin <h...@zytor.com>
---
arch/x86/kernel/i387.c | 15 ++++++++++-----
1 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index 4719bf9..23efdef 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -184,6 +184,7 @@ int xfpregs_get(struct task_struct *target, const struct user_regset *regset,
void *kbuf, void __user *ubuf)
{
int ret;
+ int size = regset->n * regset->size;
if (!cpu_has_fxsr)
return -ENODEV;
@@ -193,7 +194,7 @@ int xfpregs_get(struct task_struct *target, const struct user_regset *regset,
return ret;
return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
- &target->thread.xstate->fxsave, 0, -1);
+ &target->thread.xstate->fxsave, 0, size);
}
int xfpregs_set(struct task_struct *target, const struct user_regset *regset,
@@ -201,6 +202,7 @@ int xfpregs_set(struct task_struct *target, const struct user_regset *regset,
const void *kbuf, const void __user *ubuf)
{
int ret;
+ int size = regset->n * regset->size;
if (!cpu_has_fxsr)
return -ENODEV;
@@ -210,7 +212,7 @@ int xfpregs_set(struct task_struct *target, const struct user_regset *regset,
return ret;
ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
- &target->thread.xstate->fxsave, 0, -1);
+ &target->thread.xstate->fxsave, 0, size);
/*
* mxcsr reserved bits must be masked to zero for security reasons.
@@ -452,6 +454,7 @@ int fpregs_get(struct task_struct *target, const struct user_regset *regset,
void *kbuf, void __user *ubuf)
{
struct user_i387_ia32_struct env;
+ int size = regset->n * regset->size;
int ret;
ret = init_fpu(target);
@@ -464,7 +467,7 @@ int fpregs_get(struct task_struct *target, const struct user_regset *regset,
if (!cpu_has_fxsr) {
return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
&target->thread.xstate->fsave, 0,
- -1);
+ size);
}
if (kbuf && pos == 0 && count == sizeof(env)) {
@@ -482,6 +485,7 @@ int fpregs_set(struct task_struct *target, const struct user_regset *regset,
const void *kbuf, const void __user *ubuf)
{
struct user_i387_ia32_struct env;
+ int size = regset->n * regset->size;
int ret;
ret = init_fpu(target);
@@ -495,13 +499,14 @@ int fpregs_set(struct task_struct *target, const struct user_regset *regset,
if (!cpu_has_fxsr) {
return user_regset_copyin(&pos, &count, &kbuf, &ubuf,
- &target->thread.xstate->fsave, 0, -1);
+ &target->thread.xstate->fsave, 0,
+ size);
}
if (pos > 0 || count < sizeof(env))
convert_from_fxsr(&env, target);
- ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &env, 0, -1);
+ ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &env, 0, size);
if (!ret)
convert_to_fxsr(target, &env);
ptrace: Add support for generic PTRACE_GETREGSET/PTRACE_SETREGSET
Generic support for PTRACE_GETREGSET/PTRACE_SETREGSET commands which
export the regsets supported by each architecture using the correponding
NT_* types.
'addr' parameter for the ptrace system call encode the REGSET type and
the buffer length. 'data' parameter points to the user buffer.
ptrace(PTRACE_GETREGSET/PTRACE_SETREGSET, pid,
(NT_TYPE << 20) | buf_length, buf);
For more information on how to use this API by users like debuggers and core
dump for accessing xstate registers, etc., please refer to comments in
arch/x86/include/asm/ptrace-abi.h
Signed-off-by: Suresh Siddha <suresh....@intel.com>
LKML-Reference: <201002092025...@sbs-t61.sc.intel.com>
Acked-by: Hongjiu Lu <hjl....@gmail.com>
Signed-off-by: H. Peter Anvin <h...@zytor.com>
---
arch/x86/include/asm/ptrace-abi.h | 50 +++++++++++++++++++++++++++++++++++++
include/linux/elf.h | 25 +++++++++++++++++-
include/linux/ptrace.h | 14 ++++++++++
kernel/ptrace.c | 34 +++++++++++++++++++++++++
4 files changed, 121 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/ptrace-abi.h b/arch/x86/include/asm/ptrace-abi.h
index 8672303..965a6ce 100644
--- a/arch/x86/include/asm/ptrace-abi.h
+++ b/arch/x86/include/asm/ptrace-abi.h
@@ -139,4 +139,54 @@ struct ptrace_bts_config {
Returns number of BTS records drained.
*/
+/*
+ * The extended register state using xsave/xrstor infrastructure can be
+ * be read and set by the user using the following ptrace command.
+ *
+ * ptrace(PTRACE_GETREGSET/PTRACE_SETREGSET, pid,
+ * (NT_X86_XSTATE << 20) | xstate_buf_length, xstate_buf);
+ *
+ * The structure layout used for the xstate_buf is same as
+ * the memory layout of xsave used by the processor (except for the bytes
+ * 464..511, which can be used by the software) and hence the size
+ * of this structure varies depending on the features supported by the processor
+ * and OS. The size of the structure that users need to use can be obtained by
+ * doing:
+ * cpuid_count(0xd, 0, &eax, &ptrace_xstateregs_struct_size, &ecx, &edx);
+ * i.e., cpuid.(eax=0xd,ecx=0).ebx will be the size that user (debuggers, etc.)
+ * need to use.
+ *
+ * The format of this structure will be like:
+ * struct {
+ * fxsave_bytes[0..463]
+ * sw_usable_bytes[464..511]
+ * xsave_hdr_bytes[512..575]
+ * avx_bytes[576..831]
+ * future_state etc
+ * }
+ *
+ * The same memory layout will be used for the core-dump NT_X86_XSTATE
+ * note representing the xstate registers.
+ *
+ * For now, only the first 8 bytes of the sw_usable_bytes[464..471] will
+ * be used and will be set to OS enabled xstate mask (which is same as the
+ * 64bit mask returned by the xgetbv's xCR0). Users (analyzing core dump
+ * remotely, etc.) can use this mask as well as the mask saved in the
+ * xstate_hdr bytes and interpret what states the processor/OS supports
+ * and what states are in modified/initialized conditions for the
+ * particular process/thread.
+ *
+ * Also when the user modifies certain state FP/SSE/etc through this
+ * PTRACE_SETXSTATEREGS, they must ensure that the xsave_hdr.xstate_bv
+ * bytes[512..519] of the above memory layout are updated correspondingly.
+ * i.e., for example when FP state is modified to a non-init state,
+ * xsave_hdr.xstate_bv's bit 0 must be set to '1', when SSE is modified to
+ * non-init state, xsave_hdr.xstate_bv's bit 1 must to be set to '1', etc.
+ */
+
+/*
+ * Enable PTRACE_GETREGSET/PTRACE_SETREGSET interface
+ */
+#define PTRACE_EXPORT_REGSET
+
#endif /* _ASM_X86_PTRACE_ABI_H */
diff --git a/include/linux/elf.h b/include/linux/elf.h
index a8c4af0..5fd7996 100644
--- a/include/linux/elf.h
+++ b/include/linux/elf.h
@@ -349,7 +349,29 @@ typedef struct elf64_shdr {
#define ELF_OSABI ELFOSABI_NONE
#endif
-/* Notes used in ET_CORE */
+/*
+ * Notes used in ET_CORE. Architectures export some of the arch register sets
+ * using the corresponding note types via ptrace
+ * PTRACE_GETREGSET/PTRACE_SETREGSET requests.
+ *
+ * For example,
+ * long addr = (NT_X86_XSTATE << 20 | buf_length);
+ * ptrace(PTRACE_GETREGSET, pid, addr, buf);
+ *
+ * will obtain the x86 extended register state of the pid.
+ *
+ * On 32bit architectures, addr argument is 32bits wide and encodes the length
+ * of the buffer in the first 20 bits, followed by NT_* type encoded in the
+ * remaining 12 bits, representing an unique regset.
+ *
+ * Hence on 32bit architectures, NT_PRXFPREG (0x46e62b7f) will be seen as 0xb7f
+ * and we can't allocate 0xb7f to any other note.
+ *
+ * All the other existing NT_* types fall with in the 12 bits and we have
+ * plenty of room for more NT_* types. For now, on all architectures
+ * we use first 20 bits of the addr for the buffer size and the next 12 bits for
+ * the NT_* type representing the corresponding regset.
+ */
#define NT_PRSTATUS 1
#define NT_PRFPREG 2
#define NT_PRPSINFO 3
@@ -364,7 +386,6 @@ typedef struct elf64_shdr {
#define NT_X86_XSTATE 0x202 /* x86 extended state using xsave */
#define NT_S390_HIGH_GPRS 0x300 /* s390 upper register halves */
-
/* Note header in a PT_NOTE section */
typedef struct elf32_note {
Elf32_Word n_namesz; /* Name size */
diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 56f2d63..4522e5e 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -26,6 +26,18 @@
#define PTRACE_GETEVENTMSG 0x4201
#define PTRACE_GETSIGINFO 0x4202
#define PTRACE_SETSIGINFO 0x4203
+#define PTRACE_GETREGSET 0x4204
+#define PTRACE_SETREGSET 0x4205
+
+/*
+ * First 20 bits of the addr in the PTRACE_GETREGSET/PTRACE_SETREGSET requests
+ * are reserved for communicating the user buffer size and the remaining 12 bits
+ * are reserved for the NT_* types (defined in include/linux/elf.h) which
+ * indicate the regset that the ptrace user is interested in.
+ */
+#define PTRACE_REGSET_BUF_SIZE(addr) (addr & 0xfffff)
+#define PTRACE_REGSET_TYPE(addr) ((addr >> 20) & 0xfff)
+#define NOTE_TO_REGSET_TYPE(note) (note & 0xfff)
/* options set using PTRACE_SETOPTIONS */
#define PTRACE_O_TRACESYSGOOD 0x00000001
@@ -114,6 +126,8 @@ static inline void ptrace_unlink(struct task_struct *child)
int generic_ptrace_peekdata(struct task_struct *tsk, long addr, long data);
int generic_ptrace_pokedata(struct task_struct *tsk, long addr, long data);
+int generic_ptrace_regset(struct task_struct *child, long request, long addr,
+ long data);
/**
* task_ptrace - return %PT_* flags that apply to a task
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 23bd09c..7040208 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -22,6 +22,7 @@
#include <linux/pid_namespace.h>
#include <linux/syscalls.h>
#include <linux/uaccess.h>
+#include <linux/regset.h>
/*
@@ -573,6 +574,11 @@ int ptrace_request(struct task_struct *child, long request,
return 0;
return ptrace_resume(child, request, SIGKILL);
+#ifdef PTRACE_EXPORT_REGSET
+ case PTRACE_GETREGSET:
+ case PTRACE_SETREGSET:
+ return generic_ptrace_regset(child, request, addr, data);
+#endif
default:
break;
}
@@ -664,6 +670,34 @@ int generic_ptrace_pokedata(struct task_struct *tsk, long addr, long data)
return (copied == sizeof(data)) ? 0 : -EIO;
}
+#ifdef PTRACE_EXPORT_REGSET
+int generic_ptrace_regset(struct task_struct *child, long request, long addr,
+ long data)
+{
+ const struct user_regset_view *view = task_user_regset_view(child);
+ unsigned int buf_size = PTRACE_REGSET_BUF_SIZE(addr);
+ int setno;
+
+ for (setno = 0; setno < view->n; setno++) {
+ const struct user_regset *regset = &view->regsets[setno];
+ if (NOTE_TO_REGSET_TYPE(regset->core_note_type) !=
+ PTRACE_REGSET_TYPE(addr))
+ continue;
+
+ if (request == PTRACE_GETREGSET)
+ return copy_regset_to_user(child, view, setno, 0,
+ buf_size,
+ (void __user *) data);
+ else if (request == PTRACE_SETREGSET)
+ return copy_regset_from_user(child, view, setno, 0,
+ buf_size,
+ (void __user *) data);
+ }
+
+ return -EIO;
+}
+#endif
+
#if defined CONFIG_COMPAT
#include <linux/compat.h>
AFAIK tip/x86/ptrace is just a temporary branch containing nothing but your
patch. So I imagine you don't need a reversion, Ingo et al will just put
the new patches on a fresh replacement branch.
I'll reply to each patch individually.
Thanks,
Roland
That is my understanding too but was not sure. Wanted to mention that
they should either replace the branch or apply the reverted patch
(incase if they have issues with replacing the branch). Peter, you seem
to have applied the revert. Can you instead start with a fresh branch so
that we can just ignore the first version (and hence skip first revert
patch in this series)?
> I'll reply to each patch individually.
Thanks Roland.
Looks good modulo cosmetic nits below.
> And hence the xsave memory layout available through this regset
> interface uses SW usable bytes [464..511] to convey what state is represented
> in the memory layout.
>
> First 8 bytes of the sw_usable_bytes[464..467] will be set to OS enabled xstate
> mask(which is same as the 64bit mask returned by the xgetbv's xCR0).
I'd like to see some macros or struct or something in some user-visible
header (ptrace-abi.h I guess?) that instruct userland where to find this
software-defined word. The rest of the layout and meaning is specified by
the chip manuals and it's only a question of convenience if we want to help
userland out with those bits. But the use of the reserved-for-software
area to store a bitmask (or whatever else Linux might put there in the
future) is entirely a Linux invention that needs to be a clear and explicit
part of this userland ABI format.
> +/*
> + * The xfpregs_active() routine is same as the fpregs_active() routine,
s/xfpregs_active/xstateregs_active/
s/is same/is the same/
> @@ -204,8 +209,6 @@ int xfpregs_set(struct task_struct *targ
> if (ret)
> return ret;
>
> - set_stopped_child_used_math(target);
> -
You didn't mention this change in a comment or log entry.
It looks like it's superfluous after init_fpu. So this change
is right, but AFAICT it belongs in an entirely separate patch
and has nothing to do with xsave support.
> +int xstateregs_get(struct task_struct *target, const struct user_regset *regset,
> + unsigned int pos, unsigned int count,
> + void *kbuf, void __user *ubuf)
> +{
> + int ret;
> + int size = regset->n * regset->size;
[...]
> + /*
> + * Copy the rest of xstate memory layout
> + */
> + ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf,
> + xsave_hdr,
> + offsetof(struct xsave_struct,
> + xsave_hdr), size);
This calculation is unnecessary (though it doesn't hurt); you can just use
-1 here. The arch user_regset.get and user_regset.set functions are not
required to worry about bogus arguments. Their callers are required to
pass values that don't violate the .n, .size, and .align constraints in the
user_regset.
> +int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
> + unsigned int pos, unsigned int count,
> + const void *kbuf, const void __user *ubuf)
> +{
> + int ret;
> + int size = regset->n * regset->size;
[...]
> + ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
> + &target->thread.xstate->xsave, 0, size);
Same here.
> + [REGSET_XSTATE] = {
> + .core_note_type = NT_X86_XSTATE,
> + .size = sizeof(long), .align = sizeof(long),
[...]
> + [REGSET_XSTATE] = {
> + .core_note_type = NT_X86_XSTATE,
> + .size = sizeof(u32), .align = sizeof(u32),
Isn't the xsave format is really the same for 32-bit and 64-bit?
All its components are naturally 64 bits or larger, right?
If that's correct, I think it would be better to make the 32 and 64 version
of the regset uniform with .size and .align being sizeof(u64).
> +u64 xstate_fx_sw_bytes[6];
> +void update_regset_xstate_info(unsigned int size, u64 xstate_mask)
> +{
> +#ifdef CONFIG_X86_64
> + x86_64_regsets[REGSET_XSTATE].n = size / sizeof(long);
> +#endif
> +#if defined CONFIG_X86_32 || defined CONFIG_IA32_EMULATION
> + x86_32_regsets[REGSET_XSTATE].n = size / sizeof(u32);
> +#endif
Just change these to / sizeof(u64) accordingly.
Thanks,
Roland
Thanks,
Roland
IMHO this bit swizzling is a non-starter. The NT_* codes can use a full 32
bits. NT_PRXFPREG uses 31 bits. The comments about ignoring the high bits
for this as a special case just seem insane to me. Pass a full 32 bit word
for NT_* and a full word for the buffer size. What's so terrible about
just having an obvious and comprehensible API?
IMHO if you insist on an insane bit swizzling approach, you should mix the
buffer size bits into the request code, leaving the "addr" argument free
for the unmolested NT_* code. There is just no earthly reason that we
should suddenly impose a new and arcane constraint on what NT_* values can
be, even if there is no particular reason for future values not to be small.
> +int generic_ptrace_regset(struct task_struct *child, long request, long addr,
> + long data);
There is no need for a global function for this. It should all be static
in kernel/ptrace.c, called only by ptrace_request()/compat_ptrace_request().
> +#ifdef PTRACE_EXPORT_REGSET
> + case PTRACE_GETREGSET:
> + case PTRACE_SETREGSET:
> + return generic_ptrace_regset(child, request, addr, data);
> +#endif
Just use CONFIG_HAVE_ARCH_TRACEHOOK. That means the arch supplies
task_user_regset_view(). Any additional per-arch conditionalization
defeats the essential purpose of having fully generic ptrace requests.
> + if (NOTE_TO_REGSET_TYPE(regset->core_note_type) !=
> + PTRACE_REGSET_TYPE(addr))
> + continue;
Here is where you should validate the buf_size value. You must not pass a
size that is > regset.size * regset.n or has % regset.size != 0. The arch
user_regset.get and .set functions are not required to check for bogus
offsets or sizes.
You could either just use:
buf_size = MIN(buf_size, regset->n * regset->size);
or you could diagnose it with:
if (buf_size > regset->n * regset->size)
return -EINVAL;
Possibly we might want to allow a PTRACE_GETREGSET with a buffer that is
larger than necessary. Then it would probably make sense to zero-fill the
rest of the buffer. Or else, use an API that can pass back to userland how
much we're actually filling in.
> --- tip.orig/include/linux/elf.h
> +++ tip/include/linux/elf.h
> @@ -349,7 +349,29 @@ typedef struct elf64_shdr {
> #define ELF_OSABI ELFOSABI_NONE
> #endif
>
> -/* Notes used in ET_CORE */
> +/*
> + * Notes used in ET_CORE. Architectures export some of the arch register sets
[...]
These comments don't really belong in linux/elf.h IMHO. They don't add
anything, except documenting the insanity about truncating NT_* values. I
don't think that nutty idea should survive. But regardless, everything
about it belongs alongside the macros used to construct and deconstruct the
argument word. No example should recommend using the bit-twiddling rather
than using some such macros that are made public in linux/ptrace.h.
Thanks,
Roland
Won't it be called by ptrace emulation in utrace?
--
H.J.
I guess I don't really understand the question. I'm not aware of any
patches where any such things are done anywhere except the existing
ptrace_request and compat_ptrace_request functions in kernel/ptrace.c.
Thanks,
Roland
I guess it isn't needed to be global then.
--
H.J.
> > Patches are on top of the -tip/master tree. First patch in the series
> > reverts the first version of the patch that went into the -tip tree.
>
> AFAIK tip/x86/ptrace is just a temporary branch containing nothing but your
> patch. So I imagine you don't need a reversion, Ingo et al will just put
> the new patches on a fresh replacement branch.
Yeah - and the patches your review strikes we'll remove/fix as well. And you
can pick up the patches yourself as well into your tree, if you wish to. (or
pull it from us if we are done) We can push it too with your acks - your
call.
> I'll reply to each patch individually.
Thanks!
Ingo
I guess Suresh was going to change xstateregs_set() added by this patch,
it also sets USED_MATH after init_fpu().
Oleg.
Hmm. Suresh, could you confirm these offsetof's are correct?
We are copying xstate_fx_sw_bytes array which is u64[6], but
start_pos == sizeof(i387_fxsave_struct) - padding ?
While we are here. Perhaps it would be more symmetrical to do
ret = user_regset_copyout();
if (ret)
return ret;
instead of "if (!ret)" which needs a tab, but this is up to you.
> + if (!ret)
> + /*
> + * Copy the rest of xstate memory layout
> + */
> + ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf,
> + xsave_hdr,
Another very minor nit, this is the only user of xsave_hdr, but this
var was defined at the top. Perhaps it would be a bit more clean to
have
struct xsave_struct *xsave = target->thread.xstate->xsave;
and use it in 1st and 3rd copyout?
Speaking about the first user_regset_copyout(), perhaps xsave->i387
would be more clear than xsave?
Oleg.
Well. Personally, I like Roland's suggestion more.
How about something like the patch below?
I am not sure how should we check the size, see the comment in
ptrace_regset().
What do you think?
And, I don't understand NOTE_TO_REGSET_TYPE() logic. I mean, I don't know
why we should use "->core_note_type & 0xfff".
Oleg.
--- linux-2.6.32.2/include/linux/ptrace.h~REGSET 2009-12-18 23:27:07.000000000 +0100
+++ linux-2.6.32.2/include/linux/ptrace.h 2010-02-10 14:05:31.000000000 +0100
@@ -27,6 +27,9 @@
#define PTRACE_GETSIGINFO 0x4202
#define PTRACE_SETSIGINFO 0x4203
+#define PTRACE_GETREGSET 0x4204
+#define PTRACE_SETREGSET 0x4205
+
/* options set using PTRACE_SETOPTIONS */
#define PTRACE_O_TRACESYSGOOD 0x00000001
#define PTRACE_O_TRACEFORK 0x00000002
--- linux-2.6.32.2/kernel/ptrace.c~REGSET 2009-12-18 23:27:07.000000000 +0100
+++ linux-2.6.32.2/kernel/ptrace.c 2010-02-10 14:08:12.000000000 +0100
@@ -22,7 +22,7 @@
#include <linux/pid_namespace.h>
#include <linux/syscalls.h>
#include <linux/uaccess.h>
-
+#include <linux/regset.h>
/*
* ptrace a task: make the debugger its new parent and
@@ -397,6 +397,50 @@ int ptrace_writedata(struct task_struct
return copied;
}
+static const struct user_regset *
+find_regset(const struct user_regset_view *view, unsigned int type)
+{
+ const struct user_regset *regset;
+ int n;
+
+ for (n = 0; n < view->n; ++n) {
+ regset = view->regsets + n;
+ if (regset->core_note_type == type)
+ return regset;
+ }
+
+ return NULL;
+}
+
+static int ptrace_regset(struct task_struct *task, int req, unsigned int type,
+ struct iovec *uiov)
+{
+ const struct user_regset_view *view = task_user_regset_view(task);
+ const struct user_regset *regset = find_regset(view, type);
+ struct iovec kiov;
+
+ if (!regset)
+ return -EIO;
+
+ if (copy_from_user(&kiov, uiov, sizeof kiov))
+ return -EFAULT;
+
+ // I am not sure. Afaics it is OK to pass the
+ // size which is less than n * size. If iov_len
+ // is bigger, we can silently truncate it, or
+ // even write the correct value back.
+
+ if (kiov.iov_len != regset->n * regset->size)
+ return -EINVAL;
+
+ if (req == PTRACE_GETREGSET)
+ return copy_regset_to_user(task, view, type, 0,
+ kiov.iov_len, kiov.iov_base);
+ else
+ return copy_regset_from_user(task, view, type, 0,
+ kiov.iov_len, kiov.iov_base);
+}
+
static int ptrace_setoptions(struct task_struct *child, long data)
{
child->ptrace &= ~PT_TRACE_MASK;
@@ -525,6 +569,10 @@ int ptrace_request(struct task_struct *c
case PTRACE_POKEDATA:
return generic_ptrace_pokedata(child, addr, data);
+ case PTRACE_GETREGSET:
+ case PTRACE_SETREGSET:
+ return ptrace_regset(child, request, addr, (void*)data);
+
#ifdef PTRACE_OLDSETOPTIONS
case PTRACE_OLDSETOPTIONS:
#endif
data += *pos + start_pos;
?
Currently this doesn't matter, start_pos == 0 always. In fact I don't
understand why do we need this arg. In unlikely case the caller wants
to copy the part of *data, it can adjust data/end_pos itself.
Oleg.
Ah, I seem to understand. start_pos is not the offset inside *data as
I thought. It is needed to compensate the "*pos += copy" addition which
was done by the previous user_regset_copyout().
This means that xstateregs_get() is right, it copies xstate_fx_sw_bytes
but uses sizeof(i387_fxsave_struct) as start_pos.
So tricky... I must admit, I don't understand the point.
Sorry for noise. Now I see this should be correct, see another email I sent.
In fact, unless I missed something again, the 2nd and 3rd _copyout's could
use pos as start_pos instead of offsetof(what_we_already_copied).
No, you should never do that. start_pos and end_pos should be constants.
Thanks,
Roland
AFAIK there is no tree of mine that anyone habitually pulls from as part of
a regular merge procedure. Since the x86 patches have to go through your
approval, and these ones are coming from someone else's postings (Suresh),
I'm not sure how I would or should fit into the chain. If you would like
me to use a git branch for x86 things (ptrace things? x86 ptrace things?)
that have my ACK and I think should be merged, I'm glad to do that.
Thanks,
Roland
Since it's just two words, most places handling struct iovec seem to just
use two get_user() calls, which is probably more efficient.
Also, here is where this function would need to be split in half for
compat_ptrace_request() calls where it has to use struct compat_iovec.
> + // I am not sure. Afaics it is OK to pass the
> + // size which is less than n * size. If iov_len
> + // is bigger, we can silently truncate it, or
> + // even write the correct value back.
Modifying iov_len to report how much we accessed seems good to me. If we
do that, we should certainly allow a larger size for get, so userland can
just use a generic large buffer before even knowing the real size. I'm not
sure whether we should allow a smaller size, especially for set. I could
go either way.
Thanks,
Roland
We're happy to carry the patches as long as it's okay with you. We were
mostly wondering if you preferred any other kind of workflow.
-hpa
Allowing a larger size for get seems very sane. Allowing a smaller size
would be ok iff we make sure we handle corner cases right (i.e. a
partially overwritten subregister.)
-hpa
It should enforce the static constraints of the user_regset, which include
a starting position that's 0 % .align (if we had non-zero starting position
in the request) and a size that's 0 % .size. The arch code sets those and
then is not obliged to handle requests outside those constraints. It's
already documented that the arch code is obliged to handle any partial
regset access that meets them.
The usual thing is to set .size and .align to the size of a register.
This is exactly so that the arch code does not need to add gratuitous
corner case complications such as "partially overwritten subregister".
Thanks,
Roland
When I write patches myself, I always use git, so it is always easy and
most convenient for me to have my own branches pulled directly just to
save on 'git rebase' churn later.
As to a regular workflow of using a well-known branch of mine, I think it
makes sense for me to maintain a branch if there is an area where things
are presumptively merged just on my approval and I'm the one who has to be
lobbied to revert them. I have not heretofore acquired the sensation that
this was the case with any given area of code.
Thanks,
Roland
LOL, okay :)
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.