On Wed, May 30, 2012 at 05:34:43AM +0100, Al Viro wrote:
> On Wed, May 23, 2012 at 05:18:19PM -0400, Mimi Zohar wrote:
> > > 2) get_unmapped_area() probably ought to grow such a caller and
> > > I really suspect that it would've killed quite a few of them.
> > ?
> This belong at the same level as arch_mmap_check(). We insist on not
> having VMAs with address range that has certain properties. E.g. extends
> beyond the maximal user process address (TASK_SIZE, all architectures),
> overlaps a range forbidden by MMU (like on sparc boxen) *or* page at
> fixed address that must not be unmapped (e.g. page 0 at old arm systems,
> with their "IRQ vectors live at fixed small virtual address" lossage).
> Or hugepage VMA in range where huge pages are not allowed. Or page 0
> on systems where it's forbidden by security policy. Same kind of
> restriction, as far as the rest of the system is concerned.
> The obvious place for enforcing such restrictions is get_unmapped_area().
> That's what produces such VMA address ranges and validates ones that
> are explicitly given by userland (when called with MAP_FIXED).
BTW, here's what we have for callers of get_unmapped_area():
[hexagon,mips,powerpc,s390,sh,x86] arch_setup_additional_pages() - passes to
install_special_mapping(), where we'll feed it to that check.
[x86] setup_additional_pages() - ditto.
[uprobes] xol_add_vma() - ditto.
sys_mremap() - we explicitly call security_file_mmap() just downstream from
that call.
mremap_to() - we have called security_file_mmap() a bit earlier; might as
well fold it into get_unmapped_area(); note that we are calling it with
MAP_FIXED here, so get_unmapped_area() can't change the address, just accept
or reject it.
do_brk() - ditto.
do_mmap_pgoff() - we have security_file_mmap() (with non-NULL file, this
time) done shortly after get_unmapped_area()
vma_expandable() - on this codepath we don't do check, since starting
address can't change here. Might as well have done it, though.
[ia64 perfmon] a pointless wrapper around get_unmapped_area(); incidentally,
it gets calling conventions wrong - get_unmapped_area() return -E... on
failure, not 0. No check made, and AFAICS it's a bug. I've fixed the
calling conventions there (see vfs.git#for-next tip).
So we have one caller of get_unmapped_area() that legitimately doesn't have
the return value fed through security_file_mmap(). And that's expanding
mremap() trying to decide if it's allowed to grow an old vma in place.
Not the hottest path in the kernel, that...
As far as I'm concerned, the sane plan here would be
* split the damn hook into file-related and address-related parts.
* take the former outside - up from do_mmap_pgoff(), through
do_mmap() into vm_mmap()/do_shmat() and into sys_mmap_pgoff(). Caller of
do_mmap() in fs/aio.c doesn't need that - we have file == NULL there.
That's 5 callers, 3 on MMU side of things and 2 on !MMU.
* take the latter into get_unmapped_area() (on success exits).
As for the existing callers:
one in __bprm_mm_init() can go
what's left of the callers in do_mmap_pgoff() and !MMU
validate_mmap_request() can go (we'd taken all file-related checks upstream
and we'd done get_unmapped_area() already)
one in expand_downwards() should stay
ones in mremap_to(), sys_mremap() and do_brk() can go -
get_unmapped_area() nearby takes care of that
one in install_special_mapping() can go; most of the
callers of that one have address coming out of get_unmapped_area() and
the rest (tile and uml/x86 arch_setup_additional_pages(), uml_setup_stubs(),
unicore32 vectors_user_mapping(), compat && !compat_uses_vdso case in
x86 arch_setup_additional_pages()) are passing a fixed address in there and
any security policy wishing to deny that can just as well refuse to boot -
same effect. Incidentally, unicore32 probably should do an analog of
commit f9d4861 (ARM: 7294/1: vectors: use gate_vma for vectors user mapping)
and stop playing with install_special_mapping() completely.
* probably take arch_mmap_check() into expand_downwards(); we don't
want a full-blow get_unmapped_area() there (it's a fairly hot path and most
of get_unmapped_area() would be redundant here, even with MAP_FIXED), but
arch_mmap_check() would probably be more in place here than in the callers
of expand_stack().
The only question is what do we want passed to resulting two hooks. LSM
folks?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On Wed, 2012-05-30 at 17:36 +0100, Al Viro wrote:
> The only question is what do we want passed to resulting two hooks. LSM
> folks?
Current hook:
int security_file_mmap(struct file *file, unsigned long reqprot,
unsigned long prot, unsigned long flags,
unsigned long addr, unsigned long addr_only)
Obvious easy split:
int security_file_mmap(struct file *file, unsigned long reqprot,
unsigned long prot, unsigned long flags)
int security_addr_mmap(unsigned long addr)
security_addr_mmap would be used as you described. Which means
security_file_mmap() would pretty much only be used in do_mmap_pgoff()
(or validate_mmap_request)
file:
capabilities: does not use
apparmor/smack/selinux: used to get security blobs from
file/dentry/inode
reqprot: the PROT_* requested by userspace.
prot: the actual PROT_* which will be applied (read-implies-exec is the
difference)
capabilities: does not use *prot
SMACK: does not use *prot
apparmor: only uses prot (not reqprot)
SELinux: uses prot or reqprot based on a kernel build/selinuxfs/cmdline
value. Fedora/RHEL uses reqprot, not prot. This seems dumb, but it's
what we are doing.
These are basically used to check permission to read/write/execute the
file based on PROT_READ/PROT_WRITE/PROT_EXECUTE etc. If you move this
up we won't have reqprot and prot, we'll only have reqprot. So we would
need a helper in the mm code which allow us to easily calculate the
read-implies-exec behavior. for apparmor (and less common selinux)
***flags
capabilities: does not use
SMACK: does not use
apparmor: if (!(flags & MAP_PRIVATE))
SELinux: if ((flags & MAP_TYPE) == MAP_SHARED)
So both apparmor and SELinux only use flags to know if PROT_WRITE will
actually change the backing file. PROT_WRITE is ignored if MAP_PRIVATE.
So this could be a bool called "shared" or the LSMs can just parse the
flags. Doesn't matter to me.
On Wed, May 30, 2012 at 03:42:47PM -0400, Eric Paris wrote:
> security_file_mmap() would pretty much only be used in do_mmap_pgoff()
> (or validate_mmap_request)
Callers, actually - the whole point is to lift it out of under ->mmap_sem.
The tricky part is reqprot vs. prot mess.
In all cases, we need current to be READ_IMPLIES_EXEC and prot to contain
PROT_READ for any changes to happen. Assuming that both conditions are
met, we have the following:
file == NULL:
add PROT_EXEC in !MMU case, don't do that for MMU.
file != NULL, lives on fs mounted noexec: don't modify prot
file != NULL, lives on fs mounted without noexec:
MMU - add PROT_EXEC; !MMU - add it only if we have BDI_CAP_EXEC_MAP on
that file.
AFAICS, the least painful way of dealing with that would be to have
something like
mangle_prot(struct file *file, unsigned long *prot, unsigned long flags)
that would handle that logics and call security_file_mmap(). With separate
instances in MMU and !MMU cases ;-/ However, there's an extra fun in there -
consider fs/aio.c caller of do_mmap(). For MMU case we don't really care.
But what about !MMU? Should that guy get PROT_EXEC slapped on it?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On Wed, May 30, 2012 at 1:24 PM, Al Viro <v...@zeniv.linux.org.uk> wrote:
> On Wed, May 30, 2012 at 03:42:47PM -0400, Eric Paris wrote:
>> security_file_mmap() would pretty much only be used in do_mmap_pgoff()
>> (or validate_mmap_request)
> Callers, actually - the whole point is to lift it out of under ->mmap_sem.
> The tricky part is reqprot vs. prot mess.
See how I solved reqprot vs prot in my suggested original patch. You
have it somewhere in your inbox, I know, because you called me out on
the fact that my original email forgot to attach it ;)
It actually cleaned things up, and made the calling conventions
simpler. Just always pass in "reqprot", and have the security layer do
the trivial "calculate final prot".
Anyway, my original patch worked, but it (incorrectly) optimized away
the call to the security layer if file was NULL. But that's just a
matter of removing that check.
On Wed, 2012-05-30 at 21:24 +0100, Al Viro wrote:
> On Wed, May 30, 2012 at 03:42:47PM -0400, Eric Paris wrote:
> > security_file_mmap() would pretty much only be used in do_mmap_pgoff()
> > (or validate_mmap_request)
> Callers, actually - the whole point is to lift it out of under ->mmap_sem.
> The tricky part is reqprot vs. prot mess.
Linus already addressed it in the original patch, which I split into two
as you suggested:
733559f - "security: move security_file_mmap() and rename to security_mmap_file()" b3649e9 security: define and use the new security_mmap_addr() hook
On Wed, May 30, 2012 at 04:33:36PM -0400, Mimi Zohar wrote:
> On Wed, 2012-05-30 at 21:24 +0100, Al Viro wrote:
> > On Wed, May 30, 2012 at 03:42:47PM -0400, Eric Paris wrote:
> > > security_file_mmap() would pretty much only be used in do_mmap_pgoff()
> > > (or validate_mmap_request)
> > Callers, actually - the whole point is to lift it out of under ->mmap_sem.
> > The tricky part is reqprot vs. prot mess.
> Linus already addressed it in the original patch, which I split into two
> as you suggested:
> 733559f - "security: move security_file_mmap() and rename to security_mmap_file()" > b3649e9 security: define and use the new security_mmap_addr() hook
Ehh...
* elf_map() should just be using vm_map/vm_unmap (see vfs.git#for-next)
* that goto out; in sys_mmap_pgooff() is an instant leak - you miss
fput() that way
* again, prot handling for !MMU case differs from MMU one. Sure, we
can duplicate both, but it's getting seriously ugly ;-/
* while we are at it, checking address in validate_mmap_request() is
really odd, seeing that we have the only call to validate_mmap_request()
followed by
/* we ignore the address hint */
addr = 0;
len = PAGE_ALIGN(len);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On Wed, May 30, 2012 at 01:28:09PM -0700, Linus Torvalds wrote:
> On Wed, May 30, 2012 at 1:24 PM, Al Viro <v...@zeniv.linux.org.uk> wrote:
> > On Wed, May 30, 2012 at 03:42:47PM -0400, Eric Paris wrote:
> >> security_file_mmap() would pretty much only be used in do_mmap_pgoff()
> >> (or validate_mmap_request)
> > Callers, actually - the whole point is to lift it out of under ->mmap_sem.
> > The tricky part is reqprot vs. prot mess.
> See how I solved reqprot vs prot in my suggested original patch. You
> have it somewhere in your inbox, I know, because you called me out on
> the fact that my original email forgot to attach it ;)
> It actually cleaned things up, and made the calling conventions
> simpler. Just always pass in "reqprot", and have the security layer do
> the trivial "calculate final prot".
If only it would be trivial ;-/ Take a look at !MMU case (or at the
description in the posting upthread if you want to avoid seeing your
breakfast one more time - the code in validate_mmap_request() is
really ugly).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On Wed, May 30, 2012 at 1:56 PM, Al Viro <v...@zeniv.linux.org.uk> wrote:
>> It actually cleaned things up, and made the calling conventions
>> simpler. Just always pass in "reqprot", and have the security layer do
>> the trivial "calculate final prot".
> If only it would be trivial ;-/ Take a look at !MMU case (or at the
> description in the posting upthread if you want to avoid seeing your
> breakfast one more time - the code in validate_mmap_request() is
> really ugly).
Don't bother with validate_mmap_request() for nommu. It's ugly, but it
does the same thing, and if it does something else, it's buggy anyway.
Generating 'prot' from 'reqprot' really *should* be as simple as what
I did in my patch. The fact that some places f*ck it up is their
problem - see for example mprotect (I think) that didn't take
MNT_NOEXEC into account.
Don't try to emulate those broken semantics. Just fix them.
On Wed, May 30, 2012 at 02:04:23PM -0700, Linus Torvalds wrote:
> Generating 'prot' from 'reqprot' really *should* be as simple as what
> I did in my patch. The fact that some places f*ck it up is their
> problem - see for example mprotect (I think) that didn't take
> MNT_NOEXEC into account.
> Don't try to emulate those broken semantics. Just fix them.
Actually, it's better than I thought, but not as simple as you say.
I've misread what's going on in !file case; mea culpa, they are
actually acting the same way there.
The only difference is that for file-backed ones !MMU wants
VM_MAYEXEC in that file's bdi flags (BDI_CAP_EXEC_MAP). And
that actually sounds reasonable in !MMU case.
Anyway, I've dumped the variant I've got into vfs.git@security_file_mmap;
it should be at commit f12a0fd062b1d259a0b6bc6442019e6d4c45e9f5.
On Wed, May 30, 2012 at 2:36 PM, Al Viro <v...@zeniv.linux.org.uk> wrote:
> The only difference is that for file-backed ones !MMU wants
> VM_MAYEXEC in that file's bdi flags (BDI_CAP_EXEC_MAP). And
> that actually sounds reasonable in !MMU case.
Ok, I don't think it should be strictly necessary, but I guess I don't
mind either.
> Anyway, I've dumped the variant I've got into vfs.git@security_file_mmap;
> it should be at commit f12a0fd062b1d259a0b6bc6442019e6d4c45e9f5.
> Comments?
Two small ones:
- I really don't think you should use "goto out" in
security_mmap_file(). That implies that you're exiting the function,
but in fact you're jumping to the very *meat* of the function.
So I think you should rename "out" as "no_added_exec" or something.
now seems to exist in four places. And in fact, that pretty much seems
to *be* what vm_mmap() is, at this point. Why isn't there just one
single vm_mmap() implementation, and then the callers of that?
On Wed, May 30, 2012 at 03:51:27PM -0700, Linus Torvalds wrote:
> On Wed, May 30, 2012 at 2:36 PM, Al Viro <v...@zeniv.linux.org.uk> wrote:
> > The only difference is that for file-backed ones !MMU wants
> > VM_MAYEXEC in that file's bdi flags (BDI_CAP_EXEC_MAP). ?And
> > that actually sounds reasonable in !MMU case.
> Ok, I don't think it should be strictly necessary, but I guess I don't
> mind either.
> > Anyway, I've dumped the variant I've got into vfs.git@security_file_mmap;
> > it should be at commit f12a0fd062b1d259a0b6bc6442019e6d4c45e9f5.
> > Comments?
> Two small ones:
> - I really don't think you should use "goto out" in
> security_mmap_file(). That implies that you're exiting the function,
> but in fact you're jumping to the very *meat* of the function.
> So I think you should rename "out" as "no_added_exec" or something.
FWIW, I think it's cleaner to take the whole thing into an inlined helper.
> now seems to exist in four places. And in fact, that pretty much seems
> to *be* what vm_mmap() is, at this point. Why isn't there just one
> single vm_mmap() implementation, and then the callers of that?
Umm... Not quite. The difference is that vm_mmap() takes its argument as
offset in bytes, while sys_mmap_pgoff() - in pages.
It can be reorganized a bit, though. vm_mmap() aside, there are only two
callers of do_mmap(), both passing it 0 as the last argument. So let's
lift these checks on offset into vm_mmap() and kill do_mmap() completely -
all that remains of it would be a call of do_mmap_pgoff(). And there's no
reason to put those sanity checks (now in vm_mmap()) under ->mmap_sem,
of course. At that point we *do* get 4 identical pieces of code. Let's
call that vm_mmap_pgoff() and put it (and vm_mmap()) to mm/util.c. Voila...
I've pushed that to the same place (vfs.git#security_file_mmap). Should
propagate to git.kernel.org in a few... Guys, does anybody have objections
about the way it looks?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On Wed, May 30, 2012 at 5:28 PM, Al Viro <v...@zeniv.linux.org.uk> wrote:
> FWIW, I think it's cleaner to take the whole thing into an inlined helper.
Even better.
I notice that your inlined helper doesn't do what I did: if PROT_EXEC
is already set, stop all the stupid games. IOW, the first test in that
function could as well be
if (prot & (PROT_READ | PROT_EXEC) != PROT_READ)
return prot;
because if PROT_EXEC is already set, or if PROT_READ wasn't set, none
of the rest of the checks make any sense at all.
But that's just me being anal. It doesn't really *matter* if we end up
setting PROT_EXEC again.
> It can be reorganized a bit, though. vm_mmap() aside, there are only two
> callers of do_mmap(), both passing it 0 as the last argument. So let's
> lift these checks on offset into vm_mmap() and kill do_mmap() completely -
> all that remains of it would be a call of do_mmap_pgoff(). And there's no
> reason to put those sanity checks (now in vm_mmap()) under ->mmap_sem,
> of course. At that point we *do* get 4 identical pieces of code. Let's
> call that vm_mmap_pgoff() and put it (and vm_mmap()) to mm/util.c. Voila...
On Wed, May 30, 2012 at 05:40:54PM -0700, Linus Torvalds wrote:
> On Wed, May 30, 2012 at 5:28 PM, Al Viro <v...@zeniv.linux.org.uk> wrote:
> > FWIW, I think it's cleaner to take the whole thing into an inlined helper.
> Even better.
> I notice that your inlined helper doesn't do what I did: if PROT_EXEC
> is already set, stop all the stupid games. IOW, the first test in that
> function could as well be
> because if PROT_EXEC is already set, or if PROT_READ wasn't set, none
> of the rest of the checks make any sense at all.
Point... OK, done, pushed and the whole thing thrown into #for-next.
Probably too late for today's linux-next, but tomorrow one should pick
that. BTW, tomorrow there'll be a signal.git pull request as well,
with task_work_add() series in it.
On Thu, 2012-05-31 at 01:56 +0100, Al Viro wrote:
> On Wed, May 30, 2012 at 05:40:54PM -0700, Linus Torvalds wrote:
> > On Wed, May 30, 2012 at 5:28 PM, Al Viro <v...@zeniv.linux.org.uk> wrote:
> > > FWIW, I think it's cleaner to take the whole thing into an inlined helper.
> > Even better.
> > I notice that your inlined helper doesn't do what I did: if PROT_EXEC
> > is already set, stop all the stupid games. IOW, the first test in that
> > function could as well be
> > because if PROT_EXEC is already set, or if PROT_READ wasn't set, none
> > of the rest of the checks make any sense at all.
> Point... OK, done, pushed and the whole thing thrown into #for-next.
Patch "split cap_mmap_addr() out of cap_file_mmap()" contains the typo
'cat_mmap_addr', which is then removed in
"split ->file_mmap() into ->mmap_addr()/->mmap_file()".
> Probably too late for today's linux-next, but tomorrow one should pick
> that. BTW, tomorrow there'll be a signal.git pull request as well,
> with task_work_add() series in it.
On Thu, 31 May 2012, Al Viro wrote:
> I've pushed that to the same place (vfs.git#security_file_mmap). Should
> propagate to git.kernel.org in a few... Guys, does anybody have objections
> about the way it looks?
Nope.
Acked-by: James Morris <james.l.mor...@oracle.com>