[PATCH] kernel: make /proc/kallsyms mode 400 to reduce ease of attacking

Marcus Meissner

unread,

Nov 4, 2010, 6:10:02 AM11/4/10

to

Hi,

Making /proc/kallsyms readable only for root makes it harder
for attackers to write generic kernel exploits by removing
one source of knowledge where things are in the kernel.

Signed-off-by: Marcus Meissner <meis...@suse.de>
---
kernel/kallsyms.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 6f6d091..a8db257 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -546,7 +546,7 @@ static const struct file_operations kallsyms_operations = {

static int __init kallsyms_init(void)
{
- proc_create("kallsyms", 0444, NULL, &kallsyms_operations);
+ proc_create("kallsyms", 0400, NULL, &kallsyms_operations);
return 0;
}
device_initcall(kallsyms_init);
--
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Tejun Heo

unread,

Nov 4, 2010, 6:20:01 AM11/4/10

to

On 11/04/2010 11:09 AM, Marcus Meissner wrote:
> Making /proc/kallsyms readable only for root makes it harder
> for attackers to write generic kernel exploits by removing
> one source of knowledge where things are in the kernel.
>
> Signed-off-by: Marcus Meissner <meis...@suse.de>

I can't recall needing /proc/kallsyms when I wasn't root, so unless
there's a compelling use case.

Acked-by: Tejun Heo <t...@kernel.org>

Thanks.

--
tejun

Ingo Molnar

unread,

Nov 4, 2010, 7:50:01 AM11/4/10

to

* Marcus Meissner <meis...@suse.de> wrote:

> Hi,
>
> Making /proc/kallsyms readable only for root makes it harder for attackers to
> write generic kernel exploits by removing one source of knowledge where things are
> in the kernel.

Cc:-ed Linus - i think he argued in favor of such a patch in the past.

I generally agree with such patches (i have written some myself), but there's a few
questions with this one, which make this limited change ineffective and which make
it harder to implement a fuller patch that makes it truly harder to figure out the
precise kernel build:

- The real security obstruction effect is very small from this measure alone: the
overwhelming majority of our users are running distro kernels, so the Symbol.map
file (and hence 99% of /proc/kallsyms content) is well-known - unless we also
restrict 'uname -r' from nonprivileged users-ace. Hiding that might make sense -
but the two should be in one patch really.

- ( It will break a few tools that can be run as a plain user out of box - perf
for example. "chmod a+r /proc/kallsyms" during bootup will work that around so
it's not the end of the world. )

- For self-built kernels it might make sense - but there's "chmod a-r
/proc/kallsyms" during bootup one can do already.

- There's the side-question of module symbols - those are dynamically allocated
hence arguably per system. But module symbols make up only 1% on a typical
booted up full distro box.

So what does a distribution like Suse expect from this change alone? Those have
public packages in rpms which can be downloaded by anyone, so it makes little sense
to hide it - unless _all_ version information is hidden.

So i'd like to see a _full_ version info sandboxing patch that thinks through all
the angles and restricts uname -r kernel version info as well, and makes dmesg
unaccessible to users - and closes a few other information holes as well that give
away the exact kernel version - _that_ together will make it hard to blindly attack
a very specific kernel version.

But without actually declaring and achieving that sandboxing goal this security
measure is just a feel-good thing really - and makes it harder to make more
difficult steps down the road, like closing 'uname -r' ...

I fully expect Linus to overrule me on this, but hey, i had to try it and lay out my
arguments :-)

Thanks,

Ingo

Marcus Meissner

unread,

Nov 4, 2010, 8:30:01 AM11/4/10

to

On Thu, Nov 04, 2010 at 12:46:48PM +0100, Ingo Molnar wrote:
>
> * Marcus Meissner <meis...@suse.de> wrote:
>
> > Hi,
> >
> > Making /proc/kallsyms readable only for root makes it harder for attackers to
> > write generic kernel exploits by removing one source of knowledge where things are
> > in the kernel.
>
> Cc:-ed Linus - i think he argued in favor of such a patch in the past.
>
> I generally agree with such patches (i have written some myself), but there's a few
> questions with this one, which make this limited change ineffective and which make
> it harder to implement a fuller patch that makes it truly harder to figure out the
> precise kernel build:
>
> - The real security obstruction effect is very small from this measure alone: the
> overwhelming majority of our users are running distro kernels, so the Symbol.map
> file (and hence 99% of /proc/kallsyms content) is well-known - unless we also
> restrict 'uname -r' from nonprivileged users-ace. Hiding that might make sense -
> but the two should be in one patch really.

Of course. System.map and others also need to turn to mode 400.

> - ( It will break a few tools that can be run as a plain user out of box - perf
> for example. "chmod a+r /proc/kallsyms" during bootup will work that around so
> it's not the end of the world. )

I was wondering about how much tools there are... I was also thinking of oprofile too.

> - For self-built kernels it might make sense - but there's "chmod a-r
> /proc/kallsyms" during bootup one can do already.
>
> - There's the side-question of module symbols - those are dynamically allocated
> hence arguably per system. But module symbols make up only 1% on a typical
> booted up full distro box.
>
> So what does a distribution like Suse expect from this change alone? Those have
> public packages in rpms which can be downloaded by anyone, so it makes little sense
> to hide it - unless _all_ version information is hidden.

It is the first patch, mostly an acceptance test balloon.

There are several other files handing information out, but kallsyms has
it all very nice and ready.

(timer_list, /proc/*/stat*, sl?binfo )

> So i'd like to see a _full_ version info sandboxing patch that thinks through all
> the angles and restricts uname -r kernel version info as well, and makes dmesg
> unaccessible to users - and closes a few other information holes as well that give
> away the exact kernel version - _that_ together will make it hard to blindly attack
> a very specific kernel version.

I am personally thinking of a "small steps" philosophy, one step after the other.

> But without actually declaring and achieving that sandboxing goal this security
> measure is just a feel-good thing really - and makes it harder to make more
> difficult steps down the road, like closing 'uname -r' ...
>
> I fully expect Linus to overrule me on this, but hey, i had to try it and lay out my
> arguments :-)

The goal we (SUSE Security and the oss-security list) had in mind is:

- Do not leak kernel addresses from kernel space to user space to make
writing kernel exploits harder.

Even if attackers get to have lists of addresses in their exploits it will have made
the world a bit better.

Ciao, Marcus

Ingo Molnar

unread,

Nov 4, 2010, 10:00:01 AM11/4/10

to

* Marcus Meissner <meis...@suse.de> wrote:

> On Thu, Nov 04, 2010 at 12:46:48PM +0100, Ingo Molnar wrote:
> >
> > * Marcus Meissner <meis...@suse.de> wrote:
> >
> > > Hi,
> > >
> > > Making /proc/kallsyms readable only for root makes it harder for attackers to
> > > write generic kernel exploits by removing one source of knowledge where things are
> > > in the kernel.
> >
> > Cc:-ed Linus - i think he argued in favor of such a patch in the past.
> >
> > I generally agree with such patches (i have written some myself), but there's a few
> > questions with this one, which make this limited change ineffective and which make
> > it harder to implement a fuller patch that makes it truly harder to figure out the
> > precise kernel build:
> >
> > - The real security obstruction effect is very small from this measure alone: the
> > overwhelming majority of our users are running distro kernels, so the Symbol.map
> > file (and hence 99% of /proc/kallsyms content) is well-known - unless we also
> > restrict 'uname -r' from nonprivileged users-ace. Hiding that might make sense -
> > but the two should be in one patch really.
>
> Of course. System.map and others also need to turn to mode 400.

That is not what I meant, at all.

It's not the System.map _on the system_.

It's the SuSE or Fedora kernel rpm package with a System.map in it, which
package the attacker can download from a hundred mirrors on the internet,
based on 'uname -r' output.

You cannot obfuscate the System.map of a distro kernel without obfuscating all
identification info. (Note that even the pure size of the System.map might tell a
kernel rpm version from another ...)

Ingo

Ingo Molnar

unread,

Nov 4, 2010, 10:20:02 AM11/4/10

to

* Ingo Molnar <mi...@elte.hu> wrote:

> * Marcus Meissner <meis...@suse.de> wrote:
>
> > On Thu, Nov 04, 2010 at 12:46:48PM +0100, Ingo Molnar wrote:
> > >
> > > * Marcus Meissner <meis...@suse.de> wrote:
> > >
> > > > Hi,
> > > >
> > > > Making /proc/kallsyms readable only for root makes it harder for attackers to
> > > > write generic kernel exploits by removing one source of knowledge where things are
> > > > in the kernel.
> > >
> > > Cc:-ed Linus - i think he argued in favor of such a patch in the past.
> > >
> > > I generally agree with such patches (i have written some myself), but there's a few
> > > questions with this one, which make this limited change ineffective and which make
> > > it harder to implement a fuller patch that makes it truly harder to figure out the
> > > precise kernel build:
> > >
> > > - The real security obstruction effect is very small from this measure alone: the
> > > overwhelming majority of our users are running distro kernels, so the Symbol.map
> > > file (and hence 99% of /proc/kallsyms content) is well-known - unless we also
> > > restrict 'uname -r' from nonprivileged users-ace. Hiding that might make sense -
> > > but the two should be in one patch really.
> >
> > Of course. System.map and others also need to turn to mode 400.
>
> That is not what I meant, at all.
>
> It's not the System.map _on the system_.
>
> It's the SuSE or Fedora kernel rpm package with a System.map in it, which package
> the attacker can download from a hundred mirrors on the internet, based on 'uname
> -r' output.

For example, on a Fedora testbox i have this version info:

$ uname -r
2.6.35.6-48.fc14.x86_64

Any attacker can download that rpm from:

http://download.fedora.redhat.com/pub/fedora/linux/updates/14/x86_64/kernel-2.6.35.6-48.fc14.x86_64.rpm

And can extract the System.map from it, using rpm2cpio and cpio -i -d. That will
include all the symbol addresses - without the attacker having any access to the
System.map or /proc/kallsyms on this particular box.

I.e. on distro kernel installations (which comprise the _vast_ majority of our
userbase) your patch brings little security benefits.

What i suggested in later parts of my mail might provide more security: to sandbox
kernel version information from unprivileged user-space - if we decide that we want
to sandbox kernel version information ...

That is a big if, because it takes a considerable amount of work. Would be worth
trying it - but feel-good non-solutions that do not bring much improvement to the
majority of users IMHO hinder such efforts.

Thanks,

Tejun Heo

unread,

Nov 4, 2010, 10:40:02 AM11/4/10

to

Hello,

On 11/04/2010 03:33 PM, Marcus Meissner wrote:
> I mean the kernel could hide it from uname, but lsb_release,
> /etc/redhat-release, /etc/SuSE-release etc still exist and then you
> can still use the fixed address list table inside your exploit. But an
> exploits needs to have such a list, making it harder to write.

I do believe that making things more difficult to exploit helps. Many
people seem to think it only gives false sense of security tho.

> I also briefly thought about kernel ASLR, but my knowledge of the kernel
> loading is too limited whether this is even possible or at all useful.

We already have relocatable kernel for kdump and IIRC it doesn't add
runtime overhead, so putting the kernel at random address shouldn't be
too difficult. Not sure how useful that would be tho.

Thanks.

--
tejun

Marcus Meissner

unread,

Nov 4, 2010, 10:40:02 AM11/4/10

to

Hiding the OS version is really quite hard I think.

I mean the kernel could hide it from uname, but lsb_release,
/etc/redhat-release, /etc/SuSE-release etc still exist and then you
can still use the fixed address list table inside your exploit. But an
exploits needs to have such a list, making it harder to write.

If we avoid exploits being able to just do open("/boot/System.map") it would
make it a useful step harder for exploit writers.

(This will end up a arms race between us and the exploit toolkit writers of course,
but hopefully not a longer one than fixing all actual problems ;)

I also briefly thought about kernel ASLR, but my knowledge of the kernel
loading is too limited whether this is even possible or at all useful.

Ciao, Marcus

H. Peter Anvin

unread,

Nov 4, 2010, 10:50:02 AM11/4/10

to

On 11/04/2010 10:38 AM, Tejun Heo wrote:
> Hello,
>
> On 11/04/2010 03:33 PM, Marcus Meissner wrote:
>> I mean the kernel could hide it from uname, but lsb_release,
>> /etc/redhat-release, /etc/SuSE-release etc still exist and then you
>> can still use the fixed address list table inside your exploit. But an
>> exploits needs to have such a list, making it harder to write.
>
> I do believe that making things more difficult to exploit helps. Many
> people seem to think it only gives false sense of security tho.
>
>> I also briefly thought about kernel ASLR, but my knowledge of the kernel
>> loading is too limited whether this is even possible or at all useful.
>
> We already have relocatable kernel for kdump and IIRC it doesn't add
> runtime overhead, so putting the kernel at random address shouldn't be
> too difficult. Not sure how useful that would be tho.
>

It's very coarse-grained relocation, which is why it works.

-hpa

P.S. It's not just for kdump anymore.

Tejun Heo

unread,

Nov 4, 2010, 10:50:03 AM11/4/10

to

Hello,

On 11/04/2010 03:43 PM, H. Peter Anvin wrote:
>> We already have relocatable kernel for kdump and IIRC it doesn't add
>> runtime overhead, so putting the kernel at random address shouldn't be
>> too difficult. Not sure how useful that would be tho.
>
> It's very coarse-grained relocation, which is why it works.

Yeah, I recall reading the fairly simple relocator somewhere in the
x86 tree. Would it be impossible/difficult to improve it?

> P.S. It's not just for kdump anymore.

Ah, didn't know that either.

Thanks.

--
tejun

Ingo Molnar

unread,

Nov 4, 2010, 3:10:02 PM11/4/10

to

* Marcus Meissner <meis...@suse.de> wrote:

> > For example, on a Fedora testbox i have this version info:
> >
> > $ uname -r
> > 2.6.35.6-48.fc14.x86_64
> >
> > Any attacker can download that rpm from:
> >
> > http://download.fedora.redhat.com/pub/fedora/linux/updates/14/x86_64/kernel-2.6.35.6-48.fc14.x86_64.rpm
> >
> > And can extract the System.map from it, using rpm2cpio and cpio -i -d. That will
> > include all the symbol addresses - without the attacker having any access to the
> > System.map or /proc/kallsyms on this particular box.
> >
> > I.e. on distro kernel installations (which comprise the _vast_ majority of our
> > userbase) your patch brings little security benefits.
> >
> > What i suggested in later parts of my mail might provide more security: to
> > sandbox kernel version information from unprivileged user-space - if we decide
> > that we want to sandbox kernel version information ...
> >
> > That is a big if, because it takes a considerable amount of work. Would be worth
> > trying it - but feel-good non-solutions that do not bring much improvement to
> > the majority of users IMHO hinder such efforts.
>
> Hiding the OS version is really quite hard I think.

Yes. Hard but it would be useful - especially if we start adding things like known
exploit honeypots. Forcing attackers to probe the kernel by actually running a
kernel exploit, and risking an alarm would be a very powerful security feature.

Removing version info will upset some tools/libraries that rely on kernel version
information for quirks though.

> I mean the kernel could hide it from uname, but lsb_release, /etc/redhat-release,
> /etc/SuSE-release etc still exist and then you can still use the fixed address
> list table inside your exploit. But an exploits needs to have such a list, making
> it harder to write.
>
> If we avoid exploits being able to just do open("/boot/System.map") it would make
> it a useful step harder for exploit writers.

Dunno. It's a very low 'barrier'.

> (This will end up a arms race between us and the exploit toolkit writers of
> course, but hopefully not a longer one than fixing all actual problems ;)

That's not really an arms race. It's more like a 'throwing a feather in the path of
a tornado' kind of defense. Sure, it has some effect.

> I also briefly thought about kernel ASLR, but my knowledge of the kernel loading
> is too limited whether this is even possible or at all useful.

Now ASLR for kernel addresses would be _very_ useful. We could still 'expose' useful
debug and instrumentation info like /proc/kallsyms, but the kernel internal offset
would be a per bootup secret.

_That_ is a real statistical defensive security measure which would help everyone
and everywhere. Not hiding public info on that system and still leaving the link to
the public info (the version) available.

(Isn't such a feature available in one of the security patches? Porting that to
distros and moving it upstream would add some real defense.)

Thanks,

Ingo

Willy Tarreau

unread,

Nov 4, 2010, 5:40:02 PM11/4/10

to

On Thu, Nov 04, 2010 at 08:08:04PM +0100, Ingo Molnar wrote:
>
> * Marcus Meissner <meis...@suse.de> wrote:
>
> > > For example, on a Fedora testbox i have this version info:
> > >
> > > $ uname -r
> > > 2.6.35.6-48.fc14.x86_64
> > >
> > > Any attacker can download that rpm from:
> > >
> > > http://download.fedora.redhat.com/pub/fedora/linux/updates/14/x86_64/kernel-2.6.35.6-48.fc14.x86_64.rpm
> > >
> > > And can extract the System.map from it, using rpm2cpio and cpio -i -d. That will
> > > include all the symbol addresses - without the attacker having any access to the
> > > System.map or /proc/kallsyms on this particular box.
> > >
> > > I.e. on distro kernel installations (which comprise the _vast_ majority of our
> > > userbase) your patch brings little security benefits.
> > >
> > > What i suggested in later parts of my mail might provide more security: to
> > > sandbox kernel version information from unprivileged user-space - if we decide
> > > that we want to sandbox kernel version information ...
> > >
> > > That is a big if, because it takes a considerable amount of work. Would be worth
> > > trying it - but feel-good non-solutions that do not bring much improvement to
> > > the majority of users IMHO hinder such efforts.
> >
> > Hiding the OS version is really quite hard I think.
>
> Yes. Hard but it would be useful - especially if we start adding things like known
> exploit honeypots. Forcing attackers to probe the kernel by actually running a
> kernel exploit, and risking an alarm would be a very powerful security feature.
>
> Removing version info will upset some tools/libraries that rely on kernel version
> information for quirks though.

Quite honnestly, it's the worst idea I've ever read to protect the kernel.
Kernel version is needed at many places, when building some code which relies
on presence of syscall X or Y depending on a version, etc... If our kernel is
so buggy that we can only rely on its version to be kept secret, then we have
already failed.

The kernel version should not be a secret, and anyway there are many ways to
guess it. And judging by past exploits, some of them work on a wide variety
of kernels so that's often pointless. It's just like when admins used to hide
their product names from HTTP response headers, this did not stop exploits at
all because there were always ways to guess the information.

Also, keep in mind that the most info you'll hide from unprivileged users,
the more you'll need root access for anything, which is a lot worse. On
systems that are secured that way, there are sudoers for everyone to do
anything (even ping). This becomes unmanageable and that opens even more
flaws in the whole system. We'll be proud of saying that those are not
kernel issues anymore but management issues but it's a bit easy to point
the finger at the poor guy who tries to keep his system usable despite
our efforts not to do so. And BTW, yes I *do* have access to such a system
where sudo is required for many things and some flaws already give me root
access.

When you secure an environment too much, users build a sub-environment inside
it with lower controls. It's common to see one user provide a complete tool
suite to other users because nothing was installed for fear of opening a hole.
But when you provide all the tools to everyone with you own account, it's just
as if you were root. So that's just pushing the problem somewhere else.

Focusing on ways to make the kernel more reliable when some information is
known is more efficient than trying to hide that information and relying on
this fact.

Willy

Ingo Molnar

unread,

Nov 4, 2010, 6:00:02 PM11/4/10

to

> of syscall X or Y depending on a version, etc... [...]

Actually that's not true, since we have a kernel ABI, and since there's many
backports of newer kernel features into older kernels that it's generally not
needed nor meaningful to know the kernel version for syscalls.

Returning -ENOSYS is the general standard we use to communicate syscall
capabilities.

In fact using kernel version to switch around library functionality is a bug i'd
argue.

> [...] If our kernel is so buggy that we can only rely on its version to be kept

> secret, then we have already failed.

That mischaracterises my suggestion rather heavily - which makes me suspect that you
misunderstood it. Here's the relevant section of what i suggested here:

> > Hard but it would be useful - especially if we start adding things like known
> > exploit honeypots. Forcing attackers to probe the kernel by actually running a
> > kernel exploit, and risking an alarm would be a very powerful security feature.

An 'exploit honeypot' would be some small amount of 'detection' code for the
exploitable pattern of parameters (most attacks come via ioctls so we can add
detection for known holes without any performance hit), and the kernel would warn
the sysadmin that an exploit attempt has occured.

The point is to make it riskier to run exploits - not to 'hide version because we
are so buggy'. Unprivileged attackers wont be able to know whether a kernel is
unpatched and wont know whether trying an actual exploit triggers a silent alarm or
not.

I.e. i think the only true break-through in kernel security will be to add credible
and substantial 'strike back' functionality - to increase the risks of detection
(which necessiates the removal of the information whether a kernel is patched or
not).

As i said it's hard - but it would be a rather break-through security feature for
Linux. Not an 'arms race' thing where we just put obstruction in the road of
attackers - but some real, unavoidable risk not detectable by attackers - running on
most stock distro kernels. (so there would be a real economy of scale)

The kerneloops client could also collect exploit attempt stats.

Thanks,

Ingo

Willy Tarreau

unread,

Nov 4, 2010, 6:40:01 PM11/4/10

to

On Thu, Nov 04, 2010 at 10:51:57PM +0100, Ingo Molnar wrote:
> > Quite honnestly, it's the worst idea I've ever read to protect the kernel. Kernel
> > version is needed at many places, when building some code which relies on presence
> > of syscall X or Y depending on a version, etc... [...]
>
> Actually that's not true, since we have a kernel ABI, and since there's many
> backports of newer kernel features into older kernels that it's generally not
> needed nor meaningful to know the kernel version for syscalls.
>
> Returning -ENOSYS is the general standard we use to communicate syscall
> capabilities.
>
> In fact using kernel version to switch around library functionality is a bug i'd
> argue.

I'm sorry Ingo, but I still don't agree. We've had several versions of epoll,
several (some even buggy) versions of splice() which cannot even be detected
without checking the kernel release. And those are just two that immediately
come to my mind. If we've been providing a version for the last 19 years, it
surely had some valid uses.

> > [...] If our kernel is so buggy that we can only rely on its version to be kept
> > secret, then we have already failed.
>
> That mischaracterises my suggestion rather heavily - which makes me suspect that you
> misunderstood it. Here's the relevant section of what i suggested here:
>
> > > Hard but it would be useful - especially if we start adding things like known
> > > exploit honeypots. Forcing attackers to probe the kernel by actually running a
> > > kernel exploit, and risking an alarm would be a very powerful security feature.

I have read it, but this does not require hiding the kernel version. You can
still keep your honey pots if you want (provided that they don't slow down
normal syscall path) and log suspect attempts. But if you're hiding the version,
those tricks will be used by valid programs too.

> An 'exploit honeypot' would be some small amount of 'detection' code for the
> exploitable pattern of parameters (most attacks come via ioctls so we can add
> detection for known holes without any performance hit), and the kernel would warn
> the sysadmin that an exploit attempt has occured.

If we pollute the ioctl code with all the CVEs we have accumulated over the
years, I bet we'd get a performance hit and will probably introduce new bugs
due to the harder to maintain code.

> The point is to make it riskier to run exploits - not to 'hide version because we
> are so buggy'. Unprivileged attackers wont be able to know whether a kernel is
> unpatched and wont know whether trying an actual exploit triggers a silent alarm or
> not.

In my opinion, hiding the distro-specific part of the version should not cause
too much harm, but still I find this useless.

You see, I've used the vmsplice exploit at one place. Do you know how I did ?
$ cat /etc/redhat-release

Then I opened the box and installed the DVD showing the same version on a
spare PC to experiment with it. Once I got the exploit to reliably work without
crashing the kernel nor leaving traces, I dared launching it on the target
machine and it worked. Uname -r was not involved there. I simply relied on
the fact that updating a distro is a pain at many places and that it's very
rare to find an updated one because of that, so they remain with the shipped
kernel for months if not years, and sometimes even because some product
vendors say "my product supports Red Hat kernel 2.6.18-128xxx" so they don't
want to risk losing the support because they don't understand anything to
versioning.

So if we make fixes easier to install, we'd probably have less issues with
unfixed code than if we try to pretend they're not vulnerable by hiding the
version.

> I.e. i think the only true break-through in kernel security will be to add credible
> and substantial 'strike back' functionality - to increase the risks of detection
> (which necessiates the removal of the information whether a kernel is patched or
> not).
>
> As i said it's hard - but it would be a rather break-through security feature for
> Linux.

It requires hiding so many things for providing so little protection that I
really don't believe in it at all. Simply checking the system uptime the
last most date of /boot generally tells you precise info about the last
udpate.

> Not an 'arms race' thing where we just put obstruction in the road of
> attackers - but some real, unavoidable risk not detectable by attackers - running on
> most stock distro kernels. (so there would be a real economy of scale)
>
> The kerneloops client could also collect exploit attempt stats.

Well, in my opinion, either the attacker is remote and you can already get
many info, or he's local and has time to precisely qualify the environment
in order not to leave the slightest trace. The rule is simple : if you don't
trust your local users, remain up to date. One day lag once and you lose.

Regards,
Willy

Willy Tarreau

unread,

Nov 4, 2010, 7:50:02 PM11/4/10

to

On Thu, Nov 04, 2010 at 11:35:26PM +0100, Willy Tarreau wrote:
> > The point is to make it riskier to run exploits - not to 'hide version because we
> > are so buggy'. Unprivileged attackers wont be able to know whether a kernel is
> > unpatched and wont know whether trying an actual exploit triggers a silent alarm or
> > not.
>
> In my opinion, hiding the distro-specific part of the version should not cause
> too much harm, but still I find this useless.

BTW, if you want to hide the kernel version for the 99% distro kernels,
there's a very simple way to do that : just don't bump EXTRAVERSION nor
the build date in official builds. Keep it the same for all the product's
life, and provide the real name in a /proc entry that is only readable by
root by default. This will solve your issue with the exact kernel version
revealing pointers/bugs without hurting compatibility with user space
tools.

That will not hide the hints I was talking about though (uptime, dir mod
time, ...) but it will provide you with a version unrelated to the bug
level.

Eugene Teo

unread,

Nov 4, 2010, 8:20:02 PM11/4/10

to

On Thu, Nov 4, 2010 at 6:11 PM, Tejun Heo <t...@kernel.org> wrote:
> On 11/04/2010 11:09 AM, Marcus Meissner wrote:
>> Making /proc/kallsyms readable only for root makes it harder
>> for attackers to write generic kernel exploits by removing
>> one source of knowledge where things are in the kernel.
>>
>> Signed-off-by: Marcus Meissner <meis...@suse.de>
>
> I can't recall needing /proc/kallsyms when I wasn't root, so unless
> there's a compelling use case.
>
> Acked-by: Tejun Heo <t...@kernel.org>

Looks good to me too.

Acked-by: Eugene Teo <euge...@kernel.org>

Eugene

Jesper Juhl

unread,

Nov 4, 2010, 8:40:02 PM11/4/10

to

On Thu, 4 Nov 2010, Marcus Meissner wrote:

>
> Hi,
>
> Making /proc/kallsyms readable only for root makes it harder
> for attackers to write generic kernel exploits by removing
> one source of knowledge where things are in the kernel.
>
> Signed-off-by: Marcus Meissner <meis...@suse.de>
> ---
> kernel/kallsyms.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
> index 6f6d091..a8db257 100644
> --- a/kernel/kallsyms.c
> +++ b/kernel/kallsyms.c
> @@ -546,7 +546,7 @@ static const struct file_operations kallsyms_operations = {
>
> static int __init kallsyms_init(void)
> {
> - proc_create("kallsyms", 0444, NULL, &kallsyms_operations);
> + proc_create("kallsyms", 0400, NULL, &kallsyms_operations);
> return 0;
> }
> device_initcall(kallsyms_init);
>

This doesn't harden things much, but a little is better than nothing.
This makes sense to me and looks OK.

Reviewed-by: Jesper Juhl <j...@chaosbits.net>

--
Jesper Juhl <j...@chaosbits.net> http://www.chaosbits.net/
Plain text mails only, please http://www.expita.com/nomime.html
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html

Frank Rowand

unread,

Nov 4, 2010, 10:40:02 PM11/4/10

to

On 11/04/10 05:29, Marcus Meissner wrote:
> On Thu, Nov 04, 2010 at 12:46:48PM +0100, Ingo Molnar wrote:
>>
>> * Marcus Meissner <meis...@suse.de> wrote:
>>
>>> Hi,
>>>
>>> Making /proc/kallsyms readable only for root makes it harder for attackers to
>>> write generic kernel exploits by removing one source of knowledge where things are
>>> in the kernel.

< snip >

>> So what does a distribution like Suse expect from this change alone? Those have
>> public packages in rpms which can be downloaded by anyone, so it makes little sense
>> to hide it - unless _all_ version information is hidden.
>
> It is the first patch, mostly an acceptance test balloon.
>
> There are several other files handing information out, but kallsyms has
> it all very nice and ready.
>
> (timer_list, /proc/*/stat*, sl?binfo )
>
>> So i'd like to see a _full_ version info sandboxing patch that thinks through all
>> the angles and restricts uname -r kernel version info as well, and makes dmesg
>> unaccessible to users - and closes a few other information holes as well that give
>> away the exact kernel version - _that_ together will make it hard to blindly attack
>> a very specific kernel version.
>
> I am personally thinking of a "small steps" philosophy, one step after the other.

< snip >

The idea of trying to hide the kernel version is absurd. The number of different
places that can provide a precise fingerprint of a kernel version, or a small range of
possible kernel versions is immense. Closing all of those places makes use and
administration of a system more difficult, and encourages frequent use of su.

Dumb examples of version clues (beyond the obvious simple ones):

$ gcc -v
Target: x86_64-redhat-linux
gcc version 4.4.4 20100630 (Red Hat 4.4.4-10) (GCC)

$ rpm -qi gcc
Release : 10.fc13 Build Date: Wed Jun 30 02:54:10 2010

$ rpm -qi kernel
Version : 2.6.33.3 Vendor: Fedora Project
Release : 85.fc13 Build Date: Thu May 6 11:35:36 2010

$ ls -l /lib64
$ ls -l /boot
$ lsmod

-Frank

Ingo Molnar

unread,

Nov 7, 2010, 4:00:03 AM11/7/10

to

* Willy Tarreau <w...@1wt.eu> wrote:

> > Not an 'arms race' thing where we just put obstruction in the road of attackers
> > - but some real, unavoidable risk not detectable by attackers - running on most
> > stock distro kernels. (so there would be a real economy of scale)
> >
> > The kerneloops client could also collect exploit attempt stats.
>
> Well, in my opinion, either the attacker is remote and you can already get many
> info, or he's local and has time to precisely qualify the environment in order not

> to leave the slightest trace. [...]

Your view of how attackers operate is rather simplistic. Knowing the precise
environment (via remote or local measures) is a big tactical advantage to them.

See the very patch we are discussing. People are submitting patches to hide certain
pieces of information exactly because that information is an advantage to attackers.

And my point is that "if you want to hide information do it effectively - or if it's
too hard dont do it at all".

Thanks,

Ingo

Ingo Molnar

unread,

Nov 7, 2010, 4:00:02 AM11/7/10

to

* Willy Tarreau <w...@1wt.eu> wrote:

> On Thu, Nov 04, 2010 at 10:51:57PM +0100, Ingo Molnar wrote:
> > > Quite honnestly, it's the worst idea I've ever read to protect the kernel. Kernel
> > > version is needed at many places, when building some code which relies on presence
> > > of syscall X or Y depending on a version, etc... [...]
> >
> > Actually that's not true, since we have a kernel ABI, and since there's many
> > backports of newer kernel features into older kernels that it's generally not
> > needed nor meaningful to know the kernel version for syscalls.
> >
> > Returning -ENOSYS is the general standard we use to communicate syscall
> > capabilities.
> >
> > In fact using kernel version to switch around library functionality is a bug i'd
> > argue.
>
> I'm sorry Ingo, but I still don't agree. We've had several versions of epoll,
> several (some even buggy) versions of splice() which cannot even be detected
> without checking the kernel release. And those are just two that immediately come
> to my mind. If we've been providing a version for the last 19 years, it surely had
> some valid uses.

I'm sorry Willy, but you are mostly wrong - and there's no need to speculate here
really. Just try the patch below :-)

If your claim that 'kernel version is needed at many places' is true then why am i
seeing this on a pretty general distro box bootup:

[root@aldebaran ~]# uname -a
Linux aldebaran 2.6.99-tip-01574-g6ba54c9-dirty #1 SMP Sun Nov 7 10:24:38 CET 2010 x86_64 x86_64 x86_64 GNU/Linux

?

Yes, some user-space might be unhappy if we set the version _back_ to say 2.4.0, but
we could (as the patch below) fuzz up the version information from unprivileged
attackers easily.

_Future_ ABI breakages that necessiate a version check are clearly frowned upon, so
this patch could even be considered a debugging feature: it makes it harder to
create ABI incompatibilities (at least for unprivileged user-space).

So you can think of version fuzzing also as the ultimate ABI check.

( This is a real defensive measure - here's a reason why attackers try stealth
remote fingerprinting of a target system first: they really want to avoid
detection and knowing the exact OS and version of a target tells them which
attacks can be tried with a higher chance of success. Same goes for local attacks
as well.

And once we have _that_, version fuzzing, removing kallsyms is one of the many
measures we need to use to hide the true version of the kernel from unprivileged
user-space. )

Thanks,

Ingo

Index: linux/Makefile
===================================================================
--- linux.orig/Makefile
+++ linux/Makefile
@@ -1,7 +1,7 @@
VERSION = 2
PATCHLEVEL = 6
-SUBLEVEL = 37
-EXTRAVERSION = -rc1
+SUBLEVEL = 99
+EXTRAVERSION =
NAME = Flesh-Eating Bats with Fangs

# *DOCUMENTATION*

Ingo Molnar

unread,

Nov 7, 2010, 4:10:01 AM11/7/10

to

* Willy Tarreau <w...@1wt.eu> wrote:

> > An 'exploit honeypot' would be some small amount of 'detection' code for the
> > exploitable pattern of parameters (most attacks come via ioctls so we can add
> > detection for known holes without any performance hit), and the kernel would
> > warn the sysadmin that an exploit attempt has occured.
>
> If we pollute the ioctl code with all the CVEs we have accumulated over the years,
> I bet we'd get a performance hit and will probably introduce new bugs due to the
> harder to maintain code.

That's just wrong, because it's usually not the same ioctl hit with dozens of CVEs,
but lots of CVEs are spread out amongst lots of ioctls. You need to come up with
something more concrete than "I bet" to support that claim ;-)

> > The point is to make it riskier to run exploits - not to 'hide version because
> > we are so buggy'. Unprivileged attackers wont be able to know whether a kernel
> > is unpatched and wont know whether trying an actual exploit triggers a silent
> > alarm or not.
>
> In my opinion, hiding the distro-specific part of the version should not cause too
> much harm, but still I find this useless.
>
>
> You see, I've used the vmsplice exploit at one place. Do you know how I did ? $
> cat /etc/redhat-release
>
> Then I opened the box and installed the DVD showing the same version on a spare PC
> to experiment with it. Once I got the exploit to reliably work without crashing
> the kernel nor leaving traces, I dared launching it on the target machine and it

> worked. Uname -r was not involved there. [...]

Sigh, you _still_ have not understood my point and you clearly dont seem to know how
honeypots work.

An 'exploit honeypot' kernel feature, on a patched kernel, would at that point warn
the admin that local user XXX tried to run an exploit.

The point is that the attacker cannot know whether it's safe to run the exploit on
the box (will result in a compromise), or is not safe to run the exploit (the
honeypot code will warn the admin).

Uname -r fuzzing is not needed because the attacker 'needs to run it' to compromise
a vulnerable system (as you seem to believe). It's done because if the attacker runs
it on a _not vulnerable machine_, it keeps him from running the exploit.

In short, it removes the 'is it safe to try this exploit' information from the
system - and if there's also a honeypot there, it introduces a real (and if done
well enough, undetectable) risk of detection for the attacker.

Thanks,

Ingo

Ingo Molnar

unread,

Nov 7, 2010, 4:10:01 AM11/7/10

to

* Ingo Molnar <mi...@elte.hu> wrote:

> If your claim that 'kernel version is needed at many places' is true then why am i
> seeing this on a pretty general distro box bootup:
>
> [root@aldebaran ~]# uname -a
> Linux aldebaran 2.6.99-tip-01574-g6ba54c9-dirty #1 SMP Sun Nov 7 10:24:38 CET 2010 x86_64 x86_64 x86_64 GNU/Linux
>
> ?
>
> Yes, some user-space might be unhappy if we set the version _back_ to say 2.4.0,
> but we could (as the patch below) fuzz up the version information from
> unprivileged attackers easily.

Btw., with an 'exploit honeypot' and 'version fuzzing' the uname output would look
like this to an unprivileged user:

$ uname -a
Linux aldebaran 2.6.99 x86_64 x86_64 x86_64 GNU/Linux

[ we wouldnt want to include the date or the SHA1 of the kernel, obviously. ]

And it would look like this to root:

# uname -a
Linux aldebaran 2.6.37-tip-01574-g6ba54c9-dirty #1 SMP Sun Nov 7 10:24:38 CET 2010 x86_64 x86_64 x86_64 GNU/Linux

Ingo

Willy Tarreau

unread,

Nov 7, 2010, 5:00:01 AM11/7/10

to

Hi Ingo,

On Sun, Nov 07, 2010 at 09:50:16AM +0100, Ingo Molnar wrote:
>
> * Willy Tarreau <w...@1wt.eu> wrote:
>
> > On Thu, Nov 04, 2010 at 10:51:57PM +0100, Ingo Molnar wrote:
> > > > Quite honnestly, it's the worst idea I've ever read to protect the kernel. Kernel
> > > > version is needed at many places, when building some code which relies on presence
> > > > of syscall X or Y depending on a version, etc... [...]
> > >
> > > Actually that's not true, since we have a kernel ABI, and since there's many
> > > backports of newer kernel features into older kernels that it's generally not
> > > needed nor meaningful to know the kernel version for syscalls.
> > >
> > > Returning -ENOSYS is the general standard we use to communicate syscall
> > > capabilities.
> > >
> > > In fact using kernel version to switch around library functionality is a bug i'd
> > > argue.
> >
> > I'm sorry Ingo, but I still don't agree. We've had several versions of epoll,
> > several (some even buggy) versions of splice() which cannot even be detected
> > without checking the kernel release. And those are just two that immediately come
> > to my mind. If we've been providing a version for the last 19 years, it surely had
> > some valid uses.
>
> I'm sorry Willy, but you are mostly wrong - and there's no need to speculate here
> really. Just try the patch below :-)
>
> If your claim that 'kernel version is needed at many places' is true then why am i
> seeing this on a pretty general distro box bootup:
>
> [root@aldebaran ~]# uname -a
> Linux aldebaran 2.6.99-tip-01574-g6ba54c9-dirty #1 SMP Sun Nov 7 10:24:38 CET 2010 x86_64 x86_64 x86_64 GNU/Linux

I don't understand the point you're trying to make with this patch. Obviously
we can pretend to be any version, and by doing that, you also pretend not to
have some bugs that would have been fixed later after the *real* version.

What I'm saying is that history has shown that we have known bugs that are
not detectable by any other way than the kernel version. Take the splice()
data corruption bug for instance. I believe it was fixed in 2.6.26 or 2.6.27
and backported late in the 2.6.25.X stable branch. Due to this, without
knowing the kernel version, the user can't know whether it's safe to use
splice() or not. I'm particularly aware of this one because I got quite a
bunch of questions from users on this subject. But certainly there are a
bunch of other ones.

> Yes, some user-space might be unhappy if we set the version _back_ to say 2.4.0, but
> we could (as the patch below) fuzz up the version information from unprivileged
> attackers easily.

I think you understood my concerns about breaking compatibility with userspace
by announcing a wrong version. That's not what I'm saying, but rather that
user-space couldn't rely on the version anymore to avoid known bugs.

> _Future_ ABI breakages that necessiate a version check are clearly frowned upon, so
> this patch could even be considered a debugging feature: it makes it harder to
> create ABI incompatibilities (at least for unprivileged user-space).

Stating this will not change the behaviour WRT bugs unfortunately.

> So you can think of version fuzzing also as the ultimate ABI check.
>
>
> ( This is a real defensive measure - here's a reason why attackers try stealth
> remote fingerprinting of a target system first: they really want to avoid
> detection and knowing the exact OS and version of a target tells them which
> attacks can be tried with a higher chance of success. Same goes for local attacks
> as well.
>
> And once we have _that_, version fuzzing, removing kallsyms is one of the many
> measures we need to use to hide the true version of the kernel from unprivileged
> user-space. )

I think you didn't understand me. I was explaining that doing this will not
prevent them from guessing the precise kernel version, because if you're on
a mainstream distro, just check the uptime. If last reboot matches the next
day of a kernel release, most likely it's this version. Same for /boot
modification date. And conversely, when you find an uptime of 800 days, you
know for sure that your freshly discovered bug is still present, no need of
uname for that. And I gave you examples of that which have worked.

That's why I'm claiming that version fuzzing brings nothing *really* useful.
It just makes admin think they're secure but that's false.

Also as I said, if you want your distro to hide the bug fix level, simply
rebuild the kernel with a fixed EXTRAVERSION string, or ask the kernel
maintainers there not to update the EXTRAVERSION anymore and you'll have
your version fuzzing for free without changing any kernel code. But I'm
still certain it will not bring any value.

Willy

Ingo Molnar

unread,

Nov 7, 2010, 6:30:01 AM11/7/10

to

> I don't understand the point you're trying to make with this patch. [...]

It was a simple experiement to support my rather simple argument which you disputed.

> [...] Obviously we can pretend to be any version, [...]

Ok, it's a pretty cavalier style of arguing that you now essentially turn around
your earlier claim that the 'kernel version is needed at many places' and say what
i've been saying, prefixed with 'obviously' ;-)

Yes, it's obvious that the kernel version is not needed for many functional purposes
on a modern distro - and that was my exact point.

I cannot think of a single valid case where the proper user-space solution to some
ABI compatibility detail is a kernel version check. I'd even argue that we want to
keep unprivileged user-space from being able to implement such crappy version checks
...

Thanks,

Ingo

Ingo Molnar

unread,

Nov 7, 2010, 6:50:01 AM11/7/10

to

* Willy Tarreau <w...@1wt.eu> wrote:

> [...] I was explaining that doing this will not prevent them from guessing the
> precise kernel version, [...]

Well, which is exactly what i have said to Marcus early on in this discussion:

|
| What i suggested in later parts of my mail might provide more security: to sandbox
| kernel version information from unprivileged user-space - if we decide that we
| want to sandbox kernel version information ...
|
| That is a big if, because it takes a considerable amount of work. Would be worth
| trying it - but feel-good non-solutions that do not bring much improvement to the
| majority of users IMHO hinder such efforts.
|

The 'considerable amount of work' refers not to the utsname version fuzzing patch
(it's a 10-liner patch, literally), but to controlling the channels of version
information you mentioned (uptime, the /boot timestamp), and some other channels you
did not mention: dmesg, various /sys and /proc entries that leak version
information, etc.

All must be closed down for unprivileged user-space, for this to be effective,
obviously.

( Note that there will also be some channels of information that cannot
realistically be closed down (such as the presence of sys_perf_event_open()
indicates a v2.6.32+ kernel - or a backported, patched kernel) - but what matters
mostly is to fuzz the _precise_ version information, to inject uncertainty into
the equation of attackers. Combined with honeypot silent alarm functionality it
turns the equation around and creates an outright risk of detection. )

Thanks,

Ingo

Willy Tarreau

unread,

Nov 7, 2010, 6:50:01 AM11/7/10

to

On Sun, Nov 07, 2010 at 12:27:09PM +0100, Ingo Molnar wrote:
> > I don't understand the point you're trying to make with this patch. [...]
>
> It was a simple experiement to support my rather simple argument which you disputed.

OK

> > [...] Obviously we can pretend to be any version, [...]
>
> Ok, it's a pretty cavalier style of arguing that you now essentially turn around
> your earlier claim that the 'kernel version is needed at many places' and say what
> i've been saying, prefixed with 'obviously' ;-)

Huh ?

> Yes, it's obvious that the kernel version is not needed for many functional purposes
> on a modern distro - and that was my exact point.
>
> I cannot think of a single valid case where the proper user-space solution to some
> ABI compatibility detail is a kernel version check.

Ingo, I believe you did not read a single line of my previous mail, because I
precisely gave you counter-examples of that. The first use is simply the user
running "uname -a" to see if *he* can safely enable feature X or Y which is
known to be badly broken in some old versions.

> I'd even argue that we want to
> keep unprivileged user-space from being able to implement such crappy version checks
> ...

I'd say that *YOU* want that despite the fact that on mainstream distros, it
buys nothing since it's easy to guess the real version anyway as I showed you.
Don't forget that you proposed this in order to hide symbols from a small set
of well-known distro kernels. And the most important in my opinion is that it
does not bring anything to those who are currently victim of exploits : those
who don't upgrade, because their uptime alone is enough to *know* that the
vuln you want to exploit is still there.

At some places, your proposal would probably end up with uname being
chmoded +s so that users stop asking the admin for trivial things. That
really makes no sense.

Willy

Ingo Molnar

unread,

Nov 7, 2010, 6:50:01 AM11/7/10

to

* Willy Tarreau <w...@1wt.eu> wrote:

> On Sun, Nov 07, 2010 at 12:27:09PM +0100, Ingo Molnar wrote:
> > > I don't understand the point you're trying to make with this patch. [...]
> >
> > It was a simple experiement to support my rather simple argument which you disputed.
>
> OK
>
> > > [...] Obviously we can pretend to be any version, [...]
> >
> > Ok, it's a pretty cavalier style of arguing that you now essentially turn around
> > your earlier claim that the 'kernel version is needed at many places' and say what
> > i've been saying, prefixed with 'obviously' ;-)
>
> Huh ?
>
> > Yes, it's obvious that the kernel version is not needed for many functional purposes
> > on a modern distro - and that was my exact point.
> >
> > I cannot think of a single valid case where the proper user-space solution to some
> > ABI compatibility detail is a kernel version check.
>
> Ingo, I believe you did not read a single line of my previous mail, because I

> precisely gave you counter-examples of that. [...]

I did read it and saw no valid counter-examples. You mentioned this one:

> Take the splice() data corruption bug for instance. I believe it was fixed in
> 2.6.26 or 2.6.27 and backported late in the 2.6.25.X stable branch. Due to this,
> without knowing the kernel version, the user can't know whether it's safe to use
> splice() or not. I'm particularly aware of this one because I got quite a bunch
> of questions from users on this subject. But certainly there are a bunch of other
> ones.

That example is entirely bogus. The correct answer to a buggy, data-corrupting
kernel is a fixed kernel. No ifs and when. No version checks in user-space. If
user-space ever works around a bug in that fashion it's entirely broken and
_deserves_ to be further broken via version fuzzing.

Do you know of a single such actual vmsplice() version check example in user-space,
or have you just made it up?

Thanks,

Ingo

Willy Tarreau

unread,

Nov 7, 2010, 7:00:02 AM11/7/10

to

On Sun, Nov 07, 2010 at 12:42:37PM +0100, Ingo Molnar wrote:
>
> * Willy Tarreau <w...@1wt.eu> wrote:
>
> > [...] I was explaining that doing this will not prevent them from guessing the
> > precise kernel version, [...]
>
> Well, which is exactly what i have said to Marcus early on in this discussion:
>
> |
> | What i suggested in later parts of my mail might provide more security: to sandbox
> | kernel version information from unprivileged user-space - if we decide that we
> | want to sandbox kernel version information ...
> |
> | That is a big if, because it takes a considerable amount of work. Would be worth
> | trying it - but feel-good non-solutions that do not bring much improvement to the
> | majority of users IMHO hinder such efforts.
> |
>
> The 'considerable amount of work' refers not to the utsname version fuzzing patch
> (it's a 10-liner patch, literally), but to controlling the channels of version
> information you mentioned (uptime, the /boot timestamp), and some other channels you
> did not mention: dmesg, various /sys and /proc entries that leak version
> information, etc.

I did not mention dmesg because it's already sometimes hidden from users (eg,
when iptables logs there).

> All must be closed down for unprivileged user-space, for this to be effective,
> obviously.

This would only be effective against finding a precise version. There's
no need for that, what you want is to hide kernel pointers, and your issue
is that in distro kernels, same kernels have the same pointers. It would be
much more efficient to work on a method to randomize all pointers than to
try to hide a kernel version hoping a user is not able to guess what it is.
Even if you'd hide the uptime, there are many ways to find it. In my opinion,
it's a race in the wrong direction, and which has several negative side
effects on the normal user.

Better attack the problem than its symptoms.

Willy

Willy Tarreau

unread,

Nov 7, 2010, 7:00:02 AM11/7/10

to

On Sun, Nov 07, 2010 at 12:47:56PM +0100, Ingo Molnar wrote:
> I did read it and saw no valid counter-examples. You mentioned this one:
>
> > Take the splice() data corruption bug for instance. I believe it was fixed in
> > 2.6.26 or 2.6.27 and backported late in the 2.6.25.X stable branch. Due to this,
> > without knowing the kernel version, the user can't know whether it's safe to use
> > splice() or not. I'm particularly aware of this one because I got quite a bunch
> > of questions from users on this subject. But certainly there are a bunch of other
> > ones.
>
> That example is entirely bogus. The correct answer to a buggy, data-corrupting
> kernel is a fixed kernel. No ifs and when. No version checks in user-space. If
> user-space ever works around a bug in that fashion it's entirely broken and
> _deserves_ to be further broken via version fuzzing.

It's not working around a bug, it's that using splice() instead of
recv()+send() brings an important speed up in some environments, and that
it is suggested to make use of it when possible, except on buggy kernels.
Some user-space code simply have a tunable to enable it or not.

> Do you know of a single such actual vmsplice() version check example in user-space,
> or have you just made it up?

I was not speaking about vmsplice() but about splice(). And yes it's a real
world example. Haproxy makes use of it when the option is specified. And it
will never enable it automatically due to that nasty data corruption bug
that cannot be detected. Only the user can run "uname -a" and compare with
his distro's fixes (or mainline kernel fixes) and know what to do. Once again
it's just *one* example. A version is beforeall an indication of features and
bugs status.

It's precisely because you're making a special case of the security bug that
you want to hide bugs from user-space by cheating on version.

Willy

Ingo Molnar

unread,

Nov 7, 2010, 7:20:01 AM11/7/10

to

* Willy Tarreau <w...@1wt.eu> wrote:

> [...]

>
> It's precisely because you're making a special case of the security bug that you
> want to hide bugs from user-space by cheating on version.

You claimed this for the second time and i'm denying it for the second time.

The goal of fuzzing the version inforation is _not_ to 'hide bugs from user-space by
cheating on version'. The goal is to introduce uncertainty to attackers, so that a
honeypot silent alarm can warn the admin.

Why are you putting words in my mouth?

Thanks,

Ingo

Willy Tarreau

unread,

Nov 7, 2010, 7:30:02 AM11/7/10

to

On Sun, Nov 07, 2010 at 01:12:35PM +0100, Ingo Molnar wrote:
>
> * Willy Tarreau <w...@1wt.eu> wrote:
>
> > [...]
> >
> > It's precisely because you're making a special case of the security bug that you
> > want to hide bugs from user-space by cheating on version.
>
> You claimed this for the second time and i'm denying it for the second time.
>
> The goal of fuzzing the version inforation is _not_ to 'hide bugs from user-space by
> cheating on version'. The goal is to introduce uncertainty to attackers, so that a
> honeypot silent alarm can warn the admin.

My interpretation of this mechanism is what I explained above. "Introducing
uncertainty" means hiding a version so that the attacker doesn't precisely
know which one it is and has to send a few probes to guess it. That's not
much different than trying to fire the exploit itself. When you run a
null-deref kernel exploit, better be sure of what you're doing, otherwise
the admin will shortly be aware of it too.

You could as well consider that launching some commands is suspicious
(eg: uname). You'll obviously get a lot of false-positive alarms from
all autoconf scripts run in local, but this gives an idea. Anyway, when
local users have their time (eg: students), it's still easy to guess the
version.

> Why are you putting words in my mouth?

I'm not putting anything in your mouth Ingo :-)

Willy

Ingo Molnar

unread,

Nov 7, 2010, 7:30:02 AM11/7/10

to

* Willy Tarreau <w...@1wt.eu> wrote:

> > Why are you putting words in my mouth?
>
> I'm not putting anything in your mouth Ingo :-)

To quote you:

" you're making a special case of the security bug that you want to hide bugs from
user-space by cheating on version. "

No, i did not say that i want to hide bugs from user-space by cheating on the
version. Why are you claiming that i said that? Why are you putting words in my
mouth?

Thanks,

Ingo

Ingo Molnar

unread,

Nov 7, 2010, 7:40:02 AM11/7/10

to

* Willy Tarreau <w...@1wt.eu> wrote:

> > All must be closed down for unprivileged user-space, for this to be effective,
> > obviously.
>

> This would only be effective against finding a precise version. [...]

I'm glad that you agree with my point.

> [...] There's no need for that, what you want is to hide kernel pointers, [...]

That's a new claim from you - and when put like that it's wrong too: if the goal is
to introduce risk of detection to attackers (which i suggested to be an efficient
security measure), then hiding/fuzzing version information is an essential/needed
piece of such a measure, not something for which there is 'no need'.

Hiding the address of kernel data/code structures is another piece of such a larger
goal. Btw., as i argued it to Marcus already, hiding /proc/kallsyms will not hide
these addresses on the vast majority of Linux systems, and that the patch would only
cure the symptom, not the cause:

|
| But without actually declaring and achieving that sandboxing goal this security
| measure is just a feel-good thing really [...]
|

Anyway, i wasnt particularly successful in conveying my past arguments to you so i'd
rather leave the discussion at this point. You made your points and i made my points
as well.

Thanks,

Ingo

Ingo Molnar

unread,

Nov 7, 2010, 7:40:01 AM11/7/10

to

* Willy Tarreau <w...@1wt.eu> wrote:

> On Sun, Nov 07, 2010 at 01:12:35PM +0100, Ingo Molnar wrote:
> >
> > * Willy Tarreau <w...@1wt.eu> wrote:
> >
> > > [...]
> > >
> > > It's precisely because you're making a special case of the security bug that you
> > > want to hide bugs from user-space by cheating on version.
> >
> > You claimed this for the second time and i'm denying it for the second time.
> >
> > The goal of fuzzing the version inforation is _not_ to 'hide bugs from user-space by
> > cheating on version'. The goal is to introduce uncertainty to attackers, so that a
> > honeypot silent alarm can warn the admin.
>

> My interpretation of this mechanism is what I explained above. [...]

( Well, if it's "your interpretation" only then stop claiming that i said it. )

> [...] "Introducing uncertainty" means hiding a version so that the attacker

> doesn't precisely know which one it is and has to send a few probes to guess it.

No. The 'exploit honeypot' mechanism i outlined is really simple, and it means what
i explained already:

- attacker breaks into unprivileged user-space

- attacker runs exploit

- exploit attempt gets detected by the 'exploit honeypot' kernel code and a
(silent) warning goes to the admin (via a syslog message for example)

- attacker only sees that the attack did not succeed

This makes it _unsafe_ (for many types of attackers) to run an exploit locally.

> That's not much different than trying to fire the exploit itself. [...]

Erm, the difference is possible _detection_ via a silent alarm.

There's a huge difference between 'attempting an exploit and being caught' and 'not
even trying the exploit because based on the kernel version the attacker knows it
wont work'.

Thanks,

Ingo

Willy Tarreau

unread,

Nov 7, 2010, 7:50:02 AM11/7/10

to

On Sun, Nov 07, 2010 at 01:25:33PM +0100, Ingo Molnar wrote:
>
> * Willy Tarreau <w...@1wt.eu> wrote:
>
> > > Why are you putting words in my mouth?
> >
> > I'm not putting anything in your mouth Ingo :-)
>
> To quote you:
>
> " you're making a special case of the security bug that you want to hide bugs from
> user-space by cheating on version. "
>
> No, i did not say that i want to hide bugs from user-space by cheating on the
> version. Why are you claiming that i said that? Why are you putting words in my
> mouth?

I'm not claiming that "you said that", it's my interpretation of what
you're trying to achieve with what you're defending. I'm free to interprete
as I want. Probably it's a very synthetical analysis, but it's my analysis.

Willy

Willy Tarreau

unread,

Nov 7, 2010, 8:00:02 AM11/7/10

to

On Sun, Nov 07, 2010 at 01:32:32PM +0100, Ingo Molnar wrote:
> No. The 'exploit honeypot' mechanism i outlined is really simple, and it means what
> i explained already:
>
> - attacker breaks into unprivileged user-space
>
> - attacker runs exploit
>
> - exploit attempt gets detected by the 'exploit honeypot' kernel code and a
> (silent) warning goes to the admin (via a syslog message for example)
>
> - attacker only sees that the attack did not succeed
>
> This makes it _unsafe_ (for many types of attackers) to run an exploit locally.

It's already unsafe and has always been. When running local kernel exploits,
it's common to find lots of segfault traces in dmesg. It's common to hang the
machine (the vmsplice exploit had a 50% failure rate from my tests).

> > That's not much different than trying to fire the exploit itself. [...]
>
> Erm, the difference is possible _detection_ via a silent alarm.
>
> There's a huge difference between 'attempting an exploit and being caught' and 'not
> even trying the exploit because based on the kernel version the attacker knows it
> wont work'.

And there's an even bigger difference between leaving traces of a failed
exploit attempt and successfully getting the exploit to work because the
system is not updated in time. That's been my point since the beginning,
most kernel exploits are run very early when released to the public. So
that's when a simple "uptime" will tell you it's safe to run your exploit.
And if you want to hide the uptime, let's simply check the creation date
of /dev/shm, or that a file you left in /tmp has not been removed by the
admin's scripts which clean that up at boot, etc...

In my opinion this is not efficient at all. Also, I've already been involved
in post-mortem diags on compromised machines. If the intruder is not a known
local user, he does not care at all being caught. Leaving rootkits everywhere
is generally not a problem for them, some don't even take care of clearing
the logs, because they bounced from already compromised systems.

Willy

Willy Tarreau

unread,

Nov 7, 2010, 8:00:03 AM11/7/10

to

On Sun, Nov 07, 2010 at 01:37:46PM +0100, Ingo Molnar wrote:
> > [...] There's no need for that, what you want is to hide kernel pointers, [...]
>
> That's a new claim from you - and when put like that it's wrong too:

It's where the discussion started and it's still in the subject of the thread !
You noted that with distro kernels, hiding kallsyms is useless since uname -r
reveals what kernel to download to get them anyway. Which is true !

Reason why it would be more efficient to find how we could randomize those
pointers at runtime.

(...)

> Anyway, i wasnt particularly successful in conveying my past arguments to you so i'd
> rather leave the discussion at this point. You made your points and i made my points
> as well.

That's also what I was about to say. Let's agree we disagree and have a
nice sunday afternoon. We can bring the discussion back around a beer if
you happen to pass by Paris :-)

Cheers,
Willy

Alan Cox

unread,

Nov 7, 2010, 10:40:01 AM11/7/10

to

> This makes it _unsafe_ (for many types of attackers) to run an exploit locally.

They don't care.

Firstly it 's trivial to identify the true kernel version from all sorts
of other methods and secondly almost all exploiting is done by robots
running from box to box and which are completely disposable.

They simply *don't* care and if they do the rpm -q, tcp finger prints and
a few other tricks such as clock timing a couple of syscalls will answer
the question reliably anyway.

Andi Kleen

unread,

Nov 7, 2010, 1:10:02 PM11/7/10

to

Marcus Meissner <meis...@suse.de> writes:
>
> I also briefly thought about kernel ASLR, but my knowledge of the kernel
> loading is too limited whether this is even possible or at all useful.

Kernel ASLR sounds like a good idea, although there are some traps.

On 32bit the available range is not too great, only a few hundred MB
max. Probably less on a larger systems, there will conflicts with a
large mem_map. On 64bit x86 it's nearly 2GB and somewhat easier
(although a large mem_map may still be a problem)

You still want to not stray too much from a 2MB alignment
to make sure most of the main kernel is handled by a single 2MB TLB
entry.

It would not be too hard to do today using kexec and loading the kernel
twice. Right now the kexec command doesn't allow specifying
the address, but the kernel interface supports it, so it could
be just implemented in the user tool.

Doing it with a single boot sequence would be a bit more work.
Right now the relocation entries are not put into the bzImage
and that would be needed.

That would not cover modules, but it shouldn't be too difficult
to do it for those either.

-Andi

--
a...@linux.intel.com -- Speaking for myself only.

H. Peter Anvin

unread,

Nov 7, 2010, 1:40:02 PM11/7/10

to

We already do virtual relocation on 32 bits, and replicating that on 64 bits wouldn't be hard. However, the linkage script strongly assumes congruency mod 2/4 MiB, and that is probably nontrivial to change. However, that still gives about 9 bits of entrophy to play with. The question is if that is enough, or if we'd have to do more clever hacks.

"Andi Kleen" <an...@firstfloor.org> wrote:

--
Sent from my mobile phone. Please pardon any lack of formatting.

Ingo Molnar

unread,

Nov 8, 2010, 1:40:01 AM11/8/10

to

* Alan Cox <al...@lxorguk.ukuu.org.uk> wrote:

> > This makes it _unsafe_ (for many types of attackers) to run an exploit locally.
>
> They don't care.

Sure, script kiddies and botnet builders wont care - i.e. attacks where the
individual target is low value, or where either the attacker or the attacked is
stupid.

But it's different when a skilled attacker meets a skilled defense: all the
exploits/attacks against high-value targets i've seen showed a great deal of care
taken to avoid detection.

Future trends are also clear: eventually, as more and more of our lives are lived on
the network, home boxes are becoming more and more valuable. So i think
concentrating on the psychology of the skilled attacker would not be unwise. YMMV.

Thanks,

Ingo

Ingo Molnar

unread,

Nov 10, 2010, 4:00:01 AM11/10/10

to

* H. Peter Anvin <h...@zytor.com> wrote:

> We already do virtual relocation on 32 bits, and replicating that on 64 bits
> wouldn't be hard. However, the linkage script strongly assumes congruency mod 2/4
> MiB, and that is probably nontrivial to change. However, that still gives about 9
> bits of entrophy to play with. The question is if that is enough, or if we'd have
> to do more clever hacks.

Even 1 bit of entropy would bring a visible improvement: a failed exploit attempt to
the wrong address can crash the kernel with a 50% chance. 9 bits would be very nice.

If an exploit can be brute-forced without crashing the kernel then only some
significantly large bitness would help. So while 9 bits would be rather low for a
user-space ASLR scheme [many user-space bugs can be brute-forced without crashing
the system and raising alarms], it's very attractive for kernel ASLR.

Ingo

Jesper Juhl

unread,

Nov 10, 2010, 4:20:03 PM11/10/10

to

I agree. Hiding the kernel version is silly. But that's not what the
original patch was about. The original patch was about "Making
/proc/kallsyms readable only for root ..." and that (IMVHO) makes sense
for a number of reasons.

1. For those people running (popular) distro kernels, hiding the
information on /proc/kallsyms doesn't achieve much, true, an attacker can
get the information easily online. But it still makes it slightly more
involved for exploits to gain access to information about the addresses of
kernel functions - at the very least they now have to hard-code lists of
addresses for the kernels they target - not much pain, but the more pain
we can inflict upon these people without hurting legitimate users, the
better.

2. For people running niche-distros that attackers cannot be bothered to
target explicitly, but where they previously relied on obtaining these
addresses from /proc/kallsyms we have a real gain - the attackers can no
longer get the info they need.

3. For people running custom compiled kernels (and I personally know of a
few large businesses that do so and several individuals, and I'll bet real
money that there are more than you suspect "out there"), attacks relying
on /proc/kallsyms for info are completely defeated.

4. Once we get (and I'm sure that's only a matter of time) randomization
of the addresses that kernel functions are loaded at, even popular distros
where the kernel version and config are known to attackers will gain a
valuable defence by ths patch. Attackers will then no longer be able to
just download the info from the distro repositories and hard-code
addresses since they will be randomized, but if they have access to
/proc/kallsyms they won't need to since they can then just look up the
addresses there - this patch closes that info path to them which is good.

--
Jesper Juhl <j...@chaosbits.net> http://www.chaosbits.net/
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please.

H. Peter Anvin

unread,

Nov 10, 2010, 10:00:01 PM11/10/10

to

On 11/10/2010 12:53 AM, Ingo Molnar wrote:
>
> * H. Peter Anvin <h...@zytor.com> wrote:
>
>> We already do virtual relocation on 32 bits, and replicating that on 64 bits
>> wouldn't be hard. However, the linkage script strongly assumes congruency mod 2/4
>> MiB, and that is probably nontrivial to change. However, that still gives about 9
>> bits of entrophy to play with. The question is if that is enough, or if we'd have
>> to do more clever hacks.
>
> Even 1 bit of entropy would bring a visible improvement: a failed exploit attempt to
> the wrong address can crash the kernel with a 50% chance. 9 bits would be very nice.
>
> If an exploit can be brute-forced without crashing the kernel then only some
> significantly large bitness would help. So while 9 bits would be rather low for a
> user-space ASLR scheme [many user-space bugs can be brute-forced without crashing
> the system and raising alarms], it's very attractive for kernel ASLR.
>

Now, *relative* symbol addresses will typically not have any randomness
at all, which may limit the usefulness, of course.

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

Ingo Molnar

unread,

Nov 11, 2010, 2:10:01 AM11/11/10

to

* H. Peter Anvin <h...@zytor.com> wrote:

> Now, *relative* symbol addresses will typically not have any randomness at all,
> which may limit the usefulness, of course.

Yeah - but it happens quite often that the scope of the vulnerability only allows
absolute addresses. In fact it's a pretty common case: basically most derefs into
attacker-controlled data pointers are like that.

Thanks,

Ingo

Gilles Espinasse

unread,

Nov 13, 2010, 8:20:01 AM11/13/10

to

----- Original Message -----
From: "Ingo Molnar" <mi...@elte.hu>
To: "Willy Tarreau" <w...@1wt.eu>
Cc: "Marcus Meissner" <meis...@suse.de>; <secu...@kernel.org>;
<mo...@sgi.com>; "Peter Zijlstra" <a.p.zi...@chello.nl>;
<fwei...@gmail.com>; "H. Peter Anvin" <h...@zytor.com>;
<linux-...@vger.kernel.org>; <jason....@windriver.com>;
<t...@kernel.org>; <And...@zimbra8-e1.priv.proxad.net>; <"Morton
<"@zimbra8-e1.priv.proxad.net>
Sent: Sunday, November 07, 2010 10:08 AM
Subject: Re: [Security] [PATCH] kernel: make /proc/kallsyms mode 400 to
reduce ease of attacking

>
> * Ingo Molnar <mi...@elte.hu> wrote:
>
> > If your claim that 'kernel version is needed at many places' is true
then why am i
> > seeing this on a pretty general distro box bootup:
> >
> > [root@aldebaran ~]# uname -a
> > Linux aldebaran 2.6.99-tip-01574-g6ba54c9-dirty #1 SMP Sun Nov 7
10:24:38 CET 2010 x86_64 x86_64 x86_64 GNU/Linux
> >
> > ?
> >
> > Yes, some user-space might be unhappy if we set the version _back_ to
say 2.4.0,
> > but we could (as the patch below) fuzz up the version information from
> > unprivileged attackers easily.
>
> Btw., with an 'exploit honeypot' and 'version fuzzing' the uname output
would look
> like this to an unprivileged user:
>
> $ uname -a
> Linux aldebaran 2.6.99 x86_64 x86_64 x86_64 GNU/Linux
>
> [ we wouldnt want to include the date or the SHA1 of the kernel,
obviously. ]
>
> And it would look like this to root:
>
> # uname -a
> Linux aldebaran 2.6.37-tip-01574-g6ba54c9-dirty #1 SMP Sun Nov 7
10:24:38 CET 2010 x86_64 x86_64 x86_64 GNU/Linux
>
> Ingo

A bit late comment
gesp@a7n8x-e:~$ strings /lib/modules/*/kernel/drivers/scsi/in2000.ko | grep
2010
Sep 16 2010
gesp@a7n8x-e:~$ strings /lib/modules/*/kernel/drivers/char/nozomi.ko | grep
2010
Nozomi driver 2.1d (build date: Sep 16 2010 19:01:27)
gesp@a7n8x-e:~$ uname -a
Linux a7n8x-e 2.6.26-2-686 #1 SMP Thu Sep 16 19:35:51 UTC 2010 i686
GNU/Linux

Should it not be considered before to remove __DATE__ and __TIME__ from
module code?
That would have too the good effect that everyone that compile same code
with same compiler get exactly same file.

Gilles

Marcus Meissner

unread,

Nov 16, 2010, 5:50:01 AM11/16/10

to

Hi,

Making /proc/kallsyms readable only for root makes it harder
for attackers to write generic kernel exploits by removing
one source of knowledge where things are in the kernel.

This is the second submit, discussion happened on this on first submit
and mostly concerned that this is just one hole of the sieve ... but
one of the bigger ones.

Changing the permissions of at least System.map and vmlinux is
also required to fix the same set, but a packaging issue.

Target of this starter patch and follow ups is removing any kind of
kernel space address information leak from the kernel.

Ciao, Marcus

Signed-off-by: Marcus Meissner <meis...@suse.de>
Acked-by: Tejun Heo <t...@kernel.org>
Acked-by: Eugene Teo <euge...@kernel.org>
Reviewed-by: Jesper Juhl <j...@chaosbits.net>
---
kernel/kallsyms.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 6f6d091..a8db257 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -546,7 +546,7 @@ static const struct file_operations kallsyms_operations = {

static int __init kallsyms_init(void)
{
- proc_create("kallsyms", 0444, NULL, &kallsyms_operations);
+ proc_create("kallsyms", 0400, NULL, &kallsyms_operations);
return 0;
}
device_initcall(kallsyms_init);
--
1.7.1

Kyle McMartin

unread,

Nov 17, 2010, 12:10:02 AM11/17/10

to

On Tue, Nov 16, 2010 at 11:46:03AM +0100, Marcus Meissner wrote:
> Target of this starter patch and follow ups is removing any kind of
> kernel space address information leak from the kernel.
>

Er. Should probably hit /proc/modules while you're at it.

--Kyle

Kyle Moffett

unread,

Nov 17, 2010, 12:50:01 AM11/17/10

to

Whoops... I apparently can't count to 3... (at least not correctly anyways) :-D.

On Wed, Nov 17, 2010 at 00:40, Kyle Moffett <ky...@moffetthome.net> wrote:
> (1) For 99%+ of all the computers out there you can get a 90%+
[...]
> (2) Most of the arguments about introducing "uncertainty" into the
[...]
> (2) By just flat out changing the permissions on this file you are
[...]
> (3) If you are really interested in locking down a system to this

Cheers,
Kyle Moffett

Linus Torvalds

unread,

Nov 17, 2010, 1:00:02 AM11/17/10

to

On Tue, Nov 16, 2010 at 9:40 PM, Kyle Moffett <ky...@moffetthome.net> wrote:
>
> (1) For 99%+ of all the computers out there you can

I think that misses the point.

Security is never about absolutes. Anybody who believes in absolute
security is a moron.

True security is about "piling up the inconveniences on the attack".
Several layers. Sure, it's easy to attack a system that is a
monoculture. But immediately when you start saying "you can always
figure out the particular version" and you're talking about tens (or
hundreds) of versions, suddenly you really _are_ more secure. Because
suddenly it's one more pain.

And no, that "one more pain" is not going to be the thing that stops
attacks. But add a number of "one more pains" together, and it gets
increasingly unlikely that you will have a widespread and successful
attack.

So I do think that it's worth closing these "small" holes. Anything
that makes it more work to attack really _is_ improving things.

And being able to just immediately see the addresses is just very
convenient if you have an attack that needs kernel addresses. Much
better that we not make these things visible by default.

And yes, people can look at the vmlinux files. That's outside our
control. And maybe distros will want to close that hole, and maybe
they won't, but at least they don't have the excuse that "well, it's
not even worth it, because the kernel exports that information in
/proc/kallsyms already".

Linus

Willy Tarreau

unread,

Nov 17, 2010, 1:30:02 AM11/17/10

to

On Tue, Nov 16, 2010 at 09:58:44PM -0800, Linus Torvalds wrote:
> So I do think that it's worth closing these "small" holes. Anything
> that makes it more work to attack really _is_ improving things.

We must keep in mind that anything which requires more work as root
for common administration opens new holes. I don't think it's the
case for kallsyms, but I mean we should not try to lock too hard,
otherwise everyone will have a sudoers entry to do his work, and
that's even worse than current situation.

Willy

Ingo Molnar

unread,

Nov 18, 2010, 2:40:01 AM11/18/10

to

Putting aside the kallsyms patch (which is a tiny part of a fuller solution), i'd
like to reply to this particular point:

* Kyle Moffett <ky...@moffetthome.net> wrote:

> (2) Most of the arguments about introducing "uncertainty" into the

> hacking process are specious as well. [...]

It is only specious if you ignore the arguments i made in the previous
discussion. One argument i made was:

Future trends are also clear: eventually, as more and more of our lives
are lived on the network, home boxes are becoming more and more valuable.
So i think concentrating on the psychology of the skilled attacker would
not be unwise. YMMV.

> [...] If a kernel bug is truly a
> "workable" vulnerability then 99%+ of the attempts to exploit it would
> be completely automated virii and computer worms that don't really
> care what happens if they fail to compromise the system. Take a look
> at the vast collection of sample code we have in the form of Windows
> virii/trojans/worms/malware/etc; care to guess what portion of those
> programs authors would shed a tear if their exploit horribly crashed
> or generated vast amounts of audit spam for 70% of the computers it
> executed on?

( You'd be a fool to think that even windows malware authors do not care
whether they crash the target box. You do not get a botnet of 10 million PCs if
you crash 99% of them. There is an analogous concept for this in biology: if a
biological virus is _too_ deadly, it will never become a pandemic - because it has
no time/chance to spread, they are 'detected' and 'defended against'. Virii like
Ebola never spread widely, because they kill all their hosts. )

More importantly, look forward and take a look at the really intelligent attacks,
which are used against high-value targets with good defenses. Those real examples
give us a glimpse into future techniques, even if you do not accept my arguments
that come to a similar conclusion. Those attacks are all about avoiding detection.

Thanks,

Ingo

Ingo Molnar

unread,

Nov 18, 2010, 2:50:02 AM11/18/10

to

* Kyle McMartin <ky...@mcmartin.ca> wrote:

> On Tue, Nov 16, 2010 at 11:46:03AM +0100, Marcus Meissner wrote:
> > Target of this starter patch and follow ups is removing any kind of
> > kernel space address information leak from the kernel.
> >
>
> Er. Should probably hit /proc/modules while you're at it.

Agreed. A few other kernel address things that should be hidden are:

1) /proc/<PID>/stack

Gives out kernel addresses and is a partial /proc/kallsyms table in essence. This
got introduced recently. Useful to attackers.

Then there's a handful of physical address leaks - those are less useful but useful
in some situations:

2) /proc/mtrr

Gives some idea about the physical layout of the machine and can give information
about the location of certain physical devices as well. Limited but nonzero utility
to attackers.

3) /proc/asound/cards

Can gives out the physical address of a device. Limited but nonzero utility to
attackers.

4) /sys/devices/*/*/resources

Shows physical addresses. Limited but nonzero utility to attackers.

Plus there's some really limited fractional pieces of information - again, of
nonzero utility to attackers:

5) /proc/net/ptype

Shows the sizes of a few kernel functions in networking code. Very limited but
nonzero utility to attackers.

6) /sys/kernel/slab/*/ctor

Shows the sizes of a few kernel functions. Very limited but nonzero utility to
attackers.

7) /sys/module/*/sections/*

For example:

/sys/module/sunrpc/sections/__bug_table
/sys/module/sunrpc/sections/__ex_table
/sys/module/sunrpc/sections/__ksymtab
/sys/module/sunrpc/sections/__ksymtab_gpl
/sys/module/sunrpc/sections/__ksymtab_strings
/sys/module/sunrpc/sections/__mcount_loc
/sys/module/sunrpc/sections/__param

Potentially useful to attackers.

There's probably a few more i missed.

Ingo

Sarah Sharp

unread,

Nov 19, 2010, 2:20:02 PM11/19/10

to

On Tue, Nov 16, 2010 at 11:46:03AM +0100, Marcus Meissner wrote:

> Hi,
>
> Making /proc/kallsyms readable only for root makes it harder
> for attackers to write generic kernel exploits by removing
> one source of knowledge where things are in the kernel.
>
> This is the second submit, discussion happened on this on first submit
> and mostly concerned that this is just one hole of the sieve ... but
> one of the bigger ones.
>
> Changing the permissions of at least System.map and vmlinux is
> also required to fix the same set, but a packaging issue.
>
> Target of this starter patch and follow ups is removing any kind of
> kernel space address information leak from the kernel.
>
> Ciao, Marcus
>
> Signed-off-by: Marcus Meissner <meis...@suse.de>
> Acked-by: Tejun Heo <t...@kernel.org>
> Acked-by: Eugene Teo <euge...@kernel.org>
> Reviewed-by: Jesper Juhl <j...@chaosbits.net>

On Wednesday, I updated my branch to commit 460781b from linus' tree,
and my box would not boot. klogd segfaulted, which stalled the whole
system.

At first I thought it actually hung the box, but it continued booting
after 5 minutes, and I was able to log in. It dropped back to the text
console instead of the graphical bootup display for that period of time.
dmesg surprisingly still works. I've bisected the problem down to this
commit (commit 59365d136d205cc20fe666ca7f89b1c5001b0d5a in
linus/master).

.config and dmesg are attached. The box is running klogd 1.5.5ubuntu3
(from Jaunty). Yes, I know that's old. I read the bit in the commit
about changing the permissions of kallsyms after boot, but if I can't
boot that doesn't help. Perhaps this can be made a configuration
option?

Sarah Sharp

.config-broadway

klogd-segfault-2010-11-17-17-04.log

Linus Torvalds

unread,

Nov 19, 2010, 3:00:01 PM11/19/10

to

On Fri, Nov 19, 2010 at 11:19 AM, Sarah Sharp
<sarah....@linux.intel.com> wrote:
>
> .config and dmesg are attached. The box is running klogd 1.5.5ubuntu3
> (from Jaunty). Yes, I know that's old. I read the bit in the commit
> about changing the permissions of kallsyms after boot, but if I can't
> boot that doesn't help. Perhaps this can be made a configuration
> option?

It's not worth a config option.

If it actually breaks user-space, I think we should just revert it.
It's kind of sad to default to the world-visible thing, but as I
mentioned in the commit, this is something where a sysadmin or distro
can trivially just fix it at boot-time too, with just a

chmod og-r /proc/kallsyms

in your bootup scripts.

And if somebody has taken control of the machine _before_ the bootup
scripts get to run, you have bigger problems than a /proc/kallsyms
file.

So I guess I'll revert it.

Thanks for testing and bisecting.

Linus

da...@lang.hm

unread,

Nov 19, 2010, 3:00:01 PM11/19/10

to

On Fri, 19 Nov 2010, Linus Torvalds wrote:

> On Fri, Nov 19, 2010 at 11:19 AM, Sarah Sharp
> <sarah....@linux.intel.com> wrote:
>>
>> .config and dmesg are attached. The box is running klogd 1.5.5ubuntu3
>> (from Jaunty). Yes, I know that's old. I read the bit in the commit
>> about changing the permissions of kallsyms after boot, but if I can't
>> boot that doesn't help. Perhaps this can be made a configuration
>> option?
>
> It's not worth a config option.
>
> If it actually breaks user-space, I think we should just revert it.

how far back do we need to maintain compatibility with userspace?

Is this something that we can revisit in a few years and lock it down
then?

David Lang

Linus Torvalds

unread,

Nov 19, 2010, 3:10:02 PM11/19/10

to

On Fri, Nov 19, 2010 at 11:58 AM, <da...@lang.hm> wrote:
>
> how far back do we need to maintain compatibility with userspace?
>
> Is this something that we can revisit in a few years and lock it down then?

The rule is basically "we never break user space".

But the "out" to that rule is that "if nobody notices, it's not
broken". In a few years? Who knows?

So breaking user space is a bit like trees falling in the forest. If
there's nobody around to see it, did it really break?

Willy Tarreau

unread,

Nov 19, 2010, 3:20:02 PM11/19/10

to

On Fri, Nov 19, 2010 at 12:04:47PM -0800, Linus Torvalds wrote:
> On Fri, Nov 19, 2010 at 11:58 AM, <da...@lang.hm> wrote:
> >
> > how far back do we need to maintain compatibility with userspace?
> >
> > Is this something that we can revisit in a few years and lock it down then?
>
> The rule is basically "we never break user space".
>
> But the "out" to that rule is that "if nobody notices, it's not
> broken". In a few years? Who knows?
>
> So breaking user space is a bit like trees falling in the forest. If
> there's nobody around to see it, did it really break?

FWIW, I appreciate a lot that non-breaking rule. I have some testing
machines which boot from PXE or USB on a file-system with some old
tools and libc, that are both 2.4 and 2.6 compatible. Everything works
like a charm, the only point of care was to have both module-init-tools
and modutils (obviously) but even that integrates smoothly.

I know quite a lot of people who never replace user-space but only
kernels on their systems, so this non-breaking rule is much welcome !

Willy

da...@lang.hm

unread,

Nov 19, 2010, 4:00:03 PM11/19/10

to

On Fri, 19 Nov 2010, Willy Tarreau wrote:

> On Fri, Nov 19, 2010 at 12:04:47PM -0800, Linus Torvalds wrote:
>> On Fri, Nov 19, 2010 at 11:58 AM, <da...@lang.hm> wrote:
>>>
>>> how far back do we need to maintain compatibility with userspace?
>>>
>>> Is this something that we can revisit in a few years and lock it down then?
>>
>> The rule is basically "we never break user space".
>>
>> But the "out" to that rule is that "if nobody notices, it's not
>> broken". In a few years? Who knows?
>>
>> So breaking user space is a bit like trees falling in the forest. If
>> there's nobody around to see it, did it really break?
>
> FWIW, I appreciate a lot that non-breaking rule. I have some testing
> machines which boot from PXE or USB on a file-system with some old
> tools and libc, that are both 2.4 and 2.6 compatible. Everything works
> like a charm, the only point of care was to have both module-init-tools
> and modutils (obviously) but even that integrates smoothly.
>
> I know quite a lot of people who never replace user-space but only
> kernels on their systems, so this non-breaking rule is much welcome !

Please don't get me wrong, as a general rule I like it a lot (I almost
never run the stock kernel from a distro and I upgrade kernels _far_ more
frequently than anything else).

However, like every other general rule, there are reasons to make
exceptions.

In this case we are changing the default to make it more secure, I think
that's worth something.

Yes, distros can all add the chmod command to their startup to get similar
behavior. But by the same token, if we change the default, someone running
an old distro can add a chmod command into their bootup to allow their old
software to still work. In the case that has been identified, the problem
is that syslog is unable to get the kernel messages. this can be
important, but in my opinion it's a long way from being a fatal flaw. I've
already seen this sort of problem happen in the wild without this change.
I was running a development version of rsyslog and on a ubuntu system a
year or so ago (before they switched to rsyslog), I had a situation where
firing up rsyslog would generate a lot of messages about being unable to
read the kernel logs (I don't remember the exact message, it wasn't this
kallsyms file, it was something else)

my full-time job is in security for banks, so I'm a bit more sensitive to
the security issues than most people (but tend to agree with Linus about
the security industry and security circus), but I see this as something
that is useful enough to put in (with a compile-time flag if the
compatibility is that critical for this function). I expect that there are
going to be a few more security patches coming down the road that would be
good to put under the same or similar flag (either because they may break
some old software like eliminating /proc/kmem, or because they add a
slight amount of overhead like the nx/read-only patches). As a result I
think something similar to the 'embedded' option would be appropriate,
have these new features on by default, but have some way that people who
need to disable them can do so.

David Lang

Andy Walls

unread,

Nov 19, 2010, 4:20:02 PM11/19/10

to

> On Fri, Nov 19, 2010 at 11:19 AM, Sarah Sharp
> <sarah....@linux.intel.com> wrote:
> >
> > .config and dmesg are attached. The box is running klogd 1.5.5ubuntu3
> > (from Jaunty). Yes, I know that's old. I read the bit in the commit
> > about changing the permissions of kallsyms after boot, but if I can't
> > boot that doesn't help. Perhaps this can be made a configuration
> > option?
>
> It's not worth a config option.
>
> If it actually breaks user-space, I think we should just revert it.

User space klogd is what's broken in this case:

ksyms = fopen(KSYMS, "r");

if ( ksyms == NULL )
{
if ( errno == ENOENT )
Syslog(LOG_INFO, "No module symbols loaded - "
"kernel modules not enabled.\n");
else
Syslog(LOG_ERR, "Error loading kernel symbols " \
"- %s\n", strerror(errno));
fclose(ksyms);
return(0);
}

The fclose(NULL) is a bug, as I don't think the standards require
that to be handled gracefully.

> It's kind of sad to default to the world-visible thing,

klogd also gets symbols from System.map, so /proc/kallsyms access
is not a strict requirement.

I haven't checked to see if klogd can work without a symbol source
at all, but I'll wager it can.

Regards,
Andy

Linus Torvalds

unread,

Nov 19, 2010, 6:30:02 PM11/19/10

to

On Fri, Nov 19, 2010 at 1:12 PM, Andy Walls <an...@silverblocksystems.net> wrote:
>>
>> If it actually breaks user-space, I think we should just revert it.
>
> User space klogd is what's broken in this case:

Sure. I'm not surprised. I didn't really expect the /proc/kallsyms
mode change to trigger anything like what Sarah reported, and
user-space just being buggy because the error case had never even been
tested is quite understandable.

But the thing is, it doesn't even matter.

The rule is not "we don't break non-buggy user space" or "we don't
break reasonable user-space". The rule is simply "we don't break
user-space".

Even if the breakage is totally incidental, that doesn't help the
_user_. It's still breakage.

We still have magic scheduler debug options to run children before
parents after fork, simply because that used to _hide_ a race
condition in some older "bash" versions (or maybe it was the other way
around, whatever).

The thing is, bugs happen. And if they never had test coverage, we
can't blame people for them. Saying "tough luck, we changed it, and
you did something wrong" may be manly, but it's also unacceptable. The
developer may fix his bug, but there's still users out there.

Now, there _are_ exceptions. There are always exceptions. Intelligent
people don't run things off a script, and it's obviously always to
some degree a judgment call. The breakage has to be balanced against
the upsides. If the kernel behavior change is due to some fundamental
security issue or a major redesign that we _had_ to do to make
progress, and the user-level breakage is reasonably well-contained,
we'll just say "sorry, we had to do it".

In this case, the upside just wasn't big enough to accept _any_
breakage, especially since people and distributions can just do the
"chmod" themselves if they want to. There was a lot of discussion
whether the patch should even go in in the first place. So this time,
the "let's just revert it" was a very easy decision for me.

Linus

Kees Cook

unread,

Nov 19, 2010, 9:50:01 PM11/19/10

to

On Fri, Nov 19, 2010 at 03:22:00PM -0800, Linus Torvalds wrote:
> In this case, the upside just wasn't big enough to accept _any_
> breakage, especially since people and distributions can just do the
> "chmod" themselves if they want to. There was a lot of discussion
> whether the patch should even go in in the first place. So this time,
> the "let's just revert it" was a very easy decision for me.

The downside is that /proc can be remounted multiple times for different
containers, etc. Having to patch everything that mounts /proc to do the
chmod seems much more painful that fixing a simple userspace bug in an old
klog daemon.

(For example, rsyslogd handles this fine since it's root to open it, and
even if it fails, it doesn't do the broken fclose().)

-Kees

--
Kees Cook
Ubuntu Security Team

Kees Cook

unread,

Nov 19, 2010, 10:30:02 PM11/19/10

to

On Thu, Nov 18, 2010 at 08:48:04AM +0100, Ingo Molnar wrote:
> Agreed. A few other kernel address things that should be hidden are:

> [snip]

For reference, here's what GRKERNSEC_HIDESYM looks like in grsecurity.
It's quite a sledgehammer, but it does help to point out at least the
minimum number of things that need fixing.

And, more directly related to this thread, kallsyms hiding is implemented
in s_show instead of via DAC:

@@ -464,6 +467,11 @@ static int s_show(struct seq_file *m, vo
{
struct kallsym_iter *iter = m->private;

+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ if (current_uid())
+ return 0;
+#endif
+
/* Some debugging symbols have no name. Ignore them. */
if (!iter->name[0])
return 0;

Here's the rest, manually extracted, untested, etc...

diff -urNp linux-2.6.36/drivers/message/fusion/mptbase.c linux-2.6.36/drivers/message/fusion/mptbase.c
--- linux-2.6.36/drivers/message/fusion/mptbase.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/drivers/message/fusion/mptbase.c 2010-11-06 19:06:37.000000000 -0400
@@ -6681,8 +6681,13 @@ static int mpt_iocinfo_proc_show(struct
seq_printf(m, " MaxChainDepth = 0x%02x frames\n", ioc->facts.MaxChainDepth);
seq_printf(m, " MinBlockSize = 0x%02x bytes\n", 4*ioc->facts.BlockSize);

+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ seq_printf(m, " RequestFrames @ 0x%p (Dma @ 0x%p)\n", NULL, NULL);
+#else
seq_printf(m, " RequestFrames @ 0x%p (Dma @ 0x%p)\n",
(void *)ioc->req_frames, (void *)(ulong)ioc->req_frames_dma);
+#endif
+
/*
* Rounding UP to nearest 4-kB boundary here...
*/
diff -urNp linux-2.6.36/fs/proc/array.c linux-2.6.36/fs/proc/array.c
--- linux-2.6.36/fs/proc/array.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/fs/proc/array.c 2010-11-06 18:58:50.000000000 -0400
@@ -452,6 +452,12 @@ static int do_task_stat(struct seq_file
gtime = task->gtime;
}

+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ wchan = 0;
+ eip =0;
+ esp =0;
+#endif
+
/* scale priority and nice values from timeslices to -20..20 */
/* to make it look like a "normal" Unix priority/nice value */
priority = task_prio(task);
diff -urNp linux-2.6.36/fs/proc/base.c linux-2.6.36/fs/proc/base.c
--- linux-2.6.36/fs/proc/base.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/fs/proc/base.c 2010-11-06 18:58:50.000000000 -0400
@@ -296,7 +296,7 @@ static int proc_pid_auxv(struct task_str
}

-#ifdef CONFIG_KALLSYMS
+#if defined(CONFIG_KALLSYMS) && !defined(CONFIG_GRKERNSEC_HIDESYM)
/*
* Provides a wchan file via kallsyms in a proper one-value-per-file format.
* Returns the resolved symbol. If that fails, simply return the address.
@@ -318,7 +318,7 @@ static int proc_pid_wchan(struct task_st
}
#endif /* CONFIG_KALLSYMS */

-#ifdef CONFIG_STACKTRACE
+#if defined(CONFIG_STACKTRACE) && !defined(CONFIG_GRKERNSEC_HIDESYM)

#define MAX_STACK_TRACE_DEPTH 64

@@ -2705,10 +2705,10 @@ static const struct pid_entry tgid_base_
#ifdef CONFIG_SECURITY
DIR("attr", S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, proc_attr_dir_operations),
#endif
-#ifdef CONFIG_KALLSYMS
+#if defined(CONFIG_KALLSYMS) && !defined(CONFIG_GRKERNSEC_HIDESYM)
INF("wchan", S_IRUGO, proc_pid_wchan),
#endif
-#ifdef CONFIG_STACKTRACE
+#if defined(CONFIG_STACKTRACE) && !defined(CONFIG_GRKERNSEC_HIDESYM)
ONE("stack", S_IRUSR, proc_pid_stack),
#endif
#ifdef CONFIG_SCHEDSTATS
@@ -3040,10 +3040,10 @@ static const struct pid_entry tid_base_s
#ifdef CONFIG_SECURITY
DIR("attr", S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, proc_attr_dir_operations),
#endif
-#ifdef CONFIG_KALLSYMS
+#if defined(CONFIG_KALLSYMS) && !defined(CONFIG_GRKERNSEC_HIDESYM)
INF("wchan", S_IRUGO, proc_pid_wchan),
#endif
-#ifdef CONFIG_STACKTRACE
+#if defined(CONFIG_STACKTRACE) && !defined(CONFIG_GRKERNSEC_HIDESYM)
ONE("stack", S_IRUSR, proc_pid_stack),
#endif
#ifdef CONFIG_SCHEDSTATS
diff -urNp linux-2.6.36/fs/proc/kcore.c linux-2.6.36/fs/proc/kcore.c
--- linux-2.6.36/fs/proc/kcore.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/fs/proc/kcore.c 2010-11-06 18:58:50.000000000 -0400
@@ -542,6 +542,9 @@ read_kcore(struct file *file, char __use

static int open_kcore(struct inode *inode, struct file *filp)
{
+#if defined(CONFIG_GRKERNSEC_HIDESYM)
+ return -EPERM;
+#endif
if (!capable(CAP_SYS_RAWIO))
return -EPERM;
if (kcore_need_update)
diff -urNp linux-2.6.36/include/linux/kallsyms.h linux-2.6.36/include/linux/kallsyms.h
--- linux-2.6.36/include/linux/kallsyms.h 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/include/linux/kallsyms.h 2010-11-15 17:10:35.000000000 -0500
@@ -15,7 +15,8 @@

struct module;

-#ifdef CONFIG_KALLSYMS
+#if !defined(__INCLUDED_BY_HIDESYM) || !defined(CONFIG_KALLSYMS)
+#if defined(CONFIG_KALLSYMS) && !defined(CONFIG_GRKERNSEC_HIDESYM)
/* Lookup the address for a symbol. Returns 0 if not found. */
unsigned long kallsyms_lookup_name(const char *name);

@@ -92,6 +93,14 @@ static inline int lookup_symbol_attrs(un
/* Stupid that this does nothing, but I didn't create this mess. */
#define __print_symbol(fmt, addr)
#endif /*CONFIG_KALLSYMS*/
+#else /* when included by kallsyms.c or vsnprintf.c, with HIDESYM enabled */
+extern void __print_symbol(const char *fmt, unsigned long address);
+extern int sprint_symbol(char *buffer, unsigned long address);
+const char *kallsyms_lookup(unsigned long addr,
+ unsigned long *symbolsize,
+ unsigned long *offset,
+ char **modname, char *namebuf);
+#endif

/* This macro allows us to keep printk typechecking */
static void __check_printsym_format(const char *fmt, ...)
diff -urNp linux-2.6.36/kernel/configs.c linux-2.6.36/kernel/configs.c
--- linux-2.6.36/kernel/configs.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/kernel/configs.c 2010-11-06 18:58:50.000000000 -0400
@@ -73,8 +73,14 @@ static int __init ikconfig_init(void)
struct proc_dir_entry *entry;

/* create the current config file */
+#if defined(CONFIG_GRKERNSEC_HIDESYM)
+ entry = proc_create("config.gz", S_IFREG | S_IRUSR, NULL,
+ &ikconfig_file_ops);
+#else
entry = proc_create("config.gz", S_IFREG | S_IRUGO, NULL,
&ikconfig_file_ops);
+#endif
+
if (!entry)
return -ENOMEM;

diff -urNp linux-2.6.36/kernel/kallsyms.c linux-2.6.36/kernel/kallsyms.c
--- linux-2.6.36/kernel/kallsyms.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/kernel/kallsyms.c 2010-11-06 18:58:50.000000000 -0400
@@ -11,6 +11,9 @@
* Changed the compression method from stem compression to "table lookup"
* compression (see scripts/kallsyms.c for a more complete description)
*/
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+#define __INCLUDED_BY_HIDESYM 1
+#endif
#include <linux/kallsyms.h>
#include <linux/module.h>
#include <linux/init.h>
@@ -464,6 +467,11 @@ static int s_show(struct seq_file *m, vo
{
struct kallsym_iter *iter = m->private;

+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ if (current_uid())
+ return 0;
+#endif
+
/* Some debugging symbols have no name. Ignore them. */
if (!iter->name[0])
return 0;
@@ -504,7 +512,7 @@ static int kallsyms_open(struct inode *i
struct kallsym_iter *iter;
int ret;

- iter = kmalloc(sizeof(*iter), GFP_KERNEL);
+ iter = kzalloc(sizeof(*iter), GFP_KERNEL);
if (!iter)
return -ENOMEM;
reset_iter(iter, 0);
diff -urNp linux-2.6.36/kernel/module.c linux-2.6.36/kernel/module.c
--- linux-2.6.36/kernel/module.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/kernel/module.c 2010-11-06 18:58:50.000000000 -0400
@@ -3075,6 +3075,11 @@ static const struct file_operations proc

static int __init proc_modules_init(void)
{
+#ifndef CONFIG_GRKERNSEC_HIDESYM
+ proc_create("modules", S_IRUSR, NULL, &proc_modules_operations);
+#else
+ proc_create("modules", S_IRUSR, NULL, &proc_modules_operations);
+#endif
return 0;
}
module_init(proc_modules_init);
diff -urNp linux-2.6.36/kernel/time/timer_list.c linux-2.6.36/kernel/time/timer_list.c
--- linux-2.6.36/kernel/time/timer_list.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/kernel/time/timer_list.c 2010-11-06 18:58:50.000000000 -0400
@@ -38,12 +38,16 @@ DECLARE_PER_CPU(struct hrtimer_cpu_base,

static void print_name_offset(struct seq_file *m, void *sym)
{
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ SEQ_printf(m, "<%p>", NULL);
+#else
char symname[KSYM_NAME_LEN];

if (lookup_symbol_name((unsigned long)sym, symname) < 0)
SEQ_printf(m, "<%p>", sym);
else
SEQ_printf(m, "%s", symname);
+#endif
}

static void
@@ -112,7 +116,11 @@ next_one:
static void
print_base(struct seq_file *m, struct hrtimer_clock_base *base, u64 now)
{
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ SEQ_printf(m, " .base: %p\n", NULL);
+#else
SEQ_printf(m, " .base: %p\n", base);
+#endif
SEQ_printf(m, " .index: %d\n",
base->index);
SEQ_printf(m, " .resolution: %Lu nsecs\n",
diff -urNp linux-2.6.36/kernel/time/timer_stats.c linux-2.6.36/kernel/time/timer_stats.c
--- linux-2.6.36/kernel/time/timer_stats.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/kernel/time/timer_stats.c 2010-11-06 18:58:50.000000000 -0400
@@ -269,12 +269,16 @@ void timer_stats_update_stats(void *time

static void print_name_offset(struct seq_file *m, unsigned long addr)
{
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ seq_printf(m, "<%p>", NULL);
+#else
char symname[KSYM_NAME_LEN];

if (lookup_symbol_name(addr, symname) < 0)
seq_printf(m, "<%p>", (void *)addr);
else
seq_printf(m, "%s", symname);
+#endif
}

static int tstats_show(struct seq_file *m, void *v)
diff -urNp linux-2.6.36/lib/Kconfig.debug linux-2.6.36/lib/Kconfig.debug
--- linux-2.6.36/lib/Kconfig.debug 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/lib/Kconfig.debug 2010-11-06 19:03:24.000000000 -0400
@@ -998,6 +998,7 @@ config LATENCYTOP
depends on DEBUG_KERNEL
depends on STACKTRACE_SUPPORT
depends on PROC_FS
+ depends on !GRKERNSEC_HIDESYM
select FRAME_POINTER if !MIPS && !PPC && !S390 && !MICROBLAZE
select KALLSYMS
select KALLSYMS_ALL
diff -urNp linux-2.6.36/lib/vsprintf.c linux-2.6.36/lib/vsprintf.c
--- linux-2.6.36/lib/vsprintf.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/lib/vsprintf.c 2010-11-13 16:31:35.000000000 -0500
@@ -16,6 +16,9 @@
* - scnprintf and vscnprintf
*/

+#ifdef CONFIG_GRKERNSEC_HIDESYM
+#define __INCLUDED_BY_HIDESYM 1
+#endif
#include <stdarg.h>
#include <linux/module.h>
#include <linux/types.h>
@@ -574,7 +577,7 @@ char *symbol_string(char *buf, char *end
unsigned long value = (unsigned long) ptr;
#ifdef CONFIG_KALLSYMS
char sym[KSYM_SYMBOL_LEN];
- if (ext != 'f' && ext != 's')
+ if (ext != 'f' && ext != 's' && ext != 'a')
sprint_symbol(sym, value);
else
kallsyms_lookup(value, NULL, NULL, NULL, sym);
@@ -947,6 +950,8 @@ char *uuid_string(char *buf, char *end,
* - 'f' For simple symbolic function names without offset
* - 'S' For symbolic direct pointers with offset
* - 's' For symbolic direct pointers without offset
+ * - 'A' For symbolic direct pointers with offset approved for use with GRKERNSEC_HIDESYM
+ * - 'a' For symbolic direct pointers without offset approved for use with GRKERNSEC_HIDESYM
* - 'R' For decoded struct resource, e.g., [mem 0x0-0x1f 64bit pref]
* - 'r' For raw struct resource, e.g., [mem 0x0-0x1f flags 0x201]
* - 'M' For a 6-byte MAC address, it prints the address in the
@@ -989,7 +994,7 @@ char *pointer(const char *fmt, char *buf
struct printf_spec spec)
{
if (!ptr)
- return string(buf, end, "(null)", spec);
+ return string(buf, end, "(nil)", spec);

switch (*fmt) {
case 'F':
@@ -998,6 +1003,13 @@ char *pointer(const char *fmt, char *buf
/* Fallthrough */
case 'S':
case 's':
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ break;
+#else
+ return symbol_string(buf, end, ptr, spec, *fmt);
+#endif
+ case 'A':
+ case 'a':
return symbol_string(buf, end, ptr, spec, *fmt);
case 'R':
case 'r':
diff -urNp linux-2.6.36/net/atm/proc.c linux-2.6.36/net/atm/proc.c
--- linux-2.6.36/net/atm/proc.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/net/atm/proc.c 2010-11-06 18:58:50.000000000 -0400
@@ -190,7 +190,12 @@ static void vcc_info(struct seq_file *se
{
struct sock *sk = sk_atm(vcc);

+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ seq_printf(seq, "%p ", NULL);
+#else
seq_printf(seq, "%p ", vcc);
+#endif
+
if (!vcc->dev)
seq_printf(seq, "Unassigned ");
else
diff -urNp linux-2.6.36/net/ipv4/inet_diag.c linux-2.6.36/net/ipv4/inet_diag.c
--- linux-2.6.36/net/ipv4/inet_diag.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/net/ipv4/inet_diag.c 2010-11-13 16:33:13.000000000 -0500
@@ -114,8 +114,14 @@ static int inet_csk_diag_fill(struct soc
r->idiag_retrans = 0;

r->id.idiag_if = sk->sk_bound_dev_if;
+
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ r->id.idiag_cookie[0] = 0;
+ r->id.idiag_cookie[1] = 0;
+#else
r->id.idiag_cookie[0] = (u32)(unsigned long)sk;
r->id.idiag_cookie[1] = (u32)(((unsigned long)sk >> 31) >> 1);
+#endif

r->id.idiag_sport = inet->inet_sport;
r->id.idiag_dport = inet->inet_dport;
@@ -201,8 +207,15 @@ static int inet_twsk_diag_fill(struct in
r->idiag_family = tw->tw_family;
r->idiag_retrans = 0;
r->id.idiag_if = tw->tw_bound_dev_if;
+
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ r->id.idiag_cookie[0] = 0;
+ r->id.idiag_cookie[1] = 0;
+#else
r->id.idiag_cookie[0] = (u32)(unsigned long)tw;
r->id.idiag_cookie[1] = (u32)(((unsigned long)tw >> 31) >> 1);
+#endif
+
r->id.idiag_sport = tw->tw_sport;
r->id.idiag_dport = tw->tw_dport;
r->id.idiag_src[0] = tw->tw_rcv_saddr;
@@ -285,12 +298,14 @@ static int inet_diag_get_exact(struct sk
if (sk == NULL)
goto unlock;

+#ifndef CONFIG_GRKERNSEC_HIDESYM
err = -ESTALE;
if ((req->id.idiag_cookie[0] != INET_DIAG_NOCOOKIE ||
req->id.idiag_cookie[1] != INET_DIAG_NOCOOKIE) &&
((u32)(unsigned long)sk != req->id.idiag_cookie[0] ||
(u32)((((unsigned long)sk) >> 31) >> 1) != req->id.idiag_cookie[1]))
goto out;
+#endif

err = -ENOMEM;
rep = alloc_skb(NLMSG_SPACE((sizeof(struct inet_diag_msg) +
@@ -578,8 +593,14 @@ static int inet_diag_fill_req(struct sk_
r->idiag_retrans = req->retrans;

r->id.idiag_if = sk->sk_bound_dev_if;
+
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ r->id.idiag_cookie[0] = 0;
+ r->id.idiag_cookie[1] = 0;
+#else
r->id.idiag_cookie[0] = (u32)(unsigned long)req;
r->id.idiag_cookie[1] = (u32)(((unsigned long)req >> 31) >> 1);
+#endif

tmo = req->expires - jiffies;
if (tmo < 0)
diff -urNp linux-2.6.36/net/ipv4/tcp_ipv4.c linux-2.6.36/net/ipv4/tcp_ipv4.c
--- linux-2.6.36/net/ipv4/tcp_ipv4.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/net/ipv4/tcp_ipv4.c 2010-11-06 19:08:40.000000000 -0400
@@ -2400,7 +2400,11 @@ static void get_openreq4(struct sock *sk
0, /* non standard timer */
0, /* open_requests have no inode */
atomic_read(&sk->sk_refcnt),
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ NULL,
+#else
req,
+#endif
len);
}

@@ -2450,7 +2454,12 @@ static void get_tcp4_sock(struct sock *s
sock_i_uid(sk),
icsk->icsk_probes_out,
sock_i_ino(sk),
- atomic_read(&sk->sk_refcnt), sk,
+ atomic_read(&sk->sk_refcnt),
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ NULL,
+#else
+ sk,
+#endif
jiffies_to_clock_t(icsk->icsk_rto),
jiffies_to_clock_t(icsk->icsk_ack.ato),
(icsk->icsk_ack.quick << 1) | icsk->icsk_ack.pingpong,
@@ -2478,7 +2487,13 @@ static void get_timewait4_sock(struct in
" %02X %08X:%08X %02X:%08lX %08X %5d %8d %d %d %p%n",
i, src, srcp, dest, destp, tw->tw_substate, 0, 0,
3, jiffies_to_clock_t(ttd), 0, 0, 0, 0,
- atomic_read(&tw->tw_refcnt), tw, len);
+ atomic_read(&tw->tw_refcnt),
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ NULL,
+#else
+ tw,
+#endif
+ len);
}

#define TMPSZ 150
diff -urNp linux-2.6.36/net/ipv4/udp.c linux-2.6.36/net/ipv4/udp.c
--- linux-2.6.36/net/ipv4/udp.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/net/ipv4/udp.c 2010-11-06 18:58:50.000000000 -0400
@@ -2051,7 +2051,12 @@ static void udp4_format_sock(struct sock
sk_wmem_alloc_get(sp),
sk_rmem_alloc_get(sp),
0, 0L, 0, sock_i_uid(sp), 0, sock_i_ino(sp),
- atomic_read(&sp->sk_refcnt), sp,
+ atomic_read(&sp->sk_refcnt),
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ NULL,
+#else
+ sp,
+#endif
atomic_read(&sp->sk_drops), len);
}

diff -urNp linux-2.6.36/net/ipv6/raw.c linux-2.6.36/net/ipv6/raw.c
--- linux-2.6.36/net/ipv6/raw.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/net/ipv6/raw.c 2010-11-06 18:58:50.000000000 -0400
@@ -1243,7 +1243,13 @@ static void raw6_sock_seq_show(struct se
0, 0L, 0,
sock_i_uid(sp), 0,
sock_i_ino(sp),
- atomic_read(&sp->sk_refcnt), sp, atomic_read(&sp->sk_drops));
+ atomic_read(&sp->sk_refcnt),
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ NULL,
+#else
+ sp,
+#endif
+ atomic_read(&sp->sk_drops));
}

static int raw6_seq_show(struct seq_file *seq, void *v)
diff -urNp linux-2.6.36/net/ipv6/tcp_ipv6.c linux-2.6.36/net/ipv6/tcp_ipv6.c
--- linux-2.6.36/net/ipv6/tcp_ipv6.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/net/ipv6/tcp_ipv6.c 2010-11-06 18:58:50.000000000 -0400
@@ -1987,7 +1987,13 @@ static void get_openreq6(struct seq_file
uid,
0, /* non standard timer */
0, /* open_requests have no inode */
- 0, req);
+ 0,
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ NULL
+#else
+ req
+#endif
+ );
}

static void get_tcp6_sock(struct seq_file *seq, struct sock *sp, int i)
@@ -2037,7 +2043,12 @@ static void get_tcp6_sock(struct seq_fil
sock_i_uid(sp),
icsk->icsk_probes_out,
sock_i_ino(sp),
- atomic_read(&sp->sk_refcnt), sp,
+ atomic_read(&sp->sk_refcnt),
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ NULL,
+#else
+ sp,
+#endif
jiffies_to_clock_t(icsk->icsk_rto),
jiffies_to_clock_t(icsk->icsk_ack.ato),
(icsk->icsk_ack.quick << 1 ) | icsk->icsk_ack.pingpong,
@@ -2072,7 +2083,13 @@ static void get_timewait6_sock(struct se
dest->s6_addr32[2], dest->s6_addr32[3], destp,
tw->tw_substate, 0, 0,
3, jiffies_to_clock_t(ttd), 0, 0, 0, 0,
- atomic_read(&tw->tw_refcnt), tw);
+ atomic_read(&tw->tw_refcnt),
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ NULL
+#else
+ tw
+#endif
+ );
}

static int tcp6_seq_show(struct seq_file *seq, void *v)
diff -urNp linux-2.6.36/net/ipv6/udp.c linux-2.6.36/net/ipv6/udp.c
--- linux-2.6.36/net/ipv6/udp.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/net/ipv6/udp.c 2010-11-06 18:58:50.000000000 -0400
@@ -1399,7 +1399,12 @@ static void udp6_sock_seq_show(struct se
0, 0L, 0,
sock_i_uid(sp), 0,
sock_i_ino(sp),
- atomic_read(&sp->sk_refcnt), sp,
+ atomic_read(&sp->sk_refcnt),
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ NULL,
+#else
+ sp,
+#endif
atomic_read(&sp->sk_drops));
}

diff -urNp linux-2.6.36/net/key/af_key.c linux-2.6.36/net/key/af_key.c
--- linux-2.6.36/net/key/af_key.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/net/key/af_key.c 2010-11-06 18:58:50.000000000 -0400
@@ -3644,7 +3644,11 @@ static int pfkey_seq_show(struct seq_fil
seq_printf(f ,"sk RefCnt Rmem Wmem User Inode\n");
else
seq_printf(f ,"%p %-6d %-6u %-6u %-6u %-6lu\n",
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ NULL,
+#else
s,
+#endif
atomic_read(&s->sk_refcnt),
sk_rmem_alloc_get(s),
sk_wmem_alloc_get(s),
diff -urNp linux-2.6.36/net/netlink/af_netlink.c linux-2.6.36/net/netlink/af_netlink.c
--- linux-2.6.36/net/netlink/af_netlink.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/net/netlink/af_netlink.c 2010-11-06 18:58:50.000000000 -0400
@@ -2007,13 +2007,21 @@ static int netlink_seq_show(struct seq_f
struct netlink_sock *nlk = nlk_sk(s);

seq_printf(seq, "%p %-3d %-6d %08x %-8d %-8d %p %-8d %-8d %-8lu\n",
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ NULL,
+#else
s,
+#endif
s->sk_protocol,
nlk->pid,
nlk->groups ? (u32)nlk->groups[0] : 0,
sk_rmem_alloc_get(s),
sk_wmem_alloc_get(s),
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ NULL,
+#else
nlk->cb,
+#endif
atomic_read(&s->sk_refcnt),
atomic_read(&s->sk_drops),
sock_i_ino(s)
diff -urNp linux-2.6.36/net/packet/af_packet.c linux-2.6.36/net/packet/af_packet.c
--- linux-2.6.36/net/packet/af_packet.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/net/packet/af_packet.c 2010-11-06 18:58:50.000000000 -0400
@@ -2637,7 +2637,11 @@ static int packet_seq_show(struct seq_fi

seq_printf(seq,
"%p %-6d %-4d %04x %-5d %1d %-6u %-6u %-6lu\n",
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ NULL,
+#else
s,
+#endif
atomic_read(&s->sk_refcnt),
s->sk_type,
ntohs(po->num),
diff -urNp linux-2.6.36/net/phonet/socket.c linux-2.6.36/net/phonet/socket.c
--- linux-2.6.36/net/phonet/socket.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/net/phonet/socket.c 2010-11-13 16:29:01.000000000 -0500
@@ -535,7 +535,12 @@ static int pn_sock_seq_show(struct seq_f
sk->sk_state,
sk_wmem_alloc_get(sk), sk_rmem_alloc_get(sk),
sock_i_uid(sk), sock_i_ino(sk),
- atomic_read(&sk->sk_refcnt), sk,
+ atomic_read(&sk->sk_refcnt),
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ NULL,
+#else
+ sk,
+#endif
atomic_read(&sk->sk_drops), &len);
}
seq_printf(seq, "%*s\n", 127 - len, "");
diff -urNp linux-2.6.36/net/sctp/proc.c linux-2.6.36/net/sctp/proc.c
--- linux-2.6.36/net/sctp/proc.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/net/sctp/proc.c 2010-11-13 16:29:01.000000000 -0500
@@ -212,7 +212,12 @@ static int sctp_eps_seq_show(struct seq_
sctp_for_each_hentry(epb, node, &head->chain) {
ep = sctp_ep(epb);
sk = epb->sk;
- seq_printf(seq, "%8p %8p %-3d %-3d %-4d %-5d %5d %5lu ", ep, sk,
+ seq_printf(seq, "%8p %8p %-3d %-3d %-4d %-5d %5d %5lu ",
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ NULL, NULL,
+#else
+ ep, sk,
+#endif
sctp_sk(sk)->type, sk->sk_state, hash,
epb->bind_addr.port,
sock_i_uid(sk), sock_i_ino(sk));
@@ -318,7 +323,12 @@ static int sctp_assocs_seq_show(struct s
seq_printf(seq,
"%8p %8p %-3d %-3d %-2d %-4d "
"%4d %8d %8d %7d %5lu %-5d %5d ",
- assoc, sk, sctp_sk(sk)->type, sk->sk_state,
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ NULL, NULL,
+#else
+ assoc, sk,
+#endif
+ sctp_sk(sk)->type, sk->sk_state,
assoc->state, hash,
assoc->assoc_id,
assoc->sndbuf_used,
diff -urNp linux-2.6.36/net/unix/af_unix.c linux-2.6.36/net/unix/af_unix.c
--- linux-2.6.36/net/unix/af_unix.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/net/unix/af_unix.c 2010-11-06 20:08:14.000000000 -0400
@@ -2195,7 +2195,11 @@ static int unix_seq_show(struct seq_file
unix_state_lock(s);

seq_printf(seq, "%p: %08X %08X %08X %04X %02X %5lu",
+#ifdef CONFIG_GRKERNSEC_HIDESYM
+ NULL,
+#else
s,
+#endif
atomic_read(&s->sk_refcnt),
0,
s->sk_state == TCP_LISTEN ? __SO_ACCEPTCON : 0,
diff -urNp linux-2.6.36/arch/powerpc/kernel/process.c linux-2.6.36/arch/powerpc/kernel/process.c
--- linux-2.6.36/arch/powerpc/kernel/process.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/arch/powerpc/kernel/process.c 2010-11-13 16:29:01.000000000 -0500
@@ -654,8 +654,8 @@ void show_regs(struct pt_regs * regs)
* Lookup NIP late so we have the best change of getting the
* above info out without failing
*/
- printk("NIP ["REG"] %pS\n", regs->nip, (void *)regs->nip);
- printk("LR ["REG"] %pS\n", regs->link, (void *)regs->link);
+ printk("NIP ["REG"] %pA\n", regs->nip, (void *)regs->nip);
+ printk("LR ["REG"] %pA\n", regs->link, (void *)regs->link);
#endif
show_stack(current, (unsigned long *) regs->gpr[1]);
if (!user_mode(regs))
@@ -1145,10 +1145,10 @@ void show_stack(struct task_struct *tsk,
newsp = stack[0];
ip = stack[STACK_FRAME_LR_SAVE];
if (!firstframe || ip != lr) {
- printk("["REG"] ["REG"] %pS", sp, ip, (void *)ip);
+ printk("["REG"] ["REG"] %pA", sp, ip, (void *)ip);
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
if ((ip == rth || ip == mrth) && curr_frame >= 0) {
- printk(" (%pS)",
+ printk(" (%pA)",
(void *)current->ret_stack[curr_frame].ret);
curr_frame--;
}
@@ -1168,7 +1168,7 @@ void show_stack(struct task_struct *tsk,
struct pt_regs *regs = (struct pt_regs *)
(sp + STACK_FRAME_OVERHEAD);
lr = regs->link;
- printk("--- Exception: %lx at %pS\n LR = %pS\n",
+ printk("--- Exception: %lx at %pA\n LR = %pA\n",
regs->trap, (void *)regs->nip, (void *)lr);
firstframe = 1;
}
diff -urNp linux-2.6.36/arch/sparc/kernel/process_32.c linux-2.6.36/arch/sparc/kernel/process_32.c
--- linux-2.6.36/arch/sparc/kernel/process_32.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/arch/sparc/kernel/process_32.c 2010-11-13 16:29:01.000000000 -0500
@@ -196,7 +196,7 @@ void __show_backtrace(unsigned long fp)
rw->ins[4], rw->ins[5],
rw->ins[6],
rw->ins[7]);
- printk("%pS\n", (void *) rw->ins[7]);
+ printk("%pA\n", (void *) rw->ins[7]);
rw = (struct reg_window32 *) rw->ins[6];
}
spin_unlock_irqrestore(&sparc_backtrace_lock, flags);
@@ -263,14 +263,14 @@ void show_regs(struct pt_regs *r)

printk("PSR: %08lx PC: %08lx NPC: %08lx Y: %08lx %s\n",
r->psr, r->pc, r->npc, r->y, print_tainted());
- printk("PC: <%pS>\n", (void *) r->pc);
+ printk("PC: <%pA>\n", (void *) r->pc);
printk("%%G: %08lx %08lx %08lx %08lx %08lx %08lx %08lx %08lx\n",
r->u_regs[0], r->u_regs[1], r->u_regs[2], r->u_regs[3],
r->u_regs[4], r->u_regs[5], r->u_regs[6], r->u_regs[7]);
printk("%%O: %08lx %08lx %08lx %08lx %08lx %08lx %08lx %08lx\n",
r->u_regs[8], r->u_regs[9], r->u_regs[10], r->u_regs[11],
r->u_regs[12], r->u_regs[13], r->u_regs[14], r->u_regs[15]);
- printk("RPC: <%pS>\n", (void *) r->u_regs[15]);
+ printk("RPC: <%pA>\n", (void *) r->u_regs[15]);

printk("%%L: %08lx %08lx %08lx %08lx %08lx %08lx %08lx %08lx\n",
rw->locals[0], rw->locals[1], rw->locals[2], rw->locals[3],
@@ -305,7 +305,7 @@ void show_stack(struct task_struct *tsk,
rw = (struct reg_window32 *) fp;
pc = rw->ins[7];
printk("[%08lx : ", pc);
- printk("%pS ] ", (void *) pc);
+ printk("%pA ] ", (void *) pc);
fp = rw->ins[6];
} while (++count < 16);
printk("\n");
diff -urNp linux-2.6.36/arch/sparc/kernel/process_64.c linux-2.6.36/arch/sparc/kernel/process_64.c
--- linux-2.6.36/arch/sparc/kernel/process_64.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/arch/sparc/kernel/process_64.c 2010-11-13 16:34:22.000000000 -0500
@@ -180,14 +180,14 @@ static void show_regwindow(struct pt_reg
printk("i4: %016lx i5: %016lx i6: %016lx i7: %016lx\n",
rwk->ins[4], rwk->ins[5], rwk->ins[6], rwk->ins[7]);
if (regs->tstate & TSTATE_PRIV)
- printk("I7: <%pS>\n", (void *) rwk->ins[7]);
+ printk("I7: <%pA>\n", (void *) rwk->ins[7]);
}

void show_regs(struct pt_regs *regs)
{
printk("TSTATE: %016lx TPC: %016lx TNPC: %016lx Y: %08x %s\n", regs->tstate,
regs->tpc, regs->tnpc, regs->y, print_tainted());
- printk("TPC: <%pS>\n", (void *) regs->tpc);
+ printk("TPC: <%pA>\n", (void *) regs->tpc);
printk("g0: %016lx g1: %016lx g2: %016lx g3: %016lx\n",
regs->u_regs[0], regs->u_regs[1], regs->u_regs[2],
regs->u_regs[3]);
@@ -200,7 +200,7 @@ void show_regs(struct pt_regs *regs)
printk("o4: %016lx o5: %016lx sp: %016lx ret_pc: %016lx\n",
regs->u_regs[12], regs->u_regs[13], regs->u_regs[14],
regs->u_regs[15]);
- printk("RPC: <%pS>\n", (void *) regs->u_regs[15]);
+ printk("RPC: <%pA>\n", (void *) regs->u_regs[15]);
show_regwindow(regs);
show_stack(current, (unsigned long *) regs->u_regs[UREG_FP]);
}
@@ -285,7 +285,7 @@ void arch_trigger_all_cpu_backtrace(void
((tp && tp->task) ? tp->task->pid : -1));

if (gp->tstate & TSTATE_PRIV) {
- printk(" TPC[%pS] O7[%pS] I7[%pS] RPC[%pS]\n",
+ printk(" TPC[%pA] O7[%pA] I7[%pA] RPC[%pA]\n",
(void *) gp->tpc,
(void *) gp->o7,
(void *) gp->i7,
diff -urNp linux-2.6.36/arch/sparc/kernel/traps_32.c linux-2.6.36/arch/sparc/kernel/traps_32.c
--- linux-2.6.36/arch/sparc/kernel/traps_32.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/arch/sparc/kernel/traps_32.c 2010-11-13 16:29:01.000000000 -0500
@@ -76,7 +76,7 @@ void die_if_kernel(char *str, struct pt_
count++ < 30 &&
(((unsigned long) rw) >= PAGE_OFFSET) &&
!(((unsigned long) rw) & 0x7)) {
- printk("Caller[%08lx]: %pS\n", rw->ins[7],
+ printk("Caller[%08lx]: %pA\n", rw->ins[7],
(void *) rw->ins[7]);
rw = (struct reg_window32 *)rw->ins[6];
}
diff -urNp linux-2.6.36/arch/sparc/kernel/traps_64.c linux-2.6.36/arch/sparc/kernel/traps_64.c
--- linux-2.6.36/arch/sparc/kernel/traps_64.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/arch/sparc/kernel/traps_64.c 2010-11-13 16:34:06.000000000 -0500
@@ -75,7 +75,7 @@ static void dump_tl1_traplog(struct tl1_
i + 1,
p->trapstack[i].tstate, p->trapstack[i].tpc,
p->trapstack[i].tnpc, p->trapstack[i].tt);
- printk("TRAPLOG: TPC<%pS>\n", (void *) p->trapstack[i].tpc);
+ printk("TRAPLOG: TPC<%pA>\n", (void *) p->trapstack[i].tpc);
}
}

@@ -1141,7 +1141,7 @@ static void cheetah_log_errors(struct pt
regs->tpc, regs->tnpc, regs->u_regs[UREG_I7], regs->tstate);
printk("%s" "ERROR(%d): ",
(recoverable ? KERN_WARNING : KERN_CRIT), smp_processor_id());
- printk("TPC<%pS>\n", (void *) regs->tpc);
+ printk("TPC<%pA>\n", (void *) regs->tpc);
printk("%s" "ERROR(%d): M_SYND(%lx), E_SYND(%lx)%s%s\n",
(recoverable ? KERN_WARNING : KERN_CRIT), smp_processor_id(),
(afsr & CHAFSR_M_SYNDROME) >> CHAFSR_M_SYNDROME_SHIFT,
@@ -1748,7 +1748,7 @@ void cheetah_plus_parity_error(int type,
smp_processor_id(),
(type & 0x1) ? 'I' : 'D',
regs->tpc);
- printk(KERN_EMERG "TPC<%pS>\n", (void *) regs->tpc);
+ printk(KERN_EMERG "TPC<%pA>\n", (void *) regs->tpc);
panic("Irrecoverable Cheetah+ parity error.");
}

@@ -1756,7 +1756,7 @@ void cheetah_plus_parity_error(int type,
smp_processor_id(),
(type & 0x1) ? 'I' : 'D',
regs->tpc);
- printk(KERN_WARNING "TPC<%pS>\n", (void *) regs->tpc);
+ printk(KERN_WARNING "TPC<%pA>\n", (void *) regs->tpc);
}

struct sun4v_error_entry {
@@ -1963,9 +1963,9 @@ void sun4v_itlb_error_report(struct pt_r

printk(KERN_EMERG "SUN4V-ITLB: Error at TPC[%lx], tl %d\n",
regs->tpc, tl);
- printk(KERN_EMERG "SUN4V-ITLB: TPC<%pS>\n", (void *) regs->tpc);
+ printk(KERN_EMERG "SUN4V-ITLB: TPC<%pA>\n", (void *) regs->tpc);
printk(KERN_EMERG "SUN4V-ITLB: O7[%lx]\n", regs->u_regs[UREG_I7]);
- printk(KERN_EMERG "SUN4V-ITLB: O7<%pS>\n",
+ printk(KERN_EMERG "SUN4V-ITLB: O7<%pA>\n",
(void *) regs->u_regs[UREG_I7]);
printk(KERN_EMERG "SUN4V-ITLB: vaddr[%lx] ctx[%lx] "
"pte[%lx] error[%lx]\n",
@@ -1987,9 +1987,9 @@ void sun4v_dtlb_error_report(struct pt_r

printk(KERN_EMERG "SUN4V-DTLB: Error at TPC[%lx], tl %d\n",
regs->tpc, tl);
- printk(KERN_EMERG "SUN4V-DTLB: TPC<%pS>\n", (void *) regs->tpc);
+ printk(KERN_EMERG "SUN4V-DTLB: TPC<%pA>\n", (void *) regs->tpc);
printk(KERN_EMERG "SUN4V-DTLB: O7[%lx]\n", regs->u_regs[UREG_I7]);
- printk(KERN_EMERG "SUN4V-DTLB: O7<%pS>\n",
+ printk(KERN_EMERG "SUN4V-DTLB: O7<%pA>\n",
(void *) regs->u_regs[UREG_I7]);
printk(KERN_EMERG "SUN4V-DTLB: vaddr[%lx] ctx[%lx] "
"pte[%lx] error[%lx]\n",
@@ -2196,13 +2196,13 @@ void show_stack(struct task_struct *tsk,
fp = (unsigned long)sf->fp + STACK_BIAS;
}

- printk(" [%016lx] %pS\n", pc, (void *) pc);
+ printk(" [%016lx] %pA\n", pc, (void *) pc);
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
if ((pc + 8UL) == (unsigned long) &return_to_handler) {
int index = tsk->curr_ret_stack;
if (tsk->ret_stack && index >= graph) {
pc = tsk->ret_stack[index - graph].ret;
- printk(" [%016lx] %pS\n", pc, (void *) pc);
+ printk(" [%016lx] %pA\n", pc, (void *) pc);
graph++;
}
}
@@ -2255,7 +2255,7 @@ void die_if_kernel(char *str, struct pt_
while (rw &&
count++ < 30 &&
kstack_valid(tp, (unsigned long) rw)) {
- printk("Caller[%016lx]: %pS\n", rw->ins[7],
+ printk("Caller[%016lx]: %pA\n", rw->ins[7],
(void *) rw->ins[7]);

rw = kernel_stack_up(rw);
diff -urNp linux-2.6.36/arch/sparc/kernel/unaligned_64.c linux-2.6.36/arch/sparc/kernel/unaligned_64.c
--- linux-2.6.36/arch/sparc/kernel/unaligned_64.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/arch/sparc/kernel/unaligned_64.c 2010-11-13 16:33:46.000000000 -0500
@@ -278,7 +278,7 @@ static void log_unaligned(struct pt_regs
static DEFINE_RATELIMIT_STATE(ratelimit, 5 * HZ, 5);

if (__ratelimit(&ratelimit)) {
- printk("Kernel unaligned access at TPC[%lx] %pS\n",
+ printk("Kernel unaligned access at TPC[%lx] %pA\n",
regs->tpc, (void *) regs->tpc);
}
}
diff -urNp linux-2.6.36/arch/sparc/mm/fault_64.c linux-2.6.36/arch/sparc/mm/fault_64.c
--- linux-2.6.36/arch/sparc/mm/fault_64.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/arch/sparc/mm/fault_64.c 2010-11-13 16:29:01.000000000 -0500
@@ -74,7 +74,7 @@ static void __kprobes bad_kernel_pc(stru
printk(KERN_CRIT "OOPS: Bogus kernel PC [%016lx] in fault handler\n",
regs->tpc);
printk(KERN_CRIT "OOPS: RPC [%016lx]\n", regs->u_regs[15]);
- printk("OOPS: RPC <%pS>\n", (void *) regs->u_regs[15]);
+ printk("OOPS: RPC <%pA>\n", (void *) regs->u_regs[15]);
printk(KERN_CRIT "OOPS: Fault was to vaddr[%lx]\n", vaddr);
dump_stack();
unhandled_fault(regs->tpc, current, regs);
diff -urNp linux-2.6.36/arch/x86/kernel/dumpstack.c linux-2.6.36/arch/x86/kernel/dumpstack.c
--- linux-2.6.36/arch/x86/kernel/dumpstack.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/arch/x86/kernel/dumpstack.c 2010-11-13 16:29:01.000000000 -0500
@@ -27,7 +27,7 @@ static int die_counter;

void printk_address(unsigned long address, int reliable)
{
- printk(" [<%p>] %s%pS\n", (void *) address,
+ printk(" [<%p>] %s%pA\n", (void *) address,
reliable ? "" : "? ", (void *) address);
}

diff -urNp linux-2.6.36/kernel/panic.c linux-2.6.36/kernel/panic.c
--- linux-2.6.36/kernel/panic.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/kernel/panic.c 2010-11-13 16:29:01.000000000 -0500
@@ -368,7 +368,7 @@ static void warn_slowpath_common(const c
const char *board;

printk(KERN_WARNING "------------[ cut here ]------------\n");
- printk(KERN_WARNING "WARNING: at %s:%d %pS()\n", file, line, caller);
+ printk(KERN_WARNING "WARNING: at %s:%d %pA()\n", file, line, caller);
board = dmi_get_system_info(DMI_PRODUCT_NAME);
if (board)
printk(KERN_WARNING "Hardware name: %s\n", board);
@@ -423,7 +423,8 @@ EXPORT_SYMBOL(warn_slowpath_null);
*/
void __stack_chk_fail(void)
{
- panic("stack-protector: Kernel stack is corrupted in: %p\n",
+ dump_stack();
+ panic("stack-protector: Kernel stack is corrupted in: %pA\n",
__builtin_return_address(0));
}
EXPORT_SYMBOL(__stack_chk_fail);
diff -urNp linux-2.6.36/mm/kmemleak.c linux-2.6.36/mm/kmemleak.c
--- linux-2.6.36/mm/kmemleak.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/mm/kmemleak.c 2010-11-13 16:29:01.000000000 -0500
@@ -355,7 +355,7 @@ static void print_unreferenced(struct se

for (i = 0; i < object->trace_len; i++) {
void *ptr = (void *)object->trace[i];
- seq_printf(seq, " [<%p>] %pS\n", ptr, ptr);
+ seq_printf(seq, " [<%p>] %pA\n", ptr, ptr);
}
}

diff -urNp linux-2.6.36/mm/slub.c linux-2.6.36/mm/slub.c
--- linux-2.6.36/mm/slub.c 2010-10-20 16:30:22.000000000 -0400
+++ linux-2.6.36/mm/slub.c 2010-11-13 16:29:01.000000000 -0500
@@ -392,7 +392,7 @@ static void print_track(const char *s, s
if (!t->addr)
return;

- printk(KERN_ERR "INFO: %s in %pS age=%lu cpu=%u pid=%d\n",
+ printk(KERN_ERR "INFO: %s in %pA age=%lu cpu=%u pid=%d\n",
s, (void *)t->addr, jiffies - t->when, t->cpu, t->pid);
}

--
Kees Cook
Ubuntu Security Team

Richard W.M. Jones

unread,

Nov 20, 2010, 6:10:01 AM11/20/10

to

Sorry for being late to join this thread.

I thought I'd also mention that if you can insert a small amount of
shell code into the kernel, it's trivial to search kernel memory for
the symbol table and derive anything else you want from that.

I wrote some proof of concept code to do this a few years ago[1]. I'm
pretty sure you could compress this down to a few bytes of assembler.

(Plus I don't think that removing pointers is a good idea anyway -- it
just breaks userspace tools, and any real world system is going to be
running a well-known kernel that can be downloaded from some mirror
somewhere)

Rich.

[1] It's a poor example, but in here is code that searched for ksyms
and kallsyms in 32 bit i386 kernels (files virt_mem_ksyms.ml and
virt_mem_kallsyms.ml).
http://git.annexia.org/?p=virt-mem.git;a=tree;f=lib;hb=HEAD

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine. Supports Linux and Windows.
http://et.redhat.com/~rjones/virt-df/

Avi Kivity

unread,

Nov 20, 2010, 6:40:01 AM11/20/10

to

On 11/17/2010 07:40 AM, Kyle Moffett wrote:
> (1) For 99%+ of all the computers out there you can get a 90%+
> accurate guess for what kernel is running by looking at the version of
> libc installed on the system. All you have to do for those computers
> is download a bunch of distro kernels and look at the libc packages
> and build a table of "libc6-SOMEVERSION => 0xADDRESS", etc. Because
> of how all the vendors backport and track versions, "SOMEVERSION"
> usually includes something wonderfully helpful like "el5" or "squeeze"
> or whatever. This does *nothing* for those users, and it's not clear
> that it ever *could*.

Isn't the kernel relocatable these days? We can randomize the kernel
load address at boot time and make this information useless.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

Henrique de Moraes Holschuh

unread,

Nov 20, 2010, 2:50:01 PM11/20/10

to

On Fri, 19 Nov 2010, Kees Cook wrote:
> On Fri, Nov 19, 2010 at 03:22:00PM -0800, Linus Torvalds wrote:
> > In this case, the upside just wasn't big enough to accept _any_
> > breakage, especially since people and distributions can just do the
> > "chmod" themselves if they want to. There was a lot of discussion
> > whether the patch should even go in in the first place. So this time,
> > the "let's just revert it" was a very easy decision for me.
>
> The downside is that /proc can be remounted multiple times for different
> containers, etc. Having to patch everything that mounts /proc to do the
> chmod seems much more painful that fixing a simple userspace bug in an old
> klog daemon.
>
> (For example, rsyslogd handles this fine since it's root to open it, and
> even if it fails, it doesn't do the broken fclose().)

If it is a pain only for buggy old/legacy userspace like klogd or a few
tools, it would still be very useful as a Kconfig option defaulting to
disabled.

As an user and sysadmin, I'd rather not have to find out every place that
mounts /proc in a chroot to chmod all relevant files :( That's fighting a
loosing battle, unlike fixing broken tools (which at least will stay fixed).

Distros could get any fixing done they require, and then enable it for all
their users. Ubuntu and Debian are likely to do it, and I'd guess so is
Fedora.

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh

Pavel Machek

unread,

Nov 24, 2010, 9:50:01 AM11/24/10

to

Hi!

> > (2) Most of the arguments about introducing "uncertainty" into the
> > hacking process are specious as well. [...]
>
> It is only specious if you ignore the arguments i made in the previous
> discussion. One argument i made was:

Well, but it has downsides, too.

If I know school server is vulnerable, I can get admin to fix it... if
I can see dmesg without being root, I can help with problems. I have
done both before...

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

Ingo Molnar

unread,

Nov 26, 2010, 2:40:02 AM11/26/10

to

* Pavel Machek <pa...@ucw.cz> wrote:

> Hi!
>
> > > (2) Most of the arguments about introducing "uncertainty" into the
> > > hacking process are specious as well. [...]
> >
> > It is only specious if you ignore the arguments i made in the previous
> > discussion. One argument i made was:
>
> Well, but it has downsides, too.
>
> If I know school server is vulnerable, I can get admin to fix it... if
> I can see dmesg without being root, I can help with problems. I have
> done both before...

Yeah, restricting information is always a double edged sword - and by locking down
we are implicitly assuming that the number of people trying to do harm is larger
than the number of people trying to help. It is probably true though - and the
damage they can inflict is becoming more and more serious (financially, legally and
socially - and, in some cases, physically) with every year of humanity moving their
lives to the 'net.

So yes, the time has probably come to lock up "potentially harmful" information from
the default unprivileged user on Linux - at least from a default kernel policies
POV.

Thanks,

Ingo

Ingo Molnar

unread,

Nov 26, 2010, 2:50:02 AM11/26/10

to

* Linus Torvalds <torv...@linux-foundation.org> wrote:

> On Fri, Nov 19, 2010 at 11:19 AM, Sarah Sharp
> <sarah....@linux.intel.com> wrote:
> >
> > .config and dmesg are attached. The box is running klogd 1.5.5ubuntu3
> > (from Jaunty). Yes, I know that's old. I read the bit in the commit
> > about changing the permissions of kallsyms after boot, but if I can't
> > boot that doesn't help. Perhaps this can be made a configuration
> > option?
>
> It's not worth a config option.
>
> If it actually breaks user-space, I think we should just revert it.

Sarah,

Does your system boot fine if we make /proc/kallsyms simply an empty file to
unprivileged users? Something like the (untested ...) patch below.

Ingo

diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 6f6d091..d54c993 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -465,7 +465,7 @@ static int s_show(struct seq_file *m, void *p)

struct kallsym_iter *iter = m->private;

/* Some debugging symbols have no name. Ignore them. */

- if (!iter->name[0])
+ if (!iter->name[0] || !capable(CAP_SYS_ADMIN))
return 0;

if (iter->module_name[0]) {

Ingo Molnar

unread,

Nov 26, 2010, 3:00:01 AM11/26/10

to

* Kees Cook <kees...@canonical.com> wrote:

> On Thu, Nov 18, 2010 at 08:48:04AM +0100, Ingo Molnar wrote:
> > Agreed. A few other kernel address things that should be hidden are:
> > [snip]
>
> For reference, here's what GRKERNSEC_HIDESYM looks like in grsecurity.
> It's quite a sledgehammer, but it does help to point out at least the
> minimum number of things that need fixing.

Yeah, it's a somewhat disgusting patch - but it also looks useful.

It would be more palatable for upstream if it was:

- split up

- if all those GRKERNSEC_HIDESYM #ifdefs were removed, either by making the
grsecurity defaults the default behavior, or by intelligently hiding it behinds
wrappers.

I'd suggest a single CONFIG_LEGACY_SYMBOLS=y config option for this, but only used
to show those symbols that are absolutely needed for compatibility - like
/proc/kallsyms. (Newer distros could disable this option and the kernel could
eventually default to it being disabled as well.)

Also, while changing hexa output to symbolic output is fine, changing the oops
output is borderline - that is an absolutely useful piece of information that helps
us in decoding crashes. So i'd suggest to split that into a super-paranoid option or
so.

Anyway, after a split-up we'll see how good the individual bits are - it's a bit of
a mixed bag right now.

Thanks,

Ingo

Sarah Sharp

unread,

Nov 29, 2010, 11:40:02 AM11/29/10

to

On Fri, Nov 26, 2010 at 08:48:09AM +0100, Ingo Molnar wrote:
>
> * Linus Torvalds <torv...@linux-foundation.org> wrote:
>
> > On Fri, Nov 19, 2010 at 11:19 AM, Sarah Sharp
> > <sarah....@linux.intel.com> wrote:
> > >
> > > .config and dmesg are attached. The box is running klogd 1.5.5ubuntu3
> > > (from Jaunty). Yes, I know that's old. I read the bit in the commit
> > > about changing the permissions of kallsyms after boot, but if I can't
> > > boot that doesn't help. Perhaps this can be made a configuration
> > > option?
> >
> > It's not worth a config option.
> >
> > If it actually breaks user-space, I think we should just revert it.
>
> Sarah,
>
> Does your system boot fine if we make /proc/kallsyms simply an empty file to
> unprivileged users? Something like the (untested ...) patch below.

Yes, that works. The system boots as normal. `cat /proc/kallsyms`
returns an empty file, and `sudo cat /proc/kallsyms` does not.

Sarah Sharp

Ingo Molnar

unread,

Nov 29, 2010, 1:10:03 PM11/29/10

to

* Sarah Sharp <sarah....@linux.intel.com> wrote:

> On Fri, Nov 26, 2010 at 08:48:09AM +0100, Ingo Molnar wrote:
> >
> > * Linus Torvalds <torv...@linux-foundation.org> wrote:
> >
> > > On Fri, Nov 19, 2010 at 11:19 AM, Sarah Sharp
> > > <sarah....@linux.intel.com> wrote:
> > > >
> > > > .config and dmesg are attached. The box is running klogd 1.5.5ubuntu3
> > > > (from Jaunty). Yes, I know that's old. I read the bit in the commit
> > > > about changing the permissions of kallsyms after boot, but if I can't
> > > > boot that doesn't help. Perhaps this can be made a configuration
> > > > option?
> > >
> > > It's not worth a config option.
> > >
> > > If it actually breaks user-space, I think we should just revert it.
> >
> > Sarah,
> >
> > Does your system boot fine if we make /proc/kallsyms simply an empty file to
> > unprivileged users? Something like the (untested ...) patch below.
>
> Yes, that works. The system boots as normal. `cat /proc/kallsyms`
> returns an empty file, and `sudo cat /proc/kallsyms` does not.

Great! Marcus, mind respinning your patch with that approach?

Thanks,

Ingo

H. Peter Anvin

unread,

Nov 29, 2010, 2:10:02 PM11/29/10

to

On 11/25/2010 11:38 PM, Ingo Molnar wrote:
>
> Yeah, restricting information is always a double edged sword - and by locking down
> we are implicitly assuming that the number of people trying to do harm is larger
> than the number of people trying to help. It is probably true though - and the
> damage they can inflict is becoming more and more serious (financially, legally and
> socially - and, in some cases, physically) with every year of humanity moving their
> lives to the 'net.
>
> So yes, the time has probably come to lock up "potentially harmful" information from
> the default unprivileged user on Linux - at least from a default kernel policies
> POV.
>

The setting of these policies needs to be figured out sensibly.

One of my great complaints about several Linux distributions is that
they keep forcing log files to be readable only by root, even though
they do put the adm group in their default group file -- the adm group
is traditionally the group allowed to read log files.

It is a *good* thing for a *restricted set* of users to have *readonly*
access to this kind of information -- i.e., a group. It is *not* a good
thing for system security or reliability to force the administrator to
assert root privileges to merely monitor information.

-hpa

H. Peter Anvin

unread,

Nov 29, 2010, 2:10:03 PM11/29/10

to

On 11/29/2010 10:04 AM, Ingo Molnar wrote:
>
> * Sarah Sharp <sarah....@linux.intel.com> wrote:
>
>> On Fri, Nov 26, 2010 at 08:48:09AM +0100, Ingo Molnar wrote:
>>>
>>> * Linus Torvalds <torv...@linux-foundation.org> wrote:
>>>
>>>> On Fri, Nov 19, 2010 at 11:19 AM, Sarah Sharp
>>>> <sarah....@linux.intel.com> wrote:
>>>>>
>>>>> .config and dmesg are attached. The box is running klogd 1.5.5ubuntu3
>>>>> (from Jaunty). Yes, I know that's old. I read the bit in the commit
>>>>> about changing the permissions of kallsyms after boot, but if I can't
>>>>> boot that doesn't help. Perhaps this can be made a configuration
>>>>> option?
>>>>
>>>> It's not worth a config option.
>>>>
>>>> If it actually breaks user-space, I think we should just revert it.
>>>
>>> Sarah,
>>>
>>> Does your system boot fine if we make /proc/kallsyms simply an empty file to
>>> unprivileged users? Something like the (untested ...) patch below.
>>
>> Yes, that works. The system boots as normal. `cat /proc/kallsyms`
>> returns an empty file, and `sudo cat /proc/kallsyms` does not.
>
> Great! Marcus, mind respinning your patch with that approach?
>

Can we please not use CAP_SYS_ADMIN for this? Relying on CAP_SYS_ADMIN
is worse than anything else -- it is a fixed policy hardcoded in the
kernel, with no ability for the system owner to delegate the policy
outward, e.g. by adding group read permission and/or chgrp the file.

Delegating CAP_SYS_ADMIN, of course, otherwise known as "everything", is
worse than anything...

-hpa

Eric Paris

unread,

Nov 29, 2010, 2:30:02 PM11/29/10

to

On Mon, Nov 29, 2010 at 2:05 PM, H. Peter Anvin <h...@zytor.com> wrote:
> On 11/29/2010 10:04 AM, Ingo Molnar wrote:
>>
>> * Sarah Sharp <sarah....@linux.intel.com> wrote:
>>
>>> On Fri, Nov 26, 2010 at 08:48:09AM +0100, Ingo Molnar wrote:

>>>> Sarah,
>>>>
>>>> Does your system boot fine if we make /proc/kallsyms simply an empty file to
>>>> unprivileged users? Something like the (untested ...) patch below.
>>>
>>> Yes, that works. The system boots as normal. `cat /proc/kallsyms`
>>> returns an empty file, and `sudo cat /proc/kallsyms` does not.
>>
>> Great! Marcus, mind respinning your patch with that approach?
>>
>
> Can we please not use CAP_SYS_ADMIN for this? Relying on CAP_SYS_ADMIN
> is worse than anything else -- it is a fixed policy hardcoded in the
> kernel, with no ability for the system owner to delegate the policy
> outward, e.g. by adding group read permission and/or chgrp the file.
>
> Delegating CAP_SYS_ADMIN, of course, otherwise known as "everything", is
> worse than anything...

Serge just proposed a new CAP_SYSLOG

http://lwn.net/Articles/378472/

Which could probably still be renamed and used to cover this access as well....

H. Peter Anvin

unread,

Nov 29, 2010, 2:40:01 PM11/29/10

to

On 11/29/2010 11:21 AM, Eric Paris wrote:
>>
>> Delegating CAP_SYS_ADMIN, of course, otherwise known as "everything", is
>> worse than anything...
>
> Serge just proposed a new CAP_SYSLOG
>
> http://lwn.net/Articles/378472/
>
> Which could probably still be renamed and used to cover this access as well....
>

Quite frankly, the Linux capability system is largely a mess, with big
bundled capacities that don't make much sense and are hideously
inconvenient with the capability system used in user space (groups).
For things like this that genuinely has a file node, *let's use it* and
allow permissions to be controlled by the file node!

-hpa

Willy Tarreau

unread,

Nov 29, 2010, 5:00:02 PM11/29/10

to

On Mon, Nov 29, 2010 at 11:05:58AM -0800, H. Peter Anvin wrote:
> Can we please not use CAP_SYS_ADMIN for this? Relying on CAP_SYS_ADMIN
> is worse than anything else -- it is a fixed policy hardcoded in the
> kernel, with no ability for the system owner to delegate the policy
> outward, e.g. by adding group read permission and/or chgrp the file.
>
> Delegating CAP_SYS_ADMIN, of course, otherwise known as "everything", is
> worse than anything...

Agreed, that's why I still think that hiding lots of valuable information to
non-root users will get more users added to unmanaged sudoers files, which
will result in much more holes in the systems than we currently have.

Willy

Kevin Easton

unread,

Nov 29, 2010, 6:10:02 PM11/29/10

to

On Sat, Nov 20, 2010 at 05:47:23PM -0200, Henrique de Moraes Holschuh wrote:
> On Fri, 19 Nov 2010, Kees Cook wrote:
> > On Fri, Nov 19, 2010 at 03:22:00PM -0800, Linus Torvalds wrote:
> > > In this case, the upside just wasn't big enough to accept _any_
> > > breakage, especially since people and distributions can just do the
> > > "chmod" themselves if they want to. There was a lot of discussion
> > > whether the patch should even go in in the first place. So this time,
> > > the "let's just revert it" was a very easy decision for me.
> >
> > The downside is that /proc can be remounted multiple times for different
> > containers, etc. Having to patch everything that mounts /proc to do the
> > chmod seems much more painful that fixing a simple userspace bug in an old
> > klog daemon.
> >
>

> As an user and sysadmin, I'd rather not have to find out every place that
> mounts /proc in a chroot to chmod all relevant files :( That's fighting a
> loosing battle, unlike fixing broken tools (which at least will stay fixed).

There's only one set of "kallsyms" permissions. If you chmod it in one
mount of proc, the permissions apply in *all* mounts of proc, current
or future.

So you don't have to find every place that mounts /proc - you can just
chmod it once at startup and be done.

- Kevin

Alan Cox

unread,

Nov 29, 2010, 6:40:02 PM11/29/10

to

> > /* Some debugging symbols have no name. Ignore them. */
> > - if (!iter->name[0])
> > + if (!iter->name[0] || !capable(CAP_SYS_ADMIN))
> > return 0;

This is hardcoding file permission policy into the kernel in a way the
user cannot change - its bogus in the extreme. Use file permissions that
way saner people can chmod them as they like. Indeed quite a few people
*already* chmod chunks of /proc.

It also means that things like SELinux and Tomoyo can be used to manage
security on it in clever ways - something that using a capability
completely buggers up.

Alan

Ingo Molnar

unread,

Nov 30, 2010, 7:10:01 AM11/30/10

to

* Alan Cox <al...@lxorguk.ukuu.org.uk> wrote:

> > > /* Some debugging symbols have no name. Ignore them. */
> > > - if (!iter->name[0])
> > > + if (!iter->name[0] || !capable(CAP_SYS_ADMIN))
> > > return 0;
>
> This is hardcoding file permission policy into the kernel in a way the
> user cannot change - its bogus in the extreme. Use file permissions that
> way saner people can chmod them as they like. Indeed quite a few people
> *already* chmod chunks of /proc.

Peter already pointed that out and i agree.

The main goal here was to establish that a regression-free patch can be implemented
by giving user-space a *empty /proc/kallsyms file* - that we older systems do not
crash on bootup.

> It also means that things like SELinux and Tomoyo can be used to manage security
> on it in clever ways - something that using a capability completely buggers up.

Frankly, our security interfaces are a mess - i did not even try to figure out the
'right' way to do it. Modularization of security subsystem made it all distinctly
worse.

Why dont we have coherent, easy to use (and hard to mess up) security interfaces to
begin with? The moment a kernel developer has to think of:

retval = -EPERM;
if (capable(CAP_SETUID)) {
new->suid = new->uid = uid;
if (uid != old->uid) {
retval = set_user(new);
if (retval < 0)
goto error;
}
} else if (uid != old->uid && uid != new->suid) {
goto error;
}

new->fsuid = new->euid = uid;

retval = security_task_fix_setuid(new, old, LSM_SETID_ID);
if (retval < 0)
goto error;

As the 'secure' implementation of a piece of kernel logic we have lost the
'security' battle ...

The current security callbacks are absolutely nonsensical random crap slapped all
around the kernel. It increases our security complexity and has thus the opposite
effect - it makes us _less_ secure.

Did no-one think of merging the capabilities checks and the security subsystem
callbacks in some easy-to-use manner, which makes the default security policy
apparent at first sight?

This code should be written in a simpler form, something like:

retval = -EPERM;
if (!security_allow_task_fix_setuid(new, old)) {
new->suid = new->uid = uid;
if (uid != old->uid) {
retval = set_user(new);
if (retval < 0)
goto error;
}
} else if (uid != old->uid && uid != new->suid) {
goto error;
}

new->fsuid = new->euid = uid;

Where the default security_allow_task_fix_setuid() is basically a CAP_SETUID check -
and we know this from the 'security_allow_task_fix_setuid' name already.

This way all those stupid, passive security callbacks become _active participants of
the code_, and the code becomes more compact and easier to understand - and it
becomes harder to mess up both compatibility details and permission details.

[ And yes, i realize that this isnt a 100% replacement of the existing callback,
because some of the default logic cannot be turned off - but heck, that's a
feature not a bug! We dont want to allow security modules to make things _less_
secure, or break legacies, right? So they should be shaped as _additional_
restrictions on the coarse default semantics.

And dont get me started about the idiocy of LSM_SETID_ID. Why isnt that detail put
into the callback name? What's wrong with security_task_fix_setuid_id(new, old)? ]

Whoever allowed security modules to be added in their current form needs some
talking to.

Thanks,

Ingo