sysctl(2) and/or /kern for system variable manipulation

Erik Fair

unread,

Mar 22, 2000, 3:00:00 AM3/22/00

to e...@netbsd.org

Let's entertain the /kern notion for just one more minute; assuming
that each object has its own permissions (which would show up as file
or directory permissions), then there's no problem mounting /kern
itself anywhere you like (indeed, it can be an unprivileged mount -
anyone can do it).

Now, in the case of a chroot(2)'d environment, I hear you say, "Ah
hah! Suppose a clever attacker gains root inside the box, and then
mounts /kern? He can modify various global system operational
parameters!"

Well, yeah. Does sysctl(2) prevent that?

What sysctl variables does one typically need inside the chroot(2) box, anyway?

8th Edition and Plan 9 have some very clever mechanisms for providing
for a standard, but individual execution environment by arranging the
filesystem name space in interesting ways with mount(2). Again, the
idea was simple: make almost everything into a file, and then
manipulate as necessary with existing tools. I think we'd do well to
adopt some of them, and thereby get rid of a raft of specialized
system calls...

Erik <fa...@clock.org>

Andy Sporner

unread,

Mar 22, 2000, 3:00:00 AM3/22/00

to Erik Fair, e...@netbsd.org

Hi,

My $.02 worth...

The arguments presented here would not be the ones that
would compell me to not use /kern. I regard /kern as a
neat thing that shell programs can use to get at some
kernel stuff.

Presuming that there is a comparison between syscall(2)
and /kern, I would presume the applicationt to be some
sort of 'C' program.

I have lived in the Linux world for about 4 years and
at first I thought /proc was the greatest thing--until
I had to start writing parsers to make sense of the data
in a way useful to the programs that I was writting at
the time. Then the format changed of some of the things
(IIRC -- /proc/net/route) and I had to redo most of it.

Granted BSD is more sedintary in this respect, but I guess
I have a general problem with doing file I/O to get at
something that I can just as easily get with a direct
system call.

That being said, I still think that shell tools that make
use of 'awk' and other pattern scanning tools.

Like I said, My $0.02 worth....

Andy

Todd Whitesel

unread,

Mar 24, 2000, 3:00:00 AM3/24/00

to Andy Sporner

> I had to start writing parsers to make sense of the data
> in a way useful to the programs that I was writting at
> the time. Then the format changed of some of the things
> (IIRC -- /proc/net/route) and I had to redo most of it.

I believe this is because the Linux /proc goodies are mostly
cute hacks to turn netstat, etc., into kernel modules.

Having dealt with programs that parse the output of GDB, and
Hewlett-Packard emulators, I declare it categorically insane
to base a long-term interface on something that prints human
readable messages which are then parsed by programs.

I am in favor of beefing up /kern, but it must be done in a
machine-readable way; the current output of sysctl -a, or
mixerctl/audioctl, is not too bad, actually. (See also environ(7).)

Todd Whitesel
toddpw @ best.com

Greg A. Woods

unread,

Mar 25, 2000, 3:00:00 AM3/25/00

to tech...@netbsd.org

[ On Friday, March 24, 2000 at 05:22:49 (-0800), Todd Whitesel wrote: ]
> Subject: Re: sysctl(2) and/or /kern for system variable manipulation

>
> Having dealt with programs that parse the output of GDB, and
> Hewlett-Packard emulators, I declare it categorically insane
> to base a long-term interface on something that prints human
> readable messages which are then parsed by programs.

That's because you were dealing with *unspecified* data formats.

> I am in favor of beefing up /kern, but it must be done in a
> machine-readable way; the current output of sysctl -a, or
> mixerctl/audioctl, is not too bad, actually. (See also environ(7).)

The issue is not between "human" and "machine" readability -- it is
simply one of ensuring the data formats are carefully and firmly
specified. Why give up human readability when you don't have to!?!?!?
Providing for human readability of data has the advantage of almost
guaranteeing machine-independent data will stay that way too.

It should go without saying of course that ideally each structure should
be run-time extensible and/or versioned too so that issues with upgrades
are dealt with implicilty.

Indeed the concept of having so many namespaces in BSD is highly
questionable. libkvm, sysctl, et al should all be moved to the
filesystem namespace.

As for whether or not this means duplicating the effort with maintaining
post-crash core dump analysis tools depends on how flexible those who
use such tools are willing to be. If it is necessary to always
guarantee that 'ps -M /var/crash/*.0.core -N /var/crash/*.0' will
continue to work instead of having some new tool akin to the sysV
"crash" program (whether implemented as a separate tool or as a set of
GDB macros), well then yes, I guess it does mean duplicating some of the
maintenance headaches and bloating some programs.

--
Greg A. Woods

+1 416 218-0098 VE3TCP <gwo...@acm.org> <robohack!woods>
Planix, Inc. <wo...@planix.com>; Secrets of the Weird <wo...@weird.com>

der Mouse

unread,

Mar 25, 2000, 3:00:00 AM3/25/00

to tech...@netbsd.org

> Indeed the concept of having so many namespaces in BSD is highly
> questionable. libkvm, sysctl, et al should all be moved to the
> filesystem namespace.

"Should"? Why? What's wrong with having many namespaces? Separate
namespaces have many good properties - for example, a chrooted program
can still manipulate AF_INET sockets normally. (This can be looked
upon as good or bad, depending on your inclination.)

> As for whether or not this means duplicating the effort with
> maintaining post-crash core dump analysis tools depends on how
> flexible those who use such tools are willing to be. If it is
> necessary to always guarantee that
> 'ps -M /var/crash/*.0.core -N /var/crash/*.0' will continue to work
> instead of having some new tool akin to the sysV "crash" program
> (whether implemented as a separate tool or as a set of GDB macros),
> well then yes, I guess it does mean duplicating some of the
> maintenance headaches and bloating some programs.

Personally, the main reason I want ps -M ... -N ... instead of a
separate crash-type program is that I can be sure it's the same ps,
supporting all the same options in exactly the same way, with all the
same output formats, etc, etc.

It occurs to me that it might well be sufficient to have a tool that
presents a kernfs-style mountable interface, but instead of being
driven off a live kernel it's driven off netbsd.0.core and netbsd.0
files. It seems this might end up being more work than the other way,
but it seems to me it should be at least considered.

der Mouse

mo...@rodents.montreal.qc.ca
7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B

Greywolf

unread,

Mar 25, 2000, 3:00:00 AM3/25/00

to NetBSD Kernel Technical Discussion List

On Sat, 25 Mar 2000, Greg A. Woods wrote:

# > I am in favor of beefing up /kern, but it must be done in a
# > machine-readable way; the current output of sysctl -a, or
# > mixerctl/audioctl, is not too bad, actually. (See also environ(7).)
#
# The issue is not between "human" and "machine" readability -- it is
# simply one of ensuring the data formats are carefully and firmly
# specified. Why give up human readability when you don't have to!?!?!?
# Providing for human readability of data has the advantage of almost
# guaranteeing machine-independent data will stay that way too.
#
# It should go without saying of course that ideally each structure should
# be run-time extensible and/or versioned too so that issues with upgrades
# are dealt with implicilty.
#
# Indeed the concept of having so many namespaces in BSD is highly
# questionable. libkvm, sysctl, et al should all be moved to the
# filesystem namespace.

Greg. Go run Plan 9 if that's what you're after. :-)

I disagree with everything living in filesystem namespace. It's an added
level of interpretation, context and complexity that doesn't need to be
there.

--*greywolf;
--
BSD: Feed The Computer.

Greg A. Woods

unread,

Mar 26, 2000, 3:00:00 AM3/26/00

to tech...@netbsd.org

[ On Saturday, March 25, 2000 at 13:55:57 (-0500), der Mouse wrote: ]

> Subject: Re: sysctl(2) and/or /kern for system variable manipulation
>

> > Indeed the concept of having so many namespaces in BSD is highly

> > questionable. libkvm, sysctl, et al should all be moved to the

> > filesystem namespace.
>
> "Should"? Why? What's wrong with having many namespaces?

Many namespaces mean many different ways of managing, accessing,
controlling, etc. the objects being accessed. This means added
complexity with the potential for confusion and unsafe assumptions being
made by users; and sometimes additional bloat too.

The developers of Research UNIX and Plan 9 have made solid arguments
that I won't further repeat here in favour of using the filesystem
namespace consistently for all system objects. Indeed they even went so
far as to show the advantages of using filesystem semantics for
accessing things like network services. Their arguments are elegant and
simple and the result is I think simpler and certainly less complex.

> Separate
> namespaces have many good properties - for example, a chrooted program
> can still manipulate AF_INET sockets normally. (This can be looked
> upon as good or bad, depending on your inclination.)

Virtual filesystems have that property too. You should be able to mount
the virtual filesystem multiple places where it makes sense to allow
this, such as in "mount -r -t sysctl /chroothome/sysctl". Note the
proposed use of '-r' in this pseudo-example to ensure this instance of
the sysctl tree is read-only. Of course some objects will still only be
readable by root no matter who owned the underlying mount- point
directory.

Note also that given support for changing the permissions on virtual
filesystem objects the same tools used for protecting files can also be
used to protect objects visible through a virtual filesystem. I can
even see how a slightly modified unionfs could be used to provide safe
and persistent storage for permissions and ownership information for at
least some types of virtual filesystems.

I personally can't think of any good properties of separate namespaces
for system objects that are not also properties of unified filesystem
namespaces if they've been implemented with these goals in mind.

> Personally, the main reason I want ps -M ... -N ... instead of a
> separate crash-type program is that I can be sure it's the same ps,
> supporting all the same options in exactly the same way, with all the
> same output formats, etc, etc.

Yes there are certainly advantages to having such a common foundation
toolset that works on running kernels as well as crash dumps.

> It occurs to me that it might well be sufficient to have a tool that
> presents a kernfs-style mountable interface, but instead of being
> driven off a live kernel it's driven off netbsd.0.core and netbsd.0
> files. It seems this might end up being more work than the other way,
> but it seems to me it should be at least considered.

I was thinking of that too. While I don't see any hard obstacles in
building a tool that could offer something like /proc and /kern from a
crash dump, I do agree that it would probably be more work than simply
developing a unified crash analysis toolset.

Greg A. Woods

unread,

Mar 26, 2000, 3:00:00 AM3/26/00

to NetBSD Kernel Technical Discussion List

[ On Saturday, March 25, 2000 at 13:35:39 (-0800), Greywolf wrote: ]

> Subject: Re: sysctl(2) and/or /kern for system variable manipulation
>

> Greg. Go run Plan 9 if that's what you're after. :-)

Plan 9 and Research UNIX are not the only systems that have demonstrated
with real-world experience the benefits of unifying the system (and
network) namespaces into the filesystem.

> I disagree with everything living in filesystem namespace. It's an added
> level of interpretation, context and complexity that doesn't need to be
> there.

In fact the overall level of complexity, from pretty well all points of
view so far as I can tell, is actually lower.

It is indeed an extra level of interpretation (i.e. as opposed to
looking in /dev/kmem at the actual raw data structures). However it is
this level of interpretation that allows the interface defined for
external programs to be a little more resilient to minor changes in the
internal details and indeed even to some extent to be machine
independent.

Indeed even the 4.4BSD developers were well on their way to
demonstrating the benefits of a unified filesystem namespace for all
system objects in their borrowing of things like procfs from Research
UNIX and the development of portals, etc. I don't doubt that they would
have gone further in this direction had their group continuted to work
on BSD. Note that everything that McKusick et al say about the
advantages of sysctl apply equally well to an interface designed as a
virtual filesystem (and they go almost so far as to say this too).

der Mouse

unread,

Mar 26, 2000, 3:00:00 AM3/26/00

to tech...@netbsd.org

>> "Should"? Why? What's wrong with having many namespaces?
> Many namespaces mean many different ways of managing, accessing,
> controlling, etc. the objects being accessed.

Yes - one way per namespace. I'm not convinced this is necessarily a
bad thing.

> This means added complexity with the potential for confusion and
> unsafe assumptions being made by users; and sometimes additional
> bloat too.

Sometimes, especially when there exist objects that appear in multiple
namespaces.

But them excepted, I don't see why each object type shouldn't have a
namespace suited to its nature. The file paradigm is remarkably
robust, but that's largely because of the unstructured ioctl() escape
hatch, secondarily because people accept pseudo-filesystems that
severely break many ordinary filesystem operations. The filesystem
simply is not a good fit to many abstractions. Look at all the things
you can't do with procfs, for example: both things you can do with
processes but not procfs "files", and things you can do with normal
files that you can't do with procfs "files". (Simple examples: there's
no procfs analog to fork() - the closest would be cp, and that's not a
good fit - nor is there a process analog to mkdir().)

> The developers of Research UNIX and Plan 9 have made solid arguments
> that I won't further repeat here in favour of using the filesystem
> namespace consistently for all system objects.

It's an interesting paradigm, but as I indicated above, is has the
problem that it ends up twisting everything into the filesystem mold.

Greg A. Woods

unread,

Mar 27, 2000, 3:00:00 AM3/27/00

to tech...@netbsd.org

[ On Sunday, March 26, 2000 at 22:40:18 (-0500), der Mouse wrote: ]

> Subject: Re: sysctl(2) and/or /kern for system variable manipulation
>

> > The point of using the filesystem paradigm is to provide a simple and
> > common system-call interface through open(), close(), read(), and
> > write().
>
> Why is this good?

Your question isn't really meaningful to me in this context. As you no
doubt already know one of the common things people will say about
Unix-like systems is that their most distinguishing feature is the
concept of "everything is a file" (as a quick search for that phrase on
google.com will reveal). Unix doesn't (yet) take this as far as it can
go of course.

The basic advantage is that common tools can be used to manipulate new
services and devices without having to be re-designed with a new
interface tacked on the side of each. Any program can open a tape drive
device and write to it or read from it without knowing that it's
interface is via SCSI, or IDE, or QIC36, or something unique. Obviously
this does mean that to be truly successful anyone employing this
paradigm really must get rid of hacks like ioctl() (either that or make
them generic enough and extensible enough such that they could be used
from the shell command line without having to re-compile the shell every
time a new device driver is added, though how this would fundamentally
differ from simply providing a read/write control channel beats me).

From the Plan-9 FAQ we also read:

What are the advantages to this approach?

Plan 9's approach improves generality and modularity of application design
by encouraging servers that make any kind of information appear to users
and to applications just like collections of ordinary files.

Of course Plan 9 takes "everything is a file" to a new level and really
does eliminate many of the system calls that are unnecessary when you've
got names in the filesystem that can provide the same information and
controls (eg. time(2) is replaced by /dev/time). Small is beautiful and
there's elegance in treating everything uniformly.

Some further explanation is given in "The Design and Implementation of
the 4.4BSD Operating System" book, in particular for /proc, but I would
recommend reading the Plan 9 "early papers" related to this subject as
well as the papers published about some of the earlier implementations
in Research UNIX. Note that the early implementations of /proc were
very primitive in this sense but they've come a long way, especially if
you look at what SysVr4.2 did with it.

I think these two papers are perhaps the best to answer your question
(I've only read the first, not the second):

T. Killian, ``Processes as Files'', USENIX Summer Conf. Proc., Salt
Lake City, 1984

R. Needham, ``Names'', in Distributed systems, S. Mullender, ed.,
Addison Wesley, 1989

Here are some more random references on related topics:

Stevens & Pendry, ``Portals in 4.4BSD'', USENIX Winter Conf. Proc.,
January 1995

Plan 9 From Bell Labs
Rob Pike, Dave Presotto, Sean Dorward, Bob Flandrena, Ken
Thompson, Howard Trickey, and Phil Winterbottom
An overview of the system; read at least this paper before you
install.
http://plan9.bell-labs.com/plan9/doc/9.html

David L. Presotto and Dennis M. Ritchie, ``Interprocess Communication
in the Ninth Edition Unix System'', Softw. - Prac. and Exp., June
1990, Vol 20 #S1, pp. S1/3-S1/17.

The Use of Name Spaces in Plan 9
Rob Pike, Dave Presotto, Ken Thompson, Howard Trickey, and Phil
Winterbottom
What's in a name?
http://plan9.bell-labs.com/plan9/doc/names.html

The Organization of Networks in Plan 9
Dave Presotto and Phil Winterbottom
Connecting the pieces.
http://plan9.bell-labs.com/plan9/doc/net.html

S. Childs, ``Filing system interfaces to support distributed
multimedia applications.'', Eighth ACM SIGOPS European Workshop
Support for Composing Distributed Applications, Sintra, Portugal,
1998. ACM.

After some quick research with google.com I found an interesting paper
that proposes "everything is a box" as an alternative to files:

http://gsyc.escet.urjc.es/off/export/2kblocks/2kblocks.html

Seems NTFS goes further too:

In NTFS, everything is a file. Even all metadata are stored in files
called system files. It is quite unusual for a filesystem, but it
allows the filesystem driver to manipulate these data in a generic
way (for example, to perform access control on them), and since these
data can be moved and located anywhere on the storage unit, it
reduces the risk of damage. The Master File Table (MFT) is the most
important system file. It contains information about all the files of
the volume. There is exactly one MFT per volume.

(from http://www.via.ecp.fr/~regis/ntfs/new/MFT.html)

(not that I believe all their claims.... :-)