Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How to get the process associated with a tcp/udp port on linux?

374 views
Skip to first unread message

blt_6hm...@a6wa8v3cn8wj5.ac.uk

unread,
Feb 22, 2019, 5:57:02 AM2/22/19
to
Hello

I need to be able to associate a program with an open port using C in the same
way as netstat -p does it. I've straced netstat and I can see it scanning
through every process in /proc and their open file descriptors (seems a bit
ugly to me, must be a more efficient way?), but I can't see how it is checking
what socket and port number - if any - those file descriptors are linked to and
google isn't being much help. I can't see getsockname() working since the
file descriptor value means nothing outside the program its being used in.

Anyone know?

Thanks for any help


Kenny McCormack

unread,
Feb 22, 2019, 6:33:36 AM2/22/19
to
In article <q4okhq$i83$1...@gioia.aioe.org>,
Have you tried looking at (and working with) the source?

That's got to be more productive than tracing the executable.

Out of curiosity, why do you want/need to do this? I'm having a hard
time imagining a situation where it wouldn't be easier to just run
netstat and parse the result. Please help me to understand.

--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/DanaC

Rainer Weikusat

unread,
Feb 22, 2019, 10:44:29 AM2/22/19
to
/proc/<something>/fd will give you the socket inode number and the
/proc/net/tcp and /proc/net/udp files will enable you to get the
protocol address from the inode number.

James K. Lowden

unread,
Feb 22, 2019, 11:12:20 AM2/22/19
to
On Fri, 22 Feb 2019 11:33:33 -0000 (UTC)
gaz...@shell.xmission.com (Kenny McCormack) wrote:

> Out of curiosity, why do you want/need to do this? I'm having a hard
> time imagining a situation where it wouldn't be easier to just run
> netstat and parse the result. Please help me to understand.

I don't know the answer, but I had a similar problem I was able only to
find a partial solution to: how to know if a given file is on a local
or remote filesystem.

As a human being, you can look at things like fstab and mount(8) output
to know which filesystems are locally attached and which are accessed
over a network. As a program, not so much.

My solution was to stat the file and scan /proc/partitions for the
device. That will yield false negatives: some local filesystems aren't
represented there. tempfs and procfs come to mind as examples.

The root problem, in case you're pondering a solution, is to know
whether or not, for a given file, the kernel file buffer always
represents the current state of the on-disk image, plus pending
updates. On an NFS file, that's not true: if another user updates
the file, your kernel file buffer for that file silently
becomes obsolete. Updates to the file based on that voided cache will
result in the file being corrupted; it's a read-write-write error. A
system (such as a database) that uses shared files and relies on kernel
file buffer integrity is well advised to protect users from mistakenly
using network filesystems.

It seems like there might be a definitive way on Linux. statvfs(3)
will return you the mysterious fsid. sysfs(2) will convert the fsid to
a string. That string should appear in /proc/filesystems. If "nodev"
is not associated with it, it's a local device. If "nodev" is
associated with it, perhaps there is something
in /proc/fs/<name>/<devname>/options, such as "block_validity" that
answers the question. I didn't investigate that deeply because I can
live with the false negatives for now.



--jkl

blt_...@sco7lrqqtlvt4lmzl0e7j6y6v.org

unread,
Feb 22, 2019, 11:17:15 AM2/22/19
to
On Fri, 22 Feb 2019 11:33:33 -0000 (UTC)
gaz...@shell.xmission.com (Kenny McCormack) wrote:
>In article <q4okhq$i83$1...@gioia.aioe.org>,
> <blt_6hm...@a6wa8v3cn8wj5.ac.uk> wrote:
>>Hello
>>
>>I need to be able to associate a program with an open port using C in
>>the same way as netstat -p does it. I've straced netstat and I can
>>see it scanning through every process in /proc and their open file
>>descriptors (seems a bit ugly to me, must be a more efficient way?),
>>but I can't see how it is checking what socket and port number - if
>>any - those file descriptors are linked to and google isn't being much
>>help. I can't see getsockname() working since the file descriptor value
>>means nothing outside the program its being used in.
>
>Have you tried looking at (and working with) the source?

Not yet, I was hoping for a faster solution that trawling through thousands
of lines of code.

>Out of curiosity, why do you want/need to do this? I'm having a hard
>time imagining a situation where it wouldn't be easier to just run

Unbelievable , every single time I ask a question on this group I get this as a
response.

"I can't think of a reason to do it so why do you need it?"

Because I'm writing a packet dump program and I'd like to print the process
responsible for sending out or receiving the packet as well as other details.
And if tcpdump did exactly what I needed I'd use it.

>netstat and parse the result. Please help me to understand.

Probably not very efficient to spawn off an instance of netstat for every
packet going in or out. I suspect the machine might slow down somewhat.

blt_j_D...@i_311kjl75cv7qyadf3.net

unread,
Feb 22, 2019, 11:17:45 AM2/22/19
to
Care to elaborate? How does the socket inode number relate to the port?

Kenny McCormack

unread,
Feb 22, 2019, 11:28:20 AM2/22/19
to
In article <q4p7a6$1g2v$1...@gioia.aioe.org>,
<blt_...@sco7lrqqtlvt4lmzl0e7j6y6v.org> wrote:
...
>>Out of curiosity, why do you want/need to do this? I'm having a hard
>>time imagining a situation where it wouldn't be easier to just run
>
>Unbelievable , every single time I ask a question on this group I get
>this as a response.
>
>"I can't think of a reason to do it so why do you need it?"

Yes, that's Usenet (and just about every other online forum).

But, seriously, I get what you're saying, and I sympathize. I've been
on the wrong end of it too many times to count. The point is that it
usually carries a sub-current of "You don't need to do this. It's
stupid. You're stupid".

But I tried to phrase my question in such a way as to avoid projecting
this all-too-common sub-current. I was genuinely interested and you've
answered sensibly. My thanks.

Another poster has posted a method that looks promising (I'm not
interested enough to test it), but it looks like it will still have a
bit too much overhead (i.e., polling/searching) to be a sensible thing
to do in a tight loop.

--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/Mandela

Rainer Weikusat

unread,
Feb 22, 2019, 11:45:08 AM2/22/19
to
I just wrote that: The socket has an inode number (can be determined via
the /proc/.../fd directory of a process) and a protocol address
(recorded in /proc/net/udp and /proc/net/tcp which also contain the
inode number.

How about having a look at the files?

blt_...@p35krpd_qua_zs.gov

unread,
Feb 22, 2019, 12:01:33 PM2/22/19
to
On Fri, 22 Feb 2019 16:45:04 +0000
Rainer Weikusat <rwei...@talktalk.net> wrote:
>blt_j_D6319juu@i_311kjl75cv7qyadf3.net writes:
>> Care to elaborate? How does the socket inode number relate to the port?
>
>I just wrote that: The socket has an inode number (can be determined via
>the /proc/.../fd directory of a process) and a protocol address
>(recorded in /proc/net/udp and /proc/net/tcp which also contain the
>inode number.
>
>How about having a look at the files?

Ok, figured it out, thank you.

Trawling the file system seems a hideous way to do it, is there an ioctl() that
may do it quicker?

Rainer Weikusat

unread,
Feb 22, 2019, 12:01:39 PM2/22/19
to
The port is part of the protocol address (but that's really
"internetworking 25.25[*]).

[*] 101/4

Rainer Weikusat

unread,
Feb 22, 2019, 12:09:38 PM2/22/19
to
The traditional way to determine this kind of information was to examine
the kernel data structures directly via /proc/kmem. procfs is a virtual
filesystem which exists so that ps etc don't have to be privileged (so
that they can read kernel memory directly) and can use a somewhat more
abstract interface instead, IOW, "none I'm aware of".

Using this interface is cumbersome in C but very easy in higher-level
languages, especially, from shell scripts (or interactive shells).

blt_...@doe8jstkh6camf5g.edu

unread,
Feb 22, 2019, 12:19:35 PM2/22/19
to
On Fri, 22 Feb 2019 17:09:35 +0000
Rainer Weikusat <rwei...@talktalk.net> wrote:
>blt_ukmz@p35krpd_qua_zs.gov writes:
>> On Fri, 22 Feb 2019 16:45:04 +0000
>> Rainer Weikusat <rwei...@talktalk.net> wrote:
>>>blt_j_D6319juu@i_311kjl75cv7qyadf3.net writes:
>>>> Care to elaborate? How does the socket inode number relate to the port?
>>>
>>>I just wrote that: The socket has an inode number (can be determined via
>>>the /proc/.../fd directory of a process) and a protocol address
>>>(recorded in /proc/net/udp and /proc/net/tcp which also contain the
>>>inode number.
>>>
>>>How about having a look at the files?
>>
>> Ok, figured it out, thank you.
>>
>> Trawling the file system seems a hideous way to do it, is there an ioctl()
>that
>> may do it quicker?
>
>The traditional way to determine this kind of information was to examine
>the kernel data structures directly via /proc/kmem. procfs is a virtual

Ok.

>filesystem which exists so that ps etc don't have to be privileged (so
>that they can read kernel memory directly) and can use a somewhat more
>abstract interface instead, IOW, "none I'm aware of".
>
>Using this interface is cumbersome in C but very easy in higher-level
>languages, especially, from shell scripts (or interactive shells).

Maybe so, but parsing /proc directories for every process on the system just
to look up a single piece of data is going to hit the CPU badly.

Rainer Weikusat

unread,
Feb 22, 2019, 12:32:40 PM2/22/19
to
blt_...@doe8jstkh6camf5g.edu writes:
> On Fri, 22 Feb 2019 17:09:35 +0000 Rainer Weikusat <rwei...@talktalk.net> wrote:

[Linux procfs]

>>Using this interface is cumbersome in C but very easy in higher-level
>>languages, especially, from shell scripts (or interactive shells).
>
> Maybe so, but parsing /proc directories for every process on the system just
> to look up a single piece of data is going to hit the CPU badly.

For perspective: The first version of Linux I ever used (1.2.13) already
supported procfs and I ran that on a 486SX-25 (25Mhz). I didn't notice
any performance problems caused by it and - contrary to popular
superstitions in certain quarters - computers didn't get much slower
since then :-).

Kaz Kylheku

unread,
Feb 22, 2019, 12:39:04 PM2/22/19
to
On 2019-02-22, blt_ukmz@p35krpd_qua_zs.gov <blt_ukmz@p35krpd_qua_zs.gov> wrote:
> Trawling the file system seems a hideous way to do it, is there an ioctl() that
> may do it quicker?

There is, but it's secret. I mean, if I told you, you might leak it to
the netstat maintainers.

blt...@4gr2p0qluph2vl5iwj.gov.uk

unread,
Feb 22, 2019, 1:27:57 PM2/22/19
to
Hilarious, you should do stand up. FYI there is often another way to find out
process information other than to parse /proc text files - eg getpriority()
getrusage() - so why do you think this is any different?

Lew Pitcher

unread,
Feb 22, 2019, 1:50:44 PM2/22/19
to
On the other hand, why do /you/ think that the Linux kernel developers chose
to implement a single general access to kernel information, rather than a
multitude of special purpose system calls?

It may be OK to second-guess the guys that designed and developed the
interfaces, but it certainly isn't OK to demand that uninvolved third-
parties justify those interfaces. All we can do, absent of actually being
kernel developers/designers ourselves, is tell you what we know. If you
/want/ an explanation of /why/, ask a kernel developer, not us.


--
Lew Pitcher
"In Skills, We Trust"

Kenny McCormack

unread,
Feb 22, 2019, 2:07:06 PM2/22/19
to
In article <q4pga1$bge$1...@dont-email.me>,
Lew Pitcher <lew.p...@digitalfreehold.ca> wrote:
...
>It may be OK to second-guess the guys that designed and developed the
>interfaces, but it certainly isn't OK to demand that uninvolved third-
>parties justify those interfaces. All we can do, absent of actually being
>kernel developers/designers ourselves, is tell you what we know. If you
>/want/ an explanation of /why/, ask a kernel developer, not us.

How do you know there are no kernel developers here?

--
Marshall: 10/22/51
Jessica: 4/4/79

Rainer Weikusat

unread,
Feb 22, 2019, 2:11:59 PM2/22/19
to
No. That's the Linux-interface for this. This is designed to be easily
usable in environments where text processing is easy. Even stdio would
count as such an environment.

There's more nastiness here than the parsing: The stat/ fstat system
calls don't generally work on procfs files: The returned size is always
0. Hence, the only way to process such a file as a whole is "read it
into a dynamically growing buffer until a read returns 0" (aka EOF).


Nicolas George

unread,
Feb 23, 2019, 6:19:19 AM2/23/19
to
Kenny McCormack, dans le message <q4ph8o$5hn$1...@news.xmission.com>, a
écrit :
> How do you know there are no kernel developers here?

It is pretty obvious, really.

Kaz Kylheku

unread,
Feb 23, 2019, 10:31:40 AM2/23/19
to
I've been a kernel developer on and off, though not as career thing.

At one company I was the distro/kernel guy: put together an embedded
distro from scratch and did a good chunk of all required kernel work.

Most recently, on a previous job, I substantially rewrote the internals
of complex and buggy out-of-tree Linux driver for a USB host controller
chip/block.

Most recently before that, I hacked the USB-serial subsystem in Linux to
allow a TTY session to persist when the USB-serial adapter is unplugged,
and then plugged in again. You could be in the middle of Vim, editing a
file, on a USB-Serial console. Unplug the device. Plug it in again. Or
even a different one: unplug a Prolific PL2303 based unit, plug in a
FTDI. You get your session back: just Ctrl-L to refresh your screen and
you're editing again! Making USB serial robust this way makes it more
suitable for use as the system's serial console. I had it so that a an
embedded device could boot up with the device missing, yet still use it
as its console. Then, just plug it in at any time, and there it is.
Kind of like the obvious thing you'd do with an RS-232 port, but with
USB.

--
TXR Programming Lanuage: http://nongnu.org/txr
Music DIY Mailing List: http://www.kylheku.com/diy
ADA MP-1 Mailing List: http://www.kylheku.com/mp1

Scott Lurndal

unread,
Feb 23, 2019, 10:59:05 AM2/23/19
to
Really? I've been developing operating systems since 1981.

Nicolas George

unread,
Feb 23, 2019, 11:15:12 AM2/23/19
to
Scott Lurndal, dans le message <a5ecE.239863$Ri5....@fx45.iad>, a
écrit :
> Really? I've been developing operating systems since 1981.

And you refrained from wasting your time on this thread.

Kenny McCormack

unread,
Feb 23, 2019, 11:26:15 AM2/23/19
to
In article <5c71718e$0$5597$426a...@news.free.fr>,
Nicolas George seems to exist in his own little world.

I think it best to just leave it (and him and his world) alone.

--
A 70 year old man who watches 6 hours of TV a day, plays a lot of golf
and seems to always be in Florida is a retiree, not a president.

Jorgen Grahn

unread,
Feb 23, 2019, 4:04:44 PM2/23/19
to
On Fri, 2019-02-22, blt_...@doe8jstkh6camf5g.edu wrote:
...
> Maybe so, but parsing /proc directories for every process on the system just
> to look up a single piece of data is going to hit the CPU badly.

My impression is that the feature (being able to do a socket ->
process lookup) is a pure troubleshooting thing on a "best effort"
basis. I don't think there's any incentive for the kernel developers
to improve it, especially not if it would mean a kernel interface
change and new lookup structures in the kernel (to avoid the linear
search across processes).

It's IMHO also not useful for anything but troubleshooting.
I wouldn't e.g. try to implement sending SIGKILL to the process
listening on *:http; I suspect that would go wrong sooner or later.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Jorgen Grahn

unread,
Feb 23, 2019, 4:18:36 PM2/23/19
to
On Fri, 2019-02-22, blt_...@sco7lrqqtlvt4lmzl0e7j6y6v.org wrote:
> On Fri, 22 Feb 2019 11:33:33 -0000 (UTC)
> gaz...@shell.xmission.com (Kenny McCormack) wrote:
>>In article <q4okhq$i83$1...@gioia.aioe.org>,
>> <blt_6hm...@a6wa8v3cn8wj5.ac.uk> wrote:
>>>Hello
>>>
>>>I need to be able to associate a program with an open port using C in
>>>the same way as netstat -p does it. I've straced netstat and I can
>>>see it scanning through every process in /proc and their open file
>>>descriptors (seems a bit ugly to me, must be a more efficient way?),
>>>but I can't see how it is checking what socket and port number - if
>>>any - those file descriptors are linked to and google isn't being much
>>>help. I can't see getsockname() working since the file descriptor value
>>>means nothing outside the program its being used in.
>>
>>Have you tried looking at (and working with) the source?
>
> Not yet, I was hoping for a faster solution that trawling through thousands
> of lines of code.
>
>>Out of curiosity, why do you want/need to do this? I'm having a hard
>>time imagining a situation where it wouldn't be easier to just run
>
> Unbelievable , every single time I ask a question on this group I
> get this as a response.

I think it was explained to you last time why people ask why.

> "I can't think of a reason to do it so why do you need it?"
>
> Because I'm writing a packet dump program and I'd like to print the
> process responsible for sending out or receiving the packet as well
> as other details. And if tcpdump did exactly what I needed I'd use
> it.

I agree that would be useful for some things, especially if "other
details" included the socket buffer states. Often when I troubleshoot
e.g. TCP communication, I don't want to know only that a kilobyte of
data arrived, but also whether the process actually consumed it.

>>netstat and parse the result. Please help me to understand.
>
> Probably not very efficient to spawn off an instance of netstat for every
> packet going in or out. I suspect the machine might slow down somewhat.

Either way, with the kernel interface we seem to have, you probably
have to cache the data. And figure out when to flush a cached
host:port -> pid association.

blt_...@i_itiqa2p4hehxp113p.gov

unread,
Feb 24, 2019, 5:05:12 AM2/24/19
to
On Fri, 22 Feb 2019 19:11:55 +0000
Rainer Weikusat <rwei...@talktalk.net> wrote:
>blt...@4gr2p0qluph2vl5iwj.gov.uk writes:
>> Kaz Kylheku <157-07...@kylheku.com> wrote:
>>>On 2019-02-22, blt_ukmz@p35krpd_qua_zs.gov <blt_ukmz@p35krpd_qua_zs.gov>
>wrote:
>>>
>>>> Trawling the file system seems a hideous way to do it, is there an ioctl()
>>>that
>>>> may do it quicker?
>>>
>>>There is, but it's secret. I mean, if I told you, you might leak it to
>>>the netstat maintainers.
>>
>> Hilarious, you should do stand up. FYI there is often another way to find out
>
>> process information other than to parse /proc text files - eg getpriority()
>> getrusage() - so why do you think this is any different?
>
>No. That's the Linux-interface for this. This is designed to be easily
>usable in environments where text processing is easy. Even stdio would
>count as such an environment.

When you get this low level I'm not sure usable is more important than
efficient. Parsing text files is horribly slow compared to just calling
an ioctl() and getting a structured data buffer back in return.

>There's more nastiness here than the parsing: The stat/ fstat system
>calls don't generally work on procfs files: The returned size is always
>0. Hence, the only way to process such a file as a whole is "read it
>into a dynamically growing buffer until a read returns 0" (aka EOF).

Didn't know that, but it reinforces my point about /proc being a poor relation
to a proper process API.


blt_nV...@nlnwgj.ac.uk

unread,
Feb 24, 2019, 5:07:00 AM2/24/19
to
On 23 Feb 2019 21:04:41 GMT
Jorgen Grahn <grahn...@snipabacken.se> wrote:
>On Fri, 2019-02-22, blt_...@doe8jstkh6camf5g.edu wrote:
>....
>> Maybe so, but parsing /proc directories for every process on the system just
>> to look up a single piece of data is going to hit the CPU badly.
>
>My impression is that the feature (being able to do a socket ->
>process lookup) is a pure troubleshooting thing on a "best effort"
>basis. I don't think there's any incentive for the kernel developers
>to improve it, especially not if it would mean a kernel interface
>change and new lookup structures in the kernel (to avoid the linear
>search across processes).

I don't see why it would require a kernel interface change. All unix system
have their own system specific ioctl() calls especially when related to
networking. Adding one more to get the pid of a process bound to a particular
TCP or UDP interface and port isn't asking for much IMO.


Rainer Weikusat

unread,
Feb 24, 2019, 1:11:30 PM2/24/19
to
blt__2etx@i_itiqa2p4hehxp113p.gov writes:
> Rainer Weikusat <rwei...@talktalk.net> wrote:
>>blt...@4gr2p0qluph2vl5iwj.gov.uk writes:
>>> Kaz Kylheku <157-07...@kylheku.com> wrote:
>>>>On 2019-02-22, blt_ukmz@p35krpd_qua_zs.gov <blt_ukmz@p35krpd_qua_zs.gov>
>>wrote:
>>>>
>>>>> Trawling the file system seems a hideous way to do it, is there an ioctl()
>>>>that
>>>>> may do it quicker?
>>>>
>>>>There is, but it's secret. I mean, if I told you, you might leak it to
>>>>the netstat maintainers.
>>>
>>> Hilarious, you should do stand up. FYI there is often another way to find out
>>
>>> process information other than to parse /proc text files - eg getpriority()
>>> getrusage() - so why do you think this is any different?
>>
>>No. That's the Linux-interface for this. This is designed to be easily
>>usable in environments where text processing is easy. Even stdio would
>>count as such an environment.
>
> When you get this low level I'm not sure usable is more important than
> efficient. Parsing text files is horribly slow compared to just calling
> an ioctl() and getting a structured data buffer back in return.

This is not as easy as you seem to think.

ioctl ('I/O control operation') needs a filedescriptor as argument but
there is none. One would either need to create another special-purpose
file descriptor creation call or a character device driver here.

Any number of processes can have access to a particular socket and this
set of processes can change at any time, including that the last process
using the socket terminates before the application asking for the
information can make any use of it. With some luck, the the returned pid
may end up referring to a process no longer owning the socket which is
meanwhile owned by another process whose pid wasn't returned. Or the
process whose pid was returned closes the socket while another whose pid
wasn't returned creates a new one bound to the same port etc.

And then, computers are really fast today: A current smartphone has more
processing power than people using 1980s multi-user minicomputer users
would ever have imagined.

Barry Margolin

unread,
Feb 24, 2019, 10:16:46 PM2/24/19
to
In article <q4tqc1$1o5j$1...@gioia.aioe.org>, blt_nV...@nlnwgj.ac.uk
wrote:

> Adding one more to get the pid of a process bound to a particular
> TCP or UDP interface and port isn't asking for much IMO.

Don't forget that a stream can be associated with multiple processes,
because open files are inherited during fork() and can be passed between
processes using Unix domain sockets. So this call would have to return
ALL of them.

And it will also have to implement access control -- unprivileged users
aren't normally allowed to find out about file descriptors in processes
owned by other users.

And this ioctl obviously shouldn't be specific to network sockets, it
should work for any kind of stream (file, pipe, Unix-domain sockets,
etc.). The parameter specifying what you're looking for would have to be
somewhat complex to handle this generality.

Meanwhile, all this work has already been done in the implementation of
procfs, and it's good enough for tools that need the information.

--
Barry Margolin, bar...@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***

blt_g...@rblq.ac.uk

unread,
Feb 25, 2019, 5:59:00 AM2/25/19
to
On Sun, 24 Feb 2019 22:16:37 -0500
Barry Margolin <bar...@alum.mit.edu> wrote:
>In article <q4tqc1$1o5j$1...@gioia.aioe.org>, blt_nV...@nlnwgj.ac.uk
>wrote:
>
>> Adding one more to get the pid of a process bound to a particular
>> TCP or UDP interface and port isn't asking for much IMO.
>
>Don't forget that a stream can be associated with multiple processes,
>because open files are inherited during fork() and can be passed between
>processes using Unix domain sockets. So this call would have to return
>ALL of them.
>
>And it will also have to implement access control -- unprivileged users
>aren't normally allowed to find out about file descriptors in processes
>owned by other users.
>
>And this ioctl obviously shouldn't be specific to network sockets, it
>should work for any kind of stream (file, pipe, Unix-domain sockets,
>etc.). The parameter specifying what you're looking for would have to be
>somewhat complex to handle this generality.

So how do you think versions of unix that don't have procfs manage to get
this sort of information to processes? Eg OSX.

blt_c1...@0vrx.gov.uk

unread,
Feb 25, 2019, 5:59:54 AM2/25/19
to
On Sun, 24 Feb 2019 18:11:26 +0000
Rainer Weikusat <rwei...@talktalk.net> wrote:
>blt__2etx@i_itiqa2p4hehxp113p.gov writes:
>> When you get this low level I'm not sure usable is more important than
>> efficient. Parsing text files is horribly slow compared to just calling
>> an ioctl() and getting a structured data buffer back in return.
>
>This is not as easy as you seem to think.
>
>ioctl ('I/O control operation') needs a filedescriptor as argument but
>there is none. One would either need to create another special-purpose
>file descriptor creation call or a character device driver here.
>
>Any number of processes can have access to a particular socket and this
>set of processes can change at any time, including that the last process
>using the socket terminates before the application asking for the
>information can make any use of it. With some luck, the the returned pid
>may end up referring to a process no longer owning the socket which is
>meanwhile owned by another process whose pid wasn't returned. Or the
>process whose pid was returned closes the socket while another whose pid
>wasn't returned creates a new one bound to the same port etc.

There are always going to be these sorts of race conditions in a multi process
enviroment. Thats not a reason not to do it.

Kenny McCormack

unread,
Feb 25, 2019, 7:24:52 AM2/25/19
to
In article <q50hpg$1r6k$1...@gioia.aioe.org>, <blt_g...@rblq.ac.uk> wrote:
...
>>And this ioctl obviously shouldn't be specific to network sockets, it
>>should work for any kind of stream (file, pipe, Unix-domain sockets,
>>etc.). The parameter specifying what you're looking for would have to be
>>somewhat complex to handle this generality.
>
>So how do you think versions of unix that don't have procfs manage to get
>this sort of information to processes? Eg OSX.
>

They use icky, hard-to-use, callable-only-from-C-or-assembler interfaces.

And wish they had /proc, like modern Unix (aka, Linux) does.

--
Which of these is the crazier bit of right wing lunacy?
1) We've just had another mass shooting; now is not the time to be talking about gun control.

2) We've just had a massive hurricane; now is not the time to be talking about climate change.

Rainer Weikusat

unread,
Feb 25, 2019, 7:41:40 AM2/25/19
to
Returning an unbounded set of process IDs in response to an ioctl on a
special-purpose file descriptor of a kind which doesn't yet exist is
neither trivial to implement nor easy to use (correctly). Further,
considering that the information is useless outside of corner cases
(like troubleshooting or gathering historical statistics), there's no
reason why it would need to be provided very quickly, especially not
because of hypothetical performance problems caused by the existing
filesystem-based interface.

Jorgen Grahn

unread,
Feb 25, 2019, 9:29:21 AM2/25/19
to
On Sun, 2019-02-24, blt_nV...@nlnwgj.ac.uk wrote:
> On 23 Feb 2019 21:04:41 GMT
> Jorgen Grahn <grahn...@snipabacken.se> wrote:
>>On Fri, 2019-02-22, blt_...@doe8jstkh6camf5g.edu wrote:
>>....
>>> Maybe so, but parsing /proc directories for every process on the system just
>>> to look up a single piece of data is going to hit the CPU badly.
>>
>>My impression is that the feature (being able to do a socket ->
>>process lookup) is a pure troubleshooting thing on a "best effort"
>>basis. I don't think there's any incentive for the kernel developers
>>to improve it, especially not if it would mean a kernel interface
>>change and new lookup structures in the kernel (to avoid the linear
>>search across processes).
>
> I don't see why it would require a kernel interface change. All unix system
> have their own system specific ioctl() calls especially when related to
> networking.

That's what I meant by a kernel interface change.

> Adding one more to get the pid of a process bound to a particular
> TCP or UDP interface and port isn't asking for much IMO.

Then there's the actual work in the kernel -- see above, and Ben's
response.

Scott Lurndal

unread,
Feb 25, 2019, 9:38:27 AM2/25/19
to
The reason that ioctl(2) isn't used for this purpose is simple; you can't
do ioctl's over NFS.

The original SVR4 /proc used ioctl(2) for this purpose. In SVR4.2,
they switched to supporting read and write for this purpose
instead; primarily so /proc could be exported over NFS.

Returning binary data rather than printable (and thus parsable) data
turned out to be problematic when new data needed to be added to the
return; text makes this process easier because the data is free-form
(often returned as key-value pairs).

Parsing isn't that large an overhead on modern systems.

Scott Lurndal

unread,
Feb 25, 2019, 9:39:39 AM2/25/19
to
Generally by reading /dev/mem, /dev/kmem or using operating system specific
private interfaces.

Nicolas George

unread,
Feb 25, 2019, 9:53:18 AM2/25/19
to
Scott Lurndal, dans le message <A5TcE.72108$JT2....@fx10.iad>, a
écrit :
> The reason that ioctl(2) isn't used for this purpose is simple; you can't
> do ioctl's over NFS.

Can you explain what "over NFS" means for a socket?

Scott Lurndal

unread,
Feb 25, 2019, 10:56:13 AM2/25/19
to
I'm sorry, I don't understand your question in the context of /proc.

Nicolas George

unread,
Feb 25, 2019, 10:59:38 AM2/25/19
to
Scott Lurndal, dans le message <ueUcE.106728$qz5....@fx37.iad>, a
écrit :
>>> The reason that ioctl(2) isn't used for this purpose is simple; you can't
>>> do ioctl's over NFS.
>>Can you explain what "over NFS" means for a socket?
> I'm sorry, I don't understand your question in the context of /proc.

I am sorry you do not understand what you wrote yourself.

Rainer Weikusat

unread,
Feb 25, 2019, 11:02:11 AM2/25/19
to
sc...@slp53.sl.home (Scott Lurndal) writes:
> blt_nV...@nlnwgj.ac.uk writes:

[mapping a port to a set of pids]

> Parsing isn't that large an overhead on modern systems.

It's not that there's much parsing to do here. The program below takes a
port number as argument and prints the pids of the processes who are in
posession of a socket listening on that port:

------------
#include <dirent.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>
#include <unistd.h>

#define PREFIX " 0: 00000000"
#define INFIX "1770 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0"

static int port_2_inode(int port)
{
char hexport[5], line[1024], *p;
FILE *fp;
int inode;

sprintf(hexport, "%04X", port);

fp = fopen("/proc/net/tcp", "r");
fgets(line, sizeof(line), fp);

inode = -1;
while (fgets(line, sizeof(line), fp)) {
p = line + sizeof(PREFIX);

if (memcmp(hexport, p, 4) == 0) {
p += sizeof(INFIX);
sscanf(p, "%d", &inode);
break;
}
}

fclose(fp);
return inode;
}

static int scan_fds_for(int inode)
{
struct stat st;
DIR *dir;
struct dirent *d_ent;
int rc, found;

dir = opendir(".");
found = 0;
while ((d_ent = readdir(dir))) {
rc = stat(d_ent->d_name, &st);
if (rc == -1 || !S_ISSOCK(st.st_mode)) continue;

if (st.st_ino == inode) {
found = 1;
break;
}
}

closedir(dir);
return found;
}

static void print_pids_for(int inode)
{
DIR *dir;
struct dirent *d_ent;
int proc;
int rc;

chdir("/proc");
dir = opendir(".");
proc = dirfd(dir);
while ((d_ent = readdir(dir))) {
rc = chdir(d_ent->d_name);
if (rc == -1) continue;

rc = chdir("fd");
if (rc == 0 && scan_fds_for(inode))
puts(d_ent->d_name);

fchdir(proc);
}

closedir(dir);
}

int main(int argc, char **argv)
{
int port, inode;

port = atoi(argv[1]) & 0xffff;
inode = port_2_inode(port);
if (inode != -1) print_pids_for(inode);

return 0;
}

blt_b...@8pqtba6hz99c1hfnos621mc7.edu

unread,
Feb 25, 2019, 12:20:12 PM2/25/19
to
On Mon, 25 Feb 2019 12:24:50 -0000 (UTC)
gaz...@shell.xmission.com (Kenny McCormack) wrote:
>In article <q50hpg$1r6k$1...@gioia.aioe.org>, <blt_g...@rblq.ac.uk> wrote:
>....
>>>And this ioctl obviously shouldn't be specific to network sockets, it
>>>should work for any kind of stream (file, pipe, Unix-domain sockets,
>>>etc.). The parameter specifying what you're looking for would have to be
>>>somewhat complex to handle this generality.
>>
>>So how do you think versions of unix that don't have procfs manage to get
>>this sort of information to processes? Eg OSX.
>>
>
>They use icky, hard-to-use, callable-only-from-C-or-assembler interfaces.

You mean like most posix system calls?

blt...@6_uvfq2pzpayutxoxig98fxsf.ac.uk

unread,
Feb 25, 2019, 12:22:24 PM2/25/19
to
On 25 Feb 2019 14:29:18 GMT
Jorgen Grahn <grahn...@snipabacken.se> wrote:
>On Sun, 2019-02-24, blt_nV...@nlnwgj.ac.uk wrote:
>> On 23 Feb 2019 21:04:41 GMT
>> Jorgen Grahn <grahn...@snipabacken.se> wrote:
>>>On Fri, 2019-02-22, blt_...@doe8jstkh6camf5g.edu wrote:
>>>....
>>>> Maybe so, but parsing /proc directories for every process on the system
>just
>>>> to look up a single piece of data is going to hit the CPU badly.
>>>
>>>My impression is that the feature (being able to do a socket ->
>>>process lookup) is a pure troubleshooting thing on a "best effort"
>>>basis. I don't think there's any incentive for the kernel developers
>>>to improve it, especially not if it would mean a kernel interface
>>>change and new lookup structures in the kernel (to avoid the linear
>>>search across processes).
>>
>> I don't see why it would require a kernel interface change. All unix system
>> have their own system specific ioctl() calls especially when related to
>> networking.
>
>That's what I meant by a kernel interface change.
>
>> Adding one more to get the pid of a process bound to a particular
>> TCP or UDP interface and port isn't asking for much IMO.
>
>Then there's the actual work in the kernel -- see above, and Ben's
>response.

I'm not a kernel programmer, but there must be a data structure in the kernel
that gives a *direct* link between an interface + TCP/UDP port and a process
for networking to function at any sane speed. This data simply needs to
be exposed via an ioctl().

blt_aCS...@gba1ashx2sask4hya.ac.uk

unread,
Feb 25, 2019, 12:24:41 PM2/25/19
to
On Mon, 25 Feb 2019 14:38:24 GMT
sc...@slp53.sl.home (Scott Lurndal) wrote:
>blt_nV...@nlnwgj.ac.uk writes:
>>On 23 Feb 2019 21:04:41 GMT
>>Jorgen Grahn <grahn...@snipabacken.se> wrote:
>>>On Fri, 2019-02-22, blt_...@doe8jstkh6camf5g.edu wrote:
>>>....
>>>> Maybe so, but parsing /proc directories for every process on the system
>just
>>>> to look up a single piece of data is going to hit the CPU badly.
>>>
>>>My impression is that the feature (being able to do a socket ->
>>>process lookup) is a pure troubleshooting thing on a "best effort"
>>>basis. I don't think there's any incentive for the kernel developers
>>>to improve it, especially not if it would mean a kernel interface
>>>change and new lookup structures in the kernel (to avoid the linear
>>>search across processes).
>>
>>I don't see why it would require a kernel interface change. All unix system
>>have their own system specific ioctl() calls especially when related to
>>networking. Adding one more to get the pid of a process bound to a particular
>>TCP or UDP interface and port isn't asking for much IMO.
>>
>
>The reason that ioctl(2) isn't used for this purpose is simple; you can't
>do ioctl's over NFS.

Completely irrelovant for IP sockets and you're not going to be creating unix
sockets on an NFS filesystem.

>Returning binary data rather than printable (and thus parsable) data
>turned out to be problematic when new data needed to be added to the
>return; text makes this process easier because the data is free-form
>(often returned as key-value pairs).

And what happens when the format of this free form text changes? It potentially
breaks every program that uses it and Linux has form here.

blt_...@8c49s5.com

unread,
Feb 25, 2019, 12:30:20 PM2/25/19
to
On Mon, 25 Feb 2019 16:02:08 +0000
Rainer Weikusat <rwei...@talktalk.net> wrote:
>sc...@slp53.sl.home (Scott Lurndal) writes:
>> blt_nV...@nlnwgj.ac.uk writes:
>
>[mapping a port to a set of pids]
>
>> Parsing isn't that large an overhead on modern systems.
>
>It's not that there's much parsing to do here. The program below takes a
>port number as argument and prints the pids of the processes who are in
>posession of a socket listening on that port:

If you think there isn't much going on beneath the hood you've clearly
never seen the code for the *scanf() functions never mind the overhead of
opening and closing all those file descriptors. I'm afraid I've not seen
anything that changes my opinion that /proc is a very sub optimal solution
for providing process information.

Casper H.S. Dik

unread,
Feb 25, 2019, 12:43:58 PM2/25/19
to
gaz...@shell.xmission.com (Kenny McCormack) writes:

>In article <q50hpg$1r6k$1...@gioia.aioe.org>, <blt_g...@rblq.ac.uk> wrote:
>...
>>>And this ioctl obviously shouldn't be specific to network sockets, it
>>>should work for any kind of stream (file, pipe, Unix-domain sockets,
>>>etc.). The parameter specifying what you're looking for would have to be
>>>somewhat complex to handle this generality.
>>
>>So how do you think versions of unix that don't have procfs manage to get
>>this sort of information to processes? Eg OSX.
>>

>They use icky, hard-to-use, callable-only-from-C-or-assembler interfaces.

>And wish they had /proc, like modern Unix (aka, Linux) does.

Though Solaris has /proc, it does not adds anything other then
process specific data, though in Solaris 11.4 that does include
information about file descriptors such as most or all things
you want to know of a socket.

While netstat in Solaris also tries to give you the pid of the
"owner", it is something cached when the socket is created
(but that includes creation of sockets returned by "accept")

Generally it finds most if not all owners (processes).

And if you really want to know, you can use pfiles (which uses
/prooc/<pid>/fdinfo/*.

It should be no surprised that some sockets as reported by
netstat are NOT found: they might be owned the kernel or
they might be file descriptors being transmitted from one
process to the next and would not be seen under /proc.

Casper

Rainer Weikusat

unread,
Feb 25, 2019, 12:54:17 PM2/25/19
to
blt_...@8c49s5.com writes:
> On Mon, 25 Feb 2019 16:02:08 +0000
> Rainer Weikusat <rwei...@talktalk.net> wrote:
>>sc...@slp53.sl.home (Scott Lurndal) writes:
>>> blt_nV...@nlnwgj.ac.uk writes:
>>
>>[mapping a port to a set of pids]
>>
>>> Parsing isn't that large an overhead on modern systems.
>>
>>It's not that there's much parsing to do here. The program below takes a
>>port number as argument and prints the pids of the processes who are in
>>posession of a socket listening on that port:
>
> If you think there isn't much going on beneath the hood you've clearly
> never seen the code for the *scanf() functions

If you think you know what I think based on you not understanding
something I wrote, better think again because I think you're very likely
going to end up being very wrong.

The only (sort-of) parsing of note here is analysis of the /proc/net/tcp
lines. These are fixed width, hence, it boils down to doing a 4 char
comparison in a certain position per read line, possibly followed by
doing a single string-to-int conversion. As that was rather besides the
point, I've used sscanf for that because it was the easiest to use.

In general, converting a n digit decimal string to a number requires
loading the digit character and subtracting 0 from it for every digit
plus n - 1 multiplications with 10 and n - 1 additions.

> never mind the overhead of opening and closing all those file
> descriptors.

This is nothing to do with "parsing". It's reading of directory
contents.

> I'm afraid I've not seen anything that changes my opinion that /proc
> is a very sub optimal solution for providing process information.

The thing is "it doesn't matter".

Rainer Weikusat

unread,
Feb 25, 2019, 12:55:28 PM2/25/19
to
Which gets us back to "but there ain't no filedescriptor for this ioctl,
son!"

Rainer Weikusat

unread,
Feb 25, 2019, 12:56:27 PM2/25/19
to
There's still no file descriptor this ioctl could work on.

Rainer Weikusat

unread,
Feb 25, 2019, 1:19:43 PM2/25/19
to
Rainer Weikusat <rwei...@talktalk.net> writes:

[...]

> In general, converting a n digit decimal string to a number requires
> loading the digit character and subtracting 0 from it for every digit
> plus n - 1 multiplications with 10 and n - 1 additions.

Considering that there's a space behind the number, that's something
like this

static unsigned convert_int(char *p)
{
unsigned a, dg;

a = *p - '0';
while (dg = *++p, (dg -= '0') < 10) a = a * 10 + dg;
return a;
}

[SCNR]

Kenny McCormack

unread,
Feb 25, 2019, 1:50:47 PM2/25/19
to
In article <5c74295b$0$22351$e4fe...@news.xs4all.nl>,
Casper H.S. Dik <Caspe...@OrSPaMcle.COM> wrote:
...
>>And wish they had /proc, like modern Unix (aka, Linux) does.
>
>Though Solaris has /proc, it does not adds anything other then
>process specific data, though in Solaris 11.4 that does include
>information about file descriptors such as most or all things
>you want to know of a socket.

Just FYI - and I have no desire to debate this furher - I am familiar with
Solaris /proc and have written programs that use it. I think we can agree
that Linux /proc is better, but I suppose one could debate "At what cost?".

--
A pervert, a racist, and a con man walk into a bar...

Bartender says, "What will you have, Donald!"

Scott Lurndal

unread,
Feb 25, 2019, 1:57:40 PM2/25/19
to
gaz...@shell.xmission.com (Kenny McCormack) writes:
>In article <5c74295b$0$22351$e4fe...@news.xs4all.nl>,
>Casper H.S. Dik <Caspe...@OrSPaMcle.COM> wrote:
>...
>>>And wish they had /proc, like modern Unix (aka, Linux) does.
>>
>>Though Solaris has /proc, it does not adds anything other then
>>process specific data, though in Solaris 11.4 that does include
>>information about file descriptors such as most or all things
>>you want to know of a socket.
>
>Just FYI - and I have no desire to debate this furher - I am familiar with
>Solaris /proc and have written programs that use it. I think we can agree
>that Linux /proc is better, but I suppose one could debate "At what cost?".

Solaris (and SVR4) /proc were designed originally to replace ptrace(2). They
were never intended to be the catchall they became with Linux.

Linux also has /sys, of course.

Kenny McCormack

unread,
Feb 25, 2019, 2:32:32 PM2/25/19
to
In article <87pnrfh...@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rwei...@talktalk.net> wrote:
...
>There's still no file descriptor this ioctl could work on.

I think the OP may be using the term "ioctl" in a more generic way.

That is, as meaning just "some sort of callale API". It doesn't have to be
"ioctl" literally.

--
Nov 4, 2008 - the day when everything went
from being Clinton's fault to being Obama's fault.

Kaz Kylheku

unread,
Feb 25, 2019, 3:47:27 PM2/25/19
to
On 2019-02-25, Kenny McCormack <gaz...@shell.xmission.com> wrote:
> In article <q50hpg$1r6k$1...@gioia.aioe.org>, <blt_g...@rblq.ac.uk> wrote:
> ...
>>>And this ioctl obviously shouldn't be specific to network sockets, it
>>>should work for any kind of stream (file, pipe, Unix-domain sockets,
>>>etc.). The parameter specifying what you're looking for would have to be
>>>somewhat complex to handle this generality.
>>
>>So how do you think versions of unix that don't have procfs manage to get
>>this sort of information to processes? Eg OSX.
>>
>
> They use icky, hard-to-use, callable-only-from-C-or-assembler interfaces.

All modern high languages worth a damn have nice ways to deal with C
interfaces and structs.

These interfaces are robust: there is no parsing involved. A data
structure is filled in which has everything at a known offset,
of a known type/size.

There is no bullshit like a leading zero misinterpreted as octal,
or some -1 value spewing out FFFFFFFF for what was supposed to be a byte
in the range 0 to 255.

I recently needed to call "statfs" on an embedded ARM board. I whipped out TXR
Lisp, which is just my spare time project:

(typedef fsword-t uint64)
(typedef statfs (struct statfs
(f-type fsword-t)
(f-bsize fsword-t)
(f-blocks fsblkcnt-t)
(f-bfree fsblkcnt-t)
(f-bavail fsblkcnt-t)
(f-files fsfilcnt-t)
(f-free fsfilcnt-t)
(nil (array 1024 uchar))))

(with-dyn-lib nil
(deffi statfs "statfs" int (str (ptr-out statfs))))

Then, interactively:

1> (znew statfs)
#S(statfs f-type 0 f-bsize 0 f-blocks 0 f-bfree 0 f-bavail 0 f-files 0
f-free 0)

Fill it in:

2> (statfs "." *1)
0

Look at filled struct:

3> *1
#S(statfs f-type 61267 f-bsize 4096 f-blocks 116255154 f-bfree 16452244
f-bavail 10529377 f-files 29597696 f-free 19558956)

Kaz Kylheku

unread,
Feb 25, 2019, 3:52:39 PM2/25/19
to
I agree. Anyone who thinks otherwise should be condemned to using a
function:

int stringly_fstat(int fd, char *buf, size_t).

which provides information about an inode similarly to fstat, but
via a formatted character string.

:)

/proc would be easier to defend if it actually stuck to a common,
consistent syntax everywhere, for which we could write one parser that
would handle all /proc entries.

William Ahern

unread,
Feb 25, 2019, 4:15:09 PM2/25/19
to
For BSDs principally the sysctl syscall. Linux also implements BSD sysctl
and (AFAIU) its unified with procfs, except some years ago Red Hat
deprecated and then removed the sysctl syscall from their kernel and most
other distributions followed suit. Which is a shame because procfs has been
a reliable fount of security issues--most recently the Docker runc breakout
from two weeks ago. sysctl has had its share of exploits, but fewer as it's
both a simpler interface and also easier to sandbox.

Rainer Weikusat

unread,
Feb 25, 2019, 4:26:43 PM2/25/19
to
gaz...@shell.xmission.com (Kenny McCormack) writes:
> In article <87pnrfh...@doppelsaurus.mobileactivedefense.com>,
> Rainer Weikusat <rwei...@talktalk.net> wrote:
> ...
>>There's still no file descriptor this ioctl could work on.
>
> I think the OP may be using the term "ioctl" in a more generic way.

ioctl is the name of a system call performing a control operation on
some file descriptor.

Rainer Weikusat

unread,
Feb 25, 2019, 4:42:47 PM2/25/19
to
Kaz Kylheku <157-07...@kylheku.com> writes:
> On 2019-02-25, blt_...@8c49s5.com <blt_...@8c49s5.com> wrote:
>> On Mon, 25 Feb 2019 16:02:08 +0000
>> Rainer Weikusat <rwei...@talktalk.net> wrote:
>>>sc...@slp53.sl.home (Scott Lurndal) writes:
>>>> blt_nV...@nlnwgj.ac.uk writes:
>>>
>>>[mapping a port to a set of pids]
>>>
>>>> Parsing isn't that large an overhead on modern systems.
>>>
>>>It's not that there's much parsing to do here. The program below takes a
>>>port number as argument and prints the pids of the processes who are in
>>>posession of a socket listening on that port:
>>
>> If you think there isn't much going on beneath the hood you've clearly
>> never seen the code for the *scanf() functions never mind the overhead of
>> opening and closing all those file descriptors. I'm afraid I've not seen
>> anything that changes my opinion that /proc is a very sub optimal solution
>> for providing process information.
>
> I agree. Anyone who thinks otherwise should be condemned to using a
> function:
>
> int stringly_fstat(int fd, char *buf, size_t).
>
> which provides information about an inode similarly to fstat, but
> via a formatted character string.

Beliefs (or information) about the implementation of sscanf are
tangential to the statement which was made ("not much parsing to do
here").

William Ahern

unread,
Feb 25, 2019, 4:57:34 PM2/25/19
to
Kaz Kylheku <157-07...@kylheku.com> wrote:
> On 2019-02-25, Kenny McCormack <gaz...@shell.xmission.com> wrote:
>> In article <q50hpg$1r6k$1...@gioia.aioe.org>, <blt_g...@rblq.ac.uk> wrote:
>> ...
>>>>And this ioctl obviously shouldn't be specific to network sockets, it
>>>>should work for any kind of stream (file, pipe, Unix-domain sockets,
>>>>etc.). The parameter specifying what you're looking for would have to be
>>>>somewhat complex to handle this generality.
>>>
>>>So how do you think versions of unix that don't have procfs manage to get
>>>this sort of information to processes? Eg OSX.
>>>
>>
>> They use icky, hard-to-use, callable-only-from-C-or-assembler interfaces.
>
> All modern high languages worth a damn have nice ways to deal with C
> interfaces and structs.
>
> These interfaces are robust: there is no parsing involved. A data
> structure is filled in which has everything at a known offset,
> of a known type/size.
>
> There is no bullshit like a leading zero misinterpreted as octal,
> or some -1 value spewing out FFFFFFFF for what was supposed to be a byte
> in the range 0 to 255.

You mean bullshit like this?

https://www.reddit.com/r/linux/comments/6fjz6z/sudo_vulnerability_gets_fixed_across_multiple/

There were two classes of bullshit involved in that exploit: 1) integer
conversions and 2) field separation. A binary struct would have completely
solved #2 and *mostly* solved #1. Admittedly, there are better ways to
expose such information using procfs--such as a single file per field--but
they're often less convenient for userspace and more complex for the kernel.

Kenny McCormack

unread,
Feb 25, 2019, 5:29:15 PM2/25/19
to
In article <87h8crh...@doppelsaurus.mobileactivedefense.com>,
Rainer Weikusat <rwei...@talktalk.net> wrote:
>gaz...@shell.xmission.com (Kenny McCormack) writes:
>> In article <87pnrfh...@doppelsaurus.mobileactivedefense.com>,
>> Rainer Weikusat <rwei...@talktalk.net> wrote:
>> ...
>>>There's still no file descriptor this ioctl could work on.
>>
>> I think the OP may be using the term "ioctl" in a more generic way.
>
>ioctl is the name of a system call performing a control operation on
>some file descriptor.

Your last comment has nothing to do with anything.

--
"We should always be disposed to believe that which appears to us to be
white is really black, if the hierarchy of the church so decides."

- Saint Ignatius Loyola (1491-1556) Founder of the Jesuit Order -

Rainer Weikusat

unread,
Feb 25, 2019, 6:04:34 PM2/25/19
to
gaz...@shell.xmission.com (Kenny McCormack) writes:
> Rainer Weikusat <rwei...@talktalk.net> wrote:
>>gaz...@shell.xmission.com (Kenny McCormack) writes:
>>> In article <87pnrfh...@doppelsaurus.mobileactivedefense.com>,
>>> Rainer Weikusat <rwei...@talktalk.net> wrote:
>>> ...
>>>>There's still no file descriptor this ioctl could work on.
>>>
>>> I think the OP may be using the term "ioctl" in a more generic way.
>>
>>ioctl is the name of a system call performing a control operation on
>>some file descriptor.
>
> Your last comment has nothing to do with anything.

At least, that's what you claim to have understood of it.

ioctl has a defined technical meaning on UNIX (see above), hence,
someone who proposes "to use ioctl for something" can be reasonably
expected to answer questions related to ... er ... using ioctl. Such as
the need for a file descriptor.

blt_fw...@tf2ebf5kat17xw16bss.co.uk

unread,
Feb 26, 2019, 4:40:39 AM2/26/19
to
Why not? When I'm setting flags on an interface using
ioctl(sock,SIOCSIFFLAGS,&ifr) that socket is just any old datagram socket
that presumably provides a gateway to the networking subsystem, it is not
linked to the interface in any way. Why should the above be any different?
But if you're worried about ioctl() then provide some system call that doesn't
require a socket at all - just give it the interface id and port number and
it returns the pid.

blt_7...@u66zodekuzem.com

unread,
Feb 26, 2019, 4:58:31 AM2/26/19
to
On Mon, 25 Feb 2019 17:54:14 +0000
Rainer Weikusat <rwei...@talktalk.net> wrote:
>blt_...@8c49s5.com writes:
>> On Mon, 25 Feb 2019 16:02:08 +0000
>> Rainer Weikusat <rwei...@talktalk.net> wrote:
>>>sc...@slp53.sl.home (Scott Lurndal) writes:
>>>> blt_nV...@nlnwgj.ac.uk writes:
>>>
>>>[mapping a port to a set of pids]
>>>
>>>> Parsing isn't that large an overhead on modern systems.
>>>
>>>It's not that there's much parsing to do here. The program below takes a
>>>port number as argument and prints the pids of the processes who are in
>>>posession of a socket listening on that port:
>>
>> If you think there isn't much going on beneath the hood you've clearly
>> never seen the code for the *scanf() functions
>
>If you think you know what I think based on you not understanding
>something I wrote, better think again because I think you're very likely
>going to end up being very wrong.
>
>The only (sort-of) parsing of note here is analysis of the /proc/net/tcp
>lines. These are fixed width, hence, it boils down to doing a 4 char
>comparison in a certain position per read line, possibly followed by
>doing a single string-to-int conversion. As that was rather besides the
>point, I've used sscanf for that because it was the easiest to use.

Its also very slow.

>In general, converting a n digit decimal string to a number requires
>loading the digit character and subtracting 0 from it for every digit
>plus n - 1 multiplications with 10 and n - 1 additions.

Wow, who knew? Except you're forgetting that every time scanf is called it
has to parse its format string first.

Given the increasing amount of energy used by the worlds data centres it is
IMO beholden upon us developers to develop code that is as efficient - but
still readable - as possible, not just fling together code lego brick style.

>> never mind the overhead of opening and closing all those file
>> descriptors.
>
>This is nothing to do with "parsing". It's reading of directory
>contents.

Parsing, scanning, whatever, its just sematics. Parsing a directory tree is
neither quick nor efficient.

>> I'm afraid I've not seen anything that changes my opinion that /proc
>> is a very sub optimal solution for providing process information.
>
>The thing is "it doesn't matter".

It does. See above.

Jorgen Grahn

unread,
Feb 26, 2019, 6:44:57 AM2/26/19
to
On Mon, 2019-02-25, blt_aCS...@gba1ashx2sask4hya.ac.uk wrote:
> On Mon, 25 Feb 2019 14:38:24 GMT
> sc...@slp53.sl.home (Scott Lurndal) wrote:
...
>>Returning binary data rather than printable (and thus parsable) data
>>turned out to be problematic when new data needed to be added to the
>>return; text makes this process easier because the data is free-form
>>(often returned as key-value pairs).
>
> And what happens when the format of this free form text changes? It
> potentially breaks every program that uses it and Linux has form
> here.

SL argues above that such a change is likely to be backwards-
compatible. And indeed both Unix and the Internet protocols tend to
be text-based.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Rainer Weikusat

unread,
Feb 26, 2019, 8:43:31 AM2/26/19
to
blt_7...@u66zodekuzem.com writes:
> Rainer Weikusat <rwei...@talktalk.net> wrote:
>>blt_...@8c49s5.com writes:
>>> On Mon, 25 Feb 2019 16:02:08 +0000
>>> Rainer Weikusat <rwei...@talktalk.net> wrote:
>>>>sc...@slp53.sl.home (Scott Lurndal) writes:
>>>>> blt_nV...@nlnwgj.ac.uk writes:
>>>>
>>>>[mapping a port to a set of pids]
>>>>
>>>>> Parsing isn't that large an overhead on modern systems.
>>>>
>>>>It's not that there's much parsing to do here. The program below takes a
>>>>port number as argument and prints the pids of the processes who are in
>>>>posession of a socket listening on that port:
>>>
>>> If you think there isn't much going on beneath the hood you've clearly
>>> never seen the code for the *scanf() functions
>>
>>If you think you know what I think based on you not understanding
>>something I wrote, better think again because I think you're very likely
>>going to end up being very wrong.
>>
>>The only (sort-of) parsing of note here is analysis of the /proc/net/tcp
>>lines. These are fixed width, hence, it boils down to doing a 4 char
>>comparison in a certain position per read line, possibly followed by
>>doing a single string-to-int conversion. As that was rather besides the
>>point, I've used sscanf for that because it was the easiest to use.
>
> Its also very slow.

... and still completely irrelevant, see other postings.

BTW, I've - so far - never used stdio in production code and I don't
expect I ever will.

Rainer Weikusat

unread,
Feb 26, 2019, 8:47:32 AM2/26/19
to
Considering the ever decreasing speed of computers (and - more
importantly - But that's not how Microsoft does it !!1) this is about to
change because it urgently must!

Nobody ever changed his opinion just because it didn't make any sense
:-)

Casper H.S. Dik

unread,
Feb 26, 2019, 9:32:35 AM2/26/19
to
gaz...@shell.xmission.com (Kenny McCormack) writes:

>In article <5c74295b$0$22351$e4fe...@news.xs4all.nl>,
>Casper H.S. Dik <Caspe...@OrSPaMcle.COM> wrote:
>...
>>>And wish they had /proc, like modern Unix (aka, Linux) does.
>>
>>Though Solaris has /proc, it does not adds anything other then
>>process specific data, though in Solaris 11.4 that does include
>>information about file descriptors such as most or all things
>>you want to know of a socket.

>Just FYI - and I have no desire to debate this furher - I am familiar with
>Solaris /proc and have written programs that use it. I think we can agree
>that Linux /proc is better, but I suppose one could debate "At what cost?".

Hm, I'm not sure that is true. (Solaris procfs allows complete manipulation
and not just information)

I'm not quite sure that having the kernel convetring everything to
ascii and processes like top converting it back.

I think it more a question of philosophy if anything.

Casper

James K. Lowden

unread,
Feb 26, 2019, 11:30:38 AM2/26/19
to
On Tue, 26 Feb 2019 09:58:28 +0000 (UTC)
blt_7...@u66zodekuzem.com wrote:

> Given the increasing amount of energy used by the worlds data centres
> it is IMO beholden upon us developers to develop code that is as
> efficient - but still readable - as possible, not just fling together
> code lego brick style.

You are kidding, right? With all the web schlock out there, C
programmers are supposed to protect the environment by avoiiding
scanf?

--jkl

Kaz Kylheku

unread,
Feb 26, 2019, 11:43:14 AM2/26/19
to
On 2019-02-26, Jorgen Grahn <grahn...@snipabacken.se> wrote:
> On Mon, 2019-02-25, blt_aCS...@gba1ashx2sask4hya.ac.uk wrote:
>> On Mon, 25 Feb 2019 14:38:24 GMT
>> sc...@slp53.sl.home (Scott Lurndal) wrote:
> ...
>>>Returning binary data rather than printable (and thus parsable) data
>>>turned out to be problematic when new data needed to be added to the
>>>return; text makes this process easier because the data is free-form
>>>(often returned as key-value pairs).
>>
>> And what happens when the format of this free form text changes? It
>> potentially breaks every program that uses it and Linux has form
>> here.
>
> SL argues above that such a change is likely to be backwards-
> compatible.

Making a struct backwards compatible is fairly easy; just add new fields
at the end. Leave padding from the outset so the size doesn't change, or
else have a simple protocol like a size field at the front, or a size
argument passed with the struct.

Historically, the "struct sockaddr" family of binary types has easily
been extended to new protocols.

Kaz Kylheku

unread,
Feb 26, 2019, 11:55:29 AM2/26/19
to
Well, some of that web schlock can actually handle the conversion of an
out-of-range integer without undefined behavior. :)

blt_6r...@wxgiu9p8cxixvav4f.net

unread,
Feb 26, 2019, 12:03:42 PM2/26/19
to
On 26 Feb 2019 11:44:54 GMT
Jorgen Grahn <grahn...@snipabacken.se> wrote:
>On Mon, 2019-02-25, blt_aCS...@gba1ashx2sask4hya.ac.uk wrote:
>> On Mon, 25 Feb 2019 14:38:24 GMT
>> sc...@slp53.sl.home (Scott Lurndal) wrote:
>....
>>>Returning binary data rather than printable (and thus parsable) data
>>>turned out to be problematic when new data needed to be added to the
>>>return; text makes this process easier because the data is free-form
>>>(often returned as key-value pairs).
>>
>> And what happens when the format of this free form text changes? It
>> potentially breaks every program that uses it and Linux has form
>> here.
>
>SL argues above that such a change is likely to be backwards-
>compatible. And indeed both Unix and the Internet protocols tend to
>be text-based.

TCP/IP is text based is it? When did this happen?

blt...@buln3jqkuqmn.co.uk

unread,
Feb 26, 2019, 12:04:44 PM2/26/19
to
On Tue, 26 Feb 2019 11:30:35 -0500
"James K. Lowden" <jklo...@speakeasy.net> wrote:
It was a general point about using the most efficient tools for the job,
not necessarily the easiest just so you can leave work a bit earlier one day.


Rainer Weikusat

unread,
Feb 26, 2019, 12:22:28 PM2/26/19
to
Casper H.S. Dik <Caspe...@OrSPaMcle.COM> writes:
> gaz...@shell.xmission.com (Kenny McCormack) writes:
>>Casper H.S. Dik <Caspe...@OrSPaMcle.COM> wrote:
>>...
>>>>And wish they had /proc, like modern Unix (aka, Linux) does.
>>>
>>>Though Solaris has /proc, it does not adds anything other then
>>>process specific data, though in Solaris 11.4 that does include
>>>information about file descriptors such as most or all things
>>>you want to know of a socket.
>
>>Just FYI - and I have no desire to debate this furher - I am familiar with
>>Solaris /proc and have written programs that use it. I think we can agree
>>that Linux /proc is better, but I suppose one could debate "At what cost?".
>
> Hm, I'm not sure that is true. (Solaris procfs allows complete manipulation
> and not just information)
>
> I'm not quite sure that having the kernel convetring everything to
> ascii and processes like top converting it back.
>
> I think it more a question of philosophy if anything.

The idea is obviously that it's supposed to be possible to use the
information easily in environments where text processing is easy. This
would be interactive shells (or shell scripts) and higher-level
languages than C (including shell and Perl scripts). Ie, this was
designed based on the assumption that the information will mainly be
usuable to humans who'll mostly do fairly simple ad hoc programming if
at all or at least that this is also an important use case.

Eg, the quick way to solve the problem of finding processes owning a
certain listening socket is something like

[root@doppelsaurus]~ #printf "%x" 80
50[root@doppelsaurus]~ #grep :0050 /proc/net/tcp
1: 00000000:0050 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 4820 1 ffff880230740b40 100 0 0 10 0
[root@doppelsaurus]~ #find /proc/*/fd -ls | grep 4820
12504 0 lrwx------ 1 root root 64 Feb 26 15:09 /proc/3142/fd/3 -> socket:[4820]
12575 0 lrwx------ 1 root root 64 Feb 26 15:09 /proc/3219/fd/3 -> socket:[4820]
12612 0 lrwx------ 1 root root 64 Feb 26 15:09 /proc/3221/fd/3 -> socket:[4820]
12649 0 lrwx------ 1 root root 64 Feb 26 15:09 /proc/3223/fd/3 -> socket:[4820]
12686 0 lrwx------ 1 root root 64 Feb 26 15:09 /proc/3225/fd/3 -> socket:[4820]
12723 0 lrwx------ 1 root root 64 Feb 26 15:09 /proc/3227/fd/3 -> socket:[4820]

I had to solve this problem based on BSD netstat and fstat in the past
and I had greatly preferred the above instead, not the least because
this was a customer-visible issue (listening socket leaked to random
process due to missing FD_CLOEXEC).


Rainer Weikusat

unread,
Feb 26, 2019, 12:35:48 PM2/26/19
to
That's not going to help wrt code reading (somehow) older versions of
the structure from some source. OTOH, code processing
(newline-delimited) name - value pairs will usually just continue to
work if new data is added

Scott Lurndal

unread,
Feb 26, 2019, 1:04:29 PM2/26/19
to
That is, of course, misstating what was written. The majority
of frequently used internet _services_ running on TCP do happen to be text based
(SMTP, NNTP, HTTP, etc) absent any underlying cryptography (such as
IPSEC).

blt_o...@r12.gov.uk

unread,
Feb 26, 2019, 2:50:28 PM2/26/19
to
On Tue, 26 Feb 2019 18:04:25 GMT
sc...@slp53.sl.home (Scott Lurndal) wrote:
>blt_6r...@wxgiu9p8cxixvav4f.net writes:
>>On 26 Feb 2019 11:44:54 GMT
>>Jorgen Grahn <grahn...@snipabacken.se> wrote:
>>>On Mon, 2019-02-25, blt_aCS...@gba1ashx2sask4hya.ac.uk wrote:
>>>> On Mon, 25 Feb 2019 14:38:24 GMT
>>>> sc...@slp53.sl.home (Scott Lurndal) wrote:
>>>....
>>>>>Returning binary data rather than printable (and thus parsable) data
>>>>>turned out to be problematic when new data needed to be added to the
>>>>>return; text makes this process easier because the data is free-form
>>>>>(often returned as key-value pairs).
>>>>
>>>> And what happens when the format of this free form text changes? It
>>>> potentially breaks every program that uses it and Linux has form
>>>> here.
>>>
>>>SL argues above that such a change is likely to be backwards-
>>>compatible. And indeed both Unix and the Internet protocols tend to
>>>be text-based.
>>
>>TCP/IP is text based is it? When did this happen?
>>
>
>That is, of course, misstating what was written. The majority
>of frequently used internet _services_ running on TCP do happen to be text
>based
>(SMTP, NNTP, HTTP, etc) absent any underlying cryptography (such as
>IPSEC).

HTTP is a good example of how not to design a data transfer protocol. Verbose,
and too reliant on newline delimiters with a large processing penalty for
decoding the bloated header.

Scott Lurndal

unread,
Feb 26, 2019, 3:04:54 PM2/26/19
to
Feel free to design a replacement. Good luck.

Kaz Kylheku

unread,
Feb 26, 2019, 3:08:04 PM2/26/19
to
If new data is added, you're not reading an "old source", so that is not
analogous to using an older structure.

A character string from an old source will be missing expected fields,
and it's not hard to imagine a non-robust processor doing something
stupid, like failing to extract values into some variables, yet using
those variables anyway.

If I have an API like this:

int fill_info(int handle, struct info *pi, size_t size);

The API can tell that the client is newer:

int fill_info(int handle, struct info *pi, size_t size)
{
if (size > sizeof *pi) /* client is newer than we are! */
{
}
}


in this case, the API can do something reasonable, like perhaps fill the
excess area indicated by the client with zero bits, so the client
doesn't access garbage values:

int fill_info(int handle, struct info *pi, size_t size)
{
if (size > sizeof *pi) /* client is newer than we are! */
{
memset(pi + 1, 0, size - sizeof *pi);
}
}

we can also easily return a value to the caller which will let them
know that they are running on the older API with the smaller structure.

Rather than inferring this fact from parsing, the API client knows it
directly by comparing two simple numbers.

Rainer Weikusat

unread,
Feb 26, 2019, 3:17:02 PM2/26/19
to
As I know from personal experience, it's possible to run a
HTTP-processing engine which fully parses this "bloated header" on a
133Mhz CPU without introducing measurable request latency (JFTR: I wrote
this program).

Further, the HTTP-protocol was originally designed for 68030 25 Mhz NeXT
computers.

Could you perhaps once explain how you arrived at the conclusion that it
MUST pose a processing problem for current 3.4 Ghz CPUs? Minus repeating
the party line, I mean.

Rainer Weikusat

unread,
Feb 26, 2019, 3:18:04 PM2/26/19
to
Has already happened a long time ago, IOW, "Google's very own version of
SCP" has existed for a while.

James K. Lowden

unread,
Feb 26, 2019, 4:45:57 PM2/26/19
to
On Tue, 26 Feb 2019 17:04:39 +0000 (UTC)
blt...@buln3jqkuqmn.co.uk wrote:

> On Tue, 26 Feb 2019 11:30:35 -0500
> "James K. Lowden" <jklo...@speakeasy.net> wrote:
> >On Tue, 26 Feb 2019 09:58:28 +0000 (UTC)
> >blt_7...@u66zodekuzem.com wrote:
> >
> >With all the web schlock out there, C programmers are supposed to
> >protect the environment by avoiding scanf?
>
> It was a general point about using the most efficient tools for the
> job, not necessarily the easiest just so you can leave work a bit
> earlier one day.

I daresay that leaving a bit earlier one day is the goal, and should
be. It's called "productivity". Otherwise we'd all program in machine
code, and awk wouldn't exist.

(I'm not defending any terrible choice you can imagine or cite. I am
saying every choice to do something now to save something in the future
-- including chosing a lower-level function over a higher-level one --
involves an implicit and usually unmeasured assessment of return on
investment. Very often, YAGNI.)

I have often surprised people by demonstrating database bulk loads that
are primarily constrained not by the network, not by the speed of
the server, and not by parsing, but by floating point conversion. Yet
and still, it was never worth the company's time to have me write a
more efficient loading utility, supposing I could.

I don't doubt there must be situations in which converting strings to
floating point must be optimized, because for every operation X there
is a process for which the performance of X is critical. But I submit
that situation is rare for sscanf. Other considerations, including
ease of use and idiomatic use are usually more important.

IMO that line of thinking applies to the tempest in a teapot that is
this thread, too. When is the efficiency of discovering information
from procfs important?

--jkl

blt_n...@rqc8him4ifj3azal.edu

unread,
Feb 27, 2019, 4:08:12 AM2/27/19
to
On Tue, 26 Feb 2019 20:04:50 GMT
Apparently you've never heard of ftp.

blt_...@q8cjb6ykysvqcfi34ybd.gov.uk

unread,
Feb 27, 2019, 4:12:15 AM2/27/19
to
On Tue, 26 Feb 2019 20:16:58 +0000
Rainer Weikusat <rwei...@talktalk.net> wrote:
>blt_o...@r12.gov.uk writes:
>> HTTP is a good example of how not to design a data transfer protocol.
>Verbose,
>> and too reliant on newline delimiters with a large processing penalty for
>> decoding the bloated header.
>
>As I know from personal experience, it's possible to run a
>HTTP-processing engine which fully parses this "bloated header" on a
>133Mhz CPU without introducing measurable request latency (JFTR: I wrote
>this program).
>
>Further, the HTTP-protocol was originally designed for 68030 25 Mhz NeXT
>computers.
>
>Could you perhaps once explain how you arrived at the conclusion that it
>MUST pose a processing problem for current 3.4 Ghz CPUs? Minus repeating
>the party line, I mean.

You're conflating problem with efficiency. HTTP is not efficient and given
the billions of HTTP headers that must be parsed per second it adds up.
Also the HTTP that was used on Next back in the early 90s is a very different
beast to the bloated nightmare that exists today.

Scott Lurndal

unread,
Feb 27, 2019, 9:13:04 AM2/27/19
to
I wasn't aware that FTP was designed as a replacement for HTTP. Nor
does it suffice to replace HTTP. Nor is it a simple or straightfoward
protocol.

https://tools.ietf.org/html/rfc959

blt__r...@w0f4hn78ih.com

unread,
Feb 27, 2019, 9:57:14 AM2/27/19
to
On Wed, 27 Feb 2019 14:13:00 GMT
sc...@slp53.sl.home (Scott Lurndal) wrote:
>blt_n...@rqc8him4ifj3azal.edu writes:
>>On Tue, 26 Feb 2019 20:04:50 GMT
>>sc...@slp53.sl.home (Scott Lurndal) wrote:
>
>>>>>That is, of course, misstating what was written. The majority
>>>>>of frequently used internet _services_ running on TCP do happen to be text
>>>>>based
>>>>>(SMTP, NNTP, HTTP, etc) absent any underlying cryptography (such as
>>>>>IPSEC).
>>>>
>>>>HTTP is a good example of how not to design a data transfer protocol.
>Verbose,
>>>
>>>>and too reliant on newline delimiters with a large processing penalty for
>>>>decoding the bloated header.
>>>>
>>>
>>>Feel free to design a replacement. Good luck.
>>
>>Apparently you've never heard of ftp.
>>
>
>I wasn't aware that FTP was designed as a replacement for HTTP. Nor
>does it suffice to replace HTTP. Nor is it a simple or straightfoward
>protocol.

Its simpler than HTTP and for data transfer it works fine. For small data
packets tftp is even better.


Rainer Weikusat

unread,
Feb 27, 2019, 10:39:43 AM2/27/19
to
I'm not conflating anything, these persistent attempts to replace
arguments with misinterpretations and preaching start to become a bit
annoying. Time for some numbers, please. For some example use-case,
what's the specific difference and why does it matter?

Barry Margolin

unread,
Feb 27, 2019, 12:24:03 PM2/27/19
to
The value of text-based protocols like HTTP, SMTP, FTP, etc. is that
they're easy to debug during protocol development and test interactively
in production. I've done "telnet www.somewhere.com 80" and "telnet
smtp.example.com 25" many times when trying to troubleshoot problems.

Binary protocols are more efficient, and it's good that protocols like
DNS and NTP are binary because they need to be fast. But for protocols
that transfer bulk data, the transaction overhead is a relatively minor
cost, and doesn't need to be prioritized for optimization.

--
Barry Margolin, bar...@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***

Kaz Kylheku

unread,
Feb 27, 2019, 12:26:22 PM2/27/19
to
We could design a binary replacement for HTTP by keeping HTTP's logic
as-is, and only changing its wire representation. We "compile" what
would be the textual syntax into binary forms. Instead of cr-lf
delimiting, we have length + data. Instead of the text "404" we have a
16 bit unsigned integer in network endian. And so on. Anything above
this representation layer is unchanged, or mostly so.

So, for instance, a web server now has to use different code for
converting the request headers to a dictionary data structure. That
structure itself stays the same, so the web applications on top of the
server don't have to change.

--
TXR Programming Lanuage: http://nongnu.org/txr
Music DIY Mailing List: http://www.kylheku.com/diy
ADA MP-1 Mailing List: http://www.kylheku.com/mp1

Kaz Kylheku

unread,
Feb 27, 2019, 2:39:06 PM2/27/19
to
Inefficient is inefficient. Inefficient on 68030 is inefficient on Intel
Core i7.

If all we want from the modern machine is the same amount of work output
as we wanted from the 68030, then the TCP/IP stack could be written in
Python!

Sure, we have emulators for old machines written in Javascript that run as
well as the old machines; perhaps better.

But if we want a modern work output from the modern machine, to make it
do what what it was built for, then the efficiency matters.

Think of a high volume server that has to a huge volume of HTTP
requests (and generates its own to other servers).

We still compile C with -O2 and worry about cycles wasted acquiring a
mutex and such, same as 25 years ago.

James K. Lowden

unread,
Feb 27, 2019, 3:19:00 PM2/27/19
to
On Wed, 27 Feb 2019 12:23:56 -0500
Barry Margolin <bar...@alum.mit.edu> wrote:

> > HTTP is a good example of how not to design a data transfer
> > protocol. Verbose, and too reliant on newline delimiters with a
> > large processing penalty for decoding the bloated header.
>
> The value of text-based protocols like HTTP, SMTP, FTP, etc. is that
> they're easy to debug during protocol development and test
> interactively in production. I've done "telnet www.somewhere.com 80"
> and "telnet smtp.example.com 25" many times when trying to
> troubleshoot problems.

That, in no small part, accounts for the success of the World Wide
Web. When TBL was inventing HTTP and HTML, we already had X running
over TCP. Displaying data from a remote host was no challenge. But
HTTP made it easy to get started.

The most important kind of efficiency is the kind that moves something
from not-done to done. Machine efficiency has taken a back seat to
that since before Fortran was invented.

--jkl


Rainer Weikusat

unread,
Feb 27, 2019, 3:44:31 PM2/27/19
to
And the other way round: Efficient enough for 68030 will certainly be
efficient enough for Intel Core i7. There's no absolute criterion here
and nothing has to be as efficient as possible, only as efficient as
necessary.

[...]

> Think of a high volume server that has to a huge volume of HTTP
> requests (and generates its own to other servers).

An infinite amount of HTTP-traffic will contain an infinite amount of
avoidable overhead. But that's not a sensible frame of reference. The
real question is "does it scale well enough to be usable in situation
X", X being some specific use-case.

Related question: What's the processing cost in absolute terms?


Nicolas George

unread,
Feb 28, 2019, 4:42:52 AM2/28/19
to
blt_n...@rqc8him4ifj3azal.edu, dans le message
<q55k1o$134n$1...@gioia.aioe.org>, a écrit :
>>>HTTP is a good example of how not to design a data transfer protocol. Verbose,
>>
>>>and too reliant on newline delimiters with a large processing penalty for
>>>decoding the bloated header.
>>Feel free to design a replacement. Good luck.
> Apparently you've never heard of ftp.

Thanks for reminding us once in a while that you have no credibility
whatsoever.

Kenny McCormack

unread,
Feb 28, 2019, 6:02:41 AM2/28/19
to
In article <5c77ad19$0$31407$426a...@news.free.fr>,
You must really have been neglected as a child.

--

Prayer has no place in the public schools, just like facts
have no place in organized religion.
-- Superintendent Chalmers

Nicolas George

unread,
Feb 28, 2019, 6:06:00 AM2/28/19
to
Kenny McCormack, dans le message <q58f4e$h2n$1...@news.xmission.com>, a
écrit :
> You must really have been neglected as a child.

Oh, so you are incompetent in psychology too...

blt_4...@iiuq3_ghp7dxggdjy8u6.gov.uk

unread,
Feb 28, 2019, 2:48:11 PM2/28/19
to
On 28 Feb 2019 09:42:49 GMT
Well you would certainly know about having no credibility.

blt_...@4_usnn0mhdhgnze.com

unread,
Feb 28, 2019, 2:50:51 PM2/28/19
to
And attitudes like that are why enviroments like Java became popular, taking
2 GB of memory just to start up and bringing a 4 core system to its knees
during heavy processing. Its the same attitude the americans have to car
engines - why get more power from something smaller and more efficient when
you just stick a 6.0 V8 in front. Who cares about fuel consumption, fuel is
cheap right?

Rainer Weikusat

unread,
Feb 28, 2019, 3:48:38 PM2/28/19
to
> And attitudes like that are [more random nonsense]

I've so far assumed that you keep insulting me based on random nonsense
you're making up on the fly (all your guesses have been laughably wrong)
because of some innocent misunderstanding.

I'm now going to assume that this is an intentional "discussion tactic"
as you don't have any arguments to support your position and prefer to
attack people not sharing it to cover it.

"Have a pleasant, remaining life".

Scott Lurndal

unread,
Feb 28, 2019, 4:55:27 PM2/28/19
to
Rainer Weikusat <rwei...@talktalk.net> writes:
>blt__Gqn@4_usnn0mhdhgnze.com writes:

>I'm now going to assume that this is an intentional "discussion tactic"
>as you don't have any arguments to support your position and prefer to
>attack people not sharing it to cover it.
>
>"Have a pleasant, remaining life".

Took you long enough. Anyone who posts with a different name each
time is in the "troll" category to me.

Barry Margolin

unread,
Mar 1, 2019, 12:07:39 PM3/1/19
to
In article <201902270...@kylheku.com>,
Kaz Kylheku <157-07...@kylheku.com> wrote:

> On 2019-02-27, blt_n...@rqc8him4ifj3azal.edu
> <blt_n...@rqc8him4ifj3azal.edu> wrote:
> > On Tue, 26 Feb 2019 20:04:50 GMT
> > sc...@slp53.sl.home (Scott Lurndal) wrote:
> >>Feel free to design a replacement. Good luck.
> >
> > Apparently you've never heard of ftp.
>
> We could design a binary replacement for HTTP by keeping HTTP's logic
> as-is, and only changing its wire representation. We "compile" what
> would be the textual syntax into binary forms. Instead of cr-lf
> delimiting, we have length + data. Instead of the text "404" we have a
> 16 bit unsigned integer in network endian. And so on. Anything above
> this representation layer is unchanged, or mostly so.

Do you remember all the ISO OSI application protocols that were designed
in the 70's and 80's as competitors for the TCP/IP protocols that came
from the earlier Arpanet protocols? OSI was all binary, and the
protocols were also overly complex. Hardly any of them succeeded,
probably because the text-based protocols were easier to rapid prototype
and get into production.

Barry Margolin

unread,
Mar 1, 2019, 12:13:54 PM3/1/19
to
In article <201902270...@kylheku.com>,
Kaz Kylheku <157-07...@kylheku.com> wrote:

> Inefficient is inefficient. Inefficient on 68030 is inefficient on Intel
> Core i7.

Software doesn't have to be as fast as possible, it only has to be "fast
enough". That can be achieved by using more efficient algorithms or by
making the hardware faster.

It's true that there's nothing wrong with doing both, but at some point
you get diminishing returns. If you're watching a video and it doesn't
stop periodically and say "buffering", the protocol is efficient enough,
you're not going to get much benefit from improving the protocol
efficiency.

Nicolas George

unread,
Mar 1, 2019, 12:26:38 PM3/1/19
to
Barry Margolin , dans le message
<barmar-E933E8....@reader.eternal-september.org>, a écrit :
> It's true that there's nothing wrong with doing both, but at some point
> you get diminishing returns. If you're watching a video and it doesn't
> stop periodically and say "buffering", the protocol is efficient enough,
> you're not going to get much benefit from improving the protocol
> efficiency.

You could get video at higher resolution with the same hardware.
It is loading more messages.
0 new messages