One problem with POSIX

30 views
Skip to first unread message

duanev

unread,
Nov 13, 2009, 12:49:35 PM11/13/09
to golang-nuts
Since we are redefining the world (and it looks great so far!) I want
to bring attention to a long standing serious POSIX inefficienty that
GO appears to be propagating.

POSIX read() forces the buffer to be allocated in application space
instead of allowing the operating system to "hand" the application a
buffer. This forces ALL standard I/O to perform an extra and often
unneeded copy. A better definition of read() would be:

buf, len, err := Stdin.Read()

Thus on POSIX based operating systems the library can allocate the
buffer space, and in the future a less-POSIX OS can attach the buffer
to the application's address space.

Duane

John Cowan

unread,
Nov 13, 2009, 1:07:34 PM11/13/09
to duanev, golang-nuts
duanev scripsit:

> POSIX read() forces the buffer to be allocated in application space
> instead of allowing the operating system to "hand" the application a
> buffer. This forces ALL standard I/O to perform an extra and often
> unneeded copy. A better definition of read() would be:

Ewww. Pre-Unix operating systems had all sorts of shared data structures
including buffers between userland and the kernel. The documentation had
Awful Warnings about what happened if you violated the intricate sharing
protocols -- I remember as particularly shuddersome the intricacies of
double-buffered disk I/O in TOPS-10, where the driver was filling one
shared buffer as you were emptying the other.

Of course, programmers consistently ignored the Awful Warnings, and their
"highly efficient" programs consistently crashed. Unix had the courage
to do away with all that and make sure that user space and kernel space
were firmly separated, and so things have remained.

--
They tried to pierce your heart John Cowan
with a Morgul-knife that remains in the http://www.ccil.org/~cowan
wound. If they had succeeded, you would
become a wraith under the domination of the Dark Lord. --Gandalf

duanev

unread,
Nov 13, 2009, 1:47:10 PM11/13/09
to golang-nuts
On Nov 13, 12:07 pm, John Cowan <co...@ccil.org> wrote:
> Ewww. Pre-Unix operating systems had all sorts of shared
> data structures including buffers between userland and the > kernel. The documentation had Awful Warnings about what > happened if you violated the intricate sharing protocols

Our cell phones have page virtual memory hardware built
in, we should be taking advantage of that. The
bulk of I/O today surrounds network applications
where there is a single consuming app at the receiving
end point (eg. streaming video). Why not
optimize the future for these?

My definition does not demand that data be shared in
user space. If multiple apps register to
receive the same data
the OS can elect to hand them each a different buffer - or
better yet if we define the buffer returned by read as
read-only then the data is safely sharable. Surely by now
we know how to define usage semantics in ways that avoid
data coherency issues. :)

Duane

Ian Lance Taylor

unread,
Nov 14, 2009, 11:08:21 AM11/14/09
to duanev, golang-nuts
That's an interesting idea, but for practical use it would have to be
an alternative to the existing Read implementation. Having the kernel
return a buffer for a single byte would be an inefficient use of
memory space. Having the Read() call allocate a buffer each time
would be a lot of unnecessary allocation compared to having the
program allocate a buffer once and pass it to Read() multiple times.

In the POSIX world this would more typically be an Mmap call, and that
interface may be the best way to implement this for future operating
systems as well.

Ian

John

unread,
Nov 14, 2009, 5:53:07 PM11/14/09
to golang-nuts


On Nov 14, 11:08 am, Ian Lance Taylor <i...@google.com> wrote:
> In the POSIX world this would more typically be anMmapcall, and that
> interface may be the best way to implement this for future operating
> systems as well.


Speaking of mmap, is there any way to memory map a file in go right
now?

Russ Cox

unread,
Nov 14, 2009, 5:54:37 PM11/14/09
to John, golang-nuts
On Sat, Nov 14, 2009 at 14:53, John <jwfmc...@gmail.com> wrote:
> Speaking of mmap, is there any way to memory map a file in go right
> now?

No.

Russ

duanev

unread,
Nov 15, 2009, 5:32:20 PM11/15/09
to golang-nuts
On Nov 14, 10:08 am, Ian Lance Taylor <i...@google.com> wrote:
> That's an interesting idea, but for practical use it would have to be
> an alternative to the existing Read implementation.  Having the kernel
> return a buffer for a single byte would be an inefficient use of
> memory space.  Having the Read() call allocate a buffer each time
> would be a lot of unnecessary allocation compared to having the
> program allocate a buffer once and pass it to Read() multiple times.

Your points are well taken, reading single bytes would indeed be very
inefficient. I guess I'm thinking of more of a pure event driven
system where buffers are handed to applications containing packets or
sectors. An app would register a callback at open time and could
process what appears at any granularity it wanted - in which case read
() is not used at all... but this is more of an OS than a GO topic.

Duane

Russ Cox

unread,
Nov 16, 2009, 12:11:51 PM11/16/09
to duanev, golang-nuts
> A better definition of read() would be:
>
>    buf, len, err := Stdin.Read()

Maybe, maybe not. If you change the definition of os.File.Read
you'd also have to change the definition of io.Reader and thus
the definition of everything that implements Read. Most of these
things do not talk to the operating system, so they would not
benefit from mmap tricks, and now they'd all have to allocate.
The cost of making the os.File Read maybe a little faster is that
all the other Reads get slower due to the extra allocations.

It might make sense in non-POSIX contexts to expose the other
read as a different method, but changing Read itself has
pretty big implications in Go because of interfaces.

Russ

duanev

unread,
Nov 17, 2009, 12:27:02 AM11/17/09
to golang-nuts
On Nov 16, 11:11 am, Russ Cox <r...@golang.org> wrote:
> It might make sense in non-POSIXcontexts to expose the other
> read as a different method, but changing Read itself has
> pretty big implications in Go because of interfaces.

Agreed. And as some of the other new Go concepts begin to sink in
allow me to try again:

for bufp := <- Stdin.ch; len(*bufp) > 0; {
Stdout.ch <- bufp
}

where

Stdin.ch and Stdout.ch are *[]byte

But chan currently implies other goroutines are present and so far
I've only seen goroutines use chans locally within a program. Are
folks thinking of channels that terminate in other processes or are
attached to devices or sockets?

Duane

Ian Lance Taylor

unread,
Nov 17, 2009, 1:04:35 AM11/17/09
to duanev, golang-nuts
duanev <dua...@gmail.com> writes:

> But chan currently implies other goroutines are present and so far
> I've only seen goroutines use chans locally within a program. Are
> folks thinking of channels that terminate in other processes or are
> attached to devices or sockets?

Not really. I would use a goroutine which read the device or socket
and fed the bytes into a channel. For cases where a more efficient
read was possible, I would put that at the os or syscall layer, and
have the goroutine use the efficient method.

Ian

Mike Beller

unread,
Nov 24, 2009, 1:02:09 PM11/24/09
to Russ Cox, golang-nuts
Russ

Is there any particular reason that mmap is not supported?

mmap is a very powerful capability, especially in Linux, for example,
where it is used for super-low-overhead file I/O, for sharing memory
between applications, and ultimately, it's the underlying memory
allocator in Linux (brk is built on top of anonymous mmap). I would
say a language which calls itself a system programming language will
need some support for mmap.

If there is no language-specific reason not to do it, perhaps you
could hint at how I could go about building it, or whether anyone else
already is.

Mike


On Nov 14, 5:54 pm, Russ Cox <r...@golang.org> wrote:
> On Sat, Nov 14, 2009 at 14:53, John <jwfmccl...@gmail.com> wrote:
> > Speaking of mmap, is there any way tomemorymapa file in go right
> > now?
>
> No.
>
> Russ

Helmar

unread,
Nov 24, 2009, 1:26:20 PM11/24/09
to golang-nuts
Hi,

On Nov 14, 5:54 pm, Russ Cox <r...@golang.org> wrote:
> On Sat, Nov 14, 2009 at 14:53, John <jwfmccl...@gmail.com> wrote:
> > Speaking of mmap, is there any way to memory map a file in go right
> > now?
>
> No.

Well, I was just thinking so (and experienced this by simply loading a
big file). But is not that a little bit weird? E.g. something like a
io.ReadFile() could do so. It would be transparent. It would be the
answer to get the contents of some file immediately mapped into
memory. You will do that better in future or is this a WontFix issue?

Regards,
-Helmar


>
> Russ

Uriel

unread,
Nov 24, 2009, 1:34:55 PM11/24/09
to golang-nuts
We really got to have mmap; any system should implement two, and only
two, interfaces: mmap and ioctl.

We should do away with all that annoying and tedious open/read/write
silliness, those crazy Unix people had no clue what they were doing.

uriel

P.S.: And sockets! We need sockets! Everyone knows it is impossible to
do networking without sockets!

Devon H. O'Dell

unread,
Nov 24, 2009, 1:43:28 PM11/24/09
to golang-nuts
2009/11/24 Helmar <hel...@gmail.com>:
In response to this and the post before this:

I don't have a real answer for this, but I suspect that this is at
least in part due to the garbage collector. An interface to mmap would
need to add the memory-mapped pages to the gc, and the gc would need
to know how to free those pages, if it needs to do so differently than
just freeint the memory. It could be done transparently, but I don't
think it's as simple as just letting you call e.g. syscall.Mmap.

--dho

> Regards,
> -Helmar
>
>
>>
>> Russ
>

Russ Cox

unread,
Nov 24, 2009, 4:45:16 PM11/24/09
to Mike Beller, golang-nuts
On Tue, Nov 24, 2009 at 10:02, Mike Beller <belle...@gmail.com> wrote:
> Is there any particular reason that mmap is not supported?
>
> mmap is a very powerful capability, especially in Linux, for example,
> where it is used for super-low-overhead file I/O, for sharing memory
> between applications, and ultimately, it's the underlying memory
> allocator in Linux (brk is built on top of anonymous mmap).  I would
> say a language which calls itself a system programming language will
> need some support for mmap.
>
> If there is no language-specific reason not to do it, perhaps you
> could hint at how I could go about building it, or whether anyone else
> already is.

Go already calls mmap to get memory. The first reason there
is no interface for mmapping files is that we haven't needed it yet,
but it's also not clear how to make it fit well with the rest of the system.
The basic mmap call would have to return a []byte, but most
programs that mmap a file want to interpret it as native data structures.
Preparing that file implies precise knowledge of the memory layout
of specific structures, which Go tries to hide for portability reasons,
and the conversion itself would require importing "unsafe",
which Go tries to avoid as much as possible. (That said, I certainly
appreciate that this is an important use case.) Even once the
conversion is done, it's not clear how mmap interacts with the garbage
collector. You wouldn't want to depend on storing Go pointers
in the mmap'ed region and hope the garbage collector doesn't
collect those objects. A thornier problem is deciding when it is
safe to unmap the file. The easy answer is never; more precise
answers will require even more chumminess with the garbage
collector.

Until these questions are answered, the simplest thing to do is
just call it directly using the syscall package and take responsibility
for the safety or lack thereof. The toy Native Client code has
a use of mmap in that context, which you could look at for inspiration:
http://golang.org/src/pkg/exp/nacl/av/av.go?h=SYS_MMAP#L261


On Tue, Nov 24, 2009 at 10:34, Uriel <ur...@berlinblue.org> wrote:
> We really got to have mmap; any system should implement two, and only
> two, interfaces: mmap and ioctl.
>
> We should do away with all that annoying and tedious open/read/write
> silliness, those crazy Unix people had no clue what they were doing.

Writing these rants is probably not a productive use of your time,
and reading them is certainly a waste of our time. People in
different situations may make different design decisions, and
that's entirely reasonable. It's just not interesting to watch you
jump all over people because they have different problems than
you do. (You'll probably be disappointed to learn that mmap is
a fundamental part of the implementation of http://9fans.net/archive/.)
Please take your non-constructive comments elsewhere. Thanks.

Russ
Reply all
Reply to author
Forward
0 new messages