Strange segfault using put-bytevector on socket port

David Banks

unread,

Mar 28, 2011, 8:46:04 AM3/28/11

to Mosh Developer Disscus

Hey, I came across a strange behaviour using put-bytevector.
Intermittently put-bytevector will segfault when used on a client
socket. The expression that causes the segfault looks like this:

(put-bytevector (socket-port client-conn) (string->utf8 response))

The content of 'response' seems not to affect the problem, I have
replaced it with a constant and the problem still happens. Sadly I
can't reduce the problem to a test case as it won't reproduce on
smaller code. Something about the context is triggering the bug in a
manner that looks like a race condition, but the program is single-
threaded.

In the largish program it occurs in, removing lots of code between the
connection and the response dramatically reduces the amount the
segfault occurs. (It can take anywhere between a second and 10
minutes to happen.) However, I don't think the problem is simply the
delay between the request and the response, as stubbing out the
procedures that cause it with busy-wait loops does _not_ segfault. I
give the backtrace below. I can give the entire program on request
but it's quite large.

Note that replacing put-bytevector with socket-send fixes the
problem. However as far as I know the put-bytevector call should
work.

Program received signal SIGSEGV, Segmentation fault.
0x000000000046a7dc in scheme::TranscodedTextualOutputPort::close (
this=0x7fffffffcad0) at src/TranscodedTextualOutputPort.cpp:84
84 return port_->close();
(gdb) bt
#0 0x000000000046a7dc in scheme::TranscodedTextualOutputPort::close (
this=0x7fffffffcad0) at src/TranscodedTextualOutputPort.cpp:84
#1 0x000000000046a653 in
scheme::TranscodedTextualOutputPort::~TranscodedTextualOutputPort
(this=0x7fffffffcad0, __in_chrg=<value optimized out>,
__vtt_parm=<value optimized out>) at src/
TranscodedTextualOutputPort.cpp:72
#2 0x000000000048c364 in scheme::stringTobytevectorEx
(theVM=0x9d7d20,
argc=0, argv=0x1d0e4e0) at src/PortProcedures.cpp:1458
#3 0x00000000004aa786 in scheme::stringToutf8Ex (theVM=0x9d7d20,
argc=1,
argv=0x283a260) at src/ByteVectorProcedures.cpp:102
#4 0x00000000004290a1 in scheme::CProcedure::call (this=0x9f3220,
theVM=0x9d7d20, argc=2, argv=0x283a238) at src/CProcedure.h:47
#5 0x0000000000421ca1 in scheme::VM::runLoop (this=0x9d7d20,
code=0x1011000,
returnPoint=0x0, returnTable=false) at src/call.inc.cpp:68
#6 0x0000000000416aac in scheme::VM::evaluateUnsafe (this=0x9d7d20,
code=0xe27000, codeSize=86640, isCompiler=false) at src/VM.cpp:409
#7 0x0000000000416bd2 in scheme::VM::evaluateSafe (this=0x9d7d20,
code=0xe27000, codeSize=86640, isCompiler=false) at src/VM.cpp:423
#8 0x000000000041890c in scheme::VM::activateR6RSMode
(this=0x9d7d20,
image=0x58ac00 "\004pR\001", image_size=351814,
isDebugExpand=false)
at src/VM.cpp:868
#9 0x00000000004e520e in activateR6RSMode (vm=0x9d7d20,
isDebugExpand=false)
at src/main.cpp:120
#10 0x00000000004e55ab in main (argc=2, argv=0x7fffffffe858)
---Type <return> to continue, or q <return> to quit---
at src/main.cpp:332

gdb backtraces are not my strong point, but it seems like the port is
being destroyed halfway through trying to write to it? Any help
appreciated.

Cheers,
David

higepon

unread,

Mar 28, 2011, 10:21:47 AM3/28/11

to mosh-develo...@googlegroups.com, David Banks

Hi.

As you mentioned, it seems wrong gc happens.
Could you give me the program?
It will help us.

Cheers.

> --
> You received this message because you are subscribed to the Google Groups "Mosh Developer Disscus" group.
> To post to this group, send email to mosh-develo...@googlegroups.com.
> To unsubscribe from this group, send email to mosh-developer-di...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mosh-developer-discuss?hl=en.
>
>

David Banks

unread,

Mar 28, 2011, 11:36:36 AM3/28/11

to mosh-develo...@googlegroups.com, higepon

Hi higepon,

On 28 March 2011 15:21, higepon <hig...@gmail.com> wrote:
> As you mentioned, it seems wrong gc happens.
> Could you give me the program?
> It will help us.

I have uploaded the program to http://www.solasistim.net/webserver-segv.tar.gz
To reproduce you can probably do this procedure:

$ wget http://www.solasistim.net/webserver-segv.tar.gz
$ tar -xzf webserver-segv.tar.gz
$ cd webserver-segv
$ mosh test.scm

The server should start. Now in another terminal try this:

$ while curl http://localhost:8080/; do true; done

It will take a few requests and the server should segfault.

The code is not too pretty and is way over complicated for a test
case, however, every time I tried to simplify the code path the
problem disappeared or took longer to manifest.

--
David Banks <amo...@gmail.com>

higepon

unread,

Mar 29, 2011, 4:31:00 AM3/29/11

to David Banks, mosh-develo...@googlegroups.com

Thanks.

With your web-server, I found the problem.

The problem is multiple socket ports from one client-conn.

(1) You get client-conn.
(2) Create text port with (transcoded-port (socket-port client-conn) ...) -- (A)
(3) (A) becomes no more necessary, so it will/can be destroyed.
(4) Destruction of transcoded-port causes to close client-conn.
(5) Create socket-port from the closed client-conn -- (B)
(6) (put-bytevector ... B) causes SEGV (This should not be SEGV, but
user friendly error).

For now, this is an implementation restriction.
Can you avoid creating multiple socket-ports?

Cheers.

David Banks

unread,

Mar 29, 2011, 6:09:19 AM3/29/11

to higepon, mosh-develo...@googlegroups.com

On 29 March 2011 09:31, higepon <hig...@gmail.com> wrote:
> With your web-server, I found the problem.
>
> The problem is multiple socket ports from one client-conn.
>
> (1) You get client-conn.
> (2) Create text port with (transcoded-port (socket-port client-conn) ...) -- (A)
> (3) (A) becomes no more necessary, so it will/can be destroyed.
> (4) Destruction of transcoded-port causes to close client-conn.
> (5) Create socket-port from the closed client-conn -- (B)
> (6) (put-bytevector ... B) causes SEGV (This should not be SEGV, but
> user friendly error).
>
> For now, this is an implementation restriction.
> Can you avoid creating multiple socket-ports?

Thanks for this higepon, I have been tearing my hair out over this.
From this information I worked around the segfault by keeping the two
ports in scope. If both are kept in scope then the port is reliably
closed before it is possible to write to it. As in this simpler
example:

(import (rnrs)
(mosh socket))

(let ((srv (make-server-socket "8080")))
(let loop ((i 1))
(let ((conn (socket-accept srv)))
(display "accepted connection ")
(display i)
(newline)
(let* ((sock-port (socket-port conn))
(input-port (transcoded-port sock-port (native-transcoder)))
(output-port sock-port))
(put-bytevector output-port (string->utf8 "Hello, world!\n"))
(socket-close conn)))
(loop (+ i 1)))
(socket-close srv))

If you make a request to it, you will see this exception:

Condition components:
1. &i/o-read
2. &who who: <binary-input/output-port <socket server
localhost:41188>>
3. &message message: "put-bytevector"
4. &i/o-port port: "port is closed"
5. &irritants irritants: ("port is closed")

This is interesting as the input port is still in scope so it
shouldn't be GCed AFAICS. Attempting to read from input-port after
the call to put-bytevector doesn't change the behaviour.

Anyway you probably already realized this but it's useful to get it
clear for me. I can work around the problem by not using the
transcoded port and just using binary I/O. I will add it to the bug
tracker if you want.

Cheers,
--
David Banks <amo...@gmail.com>

higepon

unread,

Mar 29, 2011, 10:27:28 PM3/29/11

to David Banks, mosh-develo...@googlegroups.com

> I have been tearing my hair out over this.

If you find a strange behaviour on Mosh,
it must be a bug, please tell me before you tear your hair out :D

> As in this simpler example:

...

> This is interesting as the input port is still in scope so it
>shouldn't be GCed AFAICS. Attempting to read from input-port after
>the call to put-bytevector doesn't change the behaviour.

I think this behavivour is R6RS compliant.
Relation between textual port and binary port is a little bit confusing.

See below.

> As a side effect, however, transcoded-port closes binary-port in a special way
> that allows the new textual port to continue to use the byte source or sink represented by binary-port,
> even though binary-port itself is closed and cannot be used by the input and output operations described in this chapter.

http://www.r6rs.org/final/html/r6rs-lib/r6rs-lib-Z-H-9.html#node_idx_650
http://www.r6rs.org/final/html/r6rs-rationale/r6rs-rationale-Z-H-22.html#node_sec_20.5

Cheers.

David Banks

unread,

Mar 30, 2011, 3:41:40 PM3/30/11

to higepon, mosh-develo...@googlegroups.com

Oops, I meant to post the vaguely OT stuff below to the list:

On 30 March 2011 14:44, David Banks <amo...@gmail.com> wrote:
> Wow, I completely missed that. Thanks higepon, I learned something today.
>
> PS: The server is actually a port of vijaymathew's "Fermion" server
> from spark scheme. One issue is that R6RS doesn't have "load" which
> Fermion used to load Scheme scripts. It still works using 'eval' but
> the problem is, how can a script request a particular library
> environment? R6RS "eval" can't evaluate the library syntax, so
> (import (cool-functions)) won't work from inside a servlet. There are
> two ways I can think to do that:
>
> 1. (easiest) Use an external data file mapping scripts to
> environments ("myscript.scm" (import-spec))
>
> 2. Load scripts from the file with READ and attempt to parse out the
> R6RS library syntax from the sexp structure, ignoring export spec,
> passing import spec to 'environment', and the library body to eval.
> (Could have obscure issues)
>
> The downside to both ways: they need a complete interpreter to reload
> dependent libraries (those from the environment). Anyway, sorry for
> that: just a brain dump. :)

As for this, I've now realized that it is possible to structure
scripts as normal R6 implementation-parsed libraries if you use them
as computed environments to eval. So to run a script, just ((eval
'start '(my-script-name)) the-request) and everything is nice. :)

How to reload definitions without killing the entire server is another
issue. However I might just punt that and kill the server every time,
not sure if the gain is worth the pain.

higepon

unread,

Mar 30, 2011, 10:13:45 PM3/30/11

to David Banks, mosh-develo...@googlegroups.com

>How to reload definitions without killing the entire server is another
>issue. However I might just punt that and kill the server every time,
>not sure if the gain is worth the pain.

I think kill is the best solution. :D

When your web server works on Mosh, please let me know.

Cheers.

Reply all

Reply to author

Forward