Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

"bad address in system call argument" - on a socket

645 views
Skip to first unread message

de...@scratters.com

unread,
Aug 25, 2014, 6:13:47 AM8/25/14
to
What would the error "bad address in system call argument" mean in the context of a socket? My socket is in non blocking mode and has a readable fileevent on it. It's connected to a process on localhost. After several hours of running I suddenly get this error.

My log shows that the first instance of the error is actually on the flush which follows a write to that socket:

"error flushing "sock668": bad address in system call argument"

That one I happen to catch.

My application then wants to open another socket (-async, this one) to a remote machine, and that gives the error:

Connection failed: "sockets are not available on this system"

It's Windows 7 and sockets have been running fine for some time before this happens.

A few seconds later I do a [gets] without a catch, and that now gives me:

error reading "sock668": bad address in system call argument

Because there's no catch the application croaks, but clearly the sockets system has gone completely wrong by this point.

A bit of searching shows very few references to this error from Tcl. Something from the days of Tcl8.4 is all I could find, and that was related to a file, not a socket. I'm running 8.6.1.

Does anyone know what causes this? Or what I can do to prevent it?

Alexandre Ferrieux

unread,
Aug 25, 2014, 11:16:25 AM8/25/14
to
On Monday, August 25, 2014 12:13:47 PM UTC+2, de...@scratters.com wrote:
>
> Does anyone know what causes this? Or what I can do to prevent it?

No, but I would love to guide you, by extra investigations.
That is, assuming you don't just disappear after asking, like you did last time.

-Alex

Björn Lundin

unread,
Aug 25, 2014, 5:13:21 PM8/25/14
to
On 2014-08-25 12:13, de...@scratters.com wrote:
> My log shows that the first instance of the error is actually on the flush which follows a write to that socket:
> "error flushing "sock668": bad address in system call argument"
> That one I happen to catch.
> My application then wants to open another socket (-async, this one) to a remote machine, and that gives the error:
> Connection failed: "sockets are not available on this system"
> It's Windows 7 and sockets have been running fine for some time before this happens.

If this is within a loop, you might be
opening sockets
get error
opening sockets
get error
opening sockets
get error

until windows has no more sockets ?

--
--
Bj�rn

de...@scratters.com

unread,
Aug 26, 2014, 4:07:57 AM8/26/14
to
No, no loop. My application is network infrastructure based, and does continually try to open, and then close, lots of sockets both to the local machine and any number of remote machines. But it doesn't do this particularly quickly and it's fairly straightforward. It closes everything it opens and normally runs for days on end with no problems. The socket that's hit the error is the "control" socket which stays open permanently.

There is something interesting (in the "unusual" sense) in there though. This control socket reads an XML document. The fileevent handler wants to parse it and then update the application, including the GUI, before dealing with the next one. I found that when a stream of these XML documents come in the GUI blocked up because there was a lot of heavy processing going on without a break. So to fix that I do the reading from the socket, then when I've got the whole document I put the processing of it into an "after 0" call in order to let the event loop get a look in.

Further, during the XML processing and GUI updating I used to have a bunch of [update] calls for various reasons (and not [update idletasks], either). This was before I learned what a terrible idea that is, and yes, the effect was as you'd expect - my XML processing routine allowed the socket handler back in and the XML processing effectively got called re-entrantly. Not knowing any better at the time, my fix for this was to switch off the fileevent handler, do the XML processing, then switch it back on again.

So the flow of the socket's fileevent handler was basically:

gets the incoming line
append to buffer - if the XML isn't yet complete, return
once the XML is complete:
save the fileevent readable script (which I'm in) to a variable
after 0 [list processTheXML $xml]
put the fileevent readable script back

I fully appreciate what a total dog's breakfast of logic this is and it was only yesterday, when I was looking at this issue, that I realised that that mess was still in there. All my [update]s have long gone, and as of yesterday all that fileevent changing has now been removed, as has the "after 0". The XML document is processed inline now and it all seems to work.

I've explained this because that's clearly a use case of the Tcl language which isn't going to be explored too often. :) Hopefully it's that messing about with the fileevent handler which has caused the problem.

Ralf Fassel

unread,
Aug 26, 2014, 4:39:34 AM8/26/14
to
* de...@scratters.com
| [...] the effect was as you'd expect - my XML processing routine
| allowed the socket handler back in and the XML processing effectively
| got called re-entrantly. Not knowing any better at the time, my fix
| for this was to switch off the fileevent handler, do the XML
| processing, then switch it back on again.
>
| So the flow of the socket's fileevent handler was basically:
>
| gets the incoming line append to buffer - if the XML isn't yet
| complete, return once the XML is complete: save the fileevent readable
| script (which I'm in) to a variable after 0 [list processTheXML $xml]
| put the fileevent readable script back

If I read this correctly as the following sequence:
- save the fileevent readable script (which I'm in) to a variable
- after 0 [list processTheXML $xml]
- put the fileevent readable script back
then you have not changed anything with respect to the fileevent handler
being active when the XML processing runs. The handler is only inactive
during the 'after' call, which would not call it anyway. When the XML
processing starts after 0ms, the handler is already back and active.

In order to achieve your goal, you would have to reinstall the handler
not before 'processTheXML' has finished, e.g. by giving the variable as
additional argument to processTheXML:

set handler [fileevent $socket readable]
fileevent $socket readable {}
after 0 [list processTheXML $xml $socket $handler]

proc processTheXML {xml socket handler} {
... parse XML ...
fileevent $socket readable $handler
}

HTH
R'

de...@scratters.com

unread,
Aug 26, 2014, 4:57:02 AM8/26/14
to
Yes, I know. :o} It was the result of my thrashing about trying to understand what was happening, and trying to stop the re-entrant calling messing up my data model. Now I've taken out the [update] calls none of that fileevent nonsense is necessary.

I described it because I suspect it's instrumental in what appears to basically be an "internal error" in Tcl's socket handling.

de...@scratters.com

unread,
Sep 1, 2014, 5:20:28 AM9/1/14
to
On Tuesday, August 26, 2014 9:57:02 AM UTC+1, de...@scratters.com wrote:
> I described it because I suspect it's instrumental in what appears to basically be an "internal error" in Tcl's socket handling.

For the record, all that fileevent stuff isn't the cause. Having removed all that I set another run going and the same issue appeared after about 3 days of 10-20 socket open/closes per minute.
0 new messages