Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Tcl UDP Server vs Python UDP server, not good....

95 views
Skip to first unread message

jerry...@gmail.com

unread,
Apr 16, 2007, 6:44:56 PM4/16/07
to
Hi,
Today I wrote a little python udp "time server" and compared it
against a Tcl UDP time server I wrote a
couple of days ago.

The results were a little suprising: With 12 (tcl/Tk clients ) running
against the Python server the user cpu time
was running about two percent. When running against the Tcl server the
user cpu time was running about
twenty percent!

Here is the Python server:
*******************************
#!/usr/bin/env python
import SocketServer
from time import *

class handler(SocketServer.DatagramRequestHandler):
def handle(self):
self.wfile.write(strftime("%X",localtime() ))

s = SocketServer.UDPServer(('192.168.1.70',10000), handler)
s.serve_forever()
**************************************************************************
And here is the Tcl server:
#!/bin/sh
# -*- tcl -*-
# The next line is executed by /bin/sh, but not tcl \
exec tclsh "$0" ${1+"$@"}

# Create a time/date server via udp

package require udp

proc initBroadCaster { port } {
set bfd [udp_open $port ]
fconfigure $bfd -buffering none -blocking 0
fileevent $bfd readable [list sendTime $bfd ]
}

proc sendTime { sock } {
set data [read $sock]
set peer [fconfigure $sock -peer]
fconfigure $sock -remote $peer
puts -nonewline $sock "[clock format [clock seconds ] -format
%X ]"
}

initBroadCaster 10000
vwait forever
************************************************************
Just for completeness here is the client:

#!/bin/sh
# -*- tcl -*-
# The next line is executed by /bin/sh, but not tcl \
exec tclsh "$0" ${1+"$@"}
package require Tk
package require udp

proc buildUI {} {
global seconds_to_suspend
label .aLable -text "Seconds to Sleep: "
entry .seconds -textvariable seconds_to_suspend -width 14
pack .aLable -side left
pack .seconds -side left
}

proc openPort { host port } {
global fd
set fd [udp_open ]
fconfigure $fd -buffering none -remote [list $host $port ]
fileevent $fd readable [list readSocket $fd]
}

proc readSocket { fd } {
global seconds_to_suspend
set secs [ read $fd ]
if { ! [string length $secs] } {
#puts "zero [clock seconds]"
return
}
set seconds_to_suspend $secs
}

proc looper {} {
global fd
puts -nonewline $fd "hello"
after 1000 looper
}

buildUI
set seconds_to_suspend Waiting
#openPort 192.168.1.70 8888
openPort 192.168.1.70 10000
looper
*********************************************
Is the Tcl server doing some kind of busy waiting?


Jerry

Pat Thoyts

unread,
Apr 17, 2007, 8:09:34 AM4/17/07
to
"jerry...@gmail.com" <jerry...@gmail.com> writes:

>The results were a little suprising: With 12 (tcl/Tk clients ) running
>against the Python server the user cpu time
>was running about two percent. When running against the Tcl server the
>user cpu time was running about
>twenty percent!

You don't say what platform and what tcl version you are using. Both
of which are essential to properly answer your question. Doesn't look
like Windows so I'll assume linux. On unix systems we create the
socket and wrap a Tcl_Channel around it. The Tcl unix notifier will
call our channel functions when something happens. It uses select as
far as I know. We do something a bit different on Windows and run a
thread for it and post the messages back to the main thread for
handling.

Your programs don't really appear to be equivalent -- although I don't
know enough about python to be sure -- is python reading the data?
I suspect that the clock format[clock seconds] might be taking a bit
of time particularly if this is tcl 8.5.

I'd suggest trying a fixed response to examine just the network
response in isolation. The fconfigure -peer bit also does dns lookups
which can be an issue (although in this case not likely as the
response gets cached by the resolver and you only have the one
client).

--
Pat Thoyts http://www.patthoyts.tk/
To reply, rot13 the return address or read the X-Address header.
PGP fingerprint 2C 6E 98 07 2C 59 C8 97 10 CE 11 E6 04 E0 B9 DD

Ivan Young

unread,
Apr 17, 2007, 11:30:15 AM4/17/07
to
Pat Thoyts wrote:
> "jerry...@gmail.com" <jerry...@gmail.com> writes:
>
>> The results were a little suprising: With 12 (tcl/Tk clients ) running
>> against the Python server the user cpu time
>> was running about two percent. When running against the Tcl server the
>> user cpu time was running about
>> twenty percent!
>
> You don't say what platform and what tcl version you are using. Both
> of which are essential to properly answer your question. Doesn't look
> like Windows so I'll assume linux. On unix systems we create the
> socket and wrap a Tcl_Channel around it. The Tcl unix notifier will
> call our channel functions when something happens. It uses select as
> far as I know. We do something a bit different on Windows and run a
> thread for it and post the messages back to the main thread for
> handling.
>
> Your programs don't really appear to be equivalent -- although I don't
> know enough about python to be sure -- is python reading the data?
> I suspect that the clock format[clock seconds] might be taking a bit
> of time particularly if this is tcl 8.5.
>
> I'd suggest trying a fixed response to examine just the network
> response in isolation. The fconfigure -peer bit also does dns lookups
> which can be an issue (although in this case not likely as the
> response gets cached by the resolver and you only have the one
> client).
>
I've tried the code given under solaris, tcl 8.4.14, tcludp 1.0.7 &
1.0.8. The server seems to be growing with each message.
Also changing the server reply to a puts to stdio, seems to suggest that
the call-back sendTime is being invoked twice for each message received
from client. At least on this platform, but don't see that with other
test codes, I'll have a closer look at the client side.

Ivan..

jerry...@gmail.com

unread,
Apr 17, 2007, 2:11:23 PM4/17/07
to
On Apr 17, 8:09 am, Pat Thoyts <cnggub...@hfref.fbheprsbetr.arg>
wrote:

>
> You don't say what platform and what tcl version you are using. Both
> of which are essential to properly answer your question. Doesn't look
> like Windows so I'll assume linux. On unix systems we create the
> socket and wrap a Tcl_Channel around it. The Tcl unix notifier will
> call our channel functions when something happens. It uses select as
> far as I know. We do something a bit different on Windows and run a
> thread for it and post the messages back to the main thread for
> handling.
>

I am running active state tcl 8.4.14 on a Fedora 6 laptop. tcludp
1.0.8.

> Your programs don't really appear to be equivalent -- although I don't
> know enough about python to be sure -- is python reading the data?
> I suspect that the clock format[clock seconds] might be taking a bit
> of time particularly if this is tcl 8.5.
>

The handler does read the socket and data is available as
self.request[0]

> I'd suggest trying a fixed response to examine just the network
> response in isolation. The fconfigure -peer bit also does dns lookups
> which can be an issue (although in this case not likely as the
> response gets cached by the resolver and you only have the one
> client).

I am running the clients and server on a "quiescent" system.

Jerry

Ivan Young

unread,
Apr 18, 2007, 6:06:53 AM4/18/07
to
Ivan Young wrote:

> I've tried the code given under solaris, tcl 8.4.14, tcludp 1.0.7 &
> 1.0.8. The server seems to be growing with each message.
> Also changing the server reply to a puts to stdio, seems to suggest that
> the call-back sendTime is being invoked twice for each message received
> from client. At least on this platform, but don't see that with other
> test codes, I'll have a closer look at the client side.
>

More:

I added a puts to stdio to show the data, its length and the time.


proc sendTime { sock } {
set data [read $sock]
set peer [fconfigure $sock -peer]
fconfigure $sock -remote $peer
puts -nonewline $sock "[clock format [clock seconds ] -format %X ]"

puts "[string length $data] $data [clock format [clock seconds ] -format %X ]"
}

The result was :
5 hello 10:36:29
0 10:36:30
5 hello 10:36:30
0 10:36:31
5 hello 10:36:31
0 10:36:32
5 hello 10:36:32
0 10:36:33
5 hello 10:36:33
0 10:36:34
5 hello 10:36:34
0 10:36:35
5 hello 10:36:35

Likewise a puts in the client shows two replies each second.
This is caused by the -blocking 0 in the initBroadCaster
The solution is to either return to blocking I/O -blocking 1
or check for a zero length message in sendTime.

proc sendTime { sock } {
set data [read $sock]

if { [string length $data] > 0} {


set peer [fconfigure $sock -peer]
fconfigure $sock -remote $peer
puts -nonewline $sock "[clock format [clock seconds ] -format %X ]"

puts "[string length $data] $data [clock format [clock seconds ] -format %X ]"
}
}

Which reduces the calls to clock by 1/2.

More worrying to me is the speed at which the server's memory size is growing, looks like a memory
leak. I'll try and investigate further when I get time.

Ivan..

Ivan Young

unread,
Apr 18, 2007, 11:27:02 AM4/18/07
to

I've tried Jerry's code on solaris, tcl 8.4.14 with memory debug and udptcl 1.0.8.
The server is growing by 134 byte with each message.
The increase in memory is connected to the :

fconfigure $sock -remote $peer

command in sendTime.
Removing that line, ofcourse stops the replies, but the memory size remains static.
I tried opening a new socket for the reply (code below) and close it, the example works as
before, but again the memory increases by 134 bytes per message, so its not the reuse of the
socket/fd.

proc sendTime { sock } {
# memory trace on


set data [read $sock]
if { [string length $data] > 0} {
set peer [fconfigure $sock -peer]

puts "peer $peer"
set rfd [udp_open 10000]
puts "-remote [fconfigure $rfd -remote $peer]"
puts -nonewline $rfd "[clock format [clock seconds ] -format %X ]"
# puts "[string length $data] $data [clock format [clock seconds ] -format %X ]"


puts "[string length $data] $data "

close $rfd
}
puts [memory info]
return
}

example memory info output:
peer 145.239.160.81 44774
-remote
5 hello
total mallocs 30288
total frees 26373
current packets allocated 3915
current bytes allocated 229148
maximum packets allocated 4405
maximum bytes allocated 243824

peer 145.239.160.81 44774
-remote
5 hello
total mallocs 30339
total frees 26416
current packets allocated 3923
current bytes allocated 229282
maximum packets allocated 4405
maximum bytes allocated 243824

peer 145.239.160.81 44774
-remote
5 hello
total mallocs 30390
total frees 26459
current packets allocated 3931
current bytes allocated 229416
maximum packets allocated 4405
maximum bytes allocated 243824

Any ideas?

Ivan...


jerry...@gmail.com

unread,
Apr 18, 2007, 2:33:41 PM4/18/07
to
On Apr 16, 6:44 pm, "jerry.le...@gmail.com" <jerry.le...@gmail.com>
wrote:

> Hi,
> Today I wrote a little python udp "time server" and compared it
> against a Tcl UDP time server I wrote a
> couple of days ago.
>
> The results were a little suprising: With 12 (tcl/Tk clients ) running
> against the Python server the user cpu time
> was running about two percent. When running against the Tcl server the
> user cpu time was running about
> twenty percent!
>
> Here is the Python server:
> *******************************
> #!/usr/bin/env python
> import SocketServer
> from time import *
>
> class handler(SocketServer.DatagramRequestHandler):
> def handle(self):
> self.wfile.write(strftime("%X",localtime() ))
>
> s = SocketServer.UDPServer(('192.168.1.70',10000), handler)
> s.serve_forever()
> **************************************************************************
> *********************************************
> Is the Tcl server doing some kind of busy waiting?
>
> Jerry

I took a look at the server_forever code and found that ultimately it
simply
blocks on a read of the socket waiting for a packet to arrive.

This means that there is no "event" loop overhead,,,

Still t'aint clear to me why the cpu keeps creeping up as more clients
are created
( compared to the simple python solution....)

Jerry


Alexandre Ferrieux

unread,
Apr 18, 2007, 3:52:03 PM4/18/07
to
On 18 avr, 20:33, "jerry.le...@gmail.com" <jerry.le...@gmail.com>
wrote:

> Still t'aint clear to me why the cpu keeps creeping up as more clients
> are created ( compared to the simple python solution....)

Can you use strace on the server process ? I know it only shows
syscalls, but sometimes is gives unexpected hints about the guts...

-Alex

jerry...@gmail.com

unread,
Apr 18, 2007, 8:12:23 PM4/18/07
to
On Apr 18, 3:52 pm, Alexandre Ferrieux <alexandre.ferri...@gmail.com>
wrote:

Well something seems to be leaking, I rewrote the client code to use
Python and
ran about 24 clients against the tcl server.

Top showed about 40 percent cpu usage ( 2 ~4 percent when run against
the pythonserver).

Top also showed an ever increasing amount of memory being consumed by
the tcl server...

Jerry

Uwe Klein

unread,
Apr 19, 2007, 2:57:02 AM4/19/07
to
Hey all,

This is of interest for me as i have stuff using tcludp running 24/7.

I have tested this client/server setup with ~20 clients.
CPU goes up ~30% ( Athlon XP3200 ), Memory is growing

strace shows that for every port action
/etc/services is opened, read completely and closed again.

This is an issue I have come across before ( name resolution
done for every item even when already numeric
and being uncached and expensive )

uwe

what strace shows:

select(4, [3], [], [], NULL) = 1 (in [3])

open("/etc/services", O_RDONLY) = 4
fcntl64(4, F_GETFD) = 0
fcntl64(4, F_SETFD, FD_CLOEXEC) = 0
fstat64(4, {st_mode=S_IFREG|0644, st_size=596411, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40261000
read(4, "#\n# Network services, Internet s"..., 4096) = 4096
read(........................ lots of reads snipped!
read(4, "troller 48000/udp # Nimbus C"..., 4096) = 2491
read(4, "", 4096) = 0
close(4) = 0
munmap(0x40261000, 4096) = 0
time(NULL) = 1176965280
sendto(3, "08:48:00", 8, 0, {sa_family=AF_INET, sin_port=htons(33256),
sin_addr=inet_addr("172.30.28.250")}, 16) = 8
select(4, [3], [], [], NULL) = 1 (in [3])

recvfrom(3, "hello", 4096, 0, {sa_family=AF_INET, sin_port=htons(33226),
sin_addr=inet_addr("172.30.28.250")}, [16]) = 5
open("/etc/services", O_RDONLY) = 4
fcntl64(4, F_GETFD) = 0
fcntl64(4, F_SETFD, FD_CLOEXEC) = 0
fstat64(4, {st_mode=S_IFREG|0644, st_size=596411, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40261000
read(4, "#\n# Network services, Internet s"..., 4096) = 4096

rinse, repeat

Pat Thoyts

unread,
Apr 19, 2007, 5:52:15 AM4/19/07
to
Uwe Klein <uwe_klein_...@t-online.de> writes:

Aha. That's the one. This is I believe fixed in the CVS head already:

2006-07-11 Pat Thoyts <patt...@users.sourceforge.net>

* generic/udp_tcl.c: Added some timing to tests and changed
the
* tests/udp-srv.tcl: service name lookup (was very slow on
linux)

If you try your service using the tcludp HEAD and it is ok then I
better release this version as 1.0.9

Uwe Klein

unread,
Apr 19, 2007, 6:54:53 AM4/19/07
to
Pat Thoyts wrote:
> Uwe Klein <uwe_klein_...@t-online.de> writes:

>>rinse, repeat
>
>
> Aha. That's the one. This is I believe fixed in the CVS head already:
>
> 2006-07-11 Pat Thoyts <patt...@users.sourceforge.net>
>
> * generic/udp_tcl.c: Added some timing to tests and changed
> the
> * tests/udp-srv.tcl: service name lookup (was very slow on
> linux)
>
> If you try your service using the tcludp HEAD and it is ok then I
> better release this version as 1.0.9
>

hi Pat,

looks like we are halfway there:
with src from cvs, reports as 1.0.9

after start:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
14410 uwe 15 0 3284 2104 2296 S 0.0 0.4 0:00.02 tclsh
after ~30min:
14410 uwe 15 0 15196 13m 2296 S 0.0 2.7 0:00.11 tclsh


takes a lot less CPU to consume the same amount of memory ;-)

want any other tests done?

uwe


Alexandre Ferrieux

unread,
Apr 19, 2007, 4:33:06 PM4/19/07
to
On Apr 19, 12:54 pm, Uwe Klein <uwe_klein_habertw...@t-online.de>
wrote:

> takes a lot less CPU to consume the same amount of memory ;-)
> want any other tests done?

Maybe you can attach to the process when it is getting fat, look at /
proc/maps to find the bulk of the heap, and dump a few data from right
in the middle. Maybe you'll get an idea of what the leak is made of...

-Alex

Alexandre Ferrieux

unread,
Apr 19, 2007, 4:36:58 PM4/19/07
to
On Apr 19, 2:12 am, "jerry.le...@gmail.com" <jerry.le...@gmail.com>
wrote:

>
> > Can you use strace on the server process ? I know it only shows
> > syscalls, but sometimes is gives unexpected hints about the guts...

> Well something seems to be leaking,

The leak is one thing, the cpu another (here). Look at the other
posts, you'll see strace did help discover the reason for the high cpu
(/etc/services reparsed every time).

-Alex


Uwe Klein

unread,
Apr 20, 2007, 5:42:19 AM4/20/07
to
heap is growing ( only the server process) and filled with stars^H^H^H^Hthis
(repeated with very slight variations endlessly):
172.30.20.250 is the IP for server and client
ports used: 2 per 40 clients, 3224 is the server process:
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
tclsh 3224 uwe 3u IPv4 206704 UDP *:ndmp
tclsh 3268 uwe 4u IPv4 206956 UDP *:34705
tclsh 3268 uwe 5u IPv4 206957 UDP *:34706
tclsh 3270 uwe 4u IPv4 206960 UDP *:34707
tclsh 3270 uwe 5u IPv4 206961 UDP *:34708
tclsh 3272 uwe 4u IPv4 206962 UDP *:34709
tclsh 3272 uwe 5u IPv4 206963 UDP *:34710
tclsh 3274 uwe 4u IPv4 206972 UDP *:34711
tclsh 3274 uwe 5u IPv4 206973 UDP *:34712
tclsh 3276 uwe 4u IPv4 206984 UDP *:34713
tclsh 3276 uwe 5u IPv4 206985 UDP *:34714
tclsh 3278 uwe 4u IPv4 206986 UDP *:34715
tclsh 3278 uwe 5u IPv4 206987 UDP *:34716
tclsh 3280 uwe 4u IPv4 206988 UDP *:34717
tclsh 3280 uwe 5u IPv4 206989 UDP *:34718
tclsh 3282 uwe 4u IPv4 206992 UDP *:34719
tclsh 3282 uwe 5u IPv4 206993 UDP *:34720
..........
tclsh 3326 uwe 4u IPv4 207093 UDP *:34768
tclsh 3326 uwe 5u IPv4 207094 UDP *:34769
tclsh 3328 uwe 4u IPv4 207062 UDP *:34738
tclsh 3328 uwe 5u IPv4 207063 UDP *:34739
tclsh 3330 uwe 4u IPv4 207085 UDP *:34760
tclsh 3330 uwe 5u IPv4 207086 UDP *:34761
tclsh 3332 uwe 4u IPv4 207058 UDP *:34733
tclsh 3332 uwe 5u IPv4 207059 UDP *:34734
tclsh 3334 uwe 4u IPv4 207060 UDP *:34735
tclsh 3334 uwe 5u IPv4 207061 UDP *:34736
tclsh 3336 uwe 4u IPv4 207109 UDP *:34784
tclsh 3336 uwe 5u IPv4 207110 UDP *:34785
tclsh 3338 uwe 4u IPv4 207083 UDP *:34758
tclsh 3338 uwe 5u IPv4 207084 UDP *:34759
tclsh 3340 uwe 4u IPv4 207064 UDP *:34740
tclsh 3340 uwe 5u IPv4 207065 UDP *:34741
tclsh 3342 uwe 4u IPv4 207105 UDP *:34780
tclsh 3342 uwe 5u IPv4 207106 UDP *:34781
tclsh 3344 uwe 4u IPv4 207101 UDP *:34776
tclsh 3344 uwe 5u IPv4 207102 UDP *:34777
tclsh 3346 uwe 4u IPv4 207103 UDP *:34778
tclsh 3346 uwe 5u IPv4 207104 UDP *:34779

dump:

01746ca0 50 48 4f 4e 11 00 00 00 33 34 37 32 30 00 21 40 |PHON....34720.!@|
01746cb0 52 45 4d 45 11 00 00 00 02 00 00 00 02 00 00 00 |REME............|
01746cc0 28 0d 79 09 11 00 00 00 f8 05 79 09 08 05 79 09 |(.y.......y...y.|
01746cd0 00 2d 38 00 19 00 00 00 31 37 32 2e 33 30 2e 32 |.-8.....172.30.2|
01746ce0 38 2e 32 35 30 20 33 34 37 30 38 00 19 00 00 00 |8.250 34708.....|
01746cf0 31 37 32 2e 33 30 2e 32 38 2e 32 35 30 00 49 4d |172.30.28.250.IM|
01746d00 45 3d 64 65 11 00 00 00 33 34 37 30 38 00 4c 43 |E=de....34708.LC|
01746d10 5f 43 4f 4c 11 00 00 00 02 00 00 00 02 00 00 00 |_COL............|
01746d20 48 0b 79 09 11 00 00 00 88 06 79 09 98 05 79 09 |H.y.......y...y.|
01746d30 00 2d 38 00 19 00 00 00 31 37 32 2e 33 30 2e 32 |.-8.....172.30.2|
01746d40 38 2e 32 35 30 20 33 34 37 32 32 00 19 00 00 00 |8.250 34722.....|
01746d50 31 37 32 2e 33 30 2e 32 38 2e 32 35 30 00 49 4d |172.30.28.250.IM|
01746d60 45 3d 64 65 11 00 00 00 33 34 37 32 32 00 21 40 |E=de....34722.!@|
01746d70 5f 43 4f 4c 11 00 00 00 02 00 00 00 02 00 00 00 |_COL............|
01746d80 c8 0c 79 09 11 00 00 00 68 05 79 09 78 04 79 09 |..y.....h.y.x.y.|
01746d90 00 2d 38 00 19 00 00 00 31 37 32 2e 33 30 2e 32 |.-8.....172.30.2|
01746da0 38 2e 32 35 30 20 33 34 37 32 32 00 19 00 00 00 |8.250 34722.....|
01746db0 31 37 32 2e 33 30 2e 32 38 2e 32 35 30 00 4c 45 |172.30.28.250.LE|
01746dc0 50 48 4f 4e 11 00 00 00 33 34 37 32 32 00 21 40 |PHON....34722.!@|
01746dd0 52 45 4d 45 11 00 00 00 02 00 00 00 02 00 00 00 |REME............|


uwe

Alexandre Ferrieux

unread,
Apr 20, 2007, 2:16:36 PM4/20/07
to
On Apr 20, 11:42 am, Uwe Klein <uwe_klein_habertw...@t-online.de>
wrote:

> heap is growing ( only the server process) and filled with stars^H^H^H^Hthis
> (repeated with very slight variations endlessly):
>
> 01746ca0 50 48 4f 4e 11 00 00 00 33 34 37 32 30 00 21 40 |PHON....34720.!@|
> 01746cb0 52 45 4d 45 11 00 00 00 02 00 00 00 02 00 00 00 |REME............|
>...

> 01746cf0 31 37 32 2e 33 30 2e 32 38 2e 32 35 30 00 49 4d |172.30.28.250.IM|
> 01746d00 45 3d 64 65 11 00 00 00 33 34 37 30 38 00 4c 43 |E=de....34708.LC|
> 01746d10 5f 43 4f 4c 11 00 00 00 02 00 00 00 02 00 00 00 |_COL............|

Uwe, Pat, does one of you know where the strings PHON, REME, IME=, and
_COL come from ?
Is it in the udp payload of your specific stress-test, Uwe, or in your
extension's code, Pat ?

-Alex

Uwe Klein

unread,
Apr 20, 2007, 4:47:47 PM4/20/07
to
Alexandre Ferrieux wrote:
> On Apr 20, 11:42 am, Uwe Klein <uwe_klein_habertw...@t-online.de>
> wrote:
>
>>heap is growing ( only the server process) and filled with stars^H^H^H^Hthis
>>(repeated with very slight variations endlessly):
>>
>>01746ca0 50 48 4f 4e 11 00 00 00 33 34 37 32 30 00 21 40 |PHON....34720.!@|
>>01746cb0 , 11 00 00 00 02 00 00 00 02 00 00 00 |REME............|

>>...
>>01746cf0 31 37 32 2e 33 30 2e 32 38 2e 32 35 30 00 49 4d |172.30.28.250.IM|
>>01746d00 45 3d 64 65 11 00 00 00 33 34 37 30 38 00 4c 43 |E=de....34708.LC|
>>01746d10 5f 43 4f 4c 11 00 00 00 02 00 00 00 02 00 00 00 |_COL............|
>
>
> Uwe, Pat, does one of you know where the strings PHON, REME, IME=, and
> _COL come from ?
> Is it in the udp payload of your specific stress-test, Uwe, or in your
> extension's code, Pat ?
Well the two things that found my attention are the
IP as string null terminated
172.30.28.250
and the source port as null terminated string
34708
the 50 48 4f 4e, 5f 43 4f 4c and 52 45 4d 45 look like pointers somewhere
11 00 00 00 seems to be some bitmask

a TclObj?

I've never done much with tcl on the c-level.


>
> -Alex
>
uwe

Uwe Klein

unread,
Apr 20, 2007, 5:16:09 PM4/20/07
to
Alexandre Ferrieux wrote:

> Is it in the udp payload of your specific stress-test, Uwe, or in your
> extension's code, Pat ?

The code running is the original posted stuff by the OP jerry levan
as pasted from the posting.
the udplib is the current cvs from the sourceforge site reporting as 1.0.9.
tcl is 8.4.6 ( distributed with SuSE 9.1 ) libc is glibc-2.3.3

mapping is:
08048000-08049000 r-xp 00000000 22:05 460148 /usr/bin/tclsh8.4
08049000-0804a000 rw-p 00000000 22:05 460148 /usr/bin/tclsh8.4
0804a000-08941000 rwxp 0804a000 00:00 0
40000000-40016000 r-xp 00000000 22:05 18448 /lib/ld-2.3.3.so
40016000-40017000 rw-p 00015000 22:05 18448 /lib/ld-2.3.3.so
40017000-40018000 rw-p 40017000 00:00 0
40038000-400d5000 r-xp 00000000 22:05 22538 /usr/lib/libtcl8.4.so
400d5000-400df000 rw-p 0009c000 22:05 22538 /usr/lib/libtcl8.4.so
400df000-400e0000 rw-p 400df000 00:00 0
400e0000-400e2000 r-xp 00000000 22:05 18457 /lib/libdl.so.2
400e2000-400e3000 rw-p 00002000 22:05 18457 /lib/libdl.so.2
400e3000-40104000 r-xp 00000000 22:05 18475 /lib/tls/libm.so.6
40104000-40105000 rw-p 00020000 22:05 18475 /lib/tls/libm.so.6
40105000-40106000 rw-p 40105000 00:00 0
40106000-40210000 r-xp 00000000 22:05 18474 /lib/tls/libc.so.6
40210000-40218000 rw-p 0010a000 22:05 18474 /lib/tls/libc.so.6
40218000-4021b000 rw-p 40218000 00:00 0
4021b000-40256000 r--p 00000000 22:05 21324 /usr/lib/locale/de_DE.utf8/LC_CTYPE
40256000-4025c000 r--s 00000000 22:05 18855 /usr/lib/gconv/gconv-modules.cache
4025c000-40260000 r-xp 00000000 22:05 503493 /usr/lib/udp1.0.9/libudp1.0.9.so
40260000-40261000 rw-p 00003000 22:05 503493 /usr/lib/udp1.0.9/libudp1.0.9.so
40261000-40262000 r--p 00000000 22:05 19306 /usr/lib/locale/de_DE.utf8/LC_TIME
bfff5000-c0000000 rw-p bfff5000 00:00 0
ffffe000-fffff000 ---p 00000000 00:00 0

uwe

Alexandre Ferrieux

unread,
Apr 20, 2007, 5:34:39 PM4/20/07
to
On Apr 20, 11:16 pm, Uwe Klein <uwe_klein_habertw...@t-online.de>
wrote:

> The code running is the original posted stuff by the OP jerry levan
> as pasted from the posting.
> the udplib is the current cvs from the sourceforge site reporting as 1.0.9.
> tcl is 8.4.6 ( distributed with SuSE 9.1 ) libc is glibc-2.3.3

Tough. No PHON/RENE there :-(

>
> mapping is:
> 08048000-08049000 r-xp 00000000 22:05 460148 /usr/bin/tclsh8.4

> ffffe000-fffff000 ---p 00000000 00:00 0

Funny, but the addresses of your previous dump were in the 0174xxxx,
outside these mappings. Assume you modified the program in between...

Anyway, I don't buy these 'PHON' and 'RENE' as pointers. All 8
capitals is too much for sheer chance.
Instead, maybe these are magic numbers, e.g. for block integrity
checks in the malloc() implementation ?
Can you try a debugmalloc-like tool ?

-Alex
>
> uwe


Uwe Klein

unread,
Apr 20, 2007, 5:47:58 PM4/20/07
to
Alexandre Ferrieux wrote:
> On Apr 20, 11:16 pm, Uwe Klein <uwe_klein_habertw...@t-online.de>
> wrote:
>
>>The code running is the original posted stuff by the OP jerry levan
>>as pasted from the posting.
>>the udplib is the current cvs from the sourceforge site reporting as 1.0.9.
>>tcl is 8.4.6 ( distributed with SuSE 9.1 ) libc is glibc-2.3.3
>
>
> Tough. No PHON/RENE there :-(
the easy problems are for the well paid staff members ;-/

>
>
>>mapping is:
>>08048000-08049000 r-xp 00000000 22:05 460148 /usr/bin/tclsh8.4
>>ffffe000-fffff000 ---p 00000000 00:00 0
>
>
> Funny, but the addresses of your previous dump were in the 0174xxxx,
thats offset from start of segment
i dumped the stuff with gdb
dump memory <file> 0x0804a000 0x0933c000
hexdump <file>
i.e the adress you see is 0x0804a000 + <address in dump>

> outside these mappings. Assume you modified the program in between...
NO

>
> Anyway, I don't buy these 'PHON' and 'RENE' as pointers. All 8
> capitals is too much for sheer chance.
Might be right there

> Instead, maybe these are magic numbers, e.g. for block integrity
> checks in the malloc() implementation ?
> Can you try a debugmalloc-like tool ?
tomorrow
>
> -Alex
>
uwe

jerry...@gmail.com

unread,
Apr 23, 2007, 8:31:56 PM4/23/07
to
Hi,

In an effort to keep the issue moving forward, I posted
a bug report about the memory leak on the tcludp
sourceforge web site.


Jerry

Ivan Young

unread,
Apr 24, 2007, 6:51:57 AM4/24/07
to
The memory leak seems isolated to the channel configuration requests from the generic layer.
Using the old udp_conf command works without leaking.

# Here is my current hacked version of sendTime, I've #'d my tracing lines.


proc sendTime { sock } {
# memory trace on
set data [read $sock]
if { [string length $data] > 0} {

set peer [fconfigure $sock -peer]

# puts "peer $peer"
# memory tag interest
# Try udp_conf instead of fconfigure
# fconfigure $sock -remote $peer
udp_conf $sock [lindex $peer 0] [lindex $peer 1]
#memory active active-mem.txt
#memory tag {}


puts -nonewline $sock "[clock format [clock seconds ] -format %X ]"

# puts "[string length $data] $data [clock format [clock seconds ] -format %X ]"
#puts "[string length $data] $data "
}
#puts [memory info]
return
}

This would seem to point to udpSetOption code.
I tested the above on Solaris 7 tcl 8.4.14, tcludp 1.0.8 patch with HEAD CVS code, can anyone
confirm the above works on other platforms ?

Ivan...

Uwe Klein

unread,
Apr 24, 2007, 7:53:45 AM4/24/07
to
start:
5655 uwe 15 0 2628 1516 2296 S 0.0 0.3 0:00.02 tclsh
30min , 40 clients
5655 uwe 15 0 2628 1516 2296 S 0.0 0.3 0:00.04 tclsh


uwe

jerry...@gmail.com

unread,
Apr 24, 2007, 9:01:10 AM4/24/07
to
On Apr 24, 6:51 am, Ivan Young <i...@jet.uk> wrote:

Ivan,

I am running about 20 clients against the server ( one call per second
for each client) and your "fix"
appears to be the cure for the memory leak!

I am running Fedora 6 with the latest ActiveState Tcl on a HP laptop.

I guess I have to figure out how to use cvs to grab the patch that
fixes the excessive cpu usage...

Thanks for running down the problem :)

Jerry

jerry...@gmail.com

unread,
Apr 24, 2007, 10:11:40 AM4/24/07
to
On Apr 24, 6:51 am, Ivan Young <i...@jet.uk> wrote:

(Google seems to have eaten my first response...)

The above fix works fine on my Fedora 6/ActiveState Tcl 8.4.14 system.

Thanks for the fix...

Jerry

Pat Thoyts

unread,
Apr 24, 2007, 11:00:50 AM4/24/07
to
"jerry...@gmail.com" <jerry...@gmail.com> writes:

[snip]


>
>I guess I have to figure out how to use cvs to grab the patch that
>fixes the excessive cpu usage...

cvs -d:pserver:anon...@tcludp.cvs.sourceforge.net:/cvsroot/tcludp co tcludp

Alexandre Ferrieux

unread,
Apr 25, 2007, 6:00:59 PM4/25/07
to
Hi Pat,

Any idea why the [fconfigure] leaks ?

-Alex

Pat Thoyts

unread,
Apr 30, 2007, 8:40:33 AM4/30/07
to
Alexandre Ferrieux <alexandre...@gmail.com> writes:

No. When I have time and can find the answer I mark the bug fixed.

0 new messages