Orbited Memory Leak

23 views
Skip to first unread message

Niklas B

unread,
Jun 18, 2010, 8:27:47 AM6/18/10
to Orbited Discussion
Hi,

I'm seeing some issues with orbited leaking memory, has anyone else
experienced this? I'm running 0.8 beta 1. The problem is that I have
modified proxy.py, config.py and monitor.py so I am not convinced that
it's not me that has screwed up.

I'm running 4 instances (3 live and 1 development) with about 5000
unique users per day (probably around 7000 connections) and it's
eating 500MB per instance per week. I'm not sure if it over time, over
data (i.e messages) or connection pools.

I've done some changes to tidy up (mostly my code) and I will report
how it goes here.

Here is my Munin graph:

http://www.snyggastchatten.se/orbited_mem.png

Regards,
Niklas

B-ZaR

unread,
Jun 18, 2010, 11:59:38 AM6/18/10
to Orbited Discussion
Hi,

I'm experiencing the same problem. I use one instance of Orbited
version 0.7.10 (unmodified). There are under 500 messages sent per day
and only 10-20 connections, but the memory consumption raises steadily
about 327MiB per week until the server runs out of memory. Some
connections stay on for several weeks.

I have tried locating the problem by stress testing the orbited daemon
without success. I have tried:
- Spamming messages (hunderds of MiB)
- Spamming connections (quick connect/disconnect cycles, thousands)
- Spamming listeners and messages (128 listeners, 10MiB to each in
1KiB packages)

I use the HTTP interface through an apache proxy in my real-world
scenario, but the above were tested by using a straight STOMP
connection.

Here's my config file content:
---
[global]
reactor=select
# reactor=kqueue
# reactor=epoll
session.ping_interval = 1000
session.ping_timeout = 10000
proxy-enabled = 1
# once the sockets are open, orbited will drop its privileges to this
user.
user=orbited

[listen]
http://:8000
stomp://:61613
#monitor://:8001

[static]

[access]
* -> localhost:8000
#* -> localhost:8001
* -> localhost:61613

[logging]
debug=STDERR,debug.log
info=STDERR,info.log
access=STDERR,info.log
warn=STDERR,error.log
error=STDERR,error.log

enabled.default=info,access,warn,error #,debug

# Turn debug on for the "Proxy" logger
[loggers]
#Proxy=debug,info,access,warn,error
---

- Teemu

leon

unread,
Jun 18, 2010, 6:37:13 PM6/18/10
to Orbited Discussion
Are your users accessing Orbited directly via an JS interface or is it
being proxied through some other HTTP server (say, Apache)?

Teemu Erkkola

unread,
Jun 19, 2010, 4:04:07 AM6/19/10
to orbite...@googlegroups.com
Proxied through Apache, though I'm looking into the possibility of using a
lighter proxy before Apache and Orbited to decrease the number of forked
Apache processes. You think that might affect the memory footprint of the
Orbited daemon?

leon

unread,
Jun 19, 2010, 5:53:40 AM6/19/10
to Orbited Discussion
So, it's not your users that connect to Orbited, but Apache instead.
When
you ps -aux | grep httpd (or apache), how many processes do you see,
and
what is their age? I have had problems with Apache (and it was not
connected
to Orbited. Not sure this might affect Orbited's memory usage - but if
Apache keeps forking processes and these connections do not get
released,
memory will be allocated for them and Orbited will not 'see' they have
been
disconnected (which in fact, they won't have been).

So, what happens with the (Orbited's) memory usage when you kill some
of the
oldest Apache processes on your server? If it drops, we've found our
culprit.
There's also another thread where this very same issue is discussed,
but it
isn't much active lately (this week).

One other test is to, instead of proxying it through Apache, directly
connect
your users to Orbited. I have this as the most important piece is the
underlying ActiveMQ server. Can you do either of this and let us know
the
outcome?


Regards,
Leon

Teemu Erkkola

unread,
Jun 19, 2010, 7:12:02 AM6/19/10
to orbite...@googlegroups.com
There are 13 processes, and none of them are very old because I recently
restarted apache. My server runs a very closed circle beta so there are only a
handful of users. Killing some or all or those processes had no effect on
orbited's memory consumption, which has been growing steadily since I last
restarted it.

Connecting directly to orbited without proxying prevents use from behind some
firewalls, as both cannot use the same universally available port 80. I could
try this for the weekend though.

Niklas B

unread,
Jun 19, 2010, 7:59:04 AM6/19/10
to Orbited Discussion
Mine users are connecting to Orbited without any proxy, basically just
Apache servs up the HTML file, the JS is sent from Nginx and
Orbited.js and the rest goes via Orbited.

My issues seem to be mainly correlated to users connecting (not sure
if its users or messages)

If I reroute traffic from one of the instances (it now has 0
connections) the memory usage doesn't really go up (at lest not that
mnuch), but it doesn't go down either.

I've also had issues with Orbited not releasing connections to the
Stomp server, but that is another issue (one possibly being cause by
having severals connections per client, and closing some of them)

My theory is that either Orbited (or I have done something) so Twisted
doesn't clean the connections properly. So if I can get some GD
trace.

Niklas B

unread,
Jun 19, 2010, 7:59:55 AM6/19/10
to Orbited Discussion
Of course that should be "GC" for garbage collection..

Jérémy Lal

unread,
Jun 19, 2010, 11:54:39 AM6/19/10
to orbite...@googlegroups.com
On 18/06/2010 17:59, B-ZaR wrote:
> Hi,
>
> I'm experiencing the same problem. I use one instance of Orbited
> version 0.7.10 (unmodified). There are under 500 messages sent per day
> and only 10-20 connections, but the memory consumption raises steadily
> about 327MiB per week until the server runs out of memory. Some
> connections stay on for several weeks.
> ...

I wonder if what i'm seeing is similar :
* On the orbited server :
telnet localhost 8005
gives 470 connections
netstat -n | wc -l
gives around 3000 connections.

* On the rabbitmq + stomp server :
netstat -n | wc -l
gives around 500 connections.

Maybe i'm wrong, but i guess the connections on the orbited
server should be twice the connections on the stomp server (roughly) :
one for incoming client, one for the proxy to stomp ?
Is my count totally wrong ?

J�r�my.

A Monkey

unread,
Jun 19, 2010, 3:01:56 PM6/19/10
to orbite...@googlegroups.com
Hi Folks,

Thanks for chiming in with all of these questions and thanks to those
that have filed tickets.

I'm going to spend time looking into this tomorrow afternoon (I'm
GMT-4). Any additional information you can gather before then would
help move things along. orbited version is essential, config files are
great and a tarball containing a working example is the best!

Thanks,
Matthew

Niklas B

unread,
Jun 20, 2010, 4:20:07 PM6/20/10
to Orbited Discussion
Sorry for the late reply,

Let me know if you need someone to test something. I'm running
instances side by side so I can measure the difference (however, only
from 1 source code currently)

I'm using Python 2.5.2 and Orbited 0.8 beta 1

Teemu Erkkola

unread,
Jun 21, 2010, 10:45:07 AM6/21/10
to orbite...@googlegroups.com
Tried having the clients connect directly to the orbited daemon with no
proxying for a couple of days, but the memory footprint kept growing at
a steady pace.

Niklas B

unread,
Jun 22, 2010, 4:51:18 AM6/22/10
to Orbited Discussion
Mine seems to be climbing with users/activity. If I stop letting
Orbited accept connections it doesn't climb further up.

My hunch (which probably is wrong) is that it has to do with not
clearing Protocol Instances.

leon

unread,
Jun 22, 2010, 9:21:38 AM6/22/10
to Orbited Discussion
But it does not go down too, right? Do you have the monitor plugged
in?

Niklas B

unread,
Jun 22, 2010, 10:38:29 AM6/22/10
to Orbited Discussion
I have monitor plugged in. I've tried to remove it and it doesn't make
any difference.

Does anyone not have memory leaks on python 2.5.2? Thinking it might
be a python version issue.

Here is my finding so far:

I've tried using varies GC etc to nail down the issue and it doesn't
appear to be in proxy.py, I've removed everything:

from twisted.internet import protocol, reactor
from orbited.config import map as config

class Incoming(protocol.Protocol):
def connectionMade(self):
print "connection made"
self.transport.loseConnection()

def connectionLost(self):
print "lost"

class ProxyFactory(protocol.Factory):
protocol = Incoming

And it still takes up a reasonable amount of memory per user. If I
hammer the monitory instance which is using pretty much the exact same
code, it doesn't take up any noticeable amount of memory.

So what is the part that initiate the ProxyFactory doing differently?

Even if a switch to:

from echo import EchoFactory
reactor.listenWith(CometPort, factory=EchoFactory(),
resource=root, childName='csp')

It still leaks memory. Could it be in the CometPort part? I'm out of
depth in CSP though and this is driving me crazy :)

Jared Wilson

unread,
Jun 22, 2010, 10:45:24 AM6/22/10
to orbite...@googlegroups.com
Do you think it might be the size packets your sending possibly?

Also, is there some instructions on how to hook in the monitor? I am not familiar with it.
What is the monitor's capabilities?



--
You received this message because you are subscribed to the
Orbited discussion group.
To post, send email to
   <orbite...@googlegroups.com>
To unsubscribe, send email to
   <orbited-user...@googlegroups.com>
For more options, visit
   <http://groups.google.com/group/orbited-users>

desmaj

unread,
Jun 22, 2010, 11:45:51 AM6/22/10
to Orbited Discussion
This goes out to all you memory leak hunters out there!

Marius Gedminus has a nice post about visualizing the python object
graph and hunting for leaks. There's even some code!
http://mg.pov.lt/blog/hunting-python-memleaks

Michael Carter

unread,
Jun 22, 2010, 12:28:10 PM6/22/10
to orbite...@googlegroups.com
Here is my recommendation for tracking down the memory leaks:

1) add twisted manhole to orbited so you can ssh/telnet into the live process [1] 
2) ssh/telnet into your running orbited, and in the python shell: >>> import objgraph
* objgraph info, see [2]
3) >>> objgraph.show_most_common_types() to get a list of the most common object instances
4) lets say the most common instance is a list, then look at a random list instance and see whats in it: random.choice(objgraph.by_type('list'))
5) repeat step 4 a few times, then email results/transcript to the list.



-Michael Carter


Michael Carter

unread,
Jun 22, 2010, 12:32:26 PM6/22/10
to orbite...@googlegroups.com
After re-reading the thread I realize that there are reports from both 0.7.x and 0.8.x users. The comet layer of these releases are *completely* different, but they do share some other modules in common. Usually the most likely place for a memory leak is in the comet layer, so I would guess that we either have 2 separate issues. Less likely but also possible is that one of the shared modules is leaking.

For all future emails on this thread, please state the exact version number of orbited you're using before you write anything else.

leon

unread,
Jun 22, 2010, 8:32:55 PM6/22/10
to Orbited Discussion
I have JUST started coding in Python, but I do remember the official
Python website where they state 'Python does not leak memory'. It did
seem to me as fairy tail, but would that be possible? Assuming that is
true (ahem), we'd probably be facing some sort of issue where objects
are not getting released instead of a true memory leak. The only
objects that get created are the connection handlers, right? Do any of
these run within threads? It could also be something that is not
getting shut down properly on the twisted module. My wild guesses.

The js.io (I'm on 0.7.10) seems to be pretty stable and well-aware of
connections losses, both by the client and the server - question is -
how aware is the server code of connection losses? Can twisted really
track a lost connection properly?

The memory on the server is used at 1.5G, and it has not increased in
a long long time (over 6 months of running Orbited). I'm using
ActiveMQ though - are you guys using Orbited's MorbidQ?

Here's the pmap output for my orbited daemon:
>>pmap 20024
20024: /usr/bin/python /usr/bin/orbited -c auction_orbited.cfg
08048000 992K r-x-- /usr/bin/python2.5
08140000 148K rw--- /usr/bin/python2.5
08165000 14328K rw--- [ anon ]
b6700000 132K rw--- [ anon ]
b6721000 892K ----- [ anon ]
b68c6000 68K r-x-- /usr/lib/python2.5/lib-dynload/cPickle.so
b68d7000 4K rw--- /usr/lib/python2.5/lib-dynload/cPickle.so
b68d8000 4K ----- [ anon ]
b68d9000 8192K rw--- [ anon ]
b70d9000 4K ----- [ anon ]
b70da000 8580K rw--- [ anon ]
b793b000 1192K r-x-- /usr/lib/i686/cmov/libcrypto.so.0.9.8
b7a65000 84K rw--- /usr/lib/i686/cmov/libcrypto.so.0.9.8
b7a7a000 792K rw--- [ anon ]
b7b4b000 36K r-x-- /lib/tls/i686/cmov/libnss_files-2.7.so
b7b54000 8K rw--- /lib/tls/i686/cmov/libnss_files-2.7.so
b7b56000 32K r-x-- /lib/tls/i686/cmov/libnss_nis-2.7.so
b7b5e000 8K rw--- /lib/tls/i686/cmov/libnss_nis-2.7.so
b7b60000 80K r-x-- /lib/tls/i686/cmov/libnsl-2.7.so
b7b74000 8K rw--- /lib/tls/i686/cmov/libnsl-2.7.so
b7b76000 8K rw--- [ anon ]
b7b78000 28K r-x-- /lib/tls/i686/cmov/libnss_compat-2.7.so
b7b7f000 8K rw--- /lib/tls/i686/cmov/libnss_compat-2.7.so
b7b81000 260K rw--- [ anon ]
b7bc8000 12K r-x-- /lib/libuuid.so.1.2
b7bcb000 4K rw--- /lib/libuuid.so.1.2
b7bd0000 80K r-x-- /usr/lib/python2.5/lib-dynload/_ctypes.so
b7be4000 8K rw--- /usr/lib/python2.5/lib-dynload/_ctypes.so
b7be6000 8K r-x-- /usr/lib/python2.5/lib-dynload/_hashlib.so
b7be8000 4K rw--- /usr/lib/python2.5/lib-dynload/_hashlib.so
b7be9000 4K r-x-- /usr/lib/python2.5/site-packages/twisted/
protocols/_c_urlarg.so
b7bea000 4K rw--- /usr/lib/python2.5/site-packages/twisted/
protocols/_c_urlarg.so
b7beb000 12K r-x-- /usr/lib/python2.5/lib-dynload/termios.so
b7bee000 8K rw--- /usr/lib/python2.5/lib-dynload/termios.so
b7bf0000 8K r-x-- /usr/lib/python2.5/lib-dynload/_heapq.so
b7bf2000 8K rw--- /usr/lib/python2.5/lib-dynload/_heapq.so
b7bf4000 4K r-x-- /usr/lib/python2.5/lib-dynload/_bisect.so
b7bf5000 4K rw--- /usr/lib/python2.5/lib-dynload/_bisect.so
b7bf6000 20K r-x-- /usr/lib/python2.5/lib-dynload/itertools.so
b7bfb000 8K rw--- /usr/lib/python2.5/lib-dynload/itertools.so
b7bfd000 16K r-x-- /usr/lib/python2.5/lib-dynload/collections.so
b7c01000 4K rw--- /usr/lib/python2.5/lib-dynload/collections.so
b7c02000 8K r-x-- /usr/lib/python2.5/lib-dynload/grp.so
b7c04000 4K rw--- /usr/lib/python2.5/lib-dynload/grp.so
b7c05000 12K r-x-- /usr/lib/python2.5/lib-dynload/select.so
b7c08000 4K rw--- /usr/lib/python2.5/lib-dynload/select.so
b7c09000 12K r-x-- /usr/lib/python2.5/lib-dynload/fcntl.so
b7c0c000 4K rw--- /usr/lib/python2.5/lib-dynload/fcntl.so
b7c0d000 8K r-x-- /usr/lib/python2.5/lib-dynload/_random.so
b7c0f000 4K rw--- /usr/lib/python2.5/lib-dynload/_random.so
b7c10000 16K r-x-- /usr/lib/python2.5/lib-dynload/binascii.so
b7c14000 4K rw--- /usr/lib/python2.5/lib-dynload/binascii.so
b7c15000 12K r-x-- /usr/lib/python2.5/lib-dynload/math.so
b7c18000 4K rw--- /usr/lib/python2.5/lib-dynload/math.so
b7c19000 12K r-x-- /usr/lib/python2.5/lib-dynload/cStringIO.so
b7c1c000 4K rw--- /usr/lib/python2.5/lib-dynload/cStringIO.so
b7c1d000 12K rw--- [ anon ]
b7c20000 20K r-x-- /usr/lib/python2.5/lib-dynload/operator.so
b7c25000 4K rw--- /usr/lib/python2.5/lib-dynload/operator.so
b7c26000 20K r-x-- /usr/lib/python2.5/lib-dynload/_struct.so
b7c2b000 4K rw--- /usr/lib/python2.5/lib-dynload/_struct.so
b7c2c000 248K r-x-- /usr/lib/i686/cmov/libssl.so.0.9.8
b7c6a000 16K rw--- /usr/lib/i686/cmov/libssl.so.0.9.8
b7c6e000 12K r-x-- /usr/lib/python2.5/lib-dynload/_locale.so
b7c71000 4K rw--- /usr/lib/python2.5/lib-dynload/_locale.so
b7c72000 12K r-x-- /usr/lib/python2.5/lib-dynload/_ssl.so
b7c75000 4K rw--- /usr/lib/python2.5/lib-dynload/_ssl.so
b7c76000 44K r-x-- /usr/lib/python2.5/lib-dynload/_socket.so
b7c81000 12K rw--- /usr/lib/python2.5/lib-dynload/_socket.so
b7c84000 16K r-x-- /usr/lib/python2.5/site-packages/zope/
interface/_zope_interface_coptimizations.so
b7c88000 4K rw--- /usr/lib/python2.5/site-packages/zope/
interface/_zope_interface_coptimizations.so
b7c89000 60K r-x-- /usr/lib/python2.5/lib-dynload/datetime.so
b7c98000 12K rw--- /usr/lib/python2.5/lib-dynload/datetime.so
b7c9b000 16K r-x-- /usr/lib/python2.5/lib-dynload/strop.so
b7c9f000 8K rw--- /usr/lib/python2.5/lib-dynload/strop.so
b7ca1000 12K r-x-- /usr/lib/python2.5/lib-dynload/time.so
b7ca4000 8K rw--- /usr/lib/python2.5/lib-dynload/time.so
b7ca6000 80K r-x-- /usr/lib/libz.so.1.2.3.3
b7cba000 4K rw--- /usr/lib/libz.so.1.2.3.3
b7cbb000 16K r-x-- /usr/lib/python2.5/lib-dynload/zlib.so
b7cbf000 4K rw--- /usr/lib/python2.5/lib-dynload/zlib.so
b7cc0000 28K r--s- /usr/lib/gconv/gconv-modules.cache
b7cc7000 252K r---- /usr/lib/locale/en_US.utf8/LC_CTYPE
b7d06000 524K rw--- [ anon ]
b7d89000 1316K r-x-- /lib/tls/i686/cmov/libc-2.7.so
b7ed2000 4K r---- /lib/tls/i686/cmov/libc-2.7.so
b7ed3000 8K rw--- /lib/tls/i686/cmov/libc-2.7.so
b7ed5000 12K rw--- [ anon ]
b7ed8000 140K r-x-- /lib/tls/i686/cmov/libm-2.7.so
b7efb000 8K rw--- /lib/tls/i686/cmov/libm-2.7.so
b7efd000 8K r-x-- /lib/tls/i686/cmov/libutil-2.7.so
b7eff000 8K rw--- /lib/tls/i686/cmov/libutil-2.7.so
b7f01000 4K rw--- [ anon ]
b7f02000 8K r-x-- /lib/tls/i686/cmov/libdl-2.7.so
b7f04000 8K rw--- /lib/tls/i686/cmov/libdl-2.7.so
b7f06000 80K r-x-- /lib/tls/i686/cmov/libpthread-2.7.so
b7f1a000 8K rw--- /lib/tls/i686/cmov/libpthread-2.7.so
b7f1c000 16K rw--- [ anon ]
b7f20000 4K r-x-- /usr/lib/python2.5/lib-dynload/_weakref.so
b7f21000 4K rw--- /usr/lib/python2.5/lib-dynload/_weakref.so
b7f22000 8K rw--- [ anon ]
b7f24000 4K r-x-- [ anon ]
b7f25000 104K r-x-- /lib/ld-2.7.so
b7f3f000 8K rw--- /lib/ld-2.7.so
bfc25000 176K rw--- [ stack ]
total 39604K


On Jun 22, 1:32 pm, Michael Carter <cartermich...@gmail.com> wrote:
> After re-reading the thread I realize that there are reports from both 0.7.x
> and 0.8.x users. The comet layer of these releases are *completely*
> different, but they do share some other modules in common. Usually the most
> likely place for a memory leak is in the comet layer, so I would guess that
> we either have 2 separate issues. Less likely but also possible is that one
> of the shared modules is leaking.
>
> For all future emails on this thread, please state the exact version number
> of orbited you're using before you write anything else.
>
> On Tue, Jun 22, 2010 at 9:28 AM, Michael Carter <cartermich...@gmail.com>wrote:
>
>
>
> > Here is my recommendation for tracking down the memory leaks:
>
> > 1) add twisted manhole to orbited so you can ssh/telnet into the live
> > process [1]
> > 2) ssh/telnet into your running orbited, and in the python shell: >>>
> > import objgraph
> > * objgraph info, see [2]
> > 3) >>> objgraph.show_most_common_types() to get a list of the most common
> > object instances
> > 4) lets say the most common instance is a list, then look at a random list
> > instance and see whats in it: random.choice(objgraph.by_type('list'))
> > 5) repeat step 4 a few times, then email results/transcript to the list.
>
> > [1]
> >http://webcache.googleusercontent.com/search?q=cache:P3bHJILhBMgJ:blo...
> > [2]http://mg.pov.lt/objgraph/
>
> > -Michael Carter
>
> > On Tue, Jun 22, 2010 at 8:45 AM, desmaj <monkeylo...@gmail.com> wrote:
>
> >> This goes out to all you memory leak hunters out there!
>
> >> Marius Gedminus has a nice post about visualizing the python object
> >> graph and hunting for leaks. There's even some code!
> >>http://mg.pov.lt/blog/hunting-python-memleaks
>
> >> --
> >> You received this message because you are subscribed to the
> >> Orbited discussion group.
> >> To post, send email to
> >>    <orbite...@googlegroups.com>
> >> To unsubscribe, send email to
> >>    <orbited-user...@googlegroups.com<orbited-users%2Bunsubscribe@goo glegroups.com>

Niklas B

unread,
Jun 23, 2010, 3:41:21 AM6/23/10
to Orbited Discussion
Desmaj: I'll make sure to read the post

Mario: Great suggestion with Manhole, feels like the exact thing I
need.

I'll see what results I get, I might take a stabbing at csp as well to
see if I can figure out more by checking garbage collection or if I
find some other issues.

Niklas B

unread,
Jun 23, 2010, 8:09:38 AM6/23/10
to Orbited Discussion
Python: 2.5.1
Orbited: 0.8.0beta1dev-py2.5(.egg)

Using echo factory:

from twisted.internet import reactor, protocol
from csp.twisted.port import CometPort

class Echo(protocol.Protocol):
#def dataReceived(self, data):
# self.transport.write(data)

def connectionMade( self ):
print "made"
self.transport.loseConnection()

def connectionLost( self ):
print "lost"


class EchoFactory(protocol.Factory):
protocol = Echo

if __name__ == "__main__":
print "echo listening on CSP@8000"
reactor.listenWith(CometPort, port=8000, factory=EchoFactory())
reactor.run()

My results are not obvious. The numbers are zero, 1 connect and
automatic disconnect, 2 connection and automatic disconnect and last
after manual gc.collect() (which returned 88 but didn't reduce ram
usage)

http://static.unstable.snyggastchatten.se/static/orbited-objgraph.html

I'll keep digging =)
> > > > On Tue, Jun 22, 2010 at 8:45 AM, desmaj...
>
> read more »

Niklas B

unread,
Jun 23, 2010, 8:53:06 AM6/23/10
to Orbited Discussion

Niklas B

unread,
Jun 29, 2010, 5:48:25 AM6/29/10
to Orbited Discussion
Can someone who is not having these issues post:

1. Their setup (typ of linux, reactor, etc.)
2. Orbited VS
3. Python VS

If you only are running one orbited instance, this should work to see
memory usage and percent of memory usage:

ps -AH v |grep orbited|grep -v grep|awk '{print $8/1024} {print $9}'

For me it's pretty easy to verify the issue, run the above command and
note the MB usage. Connect and disconnect 10 clients. Then wait a
while for the timeouts to do it's magic. The number should have gone
up. For me:

Let's say I start with 15.4961 MB and I then connect/disconnect five
clients I get:

15.8203

An increase of 0.3242 MB for 5 clients. 0.06484 MB per client. This
might seem like a small number, but let's assume you have 5 000 users
per day it's an increase of 324.2 MB per day that is accumulating.

I'm getting nowhere. I've tried different reactors, trying to look
into the process, using heapy with setref() to find the issue but
nothing. EchoFactory() has the same issue, but not any other (i.e
monitor, static) service. Hence my suspicion about CSP. But I'm not
going anywhere their either.

Regards,
NIklas

On Jun 23, 2:53 pm, Niklas B <biv...@gmail.com> wrote:
> Python: 2.5.1
> Orbited: 0.8.0beta1dev-py2.5(.egg)
>
> Above, I think instance and type is the same.
>
> Here is objgraph over the different stuff:
>
> http://static.unstable.snyggastchatten.se/static/objgraph/weakref.pnghttp://static.unstable.snyggastchatten.se/static/objgraph/tuple.pnghttp://static.unstable.snyggastchatten.se/static/objgraph/instances.pnghttp://static.unstable.snyggastchatten.se/static/objgraph/function.png
>
> Here is the full output of the instance list:
>
> http://static.unstable.snyggastchatten.se/static/objgraph/instance.txt
> (http://static.unstable.snyggastchatten.se/static/objgraph/
> instance_sort.txt)
>
> And here is the same list again after ~20 extra clients had connected
> and disconnect.
>
> http://static.unstable.snyggastchatten.se/static/objgraph/instance20.txt
> (http://static.unstable.snyggastchatten.se/static/objgraph/
> instance_sort20.txt)
>
> Here is the same after gc.collect()
>
> http://static.unstable.snyggastchatten.se/static/objgraph/instance20g...
> ...
>
> read more »

BobDobbs

unread,
Jun 29, 2010, 1:16:14 PM6/29/10
to Orbited Discussion
I started up orbited fresh (1 instance) and then ran "ps -AH v | grep
orbited". The RSS was 792 and the DRS was 3240. I am running on Ubuntu
9.04, Orbited 0.7.10, Python 2.6.2 and the epoll reactor in
orbited.cfg

My test was to sign in a user... sign him out... sign in a user...
sign him out... and did this about 100 times. After all of that... I
ran ps again and the RSS was still 792 and the DRS was still 3240

However... I ran the ps command many times during this... and
sometimes I would see the numbers fluctuate by about 4... but then
bounce back to the old value.
> >http://static.unstable.snyggastchatten.se/static/objgraph/weakref.png...
> > > > > b7c72000     12K r-x--...
>
> read more »

Niklas B

unread,
Jul 1, 2010, 7:08:36 AM7/1/10
to Orbited Discussion
Interesting,

Did you connect the user to Orbited and then through to another
service? (such as STOMP) could you tre the same thing with Orbited
0.8?

Maybe I should try upgrading to Python 2.6.2

Regards,
Niklas

Niklas B

unread,
Jul 1, 2010, 7:28:54 AM7/1/10
to Orbited Discussion
If I setup Orbited 0.7.11 from BitBucket on Python 2.5 (epoll reactor)
with HTTP@8000 enabled on Debian and visits:

http://IP:8000/static/demos/stomp/ and clicks connect, refreshes and
click connect again and again my memory usage increases from 14.9688
to 15.4805 quite fast. Even if the connection to STOMP isn't even
successful. Note that the connection that seem to drain memory appears
to be user -> orbited, not orbited -> service. So connect and refresh
a couple of time should do the trick (while simply connect/disconnect
to stomp several times without pagereload doesnt appear to change
anything)

So I wonder:
Orbited 0.7.10 vs Orbited 0.7.11
Python 2.5 vs. Python 2.6

On Jun 29, 7:16 pm, BobDobbs <winstonwa...@gmail.com> wrote:
> ...
>
> read more »

Niklas B

unread,
Jul 1, 2010, 9:37:02 AM7/1/10
to Orbited Discussion
I've also tried Orbited 0.8. It clears my connections fine but still
leaks memory (with Python 2.6)

On Jul 1, 1:28 pm, Niklas B <biv...@gmail.com> wrote:
> If I setup Orbited 0.7.11 from BitBucket on Python 2.5 (epoll reactor)
> with HTTP@8000 enabled on Debian and visits:
>
> http://IP:8000/static/demos/stomp/and clicks connect, refreshes and
> ...
>
> read more »

BobDobbs

unread,
Jul 1, 2010, 10:42:36 AM7/1/10
to Orbited Discussion
I tried the test scenario you presented. I opened up orbited fresh and
opened up Chrome. I went to the stomp demo page and would click the
"Connect" button. The following text would show:

→ Transport openned
→ Connected as user guest

After that... I would click the reload button (not holding shift) and
click the button again.. and did this maybe 40 times. When I would
click the reload button.. I would see the following text flash by
before clearing out after the reload:

→ Transport closed (code: 201)

When opening Orbited fresh... the RSS was 792 and the DRS was 3240.
After and during the test... the number didn't change much except by
maybe 4... but would still settle back to the original numbers.
Orbited showed "new connection..." on each "Connect" click. And after
a period of time... would spit out a "connection closed from..."
message for each opened connection.

I hope I am not doing something wrong that is causing confusion but
nothing jumps out at me like a problem. And yes... in our production
app we are using stomp.

On Jul 1, 9:37 am, Niklas B <biv...@gmail.com> wrote:
> I've also tried Orbited 0.8. It clears my connections fine but still
> leaks memory (with Python 2.6)
>
> On Jul 1, 1:28 pm, Niklas B <biv...@gmail.com> wrote:
>
>
>
> > If I setup Orbited 0.7.11 from BitBucket on Python 2.5 (epoll reactor)
> > with HTTP@8000 enabled on Debian and visits:
>
> >http://IP:8000/static/demos/stomp/andclicks connect, refreshes and
> > > > > > > > b7bf4000      4K r-x--...
>
> read more »

Jérémy Lal

unread,
Jul 1, 2010, 11:00:24 AM7/1/10
to orbite...@googlegroups.com
On 01/07/2010 16:42, BobDobbs wrote:
> I tried the test scenario you presented. I opened up orbited fresh and
> opened up Chrome. I went to the stomp demo page and would click the
> "Connect" button. The following text would show:
>
> → Transport openned
> → Connected as user guest
>
> After that... I would click the reload button (not holding shift) and
> click the button again.. and did this maybe 40 times. When I would
> click the reload button.. I would see the following text flash by
> before clearing out after the reload:
>
> → Transport closed (code: 201)
>
> When opening Orbited fresh... the RSS was 792 and the DRS was 3240.
> After and during the test... the number didn't change much except by
> maybe 4... but would still settle back to the original numbers.
> Orbited showed "new connection..." on each "Connect" click. And after
> a period of time... would spit out a "connection closed from..."
> message for each opened connection.
>
> I hope I am not doing something wrong that is causing confusion but
> nothing jumps out at me like a problem. And yes... in our production
> app we are using stomp.

I suspect the bug to not be triggered by chrome...
Maybe it happens with IE or what.
I have yet to confirm this.

Jérémy.


BobDobbs

unread,
Jul 1, 2010, 11:38:13 AM7/1/10
to Orbited Discussion
Tried again with IE8 native, IE6 on a virtual machine and FF 3.6.6
native. None seemed to increase the memory numbers. What is
interesting though is both IEs and FF would immediately show
connection closed when I clicked reload... however chrome would not
and the "connection closed.." text would only show after a timeout...
perhaps 30 seconds.

Niklas B

unread,
Jul 2, 2010, 4:10:53 AM7/2/10
to Orbited Discussion
I don't think it has to do with Browsers. Orbited should close these
connections anyway (and I'm using Chrome for my testing).

If I run netstat to check the numbers of connections I have it goes
back to normal after they've timed out (and Orbited fires
connectionLost) but the memory usage still doesn't go down.

BobDobbs: What Twisted version are you using? You can check by
running

python -c "import twisted; print twisted.version"

BobDobbs: If you have the option, could you try with Orbited 0.7.11?

Regards,
Niklas

Michael Carter

unread,
Jul 2, 2010, 4:23:44 AM7/2/10
to orbite...@googlegroups.com
Has anyone had a chance to test out memory leaks with my aforementioned method? This is a simple and sure-fire way to track down the problem in about ten minutes.

-Michael Carter

--
You received this message because you are subscribed to the
Orbited discussion group.
To post, send email to
   <orbite...@googlegroups.com>
To unsubscribe, send email to
   <orbited-user...@googlegroups.com>

Niklas B

unread,
Jul 2, 2010, 4:43:53 AM7/2/10
to Orbited Discussion
Mario,

I have tried it, but I'm not good enough to find the leak. I would be
more then happy to give you either shell to a virtual server with
these issues or telnet access to a orbited instance with the issue at
hand.

Regards,
Niklas
> >    <orbited-user...@googlegroups.com<orbited-users%2Bunsubscribe@goo glegroups.com>

Niklas B

unread,
Jul 2, 2010, 4:44:41 AM7/2/10
to Orbited Discussion
* Michael (sorry for the typo)

BobDobbs

unread,
Jul 2, 2010, 10:05:38 AM7/2/10
to Orbited Discussion
@Niklas: It says I have twisted version 8.2.0 .. as for 0.7.11... I
probably won't have time until next week as I have a bucketload of
stuff to do before the holiday weekend.

@Michael: I am not a python guy... but I was able to stick code
similar to the first link you posted into orbited (I was guessing
start.py in the orbited subdir). I was able to get manhole working and
was able to telnet in. However... objgraph doesnt seem to be available
on my system (tried importing at the top). However... does your test
cause a memory leak? Or just allow you to see what is happening once
you have caused a memory leak?

For others who want to try putting manhole in... this was what I did
(my added lines have '##' next to them)...

The top of my start.py in the orbited subdir looked like this:

import os
import sys
import urlparse
##import objgraph -->> THIS FAILED FOR ME
from orbited import __version__ as version
from orbited import config
from orbited import logging

# NB: this is set after we load the configuration at "main".
logger = None

##from twisted.manhole import telnet

##def createShellServ():
## from twisted.internet import reactor
## print 'create shell serve'
## factory = telnet.ShellFactory()
## port = reactor.listenTCP(2000, factory)
## factory.namespace['x'] = 'hello world ser'
## factory.username = 'robw'
## factory.password = 'robw'
## print 'Listening 2000'
## return port

Then... search for reactor.run() in this file... my file looked like
this afterward..

if options.profile:
import hotshot
prof = hotshot.Profile("orbited.profile")
logger.info("running Orbited in profile mode")
logger.info("for information on profiler, see http://orbited.org/wiki/Profiler")
##reactor.callWhenRunning(createShellServ)
prof.runcall(reactor.run)
prof.close()
else:
##reactor.callWhenRunning(createShellServ)
reactor.run()

As mentioned above... I was able to telnet in... but objgraph does not
seem to be available on my system.

Michael Carter

unread,
Jul 2, 2010, 5:50:04 PM7/2/10
to orbite...@googlegroups.com, Mario Balibrera
Niklas was nice enough to let me study the memory leak on his production 0.8.x server, after installing manhole.

If you'd like to repeat the steps, here they are.
1) install manhole in orbited (as outlined by BobDobbs)
2) download the objgraph module and put it in the same directory as orbited's source. (I linked to it previously in this thread)
3) telnet in. Here's my session:

>>> import objgraph, gc, random
>>> objgraph.show_most_common_types()
list                       124566
tuple                      18300
dict                       13754
function                   4583
builtin_function_or_method 4545
instance                   3274
instancemethod             2837
IPv4Address                2100
weakref                    1011
CSPSession                 898
>>> random.choice(objgraph.by_type('list'))
[2887, 1, 'ODI6MSwyOlpvWm8hc255Z2dhc3RAN0ZGMUUzNEEuQjkzMEM0MzYuMTU3NThGN0YuSVAgUFJJVk1TRyAjc255Z2dhc3QgOnRhY2sganVsaWEgOikNCg==']
>>> random.choice(objgraph.by_type('list'))
[1794, 1, 'NzI6MSwyOkFuZ2VsRXllIXNueWdnYXN0QDQ4OUIwRTZBLkZERTlCQUU3LjFDNTQ2NUVFLklQIFFVSVQgOlBpbmcgdGltZW91dA0K']
>>> random.choice(objgraph.by_type('list'))
[324, 1, 'OTA6MSwyOlJvX2xsYV9tYW5uZW5fIXNueWdnYXN0QDVBMjgyMjFBLjU0QTlDN0FFLkFGQzk2QzNBLklQIFBSSVZNU0cgI3NueWdnYXN0IDpoYXIgdHLla2lndA0K']
>>> random.choice(objgraph.by_type('list'))
[1375, 1, 'OTI6MSwyOnJNeF4hcm14ZnJpYmVyZ0BjaGF0dGVuLTM5QjA1QkQ2LnRiY24udGVsaWEuY29tIFBSSVZNU0cgI3NueWdnYXN0IDpKb3JkZW4g5HIgcGxhdHQhIDoqDQo=']
>>> random.choice(objgraph.by_type('list'))
[722, 1, 'NzQ6MSwyOk5pbmExMiFzbnlnZ2FzdEBERDI4RDU0Mi43MDU0NkI1LkZCMDk5RkFELklQIFBSSVZNU0cgI3NueWdnYXN0IDpoZWogDQo=']
>>> 

My conclusion: These are CSP packets, so the memory leak is in the csp.twisted module. Probably has something to do with failing to remove references to a CSP session after close has occurred, or perhaps some condition that causes a timeout never to fire.

NOTE: This has *absolutely nothing* to do with 0.7.x. This is only a 0.8beta bug. someone who is running 0.7.x should try to reproduce my steps.

NOTE: The more memory it leaks the more likely this method is to reveal anything. Orbited will probably have thousands of lists before it starts leaking memory, so you need enough leaked lists (or tuples, or whatever datatype) before they start showing up from your random.choice calls.

Further Study: You can use the gc module, or perhaps objgraph itself to track the backrefs to the object in question. You can easily find the ref chain and understand why that object is still hanging around. Unfortunately, I managed to cause an infinite loop on Niklas's live server, so I'm done for the time being. But there is enough here to investigate.

Mario, you wrote this code originally, can you want to investigate?

desmaj, you up for it?

-Michael Carter

--
You received this message because you are subscribed to the
Orbited discussion group.
To post, send email to
   <orbite...@googlegroups.com>
To unsubscribe, send email to
   <orbited-user...@googlegroups.com>

A Monkey

unread,
Jul 2, 2010, 6:01:07 PM7/2/10
to orbite...@googlegroups.com
Hi Folks,

On Fri, Jul 2, 2010 at 5:50 PM, Michael Carter <carter...@gmail.com> wrote:
> Mario, you wrote this code originally, can you want to investigate?
>
> desmaj, you up for it?

Oh sure. I should be settled down some by tomorrow so I'll look into
it then. I'll post when I've got something interesting to say.

By the way, great thread everyone. I've been a little preoccupied
lately and so I haven't really been able to focus on this as much as I
would have liked to. Thanks to Niklas for helping to coordinate
everything, thanks to Michael for stepping in to drop science and
thanks to everyone that's reported their problems and helped others
out. This is another example of why I'm proud to be a part of the
community around orbited.

Best regards,
Matthew

Niklas B

unread,
Jul 3, 2010, 8:10:29 AM7/3/10
to Orbited Discussion
Not sure if this helps, but here's a graph over a not freed list.

import objgraph, gc, random
objgraph.show_most_common_types()
obj = objgraph.by_type('list')[random.randint(0,48000)]
objgraph.show_backrefs([obj], max_depth=10)

http://static.unstable.snyggastchatten.se/static/objgraph/list_dependencies.png

And for "instance":

obj = objgraph.by_type('instance')[random.randint(0,30000)]
objgraph.show_backrefs([obj], max_depth=10)

http://static.unstable.snyggastchatten.se/static/objgraph/instance_dependencies.png

On Jul 3, 12:01 am, A Monkey <monkeylo...@gmail.com> wrote:
> Hi Folks,
>

Niklas B

unread,
Jul 8, 2010, 8:40:54 AM7/8/10
to Orbited Discussion
The issue here is that CSP Session is never closed (currently no idea
why). This means that it's buffer will continue to build up for quite
some time, and then never be cleared.

On Jul 3, 2:10 pm, Niklas B <biv...@gmail.com> wrote:
> Not sure if this helps, but here's a graph over a not freed list.
>
> import objgraph, gc, random
> objgraph.show_most_common_types()
> obj = objgraph.by_type('list')[random.randint(0,48000)]
> objgraph.show_backrefs([obj], max_depth=10)
>
> http://static.unstable.snyggastchatten.se/static/objgraph/list_depend...
>
> And for "instance":
>
> obj = objgraph.by_type('instance')[random.randint(0,30000)]
> objgraph.show_backrefs([obj], max_depth=10)
>
> http://static.unstable.snyggastchatten.se/static/objgraph/instance_de...

Niklas B

unread,
Jul 8, 2010, 11:09:21 AM7/8/10
to Orbited Discussion
Since im not into CSP and can't really figure out how CSP figures out
a connection is dead/lost I'm probably not going to go deeper into
investigating this.

Mario/desmaj: Let me know if you like results or places to try/
investigate

I'm testing a fix right now that closes the CSP session if the main
outgoing connection closes (this is not recommended, but in my case it
suits me just fine). If that stops the leak it's pretty much also
confirmed that the problem is with CSP sessions or Incoming not
closing (but I assume that's CSP).

Regards,
Niklas

Niklas B

unread,
Aug 9, 2010, 2:27:04 AM8/9/10
to Orbited Discussion
Just a follow up for anyone reading this post later on (Google etc.)

Memory leaks was defined as CSP not closing due to improper timeouts.
This lead to that neither CSP Session or Proxy.Incoming were cleaned.

I have an ugly fix for this in place and mcarter has written a
proposed fix (which is currently waiting on
http://groups.google.com/group/orbited-users/browse_thread/thread/7d5bafd613aea7da
(since it's bundled with a utf-8 fix as well) before it can be proper
tested)

It should also be possible to use said fix (CSP Session fix) for
Orbited 0.7.X (possibly the utf-8 fix as well)

Regards,
Niklas
Reply all
Reply to author
Forward
0 new messages