Removing it unfortunately introduces some issues with the database pool
and postgres (may be more related to how we're using the database). It
also looks like, based on ticket 6614, that it will introduce a number
of other issues if it is removed. I haven't fully digested 6614 yet,
it's pretty lengthy.
I'm just wondering if anyone has any comments or thoughts about this.
Regards,
Shane
It's been discussed not so long ago on Trac-Dev, see
http://groups.google.com/group/trac-dev/msg/9dcdaffccc74471c
As you can see there, I did some tests recently, but unfortunately not
on Linux, where the benefits of the explicit gc.collect() were the more
important, IIRC (not sure I ever digested #6614 either ;-) ).
It would be nice to test again on Linux. Maybe other approaches are
possible, in order to lower the frequency of the collections. In all
cases, the lesson learnt was to not necessarily trust Python to do the
right thing if it's not told to do gc explicitly, in long running
programs. Maybe this has changed with 2.6, though.
-- Christian
Having dealt with gc issues with pyxpcom for several years now in long
lived multithreaded apps, python needs to be handled carefully if you
want it to auto-gc correctly, but it can be done and is often
preferable. Our problems were even more exagerated by the possibility
that a javascript or c++ component could potentially be holding onto
python objects creating circular references.
I'll dig back through the other thread and the bug. I'm also going to
profile heavier templates to see what impact gc has on those.
Shane
That thread happened while I was away. The connection pooling is
definitely a problem for me and easy to reproduce, and would have to be
the first thing I fix if I remove the garbage collection.
I would suggest using ab (apache bench) for simple stress testing.
Shane
FYI, by removing gc and the repository sync for each request, I cut
request time roughly in half depending on the template being requested.
I'm going to be doing more testing on this, and looking at alternative
methods to handle these two items.
Shane
I found that with the git plugin the sync on every request was *really*
impacting performance.
I hacked up a simple option to disable this automatic sync and set it in
the trac.ini file. This really helped at runtime.
Then in a second change I hacked up a mod to the trac-admin interface to
allow for the incremental sync to be called manually. This call was then
hooked up to the git receive hook (or whatever it's called) so that when
the repository is updated trac was updated, but not at any other times.
I've dug out the patches (they were a bit rough and ready but very
simple) and perhaps someone can apply them for the upcoming release (I
think they are safe enough to add in as it doesn't touch the default
behaviour).
Col
Index: trac/versioncontrol/api.py
===================================================================
--- trac/versioncontrol/api.py (revision 8289)
+++ trac/versioncontrol/api.py (working copy)
@@ -81,7 +81,8 @@
def pre_process_request(self, req, handler):
from trac.web.chrome import Chrome, add_warning
- if handler is not Chrome(self.env):
+ nosync = self.env.config.getbool('trac',
'disable_repository_autosync')
+ if not nosync and handler is not Chrome(self.env):
try:
self.get_repository(req.authname).sync()
except TracError, e:
Index: trac/admin/console.py
===================================================================
--- trac/admin/console.py (revision 8289)
+++ trac/admin/console.py (working copy)
@@ -645,7 +645,7 @@
config_path=os.path.join(self.envname, 'conf', 'trac.ini')))
_help_resync = [('resync', 'Re-synchronize trac with the repository'),
- ('resync <rev>', 'Re-synchronize only the given
<rev>')]
+ ('resync <rev>', 'Re-synchronize only the given
<rev> (if rev is "--latest" it will just sync the repository)')]
def _resync_feedback(self, rev):
sys.stdout.write(' [%s]\r' % rev)
@@ -658,6 +658,10 @@
if argv:
rev = argv[0]
if rev:
+ if "--latest" == rev:
+ repos =
env.get_repository().sync(self._resync_feedback)
+ printout(_("Done."))
+ return
env.get_repository().sync_changeset(rev)
printout(_("%(rev)s resynced.", rev=rev))
return
--
Colin Guthrie
gmane(at)colin.guthr.ie
http://colin.guthr.ie/
Day Job:
Tribalogic Limited [http://www.tribalogic.net/]
Open Source:
Mandriva Linux Contributor [http://www.mandriva.com/]
PulseAudio Hacker [http://www.pulseaudio.org/]
Trac Hacker [http://trac.edgewall.org/]
The multirepos branch already solves this by requiring post-commit hooks
in all repositories to call "trac-admin changeset added", which triggers
the sync. The default repository is still synced at every request (for
backward compatibility), but you can choose not to have a default
repository at all. Maybe we should even allow disabling that per-request
sync with an option in trac.ini.
-- Remy
--Noah