I've just upgraded t.e.o to Trac 0.11 beta1. So far everything appears 
to be working but please let me know if you notice anything broken by 
this change.
Cheers,
Jonas
I found a small issue with Opera, can anyone else with a 0.11 beta verify
this:
Go to newticket, click the component drop down menu:
For Opera 9.24 I see that the selected option jumps to the top choice. Even if
I select another option, click somewhere else in the chrome to unfocus from
the drop down menu and then click the drop down menu again it will jump to the
top option 'attachment'
For Opera 9.50 watch the 'general' selection get cleared. Even if I select
another option, click somewhere else in the chrome to unfocus from the drop
down menu and then click the drop down menu again it will clear the selection
again.
Under Firefox 2 this does not occur and general stays selected.
-- 
Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai
イェルーン ラウフロック ヴァン デル ウェルヴェン
http://www.in-nomine.org/ | http://www.rangaku.org/
In every stone sleeps a crystal...
I can confirm this for 9.24. But I only see this behaviour on t.e.o, not  
on my local testsite. Even if I add enough components so that a scrollbar  
is displayed, it works correctly. The only difference is that on my  
testsite, there is enough room underneath the selectbox so that the  
options are displayed below it, whereas on t.e.o. the options are  
displayed above the select box. To me this indicates a Opera bug, not a  
trac bug?
I noticed another detail however. When I add components in the admin page,  
then click new ticket, the new components are not displayed in the  
dropdown until I do a refresh of the whole page. Cache problem or  
something?
- Endre
I've not yet identified which pages require this much memory but I would 
not be surprised if it's some of the more expensive versioncontrol 
pages.
As far as I know memory allocated by the python interpreter is almost 
never returned to the operating system so a single memory hungry page 
can case this type of problem when using long running processes such as 
fastcgi and tracd.
As a comparison with Trac 0.10 each trac process on t.e.o never seems to 
allocate more than 100MB. And this server is 64bit so memory usage 
should be even lower on 32bit systems.
Some relevant reading:
http://wingolog.org/archives/2007/11/27/reducing-the-footprint-of- 
python-applications
Cheers,
Jonas
   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
22732 www-data  15   0 2625m 2.5g 3548 S    0 32.2 317:02.53 trac.fcgi
15142 www-data  15   0  511m 407m 3536 S    0  5.1  46:46.00 trac.fcgi
23190 www-data  15   0  437m 393m 3524 S    0  4.9  42:09.89 trac.fcgi
19822 www-data  15   0  216m 142m 3516 S    0  1.8  13:06.86 trac.fcgi
--Noah Kantrowitz
It's in trunk now, r6385.
Perhaps Noah and Jonas can also give this latest trunk a spin and see how it
handles itself.
All my testing could never get it above 30 MB since I guess my testset is too
small to trigger anything.
Christopher pointed me at Funkload and I started writing tests for this to
test Trac installs with it, so hopefully we can use some automated testing
soon as well to determine weak points.
I just wish there was a clearer way to deal with this programmatically instead
of jumping through many, many arcane hoops. I can debug C and related memory
issues easily, so I'll gladly receive hints/clues/whatsoever.
-- 
Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai
イェルーン ラウフロック ヴァン デル ウェルヴェン
http://www.in-nomine.org/ | http://www.rangaku.org/
Wisdom is the difference between knowledge and experience...
> -On [20080109 16:56], David Abrahams (da...@boost-consulting.com) wrote:
>>I think we already found and submitted a patch for this problem:
>>http://trac.edgewall.org/ticket/6581
>
> It's in trunk now, r6385.
>
> Perhaps Noah and Jonas can also give this latest trunk a spin and see how it
> handles itself.
>
> All my testing could never get it above 30 MB since I guess my
> testset is too small to trigger anything.
>
> Christopher pointed me at Funkload and I started writing tests for
> this to test Trac installs with it, so hopefully we can use some
> automated testing soon as well to determine weak points.
>
> I just wish there was a clearer way to deal with this
> programmatically instead of jumping through many, many arcane
> hoops. I can debug C and related memory issues easily, so I'll
> gladly receive hints/clues/whatsoever.
I think this is just one of the upshots of using a GC language where
"you don't have to worry about memory management."  You just end up
with different kinds of worries.
-- 
Dave Abrahams
Boost Consulting
Just for the record: I tested latest trunk again but didn't notice any 
improvement. This time there was more traffic on the server than last 
week so it didn't take longer than about 30 minutes for the server to 
start running out of memory. When I aborted this experiment each fcgi 
process was using about 3 times more resident memory (325MB) compared to 
0.10.x (around 100MB after running for a week).
Cheers,
Jonas
Erik
On Feb 1, 2008 9:17 PM, rupert....@gmail.com
You're right on both points. Maybe you could reopen #G166, for 
completeness, so that this format leak can be fixed as well. But in my 
tests, I've been careful to disable the _speedups extension, so it's not 
related to THE leak ;-)
By the way, I'm not sure I found THE leak, but at least an important 
one. I'm still trying to get my way through the generator chain in 
Genshi to understand more precisely what the problem is, but things go 
really bad memory wise once an exception is raised from an expression 
evaluation (and it doesn't seem to be related to what we do with that 
exception later at the Trac level). I'll update #6614 when I have new 
findings.
-- Christian
I think I read somewhere that using yield and getting exceptions might be
problematic when it comes to leaving around stray references. Perhaps this
might help you?
-- 
Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai
イェルーン ラウフロック ヴァン デル ウェルヴェン
http://www.in-nomine.org/ | http://www.rangaku.org/
We have met the enemy and they are ours...
Great work closing in on the problem, Christian! I tried to reproduce  
this with a somewhat smaller example, but failed so far (also,  
unfortunately have very little time right now).
So what can this be aside from stack frame data hanging around in a  
bad way? The various generators don't have much local data, and global  
data shouldn't be used at all. I'm lost.
If you can reliably reproduce this problem with a local data set (do  
you have some instructions maybe?), could you maybe try whether it  
also happens with Genshi 0.4.4? I'd like to know whether this problem  
is new in 0.5dev.
Thanks,
Chris
--
Christopher Lenz
   cmlenz at gmx.de
   http://www.cmlenz.net/
Unfortunately same goes for me (plus the fact that at times I needed to 
take a break from those daunting debugging sessions and work on features 
I needed, like the wiki rename and multirepos support).
Also, I still haven't tried to reproduce the issue on a smaller example, 
but of course the problem might relate to the Trac/Genshi interaction 
rather than to Genshi alone.
> So what can this be aside from stack frame data hanging around in a  
> bad way? The various generators don't have much local data, and global  
> data shouldn't be used at all. I'm lost.
>   
 From my very latest findings, it appears that there are two issues, 
probably rooted in the same cause.
1. At the very least the `data` dictionary of every request gets 
accumulated over time.
This can be hard to spot if this data corresponds to "cheap" pages, but 
can amount to hundreds of Mb per request if that data keeps alive costly 
stuff. Such expensive queries correspond for example to timeline 
requests. An example is to look 3 years back in the timeline of an 
environment setup to browse on a copy of the t.e.o repository, using the 
direct-svnfs access type (i.e. no cache involved). No need to list 
_lots_ of changesets, only the fact that we have to go through all the 
changesets for getting there, accumulates data in pools that are kept 
alive because the changesets themselves are kept alive (that in itself 
can be considered as another bug at the svn_fs level).
But even if we fix the SVN issue or don't get there in the first place, 
there's a kind of leak going on, for every request. That use case just 
makes it *very* apparent.
2. If there's an exception raised from a code block (Expression or 
Suite), it's even worse. The *whole traceback* gets accumulated,  i.e. 
all frames, so all objects referenced by local variables in any of the 
frames up to the evaluate one, will be kept alive for the rest of the 
lifespan of the process. In this case of course, the memory usage 
increases even more dramatically than in case 1.
To be clear, when I say "accumulated" in the above, it's really that 
those objects are considered live by the GC, not unreachable. I do a 
gc.collect() with the SAVE_ALL flag set, after every request, and 
collect() says we have no unreachable objects. This means that 
apparently we're doing the right thing for our __del__ stuff.
<some-more-gory-details>
Now the weird thing is that when I debug the case (1) using a call to 
gc.get_referrers() on the 'data' object, there are 3 referrers to it: 
the local frame (fine), the list of live objects as obtained with 
gc.get_objects() (also fine), plus a very curious one.
The last referrer is most of the time a dict object which looks like the 
session, but is not the session - I've modified the Session so that it's 
no longer a dict itself but an object delegating its dict like methods 
to a self._dict dictionary. Even that new _dict field shows up as a key 
in this dict-like session, so I'm sure the new code is picked (btw, I 
did this change because of http://bugs.python.org/issue1469629). But it 
happened a few time that this third object was an unrelated <Request 
"..../trac.css"> object, and sometimes another list with a huge pile of 
crap inside (not the live objects list). Weird. Not to mention that this 
dict doesn't appear to even contain an entry for 'data'...
Ok, those were the very latest findings of yesterday evening (quite 
late!) and I don't know how much I can trust get_referrers(), so it 
might well be (just another) wrong track...
</some-more-gory-details>
> If you can reliably reproduce this problem with a local data set (do  
> you have some instructions maybe?), could you maybe try whether it  
> also happens with Genshi 0.4.4? I'd like to know whether this problem  
> is new in 0.5dev.
>   
Great advice, I'll try with 0.4.4.
Also, I'll post some of the patches I used for debugging case (1) and 
trigger case (2)  on #6614 later today.
-- Christian
Crap, I meant DEBUG_UNCOLLECTABLE and s/unreachable/uncollectable/ in 
the above, as that's how I tested it on Linux (as reported in #6614), 
but yesterday on Windows I used the above wrong flag, which means... 
well, the whole case 1.  was complete crap (except the remark about 
get_changes() which hints about possible optimization in svn_fs.py).
This is the correct explicit_gc.diff patch which I should have used:
Index: trac/web/main.py
===================================================================
--- trac/web/main.py    (revision 6489)
+++ trac/web/main.py    (working copy)
@@ -390,6 +390,16 @@
     finally:
         if env and not run_once:
             env.shutdown(threading._get_ident())
+            # Now it's a good time to do some clean-ups
+            import gc
+            gc.disable()
+            gc.set_debug(gc.DEBUG_UNCOLLECTABLE)
+            unreachable = gc.collect()
+            env.log.info("%d unreachable objects found.", unreachable)
+            uncollectable = len(gc.garbage)
+            if uncollectable:
+                del gc.garbage[:]
+                env.log.warn("%d uncollectable objects found.", 
uncollectable)
 def _dispatch_request(req, env, env_error):
     resp = []
With that patch, there's no leak and constant memory usage for the 
scenario 1.
Sorry for the confusion.
Case 2. still stands.
-- Christian
How bad is the leakage at this point?
I mean, might it make sense to release 0.11.0 now to get the release out the
door and continue to look for these more elusive cases for a 0.11.1 release?
I am just trying to get a sense of the choice between finally releasing 0.11
and stability for the end-users.
-- 
Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai
イェルーン ラウフロック ヴァン デル ウェルヴェン
http://www.in-nomine.org/ | http://www.rangaku.org/
All for one, one for all...
I'm a bit short on time these days, so I can't really say /when/ I'll be 
able to sit down again and try to fix the issue with Genshi. On the Trac 
side, it's quite clear what has to be done (looking at 
http://trac.edgewall.org/ticket/6614#comment:18):
 1. do an explicit garbage collection after each request 
(http://trac.edgewall.org/attachment/ticket/6614/explicit_gc.diff)
 2. tune the pool usage in the svn backend:
     a. for the _history call 
(http://trac.edgewall.org/attachment/ticket/6614/fix-_history.diff)
     b. for listing long directories (#5782)
I can commit 1. and 2.a. shortly, and with that (plus using Python 2.5 
on the server), things will start to look quite acceptable.
2.b. needs more work, but is not a blocker, I think, so it could well 
wait 0.11.1 (or 0.12 if I end up reworking the Pool code entirely).
Now, if for any reason the server hits the problem with Genshi 
(http://trac.edgewall.org/ticket/6614#comment:40), the memory usage will 
increase dramatically again. To fix that one, I think the approach 
outlined in the last paragraph of 
http://trac.edgewall.org/ticket/6614#comment:38 will work, I only need 
some more time (or someone else to do it, of course).
> I mean, might it make sense to release 0.11.0 now to get the release out the
> door and continue to look for these more elusive cases for a 0.11.1 release?
>   
Well, it's a more a matter of fixing the Genshi issue for 0.5, then make 
Trac 0.11 depend on that version. As I said in the comment:38 (linked 
above), 0.4.x has the #G158 bug which makes Trac unreliable in 
multithreaded setup, so I wouldn't recommend using it.
> I am just trying to get a sense of the choice between finally releasing 0.11
> and stability for the end-users.
>   
Perhaps we can make another beta after the two patches on #6614 (listed 
in 1. and 2.a above) are committed, and see how things work in practice?
-- Christian
By all means. :)
>2.b. needs more work, but is not a blocker, I think, so it could well 
>wait 0.11.1 (or 0.12 if I end up reworking the Pool code entirely).
>
>Now, if for any reason the server hits the problem with Genshi 
>(http://trac.edgewall.org/ticket/6614#comment:40), the memory usage will 
>increase dramatically again. To fix that one, I think the approach 
>outlined in the last paragraph of 
>http://trac.edgewall.org/ticket/6614#comment:38 will work, I only need 
>some more time (or someone else to do it, of course).
Well, I started yesterday looking at the remaining tickets for 0.11 in order
to close them, but it seems you went through half of it after Odd Simon and
me closed one. :)
I'll check the tickets full history when I get home.
>> I mean, might it make sense to release 0.11.0 now to get the release out the
>> door and continue to look for these more elusive cases for a 0.11.1 release?
>
>Well, it's a more a matter of fixing the Genshi issue for 0.5, then make 
>Trac 0.11 depend on that version. As I said in the comment:38 (linked 
>above), 0.4.x has the #G158 bug which makes Trac unreliable in 
>multithreaded setup, so I wouldn't recommend using it.
Ok, but then releasing Genshi 0.5 becomes a blocking priority before
releasing Trac 0.11, since we need to get the eggs, especially for Windows
with speedups, out there.
>> I am just trying to get a sense of the choice between finally releasing 0.11
>> and stability for the end-users.
>
>Perhaps we can make another beta after the two patches on #6614 (listed 
>in 1. and 2.a above) are committed, and see how things work in practice?
Sure, I think our encountered issues warrant another beta regardless.
-- 
Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai
イェルーン ラウフロック ヴァン デル ウェルヴェン
http://www.in-nomine.org/ | http://www.rangaku.org/
There is time in life for everything...
Jeroen Ruigrok van der Werven wrote:
>>> I mean, might it make sense to release 0.11.0 now to get the release out the
>>> door and continue to look for these more elusive cases for a 0.11.1 release?
>>>       
>> Well, it's a more a matter of fixing the Genshi issue for 0.5, then make 
>> Trac 0.11 depend on that version. As I said in the comment:38 (linked 
>> above), 0.4.x has the #G158 bug which makes Trac unreliable in 
>> multithreaded setup, so I wouldn't recommend using it.
>>     
>
> Ok, but then releasing Genshi 0.5 becomes a blocking priority before
> releasing Trac 0.11, since we need to get the eggs, especially for Windows
> with speedups, out there.
>   
Thanks to Alec who managed to write a Genshi-only test program 
exhibiting the problem, we could narrow down the specific issue about 
the memory constantly growing when an exception was raised from a 
template to a Genshi bug. Well, I have the feeling it must rather be a 
Python bug, but at least it's triggered at (and can be worked around at) 
the Genshi level. See #G190.
But that issue is not the only problem we have w.r.t. to high memory 
usage. I think it's still relatively easy to make the memory usage grow 
to a high number where it will stay forever, even with Python 2.5. That 
has to be improved further.
>>> I am just trying to get a sense of the choice between finally releasing 0.11
>>> and stability for the end-users.
>>>       
>> Perhaps we can make another beta after the two patches on #6614 (listed 
>> in 1. and 2.a above) are committed, and see how things work in practice?
>>     
>
> Sure, I think our encountered issues warrant another beta regardless.
>
>   
The current state of affair is however quite better than it used to be, 
so we can probably prepare a beta2 even if #6614 is not completely fixed 
at this point. Thoughts?
-- Christian
>
> Some updates on the elusive #6614 bug
>
> Jeroen Ruigrok van der Werven wrote:
>>>> I mean, might it make sense to release 0.11.0 now to get the  
>>>> release out the
>>>> door and continue to look for these more elusive cases for a  
>>>> 0.11.1 release?
>>>>
>>> Well, it's a more a matter of fixing the Genshi issue for 0.5,  
>>> then make
>>> Trac 0.11 depend on that version. As I said in the comment:38  
>>> (linked
>>> above), 0.4.x has the #G158 bug which makes Trac unreliable in
>>> multithreaded setup, so I wouldn't recommend using it.
>>>
>>
>> Ok, but then releasing Genshi 0.5 becomes a blocking priority before
>> releasing Trac 0.11, since we need to get the eggs, especially for  
>> Windows
>> with speedups, out there.
>>
>
> Thanks to Alec who managed to write a Genshi-only test program
> exhibiting the problem, we could narrow down the specific issue about
> the memory constantly growing when an exception was raised from a
> template to a Genshi bug. Well, I have the feeling it must rather be a
> Python bug,
I'm not sure if this is related at all, since you're talking about  
0.11 and I'm running 0.10-stable, but I have had a 2-year issue with  
memory and cpu usage getting out of control at dsource.org.  With some  
help, gdb, and Python compiled w/ symbols, we tracked all the activity  
to _sre.c file (Python 2.4.4 source, but issue is in 2.5 as well).   
This is the regexp portion of Python and it seems to go into an  
endless loop.  CPU spikes and the system starts swapping.  I actually  
have a cronjob that restarts apache if utilization is too high... :(
We didn't figure out what part of Trac triggered this within Python,  
but I thought I'd mention it for this thread.  I had always assumed it  
was my use of a CentralDatabaseManager customization that I've made to  
the Trac code base.  You're talking about high memory usage, and not  
so much CPU, so I don't know if it's related or not...
> We didn't figure out what part of Trac triggered this within Python,  
> but I thought I'd mention it for this thread.  I had always assumed it  
> was my use of a CentralDatabaseManager customization that I've made to  
> the Trac code base.  You're talking about high memory usage, and not  
> so much CPU, so I don't know if it's related or not...
We've also had periodic runaway Trac processes.  I just had to kill this
one:
    PID USERNAME   THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
  84924 www          4  20    0   116M  2848K kserel 0  40.0H 98.73% python
This is with 0.11-dev.
-- 
Dave Abrahams
Boost Consulting
If this happens again, please attach to the process with gdb before 
killing it.
The backtrace could give a hint about what's going on.
-- Christian
Will do.
This is what happened:
  $ gdb 
  GNU gdb 6.1.1 [FreeBSD]
  Copyright 2004 Free Software Foundation, Inc.
  GDB is free software, covered by the GNU General Public License, and you are
  welcome to change it and/or distribute copies of it under certain conditions.
  Type "show copying" to see the conditions.
  There is absolutely no warranty for GDB.  Type "show warranty" for details.
  This GDB was configured as "amd64-marcel-freebsd".
  (gdb) attach 3302
  Attaching to process 3302
  /usr/src/gnu/usr.bin/gdb/libgdb/../../../../contrib/gdb/gdb/solib-svr4.c:1443: internal-error: legacy_fetch_link_map_offsets called without legacy link_map support enabled.
  A problem internal to GDB has been detected,
  further debugging may prove unreliable.
  Quit this debugging session? (y or n) y
So now I need to try to figure out how to enable legacy link_map
support.
> If you answer "n" to quit and then "n" to creating a core file, it
> will continue 
It did for me.
> (and seems to give reasonable information).
It didn't for me: a neverending backtrace with no symbols.
> On my FreeBSD box I get the same behavior as you by attaching without
> specifying the executable itself.  Try:
>
> gdb -p $pid `which python`
Hmm, next time, then. Thanks.
> If you answer "n" to quit and then "n" to creating a core file, it
> will continue (and seems to give reasonable information).
>
> On my FreeBSD box I get the same behavior as you by attaching without
> specifying the executable itself.  Try:
>
> gdb -p $pid `which python`
This trac process was occupying 400MB but 0% CPU when I stopped
it
$ gdb -p 94603 `which python`
  GNU gdb 6.1.1 [FreeBSD]
  Copyright 2004 Free Software Foundation, Inc.
  GDB is free software, covered by the GNU General Public License, and you are
  welcome to change it and/or distribute copies of it under certain conditions.
  Type "show copying" to see the conditions.
  There is absolutely no warranty for GDB.  Type "show warranty" for details.
  This GDB was configured as "amd64-marcel-freebsd"...(no debugging symbols found)...
  Attaching to program: /usr/local/bin/python, process 94603
  ptrace: Operation not permitted.
  /usr/home/dave/94603: No such file or directory.
  (gdb) bt
  No stack.
  (gdb) 
One more note: There were two others weighing in at 225MB or so.  These
processes linger for a good long while (may 30 sec for the longest)
after I stop lighttpd, but they will exit on their own eventually
without being explicitly killed.
It's also possible that you're hitting some Python "bugs" [1] triggered by
Genshi [2]. The Genshi ticket actually dsicusses two issues, one genuine
memory leak in Python's builtin groupby() and another covered by the
linked-to Python issue, where memory allocated to unicode objects is
never returned to the system.
The latter is "fixed" in Genshi r816 [3] and Trac r6757 [4], and the former
is fixed in Trac r6637 [5].
If you're not already running at least Trac r6637 and Genshi r816, you
should upgrade and see if your problem persists.
[1] http://bugs.python.org/issue2321,
[2] http://genshi.edgewall.org/ticket/190
[3] http://genshi.edgewall.org/changeset/816
[4] http://trac.edgewall.org/changeset/6756
[5] http://trac.edgewall.org/changeset/6637
-- 
Evolution: Taking care of those too stupid to take care of themselves.
Last time I mentioned a possible threading/python issue on FreeBSD I got
told on the Python list that FreeBSD's threading has caused issues for
Python before. Now, whether or not this is a problem within FreeBSD or
simply something else (read: Python itself), I cannot say.
-- 
Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai
イェルーン ラウフロック ヴァン デル ウェルヴェン
http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B
Nothing yet from nothing ever came...
> -On [20080617 22:20], David Abrahams (da...@boostpro.com) wrote:
>>#0  0x000000080098b85c in pthread_testcancel () from /lib/libpthread.so.2
>>#1  0x0000000800982de7 in pthread_mutexattr_init () from /lib/libpthread.so.2
>>#2  0x00000008009849cb in pthread_mutexattr_init () from /lib/libpthread.so.2
>>#3  0x000000080098799f in pthread_setconcurrency () from /lib/libpthread.so.2
>
> Last time I mentioned a possible threading/python issue on FreeBSD I got
> told on the Python list that FreeBSD's threading has caused issues for
> Python before. Now, whether or not this is a problem within FreeBSD or
> simply something else (read: Python itself), I cannot say.
Here's another stack backtrace of a Trac process taking 100% CPU.  It's
a good thing the server has two cores ;-)
I realize this seems to implicate FreeBSD threading, but this doesn't
happen to my Django processes, for example, so I'm still a little
inclined to suspect Trac.  I have a hunch that these runaway processes
start when I ask for something that takes a long time from Trac, like an
old changeset, but then get impatient and cancel the request.  Is that
possible?
Thanks in advance for any help you can offer.
In fact I can verify that viewing a changeset that touched approximately
2700 files sends CPU usage to 100% and that does not stop seemingly
ever, until the webserver is restarted (which restarts the Trac
processes).  So that's a bug, yes?
-- 
Dave Abrahams
BoostPro Computing
http://www.boostpro.com
I would wager good money the bug is in the svn bindings. Can you repro  
it outside of Trac at all?
--Noah
Not with the regular svn client.  I guess you would suggest trying to do
it from Python with the svn bindings, eh?  Can you give me any pointers?
I've never crawled through that part of the Trac code.
Well, you may just have hit some resource limits. What's the size of the 
corresponding textual diff, to get an idea? If it's in the 10Mb range, 
that might well be too much for Trac and Genshi.
Have a look at http://trac.edgewall.org/wiki/TracIni#changeset-section 
and reduce the number of max. files (e.g. 500). Does the problem still 
happen?
Now, on the hypothesis of an svn bindings bug, which version of the 
bindings are you using? Versions >= 1.5.0 are supposed to fix a long 
standing memory leak issue (see 
http://trac.edgewall.org/ticket/6588#comment:3).
-- Christian
I'm afraid I don't remember what commit I was looking at.  I could try
to reconstruct...  OK, if I got the right diff set, "svn diff | wc -c"
says it's precisely 11709506 bytes.
> If it's in the 10Mb range,
Yup.
> that might well be too much for Trac and Genshi.
I guess so!
> Have a look at http://trac.edgewall.org/wiki/TracIni#changeset-section 
> and reduce the number of max. files (e.g. 500). Does the problem still 
> happen?
Hmm, good question.
svn diff -r 2791:2796 | grep -cE '^======================'
says there are around 1929 files. OK, dropping the number to 500...
...works!
> Now, on the hypothesis of an svn bindings bug, which version of the 
> bindings are you using? Versions >= 1.5.0 are supposed to fix a long 
> standing memory leak issue (see 
> http://trac.edgewall.org/ticket/6588#comment:3).
Hum.  We're on svn 1.5, but the 'svn' module doesn't have a __version__
attribute.  The one in ports (FreeBSD) is py-subversion-1.5.4_2.  I
can't see why we'd have gone out of our way to install anything earlier
than that.
>> Have a look at http://trac.edgewall.org/wiki/TracIni#changeset-section 
>> and reduce the number of max. files (e.g. 500). Does the problem still 
>> happen?
>
> Hmm, good question.  
>
>   svn diff -r 2791:2796 | grep -cE '^======================'
>
> says there are around 1929 files.  OK, dropping the number to 500...
>
> ...works!
Couldn't Trac better protect itself from this sort of thing by using a
timeout?  It could simply bail on inlining anything if it takes more
than X seconds.  That way, one wouldn't have to try to guess at the
number of files or bytes that will cause an apparent hang on particular
hardware and source bases.
That seems more the job for the webserver or middleware, though a plugin would be trivial to implement (IRequestFilter).
Jeff Hammel
IRC: jhammel, k0s
> On Fri, Jan 09, 2009 at 08:42:08PM -0500, David Abrahams wrote:
>
>> Couldn't Trac better protect itself from this sort of thing by using a
>> timeout?  It could simply bail on inlining anything if it takes more
>> than X seconds.  That way, one wouldn't have to try to guess at the
>> number of files or bytes that will cause an apparent hang on particular
>> hardware and source bases.
>
> That seems more the job for the webserver 
How is the webserver going to cause a fallback to a non-inlined display
without the user experiencing an error?
> or middleware,
I've never really understood that term, so can't comment.
> though a plugin would be trivial to implement (IRequestFilter).
Is this a case of "it can be implemented in a plugin, so it should be?"
I was under the assumption that the important part was to avoid the runaway processes. But yeah, the webserver or middleware can't magically know about trac's internals (or really shouldn't, in any case).
 
> > or middleware, 
> 
> I've never really understood that term, so can't comment.
> 
> > though a plugin would be trivial to implement (IRequestFilter).
> 
> Is this a case of "it can be implemented in a plugin, so it should be?"
Probably? Well, yes. I make little differentiation philosophically between components that come with core and components packaged in plugins. But yeah, a plugin seems the right place to put it.
> I was under the assumption that the important part was to avoid the
> runaway processes.  But yeah, the webserver or middleware can't
> magically know about trac's internals (or really shouldn't, in any
> case).
>  
>> > or middleware, 
>> 
>> I've never really understood that term, so can't comment.
>> 
>> > though a plugin would be trivial to implement (IRequestFilter).
>> 
>> Is this a case of "it can be implemented in a plugin, so it should be?"
>
> Probably?  Well, yes.  I make little differentiation philosophically
> between components that come with core and components packaged in
> plugins.  
To me this seems like an(other) example of something that everybody will
eventually want, and should be packaged with the core.  You already have
these weak "max_bytes" and "max_files" limits in the core, so *somebody*
clearly thought it was core's business to avoid hanging the server.  The
problem is that the existing approach is ineffectual.  Eventually
someone like me who assumes the core settings are reasonable will get
bitten, and then there will be a bug report.  Again.
> But yeah, a plugin seems the right place to put it.
Why, please?  IMO someone really needs to lay out a set of criteria that
distinguish things that should be in core and things that shouldn't.
Even if the criteria are slightly fungible, it would be an improvement.
I can't argue with that, but...
 
> > But yeah, a plugin seems the right place to put it.
> 
> Why, please?  IMO someone really needs to lay out a set of criteria that
> distinguish things that should be in core and things that shouldn't.
> Even if the criteria are slightly fungible, it would be an improvement.
Two points:
* releases for trac are triaged well ahead of time. For instance, the 0.12 release focuses on i18n (IIRC). So even if this additional feature was trivial (which it almost is), it might not be slated until several releases in the future. Read: several years (maybe). I'm all for faster development of trac. That being said, I haven't thrown my hat in the ring to this end, so I can't complain. But plugins can be developed and version completely independently and can provide a solution for everyone (that follows trac-hacks, anyway) now.
...relatedly...
* developing components as plugins allows for independent efforts of implementation, testing, and release from core. You can work out a solution without having to tramp all over a shared code base. Once a plugin has been proven, it is much easier to propose an enhancement to core and really start to answer the question: "do people want this?" You can actually see what the plugin does and meaningfully judge it.
That's my opinion anyway, as a plugin and non-core developer. One of my patches is now in core (from the AutoQueryPlugin) and another is triaged for 0.13 IIRC (from the AutocompleteUsersPlugin).
On the other hand, if you wanted to provide a patch, I'm sure that would be considered too.
> Two points:
>
>  * releases for trac are triaged well ahead of time.  For instance, the 0.12 release
> focuses on i18n (IIRC).  So even if this additional feature was trivial (which it
> almost is), it might not be slated until several releases in the future.  Read:
> several years (maybe).  I'm all for faster development of trac.  That being said, I
> haven't thrown my hat in the ring to this end, so I can't complain.  But plugins can
> be developed and version completely independently and can provide a solution for
> everyone (that follows trac-hacks, anyway) now.
>
> ...relatedly...
>
>  * developing components as plugins allows for independent efforts of implementation,
> testing, and release from core.  You can work out a solution without having to tramp
> all over a shared code base.  Once a plugin has been proven, it is much easier to
> propose an enhancement to core and really start to answer the question: "do people
> want this?"  You can actually see what the plugin does and meaningfully judge it.
These are both wonderful features of a modular system.  However, they
don't really address why you think this particular improvement "belongs"
in a plug-in.
> That's my opinion anyway, as a plugin and non-core developer.  One of
> my patches is now in core (from the AutoQueryPlugin) and another is
> triaged for 0.13 IIRC (from the AutocompleteUsersPlugin).
>
> On the other hand, if you wanted to provide a patch, I'm sure that
> would be considered too.
Yeah, perhaps I will.
Probably a timeout-based approach can work, yes.
Besides we can still optimize _content_changes further and especially 
the way it takes care of big files:
  - instead of reading all the content of the old and new files before 
checking if any of those is binary, it could take this decision by 
looking at the extension or at the first 1000 bytes, as #3332 did IIRC.
  - don't display in-line changes for big files (> 100k?) at all, even 
if they're not binary, except of course when viewing a changeset 
restricted to that file
-- Christian
Some repo types (e.g. Perforce) also have a file type attribute which can
be used to determine if file is binary without having to read the file at
all.  It would be nice to take advantage of this when its available.
--
Justin