Erratic and slow hgweb performance compared to a windows network share

302 views
Skip to first unread message

Angel Ezquerra

unread,
Jul 6, 2012, 4:16:58 AM7/6/12
to Mercurial
Hi,

we have configured an apache+wsgi based mercurial hgweb server, and we
have some performance problems.

The mercurial version is 1.9 and the python version is 2.6.5, 32-bit.
The server is running on Windows Server 2003 Standard x64 edition. The
machine running the web server is quite powerful, if slightly old,
with a 2.33 GHz E5410 Xeon CPU coupled with 16 GB of RAM. The windows
task manager shows that the server has plenty of free memory left (3.9
GB are used) and the CPU's are idle most of the time, with occasional
peaks up to 50%. We are serving in the order of 500 repositories,
including subrepos. There probably less than 50 potential users of
this server. Due to the distributed nature of mercurial I doubt that
many would access the server simultaneously.

We are having some trouble with what I would describe as "erratic
performance". In addition the server seems to be much slower at
serving repos than using a network share. The server works well enough
for our daily use but some operations take longer that I would expect.
There are two problems in particular:

1. Using a mercurial client to clone, pull from or push to the server
is noticeable slower than accessing the same repositories via a shared
windows drive (which is what we did before setting up the web server).
This is something that people did notice when we made the move from
using a shared drive to using this web server. As an example, I just
cloned a repo whose size is around 500 MB, both using the web
interface and the network share interface. The results are as follows:

E:\hgtest>hg --time clone \\hgserver\hgrepos\example_repo example_repo_smb1
updating to branch default
65 files updated, 0 files merged, 0 files removed, 0 files unresolved
Time: real 55.511 secs (user 0.608+0.000 sys 1.513+0.000)

E:\hgtest>hg --time clone http://mercurial/platform/example_repo
example_repo_http1
requesting all changes
adding changesets
adding manifests
adding file changes
added 267 changesets with 1482 changes to 104 files (+7 heads)
updating to branch default
65 files updated, 0 files merged, 0 files removed, 0 files unresolved
Time: real 485.979 secs (user 146.329+0.000 sys 6.786+0.000)

E:\hgtest>hg --time clone \\hgserver\hgrepos\example_repo example_repo_smb2
updating to branch default
65 files updated, 0 files merged, 0 files removed, 0 files unresolved
Time: real 54.623 secs (user 0.749+0.000 sys 1.232+0.000)

E:\hgtest>hg --time clone -U
http://mercurial/platform/example_repo example_repo_ht
tp2
requesting all changes
adding changesets
adding manifests
adding file changes
added 267 changesets with 1482 changes to 104 files (+7 heads)
Time: real 485.929 secs (user 145.393+0.000 sys 7.363+0.000)

E:\hgtest>hg --time clone -U \\hgserver\hgrepos\example_repo
example_repo_smb3
Time: real 60.905 secs (user 0.234+0.000 sys 1.435+0.000)

E:\hgtest>hg --time clone -U
http://mercurial/platform/example_repo example_repo_http3
requesting all changes
adding changesets
adding manifests
adding file changes
added 267 changesets with 1482 changes to 104 files (+7 heads)
Time: real 501.726 secs (user 146.267+0.000 sys 6.380+0.000)

So using the -U option does not have much impact. Using the windows
network share it takes around 55 to 60 seconds to clone the repo,
while using the web interface takes around 500 seconds!

2. The actual web access, where you use your web browser to view the
list of repositories and to access the individual repositories,
viewing their graph history, etc is "erratic". Usually it is great but
sometimes the server does not seem to respond or only shows part of
the web page and you must refresh the page to get the page to show.
This is most noticeable when accessing the main server page (the one
that lists the repos being served). Note that this happens even if I
access the mercurial web server from the web server itself, so I don't
think this is a network problem.

I believe that this "erratic" web access happens when you try to
access the server while someone is doing a large pull, push or clone
operation. In particular, accessing the web server while I was doing
the http based clones above was slow. As soon as the clone operation
finished (and at a some points during the clone) the web request
finished immediately. It is as if at some point during the clone the
server gets so busy that it does not serve any more clients. Looking
at the windows task manager I see that during the clone one of the
processors is busy about 50% of the time with pieaks close to 90%. The
other 3 processors are all below 50% the whole time. The memory
remains constant (around 3.9 GB).

The web server hgweb.cofing file is as follows:

[web]
push_ssl = false
allow_push = *
style = monoblue
allow_archive = zip
logourl=http://hgserver
motd =<br><hr><br>If you have any problems with this server please
contact Angel Ezquerra<br>

[paths]
/ = c:/hgrepos/**

Note that I have edited this a little. In particular we abuse the MOTD
a bit to add a menu bar to the top of the web server pages that lets
people easily access our JIRA web server, which has been recently
setup on this same server. Note that our performance issues predate
the installation of JIRA. We allow_push to anybody that has access to
this server, which is only accessible through our intranet.

I'd like to know if you guys have some advice to improve the
performance of our server. We plan to upgrade to the latest mercurial
version soon but in the meantime do you have any other suggestions?
Also:

1. Is serving so many repos a problem?
2. Is the performance difference between using the web server and a
windows network share expected?
3. Is there some way to improve the performance, perhaps by changing
the apache or WSGI configuration? I know that the WSGIDaemonProcess
apache directive does not work on windows. Perhaps there are other
things that can be tweaked?
4. Is it normal that serving web pages gets "stuck" while a large
clone is being performed?

Any help would be greatly appreciated!

Angel
_______________________________________________
Mercurial mailing list
Merc...@selenic.com
http://selenic.com/mailman/listinfo/mercurial

Adrian Buehlmann

unread,
Jul 6, 2012, 5:17:31 AM7/6/12
to Angel Ezquerra, Mercurial
On 2012-07-06 10:16, Angel Ezquerra wrote:
> 1. Using a mercurial client to clone, pull from or push to the server
> is noticeable slower than accessing the same repositories via a shared
> windows drive (which is what we did before setting up the web server).

I don't think I'm surprised by that with respect to clone.

For pull and push, I have some doubts.

As for the clone, I think you should try "hg clone --uncompressed". This
will transfer the repo similarly to a file-wise copy would do.

See also the "uncompressed" config setting in section "server" [1].

[1] http://hg.intevation.org/mercurial/crew/help/config

Angel Ezquerra

unread,
Jul 6, 2012, 6:21:17 AM7/6/12
to Adrian Buehlmann, Mercurial
On Fri, Jul 6, 2012 at 11:17 AM, Adrian Buehlmann <adr...@cadifra.com> wrote:
> On 2012-07-06 10:16, Angel Ezquerra wrote:
>> 1. Using a mercurial client to clone, pull from or push to the server
>> is noticeable slower than accessing the same repositories via a shared
>> windows drive (which is what we did before setting up the web server).
>
> I don't think I'm surprised by that with respect to clone.
>
> For pull and push, I have some doubts.

For pull the difference is smaller but it is still quite big:


E:\hgtest> cd example_repo_smb2

E:\hgtest\example_repo_smb2>hg strip --no-backup -r 1

E:\hgtest\example_repo_smb2>hg --time pull
pulling from \\hgserver\hgrepos\example_repo
searching for changes
adding changesets
adding manifests
adding file changes
added 266 changesets with 1460 changes to 82 files (+7 heads)
(run 'hg heads .' to see heads, 'hg merge' to merge)
Time: real 261.667 secs (user 154.051+0.000 sys 7.862+0.000)

E:\hgtest> cd ..\example_repo_http2

E:\hgtest\example_repo_http2>hg strip --no-backup -r 1

E:\hgtest\example_repo_http2>hg --time pull
pulling from http://mercurial/platform/example_repo
searching for changes
adding changesets
adding manifests
adding file changes
added 266 changesets with 1460 changes to 82 files (+7 heads)
(run 'hg heads .' to see heads, 'hg merge' to merge)
Time: real 724.935 secs (user 146.064+0.000 sys 6.599+0.000)

So it takes 261 seconds when pulling from the windows share, and 725
seconds when pulling from the web server.

> As for the clone, I think you should try "hg clone --uncompressed". This
> will transfer the repo similarly to a file-wise copy would do.
>
> See also the "uncompressed" config setting in section "server" [1].
>
> [1] http://hg.intevation.org/mercurial/crew/help/config
>

I tried "hg clone --uncompressed" and you are right. The clone time
dropped to 70 seconds, which is very close to the time that it takes
when using the windows share, which is nice!

BTW, the hg help hgweb page says that only the following sections are
recognized: web, paths and collections.

However, I think that it also recognizes the section "server", because
I added that section to my hgweb.config file, where I set
preferuncompressed to true, and it worked. I'm not a 100% sure of this
because I first added:

import os
os.environ['HGRCPATH'] = config

to my wsgi_handler.wsgi file, but then I removed, restarted the server
and it still worked. So maybe that is a bug with the documentation?

Anyway, I still find that the web server is much less performant than
the windows share. Any ideas on how to improve this? Is it normal?

Cheers,

Angel

Benoît Allard

unread,
Jul 6, 2012, 8:35:08 AM7/6/12
to Angel Ezquerra, Adrian Buehlmann, Mercurial


> -----Original Message-----
> > As for the clone, I think you should try "hg clone --uncompressed". This
> > will transfer the repo similarly to a file-wise copy would do.
> >
> > See also the "uncompressed" config setting in section "server" [1].
> >
> > [1] http://hg.intevation.org/mercurial/crew/help/config
> >
>
> I tried "hg clone --uncompressed" and you are right. The clone time
> dropped to 70 seconds, which is very close to the time that it takes
> when using the windows share, which is nice!

This confirms that the difference in time is due to compression/decompression of
the data. 'server.preferuncompressed' is not (yet) understood for pull/push.
Implementing this should reduce the remaining trouble you have.

Best Regards

Angel Ezquerra

unread,
Jul 6, 2012, 10:57:00 AM7/6/12
to Benoît Allard, Mercurial
Yes, it seems so. Looking briefly at the code it seems that the
'stream' mode that is used when uncompressed is selected on clone can
only be used when no heads have been specified. Do you think that it
could be possible to use this mode on pull or push (at least when
pulling all changes?). The performance difference on a LAN is really
big!

Still, I am surprised that the web server itself seems to bet "stuck"
and not serve any pages while a big pull operation is being performed.
Is that normal? Could it be because python only uses one of the server
processors? Are there any solutions to this problem?

Also I suspect that this probably also explains why sometimes pulls
seem to take even longer than usual. It probably happens when a user
starts a pull while another user is already performing pull, push or
clone operation. Does that make sense?

Bryan O'Sullivan

unread,
Jul 6, 2012, 2:29:34 PM7/6/12
to Angel Ezquerra, Mercurial
On Fri, Jul 6, 2012 at 7:57 AM, Angel Ezquerra <angel.e...@gmail.com> wrote:

Yes, it seems so. Looking briefly at the code it seems that the
'stream' mode that is used when uncompressed is selected on clone can
only be used when no heads have been specified. Do you think that it
could be possible to use this mode on pull or push (at least when
pulling all changes?).

No, since the protocol is completely different.

Still, I am surprised that the web server itself seems to bet "stuck"
and not serve any pages while a big pull operation is being performed.
Is that normal?

I don't think so. Is your server running via Apache, or as a standalone "hg serve"?
 
Could it be because python only uses one of the server
processors?

We'd need more details of your server setup before speculating, really.

Also I suspect that this probably also explains why sometimes pulls
seem to take even longer than usual. It probably happens when a user
starts a pull while another user is already performing pull, push or
clone operation. Does that make sense?

It does suggest that there's some bottleneck on the server side. 

Angel Ezquerra

unread,
Jul 6, 2012, 3:19:41 PM7/6/12
to Bryan O'Sullivan, mercurial


On Jul 6, 2012 8:29 PM, "Bryan O&apos;Sullivan" <b...@serpentine.com> wrote:
>
> On Fri, Jul 6, 2012 at 7:57 AM, Angel Ezquerra <angel.e...@gmail.com> wrote:
>>
>>
>> Yes, it seems so. Looking briefly at the code it seems that the
>> 'stream' mode that is used when uncompressed is selected on clone can
>> only be used when no heads have been specified. Do you think that it
>> could be possible to use this mode on pull or push (at least when
>> pulling all changes?).
>
> No, since the protocol is completely different.

I was afraid that would be the case. What a potty, since the performance difference is huge!

>> Still, I am surprised that the web server itself seems to bet "stuck"
>> and not serve any pages while a big pull operation is being performed.
>> Is that normal?
>
>
> I don't think so. Is your server running via Apache, or as a standalone "hg serve"?
>  
>>
>> Could it be because python only uses one of the server
>> processors?
>
>
> We'd need more details of your server setup before speculating, really.

I put some info on my first email. Since I'm on my phone i hope you don't mind if I just paste what i wrote, add some additional details and if you need something else just let me know:

The sever is an apache+wsgi based mercurial hgweb server. I believe the Apache version is 2.2 32 bit. The mercurial version is 1.9 and the python version is 2.6.5, 32-bit. The server is running on Windows Server 2003 Standard x64 edition. The machine running the web server is quite powerful, if slightly old, with a 2.33 GHz E5410 Xeon CPU coupled with 16 GB of RAM. The windows task manager shows that the server has plenty of free memory left (3.9 GB are used) and the CPU's are idle most of the time, with occasional peaks up to 50%. We are serving in the order of 500 repositories, including subrepos. There probably less than 50 potential users of this server.

>> Also I suspect that this probably also explains why sometimes pulls
>> seem to take even longer than usual. It probably happens when a user
>> starts a pull while another user is already performing pull, push or
>> clone operation. Does that make sense?
>
>
> It does suggest that there's some bottleneck on the server side. 

Makes sense although I don't know what it could be. The task manager does not give any indication either since memory seems fine and CPU usage is high but does not seem to reach 100% on none of the processors.

Thanks for looking onto this!

Angel

Angel Ezquerra

unread,
Jul 6, 2012, 3:40:43 PM7/6/12
to Bryan O'Sullivan, mercurial
Correction, one of the CPUs seems to get up to 90% or so during a
clone or pull through the web server.

Bryan O'Sullivan

unread,
Jul 6, 2012, 4:53:24 PM7/6/12
to Angel Ezquerra, mercurial
On Fri, Jul 6, 2012 at 12:19 PM, Angel Ezquerra <angel.e...@gmail.com> wrote:

The sever is an apache+wsgi based mercurial hgweb server.

The first thing I'd look at is the Apache and mod_wsgi configuration. There's a lot of tweaking involved for both.

If you're not able to get Apache to serve anything else while a long-running client is connected to it, that seems likely to be the source of the problem. A normally configured Apache shouldn't have any problems talking to a few clients at a time. 

Angel Ezquerra

unread,
Jul 6, 2012, 5:32:58 PM7/6/12
to Bryan O'Sullivan, mercurial
The Apache configuration is pretty vanilla. I am really not an expert
on apache configuration. I think that I basically followed the
PublishingRepositories wiki entry.

I just took the default apache httpd.conf file and changed the following lines:

1. At line 47 I added:

Listen 8000

Note that the extra indentation is not there. I just indented it here
so that it stands out from the rest of my comments.

The idea was to make it possible to access the server both through the
usual 80 http port and also through the usual mercurial hgweb 8000
port

2. At line 130 I added:

LoadModule wsgi_module modules/mod_wsgi.so

This enables WSGI. I don't remember where I got the mod_wsgi.so file
but I think it was somewhere "official".

3. At line 351 I added:

WSGIScriptAlias /wsgi "C:/hg_server/wsgi_handler.wsgi"
WSGIScriptAlias /hg "C:/hg_server/hgweb.wsgi"
WSGIScriptAlias / "C:/hg_server/hgweb.wsgi"

<Directory "C:/hg_server">
AllowOverride None
Options None
Allow from all

AddHandler wsgi-script .wsgi
</Directory>

Alias /scripts "C:/hg_server/scripts"
<Directory "C:/hg_server/scripts">
AllowOverride Fileinfo
Options Indexes +ExecCGI
#AddHandler wsgi-script none
AddHandler wsgi-script .wsgi

# Security - Only allow accessing the
# mercurial scripts from the local intranet
Allow from 192.168
Allow from 127.0.0.1
Deny from all
</Directory>

Everything else is unchanged from the default apache http.conf file.

The idea is that people can access the server by typing
"http://mercurial" on their web browser and they immediately get the
list of all repos that are being served. This can only be done from
the local intranet. They can also access it through
"http://mercurial/hg" for backwards compatibility with some old
repos. "http://mercurial/wsgi" is used for testing and is never
accessed by anybody except me (I used it to test the apache wsgi
configuration).

The C:/hg_server/hgweb.wsgi file is really simple:

config = "c:/hg_server/hgweb.config"

# enable demandloading to reduce startup time
from mercurial import demandimport; demandimport.enable()

from mercurial.hgweb import hgweb
application = hgweb(config)

Additionally, this is what the apache error.log shows after I restart
the server:

The Apache2.2 service is restarting.
The Apache2.2 service has restarted.
arent: Received restart signal -- Restarting the server.
[Fri Jul 06 12:13:55 2012] [notice] Child 1820: Exit event
signaled. Child process is ending.
[Fri Jul 06 12:13:55 2012] [warn] mod_wsgi: Compiled for Python/2.6.2.
[Fri Jul 06 12:13:55 2012] [warn] mod_wsgi: Runtime using Python/2.6.5.
[Fri Jul 06 12:13:55 2012] [notice] Apache/2.2.19 (Win32)
mod_wsgi/3.3 Python/2.6.5 configured -- resuming normal operations
[Fri Jul 06 12:13:55 2012] [notice] Server built: May 20 2011 17:39:35
[Fri Jul 06 12:13:55 2012] [notice] Parent: Created child process 3512
[Fri Jul 06 12:13:56 2012] [warn] mod_wsgi: Compiled for Python/2.6.2.
[Fri Jul 06 12:13:56 2012] [warn] mod_wsgi: Runtime using Python/2.6.5.
[Fri Jul 06 12:13:56 2012] [notice] Child 3512: Child process is running
[Fri Jul 06 12:13:56 2012] [notice] Child 1820: Released the start mutex
[Fri Jul 06 12:13:56 2012] [notice] Child 3512: Acquired the start mutex.
[Fri Jul 06 12:13:56 2012] [notice] Child 3512: Starting 64 worker threads.
[Fri Jul 06 12:13:56 2012] [notice] Child 3512: Starting thread to
listen on port 8000.
[Fri Jul 06 12:13:56 2012] [notice] Child 3512: Starting thread to
listen on port 80.
[Fri Jul 06 12:13:57 2012] [notice] Child 1820: All worker threads
have exited.
[Fri Jul 06 12:13:57 2012] [notice] Child 1820: Child process is exiting

So mod_wsgi.so was compiled for python 2.6.2. but we are using 2.6.5.
Other than that I don't see anything weird. However I am no expert on
any of this so I could be missing something trivial of course.

Bryan O'Sullivan

unread,
Jul 9, 2012, 2:17:12 PM7/9/12
to Angel Ezquerra, mercurial
On Fri, Jul 6, 2012 at 2:32 PM, Angel Ezquerra <angel.e...@gmail.com> wrote:

So mod_wsgi.so was compiled for python 2.6.2. but we are using 2.6.5.
Other than that I don't see anything weird. However I am no expert on
any of this so I could be missing something trivial of course.


I'm pretty sure you need to set a number of threads or processes to use, as otherwise they're likely to default to 1. 

Angel Ezquerra

unread,
Jul 9, 2012, 4:35:53 PM7/9/12
to Bryan O'Sullivan, mercurial
Thank you Brian. I had read that document before and I thought that
the default configuration already uses multiple threads.

If I understand that document correctly the only available 'MPM' on
windows is 'winnt'. According to the document "To emulate the same
process/thread model as the 'winnt' MPM, that is, a single process
with multiple threads, the following configuration would be used:

WSGIDaemonProcess example threads=25"

So I thought that on winnt we would already be using 25 threads by
default. I'll see if setting it explicitly improves things though.

Am I the only one running an apache based hgweb server on windows? I'm
a bit surprised that other people have not experienced this issue.

Thanks,

Adrian Buehlmann

unread,
Jul 10, 2012, 3:21:22 AM7/10/12
to Angel Ezquerra, mercurial
On 2012-07-09 22:35, Angel Ezquerra wrote:
> On Mon, Jul 9, 2012 at 8:17 PM, Bryan O'Sullivan <b...@serpentine.com> wrote:
>> On Fri, Jul 6, 2012 at 2:32 PM, Angel Ezquerra <angel.e...@gmail.com>
>> wrote:
>>>
>>>
>>> So mod_wsgi.so was compiled for python 2.6.2. but we are using 2.6.5.
>>> Other than that I don't see anything weird. However I am no expert on
>>> any of this so I could be missing something trivial of course.
>>
>>
>> Take a look at the mod_wsgi docs:
>> http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading
>>
>> I'm pretty sure you need to set a number of threads or processes to use, as
>> otherwise they're likely to default to 1.
>
> Thank you Brian. I had read that document before and I thought that
> the default configuration already uses multiple threads.
>
> If I understand that document correctly the only available 'MPM' on
> windows is 'winnt'. According to the document "To emulate the same
> process/thread model as the 'winnt' MPM, that is, a single process
> with multiple threads, the following configuration would be used:
>
> WSGIDaemonProcess example threads=25"
>
> So I thought that on winnt we would already be using 25 threads by
> default. I'll see if setting it explicitly improves things though.
>
> Am I the only one running an apache based hgweb server on windows? I'm
> a bit surprised that other people have not experienced this issue.

I wouldn't be that surprised. I guess some people are using IIS or if
they are interested in Apache run it on Linux anyway. The barrier to
switch to using Linux gets very low pretty quickly, given all the
combined quirks you most likely run into with trying to run Apache on
Windows. There's nothing wrong with having a Linux server and people
using Windows desktops. In fact, it even spares you a Windows server
license. IMHO, Linux on servers is way more common than on desktops.

I don't run Apache here either, so you most likely have more experience
than I have.

What strikes me as odd, is that you have such a powerful 64-bit server
(16 GB RAM, IIRC) but you still use 32-bit Apache and thus 32-bit Python
and 32-bit Mercurial.

I see that ASF (through their official Apache website) provides only
32-bit Windows binaries for Apache. It seems others like
http://www.apachelounge.com/download/win64/ are jumping in to help, but
then you still have a problem with getting 64 bit mod_wsgi. But I'm not
really sure whether mod_wsgi is really such a good idea to use. I
wouldn't be surprised if wsgi requires more tuning than plain simple
stupid cgi. apachelounge seems to be providing fcgi 64-bit modules for
Apache, so you might have more luck using plain simple cgi. It might
even be interesting to see if there is a difference by using cgi vs wsgi
with your existing 32-bit Apache.

It seems the projects haven't yet caught up with the reality of 64-bit
now being ubiquitous.

I'm not saying your problems will go away by using 64-bit Apache instead
of 32-bit. But a 32-bit process has a very hard memory limit due to the
limit imposed by pointer sizes (4 GB). If you have only one process with
lots of threads, these threads are still limited into the same process
space regarding memory.

Lester Caine

unread,
Jul 10, 2012, 5:37:45 AM7/10/12
to mercurial
Adrian Buehlmann wrote:
>> Am I the only one running an apache based hgweb server on windows? I'm
>> > a bit surprised that other people have not experienced this issue.
> I wouldn't be that surprised. I guess some people are using IIS or if
> they are interested in Apache run it on Linux anyway. The barrier to
> switch to using Linux gets very low pretty quickly, given all the
> combined quirks you most likely run into with trying to run Apache on
> Windows. There's nothing wrong with having a Linux server and people
> using Windows desktops. In fact, it even spares you a Windows server
> license. IMHO, Linux on servers is way more common than on desktops.

While I have run Apache/Windows servers in the past, and currently I have a
couple of test machines running, all the production machines are Apache/Linux.
The main reason for that is the VERY poor performance of Apache/PHP on the
windows boxes. Several times slower than the SAME hardware running Linux ( I'd
published the figures somewhere :( ) on Windows7. WindowsXP is still the best
platform for a windows Apache stack. So I would not be surprised with the same
problems running python.

32/64bit windows does not make much difference, with some people reporting 32bit
actually being faster, but when I bring 64bit Firebird database into the stack,
then I see a 15 to 20% improvement with a 64bit stack, but nothing to approach
the performance improvements provided by running a 64bit Linux setup.

--
Lester Caine - G8HFL
-----------------------------
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk//
Firebird - http://www.firebirdsql.org/index.php

Angel Ezquerra Moreu

unread,
Jul 10, 2012, 6:10:44 AM7/10/12
to Adrian Buehlmann, mercurial
That is exactly the reason why we are using a 32 bit Apache on a 64
bit windows box.

> But I'm not
> really sure whether mod_wsgi is really such a good idea to use.I
> wouldn't be surprised if wsgi requires more tuning than plain simple
> stupid cgi. apachelounge seems to be providing fcgi 64-bit modules for
> Apache, so you might have more luck using plain simple cgi. It might
> even be interesting to see if there is a difference by using cgi vs wsgi
> with your existing 32-bit Apache.

I never considered using plain old CGI. Everything I read when I
looked into this indicated that WSGI was the way to go.

When using CGI, is an hg process created for every server request?
Could it improve the "concurrent access" behavior?

> It seems the projects haven't yet caught up with the reality of 64-bit
> now being ubiquitous.

Very true. There are plenty of very popular python modules that do not
have 64 versions (or at least 64 bit windows installers) yet.

> I'm not saying your problems will go away by using 64-bit Apache instead
> of 32-bit. But a 32-bit process has a very hard memory limit due to the
> limit imposed by pointer sizes (4 GB). If you have only one process with
> lots of threads, these threads are still limited into the same process
> space regarding memory.

The thing is it feels as if the problem is more a certain lack of
"multi-threading" for lack of a better way to put. Isn't it weird that
a big pull operation blocks (or at least considerably slows down) any
access to the web server itself?

Cheers,

Angel

Angel Ezquerra

unread,
Jul 10, 2012, 6:42:12 AM7/10/12
to Bryan O'Sullivan, mercurial
Bryan,

I looked into this in more detail and according to:

http://code.google.com/p/modwsgi/wiki/ConfigurationDirectives#WSGIDaemonProcess

"the WSGIDaemonProcess directive and corresponding features are not
available on Windows".

Actually it seems that there is not much that can be tweaked on
windows after all :-(

I just had a few users come to me asking whether the server is down.
It seems that another user is doing a big push of a brand new subrepo
which contains a few big files and this has blocked every other user
trying to access the server. Looking at the windows task manager I see
that httpd.exe's CPU usage is around 25%, while its memory usage is
160 MB. No other process is taking any significant amount of CPU. The
Apache tomcat6.exe process is the one that is taking the highest
amount of memory, but that is only 595 MB.

The apache error.log does not show any issues. However the access.log
file is very big (so much that it cannot be open with notepad++). I
cannot even delete it. Could that cause any issues (i.e. perhaps
apache is trying to write to that file and failing every time or
something of that sort)?

Other than that I don't see anything that may justify this poor
performance. It feels as if the server can only handle a single a
request at a time.

Benoît Allard

unread,
Jul 10, 2012, 6:54:22 AM7/10/12
to Angel Ezquerra, Bryan O'Sullivan, mercurial


> -----Original Message-----
> I just had a few users come to me asking whether the server is down.
> It seems that another user is doing a big push of a brand new subrepo
> which contains a few big files and this has blocked every other user
> trying to access the server. Looking at the windows task manager I see
> that httpd.exe's CPU usage is around 25%, while its memory usage is
> 160 MB. No other process is taking any significant amount of CPU. The
> Apache tomcat6.exe process is the one that is taking the highest
> amount of memory, but that is only 595 MB.
>

Mercurial in itself has a few locks to prevent among others multiple write operation at the same time. Last time I looked at it, there was two kinds of locks: read locks (can have multiple of them, but not compatible with write lock), and write locks (only one allowed, not compatible with read locks.)

If a user of yours is performing a huge push, I would expect the write lock to come in place, preventing anyone to read that repository until the push is done.

This is anyway nothing web server related, and is this is your problem you probably had it through file share also.

Regards,
Benoît.

Angel Ezquerra

unread,
Jul 10, 2012, 6:56:56 AM7/10/12
to Benoît Allard, mercurial
Yes,

the problems happen when users are accessing different repositories,
so I do not think that they are due to these repository-level locks...
(I may be wrong though).

In particular, accessing the repository list should be a read only
operation, and that slows down and even halts while a pull or push
operation is being performed.

Simon King

unread,
Jul 10, 2012, 7:31:09 AM7/10/12
to Angel Ezquerra, mercurial

As a test, could you try running plain "hg serve" to see if that has the same problems? And also try running multiple instances of "hg serve" on different ports? (This will at least confirm whether or not the problem is threading-related)

I can't remember if "hg serve" can serve multiple repositories. I seem to remember TortoiseHG allowing it, but I'm not certain.

Simon

Adrian Buehlmann

unread,
Jul 10, 2012, 7:34:12 AM7/10/12
to Angel Ezquerra Moreu, mercurial
I think so, yes. But it might be interesting to see how bad that really
is, compared to the current problems you have.

But it would certainly be sort of sidestepping the problem you have.

> Could it improve the "concurrent access" behavior?

I suspect that, yes. Why don't you give it a try?

If you want to see how much "better" wsgi really is compared to cgi,
then giving cgi a shot shouldn't be that much of a problem.

I think using cgi would also allow to combine your 32-bit Apache with a
64-bit Mercurial. After all, it runs in a separate process anyway.

>> It seems the projects haven't yet caught up with the reality of 64-bit
>> now being ubiquitous.
>
> Very true. There are plenty of very popular python modules that do not
> have 64 versions (or at least 64 bit windows installers) yet.
>
>> I'm not saying your problems will go away by using 64-bit Apache instead
>> of 32-bit. But a 32-bit process has a very hard memory limit due to the
>> limit imposed by pointer sizes (4 GB). If you have only one process with
>> lots of threads, these threads are still limited into the same process
>> space regarding memory.
>
> The thing is it feels as if the problem is more a certain lack of
> "multi-threading" for lack of a better way to put. Isn't it weird that
> a big pull operation blocks (or at least considerably slows down) any
> access to the web server itself?

It probably is, but you are most likely on your own to find out why.

Martin Geisler

unread,
Jul 10, 2012, 9:00:21 AM7/10/12
to Benoît Allard, mercurial
Benoît Allard <ben...@aeteurope.nl> writes:

> Mercurial in itself has a few locks to prevent among others multiple
> write operation at the same time. Last time I looked at it, there was
> two kinds of locks: read locks (can have multiple of them, but not
> compatible with write lock), and write locks (only one allowed, not
> compatible with read locks.)
>
> If a user of yours is performing a huge push, I would expect the write
> lock to come in place, preventing anyone to read that repository until
> the push is done.

No, Mercurial is a single-writer multiple-reader system. So you only
need a lock for write operations and there can be any number of
concurrent readers at the same time.

http://mercurial.selenic.com/wiki/LockingDesign

--
Martin Geisler

aragost Trifork
Commercial Mercurial support
http://aragost.com/mercurial/

Angel Ezquerra

unread,
Jul 10, 2012, 10:34:13 AM7/10/12
to Martin Geisler, mercurial
On Tue, Jul 10, 2012 at 3:00 PM, Martin Geisler <m...@aragost.com> wrote:
> Benoît Allard <ben...@aeteurope.nl> writes:
>
>> Mercurial in itself has a few locks to prevent among others multiple
>> write operation at the same time. Last time I looked at it, there was
>> two kinds of locks: read locks (can have multiple of them, but not
>> compatible with write lock), and write locks (only one allowed, not
>> compatible with read locks.)
>>
>> If a user of yours is performing a huge push, I would expect the write
>> lock to come in place, preventing anyone to read that repository until
>> the push is done.
>
> No, Mercurial is a single-writer multiple-reader system. So you only
> need a lock for write operations and there can be any number of
> concurrent readers at the same time.
>
> http://mercurial.selenic.com/wiki/LockingDesign
>
> --
> Martin Geisler

Martin,

but it is single-writer _per repo_ right? That is, it is not expected
that the wsgi process should block writing or even reading from a repo
when someone is writing (or even reading from) another repo, right?

Angel

Kevin Bullock

unread,
Jul 10, 2012, 10:38:39 AM7/10/12
to Angel Ezquerra, Martin Geisler, mercurial
On Jul 10, 2012, at 9:34 AM, Angel Ezquerra wrote:

On Tue, Jul 10, 2012 at 3:00 PM, Martin Geisler <m...@aragost.com> wrote:

No, Mercurial is a single-writer multiple-reader system. So you only
need a lock for write operations and there can be any number of
concurrent readers at the same time.

 http://mercurial.selenic.com/wiki/LockingDesign

--
Martin Geisler

Martin,

but it is single-writer _per repo_ right? That is, it is not expected
that the wsgi process should block writing or even reading from a repo
when someone is writing (or even reading from) another repo, right?

Right.

pacem in terris / мир / शान्ति / ‎‫سَلاَم‬ / 平和
Kevin R. Bullock

Angel Ezquerra

unread,
Jul 10, 2012, 10:41:20 AM7/10/12
to Adrian Buehlmann, mercurial
That is a good point. I've never configured CGI before so I'll have to
look into that. First I'll have to upgrade mercurial to a more recent
version though, since we are still on 1.9. I've been postponing it in
the hopes that I might get some pointer to try something else with our
current setup. Also our IT department must do a backup of the server
before the upgrade just in case something wrong happens.

>>> It seems the projects haven't yet caught up with the reality of 64-bit
>>> now being ubiquitous.
>>
>> Very true. There are plenty of very popular python modules that do not
>> have 64 versions (or at least 64 bit windows installers) yet.
>>
>>> I'm not saying your problems will go away by using 64-bit Apache instead
>>> of 32-bit. But a 32-bit process has a very hard memory limit due to the
>>> limit imposed by pointer sizes (4 GB). If you have only one process with
>>> lots of threads, these threads are still limited into the same process
>>> space regarding memory.
>>
>> The thing is it feels as if the problem is more a certain lack of
>> "multi-threading" for lack of a better way to put. Isn't it weird that
>> a big pull operation blocks (or at least considerably slows down) any
>> access to the web server itself?
>
> It probably is, but you are most likely on your own to find out why.

:-(

I think that if it turns out that this limited performance is to be
expected on windows when using apache this should probably be said
loud and clear on the wiki, perhaps on the PublishingRepositories
page. Configuring apache + mod_wsgi + mercurial is not hard but is not
that simple either if you don't know what you are doing. If the
expected performance is not very good it might be best to directly
tell people to use IIS or linux instead.

Cheers,

Angel

Simon King

unread,
Jul 10, 2012, 2:15:52 PM7/10/12
to Angel Ezquerra, Martin Geisler, mercurial
On 10 Jul 2012, at 15:34, Angel Ezquerra <angel.e...@gmail.com> wrote:

> On Tue, Jul 10, 2012 at 3:00 PM, Martin Geisler <m...@aragost.com> wrote:
>> Benoît Allard <ben...@aeteurope.nl> writes:
>>
>>> Mercurial in itself has a few locks to prevent among others multiple
>>> write operation at the same time. Last time I looked at it, there was
>>> two kinds of locks: read locks (can have multiple of them, but not
>>> compatible with write lock), and write locks (only one allowed, not
>>> compatible with read locks.)
>>>
>>> If a user of yours is performing a huge push, I would expect the write
>>> lock to come in place, preventing anyone to read that repository until
>>> the push is done.
>>
>> No, Mercurial is a single-writer multiple-reader system. So you only
>> need a lock for write operations and there can be any number of
>> concurrent readers at the same time.
>>
>> http://mercurial.selenic.com/wiki/LockingDesign
>>
>> --
>> Martin Geisler
>
> Martin,
>
> but it is single-writer _per repo_ right? That is, it is not expected
> that the wsgi process should block writing or even reading from a repo
> when someone is writing (or even reading from) another repo, right?

Remember that python's threading behaviour is limited by the Global Interpreter Lock - only a single thread can be running python code at any time. I don't know exactly how this works with mod_wsgi (perhaps there are multiple interpreters?), but I could easily imagine a big push on one repository slowing down access to another, inside the same process.

I've also found that mercurial is generally slower on windows due to things like virus checkers, indexing services etc.

Simon

Bryan O'Sullivan

unread,
Jul 11, 2012, 10:09:31 PM7/11/12
to Angel Ezquerra, mercurial
On Tue, Jul 10, 2012 at 7:41 AM, Angel Ezquerra <angel.e...@gmail.com> wrote:
I think that if it turns out that this limited performance is to be
expected on windows when using apache this should probably be said
loud and clear on the wiki, perhaps on the PublishingRepositories
page.

It should be clear from the thread so far that you're the only person we know of who has shown up and talked about trying this, so until you can along we have had no reason to have any expectations of any kind (or to ever even think about the issue).

(Not only that, but the people who have been offering you advice (myself included!) are clearly shooting in the dark, and at least some of what's being said (e.g. about the GIL) is somewhere between irrelevant and wrong.)

The most we've established so far is that it seems fairly clear that you have a concurrent access problem with the combination of Apache and mod_wsgi; I have no way of knowing if this is generic to Windows or specific to your local configuration. To figure that out, it seems worth trying CGI, as Adrian suggested.

Simon King

unread,
Jul 12, 2012, 3:05:37 AM7/12/12
to Bryan O'Sullivan, mercurial
On 12 Jul 2012, at 03:09, "Bryan O'Sullivan" <b...@serpentine.com> wrote:

On Tue, Jul 10, 2012 at 7:41 AM, Angel Ezquerra <angel.e...@gmail.com> wrote:
I think that if it turns out that this limited performance is to be
expected on windows when using apache this should probably be said
loud and clear on the wiki, perhaps on the PublishingRepositories
page.

It should be clear from the thread so far that you're the only person we know of who has shown up and talked about trying this, so until you can along we have had no reason to have any expectations of any kind (or to ever even think about the issue).

(Not only that, but the people who have been offering you advice (myself included!) are clearly shooting in the dark, and at least some of what's being said (e.g. about the GIL) is somewhere between irrelevant and wrong.)

Hi Bryan,

Would you mind elaborating on why the GIL is definitely not the issue here? Is it because of the way mod_wsgi runs separate threads, or because you suspect that most of the time is spent in I/O or other places where the GIL has been released?

(Just trying to improve my understanding of python and hg)

Thanks,

Simon

Angel Ezquerra

unread,
Jul 12, 2012, 3:23:23 AM7/12/12
to Bryan O'Sullivan, mercurial


On Jul 12, 2012 4:09 AM, "Bryan O&apos;Sullivan" <b...@serpentine.com> wrote:
>
> On Tue, Jul 10, 2012 at 7:41 AM, Angel Ezquerra <angel.e...@gmail.com> wrote:
>>
>> I think that if it turns out that this limited performance is to be
>> expected on windows when using apache this should probably be said
>> loud and clear on the wiki, perhaps on the PublishingRepositories
>> page.
>
>
> It should be clear from the thread so far that you're the only person we know of who has shown up and talked about trying this, so until you can along we have had no reason to have any expectations of any kind (or to ever even think about the issue).

I'm a bit surprised that I'm the first one with this problem. Either it is specific to my setup or nobody serves hg repos through Apache + wsgi on windows...

> (Not only that, but the people who have been offering you advice (myself included!) are clearly shooting in the dark, and at least some of what's being said (e.g. about the GIL) is somewhere between irrelevant and wrong.)
>
> The most we've established so far is that it seems fairly clear that you have a concurrent access problem with the combination of Apache and mod_wsgi; I have no way of knowing if this is generic to Windows or specific to your local configuration. To figure that out, it seems worth trying CGI, as Adrian suggested.

I think you are right. Changing from wsgi to CGI is the next thing I'll try next week when I'll be back from a short vacation   :-)

I'll let you guys know the results of my tests a soon as I've done them.

Thanks a lot to all of you for your help an suggestions. I'll be back with more info next week.

Cheers,

Angel

Bryan O'Sullivan

unread,
Jul 12, 2012, 7:55:16 PM7/12/12
to Simon King, mercurial
On Thu, Jul 12, 2012 at 12:05 AM, Simon King <si...@simonking.org.uk> wrote:
Would you mind elaborating on why the GIL is definitely not the issue here?

The GIL is released and acquired every time Python does I/O, which in the context of Mercurial happens rather a lot.

Isaac Jurado

unread,
Jul 13, 2012, 3:05:24 AM7/13/12
to Bryan O'Sullivan, mercurial
For example:

http://hg.python.org/cpython/file/93df82c18781/Objects/fileobject.c#l1051

Lines 1084 to 1089

FILE_BEGIN_ALLOW_THREADS(f)
errno = 0;
chunksize = Py_UniversalNewlineFread(BUF(v) + bytesread,
buffersize - bytesread, f->f_fp, (PyObject *)f);
interrupted = ferror(f->f_fp) && errno == EINTR;
FILE_END_ALLOW_THREADS(f)

All IO operations implemented in CPython use the same pattern: release
the GIL, perform IO, try to acquire de GIL again. So another Python
thread can be interpreted meanwhile.

There is one thing I never fully understood, though. The GIL state is
stored in a global variable that is shared even across different
Python instances in the same process. So, in practice, even for
software like mod_wsgi that instances a new interpreter on each
thread, the GIL is still shared between those different interpreters;
thus limiting concurrent operation for CPU-intensive bursts (i.e. ORM
result mapping).

So, the question remains, is the GIL used for mutual exclusion between
both threads and interpreter instances? I know this is a question for
the Python list, but since the topic arose here...

Cheers.

--
Isaac Jurado

"The noblest pleasure is the joy of understanding"
Leonardo da Vinci

Simon King

unread,
Jul 13, 2012, 4:25:31 AM7/13/12
to Bryan O'Sullivan, mercurial
Thanks - I was aware of the GIL being released during I/O, but I wasn't sure if there was perhaps some CPU-intensive operation happening during the big pushes that Angel described, such as decompression (although I see that zlib releases the GIL)

Cheers,

Simon

Angel Ezquerra

unread,
Jul 16, 2012, 3:17:07 PM7/16/12
to Bryan O'Sullivan, mercurial
On Thu, Jul 12, 2012 at 9:23 AM, Angel Ezquerra
Hi again,

today I updated our server to the latest mercurial release (2.2.3). In
doing so I also moved from python 2.6 to python 2.7 (32 bit). This did
not noticeably improve the performance of the server (as I was afraid
it wouldn't).

Then I changed the server configuration to use CGI rather than WSGI.
The results are not very good either. Simply accessing any server page
through a web browser client (e.g. the repository list or one of the
repository summary or graph pages) is very slow. It generally takes
more than 15 seconds (perhaps even 20) to show any page. I think the
performance (when accessing the web server through a web browser) is
similar or perhaps a little worse than with WSGI, I cannot really
tell.

I tried checking whether accessing the web server while a big push or
pull operation is ongoing stalls as with WSGI but it is hard to tell
since all web server accesses are so slow anyway and I did not have
much time to test this today.

One thing that I did notice and which I find really weird is that when
you access the repository list you can see how the mercurial hglogo
(hglogo.png) is load. First you see the page, with a place holder for
the logo, then you can see how the logo is being downloaded (line by
line), as if I were accessing the server through an old dialup line!
This is very weird since the mercurial logo is a static file stored
right on the server. Once mercurial generates the html file that
points to that file the file should download to the web browser client
immediately.

I think this may be related to the overall performance problem. I
think this should be very fast, yet somehow it feels as if for every
access to the server from the client the server were trying to access
some file that does not exist and then after a time out it went ahead
and accessed the right file. I have nothing to back that claim though,
it is just a feeling. I thought about installing wireshark but I had
some trouble running it on the server this morning.

Accessing a static page is really fast. It feels instantaneous. I also
created a quick CGI file that calls python to print "hello world"
followed by the current time" to see if perhaps calling python from
apache is slow. It does not seem so though. The message is printed
almost immediately (it takes between half a second to two seconds or
so).

Is there some way I could measure what is taking the longest time? Is
the time spent inside mercurial's code or somewhere else?

I've been going through the apache httpd.conf file looking for "weird"
things and I saw a couple of things:

1. We listen both on port 80 and port 8000. This is to support legacy
repos that used to be hosted on port 8000 (now we all use port 80).

2.We want people to be able to go to:

http://mercurial/projectrepo

in order to access the "projectrepo" repository, instead of having to go to:

http://mercurial/hg/projectrepo

Which I think is the default. In order to do so I defined two "script
aliases" in the apache httpd.conf file to get things working (both on
the CGI and the WSGI mode), as follows:

ScriptAlias /hg "C:/hg_server/hgweb.cgi"
ScriptAlias / "C:/hg_server/hgweb.cgi"

(I used WSGIScriptAlias for WSGI of course). I don't think this can
cause any trouble though, since removing the second script alias does
not improve things either.

Other than that I don't see anything weird.

Any ideas?

Matt Mackall

unread,
Jul 16, 2012, 4:10:09 PM7/16/12
to Angel Ezquerra, mercurial
_15 seconds_? 15 seconds is not "slow". 15 seconds is an order of
magnitude or more and cannot be explained by Apache or Windows vs Linux
issues. If you had said 15 seconds at the outset, this conversation
would have been very different.

This smells like a recursion configuration issue. Hgweb is probably
busily searching for repositories through a large directory tree.. on
every single request. You can check this with whatever strace-equivalent
is handy. Search for 'recurse' in 'hg help hgweb'.

Alternately, it might be a permissions problem in updating the branch
cache. When Mercurial needs branch info but can't write the cache, it
will reconstruct it from scratch by reading the entire log and silently
fail to write the cache to the disk to disk.. on every single request.

--
Mathematics is the supreme nostalgia of our time.

Angel Ezquerra

unread,
Jul 16, 2012, 6:28:27 PM7/16/12
to Matt Mackall, mercurial
I should have mentioned the exact time. I suspected it was real bad
but I did not know how bad since I have no other server to compare it
to.

> This smells like a recursion configuration issue. Hgweb is probably
> busily searching for repositories through a large directory tree.. on
> every single request. You can check this with whatever strace-equivalent
> is handy. Search for 'recurse' in 'hg help hgweb'.

You are right. We have configured a recursive path as follows:

[paths]
/ = c:/hgrepos/**

Changing that into:

[paths]
/ = c:/hgrepos/*

makes the access to the web server through a web browser way faster
(it takes about 5 seconds to show the list of repos now).

The first question I asked on my first email, was related to this. I
asked whether serving that many repos ("in the order of 500
repositories, including subrepos") could be a problem. However, I did
not think that serving them recursively could be a problem in itself,
since the problem happens even when looking at one of the repository
summary pages.

The reason why (I believe) we need to use a recursive path is that
otherwise there is no (easy) way to see or access any subrepos (of
which we use quite a few).

Thus I have a few questions regarding the recursive mode:

1. Why does mercurial need to "search for repositories through a large
directory tree" when serving the summary page of a particular
repository? It cannot possibly show the repos it finds in the web
interface, since there is no place for it... so what does mercurial
need them for?

2. Is there a way to tell mercurial to not recurse yet look for
subrepos? The only way I can think of how to do so right now is to
manually specify every repo and subrepo on the paths section of our
hgweb.config file. Perhaps that could be done by a script that is run
periodically, but that seems a bit of a hack... Is there a better
solution?

3. Could having a "paths" section with hundreds of entries pose a
performance problem of its own? If so, what are the alternatives?

Since the actual web access is faster now, I'll try to compare the CGI
vs WSGI performance tomorrow and see if there is still a problem with
concurrent access to the server.

Thanks a lot!

Angel

Matt Mackall

unread,
Jul 17, 2012, 1:13:50 PM7/17/12
to Angel Ezquerra, mercurial
It doesn't. But no one's felt the need to optimize it.

> 2. Is there a way to tell mercurial to not recurse yet look for
> subrepos? The only way I can think of how to do so right now is to
> manually specify every repo and subrepo on the paths section of our
> hgweb.config file. Perhaps that could be done by a script that is run
> periodically, but that seems a bit of a hack... Is there a better
> solution?
>
> 3. Could having a "paths" section with hundreds of entries pose a
> performance problem of its own? If so, what are the alternatives?

No, it should be fine.

WSGI will cache the results of its directory crawl for a short interval,
which might make most of the time disappear here.

--
Mathematics is the supreme nostalgia of our time.


Matt Mackall

unread,
Jul 17, 2012, 1:18:50 PM7/17/12
to Angel Ezquerra, mercurial
On Tue, 2012-07-17 at 00:28 +0200, Angel Ezquerra wrote:

> The first question I asked on my first email, was related to this. I
> asked whether serving that many repos ("in the order of 500
> repositories, including subrepos") could be a problem.

Doing a recursive directory search through 500 otherwise bare repos
should be pretty fast, even on Windows. You should only run into trouble
if you have checked out repos, in which case you're likely to have lots
more to search through.

--
Mathematics is the supreme nostalgia of our time.


Angel Ezquerra

unread,
Jul 20, 2012, 5:49:20 AM7/20/12
to Matt Mackall, mercurial
On Tue, Jul 17, 2012 at 7:18 PM, Matt Mackall <m...@selenic.com> wrote:
> On Tue, 2012-07-17 at 00:28 +0200, Angel Ezquerra wrote:
>
>> The first question I asked on my first email, was related to this. I
>> asked whether serving that many repos ("in the order of 500
>> repositories, including subrepos") could be a problem.
>
> Doing a recursive directory search through 500 otherwise bare repos
> should be pretty fast, even on Windows. You should only run into trouble
> if you have checked out repos, in which case you're likely to have lots
> more to search through.

There are a bunch of "server repos" which are not at the -1 revision.
Thus you are correct in that these repos are not empty which makes the
search take longer than it should. I have no control over that though
so I cannot force all those repos to be at the -1 revision.

I created a script that runs whenever a new repo is created on the
server and also every 15 minutes which generates a explicit list of
mercurial repositories and adds it to the paths section of the
web.config file. This works great and the performance of the server is
much better now, so thanks a lot for your help. Also, it turns out
that CGI performance is quite good and since we are not experiencing
any issues with concurrent access now that is what we'll use from now
on. I have not had a chance to check the WSGI config again but if I do
I'll let you guys know if concurrent access is still a problem.

Thanks,

Angel
Reply all
Reply to author
Forward
0 new messages