"Connection refused" on Mercurial server

32 views
Skip to first unread message

Tony Mechelynck

unread,
Apr 25, 2025, 5:18:54 PM4/25/25
to cbl...@256bit.org, vim...@googlegroups.com
I'm giving below my "last good" and "first bad" attempts at
connection. This is still happening at 2025-04-25 23:15 Belgian summer
time (UTC+0200).

linux-tuxedo:~/.build/vim/vim-hg # hg in || echo 'exit status' $? ; date
comparing with https://www.vim.org/hgweb/vim/
searching for changes
no changes found
(sent 2 HTTP requests and 641 bytes; received 619 bytes in responses)
exit status 1
Fri 25 Apr 13:37:10 CEST 2025
linux-tuxedo:~/.build/vim/vim-hg # hg in || echo 'exit status' $? ; date
abort: error: Connection refused
exit status 255
Fri 25 Apr 17:12:16 CEST 2025

Best regards,
Tony

Christian Brabandt

unread,
Apr 26, 2025, 2:03:10 PM4/26/25
to vim...@googlegroups.com, Marc Schoechlin
The whole website was down, killed by the kernels because of OOM.
I restarted it and it should work now again. I guess this was also the
reason why the CI started throwing errors for the glvs plugin test.

Thanks,
Christian
--
One would like to stroke and caress human beings, but one dares not do so,
because they bite.
-- Vladimir Il'ich Lenin

Tony Mechelynck

unread,
Apr 26, 2025, 2:45:48 PM4/26/25
to vim...@googlegroups.com, Marc Schoechlin
On Sat, Apr 26, 2025 at 8:03 PM Christian Brabandt <cbl...@256bit.org> wrote:
>
>
> On Fri, 25 Apr 2025, Tony Mechelynck wrote:
>
> > I'm giving below my "last good" and "first bad" attempts at
> > connection. This is still happening at 2025-04-25 23:15 Belgian summer
> > time (UTC+0200).
> >
> > linux-tuxedo:~/.build/vim/vim-hg # hg in || echo 'exit status' $? ; date
> > comparing with https://www.vim.org/hgweb/vim/
> > searching for changes
> > no changes found
> > (sent 2 HTTP requests and 641 bytes; received 619 bytes in responses)
> > exit status 1
> > Fri 25 Apr 13:37:10 CEST 2025
> > linux-tuxedo:~/.build/vim/vim-hg # hg in || echo 'exit status' $? ; date
> > abort: error: Connection refused
> > exit status 255
> > Fri 25 Apr 17:12:16 CEST 2025
>
> The whole website was down, killed by the kernels because of OOM.
> I restarted it and it should work now again. I guess this was also the
> reason why the CI started throwing errors for the glvs plugin test.
>
> Thanks,
> Christian

Problem has now disappeared.

Best regards,
Tony.

ms-goo...@256bit.org

unread,
May 1, 2025, 12:18:45 AM5/1/25
to vim...@googlegroups.com, Christian Brabandt

Hello everyone,

that's right, I had an alarm on my private monitoring system, but unfortunately for work reasons I haven't had time to pay enough attention to the problem.

According to the Monitoring the website was down between 4PM and 8PM (CEST) and that was caused by almost 100% memory usage by apache processes.
As Christian described, that leaded to a out of memory situation and the oom killer killed the apache an mysql processes *sigh*.

I thing that this is caused by the Mercurical WSGI Web application because we also experienced that massive memory consumption with our previous server.
A solution, which helped a lot, was to instruct search engines via robots.txt not to  create an index on hgweb.
Nevertheless the problem seems to be back :-)

Since the Mercurical Web Interface is only sporadic, it seems to me a feasible way to reduce the number of active threads from 15 to 5 (threads), to terminate the WSGI process quickly when it is inactive (inactivity-timeout=15 seconds), to resolve application deadlocks after 90 seconds instead of 600 seconds (deadlock-timeout=90), to limit the entire WSGI process to 500MB (memory-limit) and to restart the server after 10000 requests at the latest (maximum-requests).

Let's hope that this will get us rid of the problem :-)

root@web01(2025-04-27 17:06:03) /etc [main]   
# git diff
diff --git a/apache2/mods-available/wsgi.conf b/apache2/mods-available/wsgi.conf
index 71b1283..e7239f0 100644
--- a/apache2/mods-available/wsgi.conf
+++ b/apache2/mods-available/wsgi.conf
@@ -1,6 +1,9 @@
 <IfModule mod_wsgi.c>
 
 
+    # https://modwsgi.readthedocs.io/en/latest/configuration-directives/WSGIDaemonProcess.html
+    WSGIDaemonProcess hgweb user=www-data group=www-data processes=1 threads=5 inactivity-timeout=15 deadlock-timeout=90 memory-limit=524288000 maximum-requests=10000
+
     #This config file is provided to give an overview of the directives,
     #which are only allowed in the 'server config' context.
     #For a detailed description of all avaiable directives please read
diff --git a/apache2/sites-available/vim_org.conf b/apache2/sites-available/vim_org.conf
index 07a9528..77387d4 100644
--- a/apache2/sites-available/vim_org.conf
+++ b/apache2/sites-available/vim_org.conf
@@ -61,6 +61,7 @@
    </LimitExcept>
  </Directory>
 
+ WSGIProcessGroup hgweb
  WSGIScriptAlias /hgweb "/srv/www/www.vim.org/hgweb/hgweb.wsgi"
 
  <Location /hgweb>

Furthermore, I have now made sure that MySQL is practically ignored by the OOM Killer and that Apache is automatically restarted if it is terminated unplanned, e.g. by the OOKiller.

# git diff --staged 
diff --git a/systemd/system/apache2.service.d/override.conf b/systemd/system/apache2.service.d/override.conf
new file mode 100644
index 0000000..34cca89
--- /dev/null
+++ b/systemd/system/apache2.service.d/override.conf
@@ -0,0 +1,3 @@
+[Service]
+Restart=always
+RestartSec=5s
\ No newline at end of file
diff --git a/systemd/system/mysql.service.d/override.conf b/systemd/system/mysql.service.d/override.conf
new file mode 100644
index 0000000..8e88eb1
--- /dev/null
+++ b/systemd/system/mysql.service.d/override.conf
@@ -0,0 +1,2 @@
+[Service]
+OOMScoreAdjust=-1000
\ No newline at end of file

Regards
Marc

Am 26.04.25 um 20:03 schrieb Christian Brabandt:
Reply all
Reply to author
Forward
0 new messages