DSpace 7.6 slow server response

1,165 views
Skip to first unread message

Andrew K

unread,
Nov 13, 2023, 9:34:01 AM11/13/23
to DSpace Technical Support
Hello!

Yesterday I put a new DSpace 7.6 server to production.
And unfortunately there are some weird problems.
1. Often the server response time is very high, like 4-6s. But the server load is about 1.5, one process (node /home/dspa) has 100-150% load. Also iotop shows no significant load (less than 100-300K/s, mostly 0 at all).
2. Sometimes red error boxes pop like 0 Http failure response for /server/api/core/bundles/c6beb8dd-b676-4099-8b3e-e382addabe9a/bitstreams?page=0&size=5: 0 Unknown Error. Also error page 502 is frequent.
3. Apache error log has something like 
[Mon Nov 13 16:17:32.347870 2023] [proxy:error] [pid 1724:tid 140175577442048] [client 87.250.224.227:64062] AH00898: Error reading from remote server returned by /bitstream/123456789/3767/1/Prioritetni sotsial\xca\xb9ni spozhyvy.pdf
[Mon Nov 13 16:17:35.984410 2023] [proxy_http:error] [pid 1864:tid 140176072349440] (104)Connection reset by peer: [client 47.128.18.32:52936] AH01102: error reading status line from remote server localhost:4000

[Mon Nov 13 00:23:50.983739 2023] [proxy_http:error] [pid 10447:tid 139668360255232] (104)Connection reset by peer: [client 185.191.171.19:54166] AH01102: error reading status line from remote server localhost:4000
[Mon Nov 13 00:23:50.983789 2023] [proxy:error] [pid 10447:tid 139668360255232] [client 185.191.171.19:54166] AH00898: Error reading from remote server returned by /handle/123456789/1345/simple-search
[Mon Nov 13 00:26:13.310238 2023] [proxy_http:error] [pid 10447:tid 139667739490048] (104)Connection reset by peer: [client 47.128.52.192:41852] AH01102: error reading status line from remote server localhost:4000, referer: /st>[Mon Nov 13 00:26:13.310299 2023] [proxy:error] [pid 10447:tid 139667739490048] [client 47.128.52.192:41852] AH00898: Error reading from remote server returned by /assets/images/logo.png, referer: /statistics/items/b991c0e>[Mon Nov 13 00:28:08.645601 2023] [proxy_http:error] [pid 10447:tid 139668493174528] (70007)The timeout specified has expired: [client 52.167.144.216:43793] AH01102: error reading status line from remote server localhost:4000
4. Dspace log is unavailable! The last logged time is yesterday evening. Since then there's no log. I have not changed any log settings.

I followed https://wiki.lyrasis.org/display/DSDOC7x/Performance+Tuning+DSpace pretty thoroughly. The DSpace server is on the VDS and it works through apache proxy (for https).

This is very disappointing. The old server worked lke a charm, always instant response. 
Please help me trace the source of errors.

WBR,
Andrew

PS: That said, once the page has loaded, clicking the links like statistics always has instant responce (cache?). But opening links in another tab takes forever, or fails

Andrew K

unread,
Nov 13, 2023, 9:42:51 AM11/13/23
to DSpace Technical Support
I forgot to mention the server capacity: 4 cores, 6GB RAM (12GB doesn't change a lot), 200GB.

Hrafn Malmquist

unread,
Nov 13, 2023, 9:48:51 AM11/13/23
to Andrew K, DSpace Technical Support
Hello Andrew

Out of curiosity, what version was the old server that worked like a charm?

Hrafn

On Mon, Nov 13, 2023 at 2:42 PM Andrew K <pkm...@gmail.com> wrote:
I forgot to mention the server capacity: 4 cores, 6GB RAM (12GB doesn't change a lot), 200GB.

--
All messages to this mailing list should adhere to the Code of Conduct: https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
---
You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dspace-tech...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dspace-tech/ac840440-ea37-4759-b464-5c79fea4c27an%40googlegroups.com.

Andrew K

unread,
Nov 13, 2023, 9:55:02 AM11/13/23
to DSpace Technical Support
Hello Hrafn,

It was 5.4 (it actually still is, I only switched DNS to the new server).
But the old server had no https, unfortunately.

понеділок, 13 листопада 2023 р. о 16:48:51 UTC+2 Hrafn Malmquist пише:

Andrew K

unread,
Nov 13, 2023, 1:38:51 PM11/13/23
to DSpace Technical Support
Another very interesting observation.
I load the main page. It takes a while. Then I click the links, everything works perfectly, list the items by date, list the communities... everything.
So, I find some item and open it (works pretty fast again, all in the same tab). Then I copy the link and insert it to the new tab, press Enter. 
And boom! Error fetching item   But how?? Why??? The same happens when I simply refresh the page...

Something is fundamentally wrong. But what? While dspace log is not working, I can only guess...


Andrew K

unread,
Nov 13, 2023, 2:25:12 PM11/13/23
to DSpace Technical Support
OK, that looks like the cache. When I set 
anonymousCache:
      max: 0
I stopped getting "Error fetching item".
At least the page loads now. It takes 30s to load (!), but at least no error.
I think Google is going to ban our server with such response times, though...
понеділок, 13 листопада 2023 р. о 20:38:51 UTC+2 Andrew K пише:

Andrew K

unread,
Nov 13, 2023, 3:43:55 PM11/13/23
to DSpace Technical Support
OK, I suddenly realized that DSpace log works only when I run scripts like index-discovery, filter-media etc. Nothing on the web activity

So, the latest description of the problem:
Any page loads for a very long time, like 30s. But when it's loaded, the navigation works pretty fast within the browser tab: search, collections, statistics.
Reloading or opening a link in a new tab causes very slow loading again.
Please help me figure it out

dataquest

unread,
Nov 14, 2023, 2:50:36 AM11/14/23
to Andrew K, DSpace Technical Support
Dear Andrew.

I faced similar problems when we deployed our first instance of DSpace and I 
was also quite confused so as to what is wrong.

Perhaps most important question is, if you are using pm2 with several nodes.
When I was deploying it, performance did mention it, but we weren't using it at first.
You certainly need it (pm2 starts several nodes, so frontend is served from
multiple threads, not just one).

Also having anonymousCache set to 0 is probably the issue. I tuned it for some time,
but certainly use at least 10, so that some requests are much faster and your server
does not get clogged up, therefore slowing every load.

If you needed to set it to 0 in order for everything to work, there must be some other,
worse problem. Perhaps insufficient memory?

Also, your server spec seem to be too low. It seems SSR requires much "bigger" server.
We are using 15 cores, 32GB RAM and it is about fine (10 cores were ok, but not when
there were long imports of many items running etc.) Yes, it requires much more resources
than previous versions, but that simply seems to be the case...

I am no expert, so far I only deployed one instance, but I learned a lot with it, at least
compared to what I knew before. (If I wrote anything inaccurate I am sorry, anybody
that knows better, please correct me.) But I know I was quite desperate and I hope
this can help you a bit.

To reiterate, most important things (in my opinion) are:
1. Make sure you are using pm2, without it it will hardly ever work well (and fast enough)
2. I suppose your server specs are insufficient.


Best regards,
Majo

--
All messages to this mailing list should adhere to the Code of Conduct: https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
---
You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dspace-tech...@googlegroups.com.

Andrew K

unread,
Nov 14, 2023, 12:03:10 PM11/14/23
to DSpace Technical Support
Dear Majo!

Thanks a lot for your reply!

It clears a lot: DSpace 7 needs a lot of CPU power.

Yes, I was really desperate because I spent a lot of time (not without interest and pleasure, though) setting up the new server. And our current hosting provider has pretty slow VDS: CPU about 2GHz, slow disk etc. But it's affordable and we already use it. Switching to a different one is a huge PITA. On the other hand, we need VDS because the power outages expected in winter, so it is going to be hard to keep our local servers powered 24/7.

So, what I did first is increased the VDS's number of CPU cores from 4 to 8. And it helped significantly. The server is more stable, the response times shortened. The server load is usually <1 instead of ~1.5. Now the server (more or less) works.
But still, it happens to throw error pages. Sometines Apache proxy says it received no data. And the page loading in a new browser tab is always delayed. On the other hand, within one tab all possible links work fast.
About the anonymousCache, I tried setting it to 100. But there were errors too often. So I turned it off. Less errors. Now I set it to 20. Works so far...

Also, PageSpeed Insights show pretty awful results, Speed Index about 5s for desktop. But at least it shows something, the server barely responded before.
I just tested a few other DSpace 7 repos with PageSpeed Insights, and they all show low efficiency and high First Contentful Paint time.

So, what I can and plan to do is increase the number of cores to 12. That should speed it up a little more.
Then I suppose we all should wait for DSpace to get more optimized )

Thanks again!

Best regards,
Andrew





вівторок, 14 листопада 2023 р. о 09:50:36 UTC+2 dataquest пише:

dataquest

unread,
Nov 15, 2023, 8:21:07 AM11/15/23
to Andrew K, DSpace Technical Support
Hi Andrew!

I am glad it helped somewhat.

Did you look into pm2? 
If so, did you set number of instances?
I recommend using less than number of CPUs you have, so that some
CPUs are available for backend, solr, database and perhaps other necessary processes.
I used 7 instances when I had 10 CPUs and it worked quite fine.
If you do not set this parameter, it will always take as many CPUs as available, which is
certainly not ideal.

Based on my understanding, once the page loads, it runs in your browser. Only when you refresh,
open it on new window (or for some other reason download the whole page from server) is it slow.
When it is downloaded and runs in browser, it only executes some calls to backend, which is fast.
This would agree with your observations.

Best regards,
Majo

Andrew K

unread,
Nov 15, 2023, 10:48:16 AM11/15/23
to DSpace Technical Support
Hi Mayo!

Sure, I looked to pm2 first thing
This is what I have
2023-11-15_171420.png
I guess it's OK, cluster mode.
I added "exec_mode": "cluster" in dspace-ui-json
I also use "pm2 start /home/dspace/dspace-angular-dspace-7.6/dspace-ui.json -i max"
Per your advice I am going to change it to a number less than cores #. As soon as know how much cores I can get)

Based on my understanding, once the page loads, it runs in your browser. Only when you refresh,
open it on new window (or for some other reason download the whole page from server) is it slow.
When it is downloaded and runs in browser, it only executes some calls to backend, which is fast.
This would agree with your observations.

Ah. Thanks! That explains a lot.
But then I can not understand the main page loading so long when the anon cache is enabled.
The main page should be cached 100%.
But it always loads as if there's no cache... It actually looks like it loads as long as for a logged user.
I have 
 anonymousCache:
      max: 50
      timeToLive: 7200000 # 3 hours
      allowStale: true

That should work, right?

Best regards,
Andrew

Hrafn Malmquist

unread,
Nov 16, 2023, 8:32:46 AM11/16/23
to DSpace Technical Support
Hi guys

I would consider upgrading to 7.6.1. There are a few performance fixes in this bug release:
https://wiki.lyrasis.org/display/DSDOC7x/Release+Notes#ReleaseNotes-7.6.1ReleaseNotes

Specifically https://github.com/DSpace/dspace-angular/issues/2482. A bug where a page load would make 1 or 2 unnecessary GET calls to the server depending on the path. Upgrading to 7.6.1 should therefore reduce server load.

Best regards, Hrafn

--
All messages to this mailing list should adhere to the Code of Conduct: https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
---
You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dspace-tech...@googlegroups.com.

dataquest

unread,
Nov 16, 2023, 8:47:21 AM11/16/23
to Hrafn Malmquist, DSpace Technical Support
Hi, it is me again! 

Hrafn, there are many many many REST calls every time any page loads. I do not think the issue will be mitigated
by the update you mention.

Andrew, your pm2 configuration does not seem to be correct at all. 
I have 7 nodes and it looks like this:
image.png
From what you posted, it seems you only have one node running, which would explain a lot.
I am not sure if you are using dspace-ui.json (or similar) config file for pm2, if so, check if you have
line such as this:
"instances": "7"
I know the -i parameter on command line should set this and I have no idea why you apparently only have
one node, but it might be a good idea to try this. I also found that for certain settings, I have to do
pm2 delete dspace-ui.json
and then
pm2 start dspace-ui.json

merely using pm2 restart dspace-ui.json was not sufficient, perhaps that is what happened.

The anon chache only starts having effect, if your server is not clogged up by other requests. If it is,
requests that could be served from cache will probably spend some time waiting, therefore being slow
anyway.

Time to live = 3 hours is way too long, I have 10 seconds and it is fine.

In my humble opinion, your biggest problem is only having one node on pm2. 
If you do not resolve that, I wouldn't expect other fixes to have much effect on anything.

Make sure you see several nodes listed, just like in my screenshot. Then you can start looking
into other optimizations.

Best regards,
Majo



Andrew K

unread,
Nov 16, 2023, 11:29:54 AM11/16/23
to DSpace Technical Support
Hey Majo!

You're a lifesaver! Of course I had one pm2 instance! Now I have 11 of them and the server is doing a lot better!
2023-11-16_182827.png
Still, the response in not instant. I think the response time depends on the performance of one pm2 instance and one CPU core (which is not the best in my case).
But at least the server is not chocking with multiple requests. That is a very big deal! Amazing. Thank you!
The speed index is 1-2s now instead of 5s! That is acceptable.

Best regards,
Andrew
четвер, 16 листопада 2023 р. о 15:47:21 UTC+2 dataquest пише:

Andrew K

unread,
Nov 16, 2023, 11:42:15 AM11/16/23
to DSpace Technical Support
Hi Hrafn!

Thanks! I am also somewhat hopeful about the new release. In my situation first thing I looked for were fixed performance issues )

Best regards,
Andrew

четвер, 16 листопада 2023 р. о 15:32:46 UTC+2 Hrafn Malmquist пише:

Plate, Michael

unread,
Nov 16, 2023, 12:07:18 PM11/16/23
to DSpace Technical Support
Hi Andrew,

be careful on the number of CPUs (Cores) you assign to the VM - this may be a boomerang if the host can't get enough free Cores in one piece and your VM has to wait until the amount of configured Cores is free.
At least in former times this was limitation, I'am not sure if this still is that way.

CU

Michael

________________________________________
Von: dspac...@googlegroups.com <dspac...@googlegroups.com> im Auftrag von Andrew K <pkm...@gmail.com>
Gesendet: Donnerstag, 16. November 2023 17:29
An: DSpace Technical Support
Betreff: [Extern] Re: [dspace-tech] Re: DSpace 7.6 slow server response

[…]

dataquest

unread,
Nov 20, 2023, 1:59:24 AM11/20/23
to DSpace Technical Support
Hi Andrew.

Glad I could help.
Certainly heed advice from Michael and don't overdo it with the instances.
Your issue was that there was a single one.. that will never be enough.
But 5 might be, or 7.. Impossible to say. 11 appears to be too much.
Now that you know how to set it, perhaps try experimenting a bit,
so that you can find the correct number for you. Too many instances
will drain your memory, cause swapping and definitely slow things down.

Best regards,
Majo

--
All messages to this mailing list should adhere to the Code of Conduct: https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
---
You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dspace-tech...@googlegroups.com.

Andrew K

unread,
Nov 20, 2023, 2:33:42 AM11/20/23
to DSpace Technical Support
Hi Majo!

Thanks again for your help.
Actually right now the server works with 3 pm2 instances on 4 CPU cores and 6GB RAM. Not too fast, but still not choking.
I also blocked Amazonbot (10-20K connections a day) and a couple of less active bots. It helped a lot.
Looks like 5-7 instances is pretty much enough for a modest repository like ours (13K items).
Thus 8-10 CPU cores should do the job. As for RAM amount, it's never too much, but 10-12GB should be OK (on our VDS hosting RAM is a lot more expensive, 1GB RAM equals 10 CPU cores).

Best regards,
Andrew

понеділок, 20 листопада 2023 р. о 08:59:24 UTC+2 dataquest пише:

Andrew K

unread,
Nov 20, 2023, 2:39:37 AM11/20/23
to DSpace Technical Support
Hi Michael!

Thanks for the warning! I'm not sure what kind of virtualization we're on. But probably it also has this problem, as they say that the usual amount is 4 cores per server.
I guess the performance suffers more with full load (which is rare in this case). 
Still, even it it doesn't scale linear, 8 cores should overperform 4 cores, right?

Best regards,
Andrew

четвер, 16 листопада 2023 р. о 19:07:18 UTC+2 Plate, Michael пише:

Michael Plate

unread,
Nov 20, 2023, 6:14:13 AM11/20/23
to dspac...@googlegroups.com
Hi Andrew,

Am 20.11.23 um 08:39 schrieb Andrew K:
[…]

> Still, even it it doesn't scale linear, 8 cores should overperform 4
> cores, right?
[…]
depends on the number of cores of the host.

I've running a private DSpace 7.6 on raw iron (i7-3770, 8 Cores) with 4
cores for pm2, 16GBs RAM, HW RAID5 and it is the most responsive DSpace
I've seen so far (does mot mean much) compared to those VMs I've seen
here at work :) .
Reply all
Reply to author
Forward
0 new messages