Any improvement now? I cleared out another round of long-running queries some time ago, and at least on this end, things seem to have been fairly responsive since.
Cheers,
Rick
--
Rick Scott // Library Technologies Specialist // Wishart Library @ AlgomaU
Good judgment comes from experience.
Experience comes from bad judgment. :Nasrudin
________________________________________
From:
conifer...@googlegroups.com [
conifer...@googlegroups.com] On Behalf Of Noella Cliche [
ncl...@laurentian.ca]
Sent: February-07-14 11:55
To: Conifer testing and discussion
Subject: Re: [conifer-discuss] Conifer Client issue
we are back to a crawl here at laurentian once again... unable to search the catalogue as well
Noëlla Cliche
Library Assistant, Access Services
Bibliothèque J.N. Desmarais Library
Laurentian University
935 Ramsey Lake Road, Sudbury, Ontario, Canada P3E 2C6
Tel:
705-675-1151 extension 4377 / 3242
Fax:
705-671-3803
ncl...@laurentian.ca<mailto:
ncl...@laurentian.ca>
>>> Dan Scott <
den...@gmail.com> 2/7/14 10:03 AM >>>
Hi Alain and everyone else:
The issues may be related to the approach we've been using to address the main symptom related to what looks like (but might not be) a denial of service of attack. This morning we identified the source of the queries (a link checker system, which could still be a front for a deliberate attack... but might also be entirely innocent) and have taken steps to prevent that source from issuing further such queries--but we'll see if that's effective.
In a nutshell: almost everything in Evergreen needs to go through the database. The database is capable of handling multiple concurrent queries, but generally (due in part to the way our database server is set up, and in part how our database has grown over time), think of it as handling every request in sequence.
In a best case scenario, read-only queries like searches can be resolved entirely in RAM. Over time the operating system learns which parts of the database are frequently accessed and tries to cache those in RAM, which is by far the fastest data access method. These can run in a second or less.
When a really pathological search query comes in, however, (and by pathological I mean one that ends up having to touch almost every row in the bibliographic database), problems can occur. For one thing, over the past four years our server has grown enough that it doesn't have enough RAM to hold the entire database in memory anymore. Once the database has to go to disk, access to the data is up to a thousand times slower than access to RAM. In addition, the cached data in RAM starts getting pushed out... so subsequent queries are more likely to have to go back to disk. When a query takes minutes to handle rather than a few seconds, things start to fall apart because other requests that need to update the data that the search query is looking at end up having to wait until the rows of data that they need to touch are free. These requests can then start to build up a queue... and we have a 60-second timeout built into Evergreen for requests that will terminate long database queries, which then automatically roll back those requested changes.
What we've been seeing for the first time since Conifer started running is a repeated set of identically pathological queries coming in at the same time. This escalates the problem significantly, because now there is a whole lot of contention going on in the system.
Short term, what we've been doing to handle the pathological queries is killing them directly at the database. Manually. Which is really pretty crazy, when you think about it, but necessity / motherhood / trying to keep the system alive / next thing you know you're a slave to the machine. I am beginning to suspect (but am not sure yet) that killing these queries may also have a side-effect of disturbing the Evergreen processes that were connected to the database and potentially causing subsequent requests from the same Evergreen process.
This morning I hoped to upgrade PostgreSQL to the latest minor version (we're at 9.1.9 and 9.1.11 is available with many important bug fixes and performance improvements) but unfortunately 9.1.11 is not yet available through the channel we used to use. So that will wait until the weekend; that way we'll be able to use our test system to ensure there are no surprises.
Longer term, when we move the system from Guelph to our new hosts in the June timeframe we're going to get a whole lot of advantages:
a) From a hardware perspective, instead of the old-school spinning hard drives that we rely on, they're using SSD -- which is much closer to RAM in terms of performance
b) They also plan to spec out the database servers with far more RAM than the actual size of the database. So read-only queries like searches should always be able to run in RAM.
c) They'll be running the latest major version of PostgreSQL (9.3) which brings many other performance improvements.
In theory, we'll also be able to rely on our hosts for Evergreen support in situations like this where we don't really have full-time people dedicated to Conifer support.
Shortly after we're on the new hardware platform and have ensured that we're stable, we'll be able to upgrade to the latest version of OpenSRF and Evergreen (we're on Evergreen 2.4, while 2.5 has been out for a while now and 2.6 is just around the corner), both of which include performance improvements and robustness improvements for the cases like the long-running processes.
Sorry for the relative quiet on my side; I've been mostly working behind the scenes to try and support Robin, Rick, and Art. In the interests of transparency during what must be a very frustrating time for you all, though I wanted to let you know what we're seeing on the systems side and what we've been trying to do.
Dan
On Fri, Feb 7, 2014 at 9:28 AM, Alain Lamothe <
alam...@laurentian.ca<mailto:
alam...@laurentian.ca>> wrote:
Thanks Richard!
Out of curiosity, are these issues related to the DoS attack of last week?
Alain
Alain Lamothe, M.Sc., M.L.I.S.
Chair, Department of Library and Archives
Head, Collections and Technical Services
J.N. Desmarais Library
Laurentian University
Sudbury, Ontario
Canada
P3E 2C6
(705) 675-1151 ext. 3304<tel:%28705%29%20675-1151%20ext.%203304>
alam...@laurentian.ca<mailto:
alam...@laurentian.ca>
>>> Richard Scott <
Richar...@algomau.ca<mailto:
Richar...@algomau.ca>> 2/7/2014 9:05 AM >>>
Good morning,
Some of the Conifer services that the staff client needs to sign in appear to be experiencing issues. We're just restarting them now; hopefully that should bring things back into line.
Cheers,
Rick
--
Rick Scott // Library Technologies Specialist // Wishart Library @ AlgomaU
Good judgment comes from experience.
Experience comes from bad judgment. :Nasrudin
________________________________________
From:
conifer...@googlegroups.com<mailto:
conifer...@googlegroups.com> [
conifer...@googlegroups.com<mailto:
conifer...@googlegroups.com>] On Behalf Of Alain Lamothe [
alam...@laurentian.ca<mailto:
alam...@laurentian.ca>]
Sent: February-07-14 08:54
To:
conifer...@googlegroups.com<mailto:
conifer...@googlegroups.com>
Cc: Aline Krause; Lorraine Racine; Marlene Bonin; Noella Cliche; Rachelle Larcher
Subject: [conifer-discuss] Conifer Client issue
Hi everyone,
Laurentian is continuing to experience issues with the Client. Circulation can't sign any books out/in and Cataloguing can't import or save records. They keep getting network error messages or an extremely slow response times.
On the other hand, the OPAC is functioning a-ok.
Alain
Alain Lamothe, M.Sc., M.L.I.S.
Chair, Department of Library and Archives
Head, Collections and Technical Services
J.N. Desmarais Library
Laurentian University
Sudbury, Ontario
Canada
P3E 2C6
(705) 675-1151 ext. 3304<tel:%28705%29%20675-1151%20ext.%203304>
alam...@laurentian.ca<mailto:
alam...@laurentian.ca>
--
You received this message because you are subscribed to the Google Groups "Conifer testing and discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
conifer-discu...@googlegroups.com<mailto:
conifer-discuss%2Bunsu...@googlegroups.com>.
To post to this group, send email to
conifer...@googlegroups.com<mailto:
conifer...@googlegroups.com>.
To unsubscribe from this group and stop receiving emails from it, send an email to
conifer-discu...@googlegroups.com<mailto:
conifer-discuss%2Bunsu...@googlegroups.com>.
To post to this group, send email to
conifer...@googlegroups.com<mailto:
conifer...@googlegroups.com>.
To unsubscribe from this group and stop receiving emails from it, send an email to
conifer-discu...@googlegroups.com<mailto:
conifer-discuss%2Bunsu...@googlegroups.com>.
To post to this group, send email to
conifer...@googlegroups.com<mailto:
conifer...@googlegroups.com>.