Simple change to improve iRODS-ICAT performance

Visto 15 veces
Saltar al primer mensaje no leído

schr...@diceresearch.org

no leída,
27 oct 2009, 13:53:3127/10/09
a irod...@googlegroups.com
We recently analyzed a user-site iRODS-ICAT-PostgreSQL performance
problem, determined that there was an issue as the number of collections
increased, and found a solution. If your iRODS system has slowed down,
particularly as the number of collections has increased, it is likely
that the following index will return the performance to near its
previous level:

psql ICAT
create unique index idx_coll_main3 on R_COLL_MAIN (coll_name);

Altho our testing was with Postgres, the same problem will probably
occur with MySQL and Oracle and creating the index with the above SQL
should solve the issue for them too.

In local ICAT-intensive tests, some operations that had slowed by more
than an order of magnitude under these conditions returned to nearly the
original speed. For example, on modest hardware, an iput -r of 500
small files took about 9 to 10 seconds with just a few defined
collections without this index, took 350 seconds (and sometimes much
longer) with 250,000 collections without this index, and again took
about 10 seconds with the index. In some cases, without the index, we
can start to notice a significant slowdown with just a few tens of
thousands of collections.

Since many ICAT queries and updates are part of most operations done by
iRODS, this performance improvement should speed most operations to some
extent. There are other limiting factors tho (network bandwidth, for
example), so the improvement will often be slight or moderate.

This new index will be part of the next release. In the mean-time, we
recommend you create this index manually via the above SQL. It can be
done on a running system, and usually takes just a few seconds to
complete. You can easily create and then drop the index ('drop index
idx_coll_main3') to run comparison tests. We'd be interested in what you
find.

A simple way to see how many collections are defined on your system is:

iquest "select count(COLL_ID)"

Without this index, the iRODS system was not scaling properly; with the
index it does. We have two other indexes defined on the main collection
table (r_coll_main), but this additional one is needed for many of the
SQL forms that IRODS ICAT software uses in its normal operation.

- Wayne -


Responder a todos
Responder al autor
Reenviar
0 mensajes nuevos