Hi,
I'm debugging a confusing performance problem. I'm running HS 0.07 on a single EC2 r3.xlarge instance. The core has about 25M documents and about 42GB on disc. Our text queries are rather simple, but we do some filtering and a lot of facetting. What I observe is the following:
- A simple query, with lots of facets and fq=some_field: some_id takes up to 20-30sec, if some_Id is used the first time.
- I run the query three times: First calls takes 20-30sec, next ones 100-300ms.
- I do this for multiple ids, iterating over a list of valid ids. New ids always show the above behavior.
- I stop the script and restart it. "Old" ids are fast. As soon as I hit the first new one, the first call is slow.
- I do this in parallel for different fields. Fields seem not to interfere and "fast" ids are always fast. According to web interface, SOLR is just using <10GB of available 30GB.
So far, this looks like a warming issue. But: I check the disc activity on the server by using iostat. I don't see any read activity on the machine!?
Can anybody explain what's might be going on here? From where to where is the data moved during the first call, if not from disc into memory?
Any hint would be very appreciated.
kind regards,
Achim