One of the issues I've been struggling with lately is expensive sphinx MySQL queries due to the number of MVAs I have attached to certain model classes. For example, if you are sphinx-indexing a list of cars, and pulling in attributes like:
- internal_accessories
- external_accessories
if internal_accessories and external_accessories live in the same table, you can end up with a quasi-cartesian join which then has to be grouped together. I have one query which takes a while to parse because 14,000 model records result in an ungrouped result set of over 1 million records.
I've tried to avoid this by using separate indices, and making some attributes present in index A and other attributes present in index B. That mostly works, but can lead to some inaccurate results if certain combinations of attributes are used in filters, and is awkward to maintain.
My application has a mix of updates and inserts that need to be re-indexed. The optimization that occurs to me is to generate the MVA lists at the time a model is saved as a set of comma-delimited strings. Then the sphinx query would be very fast as the grouping and joining would be already done on a record-by-record basis.
Wondering if anybody else has considered this approach or has an alternate solution. I know I've read some other posts about creating temp tables to try to address this issue.
Thanks,
Eric