Hey Alexsander,
I believe I have a pretty clear understanding of what's going on here, now. I've edited my prior post to remove speculation which didn't actually correspond to the likely cause.
It seems that
in the case of your data, specifically, the condition you moved to the top of the chained "AND" statement (
url ~ E'^https?:\\/\\/www\\.zeit\\.de\\/[-/a-z0-9]+$') is more likely to be false than the other conditionals, leading to a
logical short-circuit of the chained "AND" statement sooner and more often than with another order. If you'd like to test this theory, you could do the following:
1. Determine how many rows evaluate as true for each of the regex conditionals
alone2. Run (and record the timing for) queries which put the regex conditionals in various orders
You could then note whether the runtime is shortest for the queries where the conditions evaluating as false for most rows are put closer to the first position, and whether the runtime is longest for those queries where the conditions evaluating as true for most rows are closer to first position.
If this turns out to be a correct intuition, you should run this kind of analysis on your data every so often to make sure that the order in which you place the conditionals is optimized.
You could also pre-compute the conditional value for various useful and recurring regexes and store these as columns, making it much faster to query based on these conditionals.