--
You received this message because you are subscribed to the Google Groups "mybatis-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mybatis-user...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
1) High skew values that have Frequency histograms on the column
2) Range-based parameters like date BETWEEN and LIKE clauses.
To address the first issue, we sometimes drop the histogram on a column so that the optimizer is not fooled by the absence of a value in the histogram and therefore treats all distinct values equally: For example, in a table of 100 rows and 10 distinct values, the optimizer will assume that you will get 10 rows back every time. This will give us a single plan with still using bind variables. This situation is good when we truly want & expect the same plan regardless of the parameter. A good example is a newly onboarded customer who very quickly generates many tens of thousands of rows, but is statistically insignificant compared to a long-standing customer.
For the second problem, we use the literal parameters to get different execution plans by forcing the hard parse if the consequence of a poor plan choice is high risk/impact to the system. For example:
A table has two columns: Category and Value. Category always has cardinality of 100.“WHERE value LIKE ‘A%’ “ has a cardinality of 1000 but “WHERE value LIKE ‘ABCDE%’ “has a lower cardinality of 10. When running the trailing wildcard query we want different plans based on the query parameter: Sometimes we want to drive by the index on“category” column other times by index on “value” column. With the bind variable, the plan is pinned the first time the query is parsed. In this case we tend to use literal parameters in the query.
A similar example would apply with “WHERE date_column BETWEEN ? and ? “when the user-entered date ranges might be a day, a week, a month or a year. Narrow date range? Use the index on the date column? Wide date range? Use a more selective scan.
We know parsing a query is an expensive operation, but we weigh the parsing cost against what the query is trying to do. For high occurrence and repetitive queries, we use bind variables always.
We do use hints to diagnose a problem but do not consider them to be good for production code. Call it a policy of ‘non-interference’ with the optimizer when the pattern of growth in our large data model is not predictable.
Zaher
--
--
You received this message because you are subscribed to the Google Groups "mybatis-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mybatis-user...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Thanks,Zaher--