Multi-Column Correlation and Extended Stats in Oracle 11g
We all have encountered this situation many times before: the cost-based optimizer assumes no correlation between two columns (until 11g), and this has the effect of erroneously reducing the cardinality of a row source. Incorrect cardinality estimates are one of many root causes for SQL performance issues. Consider the following example.
This code creates a table and populates data with four columns:
create table t_vc as select mod(n, 100) n1, mod(n, 100) n2 , mod(n, 50) n3 , mod(n, 20) n4 from (select level n from dual connect by level <= 10001);
The first two columns, n1 and n2, have a strong correlation: n1 is always equals to n2.
Let’s collect statistics with histograms on all columns.
begin dbms_stats.gather_Table_stats( user, 'T_VC', estimate_percent => null, method_opt => 'for all columns size 254'); end; /
Let’s explain the plan for the query, specifying one predicate, exactly 100 rows with a value of 10.
