Debugging CBQT OR expansion with 10053 trace
I had been working with a client to add a little bit of functionality to an analytical view that we had been running queries against for a while. The functionality was easy to implement but during testing we found that the execution time for some queries against the view went from about 5s to 25s. Some things were ruled out pretty quickly:
- Re-running the query was very fast, indicating a parsing issue
- ASH also showed that the query was "in parse" for over 20s
- The execution plan was long but looking good. No unexpected notes
- A 10046 trace confirmed that time was spent parsing and also that it was not dynamic sampling. Some dynamic sampling was happening but that only took about 1s
- I also found that I could leave a certain piece of my query to produce the same execution plan, but hard parse in less than 5s instead of 25s
Optimizer trace to the rescue!
At this point I was very certain that this must have something to do with parsing but I needed to know more. I am not aware of how to dig deeper within the database itself or any other nifty tool. That's why I deemed it was time for a good old 10053 trace. I ran this for the "good" and the "bad" query. The "bad" trace was over 600MB, there was no way I was going to read all of that, let alone understand it. But I poked around a bit and grepped for this and that, including for timing information.oracle@exa1:orcll1> grep TIMER orcl1_ora_96077_BROST.trc
TIMER: costing general plans cpu: 7.050 sec elapsed: 7.064 sec
TIMER: costing additional plans cpu: 1.663 sec elapsed: 1.665 sec
TIMER: costing general plans cpu: 7.089 sec elapsed: 7.098 sec
TIMER: costing additional plans cpu: 2.195 sec elapsed: 2.202 sec
TIMER: costing general plans cpu: 6.932 sec elapsed: 6.943 sec
TIMER: costing additional plans cpu: 2.155 sec elapsed: 2.158 sec
TIMER: costing general plans cpu: 6.827 sec elapsed: 6.839 sec
TIMER: costing additional plans cpu: 2.133 sec elapsed: 2.136 sec
TIMER: costing general plans cpu: 6.619 sec elapsed: 6.624 sec
TIMER: costing additional plans cpu: 2.075 sec elapsed: 2.076 sec
TIMER: CBQT OR expansion SEL$BC0C3E63 cpu: 44.864 sec elapsed: 44.934 sec
TIMER: costing general plans cpu: 3.802 sec elapsed: 3.805 sec
TIMER: costing general plans cpu: 3.638 sec elapsed: 3.641 sec
TIMER: costing additional plans cpu: 1.321 sec elapsed: 1.324 sec
TIMER: costing general plans cpu: 3.419 sec elapsed: 3.422 sec
TIMER: costing additional plans cpu: 1.164 sec elapsed: 1.165 sec
TIMER: costing general plans cpu: 3.426 sec elapsed: 3.428 sec
TIMER: costing additional plans cpu: 1.253 sec elapsed: 1.254 sec
TIMER: costing general plans cpu: 3.360 sec elapsed: 3.362 sec
TIMER: costing additional plans cpu: 1.138 sec elapsed: 1.138 sec
TIMER: costing general plans cpu: 7.367 sec elapsed: 7.374 sec
TIMER: costing additional plans cpu: 1.832 sec elapsed: 1.834 sec
TIMER: costing general plans cpu: 7.037 sec elapsed: 7.104 sec
TIMER: costing additional plans cpu: 2.253 sec elapsed: 2.254 sec
TIMER: costing general plans cpu: 6.839 sec elapsed: 6.849 sec
TIMER: costing additional plans cpu: 2.115 sec elapsed: 2.116 sec
TIMER: costing general plans cpu: 6.185 sec elapsed: 6.188 sec
TIMER: costing additional plans cpu: 1.851 sec elapsed: 1.853 sec
TIMER: costing general plans cpu: 6.324 sec elapsed: 6.329 sec
TIMER: costing additional plans cpu: 1.851 sec elapsed: 1.852 sec
TIMER: CBQT OR expansion SEL$BC0C3E63 cpu: 69.668 sec elapsed: 70.090 sec
TIMER: Cost-Based Transformations (Overall) SEL$BC0C3E63 cpu: 114.791 sec elapsed: 115.282 sec
TIMER: costing general plans cpu: 3.447 sec elapsed: 3.449 sec
TIMER: costing general plans cpu: 6.372 sec elapsed: 6.376 sec
TIMER: costing additional plans cpu: 1.527 sec elapsed: 1.528 sec
TIMER: Access Path Analysis (Final) SEL$BC0C3E63 cpu: 12.573 sec elapsed: 12.582 sec
TIMER: SQL Optimization (Overall) SEL$BC0C3E63 cpu: 127.900 sec elapsed: 128.401 sec
Note that all timings are inflated due to tracing being enabled. But the important part is that the cost based transformations for "CBQT OR expansion" took almost 90% of the time (115 of 128 seconds). I had to look up what these transformations are and
learned that or-expansions have been around for a while but that in 12.2 (the version used in this case) some new optimizations have been introduced. Or expansions are intended to offer more transformation options to the CBO in certain circumstances. The downside to that is that more options also mean more permutations to evaluate, resulting in more work for the CBO. My query joins dozens of tables and some of them have thousands of partitions. Calculating the cost for each permutation takes CPU time and this adds up as the number of potential permutations grow. No wonder this is taking a while.
Workaround
I also double-checked that the resulting plan was not using the result of these or-expansions by looking for a VIEW-step with a name starting with VW_ORE_. The workaround then became adding a hint to my view that would disable this feature. I started with turning off the feature via OPT_PARAM('_optimizer_cbqt_or_expansion','off'). That worked but seems a bit ugly. Further trials showed that the old NO_EXPAND hint has the same effect on my query but I finally ended up using what I think is a new hint for NO_OR_EXPAND.CREATE VIEW [...] AS SELECT /*+ NO_OR_EXPAND ... */ ...
Just for comparison, this is the same grep on a 10053 trace after I disabled the CBQT or expansion feature:
oracle@exa1:orcl1> grep TIMER orcl1_ora_96077_BROST23.trc
TIMER: costing general plans cpu: 3.457 sec elapsed: 3.460 sec
TIMER: costing general plans cpu: 6.903 sec elapsed: 6.907 sec
TIMER: costing additional plans cpu: 1.605 sec elapsed: 1.605 sec
TIMER: Access Path Analysis (Final) SEL$BC0C3E63 cpu: 14.153 sec elapsed: 14.164 sec
TIMER: SQL Optimization (Overall) SEL$BC0C3E63 cpu: 14.163 sec elapsed: 14.174 sec
Conclusion
Troubleshooting performance can be weird at times. In this case the root cause was a combination of a recent upgrade to 12.2 and small changes to a view I had just made. Luckily Oracle has many capabilities to drill down into where time is spent, even if I had to get into 10053 tracing in this case. The time-based analysis of that trace file gave me the exact clues I needed to solve this issue.Share this
Previous story
← Spark Scala UDF primitive type bug
Next story
Scala TreeSet: mutable vs immutable →
You May Also Like
These Related Stories
SYSQL: SysAdmin Query Language
SYSQL: SysAdmin Query Language
Jul 20, 2021
20
min read
Optimizing CPU cores and threads for Oracle on AWS
Optimizing CPU cores and threads for Oracle on AWS
Jul 18, 2018
4
min read
Slow query to V$ views after DBRU Patching
Slow query to V$ views after DBRU Patching
Apr 3, 2023
3
min read
No Comments Yet
Let us know what you think