Performance problems with Dynamic Statistics in Oracle 12c
I’ve been making some tests recently with the new Oracle 12.1.0.2 In-Memory option and have been faced with an unexpected performance problem. Here is a test case:
create table tst_1 as with q as (select 1 from dual connect by level <= 100000) select rownum id, 12345 val, mod(rownum,1000) ref_id from q,q where rownum <= 200000000; Table created. create table tst_2 as select rownum ref_id, lpad(rownum,10, 'a') name, rownum || 'a' name2</pre> from dual connect by level <= 1000; Table created. begin dbms_stats.gather_table_stats( ownname => user, tabname =>'TST_1', method_opt => 'for all columns size 1', degree => 8 ); dbms_stats.gather_table_stats( ownname => user, tabname =>'TST_2', method_opt => 'for all columns size 1' ); end; / PL/SQL procedure successfully completed. alter table tst_1 inmemory; Table altered. select count(*) from tst_1; COUNT(*) ---------- 200000000
Waiting for in-memory segment population:
select segment_name, bytes, inmemory_size from v$im_segments; SEGMENT_NAME BYTES INMEMORY_SIZE --------------- ---------- ------------- TST_1 4629463040 3533963264
Now let’s make a simple two table join:
select name, sum(val) from tst_1 a, tst_2 b where a.ref_id = b.ref_id and name2='50a' group by name; Elapsed: 00:00:00.17
Query runs pretty fast. Execution plan has the brand new vector transformation
Execution Plan ---------------------------------------------------------- Plan hash value: 213128033 -------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 54 | 7756 (21)| 00:00:01 | | 1 | TEMP TABLE TRANSFORMATION | | | | | | | 2 | LOAD AS SELECT | SYS_TEMP_0FD9D66FA_57B2B | | | | | | 3 | VECTOR GROUP BY | | 1 | 24 | 5 (20)| 00:00:01 | | 4 | KEY VECTOR CREATE BUFFERED | :KV0000 | 1 | 24 | 5 (20)| 00:00:01 | |* 5 | TABLE ACCESS FULL | TST_2 | 1 | 20 | 4 (0)| 00:00:01 | | 6 | HASH GROUP BY | | 1 | 54 | 7751 (21)| 00:00:01 | |* 7 | HASH JOIN | | 1 | 54 | 7750 (21)| 00:00:01 | | 8 | VIEW | VW_VT_377C5901 | 1 | 30 | 7748 (21)| 00:00:01 | | 9 | VECTOR GROUP BY | | 1 | 13 | 7748 (21)| 00:00:01 | | 10 | HASH GROUP BY | | 1 | 13 | 7748 (21)| 00:00:01 | | 11 | KEY VECTOR USE | :KV0000 | 200K| 2539K| 7748 (21)| 00:00:01 | |* 12 | TABLE ACCESS INMEMORY FULL| TST_1 | 200M| 1716M| 7697 (21)| 00:00:01 | | 13 | TABLE ACCESS FULL | SYS_TEMP_0FD9D66FA_57B2B | 1 | 24 | 2 (0)| 00:00:01 | -------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 5 - filter("NAME2"='50a') 7 - access("ITEM_5"=INTERNAL_FUNCTION("C0") AND "ITEM_6"="C2") 12 - inmemory(SYS_OP_KEY_VECTOR_FILTER("A"."REF_ID",:KV0000)) filter(SYS_OP_KEY_VECTOR_FILTER("A"."REF_ID",:KV0000)) Note ----- - vector transformation used for this statement
After having such impressive performance I’ve decided to run the query in parallel:
select /*+ parallel(8) */ name, sum(val) from tst_1 a, tst_2 b where a.ref_id = b.ref_id and name2='50a' group by name; Elapsed: 00:01:02.55
Query elapsed time suddenly dropped from 0.17 seconds to the almost 1 minute and 3 seconds. But the second execution runs in 0.6 seconds.
The new plan is:
Execution Plan ---------------------------------------------------------- Plan hash value: 3623951262 ----------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib | ----------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 29 | 1143 (26)| 00:00:01 | | | | | 1 | PX COORDINATOR | | | | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | 1 | 29 | 1143 (26)| 00:00:01 | Q1,01 | P->S | QC (RAND) | | 3 | HASH GROUP BY | | 1 | 29 | 1143 (26)| 00:00:01 | Q1,01 | PCWP | | | 4 | PX RECEIVE | | 1 | 29 | 1143 (26)| 00:00:01 | Q1,01 | PCWP | | | 5 | PX SEND HASH | :TQ10000 | 1 | 29 | 1143 (26)| 00:00:01 | Q1,00 | P->P | HASH | | 6 | HASH GROUP BY | | 1 | 29 | 1143 (26)| 00:00:01 | Q1,00 | PCWP | | |* 7 | HASH JOIN | | 200K| 5664K| 1142 (26)| 00:00:01 | Q1,00 | PCWP | | | 8 | JOIN FILTER CREATE | :BF0000 | 1 | 20 | 2 (0)| 00:00:01 | Q1,00 | PCWP | | |* 9 | TABLE ACCESS FULL | TST_2 | 1 | 20 | 2 (0)| 00:00:01 | Q1,00 | PCWP | | | 10 | JOIN FILTER USE | :BF0000 | 200M| 1716M| 1069 (21)| 00:00:01 | Q1,00 | PCWP | | | 11 | PX BLOCK ITERATOR | | 200M| 1716M| 1069 (21)| 00:00:01 | Q1,00 | PCWC | | |* 12 | TABLE ACCESS INMEMORY FULL| TST_1 | 200M| 1716M| 1069 (21)| 00:00:01 | Q1,00 | PCWP | | ----------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 7 - access("A"."REF_ID"="B"."REF_ID") 9 - filter("NAME2"='50a') 12 - inmemory(SYS_OP_BLOOM_FILTER(:BF0000,"A"."REF_ID")) filter(SYS_OP_BLOOM_FILTER(:BF0000,"A"."REF_ID")) Note ----- - dynamic statistics used: dynamic sampling (level=AUTO) - Degree of Parallelism is 8 because of hint
We can see a Bloom filter instead of key vector, but this is not the issue. Problem is coming from the “dynamic statistics used: dynamic sampling (level=AUTO)” note.
In 10046 trace file I’ve found nine dynamic sampling queries and one of them was this one:
SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring optimizer_features_enable(default) no_parallel result_cache(snapshot=3600) */ SUM(C1) FROM (SELECT /*+ qb_name("innerQuery") */ 1 AS C1 FROM (SELECT /*+ NO_VECTOR_TRANSFORM ORDERED */ "A"."VAL" "ITEM_1","A"."REF_ID" "ITEM_2" FROM "TST_1" "A") "VW_VTN_377C5901#0", (SELECT /*+ NO_VECTOR_TRANSFORM ORDERED */ "B"."NAME" "ITEM_3","B"."REF_ID" "ITEM_4" FROM "TST_2" "B" WHERE "B"."NAME2"='50a') "VW_VTN_EE607F02#1" WHERE ("VW_VTN_377C5901#0"."ITEM_2"= "VW_VTN_EE607F02#1"."ITEM_4")) innerQuery call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 1 43.92 76.33 0 5 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 3 43.92 76.33 0 5 0 0 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 64 (recursive depth: 1) Number of plan statistics captured: 1 Rows (1st) Rows (avg) Rows (max) Row Source Operation ---------- ---------- ---------- --------------------------------------------------- 0 0 0 RESULT CACHE 56bn7fg7qvrrw1w8cmanyn3mxr (cr=0 pr=0 pw=0 time=0 us) 0 0 0 SORT AGGREGATE (cr=0 pr=0 pw=0 time=8 us) 0 0 0 HASH JOIN (cr=0 pr=0 pw=0 time=4 us cost=159242 size=2600000 card=200000) 200000000 200000000 200000000 TABLE ACCESS INMEMORY FULL TST_1 (cr=3 pr=0 pw=0 time=53944537 us cost=7132 size=800000000 card=200000000) 0 0 0 TABLE ACCESS FULL TST_2 (cr=0 pr=0 pw=0 time=3 us cost=4 size=9 card=1) Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ asynch descriptor resize 1 0.00 0.00 Disk file operations I/O 1 0.00 0.00 CSS initialization 1 0.00 0.00 CSS operation: action 1 0.00 0.00 direct path write temp 6267 0.02 30.37 ********************************************************************************
Vector transformation is disabled, inefficient table order is fixed by the ORDERING hint and we are waiting for hash table creation based on huge TST_1 table.
Dynamic statistics feature has been greatly improved in Oracle 12c with the support for joins and group by predicates. This is why we have such join during the parse time. Next document has the”Dynamic Statistics (previously known as dynamic sampling)” section inside: Understanding Optimizer Statistics with Oracle Database 12c where the new functionality is described.
Let’s make a simpler test:
select /*+ parallel(2) */ ref_id, sum(val) from tst_1 a group by ref_id; Execution Plan ---------------------------------------------------------- Plan hash value: 2527371111 --------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib | --------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1000 | 9000 | 7949 (58)| 00:00:01 | | | | | 1 | PX COORDINATOR | | | | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | 1000 | 9000 | 7949 (58)| 00:00:01 | Q1,01 | P->S | QC (RAND) | | 3 | HASH GROUP BY | | 1000 | 9000 | 7949 (58)| 00:00:01 | Q1,01 | PCWP | | | 4 | PX RECEIVE | | 1000 | 9000 | 7949 (58)| 00:00:01 | Q1,01 | PCWP | | | 5 | PX SEND HASH | :TQ10000 | 1000 | 9000 | 7949 (58)| 00:00:01 | Q1,00 | P->P | HASH | | 6 | HASH GROUP BY | | 1000 | 9000 | 7949 (58)| 00:00:01 | Q1,00 | PCWP | | | 7 | PX BLOCK ITERATOR | | 200M| 1716M| 4276 (21)| 00:00:01 | Q1,00 | PCWC | | | 8 | TABLE ACCESS INMEMORY FULL| TST_1 | 200M| 1716M| 4276 (21)| 00:00:01 | Q1,00 | PCWP | | --------------------------------------------------------------------------------------------------------------------------- Note ----- - dynamic statistics used: dynamic sampling (level=AUTO) - Degree of Parallelism is 2 because of hint
We can see a “dynamic statistics used” note again. It’s a simple query without predicates with the single table with pretty accurate statistics. From my point of view, here is no reason for dynamic sampling at all.
Automatic dynamic sampling was introduced in 11G Release 2. Description of this feature can be found in this document: Dynamic sampling and its impact on the Optimizer.
“From Oracle Database 11g Release 2 onwards the optimizer will automatically decide if dynamic sampling will be useful and what dynamic sampling level will be used for SQL statements executed in parallel. This decision is based on size of the tables in the statement and the complexity of the predicates”.
Looks like algorithm has been changed in 12c and dynamic sampling is triggered in a broader set of use cases.
This behavior can be disabled at statement, session or system level using the fix control for the bug 7452863. For example,
ALTER SESSION SET “_fix_control”=’7452863:0′;
Summary
Dynamic statistics has been enhanced in Oracle 12c, but this can lead to a longer parse time.
Automatic dynamic statistics is used more often in 12c which can lead to a parse time increase in the more cases than before.
Share this
You May Also Like
These Related Stories
No Comments Yet
Let us know what you think