Oracle Technical Track

Oracle parallel query hints - part 1: PQ_DISTRIBUTE

3 min read

Apr 10, 2018 12:00:00 AM

One of the most powerful features of the Oracle database is its ability to leverage multiple CPU cores across many RAC servers to execute a single query. While Oracle's optimizer attempts to execute queries in the most efficient manner, the optimal distribution of data isn't always obvious from statistics alone.

In these cases, we need hints to guide the execution. The most important, and arguably most complex, hint for this purpose is PQ_DISTRIBUTE. Oracle documentation offers only the bare minimum on its usage, and detailed support articles are scarce.

Understanding the PQ_DISTRIBUTE Hint

There are two distinct use cases covered by the same PQ_DISTRIBUTE hint. While both involve reshuffling data between parallel query processes, they function differently:

Distribution Case: Controls the reshuffling of data during parallel DML and direct path loads.
Join Case (outer_distribution, inner_distribution): Controls the reshuffling of data during table joins.

This blog focuses on Case #2, re-shuffling data for joins, as it is easier to demonstrate and test.

Outer vs. Inner Distributions

Oracle's documentation is somewhat ambiguous regarding "outer" and "inner" distributions. Simply put, the "outer" table is the one mentioned explicitly in the PQ_DISTRIBUTE hint, while the "inner" table represents the "current result set" at that stage of the execution plan.

Case Study: Optimizing a Two-Table Join

When joining two tables of similar size, neither is an obvious candidate for broadcasting. Consequently, Oracle defaults to a Hash-Hash distribution. This typically results in a BUFFERED join, meaning the result set is staged before being returned to the application, creating extra overhead.

Default Behavior: Hash Join Buffered

[sql]
select /*+PARALLEL(8) / from tlarge t1
join tlarge_two t2 on t2.id=t1.id;
[/sql]

In the execution plan below, notice the HASH JOIN BUFFERED operation and the HASH distribution method:

Id	Operation	Name	TQ	IN-OUT	PQ Distrib
3	HASH JOIN BUFFERED		Q1,02	PCWP
5	PX SEND HASH	:TQ10000	Q1,00	P->P	HASH
9	PX SEND HASH	:TQ10001	Q1,01	P->P	HASH

Even adding a filter condition (like mod5_id=1) that reduces rows by 5x often isn't enough for the optimizer to switch away from the expensive Hash-Hash distribution on its own.

Improving Efficiency with Broadcast Distribution

By forcing a BROADCAST distribution, we can eliminate the buffered join. This plan is more efficient but requires more memory, as each parallel process receives its own copy of the filtered TLARGE table.

[sql]select /+PARALLEL(8) NO_PX_JOIN_FILTER(T2) NO_PQ_REPLICATE(T2) PQ_DISTRIBUTE(T2 BROADCAST NONE)/ * from tlarge t1join tlarge_two t2 on t2.id=t1.idwhere t1.mod5_id=1;[/sql]

Note: NO_PX_JOIN_FILTER and NO_PQ_REPLICATE are used here to simplify the example and isolate the effects of PQ_DISTRIBUTE.

In this plan, the BUFFERED operator is gone, replaced by PX SEND BROADCAST. The hint PQ_DISTRIBUTE(T2 BROADCAST NONE) tells Oracle: "When joining T2, take the current result set and broadcast it to all processes reading T2."

Scaling to Multi-Table Parallel Joins

The complexity increases significantly when joining three or more tables. With the default settings, Oracle continues to use Hash-Hash for each join.

The Pitfalls of "Result Set" Broadcasting

If we use the BROADCAST NONE method on a 3-table join, the resulting join between the first two tables (TSMALL and TLARGE) is broadcasted to every process to join with the third table.

[sql]select /*+PARALLEL(8) ... PQ_DISTRIBUTE(T1 BROADCAST NONE) PQ_DISTRIBUTE(T2 BROADCAST NONE) */ *from tlarge t1 join tlarge_two t2 on t2.id=t1.id join tsmall ts on ts.id = t1.idwhere t1.mod5_id=1;[/sql]

If that initial join results in many records, the amount of data being broadcasted across the network explodes. With every additional table, the data volume increases exponentially.

The Optimal Strategy: NONE BROADCAST and SWAP BROADCAST and SWAP_JOIN_INPUTS

A more robust approach for large result sets is to do the opposite: keep the result set stationary and broadcast the "new" table instead. This is achieved using the NONE BROADCAST distribution combined with the SWAP_JOIN_INPUTS hint.

[sql]select /*+PARALLEL(8)LEADING(TS, T1, T2)PQ_DISTRIBUTE(T1 NONE BROADCAST) SWAP_JOIN_INPUTS(T1)PQ_DISTRIBUTE(T2 NONE BROADCAST) SWAP_JOIN_INPUTS(T2) */ * from tlarge t1join tlarge_two t2 on t2.id=t1.idjoin tsmall ts on ts.id = t1.idwhere t1.mod5_id=1;[/sql]

How this works:

Join Order: We use LEADING(TS, T1, T2) to define the sequence.
NONE BROADCAST: This tells Oracle: "Do nothing to the current result set, but broadcast the table being joined (T1 or T2) to every process."
SWAP_JOIN_INPUTS: This instructs the optimizer to reverse the hash join order—hashing the broadcasted table into memory and probing it with the (potentially much larger) result set.

This strategy ensures that even as your result set grows to millions of rows, you are only broadcasting the static table data, keeping your parallel execution fast and memory-efficient.

Oracle Consulting Services

Ready to optimize your Oracle Database for the future?

Speak with our Oracle consultants ->

On this page

Ready to unlock value from your data?

With Pythian, you can accomplish your data transformation goals and more.

Speak with Pythian consultants now →

Services