Oracle Database

Bind Peeking, Ad Hoc Queries, Stable Performance

3 min read

Mar 12, 2008 12:00:00 AM

I got to troubleshoot an amazing situation a few weeks ago. I think, we can all agree that assuming we are on a supported configuration of a production release of Oracle, it is essentially inconceivable that allowing a single query to run on your system can flip another query’s plans and cause major performance issues (and in this case even downtime!)

There have been plenty of posts about bind peeking. Alex Gorbachev wrote about it last year, and so did Jonathan Lewis. It’s a well known issue. However what hasn’t been written about is when it is expected to strike and cause you headaches.

The Ghost in the Machine: When Plans Flip

A single query can change the plan of a number of other queries, but just sometimes.

One day you come to work (or get woken up) and you have a badly performing system. You look into it and it comes to one query. Nothing has changed, nothing has been modified. I found the query no problems. I did an explain plan, comes up with an index range scan. Given the execution time however, it can’t be a range scan, it’s taking way too much time. Looking at v$sql_plan reveals the full table scan. At this point I was almost sure what the problem was.

How did I fix it? I did an alter system flush shared_pool; the query got reparsed, a new child cursor was created and all new executions of the query were using the expected and much faster index range scan. Easy fix, but what happened? Nothing has been changed for months!

The Three Triggers: Coincidence, Load, and the Ad-hoc "Time Bomb"

Sometimes it’s coincidence: The query just happens to run at the right time, or some job is delayed and queries run in a different order and the wrong plan is selected. Those are generally easy to catch as they happen early in the life of a system.
Sometimes it’s load: You decide to run an additional application, which causes more SQL statements to be aged out and re-parsed, creating a higher chance of parsing with “bad” values. Although the additional load will be blamed for the “slow” performance, the root cause will eventually be discovered.
And sometimes it’s a single ad hoc query: A new predicate starts the slowly-ticking time bomb.

Deep Dive: How sys.col_usage$ Triggers Histograms

Here is how it happens . . .

When you first create a new table, it has no statistics. The default 10g analyse job picks it up in the night and analyses the table. The next day, all queries start using the CBO (Cost Based Optimizer), and column usage statistics start to gather.

What are column usage statistics? They are statistics collected in the system table sys.col_usage$. According to various sources, information is flushed every 15-or-so minutes. col_usage$ has various columns with counts:

Name	Type	Nullable
`OBJ#`	NUMBER	Y
`INTCOL#`	NUMBER	Y
`EQUALITY_PREDS`	NUMBER	Y
`EQUIJOIN_PREDS`	NUMBER	Y
`NONEQUIJOIN_PREDS`	NUMBER	Y
`RANGE_PREDS`	NUMBER	Y
`LIKE_PREDS`	NUMBER	Y
`NULL_PREDS`	NUMBER	Y
`TIMESTAMP`	DATE	Y

Each of these columns will have a count for the number of times this type of predicate has been used. Once you have some column usage statistics and your table has been modified sufficiently to be a candidate for new statistics to be gathered, it will be analyzed during the night.

Based on internal rules, histograms will be gathered for certain columns. During my testing, I’ve observed that only columns with cardinality greater than 1.5 will have histograms. They also need to have at least equality predicates to trigger collections. Then the next time your query is parsed, it will use the new statistics data and possibly select a better execution plan. For its decision, the optimizer will use the bind values of only the first execution of this query.

The Chronology of a Plan Failure

Here’s the sequence of events:

New table $\rightarrow$ nightly job analyses, no histograms.
New queries populate col_usage$.
Table changes sufficiently to be a candidate for analysis.
Table is re-analyzed with histograms.
Queries age out from shared pool; they are re-parsed with histograms and bind variable peeking.

Where’s the devil here?

A single ad hoc query can drastically change the statistics of a table. This may trigger the gathering of histograms, and this, in turn, might enable a time bomb that, based on which bind values were used at parsing time, determines the execution plan of the query.

It’s a long sequence of events, and you may argue it’s very unlikely that this will ever happen. Consider this, though — every ad hoc query will only add to the column usage statistics. There might be weeks between each step.

The Verdict: Turning Off Bind Peeking

What is the solution? Disable bind peeking.

It is a feature I’ve never liked. To me, bind peeking is a workaround to enable histogram usage with bind variables. In my experience, this feature has caused me more headaches than saved-the-day moments.

This issue is somewhat addressed in 11g with the addition of Adaptive Cursor Sharing (multiple execution plans based on different bind variables). I haven’t tested yet, so I can’t say how well that works. For now, I just turn off bind peeking.

Oracle Database Consulting Services

Ready to optimize your Oracle Database for the future?

Speak with our Oracle Database consultants ->

On this page

Ready to unlock value from your data?

With Pythian, you can accomplish your data transformation goals and more.

Speak with Pythian consultants now →

Bind Peeking, Ad Hoc Queries, Stable Performance

The Ghost in the Machine: When Plans Flip

The Three Triggers: Coincidence, Load, and the Ad-hoc "Time Bomb"

Deep Dive: How sys.col_usage$ Triggers Histograms

The Chronology of a Plan Failure

The Verdict: Turning Off Bind Peeking

Oracle Database Consulting Services

Share this

Share this

More resources

Debugging IN vs OR Performance in MySQL

Oracle parallel query hints reference - part 5: PQ_DISTRIBUTE_WINDOW

Oracle parallel query hints. A reference series part 2: PQ_REPLICATE

Ready to unlock value from your data?