THE WORLD DISCUSSES #PYTHIAN ON TWITTER. HAVE A QUESTION? USE OUR HASHTAG AND ASK AWAY.

Oracle: Is OUTER JOIN Better Than NOT EXISTS?

I’ve been told that using NOT EXISTS in (Oracle) SQL is a bad idea, and that a way to overcome this problem is to collect the non-matching rows with an OUTER JOIN. So I decided to check if it is true.

In order to start, here is my test case:

create table t1(id number,
   constraint t1_pk primary key(id));

  create table t2(id number);

begin
  for i in 1..100 loop
    insert into t1 values(i);
  end loop;
  commit;
end;

begin
  for i in 1..100000 loop
    insert into t2
      values(mod(i,97));
  end loop;
  commit;
end;
/

create index t2_idx on t2(id);

exec dbms_stats.gather_table_stats(USER,'T1');

exec dbms_stats.gather_table_stats(USER,'T2');

First, I checked what I’d been told, i.e. that the OUTER JOIN is more efficient than the NOT EXISTS. In order to do that, I wrote a simple SELECT and displayed the plan for both syntaxes (my database is 11.1.0.6 on Linux 32-bits). As I assumed, it’s not the case. In fact, both orders took the same plan.

Here is the plan with NOT EXISTS:

explain plan for
  select id from t1 a
   where not exists
    (select 1 from t2 b where b.id=a.id);

select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------
Plan hash value: 1906534000

-----------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)|
-----------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |    96 |   480 |    46   (5)|
|*  1 |  HASH JOIN ANTI    |       |    96 |   480 |    46   (5)|
|   2 |   INDEX FULL SCAN  | T1_PK |   100 |   300 |     1   (0)|
|   3 |   TABLE ACCESS FULL| T2    |   100K|   195K|    44   (3)|
-----------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("B"."ID"="A"."ID")

Here is the plan with the OUTER JOIN:

explain plan for
  select a.id from t1 a, t2 b
   where a.id=b.id(+)
     and b.id is null;

select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------
Plan hash value: 1906534000

-----------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)|
-----------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |     3 |    15 |    46   (5)|
|*  1 |  HASH JOIN ANTI    |       |     3 |    15 |    46   (5)|
|   2 |   INDEX FULL SCAN  | T1_PK |   100 |   300 |     1   (0)|
|   3 |   TABLE ACCESS FULL| T2    |   100K|   195K|    44   (3)|
-----------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("A"."ID"="B"."ID")

I know — the real way to check that both queries are equivalent is to trace the plan generation with a 10053 event. (I cannot explain this, so I’ll leave that to you.) However, the original query I’ve been told to rewrite was not a SELECT, but the DELETE below:

delete from t1 a
   where not exists
    (select 1 from t2 b where b.id=a.id);

I haven’t yet found how to rewrite it in a way that makes it more efficient with an OUTER JOIN. I’ll be happy if someone can help me, at least to find the syntax, if not to enhance response time.

4 Responses

  1. Preetha says:

    delete from t1 a
    where exists
    (select 1 from t1 one left outer join t2 two on two.id = one.id where two.id is null);

  2. Ash says:

    just stumbled upon this, so maybe useless by now…

    Anyway, the delete should be similar to select of the missing records:

    select a.id from t1 a left join t2 b on a.id=b.id where b.id is null;

    just apply the delete syntax of your DBMS:

    delete from t1 a left join t2 b on a.id=b.id where b.id is null;

    or in MSAccess:

    delete a.* from t1 as a left join t2 as b on a.id=b.id where b.id is null;

    hope it helps.

  3. Jack McGarry says:

    You can do this and get the same result…

    delete
    from t1 a
    where 0 = (select count(b.id)
    from t2 b
    where b.id = a.id);

Leave a Reply

Start NowWith Pythian - database design, management and emergency handling capabilities...

Live Updates

pythian: RT @sheeri: #confoo talk "Bending Queries to your Will with EXPLAIN" slides http://bit.ly/explainslides & handout
more



Testimonials

  • Serge Racine

    DBA, Brookfield Energy

    We are very satisfied by the service given to us by Andre and Shakir in support of our recent data quality and reorganization initiative.... more