1.617.682.4508

Pythian Blog

The world discusses #Pythian on Twitter. Have a question? Use our hashtag and ask away.

Emergency

24x7 Support

Not a Pythian client but need help now? No problem. Click here.

Are you aware of an existing DBA opening or consulting requirement in your organization? Enter your email for a chance to win one year's access to Safari Books.

  

Oracle: Is OUTER JOIN Better Than NOT EXISTS?

By: Grégory Guillou

I’ve been told that using NOT EXISTS in (Oracle) SQL is a bad idea, and that a way to overcome this problem is to collect the non-matching rows with an OUTER JOIN. So I decided to check if it is true.

In order to start, here is my test case:

create table t1(id number,
   constraint t1_pk primary key(id));

  create table t2(id number);

begin
  for i in 1..100 loop
    insert into t1 values(i);
  end loop;
  commit;
end;

begin
  for i in 1..100000 loop
    insert into t2
      values(mod(i,97));
  end loop;
  commit;
end;
/

create index t2_idx on t2(id);

exec dbms_stats.gather_table_stats(USER,'T1');

exec dbms_stats.gather_table_stats(USER,'T2');

First, I checked what I’d been told, i.e. that the OUTER JOIN is more efficient than the NOT EXISTS. In order to do that, I wrote a simple SELECT and displayed the plan for both syntaxes (my database is 11.1.0.6 on Linux 32-bits). As I assumed, it’s not the case. In fact, both orders took the same plan.

Here is the plan with NOT EXISTS:

explain plan for
  select id from t1 a
   where not exists
    (select 1 from t2 b where b.id=a.id);

select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------
Plan hash value: 1906534000

-----------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)|
-----------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |    96 |   480 |    46   (5)|
|*  1 |  HASH JOIN ANTI    |       |    96 |   480 |    46   (5)|
|   2 |   INDEX FULL SCAN  | T1_PK |   100 |   300 |     1   (0)|
|   3 |   TABLE ACCESS FULL| T2    |   100K|   195K|    44   (3)|
-----------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("B"."ID"="A"."ID")

Here is the plan with the OUTER JOIN:

explain plan for
  select a.id from t1 a, t2 b
   where a.id=b.id(+)
     and b.id is null;

select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------
Plan hash value: 1906534000

-----------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)|
-----------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |     3 |    15 |    46   (5)|
|*  1 |  HASH JOIN ANTI    |       |     3 |    15 |    46   (5)|
|   2 |   INDEX FULL SCAN  | T1_PK |   100 |   300 |     1   (0)|
|   3 |   TABLE ACCESS FULL| T2    |   100K|   195K|    44   (3)|
-----------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("A"."ID"="B"."ID")

I know — the real way to check that both queries are equivalent is to trace the plan generation with a 10053 event. (I cannot explain this, so I’ll leave that to you.) However, the original query I’ve been told to rewrite was not a SELECT, but the DELETE below:

delete from t1 a
   where not exists
    (select 1 from t2 b where b.id=a.id);

I haven’t yet found how to rewrite it in a way that makes it more efficient with an OUTER JOIN. I’ll be happy if someone can help me, at least to find the syntax, if not to enhance response time.

4 Responses

  1. Preetha says:

    delete from t1 a
    where exists
    (select 1 from t1 one left outer join t2 two on two.id = one.id where two.id is null);

  2. Ash says:

    just stumbled upon this, so maybe useless by now…

    Anyway, the delete should be similar to select of the missing records:

    select a.id from t1 a left join t2 b on a.id=b.id where b.id is null;

    just apply the delete syntax of your DBMS:

    delete from t1 a left join t2 b on a.id=b.id where b.id is null;

    or in MSAccess:

    delete a.* from t1 as a left join t2 as b on a.id=b.id where b.id is null;

    hope it helps.

  3. Jack McGarry says:

    You can do this and get the same result…

    delete
    from t1 a
    where 0 = (select count(b.id)
    from t2 b
    where b.id = a.id);

Leave a Reply

Start NowWith Pythian - database design, management and emergency handling capabilities...

Pythian Blog

Connecting to Oracle with SQL Server 2005 x64
The quirks of connecting to Oracle from SQL 2005 64
more



Live Updates

pythian: Pythian is now official members of the Microsoft Partner Program. Thanks Peter
more



RSSTestimonials

  • Casey Dyke

    Database Team Manager Service Delivery and Applications , Telstra

    Pythian were recently engaged to take a lead role in a high end infrastructure build project at Telstra. Our requirements were a combination of... more