Old Dog with New Tricks: Indexing Null Columns in Oracle
I have been an Oracle DBA for quite some time now. Like most humans, when I find something that works well, I tend to use it and use it and use it.
Take, for instance, Oracle Function-Based Indexes. When you need to index a column based on data that is not there, you can create a Function-Based Index. Oracle will create a pseudocolumn for the results of the Function-Based Indexes.
As a quick recap on pseudocolumns, there are two types of Oracle psuedocolumns: default pseudocolumns (such as rownum) and those created by Oracle when you create Function-Based Indexes. These “manually created” pseudocolumns will hold a copy of the converted data.
In order to be able to use Function-Based Indexes, both QUERY_REWRITE_ENABLED and QUERY_REWRITE_INTEGRITY must be set. QUERY_REWRITE_ENABLED will allow Oracle to rewrite the query. and then find the most efficient path using the rewritten query.
QUERY_REWRITE_INTEGRITY determines the degree to which Oracle must enforce query rewriting.
Two common scenarios of when you would want to use a Function-Based Index:
a) Users can enter their name in any format, uppercase, lowercase, or mixed case:- Smith
- SMITH
- smith
When you search, you want to find every Smith. So you would use a where clause, such as:
- upper(lastname) = ‘SMITH’
b) Columns are nullable, and you are looking for any records where there is a null in a column. To search this, you would need to use:
- where mycolumn is null
In both scenarios A and B, Oracle would have to do a full table scan to find the relevant records. To avoid doing full table scans to identify a few records, I would create indexes such as:
- create index users_lastname on users_table ( upper (lastname) );
- create index find_null on users_table ( nvl ( mycolumn, ‘ISNULL’ );
Oracle would then create a psuedocolumn and create an index based on the pseudocolumn, making queries more efficient.
So, what is the new trick for this old dog? If I want to easily identify all records with a NULL value in a column, I can use this syntax:
- create index find_null on users_table ( mycolumn, 1) ;
This functionality has actually been available since Oracle 11g. It has been a long time since I have had to create an index to identify the records with null values, and only recently had the need to revisit this scenario. Normally, Oracle will not index columns with null values. However, by adding an extra character to the index (‘1), Oracle will index the nullable values.
The following example demonstrates the above:
a) Create a procedure to create and load test table ( logic modified from “How to Quickly Create Large Tables for Testing (Doc ID 810826.1”)
create or replace PROCEDURE TESTPROC
( val1 IN NUMBER DEFAULT 1
, val2 IN NUMBER DEFAULT 1) as
col1_val varchar2(20);
col2_val varchar2(20);
table_exists number(1,0);
BEGIN
-- Check whether the table exists or not
-- If yes, then the table is truncated
select count(*) into table_exists from user_objects where object_name = 'BABETTE_TEST' and object_type='TABLE';
if table_exists=1 then
execute immediate 'truncate table BABETTE_TEST';
else
execute immediate 'CREATE TABLE BABETTE_TEST (COL1 VARCHAR2(20), COL2 VARCHAR2(20) )';
end if;
for i in 1..val1 loop
col1_val := 'A'||i;
for j in 1..val2 loop
col2_val := 'B'||j;
execute immediate 'insert into BABETTE_TEST values(:col1_val,:col2_val )'
using col1_val,col2_val;
end loop;
commit;
end loop;
commit;
END TESTPROC;
/
SQL> exec testproc (100,2000);
b) Set a few rows to null value and query for records with null values. Notice that a full table scan is done :
SQL> update babette_test set col2 = null where rownum < 100;c) Add a function-based index using the new syntax. Verify that a pseudocolumn was added to the table.
SQL> exec DBMS_STATS.GATHER_TABLE_STATS('SYS','BABETTE_TEST');
SQL> set autotrace on
ORC1 (SYS) SQL> update babette_test set col2 = 'GOOD' where col2 is null;
99 rows updated.
Execution Plan
----------------------------------------------------------
Plan hash value: 3281971041
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | 1 | 6 | 117 (7)| 00:00:01 |
| 1 | UPDATE | BABETTE_TEST | | | | |
|* 2 | TABLE ACCESS FULL| BABETTE_TEST | 1 | 6 | 117 (7)| 00:00:01 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("COL2" IS NULL)
SQL> create index babette_test_ix on babette_test (col2, 1) ;
SQL> select * from dba_ind_expressions where table_name = 'BABETTE_TEST';
INDEX_OWNER INDEX_NAME TABLE_OWNER TABLE_NAME COLUMN_EXPRESSION POS
------------ ---------------- ------------ -------------- ------------------ -----
SYS BABETTE_TEST_IX SYS BABETTE_TEST 1
d) Query for records with null values and notice that an index scan is done.
SQL> update babette_test set col2 = null where rownum < 100;
SQL> exec DBMS_STATS.GATHER_TABLE_STATS('SYS','BABETTE_TEST');
SQL> update babette_test set col2 = 'GOOD' where col2 is null;
Execution Plan
----------------------------------------------------------
Plan hash value: 4024672828
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | 99 | 594 | 3 (0)| 00:00:01 |
| 1 | UPDATE | BABETTE_TEST | | | | |
|* 2 | INDEX RANGE SCAN| BABETTE_TEST_IX | 99 | 594 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("COL2" IS NULL)