Designing for Exadata: Maximizing Storage Indexes Use
Oracle Exadata V2 is a very well balanced database machine combined with smart and innovative software. One of these innovative features is the Storage Index (see also Marc’s post here) – a game changing feature in my opinion.
What is the Storage Index?
The Storage Index is more similar to partitions than to an index. It should be called an anti-index. It allows Oracle to skip reading granules of 1Mb when they are known not to have the data requested. Similar to bloom filters (See Christian Antognini ‘s paper) they reduce the amount of work required, similar to partition elimination, but far more granular, and fully automatic on ALL columns (certain datatypes excluded).
Here’s an extreme example of a Storage Index at play. CKTEST is a 80 Gb table, with no partitioning nor any kind of indexes. The ID column is the equivalent of a primary key, and ever-incrementing value.[sql]select * from cktest where id = 283942;
cell physical IO bytes eligible for predicate offload 81 920 000 000
cell physical IO bytes saved by storage index 81 917 984 768
physical read bytes 81 920 212 992
physical read flash cache hits 0
physical reads 10 000 026
physical reads cache 26
physical reads cache prefetch 0
physical reads direct 10 000 000[/sql]
This query scanned an 80 GB table with no indexes and returned in 0.3 seconds.
Notice how Oracle actually reports physical reads, when physical reads did not happen. Oracle “fills up” the saved reads with the granules that it would have read if storage indexes were not in place.
Note that this feature is no replacement for primary key lookups. A cached primary key lookup is somewhere in the 0.000003 seconds range.
How do design for and leverage Storage Indexes?
The most important component is data locality. Similar data (for example same region) needs to be physically together in order to fully benefit from storage indexes. Some data naturally sits together, such as most ever-incrementing numbers (most ids) as well as dates.
For example, during ETL processes, batches of data should be ordered when loaded into the table. By product, or by region, or any combination that will be effective. For example, you no longer need to index your “batch_id” equivalent column.
This is not that different from classical non-Exadata datawarehouse design with bitmap indexes. Data locality helps to keep bitmap indexes extremely small, as well as queries using them can benefit multi-block reads from index range scans. Oracle would scan one index row, and then reads up to 128 table blocks in a single request, if those blocks are sequential.
It sounds simple, and it is. Just order the data, and leverage an amazing optimization.
Ordering similar data together also allows for better compression. Especially with the new HCC (Hybrid Columnar Compression), much higher ratios can be achieved. And all the decompression is offloaded to the storage cells, so in theory, there’s no overhead.
As with every great feature, there are limitations on when they can be used.
Updated: 13 Sept 2010
Storage indexes are used during smart scans. All the limitations to smart scans apply to storage indexes. They do not work with joins. Bind variables are supported, however it’s slightly more restrictive than regular indexes/queries. For example the query:[sql] select * from table where id = :x;
Will only use storage indexes if :x matches the datatype of the table. Contrary to regular indexes, where if “id” is number and “:x” is string, the implicit convertion of to_number(:x) will still use a b-tree index.
When I originally posted this blog, I reported as storage indexes not working with bind variables, as I had only experimented with number columns and varchar binds.
Note however that in literals, a datatype convertion of “number_column = ‘4’ ” (the string ‘4’) does not disable storage indexes.
However “varchar_column = 4” (the number 4) disables them. So in a way, the disable of storage index usue due to implicit convertion works in the exact opposite way (with a twist for binds) than with b-tree indexe use, where number = string uses the index, but string = number does not. Oracle converts the later to “to_number(column) = number” which disables the index.
I am sure that one will bite a number of people in the future, especially ones with past implicit convertion experience.
Credit for this update goes to Kerry Osborne’s blog post, who did some additional research. As a result, this update comes from me.
End Update from 13 Sept 2010
This raises the question of whether STAR schema should be at all used with Exadata.
What’s the purpose of dimensions in large fact tables? Compression! They are used so that highly-repeated data is “extracted” in a common area, and a pointer to it (foreign key) used instead. There are other uses of course, particularly when determining what kind of search criteria we have.
But combined with HC compression, the benefits of dimensions becomes questionable. Storage indexes and pure-flat tables, very similar to text files might be better.
Of course, any combination of the two is quite a valid approach. This is why Oracle is so great – all the features come in the same package. Which one you use and how you combine them is what defines how well a database will perform.
Since storage indexes are anti-indexes, i.e. they tell you where you data is not, they can’t be used in all manners real indexes and partitions can. For example min() and max() cannot currently benefit from storage indexes, although in theory they could.
Also, since storage indexes are part of smart scan, they cannot be used in any combination (for the same table) with other indexes. You either perform a full table scan and use storage indexes, or use classical multiple bitmap indexes.
They do however work well with partitions and sub-partitions. And partition elimination works with bind variables, with subqueries, and in even more ways than you may be used via the JOIN-FILTER execution path (a variation of bloom-filtering).
At this point you have a design choice:which queries would be a bitmap index/dimension combination, which ones will be straight literals with storage indexes.
The optimizer has no clue of storage indexes. What this means is that a query’s execution path will be calculated as if storage indexes did not exists. This means anyone designing a mix in the same table of classical index design with flat storage index design may have to use hinting to ensure queries intended to use storage indexes run with full table scans. Of course this is only for queries that have both predicates on the same table that could use bitmaps.
And a final caveat, DML (UPDATE, DELETE probably MERGE as well) cannot use storage indexes elimination in serial execution. And parallel DML locks the table (or partition) preventing any other modifications of the table (reads are still fine) for the life of the transaction.
As with any feature, there are intended use cases, limitations and caveats. Use it right, and amazing performance gains can be achieved. Use it wrong, and nothing will happen. This is what is great about storage indexes in particular. They are there, without any overhead. It’s only a question of how to leverage them, in addition or combination withevery other feature that Oracle Exadata has to offer.