Tips for Better Capacity Planning on Exadata

Exadata X8-2 comes with up to 3 Petabytes (PB) of raw disk capacity. This post briefly explains the basics of capacity planning on Exadata.capacity planning on Exadata.
Why should you care about capacity planning when you’re already loaded with Petabytes of space?
First, you don’t really have petabytes (PB) of space. A full rack configuration of 8 compute nodes and 14 cell nodes provides about 2.3 PB of space, which, while accounting for high redundancy disk groups, effectively provides about 780 terabytes (TB) storage for the databases. On a quarter rack, that comes out to be 168 TB.
Even 168 TB on a quarter rack configuration is a huge amount of storage space. Different applications have different rates of data accumulation though, so without an effective capacity planning strategy, you might find yourself running out of space very quickly. This is particularly worrisome given that you can’t add storage to an Exadata rack without some sort of expansion. Not to mention, such an undertaking tends to be expensive.
So, how do you go about capacity planning on an Oracle database? To project our future storage needs, we need to look at past trends. Here are a few structured query languages (SQL) that will help you gather the past space utilization of your databases.
The below SQL shows the growth of an Oracle database in the last 30 days. The data is pulled from automatic workload repository (AWR) snaps and our AWR retention period was 30 days.
SQL> with TS_GROUP as ( select /*+ materialize parallel(t,2) ordered */ to_char(c.begin_interval_time,'yyyy-mm-dd') report_date, b.name ts_name, (round(max((a.tablespace_size*8192))/1024/1024/1024,2)) size_gb, (round(max((tablespace_usedsize*8192))/1024/1024/1024,2)) used_gb from dba_hist_tbspc_space_usage a, v$tablespace b, dba_hist_snapshot c where a.tablespace_id=b.ts# and a.snap_id=c.snap_id group by to_char(c.begin_interval_time,'yyyy-mm-dd'), b.name) select report_date, sum(size_gb) total_size_GB, sum(used_gb) used_size_GB from TS_GROUP group by report_date order by report_date; REPORT_DAT TOTAL_SIZE_GB USED_SIZE_GB ---------- ------------- ------------ 2021-08-17 277.98 131.82 2021-08-18 278.01 133.31 2021-08-19 278.02 132.14 2021-08-20 278.12 132.26 2021-08-21 278.18 132.32 2021-08-22 278.2 132.08 2021-08-23 278.2 132.55 .. ..
To get the growth report for a specific tablespace, use the below query:
SQL> select /*+ materialize parallel(t,2) ordered */ to_char(c.begin_interval_time,'yyyy-mm-dd') get_date, b.name ts_name, (round(max((a.tablespace_size*8192))/1024/1024/1024,2)) size_gb, (round(max((tablespace_usedsize*8192))/1024/1024/1024,2)) used_gb from dba_hist_tbspc_space_usage a, v$tablespace b, dba_hist_snapshot c where a.tablespace_id=b.ts# and a.snap_id=c.snap_id and b.name='&tablespace_name' group by to_char(c.begin_interval_time,'yyyy-mm-dd'), b.name order by 1; Enter value for tablespace_name: USERS old 12: and b.name='&tablespace_name' new 12: and b.name='USERS' GET_DATE TS_NA SIZE_GB USED_GB ---------- ----- ---------- ---------- 2021-08-17 USERS 11.01 10.44 2021-08-18 USERS 11.01 10.44 2021-08-19 USERS 11.01 10.44 2021-08-20 USERS 11.01 10.44 2021-08-21 USERS 11.01 10.44 .. ..
What if your AWR retention is only one week and you need data from previous months? You can mine this information from the Oracle Enterprise Manager (OEM) repository database using the following SQLs.
To get the growth of a database use the below query:
SQL> select to_char(rollup_timestamp,'YYYY-MM-DD') DAY, target_name as DATABASE_NAME, sum(average/1024) SIZE_USED_GB from sysman.mgmt$metric_daily where target_name='&Database_name' and column_label = 'Tablespace Used Space (MB)' group by rollup_timestamp, target_name order by 1; DAY DATABASE_NAME SIZE_USED_GB ---------- ------------------------------ ------------ 2021-07-05 orcl 111.918549 2021-07-06 orcl 111.955902 2021-07-07 orcl 111.927277 2021-07-08 orcl 111.971161 2021-07-09 orcl 111.923553 .. ..
To get the growth of a specific tablespace in a database use the below query:
SQL> select to_char(rollup_timestamp,'YYYY-MM-DD') DAY, key_value as TABLESPACE_NAME, sum(average/1024) SIZE_USED_GB from sysman.mgmt$metric_daily where target_name='&Database_Name' and key_value='&Tablespace_name' and column_label = 'Tablespace Used Space (MB)' group by rollup_timestamp, key_value order by 1; DAY TABLESPACE_NAME SIZE_USED_GB ---------- ------------------------------ ------------ 2021-07-05 SYSAUX 1.36871338 2021-07-06 SYSAUX 1.36993408 2021-07-07 SYSAUX 1.36572266 2021-07-08 SYSAUX 1.36590576 .. ..
Once we understand the past growth of the database, we have a reasonable projection of the kind of storage we’d require in the future.
In our experience, the best way to manage storage in an Oracle database is to always look for data that can be archived. Several customers use extract, transform, and load (ETL) jobs on data warehouses to process data and delete the rows that are no longer needed.
What if you need to hang onto all of the data, say for regulatory purposes? Compression is the way to go. Oracle provides upto 10X compression when using hybrid columnar compression on Exadata. Hybrid columnar compression (HCC) is an effective way to archive older data that would not see much data manipulation language (DML). This blog post by Oracle provides more details on the FAQs of HCC:
https://blogs.oracle.com/dbstorage/post/hybrid-columnar-compression-common-questions
Capacity planning is something that can easily be overlooked when planning a migration onto Exadata or even after moving to Exadata. However, having an effective strategy to manage space growth on your Oracle database will keep you in good stead for the lifecycle of the database.
Have you encountered capacity planning issues with Exadata? Let me know in the comments how I can help.
Don’t forget to sign up for more updates here.