Oracle file extent map, the old-fashioned way

Have you ever wanted to know where exactly in a datafile segments are placed? Have you ever wondered just how fragmented a tablespace is? How about how much space you could reclaim if the last segment of a datafile was moved out of the tablespace? These questions, and many more, can be easily answered with a detailed extent map of the datafiles.
Yes, I know, this subject is a rather old one, and a few different solutions have been provided by several professionals, including the famous Tom Kyte. But none of the answers I found did exactly what I wanted, and therefore, I chose to write my own solution. OEM does provide this, but for a price — the Tablespace Map is part of the Oracle Tuning Pack — and I like the free stuff and the extra flexibility I have using queries.
As all of you know, Oracle has been providing this extent map via the dba_extents
view forever (or at least since v6, which is the version I first worked with). The problem is that today’s super-large databases tend to have segments with thousands of extents. These extents are either of fixed size (LMT Uniformed Sized, DMT with pct_increase=0
) or variable sizes (DMT, system-managed extent LMTs). In both cases, the dba_extents
view provides one entry per extent, but several of these extents may in fact be contiguous extents. To answer the questions I began with, a view of this extent list aggregated by segment is much more useful.
To have a fully aggregated map, one needs to union the content of dba_extents
and dba_free_space
. Piece of cake, right? No! The basic hierarchical query is relatively simple to understand, but getting Oracle to run it efficiently is something else. No matter how I tried to write the query, with or without hints, I was simply not able to get it to provide me the aggregated map within an acceptable amount of time, and without adding an insane load on the server.
The main problem is the underlying complexity of the dba_extents
and dba_free_space
views. The solution to that is simple — get the data you want into a regular heap table (I used global temporary tables) and — case solved. The queries to produce the map can now run in just a few minutes, if not seconds.
Here is what I wrote to produce the aggregated extent map. From this map, you can run all kinds of interesting and useful queries. I’ve also provided some of those to get you started.
The base script:
-- -- Create the global temporary tables. -- create global temporary table seg_list (file_id number, block_id number, owner varchar2(30), segment_name varchar2(30), segment_type varchar2(30), blocks number, constraint seg_list_pk primary key (file_id, block_id, owner, segment_name, segment_type)) ; create global temporary table aggregated_extent_map (file_id number, root_block_id number, owner varchar2(30), segment_name varchar2(30), segment_type varchar2(30), total_blocks number) ; -- -- Load the base extent data -- from dba_extents and dba_free_space -- insert into seg_list select file_id,block_id,owner, segment_name,segment_type,blocks from dba_extents -- this is optional, you can load all your tablespaces -- where tablespace_name = 'MY_TS' union all select file_id,block_id, 'free space', 'free space', 'free space', blocks from dba_free_space -- this is optional, you can load all your tablespaces -- where tablespace_name = 'MY_TS'; -- -- Generate the aggregate extent map using a hierarchical query. -- Be patient, this will take a short while depending on the number -- of extents to process and your system's speed. It took 5:02.69 -- minutes on a dev server to process 18033 extents and -- generated 11848 aggregated extents. -- insert into aggregated_extent_map select file_id, root, owner, segment_name, segment_type, sum(blocks) from ( select owner, segment_name, segment_type, file_id, blocks, block_id, substr(sys_connect_by_path(block_id,'/'),2, decode(instr(sys_connect_by_path(block_id,'/'),'/',2) -2,-2,length(sys_connect_by_path(block_id,'/')), instr(sys_connect_by_path(block_id,'/'),'/',2)-2)) root from seg_list a start with (file_id, block_id) in (select file_id, block_id from seg_list where (file_id,block_id) in (select file_id, min(block_id) from seg_list group by file_id) union all select b.file_id, b.block_id from seg_list a, seg_list b where b.block_id = a.block_id + a.blocks and a.file_id = b.file_id and (a.owner <> b.owner or a.segment_name <> b.segment_name) ) connect by owner = prior owner and segment_name = prior segment_name and file_id = prior file_id and block_id = prior a.block_id + prior a.blocks ) c group by owner, segment_name, segment_type, file_id, root ; -- >>> run all your queries here... -- Don't forget to re-populate the temporary tables if you -- sign out or rollback.
Here are some sample queries.
Query 1
This query lists the last five aggregated extents for each datafile.
break on file_id skip 1 set linesize 140 pagesize 10000 col file_id for 9999 col top_n noprint col segment_type for a12 col size_mb for 999999.99 select * from ( select a.file_id, rank() over (partition by a.file_id order by root_block_id desc) top_n, segment_name, segment_type, root_block_id, total_blocks*(b.bytes/b.blocks)/1048576 size_mb from aggregated_extent_map a, dba_data_files b where a.file_id = b.file_id -- use this if you loaded more than one TS in the seg_list -- and tablespace_name = 'MY_TS' -- use this to list a single datafile -- and file_id = 7 ) where top_n ;
Sample output with segment names masked:
FILE_ID SEGMENT_NAME SEGMENT_TYPE ROOT_BLOCK_ID SIZE_MB ------- --------------- ------------ ------------- ------- 1 free space free space 16965 246.94 IDL_UB2$ TABLE 16901 1.00 SOURCE$ TABLE 16837 1.00 free space free space 16809 .44 PLAN_TABLE TABLE 16805 .06 2 free space free space 260261 29.44 _SYSSMU64$ TYPE2 UNDO 260257 .06 free space free space 260253 .06 _SYSSMU44$ TYPE2 UNDO 260249 .06 free space free space 242577 276.13 3 JOB_LOGS_PK INDEX 13381 1.00 JOB_MASTERS TABLE 13317 1.00 JOB_LOGS TABLE 13253 1.00 JOB_MASTERS_PK INDEX 13189 1.00 JOB_MASTERS TABLE 13125 1.00 4 T1 TABLE 511997 .06 T2 TABLE 511993 .06 IDX1 INDEX 511989 .06 IDX2 INDEX 511981 .13 T3 TABLE 511973 .13 5 free space free space 5 64.00 6 free space free space 19465 103.75 IDX3 INDEX 19401 .50 T4 TABLE 19241 1.25 free space free space 18921 2.50 T4 TABLE 18889 .25 7 free space free space 319493 124.00 IDX4 INDEX 304133 240.00 T5 TABLE 301573 40.00 IDX4 INDEX 298757 44.00 free space free space 258821 624.00 8 free space free space 40965 320.00 T6 TABLE 32773 128.00 T7 TABLE 12293 320.00 T8 TABLE 5 192.00 ...
The most interesting observation I can make of the first listing is that datafile 7 has a 624MB free space extent in the last position. Should I need to reclaim some or all of that space, I could run a move of table T5 and a rebuild of IDX4 in place, or better yet, to another tablespace. Assuming I move it to another tablespace, I will be able to reclaim over 1GB of disk space by shrinking datafile 7.
Running that query for the single datafile and extracting more aggregate extents might show that even more space would get freed (that was not the case for me as the sixth-last extent was used by another table).
Query 2
This query shows the “real” fragmentation of each datafile.
break on file_id skip 1 compute sum of cnt on file_id compute sum of blocks on file_id compute sum of size_mb on file_id set linesize 140 pagesize 10000 col file_id for 9999 col segment_type for a32 col cnt for 999999 col blocks for 999999999 col size_mb for 999999.99 select a.file_id, segment_type, count(*) cnt, sum(total_blocks) blocks, sum(total_blocks*(b.bytes/b.blocks)/1048576) size_mb from aggregated_extent_map a, dba_data_files b where a.file_id = b.file_id group by a.file_id, segment_type order by file_id, cnt desc ;
FILE_ID SEGMENT_TYPE CNT BLOCKS SIZE_MB ------- ---------------- ------- ---------- ---------- 1 INDEX 679 3084 48.19 TABLE 621 11540 180.31 LOBSEGMENT 65 396 6.19 CLUSTER 63 1468 22.94 LOBINDEX 51 204 3.19 TABLE PARTITION 27 108 1.69 INDEX PARTITION 24 96 1.50 ROLLBACK 2 28 .44 free space 2 15832 247.38 CACHE 1 4 .06 NESTED TABLE 1 4 .06 ******* ------- ---------- ---------- sum 1536 32764 511.94 2 TYPE2 UNDO 521 51023 797.23 free space 120 210076 3282.44 ******* ------- ---------- ---------- sum 641 261099 4079.67 3 TABLE 130 8640 135.00 INDEX 73 4800 75.00 ******* ------- ---------- ---------- sum 203 13440 210.00 4 INDEX 2113 147512 2304.88 TABLE 1614 187228 2925.44 free space 348 176808 2762.63 LOBINDEX 28 224 3.50 LOBSEGMENT 28 224 3.50 ******* ------- ---------- ---------- sum 4131 511996 7999.94 5 free space 1 4096 64.00 ******* ------- ---------- ---------- sum 1 4096 64.00 6 TABLE 124 15456 120.75 free space 43 16000 125.00 LOBSEGMENT 11 1184 9.25 INDEX 1 64 .50 LOBINDEX 1 32 .25 ******* ------- ---------- ---------- sum 180 32736 255.75 7 TABLE 71 44544 696.00 free space 44 264704 4136.00 INDEX 2 18176 284.00 ******* ------- ---------- ---------- sum 117 327424 5116.00 8 TABLE 3 40960 640.00 free space 1 20480 320.00 ******* ------- ---------- ---------- sum 4 61440 960.00 ...
The second listing shows that file 4 is much more fragmented than the other ones, and that the fragmentation is well distributed over tables and indexes. Further analysis is required to know if this is abnormal, as it is possible that file 4 stores lots of very small objects (see listing 3).
We can also see that datafile 5 is totally empty and is very small. Perhaps it can be dropped.
Datafile 7 is very interesting. It has 4.1GB free out of 5.1GB. But, if you refer to listing 1, this datafile as it stands can only shrink by 124MB. Some objects would need to be moved and rebuilt to get all that freespace sitting at the end of the datafile, and to be able to shrink the datafile significantly.
Query 3
This query shows the number of distinct aggregated segments for each datafiles.
clear break col file_id for 9999 col cnt for 999999 select file_id, count(distinct segment_name) cnt from aggregated_extent_map group by file_id ;
FILE_ID CNT ------- ------- 1 1039 2 361 3 18 4 552 5 1 6 28 7 32 8 4 ...
The third listing shows why there’s so many extents in datafile 4. It hosts 552 distinct objects. Listing 2 shows that this datafile has a total of 4131 contiguous extents. That’s an average of 7.5 contiguous extents per objects. Not that bad. A quick look at the dba_data_files
reveals that this file belongs to the USERS
tablespace. The dba_tablespaces
view shows that the segment_space_management
for USERS
is AUTO
(i.e. system-managed). System-managed segment extents always start with small extents and grow to about a maximum of 128MB. It is therefore not surprising at all that the datafile has so many extents.
The problem here is that this datafile has 2.76GB of free space spread over 348 chunks, interspersed with 3783 (4131-348) segment extents. In the event that you’d need to release it back to the system, that space is not currently reclaimable without a serious reorganization.
Query 4
This query shows the largest segment, and the first and last aggregated extents per datafile.
break on file_id skip 1 set linesize 140 pagesize 10000 col file_id for 9999 col top_n noprint col tb head '' col segment_name head '' col fe head '' col le head '' col ls head '' col nl newline head '' col filler1 for a42 head '' col filler2 for a10 head '' col size_mb for 999999.99 select aggr.file_id, 'total blocks:' tb, '' filler1, file_blocks, null nl, ' First extent:' fe, minext.segment_name, aggr.min_rbi, minext.total_blocks, null nl, ' Last extent:' le, maxext.segment_name, aggr.max_rbi, maxext.total_blocks, null nl, ' largest seg :' ls, lg.segment_name, '' filler2, lg.total_blocks from (select file_id, min(root_block_id) min_rbi, max(root_block_id) max_rbi, sum(total_blocks) file_blocks from aggregated_extent_map group by file_id) aggr, aggregated_extent_map minext, aggregated_extent_map maxext, (select z.*, rank() over (partition by file_id order by total_blocks desc) top_n from (select file_id, segment_name, sum(total_blocks) total_blocks from aggregated_extent_map where segment_name <> 'free space' group by file_id, segment_name order by total_blocks) z ) lg where minext.file_id = aggr.file_id and minext.root_block_id = aggr.min_rbi and maxext.file_id = aggr.file_id and maxext.root_block_id = aggr.max_rbi and lg.file_id = aggr.file_id and lg.top_n ;
Sample output:
FILE_ID FILE_BLOCKS ------- ------------- ------------------------- ----------- MIN_RBI TOTAL_BLOCKS - ------------------- ------------- ---------- ------------ MAX_RBI TOTAL_BLOCKS - ------------------- ------------- ---------- ------------ TOTAL_BLOCKS - ------------------- ------------- ---------- ------------ 1 total blocks: 32764 First extent: SYSTEM 5 8 Last extent: free space 16965 15804 largest seg : FGA_LOG$ 4608 2 total blocks: 261227 First extent: _SYSSMU1$ 6 3 Last extent: free space 260261 1884 largest seg : _SYSSMU266$ 28096 3 total blocks: 13440 First extent: free space 5 192 Last extent: JOB_LOGS_PK 13381 64 largest seg : JOB_MASTERS 4032 4 total blocks: 511996 First extent: free space 5 8 Last extent: BP 511997 4 largest seg : HOLD_LOGMINER 34144 6 total blocks: 32736 First extent: CHAT_ROOM 9 288 Last extent: free space 19401 13344 largest seg : SMSMESSAGE 10752 7 total blocks: 31744 First extent: CHAT_EVENT 5 256 Last extent: free space 31493 256 largest seg : PODCONTACTS 5632 8 total blocks: 61440 First extent: PODFILEACL 5 12288 Last extent: free space 40965 20480 largest seg : PODFILEDATA 20480 ...
Ok, you get the idea. You can run queries against this aggregated_extent_map
table to get information about datafiles, segments, and tablespaces. Heck, if you’re up to it, you can probably extract the list of aggregated extents for a tablespace, load it into an excel sheet, and chart it with nice colors and all, Ã la OEM.
I think this mapping is very useful to anyone who wants to understand where the space is being used within their database and see if maintenance would be beneficial. Remember, space is not cheap — MBs are, but DBAs’ and SAs’ time is not, and nor is backup storage capacity, nor the time to backup (and restore) the datafiles.
For maximum performance, keep your databases lean and clean!
Enjoy!
Marc.