<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Pythian Blog &#187; Marc Billette</title>
	<atom:link href="http://www.pythian.com/news/author/billette/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.pythian.com/news</link>
	<description>News and views from Pythian DBAs</description>
	<lastBuildDate>Fri, 19 Mar 2010 02:09:24 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Easy Pivot Query Result in pre-11g Oracle</title>
		<link>http://www.pythian.com/news/7851/easy-pivot-query-result-in-pre-11g-oracle/</link>
		<comments>http://www.pythian.com/news/7851/easy-pivot-query-result-in-pre-11g-oracle/#comments</comments>
		<pubDate>Wed, 03 Feb 2010 19:48:46 +0000</pubDate>
		<dc:creator>Marc Billette</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Technical Blog]]></category>
		<category><![CDATA[CSV]]></category>
		<category><![CDATA[pivot]]></category>
		<category><![CDATA[pivot tables]]></category>

		<guid isPermaLink="false">http://www.pythian.com/news/?p=7851</guid>
		<description><![CDATA[I was asked, the other day, to automate the creation of a client&#8217;s weekly report, which is a pivot table of some aggregate data generated by a query. 
As we know, prior to 11g, Oracle did not have a simple table pivot feature. 11g has changed that, and the pivot clause is certainly useful. It [...]]]></description>
			<content:encoded><![CDATA[<p>I was asked, the other day, to automate the creation of a client&#8217;s weekly report, which is a pivot table of some aggregate data generated by a query. </p>
<p>As we know, prior to 11g, Oracle did not have a simple table pivot feature. 11g has changed that, and the pivot clause is certainly useful. It requires, however, an aggregation calculation in the intersection (at least that&#8217;s what I got from the documentation). But what if you already have the data to populate in the intersection area? Or, you may no longer have the raw data to aggregate it again. In that case, you are forced to trick it to get an aggregation in.</p>
<p>I&#8217;ve written a set of fairly simple PL/SQL code that handles this task in all versions of Oracle that supports associative nested VARRAYs (I have no clue when that started getting supported. It sure works great with 10g).</p>
<p>Here is how my code works: <span id="more-7851"></span></p>
<ol>
<li>A function is used to pass in your query that retrieves your raw data.
<p>The function needs a query that returns three columns. The first one for the row labels, the second one for the column labels, and the third one for the intersection data.</li>
<li>The function converts the result set of the supplied query into a pivot as CSV strings (don&#8217;t worry, I&#8217;ll explain later how to make those separate columns).</li>
<li>Each line (row) is summarized in an extra trailing column.</li>
<li>An extra trailing line is also generated with the column summaries and a Grand Total at the end.</li>
<li>The function pipes the result out, as it gets readied, to the calling query, which receives it as a single column set of rows.</li>
</ol>
<p>This is the core of the work. The resulting CSV strings can be spooled in a file and loaded into any decent spreadsheet editor.</p>
<p>Some of you may say, <em>yeah okay, but I&#8217;m really looking at getting back a cursor with this data in separate columns</em>. That&#8217;s not so difficult to do, assuming you are willing to live with some minor details, which I will explain later.</p>
<p>Alright, here is the function:</p>
<pre class="brush: sql; collapse: true; light: false; toolbar: true;">
-- we need this TYPE created to return a cursor of varchar2.
create type varchar2_type as table of varchar2(2000);
/
create or replace function pivot_three_cols_table_func
(p_query_string varchar2, sum_cols_ind INTEGER default 0, sum_lines_ind INTEGER default 0)
return varchar2_type pipelined
as
   type tb1  is table of number index by varchar2(1000);
   type ntb1 is table of tb1 index by varchar2(500);
   nvar ntb1;

   type coltb is table of varchar2(20) index by varchar2(20);
   cols coltb;

   col_1 varchar2(1000);
   col_2 varchar2(500);
   col_3 number;

   line varchar2(1000);
   col  varchar2(500);

   sum_lines number :=0;
   sum_cols  number :=0;
   sum_pivot number :=0;

   type item_cur_type is ref cursor;
   item item_cur_type;

   str varchar2(32767);
begin
   -- initialize the empty array
   nvar := nvar;

   -- load the array
   open item for p_query_string;
   loop
       fetch item into col_1, col_2, col_3;
       -- replace commas with space from the col_1 and col2 as
       -- these are used as column delimiters in the csv
       col_1 := replace(col_1,',',' ');
       col_2 := replace(col_2,',',' ');
       nvar(col_1)(col_2) := col_3;
       exit when item%notfound;
   end loop;
   close item;

   -- print the crosstab/pivot header
   --  -- first load the list of distinct column names in an associative array
   line := nvar.first;
   for t in 1..nvar.count loop
       col := nvar(line).first;
       for d in 1..nvar(line).count loop
           cols(col) := col;
           col := nvar(line).next(col);
       end loop;
       line := nvar.next(line);
   end loop;
   -- -- print the list of disctinct column names
   str := 'TITLE,';
   col := cols.first;
   for d in 1..cols.count loop
       str := str||cols(col)||',';
       col := cols.next(col);
   end loop;
   if sum_cols_ind = 1 then
      str := str||'TOTAL';
   else
      str:= substr(str,1,length(str)-1);
   end if;
   pipe row (str);

   -- sum and print the column data
   line := nvar.first;
   for t in 1..nvar.count loop
       str := line||',';
       sum_cols :=0;
       col := cols.first;
       for d in 1..cols.count loop
           begin
             str := str||to_char(nvar(line)(col))||',';
             if sum_cols_ind = 1 then
                sum_cols := sum_cols + nvar(line)(col);
             end if;
           exception when NO_DATA_FOUND then
             str := str||',';
           end;
           col := cols.next(col);
       end loop;
       if sum_cols_ind = 1 then
          str := str||to_char(sum_cols);
       else
          str:= substr(str,1,length(str)-1);
       end if;
       pipe row (str);
       line := nvar.next(line);
   end loop;

   if sum_lines_ind = 1 then
      -- sum and print the column totals
      str := 'TOTAL RESULT,';
      line := nvar.first;
      col  := cols.first;
      for d in 1..cols.count loop
        sum_lines :=0;
        line := nvar.first;
        for t in 1..nvar.count loop
          begin
            sum_lines := sum_lines + nvar(line)(col);
          exception when NO_DATA_FOUND then
            null;
          end;
          line := nvar.next(line);
        end loop;
        str := str||to_char(sum_lines)||',';
        sum_pivot := sum_pivot + sum_lines;
        col := cols.next(col);
      end loop;

      -- print the Grand Total
      str := str||to_char(sum_pivot);
      pipe row (str);
   end if;
   return;
end;
/
</pre>
<p>Lots to say about it:</p>
<ol>
<li>It does the job. At least it did&#8211;and very well&#8211;during my testing.</li>
<li>It can summarize the intersection data. So that third column in your query must be a number. It would be pretty easy to change that for someone who wanted to show text data in the intersection.</li>
<li>The first two columns must be characters. If you want numbers in the labels, simply get them converted in your driving query as in <code>to_char(num_col)</code>.</li>
<li>I have included parameters to turn on and off the SUM calculations. The default is to not sum nor print the column and line totals.</li>
<li>The function produces CSV strings so that it will support any number of columns. No need to know how many columns it will have, nor what the names are. This meant that I had to make it replace all commas in the input strings with a space. Otherwise, it would have caused the CSV parsing to fail which would in turn have created some pretty ugly spreadsheets.  So, your labels will lose any single quotes they may have (not a big deal in my opinion).</li>
<li>Line labels can be as large as 1000 characters. That is huge and very inconvenient, I would argue, as it would make the pivot table difficult to read. It&#8217;s up to you to control the width of the data your query provides in the first column.</li>
<li>Column labels can be as large as 500 characters. Same issue as for line labels. If you have wide column labels, your pivot table may be difficult to read, so set them appropriately in the second column.</li>
<li>Each CSV string/line is limited to 32767 characters. This would certainly create a very ugly pivot table if you have too many columns or very wide data. But that&#8217;s the limit, which I haven&#8217;t actually tested.</li>
</ol>
<p>Let&#8217;s see a sample query and output. It creates a pivot of Oracle-archived logs showing the sum of log sizes created by weekday across 24 hours for the last year. DBAs will be familiar with that.</p>
<p>First here&#8217;s the source query and a data subset:</p>
<pre class="brush: sql;">
SELECT To_Char(completion_time,'Day') WD,
       To_Char(completion_time,'HH24') HR,
       round((Sum(blocks)*block_size)/1048576) mb
  FROM v$archived_log
 WHERE completion_time &gt;= SYSDATE - 365
 GROUP BY to_Char(completion_time,'Day'), To_Char(completion_time,'HH24'), block_size;

WD        HR   MB
Friday    02   80
Friday    04   131
Thursday  17   128
Thursday  19   75
Thursday  18   130
Friday    03   75
Thursday  20   80
Thursday  21   542
Thursday  15   18
Thursday  22   207
Friday    00   85
Friday    06   56
Thursday  16   128
Thursday  23   88
Friday    01   92
Friday    05   72
...
</pre>
<p>This data gets pivoted using this query:</p>
<pre class="brush: sql;">
select column_value
  from table(pivot_three_cols_table_func(
       'select d, h, round((Sum(blocks)*block_size)/1048576) MB
          from (SELECT To_Char(completion_time,''Day'') D,
                       To_Char(completion_time,''HH24'') H,
                       blocks,
                       block_size
                  FROM v$archived_log
                 WHERE completion_time &gt;= SYSDATE - 365
               )
         GROUP BY d, h, block_size
       ',1,1))
;
</pre>
<p>Note that single quotes needed to be doubled in the subquery string. Also, I&#8217;ve generated the line and column sums.</p>
<p>The output is as follows:</p>
<pre class="brush: plain; wrap-lines: false;">
COLUMN_VALUE
------------
TITLE,00,01,02,03,04,05,06,07,08,09,10,11,12,13,14,15,16,17,18,19,20,21,22,23,TOTAL
Friday   ,299,261,233,216,404,242,265,328,420,482,443,465,465,396,403,412,309,263,221,168,142,1340,386,144,8707
Monday   ,186,172,160,157,277,168,184,206,292,320,306,295,288,232,261,278,278,266,204,184,171,1730,436,192,7243
Saturday ,152,155,166,146,275,146,251,162,213,185,150,158,155,145,161,144,142,148,141,140,149,1854,278,153,5769
Sunday   ,146,141,156,144,259,142,156,163,190,174,162,165,158,141,141,141,170,202,190,172,149,1815,240,161,5678
Thursday ,209,167,198,151,274,158,186,203,291,281,338,264,296,286,259,259,272,285,257,186,172,1641,620,251,7504
Tuesday  ,160,161,164,162,280,146,174,206,269,314,313,328,276,302,284,279,287,266,261,182,199,989,362,179,6543
Wednesday,196,178,161,172,288,181,175,220,313,347,316,277,264,281,243,312,292,272,270,207,159,1026,360,165,6675
TOTAL RESULT,1348,1235,1238,1148,2057,1183,1391,1488,1988,2103,2028,1952,1902,1783,1752,1825,1750,1702,1544,1239,1141,10395,2682,1245,48119
</pre>
<p>Voila! The pivot is done and can be saved in a file and imported into a spreadsheet, or whatever other tools you use that reads CSV data.</p>
<p>Now, for those who want distinct columns for this data. I&#8217;ve created a simple CSV parser function that can be used to extract distinct columns. Here&#8217;s the code for this CSV parser function:</p>
<pre class="brush: sql;">
create or replace function csv_element(string varchar2, element_number number)
return varchar2
as
i number := element_number;
r varchar2(2000);
begin
  case
    when (i=1 and instr(string,',',1)=0) then r:= string;
    when (i=1) then r := substr(string,1,instr(string,',',1,1) -1 );
    when (i&gt;1 and (instr(string,',',1,i-1)&gt;0 and instr(string,',',1,i)=0)) then r:= substr(string,instr(string,',',-1)+1);
    when (i&gt;1) then r := substr(string,instr(string,',',1,i-1)+1,instr(string,',',1,i)-instr(string,',',1,i-1)-1 );
    else r:= null;
  end case;
  return r;
end;
/
</pre>
<p>The <code>csv_element</code> function handles strings with no commas, and returns that value as the first and only element. It also takes care of the trailing string, and it will also return a null value for columns that are beyond the number of elements in the strings. That is as sophisticated as I needed it to be.</p>
<p>Here&#8217;s the previous sample query using the function to return the data as separate columns:</p>
<pre class="brush: sql;">
select csv_element(column_value,1) c1, csv_element(column_value,2) c2,
       csv_element(column_value,3) c3, csv_element(column_value,4) c4,
       csv_element(column_value,5) c5, csv_element(column_value,6) c6,
       csv_element(column_value,7) c7, csv_element(column_value,8) c8,
       csv_element(column_value,9) c9, csv_element(column_value,10) c10,
       csv_element(column_value,11) c11, csv_element(column_value,12) c12,
       csv_element(column_value,13) c13, csv_element(column_value,14) c14,
       csv_element(column_value,15) c15, csv_element(column_value,16) c16,
       csv_element(column_value,17) c17, csv_element(column_value,18) c18,
       csv_element(column_value,19) c19, csv_element(column_value,20) c20,
       csv_element(column_value,21) c21, csv_element(column_value,22) c22,
       csv_element(column_value,23) c23, csv_element(column_value,24) c24,
       csv_element(column_value,25) c25, csv_element(column_value,26) c26,
       csv_element(column_value,27) c27, csv_element(column_value,28) c28
  from table(pivot_three_cols_table_func(
       'select d, h, round((Sum(blocks)*block_size)/1048576) MB
          from (SELECT To_Char(completion_time,''Day'') D,
                       To_Char(completion_time,''HH24'') H,
                       blocks,
                       block_size
                  FROM v$archived_log
                 WHERE completion_time &gt;= SYSDATE - 365
               )
         GROUP BY d, h, block_size
       ',1,1))
;
</pre>
<p>I&#8217;ve deliberately added two extra columns in the query (27 and 28) so that you can see it handling it fine. Here&#8217;s the output of this query (with some columns removed for clearer formatting):</p>
<pre class="brush: plain;">
C1           C2   C3   C4   ... C23   C24  C25  C26        C27  C28
TITLE        00   01   02   ... 21    22   23   TOTAL
Friday       299  261  233  ... 1340  386  144  8721
Monday       186  172  160  ... 1730  436  192  7243
Saturday     152  155  166  ... 1854  278  153  5769
Sunday       146  141  156  ... 1815  240  161  5678
Thursday     209  167  198  ... 1593  620  251  7456
Tuesday      160  161  164  ... 989   362  179  6543
Wednesday    196  178  161  ... 1026  360  165  6675
TOTAL RESULT 1348 1235 1238 ... 10347 2682 1245 48085
</pre>
<p>As you can see, columns 27 and 28 are there, but null. This way, you can have a generic wrapper query with a hundred columns if you&#8217;d like, without worrying about how many columns will be produced from your base query. Just make sure your app can handle that.</p>
<p>Enjoy!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.pythian.com/news/7851/easy-pivot-query-result-in-pre-11g-oracle/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Standby Log Apply Elapsed Time</title>
		<link>http://www.pythian.com/news/1309/standby-log-apply-elapsed-time/</link>
		<comments>http://www.pythian.com/news/1309/standby-log-apply-elapsed-time/#comments</comments>
		<pubDate>Fri, 24 Oct 2008 18:19:49 +0000</pubDate>
		<dc:creator>Marc Billette</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[gv$log_history]]></category>
		<category><![CDATA[hidden column]]></category>
		<category><![CDATA[v$log_history]]></category>
		<category><![CDATA[x$kcclh]]></category>
		<category><![CDATA[x$kcclh.lhtsm]]></category>

		<guid isPermaLink="false">http://www.pythian.com/blogs/1309/standby-log-apply-elapsed-time</guid>
		<description><![CDATA[This has been discussed before by my colleague Paul Moen in his article on Oracle standby recovery rate monitoring,  but I recently made a discovery that makes it easier to generate both statistics on log apply performance, and more useful stats too.
First, let me say that this discovery is based on my observations and [...]]]></description>
			<content:encoded><![CDATA[<p>This has been discussed before by my colleague Paul Moen in his article on <a href="http://www.pythian.com/blogs/641/oracle-standby-recovery-rate-monitoring">Oracle standby recovery rate monitoring</a>,  but I recently made a discovery that makes it easier to generate both statistics on log apply performance, and more useful stats too.</p>
<p>First, let me say that this discovery is based on my observations and has not been verified with Oracle Support nor by any insider. If you know one who can confirm this, please ask him or her.</p>
<p>The discovery is a hidden column in the <code>x$kcclh</code> table, which is the underlying table for <code>gv$log_history</code> (and therefore <code>v$log_history</code>). It&#8217;s the only column not exposed in the fixed views. Why? I don&#8217;t know but it sure would be nice to have it exposed and clearly (officially) defined. The column is <code>x$kcclh.lhtsm</code>. It stores the timestamp of when the log began to be applied. The same timestamp as the one printed in the alert log for the &#8220;Media Recovery Log&#8230;&#8221; message and the same timestamp as the one recorded in the <code>v$dataguard_status.message</code> text column.</p>
<p>The advantage of accessing this information from the <code>x$kcclh</code> table is that you get access to more historical data (the number of days will vary based on your redo log switch rate and the size of the control files and possibly on other factors).</p>
<p>The time difference between two consecutive logs, say 1 and 2, gives us the time spent applying the log 1. At least that&#8217;s the theory. Some factors that we can&#8217;t really take into account are the log delay set in the <code>log_archive_dest_n</code> parameter on the primary db (it&#8217;s an optional setting), and any unexpected events, such as an MRP0 process getting hung or a FAL server having trouble to send a log, etc.</p>
<p>Nevertheless, I feel it is very useful information. Any anomaly can be easily identified as it would come up with an unusual elapse time.</p>
<p>Here&#8217;s the query I use to get the log apply elapse time:</p>
<p><span id="more-1309"></span></p>
<pre>
set linesize 80
select b.lhseq sequence#, b.lhtsm fromtime, a.lhtsm totime,
 ((to_date(a.lhtsm,'MM/DD/RR HH24:MI:SS') -
   to_date(b.lhtsm,'MM/DD/RR HH24:MI:SS')) * 1440) minutes_to_apply
  from sys.x$kcclh a, sys.x$kcclh b
 where a.lhrid = b.lhrid + 1
 order by a.lhrid
;
</pre>
<p>Sample output:</p>
<pre>
SEQUENCE# FROMTIME            TOTIME              MINUTES_TO_APPLY
--------- ------------------- ------------------- ----------------
    97077 09/18/2008 03:28:32 09/18/2008 03:43:31       14.9833333
    97078 09/18/2008 03:43:31 09/18/2008 03:58:31               15
    97079 09/18/2008 03:58:31 09/18/2008 04:13:31               15
    97080 09/18/2008 04:13:31 09/18/2008 04:28:31               15
    97081 09/18/2008 04:28:31 09/18/2008 04:43:33       15.0333333
    97082 09/18/2008 04:43:33 09/18/2008 04:58:33               15
    97083 09/18/2008 04:58:33 09/18/2008 05:13:36            15.05
    97084 09/18/2008 05:13:36 09/18/2008 05:28:33            14.95
    97085 09/18/2008 05:28:33 09/18/2008 05:43:31       14.9666667
    97086 09/18/2008 05:43:31 09/18/2008 05:58:31               15
</pre>
<p>Here&#8217;s a query to show the min, max and average elapse time:</p>
<pre>
set linesize 140
col min_log_date newline
col max_log_date newline
col max_log_apply_duration for a30 newline
col min_log_apply_duration for a30 newline
col avg_log_apply_duration for a30 newline
select min(lhlot) min_log_date, max(lhlot) max_log_date,
       numtodsinterval(max(to_number(a_lhtsm - b_lhtsm)),'DAY') max_log_apply_duration,
       numtodsinterval(min(to_number(a_lhtsm - b_lhtsm)),'DAY') min_log_apply_duration,
       numtodsinterval(avg(to_number(a_lhtsm - b_lhtsm)),'DAY') avg_log_apply_duration
  from (
        select a.lhlot lhlot,
               to_date(a.lhtsm,'MM/DD/RR HH24:MI:SS') a_lhtsm,
               to_date(b.lhtsm,'MM/DD/RR HH24:MI:SS') b_lhtsm,
               a.lhrid
          from sys.x$kcclh a, sys.x$kcclh b
         where a.lhrid = b.lhrid + 1
       )
 order by lhrid;
</pre>
<p>Sample output:</p>
<pre>
MIN_LOG_DATE
--------------------
MAX_LOG_DATE
--------------------
MAX_LOG_APPLY_DURATION
------------------------------
MIN_LOG_APPLY_DURATION
------------------------------
AVG_LOG_APPLY_DURATION
------------------------------
09/18/2008 04:28:30
10/21/2008 15:06:56
+000000000 04:16:55.999999999
+000000000 00:00:05.999999999
+000000000 00:11:47.666911225
</pre>
<p>The following query  shows logs that took over 30 minutes to apply:</p>
<pre>
select a.lhseq, b.lhtsm fromtime, a.lhtsm totime,
       ((to_date(a.lhtsm,'MM/DD/RR HH24:MI:SS') -
         to_date(b.lhtsm,'MM/DD/RR HH24:MI:SS')) * 1440) minutes_to_apply
  from sys.x$kcclh a, sys.x$kcclh b
 where to_date(a.lhlot,'MM/DD/RR HH24:MI:SS') &gt; sysdate - 60
   and a.lhrid = b.lhrid + 1
   and  ((to_date(a.lhtsm,'MM/DD/RR HH24:MI:SS') -
          to_date(b.lhtsm,'MM/DD/RR HH24:MI:SS')) * 1440) &gt; 30
order by a.lhrid
;
</pre>
<p>Sample output:</p>
<pre>
 LHSEQ FROMTIME             TOTIME               MINUTES_TO_APPLY
------ -------------------- -------------------- ----------------
 98276 09/28/2008 08:04:41  09/28/2008 11:04:41               180
 98278 09/28/2008 11:25:24  09/28/2008 12:07:06              41.7
 98280 09/28/2008 12:15:35  09/28/2008 14:23:01        127.433333
 98294 09/28/2008 14:33:29  09/28/2008 15:13:33        40.0666667
 98297 09/28/2008 15:51:18  09/28/2008 16:31:42              40.4
 98423 09/28/2008 19:36:32  09/28/2008 20:20:56              44.4
 98533 09/28/2008 22:16:22  09/28/2008 22:54:11        37.8166667
 98662 09/29/2008 02:29:11  09/29/2008 03:37:19        68.1333333
 98663 09/29/2008 03:37:19  09/29/2008 04:13:37              36.3
 98664 09/29/2008 04:13:37  09/29/2008 04:44:14        30.6166667
 98669 09/29/2008 05:54:33  09/29/2008 07:19:25        84.8666667
100220 10/13/2008 07:50:41  10/13/2008 08:50:27        59.7666667
100422 10/15/2008 06:10:58  10/15/2008 10:27:54        256.933333
100537 10/16/2008 08:40:24  10/16/2008 10:05:32        85.1333333
</pre>
<p>I hope you find these queries useful for monitoring your standby database log apply rate and efficiency.</p>
<p>Enjoy!<br />
Marc.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.pythian.com/news/1309/standby-log-apply-elapsed-time/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Oracle File Extent Map, the Old-Fashioned Way</title>
		<link>http://www.pythian.com/news/563/oracle-file-extent-map-the-old-fashioned-way/</link>
		<comments>http://www.pythian.com/news/563/oracle-file-extent-map-the-old-fashioned-way/#comments</comments>
		<pubDate>Mon, 30 Jul 2007 20:07:29 +0000</pubDate>
		<dc:creator>Marc Billette</dc:creator>
				<category><![CDATA[Group Blog Posts]]></category>
		<category><![CDATA[Oracle]]></category>

		<guid isPermaLink="false">http://www.pythian.com/blogs/563/oracle-file-extent-map-the-old-fashioned-way</guid>
		<description><![CDATA[Have you ever wanted to know where exactly in a datafile segments are placed? Have you ever wondered just how fragmented a tablespace is? How about how much space you could reclaim if the last segment of a datafile was moved out of the tablespace? These questions, and many more, can be easily answered with [...]]]></description>
			<content:encoded><![CDATA[<p>Have you ever wanted to know where exactly in a datafile segments are placed? Have you ever wondered just how fragmented a tablespace is? How about how much space you could reclaim if the last segment of a datafile was moved out of the tablespace? These questions, and many more, can be easily answered with a detailed extent map of the datafiles.</p>
<p>Yes, I know, this subject is a rather old one, and a few different solutions have been provided by several professionals, including the famous <a href="http://tkyte.blogspot.com/">Tom Kyte</a>. But none of the answers I found did exactly what I wanted, and therefore, I chose to write my own solution.  OEM does provide this, but for a price &#8212; the Tablespace Map is part of the Oracle Tuning Pack &#8212; and I like the free stuff and the extra flexibility I have using queries.</p>
<p>As all of you know, Oracle has been providing this extent map via the <code>dba_extents</code> view forever (or at least since v6, which is the version I first worked with). The problem is that today&#8217;s super-large databases tend to have segments with thousands of extents. These extents are either of fixed size (LMT Uniformed Sized, DMT with <code>pct_increase=0</code>) or variable sizes (DMT, system-managed extent LMTs). In both cases, the <code>dba_extents</code> view provides one entry per extent, but several of these extents may in fact be contiguous extents. To answer the questions I began with, a view of this extent list aggregated by segment is much more useful.<span id="more-563"></span></p>
<p>To have a fully aggregated map, one needs to union the content of <code>dba_extents</code> and <code>dba_free_space</code>. Piece of cake, right?  No! The basic hierarchical query is relatively simple to understand, but getting Oracle to run it efficiently is something else. No matter how I tried to write the query, with or without hints, I was simply not able to get it to provide me the aggregated map within an acceptable amount of time, and without adding an insane load on the server. </p>
<p>The main problem is the underlying complexity of the <code>dba_extents</code> and <code>dba_free_space</code> views. The solution to that is simple &#8212; get the data you want into a regular heap table (I used global temporary tables) and &#8212;  case solved. The queries to produce the map can now run in just a few minutes, if not seconds.</p>
<p>Here is what I wrote to produce the aggregated extent map. From this map, you can run all kinds of interesting and useful queries. I&#8217;ve also provided some of those to get you started.</p>
<p>The base script:</p>
<pre>
--
-- Create the global temporary tables.
--
create global temporary table seg_list
(file_id number,
 block_id number,
 owner varchar2(30),
 segment_name varchar2(30),
 segment_type varchar2(30),
 blocks number,
 constraint seg_list_pk primary key
   (file_id, block_id, owner, segment_name, segment_type))
;
create global temporary table aggregated_extent_map
(file_id number,
 root_block_id number,
 owner varchar2(30),
 segment_name varchar2(30),
 segment_type varchar2(30),
 total_blocks number)
;
--
-- Load the base extent data
-- from dba_extents and dba_free_space
--
insert into seg_list
select file_id,block_id,owner,
       segment_name,segment_type,blocks
  from dba_extents
-- this is optional, you can load all your tablespaces
-- where tablespace_name = 'MY_TS'
union all
select file_id,block_id,
       'free space',
       'free space',
       'free space',
       blocks
  from dba_free_space
-- this is optional, you can load all your tablespaces
-- where tablespace_name = 'MY_TS';
--
-- Generate the aggregate extent map using a hierarchical query.
-- Be patient, this will take a short while depending on the number
-- of extents to process and your system's speed. It took 5:02.69
-- minutes on a dev server to process 18033 extents and
-- generated 11848 aggregated extents.
--
insert into aggregated_extent_map
select file_id, root, owner, segment_name, segment_type,
       sum(blocks)
  from
(
select owner, segment_name, segment_type, file_id,
       blocks, block_id,
       substr(sys_connect_by_path(block_id,'/'),2,
       decode(instr(sys_connect_by_path(block_id,'/'),'/',2)
       -2,-2,length(sys_connect_by_path(block_id,'/')),
       instr(sys_connect_by_path(block_id,'/'),'/',2)-2))
       root
  from seg_list a
 start with (file_id, block_id) in
            (select file_id, block_id
              from seg_list
             where (file_id,block_id) in
                   (select file_id, min(block_id)
                      from seg_list group by file_id)
            union all
            select b.file_id, b.block_id
              from seg_list a, seg_list b
             where b.block_id = a.block_id + a.blocks
               and a.file_id = b.file_id
               and (a.owner &lt;&gt; b.owner or
                    a.segment_name &lt;&gt; b.segment_name)
           )
connect by owner = prior owner
       and segment_name = prior segment_name
       and file_id = prior file_id
       and block_id = prior a.block_id + prior a.blocks
) c
group by owner, segment_name, segment_type, file_id, root
;

-- &gt;&gt;&gt; run all your queries here...
-- Don't forget to re-populate the temporary tables if you
-- sign out or rollback.
</pre>
<p>Here are some sample queries.</p>
<h3>Query 1</h3>
<p>This query lists the last five aggregated extents for each datafile.</p>
<pre>
break on file_id skip 1
set linesize 140 pagesize 10000
col file_id for 9999
col top_n noprint
col segment_type for a12
col size_mb for 999999.99
select * from (
select a.file_id,
       rank() over (partition by a.file_id
                    order by root_block_id desc) top_n,
       segment_name,
       segment_type,
       root_block_id,
       total_blocks*(b.bytes/b.blocks)/1048576 size_mb
  from aggregated_extent_map a, dba_data_files b
 where a.file_id = b.file_id
-- use this if you loaded more than one TS in the seg_list
-- and tablespace_name = 'MY_TS'
-- use this to list a single datafile
-- and file_id = 7
) where top_n
;
</pre>
<p>Sample output with segment names masked:<a name="listing1">&nbsp;</a></p>
<pre>
FILE_ID SEGMENT_NAME    SEGMENT_TYPE ROOT_BLOCK_ID  SIZE_MB
------- --------------- ------------ -------------  -------
      1 free space      free space           16965   246.94
        IDL_UB2$        TABLE                16901     1.00
        SOURCE$         TABLE                16837     1.00
        free space      free space           16809      .44
        PLAN_TABLE      TABLE                16805      .06

      2 free space      free space          260261    29.44
        _SYSSMU64$      TYPE2 UNDO          260257      .06
        free space      free space          260253      .06
        _SYSSMU44$      TYPE2 UNDO          260249      .06
        free space      free space          242577   276.13

      3 JOB_LOGS_PK     INDEX                13381     1.00
        JOB_MASTERS     TABLE                13317     1.00
        JOB_LOGS        TABLE                13253     1.00
        JOB_MASTERS_PK  INDEX                13189     1.00
        JOB_MASTERS     TABLE                13125     1.00

      4 T1              TABLE               511997      .06
        T2              TABLE               511993      .06
        IDX1            INDEX               511989      .06
        IDX2            INDEX               511981      .13
        T3              TABLE               511973      .13

      5 free space      free space               5    64.00

      6 free space      free space           19465   103.75
        IDX3            INDEX                19401      .50
        T4              TABLE                19241     1.25
        free space      free space           18921     2.50
        T4              TABLE                18889      .25

      7 free space      free space          319493   124.00
        IDX4            INDEX               304133   240.00
        T5              TABLE               301573    40.00
        IDX4            INDEX               298757    44.00
        free space      free space          258821   624.00

      8 free space      free space           40965   320.00
        T6              TABLE                32773   128.00
        T7              TABLE                12293   320.00
        T8              TABLE                    5   192.00
...
</pre>
<p>The most interesting observation I can make of the first listing is that datafile 7 has a 624MB free space extent in the last position. Should I need to reclaim some or all of that space, I could run a move of table T5 and a rebuild of IDX4 in place, or better yet, to another tablespace. Assuming I move it to another tablespace, I will be able to reclaim over 1GB of disk space by shrinking datafile 7.</p>
<p>Running that query for the single datafile and extracting more aggregate extents might show that even more space would get freed (that was not the case for me as the sixth-last extent was used by another table).</p>
<h3>Query 2</h3>
<p>This query shows the &#8220;real&#8221; fragmentation of each datafile.</p>
<pre>
break on file_id skip 1
compute sum of cnt on file_id
compute sum of blocks on file_id
compute sum of size_mb on file_id
set linesize 140 pagesize 10000
col file_id for 9999
col segment_type for a32
col cnt for 999999
col blocks for 999999999
col size_mb for 999999.99
select a.file_id,
       segment_type,
       count(*) cnt,
       sum(total_blocks) blocks,
       sum(total_blocks*(b.bytes/b.blocks)/1048576) size_mb
  from aggregated_extent_map a, dba_data_files b
 where a.file_id = b.file_id
 group by a.file_id, segment_type
order by file_id, cnt desc
;
</pre>
<p>Sample output:<a name="listing2">&nbsp;</a></p>
<pre>
FILE_ID SEGMENT_TYPE         CNT     BLOCKS    SIZE_MB
------- ---------------- ------- ---------- ----------
      1 INDEX                679       3084      48.19
        TABLE                621      11540     180.31
        LOBSEGMENT            65        396       6.19
        CLUSTER               63       1468      22.94
        LOBINDEX              51        204       3.19
        TABLE PARTITION       27        108       1.69
        INDEX PARTITION       24         96       1.50
        ROLLBACK               2         28        .44
        free space             2      15832     247.38
        CACHE                  1          4        .06
        NESTED TABLE           1          4        .06
*******                  ------- ---------- ----------
sum                         1536      32764     511.94

      2 TYPE2 UNDO           521      51023     797.23
        free space           120     210076    3282.44
*******                  ------- ---------- ----------
sum                          641     261099    4079.67

      3 TABLE                130       8640     135.00
        INDEX                 73       4800      75.00
*******                  ------- ---------- ----------
sum                          203      13440     210.00

      4 INDEX               2113     147512    2304.88
        TABLE               1614     187228    2925.44
        free space           348     176808    2762.63
        LOBINDEX              28        224       3.50
        LOBSEGMENT            28        224       3.50
*******                  ------- ---------- ----------
sum                         4131     511996    7999.94

      5 free space             1       4096      64.00
*******                  ------- ---------- ----------
sum                            1       4096      64.00

      6 TABLE                124      15456     120.75
        free space            43      16000     125.00
        LOBSEGMENT            11       1184       9.25
        INDEX                  1         64        .50
        LOBINDEX               1         32        .25
*******                  ------- ---------- ----------
sum                          180      32736     255.75

      7 TABLE                 71      44544     696.00
        free space            44     264704    4136.00
        INDEX                  2      18176     284.00
*******                  ------- ---------- ----------
sum                          117     327424    5116.00

      8 TABLE                  3      40960     640.00
        free space             1      20480     320.00
*******                  ------- ---------- ----------
sum                            4      61440     960.00

...
</pre>
<p>The second listing shows that file 4 is much more fragmented than the other ones, and that the fragmentation is well distributed over tables and indexes. Further analysis is required to know if this is abnormal, as it is possible that file 4 stores lots of very small objects (see <a href="#listing3">listing 3</a>). </p>
<p>We can also see that datafile 5 is totally empty and is very small. Perhaps it can be dropped.</p>
<p>Datafile 7 is very interesting. It has 4.1GB free out of 5.1GB. But, if you refer to <a href="#listing1">listing 1</a>, this datafile  as it stands can only shrink by 124MB. Some objects would need to be moved and rebuilt to get all that freespace sitting at the end of the datafile, and to be able to shrink the datafile significantly.</p>
<h3>Query 3</h3>
<p>This query shows the number of distinct aggregated segments for each datafiles.</p>
<pre>
clear break
col file_id for 9999
col cnt for 999999
select file_id,
       count(distinct segment_name) cnt
from aggregated_extent_map
group by file_id
;
</pre>
<p>Sample output:<a name="listing3">&nbsp;</a></p>
<pre>
FILE_ID     CNT
------- -------
      1    1039
      2     361
      3      18
      4     552
      5       1
      6      28
      7      32
      8       4
...
</pre>
<p>The third listing shows why there&#8217;s so many extents in datafile 4. It hosts 552 distinct objects. <a href="#listing2">Listing 2</a> shows that this datafile has a total of 4131 contiguous extents. That&#8217;s an average of 7.5 contiguous extents per objects. Not that bad. A quick look at the <code>dba_data_files</code> reveals that this file belongs to the <code>USERS</code> tablespace. The <code>dba_tablespaces</code> view shows that the <code>segment_space_management</code> for <code>USERS</code> is <code>AUTO</code> (i.e. system-managed). System-managed segment extents always start with small extents and grow to about a maximum of 128MB. It is therefore not surprising at all that the datafile has so many extents. </p>
<p>The problem here is that this datafile has 2.76GB of free space spread over 348 chunks, interspersed with 3783 (4131-348) segment extents. In the event that you&#8217;d need to release it back to the system, that space is <em>not</em> currently reclaimable without a serious reorganization.</p>
<h3>Query 4</h3>
<p>This query shows the largest segment, and the first and last aggregated extents per datafile.</p>
<pre>
break on file_id skip 1
set linesize 140 pagesize 10000
col file_id for 9999
col top_n noprint
col tb head ''
col segment_name head ''
col fe head ''
col le head ''
col ls head ''
col nl newline head ''
col filler1 for a42 head ''
col filler2 for a10 head ''
col size_mb for 999999.99
select aggr.file_id,
       'total blocks:' tb,
       '' filler1,
       file_blocks,
       null nl,
       '      First extent:' fe,
       minext.segment_name,
       aggr.min_rbi,
       minext.total_blocks,
       null nl,
       '      Last  extent:' le,
       maxext.segment_name,
       aggr.max_rbi,
       maxext.total_blocks,
       null nl,
       '      largest seg :' ls,
       lg.segment_name,
       '' filler2,
       lg.total_blocks
from (select file_id, min(root_block_id) min_rbi,
             max(root_block_id) max_rbi,
             sum(total_blocks) file_blocks
        from aggregated_extent_map group by file_id) aggr,
     aggregated_extent_map minext,
     aggregated_extent_map maxext,
     (select z.*,
             rank() over (partition by file_id
                          order by total_blocks desc) top_n
        from (select file_id, segment_name,
                     sum(total_blocks) total_blocks
                from aggregated_extent_map
               where segment_name &lt;&gt; 'free space'
               group by file_id, segment_name
               order by total_blocks) z
     ) lg
where minext.file_id = aggr.file_id
  and minext.root_block_id = aggr.min_rbi
  and maxext.file_id = aggr.file_id
  and maxext.root_block_id = aggr.max_rbi
  and lg.file_id = aggr.file_id
  and lg.top_n
;
</pre>
<p>Sample output:</p>
<pre>
FILE_ID                                         FILE_BLOCKS
------- ------------- ------------------------- -----------
                                       MIN_RBI TOTAL_BLOCKS
- ------------------- ------------- ---------- ------------
                                       MAX_RBI TOTAL_BLOCKS
- ------------------- ------------- ---------- ------------
                                               TOTAL_BLOCKS
- ------------------- ------------- ---------- ------------
      1 total blocks:                                 32764
        First extent: SYSTEM                 5            8
        Last  extent: free space         16965        15804
        largest seg : FGA_LOG$                         4608

      2 total blocks:                                261227
        First extent: _SYSSMU1$              6            3
        Last  extent: free space        260261         1884
        largest seg : _SYSSMU266$                     28096

      3 total blocks:                                 13440
        First extent: free space             5          192
        Last  extent: JOB_LOGS_PK        13381           64
        largest seg : JOB_MASTERS                      4032

      4 total blocks:                                511996
        First extent: free space             5            8
        Last  extent: BP                511997            4
        largest seg : HOLD_LOGMINER                   34144

      6 total blocks:                                 32736
        First extent: CHAT_ROOM              9          288
        Last  extent: free space         19401        13344
        largest seg : SMSMESSAGE                      10752

      7 total blocks:                                 31744
        First extent: CHAT_EVENT             5          256
        Last  extent: free space         31493          256
        largest seg : PODCONTACTS                      5632

      8 total blocks:                                 61440
        First extent: PODFILEACL             5        12288
        Last  extent: free space         40965        20480
        largest seg : PODFILEDATA                     20480

...
</pre>
<p>Ok, you get the idea. You can run queries against this <code>aggregated_extent_map</code> table to get information about datafiles, segments, and tablespaces. Heck, if you&#8217;re up to it, you can probably extract the list of aggregated extents for a tablespace, load it into an excel sheet, and chart it with nice colors and all, Ã  la OEM.  </p>
<p>I think this mapping is very useful to anyone who wants to understand where the space is being used within their database and see if maintenance would be beneficial. Remember, space is <em>not</em> cheap &#8212; MBs are, but DBAs&#8217; and SAs&#8217; time is not, and nor is backup storage capacity, nor the time to backup (and restore) the datafiles.</p>
<p>For maximum performance, keep your databases lean and clean!</p>
<p>Enjoy!<br />
Marc.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.pythian.com/news/563/oracle-file-extent-map-the-old-fashioned-way/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Reporting Space-Wasting Objects in Oracle</title>
		<link>http://www.pythian.com/news/464/reporting-space-wasting-objects-in-oracle/</link>
		<comments>http://www.pythian.com/news/464/reporting-space-wasting-objects-in-oracle/#comments</comments>
		<pubDate>Tue, 01 May 2007 20:07:48 +0000</pubDate>
		<dc:creator>Marc Billette</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Technical Blog]]></category>

		<guid isPermaLink="false">http://www.pythian.com/blogs/464/reporting-space-wasting-objects-in-oracle</guid>
		<description><![CDATA[Woohoo, I finally got a bit of spare time to blog &#8212; my first blog post ever!
I chose to talk about a technique I used at a client&#8217;s site to report the topmost space-wasting objects in an Oracle database. I was looking for a way to detect these objects without having to run some expensive [...]]]></description>
			<content:encoded><![CDATA[<p>Woohoo, I finally got a bit of spare time to blog &#8212; my first blog post ever!</p>
<p>I chose to talk about a technique I used at a client&#8217;s site to report the topmost space-wasting objects in an Oracle database. I was looking for a way to detect these objects without having to run some expensive analyze statements or <code>dbms_stats</code> jobs. I found out that I can use the <code>dbms_space</code> package to do this. It worked very well for me and I&#8217;m sure lots of DBAs could use this technique too. It is not perfect in all situations as it has some prerequisites that I will list later, but still it does the trick for me.</p>
<p>The <code>dbms_space</code> package has procedures that are very useful for determining how much space is used within db blocks. As far as I know, this information is determined by looking  at the tablespace bitmap pages. I haven&#8217;t confirmed this yet, but that would explain why it is so much faster than running an analyze or <code>dbms_stats.gather_%_stats</code>.</p>
<p>So, the idea is to generate the space-usage information using <code>dbms_space</code>, store that information into a table, and then run queries on it to report the topmost space-wasters.</p>
<p><span id="more-464"></span></p>
<p>Here are simplified versions of the scripts I use:</p>
<ol>
<li>Create a table to store the statistics generated by <code>dbms_space</code>.
<pre class="brush: plain;">
connect &amp;your_account/&amp;your_account_password

create table segment_space_stats
       (owner                     varchar2(32),
        segment_name              varchar2(32),
        segment_type              varchar2(32),
        tablespace_name           varchar2(32),
        segment_space_management  varchar2(32),
        unformatted_blocks        number,
        unformatted_bytes         number,
        fs1_blocks                number,
        fs1_bytes                 number,
        fs2_blocks                number,
        fs2_bytes                 number,
        fs3_blocks                number,
        fs3_bytes                 number,
        fs4_blocks                number,
        fs4_bytes                 number,
        full_blocks               number,
        full_bytes                number,
        total_blocks              number,
        total_bytes               number,
        unused_blocks             number,
        unused_bytes              number,
        last_used_extent_file_id  number,
        last_used_extent_block_id number,
        last_used_block           number,
        timestamp                 date)
;
</pre>
</li>
<li>Create a procedure and job to run the <code>dbms_space</code> procedures (<code>space_usage</code> and <code>unused_space</code>) on all applicable objects.
<pre class="brush: sql; collapse: true; light: false; toolbar: true; wrap-lines: false;">
-- You may need to grant these to your_account.
--
-- connect sys as sysdba
-- grant select any tables, select any dictionary, analyze any to &amp;your_account;
-- grant execute on dbms_space to &amp;your_account;
-- connect &amp;your_account/&amp;your_account_password

CREATE OR REPLACE procedure GEN_SEGMENT_SPACE_STATS
      as
        index_does_not_exist EXCEPTION;
        table_does_not_exist EXCEPTION;
        PRAGMA EXCEPTION_INIT(index_does_not_exist, -1418);
        PRAGMA EXCEPTION_INIT(table_does_not_exist, -942);

        v_unformatted_blocks number;
        v_unformatted_bytes  number;
        v_fs1_blocks         number;
        v_fs1_bytes          number;
        v_fs2_blocks         number;
        v_fs2_bytes          number;
        v_fs3_blocks         number;
        v_fs3_bytes          number;
        v_fs4_blocks         number;
        v_fs4_bytes          number;
        v_full_blocks        number;
        v_full_bytes         number;

        v_total_blocks number;
        v_total_bytes number;
        v_unused_blocks number;
        v_unused_bytes number;
        v_last_used_extent_file_id number;
        v_last_used_extent_block_id number;
        v_last_used_block number;

      begin
        for s in (select owner, segment_name, segment_type, seg.tablespace_name , segment_space_management
                    from dba_tablespaces ts, dba_segments seg
                   where ts.tablespace_name not in ('SYSTEM', 'SYSAUX')
                     and ts.tablespace_name = seg.tablespace_name
                     and segment_type in ('TABLE', 'INDEX','CLUSTER','LOB')
                     and (owner, segment_name, segment_type, seg.tablespace_name ) in (
                          select owner, segment_name, segment_type, tablespace_name from segment_space_stats))
        loop
          begin

            if s.segment_space_management = 'AUTO' then
               dbms_space.space_usage(s.owner, s.segment_name, s.segment_type, v_unformatted_blocks, v_unformatted_bytes,
                                      v_fs1_blocks, v_fs1_bytes, v_fs2_blocks, v_fs2_bytes , v_fs3_blocks, v_fs3_bytes ,
                                      v_fs4_blocks, v_fs4_bytes , v_full_blocks, v_full_bytes , NULL);
            end if;
            dbms_space.unused_space(s.owner, s.segment_name, s.segment_type, v_total_blocks, v_total_bytes,
                                    v_unused_blocks, v_unused_bytes, v_last_used_extent_file_id,
                                    v_last_used_extent_block_id, v_last_used_block, NULL);
            update SEGMENT_SPACE_STATS
               set tablespace_name=s.tablespace_name,
                   unformatted_blocks=v_unformatted_blocks,
                   unformatted_bytes=v_unformatted_bytes,
                   fs1_blocks =v_fs1_blocks,
                   fs1_bytes  =v_fs1_bytes,
                   fs2_blocks =v_fs2_blocks,
                   fs2_bytes  =v_fs2_bytes,
                   fs3_blocks =v_fs3_blocks,
                   fs3_bytes  =v_fs3_bytes,
                   fs4_blocks =v_fs4_blocks,
                   fs4_bytes  =v_fs4_bytes,
                   full_blocks=v_full_blocks,
                   full_bytes =v_full_bytes,
                   timestamp  =sysdate,
                   total_blocks=v_total_blocks,
                   total_bytes =v_total_bytes,
                   unused_blocks=v_unused_blocks,
                   unused_bytes =v_unused_bytes,
                   last_used_extent_file_id =v_last_used_extent_file_id,
                   last_used_extent_block_id=v_last_used_extent_block_id,
                   last_used_block          =v_last_used_block
             where owner=s.owner
               and segment_name=s.segment_name
               and segment_type=s.segment_type;
          exception
            when index_does_not_exist then
              null; -- ignore these errors
            when table_does_not_exist then
              null; -- ignore these errors
            when others then
              raise;
          end;
        end loop;
        commit;

        for s in (select owner, segment_name, segment_type, seg.tablespace_name, segment_space_management
                    from dba_tablespaces ts, dba_segments seg
                   where ts.tablespace_name not in ('SYSTEM', 'SYSAUX')
                     and ts.tablespace_name = seg.tablespace_name
                     and segment_type in ('TABLE', 'INDEX','CLUSTER','LOB')
                   minus
                  select owner, segment_name, segment_type, tablespace_name, segment_space_management
                    from segment_space_stats)
        loop
          begin
            if s.segment_space_management = 'AUTO' then
               dbms_space.space_usage(s.owner, s.segment_name, s.segment_type, v_unformatted_blocks, v_unformatted_bytes,
                                      v_fs1_blocks, v_fs1_bytes, v_fs2_blocks, v_fs2_bytes , v_fs3_blocks, v_fs3_bytes ,
                                      v_fs4_blocks, v_fs4_bytes , v_full_blocks, v_full_bytes , NULL);
            end if;
            dbms_space.unused_space(s.owner, s.segment_name, s.segment_type, v_total_blocks, v_total_bytes, v_unused_blocks,
                                    v_unused_bytes, v_last_used_extent_file_id, v_last_used_extent_block_id, v_last_used_block, NULL);
            insert into SEGMENT_SPACE_STATS
            values(s.owner, s.segment_name, s.segment_type, s.tablespace_name, s.segment_space_management,v_unformatted_blocks,
                   v_unformatted_bytes, v_fs1_blocks, v_fs1_bytes, v_fs2_blocks, v_fs2_bytes, v_fs3_blocks, v_fs3_bytes,
                   v_fs4_blocks, v_fs4_bytes, v_full_blocks, v_full_bytes, v_total_blocks, v_total_bytes,
                   v_unused_blocks, v_unused_bytes, v_last_used_extent_file_id, v_last_used_extent_block_id,
                   v_last_used_block,sysdate);
          exception
            when index_does_not_exist then
               null; -- ignore these errors
            when table_does_not_exist then
               null; -- ignore these errors
            when others then
               raise;
          end;
        end loop;

        commit;

        delete SEGMENT_SPACE_STATS where (owner, segment_name, segment_type, tablespace_name) not in
        (select owner, segment_name, segment_type, tablespace_name from dba_segments
          where tablespace_name not in ('SYSTEM', 'SYSAUX')
            and segment_type in ('TABLE', 'INDEX','CLUSTER','LOB'));
        commit;
      end;
/

var v_job_id number;
begin
  -- schedule the job to run daily at 4AM...
  dbms_job.submit(:v_job_id,'GEN_SEGMENT_SPACE_STATS;',to_date(trunc(sysdate+1)+4/24),'trunc(sysdate+1)+4/24');
  commit;
end;
/

exec dbms_job.run(:v_job_id)
</pre>
</li>
<li>Run queries on the table&#8217;s data.
<pre class="brush: sql; collapse: true; light: false; toolbar: true; wrap-lines: false;">
TTITLE center underline &quot;Top Ten Largest Empty Segments&quot;
set linesize 140 pagesize 100
col owner for a20
col total_MB for 999999
clear breaks
clear computes
break on report dup
compute sum of total_MB on report
select a.*
  from (select owner, segment_name, segment_type, tablespace_name, total_bytes/1048576 total_MB
          from segment_space_stats
         where full_bytes = 0
        order by total_bytes desc) a
where rownum&lt;11
;

                                                       Top Ten Largest Empty Segments
                                                       ______________________________

OWNER                SEGMENT_NAME                     SEGMENT_TYPE                     TABLESPACE_NAME                  TOTAL_MB
-------------------- -------------------------------- -------------------------------- -------------------------------- --------
PERFSTAT             STATS$SQL_PLAN                   TABLE                            PERFSTAT_DATA                           5
PERFSTAT             STATS$SQL_PLAN_USAGE             TABLE                            PERFSTAT_DATA                           5
PERFSTAT             STATS$SEG_STAT                   TABLE                            PERFSTAT_DATA                           3
BP_PROD_JOBS         JB_SEARCH_RESULT_ALERT           TABLE                            BP_PROD_JOBS_DATA                       2
PERFSTAT             STATS$BUFFERED_QUEUES            TABLE                            PERFSTAT_DATA                           1
PERFSTAT             STATS$BUFFERED_QUEUES_PK         INDEX                            PERFSTAT_DATA                           1
PERFSTAT             STATS$BUFFERED_SUBSCRIBERS       TABLE                            PERFSTAT_DATA                           1
PERFSTAT             STATS$CR_BLOCK_SERVER_PK         INDEX                            PERFSTAT_DATA                           1
PERFSTAT             STATS$CR_BLOCK_SERVER            TABLE                            PERFSTAT_DATA                           1
PERFSTAT             STATS$BUFFERED_SUBSCRIBERS_PK    INDEX                            PERFSTAT_DATA                           1
                                                                                                                        --------
sum                                                                                                                           21

10 rows selected.

TTITLE center underline &quot;Top Ten Largest Space Wasters&quot;
set linesize 140 pagesize 20
col owner for a20
col segment_type for a12
col total_MB for 999999
col unformatted_MB for 999999
col unused_MB for 999999
col pct_unused for 999 heading &quot;Pct Unused&quot;
col estmtd_pssbl_svngs for 99999 heading &quot;Estimated|Possible Savings|w Index Rebuild&quot;
clear breaks
clear computes
break on report dup
compute sum of total_MB on report
compute sum of unused_MB on report

select a.*
  from (
        select owner, segment_name, segment_type,
               total_bytes/1048576 total_MB,
               unformatted_bytes/1048576 unformatted_MB,
               unused_bytes/1048576 unused_MB,
               trunc(unused_bytes / total_bytes * 100) pct_unused,
               case when segment_type = 'INDEX' and total_blocks &gt;= 256 and fs2_blocks &gt;= 64
                         then unused_blocks + fs2_blocks
                    else 0
               end * b.block_size/1048576 estmtd_pssbl_svngs
          from segment_space_stats a, dba_tablespaces b
         where unused_bytes &lt;&gt; 0
           and a.tablespace_name = b.tablespace_name
         order by unused_bytes desc) a
where rownum&lt;11
;

                                                        Top Ten Largest Space Wasters
                                                        _____________________________

                                                                                                                       Estimated
                                                                                                                Possible Savings
OWNER                SEGMENT_NAME                     SEGMENT_TYPE TOTAL_MB UNFORMATTED_MB UNUSED_MB Pct Unused  w Index Rebuild
-------------------- -------------------------------- ------------ -------- -------------- --------- ---------- ----------------
BP_PROD_JOBS         JB_SEARCH_RESULT_IDX             INDEX            2229              4        56          2             2179
BP_PROD_MVIEWS       USER_INFO_BASIC                  TABLE            4412              0        48          1                0
BP_PROD_JOBS         JB_SEARCH_RESULT                 TABLE            1216              6        40          3                0
BP_PROD_JOBS         JB_SEARCH_RESULT_USJ_IDX         INDEX            2109              4        32          1             2041
BP_PROD_MVIEWS       USER_INFO                        TABLE            6039              0        31          0                0
BP_PROD_GW_SUPPORT   GWS_ACCESS_LOG_SUM_PK            INDEX            1761              1        16          0              323
BP_PROD_JOBS         JB_MONSTER_XMIT_LOG              TABLE            3108              0         8          0                0
BP_PROD_JOBS         DR$JM_SEARCH_DNRM_XML_IDX_2$I    TABLE             144              0         8          5                0
BP_PROD_JOBS         DR$JM_SEARCH_DNRM_XML_IDX_1$I    TABLE             144              0         8          5                0
BP_PROD_GW_SUPPORT   GWS_DLY_LOG_20070423_UK          INDEX              88              0         6          6                7
                                                                   --------                ---------
sum                                                                   21250                      252

10 rows selected.

TTITLE center underline &quot;Top Ten Indexes with non-full blocks&quot;
clear breaks
clear computes
set linesize 145 pagesize 100
col owner for a20
col fs1_blocks for 9999999 heading &quot;&lt; 25%|blocks free&quot;
col fs2_blocks for 9999999 heading &quot;25-50%|blocks free&quot;
col fs3_blocks for 9999999 heading &quot;50-75%|blocks free&quot;
col fs4_blocks for 9999999 heading &quot;75-100%|blocks free&quot;
col full_blocks for 9999999 heading &quot;blocks|full&quot;
col total_blocks for 9999999 heading &quot;total|blocks&quot;
col estmtd_pssbl_svngs for 99999 heading &quot;Estimated|Possible Savings|w Index Rebuild&quot;
col block_size noprint
break on report dup
compute sum of estmtd_pssbl_svngs on report
select a.*
  from (
        select owner, segment_name,
               fs1_blocks ,
               fs2_blocks ,
               fs3_blocks ,
               fs4_blocks ,
               full_blocks, total_blocks,
               case when segment_type = 'INDEX' and total_blocks &gt;= 256 and fs2_blocks &gt;= 64
                         then unused_blocks + fs2_blocks
                    else 0
               end * b.block_size/1048576 estmtd_pssbl_svngs,
               b.block_size
          from segment_space_stats a, dba_tablespaces b
         where segment_type = 'INDEX'
           and a.tablespace_name = b.tablespace_name
         order by fs4_blocks desc, fs3_blocks desc, fs2_blocks desc, fs1_blocks desc) a
where rownum&lt;11
;

                                                       Top Ten Indexes with non-full blocks
                                                       ____________________________________

                                                                                                                               Estimated
                                                            &lt; 25%      25-50%      50-75%     75-100%   blocks    total Possible Savings
OWNER                SEGMENT_NAME                     blocks free blocks free blocks free blocks free     full   blocks  w Index Rebuild
-------------------- -------------------------------- ----------- ----------- ----------- ----------- -------- -------- ----------------
BP_PROD_JOBS         JB_SEARCH_RESULT_IDX                       0      271793           0           0     5035   285312             2179
BP_PROD_JOBS         JB_SEARCH_RESULT_USJ_IDX                   0      257204           0           0     7402   269952             2041
BP_PROD_JOBS         JB_SEARCH_RESULT_SCORE_IDX                 0      122435           0           0     4443   127488              957
BP_PROD_GW_SUPPORT   GWS_ACCESS_LOG_SUM_PK                      0       39342           0           0   183215   225408              323
PERFSTAT             STATS$SQL_SUMMARY_PK                       0        2015           0           0     9354    12288               22
AVAIL                STAT_TABLE                                 0        1556           0           0    20269    22528               16
BP_PROD_INVITE       REL_RELATIONSHIP_QUERY_IDX                 0         489           0           0   155335   156544                4
PERFSTAT             STATS$EVENT_HISTOGRAM_PK                   0         245           0           0     6299     6656                2
BP_PROD_JOBS         JB_ACTIVITY_LOG_IDX                        0         245           0           0    49501    50816                8
BP_PROD_MVIEWS       USER_INFO_BASIC_LWR_USRNAM_IDX             0         229           0           0    91468    92160                2
                                                                                                                        ----------------
sum                                                                                                                                 5554

10 rows selected.
</pre>
</li>
</ol>
<p>Voila! Isn&#8217;t that pretty? I know, I need to explain these reports a bit&hellip;</p>
<p>The first report is straightforward. It lists the ten largest empty segments. As you can see, there aren&#8217;t any large empty objects in this database. This is mostly due to the work of Oracle&#8217;s Automatic Storage Management (ASM).</p>
<p>The second report lists the objects with the most unused space within their currently allocated extents.</p>
<p>The column that probably gets you going &#8220;huh?&#8221; is the &#8220;Estimated Possible Savings w Index Rebuild&#8221;. This column is derived based on my own observations of the data generated by <code>dbms_space.space_usage</code>. </p>
<p>I noticed that for indexes, there are never any blocks that are totally empty. That got me curious. It  seems that all blocks allocated to the indexes automatically get a usage estimate of at least 25%. I also noticed, based on my analysis of the indexes, that these large indexes with lots of blocks in the 25-50% range are indexes that get updated very frequently, and in fact, get lots of their rows deleted (i.e. rows older than, say, 30 days are deleted). </p>
<p>I was of the impression that lots of old blocks would be reported in the 0-25% utilization range. So I figured, what the heck, let&#8217;s rebuild some of these and see if I&#8217;m right in suspecting that most of these blocks in the 25-50% range are empty. And, bingo! The space usage went down to about 10% more than the &#8220;blocks full&#8221; figure. Nice! I might be able to use this to estimate how much space I can save by rebuilding an index.  Yes, I know &#8212; the problem is an application-design issue, and indexes shouldn&#8217;t need to be rebuilt over time. Hey, if we lived in a perfect world, there wouldn&#8217;t be any administrators&hellip; but that&#8217;s another story.</p>
<p>I also noticed that this was not necessarily true for small indexes, hence the case statement I have in the query. Only indexes larger than 256 blocks and with at least 64 blocks in the 25-50% range are considered for space saving. You can use whatever value you like here. These are the ones that work for me. And no, I didn&#8217;t do any extensive research or testing with these figures.</p>
<p>A few things that you should be aware of:
<ol>
<li><code>dbms_space.space_usage</code> works only for ASM segments.</li>
<li><code>dbms_space</code>&#8217;s <code>unused_space</code>, <code>space_usage</code> and <code>free_space</code> do not require a special license to use. The other procedures in that package are linked to the AWR licensing (at least they were the last time I checked).</li>
<li>The above code and procedures are examples only. Use them at your own risk, and tailor them as you wish. And,
</li>
<li>Feel free to test my assumptions as extensively as you wish. And I welcome your feedback. </li>
</ol>
<p>I hope you find this useful.</p>
<p>Enjoy!<br />
Marc</p>
]]></content:encoded>
			<wfw:commentRss>http://www.pythian.com/news/464/reporting-space-wasting-objects-in-oracle/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
	</channel>
</rss>
