I was asked, the other day, to automate the creation of a client’s weekly report, which is a pivot table of some aggregate data generated by a query.
As we know, prior to 11g, Oracle did not have a simple table pivot feature. 11g has changed that, and the pivot clause is certainly useful. It requires, however, an aggregation calculation in the intersection (at least that’s what I got from the documentation). But what if you already have the data to populate in the intersection area? Or, you may no longer have the raw data to aggregate it again. In that case, you are forced to trick it to get an aggregation in.
I’ve written a set of fairly simple PL/SQL code that handles this task in all versions of Oracle that supports associative nested VARRAYs (I have no clue when that started getting supported. It sure works great with 10g).
Here is how my code works:
- A function is used to pass in your query that retrieves your raw data.The function needs a query that returns three columns. The first one for the row labels, the second one for the column labels, and the third one for the intersection data.
- The function converts the result set of the supplied query into a pivot as CSV strings (don’t worry, I’ll explain later how to make those separate columns).
- Each line (row) is summarized in an extra trailing column.
- An extra trailing line is also generated with the column summaries and a Grand Total at the end.
- The function pipes the result out, as it gets readied, to the calling query, which receives it as a single column set of rows.
This is the core of the work. The resulting CSV strings can be spooled in a file and loaded into any decent spreadsheet editor.
Some of you may say, yeah okay, but I’m really looking at getting back a cursor with this data in separate columns. That’s not so difficult to do, assuming you are willing to live with some minor details, which I will explain later.
Alright, here is the function:
-- we need this TYPE created to return a cursor of varchar2. create type varchar2_type as table of varchar2(2000); / create or replace function pivot_three_cols_table_func (p_query_string varchar2, sum_cols_ind INTEGER default 0, sum_lines_ind INTEGER default 0) return varchar2_type pipelined as type tb1 is table of number index by varchar2(1000); type ntb1 is table of tb1 index by varchar2(500); nvar ntb1; type coltb is table of varchar2(20) index by varchar2(20); cols coltb; col_1 varchar2(1000); col_2 varchar2(500); col_3 number; line varchar2(1000); col varchar2(500); sum_lines number :=0; sum_cols number :=0; sum_pivot number :=0; type item_cur_type is ref cursor; item item_cur_type; str varchar2(32767); begin -- initialize the empty array nvar := nvar; -- load the array open item for p_query_string; loop fetch item into col_1, col_2, col_3; -- replace commas with space from the col_1 and col2 as -- these are used as column delimiters in the csv col_1 := replace(col_1,',',' '); col_2 := replace(col_2,',',' '); nvar(col_1)(col_2) := col_3; exit when item%notfound; end loop; close item; -- print the crosstab/pivot header -- -- first load the list of distinct column names in an associative array line := nvar.first; for t in 1..nvar.count loop col := nvar(line).first; for d in 1..nvar(line).count loop cols(col) := col; col := nvar(line).next(col); end loop; line := nvar.next(line); end loop; -- -- print the list of disctinct column names str := 'TITLE,'; col := cols.first; for d in 1..cols.count loop str := str||cols(col)||','; col := cols.next(col); end loop; if sum_cols_ind = 1 then str := str||'TOTAL'; else str:= substr(str,1,length(str)-1); end if; pipe row (str); -- sum and print the column data line := nvar.first; for t in 1..nvar.count loop str := line||','; sum_cols :=0; col := cols.first; for d in 1..cols.count loop begin str := str||to_char(nvar(line)(col))||','; if sum_cols_ind = 1 then sum_cols := sum_cols + nvar(line)(col); end if; exception when NO_DATA_FOUND then str := str||','; end; col := cols.next(col); end loop; if sum_cols_ind = 1 then str := str||to_char(sum_cols); else str:= substr(str,1,length(str)-1); end if; pipe row (str); line := nvar.next(line); end loop; if sum_lines_ind = 1 then -- sum and print the column totals str := 'TOTAL RESULT,'; line := nvar.first; col := cols.first; for d in 1..cols.count loop sum_lines :=0; line := nvar.first; for t in 1..nvar.count loop begin sum_lines := sum_lines + nvar(line)(col); exception when NO_DATA_FOUND then null; end; line := nvar.next(line); end loop; str := str||to_char(sum_lines)||','; sum_pivot := sum_pivot + sum_lines; col := cols.next(col); end loop; -- print the Grand Total str := str||to_char(sum_pivot); pipe row (str); end if; return; end; /
Lots to say about it:
- It does the job. At least it did–and very well–during my testing.
- It can summarize the intersection data. So that third column in your query must be a number. It would be pretty easy to change that for someone who wanted to show text data in the intersection.
- The first two columns must be characters. If you want numbers in the labels, simply get them converted in your driving query as in
- I have included parameters to turn on and off the SUM calculations. The default is to not sum nor print the column and line totals.
- The function produces CSV strings so that it will support any number of columns. No need to know how many columns it will have, nor what the names are. This meant that I had to make it replace all commas in the input strings with a space. Otherwise, it would have caused the CSV parsing to fail which would in turn have created some pretty ugly spreadsheets. So, your labels will lose any single quotes they may have (not a big deal in my opinion).
- Line labels can be as large as 1000 characters. That is huge and very inconvenient, I would argue, as it would make the pivot table difficult to read. It’s up to you to control the width of the data your query provides in the first column.
- Column labels can be as large as 500 characters. Same issue as for line labels. If you have wide column labels, your pivot table may be difficult to read, so set them appropriately in the second column.
- Each CSV string/line is limited to 32767 characters. This would certainly create a very ugly pivot table if you have too many columns or very wide data. But that’s the limit, which I haven’t actually tested.
Let’s see a sample query and output. It creates a pivot of Oracle-archived logs showing the sum of log sizes created by weekday across 24 hours for the last year. DBAs will be familiar with that.
First here’s the source query and a data subset:
SELECT To_Char(completion_time,'Day') WD, To_Char(completion_time,'HH24') HR, round((Sum(blocks)*block_size)/1048576) mb FROM v$archived_log WHERE completion_time >= SYSDATE - 365 GROUP BY to_Char(completion_time,'Day'), To_Char(completion_time,'HH24'), block_size; WD HR MB Friday 02 80 Friday 04 131 Thursday 17 128 Thursday 19 75 Thursday 18 130 Friday 03 75 Thursday 20 80 Thursday 21 542 Thursday 15 18 Thursday 22 207 Friday 00 85 Friday 06 56 Thursday 16 128 Thursday 23 88 Friday 01 92 Friday 05 72 ...
This data gets pivoted using this query:
select column_value from table(pivot_three_cols_table_func( 'select d, h, round((Sum(blocks)*block_size)/1048576) MB from (SELECT To_Char(completion_time,''Day'') D, To_Char(completion_time,''HH24'') H, blocks, block_size FROM v$archived_log WHERE completion_time >= SYSDATE - 365 ) GROUP BY d, h, block_size ',1,1)) ;
Note that single quotes needed to be doubled in the subquery string. Also, I’ve generated the line and column sums.
The output is as follows:
COLUMN_VALUE ------------ TITLE,00,01,02,03,04,05,06,07,08,09,10,11,12,13,14,15,16,17,18,19,20,21,22,23,TOTAL Friday ,299,261,233,216,404,242,265,328,420,482,443,465,465,396,403,412,309,263,221,168,142,1340,386,144,8707 Monday ,186,172,160,157,277,168,184,206,292,320,306,295,288,232,261,278,278,266,204,184,171,1730,436,192,7243 Saturday ,152,155,166,146,275,146,251,162,213,185,150,158,155,145,161,144,142,148,141,140,149,1854,278,153,5769 Sunday ,146,141,156,144,259,142,156,163,190,174,162,165,158,141,141,141,170,202,190,172,149,1815,240,161,5678 Thursday ,209,167,198,151,274,158,186,203,291,281,338,264,296,286,259,259,272,285,257,186,172,1641,620,251,7504 Tuesday ,160,161,164,162,280,146,174,206,269,314,313,328,276,302,284,279,287,266,261,182,199,989,362,179,6543 Wednesday,196,178,161,172,288,181,175,220,313,347,316,277,264,281,243,312,292,272,270,207,159,1026,360,165,6675 TOTAL RESULT,1348,1235,1238,1148,2057,1183,1391,1488,1988,2103,2028,1952,1902,1783,1752,1825,1750,1702,1544,1239,1141,10395,2682,1245,48119
Voila! The pivot is done and can be saved in a file and imported into a spreadsheet, or whatever other tools you use that reads CSV data.
Now, for those who want distinct columns for this data. I’ve created a simple CSV parser function that can be used to extract distinct columns. Here’s the code for this CSV parser function:
create or replace function csv_element(string varchar2, element_number number) return varchar2 as i number := element_number; r varchar2(2000); begin case when (i=1 and instr(string,',',1)=0) then r:= string; when (i=1) then r := substr(string,1,instr(string,',',1,1) -1 ); when (i>1 and (instr(string,',',1,i-1)>0 and instr(string,',',1,i)=0)) then r:= substr(string,instr(string,',',-1)+1); when (i>1) then r := substr(string,instr(string,',',1,i-1)+1,instr(string,',',1,i)-instr(string,',',1,i-1)-1 ); else r:= null; end case; return r; end; /
csv_element function handles strings with no commas, and returns that value as the first and only element. It also takes care of the trailing string, and it will also return a null value for columns that are beyond the number of elements in the strings. That is as sophisticated as I needed it to be.
Here’s the previous sample query using the function to return the data as separate columns:
select csv_element(column_value,1) c1, csv_element(column_value,2) c2, csv_element(column_value,3) c3, csv_element(column_value,4) c4, csv_element(column_value,5) c5, csv_element(column_value,6) c6, csv_element(column_value,7) c7, csv_element(column_value,8) c8, csv_element(column_value,9) c9, csv_element(column_value,10) c10, csv_element(column_value,11) c11, csv_element(column_value,12) c12, csv_element(column_value,13) c13, csv_element(column_value,14) c14, csv_element(column_value,15) c15, csv_element(column_value,16) c16, csv_element(column_value,17) c17, csv_element(column_value,18) c18, csv_element(column_value,19) c19, csv_element(column_value,20) c20, csv_element(column_value,21) c21, csv_element(column_value,22) c22, csv_element(column_value,23) c23, csv_element(column_value,24) c24, csv_element(column_value,25) c25, csv_element(column_value,26) c26, csv_element(column_value,27) c27, csv_element(column_value,28) c28 from table(pivot_three_cols_table_func( 'select d, h, round((Sum(blocks)*block_size)/1048576) MB from (SELECT To_Char(completion_time,''Day'') D, To_Char(completion_time,''HH24'') H, blocks, block_size FROM v$archived_log WHERE completion_time >= SYSDATE - 365 ) GROUP BY d, h, block_size ',1,1)) ;
I’ve deliberately added two extra columns in the query (27 and 28) so that you can see it handling it fine. Here’s the output of this query (with some columns removed for clearer formatting):
C1 C2 C3 C4 ... C23 C24 C25 C26 C27 C28 TITLE 00 01 02 ... 21 22 23 TOTAL Friday 299 261 233 ... 1340 386 144 8721 Monday 186 172 160 ... 1730 436 192 7243 Saturday 152 155 166 ... 1854 278 153 5769 Sunday 146 141 156 ... 1815 240 161 5678 Thursday 209 167 198 ... 1593 620 251 7456 Tuesday 160 161 164 ... 989 362 179 6543 Wednesday 196 178 161 ... 1026 360 165 6675 TOTAL RESULT 1348 1235 1238 ... 10347 2682 1245 48085
As you can see, columns 27 and 28 are there, but null. This way, you can have a generic wrapper query with a hundred columns if you’d like, without worrying about how many columns will be produced from your base query. Just make sure your app can handle that.
Interested in working with Marc? Schedule a tech call.