How to decipher oracle Internal datatype storage
Now that we have a table, let's take a look a the ranges of values. Note: I am using the _TAB_COLUMNS views for some queries just for simplification of the SQL for demonstration.[code language="SQL"] col low_value format a40 col high_value format a40 prompt prompt NUMERIC prompt select column_name, low_value, high_value from user_tab_columns where table_name = 'LOW_HIGH' and data_type = 'NUMBER' / prompt prompt VARCHAR2 prompt select column_name, low_value, high_value from user_tab_columns where table_name = 'LOW_HIGH' and data_type = 'VARCHAR2' / prompt prompt DATE prompt select column_name, low_value, high_value from user_tab_columns where table_name = 'LOW_HIGH' and data_type = 'DATE' / NUMERIC COLUMN LOW_VALUE HIGH_VALUE ------ ---------------------------------------- ---------------------------------------- ID C102 C30E0B49 N1 80 C2021C 2 rows selected. VARCHAR2 COLUMN LOW_VALUE HIGH_VALUE ------ ---------------------------------------- ---------------------------------------- C1 303132333435363738396162636465666768696A 666768696A6B6C6D6E6F70717273747576777879 1 row selected. DATE COLUMN LOW_VALUE HIGH_VALUE ------ ---------------------------------------- ---------------------------------------- D1 7871030D121C04 78730C07121C04 1 row selected. [/code] Clearly the values being stored for LOW_VALUE and HIGH_VALUE are of little use to us in their current format. What can we do? For the NUMBER and character data types (VARCHAR2, VARCHAR, CHAR) the package UTL_RAW can be used to get the actual values. Here is an example of converting some of these to a human readable format. [code language="SQL"] col low_value format 999999999999 col high_value format 999999999999 select column_name , utl_raw.cast_to_number(low_value) low_value , utl_raw.cast_to_number(high_value) high_value from user_tab_columns where table_name = 'LOW_HIGH' and data_type = 'NUMBER' / col low_value format a20 col high_value format a20 select column_name , utl_raw.cast_to_varchar2(low_value) low_value , utl_raw.cast_to_varchar2(high_value) high_value from user_tab_columns where table_name = 'LOW_HIGH' and data_type = 'VARCHAR2' / COLUMN LOW_VALUE HIGH_VALUE ------------------------------ ------------- ------------- N1 0 127 ID 1 131072 2 rows selected. COLUMN LOW_VALUE HIGH_VALUE ------------------------------ -------------------- -------------------- C1 0123456789abcdefghij fghijklmnopqrstuvwxy 1 row selected. [/code] These values can be verified, as shown here with the N1 column: [code language="SQL"] SQL> select min(n1), max(n1) from low_high; MIN(N1) MAX(N1) ---------- ---------- 0 127 1 row selected. [/code] So far I have done this only with these simple versions of these data types. Variations such as NVARCHAR2, BINARY_FLOAT and others may require different handling. What is missing? The DATE column has not yet been handled. Converting the raw date format to a readable date is not so straightforward as there does not seem to be any conversion function available for that (If you know of one, please write about it in the comments section of this article).
Oracle DATE Format
First it will be necessary to know how Oracle stores a date in the database. Oracle's internal date format has been documented a number of times and is well known, such as in the following Oracle Support Note: How does Oracle store the DATE datatype internally? (Doc ID 69028.1) Oracle dates consist of seven parts: century, year, month of the year, day of the month, and the hours, minutes and seconds after midnight. The internal representation of this format can be seen by running the script in Example 1.Example 1: dumping the internal date format
[code language="SQL"] alter session set nls_date_format = 'mm/dd/yyyy hh24:mi:ss'; col today format a40 drop table t1; create table t1 as select sysdate today from dual; select to_char(today) today from t1 union select dump(today) today from t1; TODAY ---------------------------------------- 12/09/2015 13:13:57 Typ=12 Len=7: 120,115,12,9,14,14,58 2 rows selected. [/code] The hour, minute and second are all stored in excess-1 notation, so 1 must be subtracted from them to get the correct time. Using excess-1 notation prevents a zero byte from being stored. The month and day are both stored with the actual value, which can be seen in the SELECT output. The values for the century and year are stored in excess-100 notation. This means that 100 must be subtracted from the value before using it. In the case of the date in Example 1 the year is clearly seen by subtracting 100 from 104. The century is somewhat different. Not only must 100 be subtracted from the value, it must then be multiplied by 100. The following example demontrates how the components of a date can be extracted from the information returned by the dump() function. [code language="SQL"] col cyear format 9999 col month format a2 col day format a2 col hour format 99 col minute format 99 col second format 99 select -- extract the century and year information from the -- internal date format -- century = (century byte -100) * 100 ( to_number( -- parse out integer appearing before first comma substr( startup_dump, 1, instr(startup_dump,',')-1) - 100 ) * 100 ) + -- year = year byte - 100 ( to_number( substr( startup_dump, -- get position of 2nd comma instr(startup_dump,',',2)+1, -- get position of 2nd comma - position of 1st comma instr(startup_dump,',',1,2) - instr(startup_dump,',',1,1) -1 ) ) - 100 ) cyear , substr( startup_dump, instr(startup_dump,',',1,2)+1, instr(startup_dump,',',1,3) - instr(startup_dump,',',1,2) -1 ) month , substr( startup_dump, instr(startup_dump,',',1,3)+1, instr(startup_dump,',',1,4) - instr(startup_dump,',',1,3) -1 ) day , to_number(substr( startup_dump, instr(startup_dump,',',1,4)+1, instr(startup_dump,',',1,5) - instr(startup_dump,',',1,4) -1 ))-1 hour , to_number(substr( startup_dump, instr(startup_dump,',',1,5)+1, instr(startup_dump,',',1,6) - instr(startup_dump,',',1,5) -1 ))-1 minute , to_number(substr( startup_dump, instr(startup_dump,',',1,6)+1 ))-1 second from ( -- return just the date bytes from the dump() select substr(dump(startup_time),15) startup_dump from v$instance ) a SQL> / CYEAR MO DA HOUR MINUTE SECOND ----- -- -- ---- ------ ------ 2015 11 18 17 33 32 1 row selected. [/code] Note: the internal format for SYSDATE is not the same as dates stored in a table. This is also true for TIMESTAMP and SYSTIMESTAMP. The internal format for TIMESTAMP columns can be seen in this OraFaq Article.Putting it All Together
So, now we can make use of this to examine the values Oracle stores to bind the ranges of columns, this time including the DATE columns. [code language="SQL"] col low_value format a20 col high_value format a20 col table_name format a10 head 'TABLE' col data_type format a20 col column_name format a6 head 'COLUMN' set linesize 200 trimspool on set pagesize 60 select us.table_name, uc.data_type, us.column_name, case when uc.data_type in ('VARCHAR2','VARCHAR','CHAR') then utl_raw.cast_to_varchar2(us.low_value) when uc.data_type = 'NUMBER' then to_char(utl_raw.cast_to_number(us.low_value) ) when uc.data_type = 'DATE' then -- extract the century and year information from the -- internal date format -- century = (century byte -100) * 100 to_char(( to_number( -- parse out integer appearing before first comma substr( substr(dump(us.low_value),15), 1, instr(substr(dump(us.low_value),15),',')-1) - 100 ) * 100 ) + -- year = year byte - 100 ( to_number( substr( substr(dump(us.low_value),15), -- get position of 2nd comma instr(substr(dump(us.low_value),15),',',2)+1, -- get position of 2nd comma - position of 1st comma instr(substr(dump(us.low_value),15),',',1,2) - instr(substr(dump(us.low_value),15),',',1,1) -1 ) ) - 100 )) --current_year || '-' || lpad( substr( substr(dump(us.low_value),15), instr(substr(dump(us.low_value),15),',',1,2)+1, instr(substr(dump(us.low_value),15),',',1,3) - instr(substr(dump(us.low_value),15),',',1,2) -1 ) -- month ,2,'0' ) || '-' || lpad( substr( substr(dump(us.low_value),15), instr(substr(dump(us.low_value),15),',',1,3)+1, instr(substr(dump(us.low_value),15),',',1,4) - instr(substr(dump(us.low_value),15),',',1,3) -1 ) -- day ,2,'0' ) || ' ' || lpad( to_char(to_number( substr( substr(dump(us.low_value),15), instr(substr(dump(us.low_value),15),',',1,4)+1, instr(substr(dump(us.low_value),15),',',1,5) - instr(substr(dump(us.low_value),15),',',1,4) -1 ) )-1) ,2,'0' ) -- hour || ':' || lpad( to_char( to_number( substr( substr(dump(us.low_value),15), instr(substr(dump(us.low_value),15),',',1,5)+1, instr(substr(dump(us.low_value),15),',',1,6) - instr(substr(dump(us.low_value),15),',',1,5) -1 ) )-1 ) ,2,'0' ) -- minute || ':' || lpad( to_char( to_number( substr( substr(dump(us.low_value),15), instr(substr(dump(us.low_value),15),',',1,6)+1 ) )-1 ) ,2,'0' ) --second else 'NOT SUPPORTED' end low_value, -- get the high value case when uc.data_type in ('VARCHAR2','VARCHAR','CHAR') then utl_raw.cast_to_varchar2(us.high_value) when uc.data_type = 'NUMBER' then to_char(utl_raw.cast_to_number(us.high_value) ) when uc.data_type = 'DATE' then -- extract the century and year information from the -- internal date format -- century = (century byte -100) * 100 to_char(( to_number( -- parse out integer appearing before first comma substr( substr(dump(us.high_value),15), 1, instr(substr(dump(us.high_value),15),',')-1) - 100 ) * 100 ) + -- year = year byte - 100 ( to_number( substr( substr(dump(us.high_value),15), -- get position of 2nd comma instr(substr(dump(us.high_value),15),',',2)+1, -- get position of 2nd comma - position of 1st comma instr(substr(dump(us.high_value),15),',',1,2) - instr(substr(dump(us.high_value),15),',',1,1) -1 ) ) - 100 )) --current_year || '-' || lpad( substr( substr(dump(us.high_value),15), instr(substr(dump(us.high_value),15),',',1,2)+1, instr(substr(dump(us.high_value),15),',',1,3) - instr(substr(dump(us.high_value),15),',',1,2) -1 ) -- month ,2,'0' ) || '-' || lpad( substr( substr(dump(us.high_value),15), instr(substr(dump(us.high_value),15),',',1,3)+1, instr(substr(dump(us.high_value),15),',',1,4) - instr(substr(dump(us.high_value),15),',',1,3) -1 ) -- day ,2,'0' ) || ' ' || lpad( to_char(to_number( substr( substr(dump(us.high_value),15), instr(substr(dump(us.high_value),15),',',1,4)+1, instr(substr(dump(us.high_value),15),',',1,5) - instr(substr(dump(us.high_value),15),',',1,4) -1 ) )-1) ,2,'0' ) -- hour || ':' || lpad( to_char( to_number( substr( substr(dump(us.high_value),15), instr(substr(dump(us.high_value),15),',',1,5)+1, instr(substr(dump(us.high_value),15),',',1,6) - instr(substr(dump(us.high_value),15),',',1,5) -1 ) )-1 ) ,2,'0' ) -- minute || ':' || lpad( to_char( to_number( substr( substr(dump(us.high_value),15), instr(substr(dump(us.high_value),15),',',1,6)+1 ) )-1 ) ,2,'0' ) --second else 'NOT SUPPORTED' end high_value from all_tab_col_statistics us join all_tab_columns uc on uc.owner = us.owner and uc.table_name = us.table_name and uc.column_name = us.column_name and us.owner = USER and us.table_name = 'LOW_HIGH' order by uc.column_id SQL> / TABLE DATA_TYPE COLUMN LOW_VALUE HIGH_VALUE ---------- -------------------- ------ -------------------- -------------------- LOW_HIGH NUMBER ID 1 131072 LOW_HIGH NUMBER N1 0 127 LOW_HIGH VARCHAR2 C1 0123456789abcdefghij fghijklmnopqrstuvwxy LOW_HIGH DATE D1 2013-03-13 17:27:03 2015-12-07 17:27:03 4 rows selected. Verify the D1 column values SQL> select min(d1) min_d1, max(d1) max_d1 from low_high; MIN_D1 MAX_D1 ------------------- ------------------- 2013-03-13 17:27:03 2015-12-07 17:27:03 1 row selected. [/code] And there you have it. We can now see in human readable form the low and high values that Oracle has stored for each column. While it is a rather complex SQL statement, it really is not difficult to understand once you know the purpose behind. And the beauty of this script is that no functions or procedures need to be created to make use of it. If you would like add TIMESTAMP or any other value to the script, please, do so! The SQL can be found here in the Low-High GitHub repo. Now that the values can be viewed, the next task will be to put the script to use by using some examples to see how Oracle handles predicates outside the known range of values.On this page
Share this
Share this
More resources
Learn more about Pythian by reading the following blogs and articles.
Enqueue bytes - is that a pun?
Enqueue bytes - is that a pun?
Nov 7, 2016 12:00:00 AM
5
min read
Making existing SQLPLUS Scripts 12c and Container DB (PDB) compatible
Making existing SQLPLUS Scripts 12c and Container DB (PDB) compatible
May 26, 2015 12:00:00 AM
7
min read
DBMS_SCHEDULER and Implicit Commits
DBMS_SCHEDULER and Implicit Commits
Feb 26, 2007 12:00:00 AM
2
min read
Ready to unlock value from your data?
With Pythian, you can accomplish your data transformation goals and more.