Decoding DBA_TAB_COLUMNS high_value and low_value

When performance tuning, it can be important to understand the statistics in the database. It is worth noting that every column records it lowest (low_value) and highest (high_value) values in DBA_TAB_COLUMNS when you gather stats, like a mini-histogram. If the optimizer believes it knows the range of values within a column, plus the number of different values (cardinality), it can make guesses about the suitability and cost of index access when a predicate references that column.

This is especially important when looking at values outside of the low_value and high_value range, as Oracle crudely calculates the probability that such a value exists. It does this by identifying the total range of values (high_value – low_value) and lowering the calculated cost for predicates increasingly further outside of the known range until it decides that there’s no realistic chance of hitting the value you have asked for.

For example:

If low_value was 40 and high_value was 50, this gives us a range of 10. Therefore, any predicate selecting outside of the values 40-to-50 will have it’s cost lowered by a percentage of the range (10). If we assume an access cost of 1000, this is how the index access cost may change.

where X = 40 ; cost = 1000 - within low-to-high value
where X = 50 ; cost = 1000 - within low-to-high value
where X = 55 ; cost = 500  - outsize of low-to-high value by 50% of the range
where X = 32 ; cost = 200  - outsize of low-to-high value by 80% of the range
where X = 61 ; cost = 1    - outside of low-to-high value by more than 100% of the range
where X = 29 ; cost = 1    - outside of low-to-high value by more than 100% of the range

This is called statistical decay – the calculated lowering of the cost because Oracle doesn’t think it will find the value in the index.

Querying the LOW_VALUE and HIGH_VALUE

The low_value and high_value in DBA_TAB_COLUMNS is stored in an Oracle internal format, so you need to use DBMS_STATS.CONVERT_RAW_VALUE to turn it into a human-readable format. Here’s an example:

Table Definition:

08:08:24 SQL> desc nc1
Name    Null?    Type
-----   -----    ----
C1               DATE
C2               TIMESTAMP(6)
C3               VARCHAR2(10)
C4               NUMBER
C5               BINARY_FLOAT
C6               BINARY_DOUBLE

Select all of the data out of the table so we can see what we have

C1                  C2                          C3                 C4         C5         C6
------------------- ----------------------------- ---------- ---------- ---------- ----------
21/07/0001 01:02:03 21/07/0001 01:02:03.456789    Lower               1   1.0E+000   1.0E+000
21/07/9999 01:02:03 21/07/9999 01:02:03.456789    Upper              99   9.9E+001   9.9E+001
31/12/2015 00:00:00 12/11/2015 08:08:24.586342    Now                50   5.0E+001   5.0E+001
31/12/2015 00:00:00 12/11/2015 08:08:24.585173    Now                50   5.0E+001   5.0E+001
31/12/2015 00:00:00 12/11/2015 08:08:24.579977    Now                50   5.0E+001   5.0E+001
31/12/2015 00:00:00 12/11/2015 08:08:24.585896    Now                50   5.0E+001   5.0E+001

Stats have been gathered, so we can look at DBA_TAB_COLUMNS (or USER_TAB_COLUMNS), and decode the high and low values

TABLE COLUM DATA_TYPE       LOW_VALUE                      HIGH_VALUE                     LOW_DECODE                     HIGH_DECODE
----- ----- --------------- ------------------------------ ------------------------------ ------------------------------ ------------------------------
NC1   C1    DATE            64650715020304                 C7C70715020304                 0001-07-21 01:02:03            9999-07-21 01:02:03
NC1   C2    TIMESTAMP(6)    646507150203041B3A0C08         C7C707150203041B3A0C08         0001-07-21 01:02:03            9999-07-21 01:02:03
NC1   C3    VARCHAR2        4C6F776572                     5570706572                     Lower                          Upper
NC1   C4    NUMBER          C102                           C164                           1                              99
NC1   C5    BINARY_FLOAT    BF800000                       C2C60000                       1.0E+000                       9.9E+001
NC1   C6    BINARY_DOUBLE   BFF0000000000000               C058C00000000000               1.0E+000                       9.9E+001

So how did I decode all of that? I used the 12C capability of putting in-line functions in a SQL statement using subquery refactoring. Pre-12C, you could create the functions as objects.

with
FUNCTION raw_to_date(i_var in raw) return date  as
o_var date;
begin
dbms_stats.convert_raw_value(i_var,o_var);
return o_var;
end;
FUNCTION raw_to_number(i_var in raw) return number  as
o_var number;
begin
dbms_stats.convert_raw_value(i_var,o_var);
return o_var;
end;
FUNCTION raw_to_varchar2(i_var in raw) return varchar2  as
o_var varchar2(32767);
begin
dbms_stats.convert_raw_value(i_var,o_var);
return o_var;
end;
FUNCTION raw_to_float(i_var in raw) return binary_float  as
o_var binary_float;
begin
dbms_stats.convert_raw_value(i_var,o_var);
return o_var;
end;
FUNCTION raw_to_double(i_var in raw) return binary_double  as
o_var binary_double;
begin
dbms_stats.convert_raw_value(i_var,o_var);
return o_var;
end;
select table_name,column_name,data_type,low_value,high_value,
decode(substr(data_type,1,9),
'DATE'     , to_char(raw_to_date(low_value),'YYYY-MM-DD HH24:MI:SS') ,
'TIMESTAMP', to_char(raw_to_date(low_value),'YYYY-MM-DD HH24:MI:SS') ,
'NUMBER',    to_char(raw_to_number(low_value)),
'VARCHAR2',  to_char(raw_to_varchar2(low_value)),
'BINARY_FL', to_char(raw_to_float(low_value)),
'BINARY_DO', to_char(raw_to_double(low_value)),'Unknown') low_decode,
decode(substr(data_type,1,9),
'DATE'     , to_char(raw_to_date(high_value),'YYYY-MM-DD HH24:MI:SS') ,
'TIMESTAMP', to_char(raw_to_date(high_value),'YYYY-MM-DD HH24:MI:SS') ,
'NUMBER',    to_char(raw_to_number(high_value)),
'VARCHAR2',  to_char(raw_to_varchar2(high_value)),
'BINARY_FL', to_char(raw_to_float(high_value)),
'BINARY_DO', to_char(raw_to_double(high_value)),'Unknown') high_decode
from user_tab_columns
where table_name = 'NC1'
/

NOTE: There is no function to decode TIMESTAMPs. You can use them as input to the DATE convert (as above) but you will lose precision, or you can convert them manually:

select
      column_name,
      low_value,
          lpad(to_char(100*to_number(substr(low_value,1,2),'XX')
                          +to_number(substr(low_value,3,2),'XX')) - 10100,4,'0') ||'-'
       ||lpad( to_number(substr(low_value, 5,2),'XX'),2,'0')||'-'
       ||lpad( to_number(substr(low_value, 7,2),'XX'),2,'0')||' '
       ||lpad((to_number(substr(low_value, 9,2),'XX')-1),2,'0')||':'
       ||lpad((to_number(substr(low_value,11,2),'XX')-1),2,'0')||':'
       ||lpad((to_number(substr(low_value,13,2),'XX')-1),2,'0')||'.'
       ||lpad((to_number(substr(low_value,15,8),'XXXXXXXX')-1),8,'0')
       low_decode,
      high_value,
          lpad(to_char(100*to_number(substr(high_value,1,2),'XX')
                          +to_number(substr(high_value,3,2),'XX')) - 10100,4,'0') ||'-'
       ||lpad( to_number(substr(high_value, 5,2),'XX'),2,'0')||'-'
       ||lpad( to_number(substr(high_value, 7,2),'XX'),2,'0')||' '
       ||lpad((to_number(substr(high_value, 9,2),'XX')-1),2,'0')||':'
       ||lpad((to_number(substr(high_value,11,2),'XX')-1),2,'0')||':'
       ||lpad((to_number(substr(high_value,13,2),'XX')-1),2,'0')||'.'
       ||lpad((to_number(substr(high_value,15,8),'XXXXXXXX')-1),8,'0')
       high_decode
from user_tab_columns
where data_type like 'TIMESTAMP%'
/
COLUM LOW_VALUE                      LOW_DECODE                     HIGH_VALUE                     HIGH_DECODE
----- ------------------------------ ------------------------------ ------------------------------ ------------------------------
C2    646507150203041B3A0C08         0001-07-21 01:02:03.45678899   C7C707150203041B3A0C08         9999-07-21 01:02:03.45678899

4 Responses to Decoding DBA_TAB_COLUMNS high_value and low_value

  1. Neil,
    a nice idea to use PL/SQL in the with clause in 12c. In earlier versions (and if I am allowed to create my own functions in the database) I think I would prefer not to create a lot of single type functions but rather use Greg Rahn’s display_raw: http://structureddata.org/2007/10/16/how-to-display-high_valuelow_value-columns-from-user_tab_col_statistics/.

    Regards
    Martin

    • Getting ‘DBA’ objects created is sometimes close to impossible, especially at short notice when things are going wrong and you’re trying to analyze a performance problem.
      Gregs code is good – never seen it before, or I might not have written this 🙂

  2. Hello Neil
    this is all very good, but how do you decode them low and high values in 11g?

    • Ca va!

      For dates and timestamps, use the code at the end of the blog post (remove the last decode if necessary).

      For most of the other you can use UTL_RAW to resolve the values, something like…

      ,decode(data_type
      ,’NUMBER’ ,to_char(utl_raw.cast_to_number(high_value))
      ,’VARCHAR2′ ,to_char(utl_raw.cast_to_varchar2(high_value))
      ,’NVARCHAR2′ ,to_char(utl_raw.cast_to_nvarchar2(high_value))
      ,’BINARY_DOUBLE’,to_char(utl_raw.cast_to_binary_double(high_value))
      ,’BINARY_FLOAT’ ,to_char(utl_raw.cast_to_binary_float(high_value))

      Neil

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: