Overriding DBMS_STATS Parameter Settings

Since it’s introduction in Oracle Database 8i, the DBMS_STATS package is Oracle’s preferred method for gathering statistics. With each new database release the DBMS_STATS package is extended to accommodate new approaches to gather statistics and new types of statistics.

Over the years, application developers and DBAs have written hundreds of scripts using the DBMS_STATS package to help gather and manage optimizer statistics effectively. However, once written these scripts are rarely modified to take advantage of the improvements in the DBMS_STATS package, which can result in suboptimal statistics.

Oracle Database 12 Release 2 makes it a lot easier to be able to manage this vast collection of scripts by includes a new DBMS_STATS preference called PREFERENCE_OVERRIDES_PARAMETER. When this preference is set to TRUE, it allows preference settings to override the parameter values specifically set in a DBMS_STATS command.

For example, if the global preference ESTIMATE_PERCENT is set to the default DBMS_STATS.AUTO_SAMPLE_SIZE, but a 10% sample size was specified in the in a DBMS_STATS.GATHER_TABLE_STATS command, setting PREFERENCE_OVERRIDES_PARAMETER to TRUE would automatically replace the 10% sample size with the default preference  DBMS_STATS.AUTO_SAMPLE_SIZE.

Let’s take a look at this in action. Imagine we have a SALES table with 1.1 million rows in it.

SELECT COUNT(*)
FROM sales;
 
COUNT(*)
------------
1164295

The following command is used to gather statistics on the SALES table.

BEGIN
DBMS_STATS.GATHER_TABLE_STATS(ownname=>'SH', tabname=>'SALES', estimate_percent=>10);
END;
/
 
PL/SQL PROCEDURE successfully completed.
 
Elapsed: 00:00:09.08

If we check the sample size used for gathering statistics, we see that only 10% of the rows were used as requested and that a FREQUENCY histogram has been created on the CUST_ID column of the SALES table.

SELECT TABLE_NAME, sample_size
FROM   user_tables
WHERE  TABLE_NAME= 'SALES';
 
TABLE_NAME                     SAMPLE_SIZE
------------------------------ -----------
SALES                           116665
 
SELECT column_name, num_distinct, num_nulls, histogram
FROM   user_tab_col_statistics
WHERE  TABLE_NAME='SALES';
 
COLUMN_NAME     NUM_DISTINCT   NUM_NULLS    HISTOGRAM
-------------- --------------- ------------ ---------- 
PROD_ID                  137       0         NONE
CUST_ID                  161       0         FREQUENCY
TIME_ID                  135       0         NONE
CHANNEL_ID                 5       0         NONE
PROMO_ID                  22       0         NONE
QUANTITY_SOLD             39       0         NONE
AMOUNT_SOLD              152       0         NONE

Unfortunately, these statistics are not that accurate as there are over 600 distinct PROD_IDs and CUST_IDs in the SALES table but the statistics gather missed this information because it looked at such a small sample.

SELECT COUNT(DISTINCT prod_id), COUNT(DISTINCT cust_id)
FROM   sales;
 
COUNT(DISTINCTPROD_ID)  COUNT(DISTINCTCUST_ID)
----------------------  ----------------------
651                      630

In order to improve the accuracy of the statistics, we should use the recommended best practice value for the ESTIMATE_PERECENT parameter, AUTO_SAMPLE_SIZE. Let’s check the preference value for the ESTIMATE_PERECENT parameter on the SALES table.

SELECT
DBMS_STATS.GET_PREFS(pname=>'ESTIMATE_PRECENT' , ownname=>'SH', tabname=>'SALES')
FROM Dual;
 
DBMS_STATS.GET_PREFS(PNAME=>'ESTIMATE_PERCENT',OWNNAME=>'SH',TABNAME=>'SALES')
--------------------------------------------------------------------------------
DBMS_STATS.AUTO_SAMPLE_SIZE

We see it’s actually set to DBMS_STATS.AUTO_SAMPLE_SIZE, which is good news for us.

Rather than having to find and edit all of the existing statistics gathering script that refer to the SALES table, we can set the DBMS_STATS preference PREFERENCE_OVERRIDES_PARAMETER to TRUE on the SALES table, which will automatically override all of the none default parameter values used in the DBMS_STATS command with the corresponding preferences values.

BEGIN
DBMS_STATS.SET_TABLE_PREFS(ownname=>'SH', tabname=>'SALES', 
                           pname=>'PREFERENCE_OVERRIDES_PARAMETER', pvalue=>'TRUE');
END;
/
 
PL/SQL PROCEDURE successfully completed.

So, in our case the ESTIMATE_PERCENT of 10% will actually be overridden and DBMS_STATS.AUTO_SAMPLE_SIZE will be used. Let’s see what impact that has on both the statistics gathering performance and the quality of the statistics gathered.

BEGIN
DBMS_STATS.GATHER_TABLE_STATS(ownname=>'SH', tabname=>'SALES', estimate_percent=>10);
END;
/
 
PL/SQL PROCEDURE successfully completed.
Elapsed: 00:00:06.59
SELECT TABLE_NAME, SAMPLE_SIZE
FROM   user_tables
WHERE  TABLE_NAME= 'SALES';
 
TABLE_NAME                      SAMPLE_SIZE
------------------------------ -----------
SALES                           1164295

So, you can see that the time it took to gather statistics was actually less than the time for the 10% but the sample size was the total number of rows in the table. But what about the quality of the statistics?

SELECT column_name, num_distinct, num_nulls, histogram
FROM  user_tab_col_statistics
WHERE TABLE_NAME='SALES';
 
COLUMN_NAME     NUM_DISTINCT   NUM_NULLS    HISTOGRAM
-------------- --------------- ------------ ---------- 
PROD_ID                  651       0         NONE
CUST_ID                  630       0         TOP-FREQUENCY
TIME_ID                  620       0         NONE
CHANNEL_ID               5         0         NONE
PROMO_ID                 116       0         NONE
QUANTITY_SOLD            44        0         NONE
AMOUNT_SOLD              583       0         NONE

As you can see, we have a much more accurate set of statistics this time with each of the number of distinct values being correct. You will also notice we got a new type of histogram a TOP-FREQUENCY histogram this time. That’s because this new type of histogram is only gathered if the ESTIMATE_PERCENT is set to the default.

Remember setting PREFERENCE_OVERRIDES_PARAMETER to TRUE will override all none default parameters set in the DBMS_STATS.GATHER_*_STATS command, so you may need to adjust the default preferences for some of the parameters, for example METHOD_OPT if you don’t want to revert to the default.

You can set a table preference using the DBMS_STATS.SET_TABLE_PREFS procedure.

BEGIN
DBMS_STATS.SET_TABLE_PREFS(ownname=>'SH', tabname=>'SALES', pname=>METHOD_OPT, 
                           pvalue=>'FOR ALL COLUMNS SIZE 1 FOR COLUMNS CUST_ID SIZE 254');
END;

Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: