Although there have been a number of significant changes to the Oracle Database in 12c, some of my favorite new features are actually the small enhancements, designed to make your life a little easier. So, I decided to spend some time this year sharing the details on these small but extremely useful features.
One such enhancement is Online Statistics Gathering.
Whenever an index is created, Oracle automatically gathers optimizer statistics for that index. The database piggybacks the statistics gather on the full data scan and sort operation necessary for the index creation. This approach has worked so well since it was introduced in 9i, few people even realize it’s happening.
SELECT TO_CHAR(SYSDATE,'DD-MON-YYYY HH24:MI:SS') WALL_CLOCK_TIME FROM dual; WALL_CLOCK_TIME ----------------------------- 03-JAN-2017 21:29:44 SELECT index_name, TO_CHAR(last_analyzed,'DD-MON-YYYY HH24:MI:SS') last_analyzed, leaf_blocks, distinct_keys, clustering_factor FROM user_indexes; no ROWS selected CREATE INDEX t_s_idx ON items (s); INDEX created. SELECT index_name, TO_CHAR(last_analyzed,'DD-MON-YYYY HH24:MI:SS') last_analyzed, leaf_blocks, distinct_keys, clustering_factor FROM user_indexes; INDEX_NAME LAST_ANALYZED LEAF_BLOCKS DISTINCT_KEYS CLUS_FACT ---------- -------------------- ----------- ------------- ---------- T_S_IDX 03-JAN-2017 21:29:45 6135 1000000 999839
In Oracle Database 12c, the same technique is now applied for direct path operations such as, Create Table As Select (CTAS) and Insert /+APPEND */ As Select (IAS) operations into an empty table or partition.
Piggybacking the statistics gather as part of the data loading operation, means no additional full data scan is required to have statistics available immediately after the data is loaded. The additional time spent on gathering statistics is small compared to a separate statistics collection process, and it guarantees accurate statistics are readily available from the get-go.
SELECT TO_CHAR(SYSDATE,'DD-MON-YYYY HH24:MI:SS') WALL_CLOCK_TIME FROM dual; WALL_CLOCK_TIME ----------------------------- 03-JAN-2017 21:40:07 CREATE TABLE items2 AS SELECT * FROM items; TABLE created. SELECT TABLE_NAME, num_rows, TO_CHAR(last_analyzed,'DD-MON-YYYY HH24:MI:SS') last_analyzed FROM user_tables; TABLE_NAME NUM_ROWS LAST_ANALYZED ---------- ---------- --------------------- ITEMS 1000000 20-OCT-2016 05:58:26 ITEMS2 1000000 03-JAN-2017 21:40:09 SELECT column_name, num_distinct, num_nulls, histogram, notes FROM user_tab_col_statistics WHERE TABLE_NAME='ITEMS2'; COLUMN_NAME NUM_DISTINCT NUM_NULLS HISTOGRAM NOTES ----------- ------------ ---------- --------------- -------------- N 624640 0 NONE STATS_ON_LOAD S 988416 0 NONE STATS_ON_LOAD
Be aware that online statistics gathering will not occur if any data exists in the table or partition, as the overhead of reading the existing data could greatly increase the time it takes to complete the data load. Only base table and column statistics are gathered. Histograms are not gathered automatically, as they require a second pass of the data, which again could adversely affect the elapse time for the database load.
Histograms can be gathered manually without re-gathering base column statistics by using new GATHER AUTO option in the DBMS_STATS package.
BEGIN dbms_stats.Gather_table_stats('SH','ITEMS2',options=>'GATHER AUTO'); END; /
If for some reason you wish to turn off this new automatic online statistics gathering, you can do so by setting the underscore parameter _optimizer_gather_stats_on_load to FALSE.
hi Maria,
Excuse me if my question is too basic, but to my knowledge, no question is stupid until I am not clear on it.
Though I understood the concept from this, Am still trying to understand from this post with what is the benefit that we are gaining from 12c? i.e., what was it before 12c and what is it with 12c. Please help me in understanding.
Many Thanks,
Hi Sunil,
Prior to 12c, if you created a table via a CTAS command or insert a set of data into an empty table or partition via an IAS command, there would be no statistics generated. All subsequent queries would have to use dynamic sampling during parse to generate some basic statistics about the table until either the DBA manually gathered statistics or the nightly statistics gathering task kicks in.
Having the statistics automatically gathered as part of the data load, means all subsequent queries will have accurate statistics immediately after the load completes. This reduces the parse time for the subsequent queries, as no dynamic sampling is required and ensures the optimizer has the information it needs to generate a optimal execution plan.
Thanks,
Maria
Thanks, Maria, Got it. So, like an Index, now from 12c if _optimizer_gather_stats_on_load parameter is set to true stats on the table will be available (histograms excepted) to get accurate run times.
Thank you got it.
This property is stored in which table? i.e., I want to set this property true at system level,
I can run a command
alter system set “_optimizer_gather_stats_on_load”=true;
But I want to see what is the current value for this property, where can I check the same.
Many Thanks,
Thanks for this post, as of 12c(12.1.0.2) this online stats gathering would work only for Heap table and not for IOT’s.
Ofcourse it is documented here
http://docs.oracle.com/database/121/TGSQL/tgsql_statscon.htm#GUID-CDDB5A54-0991-4E68-A9D7-2305777B608B
demo@ORA12C> select * from v$version;
BANNER
——————————————————————————– —
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 – 64bit Production
PL/SQL Release 12.1.0.2.0 – Production
CORE 12.1.0.2.0 Production
TNS for 64-bit Windows: Version 12.1.0.2.0 – Production
NLSRTL Version 12.1.0.2.0 – Production
5 rows selected.
demo@ORA12C> create table t(
2 object_id,
3 owner,
4 object_name,
5 object_type,
6 constraint t_pk primary key(object_id) )
7 organization index
8 as
9 select object_id,owner,object_name,object_type
10 from all_objects;
Table created.
demo@ORA12C> select num_rows,blocks,last_analyzed,avg_row_len
2 from user_tables
3 where table_name =’T’;
NUM_ROWS BLOCKS LAST_ANALYZ AVG_ROW_LEN
———- ———- ———– ———–
1 row selected.
demo@ORA12C> select column_name,num_distinct,num_nulls,last_analyzed
2 from user_tab_col_statistics
3 where table_name =’T’
4 order by column_name;
no rows selected
demo@ORA12C> select index_name, leaf_blocks,distinct_keys,
2 clustering_factor,num_rows,last_analyzed
3 from user_ind_statistics
4 where table_name =’T’;
INDEX_NAME LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR NUM_ROWS LAST_ANALYZ
———- ———– ————- —————– ———- ———–
T_PK
1 row selected.
demo@ORA12C> exec dbms_stats.gather_table_stats(user,’T’);
PL/SQL procedure successfully completed.
demo@ORA12C> select num_rows,blocks,last_analyzed,avg_row_len
2 from user_tables
3 where table_name =’T’;
NUM_ROWS BLOCKS LAST_ANALYZ AVG_ROW_LEN
———- ———- ———– ———–
77423 09-JAN-2017 46
1 row selected.
demo@ORA12C>
demo@ORA12C> select column_name,num_distinct,num_nulls,last_analyzed
2 from user_tab_col_statistics
3 where table_name =’T’
4 order by column_name;
COLUMN_NAM NUM_DISTINCT NUM_NULLS LAST_ANALYZ
———- ———— ———- ———–
OBJECT_ID 77423 0 09-JAN-2017
OBJECT_NAM 42128 0 09-JAN-2017
E
OBJECT_TYP 24 0 09-JAN-2017
E
OWNER 19 0 09-JAN-2017
4 rows selected.
demo@ORA12C> select index_name, leaf_blocks,distinct_keys,
2 clustering_factor,num_rows,last_analyzed
3 from user_ind_statistics
4 where table_name =’T’;
INDEX_NAME LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR NUM_ROWS LAST_ANALYZ
———- ———– ————- —————– ———- ———–
T_PK 564 77423 0 77423 09-JAN-2017
1 row selected.
demo@ORA12C>
Thanks Maria.
Hi Maria, does it make sense to disable online statistics gathering when you are importing via data pump and the next step is to import the schema stats from source database with DBMS_STATS.IMPORT_SCHEMA_STATS ?
I am trying to reduce migration time as much as possible…
Best regards,
Xabi.