How to identify which indexes can be safely dropped in Oracle

The longer an application has been successfully run, the more likely you are to have indexes that are no longer used or beneficial. Removing these indexes not only saves space but can also improve the performance of any DML operations.

But knowing which indexes can be safely dropped can be tricky.

In Oracle Database 12c, things got a little easier with the introduction of a new view called DBA_INDEX_USAGE.

The DBA_INDEX_USAGE view displays cumulative index usage statistics, but unlike previous releases, it’s not just a binary value (YES or NO). This new view shows how often an index was used, who accessed it, and how effective it was via histogram based on the number of rows returned. The index access information is captured in memory and periodically flushed to disk every 15 minutes. You can query the last_flush_Time in v$INDEX_USAGE_INFO to determine when it was updated.

As always, it is easier to understand the benefits of a dictionary view by looking at an example. In my example below, I join the DBA_INDEX_USAGE view to USER_INDEXES via an outer join to ensure I capture information on indexes that are never used because these indexes won’t appear in DBA_INDEX_USAGE at all.

SELECT i.index_name, u.total_access_count tot_access, u.total_exec_count exec_cnt,
       u.bucket_0_access_count B0, u.bucket_1_access_count B1, u.bucket_2_10_access_count B2_10,
       u.bucket_11_100_access_count B11_100, u.bucket_101_1000_access_count B101_1K,
       u.bucket_1000_plus_access_count B1K, u.last_used
FROM    DBA_INDEX_USAGE u
RIGHT JOIN DBA_INDEXES i
ON     i.index_name = u.name 
WHERE  i.owner='MARIA'
ORDER BY u.total_access_count;

The output of this query should look something like this.

INDEX_NAME     	TOT_ACCESS    EXEC_CNT   B0      B1      B2_10  B11_100   B101_1K     B1K     LAST_USED
---------------- ---------- ---------- ------- ------- ------- -------- ---------- ---------- -----------
PROD_CUST_SALES       1  	1       0       0       0       0        	0          1 	06-APR-23
INDX_LOC       	      2  	2       0       1       1       0		0          0 	12-APR-23
INDX_DEPTNO    	     19        19      18       1       0       0   	        0          0 	26-APR-23
PROD_SUP_INDX        27        27      25       0       0       0        	2          0 	26-APR-23
EMPNO_PK_IND         82        82      48       32      0       2        	0          0 	26-APR-23
CHAN_SOLD
PROD_SUB_IDX

As you can see in the output above, for each index, we get the total number of accesses (TOT_ACCESS), the execution count (EXEC_CNT), the last date the index was used (LAST_USED), and a histogram of how many rows were returned across several buckets.

Let’s look at the  EMPNO_PK INDEX  index (the third row from the bottom). This index returned 0 rows 48 times out of  82 total accesses and a single row 32 times. So, it looks like this index is commonly used and valuable.

However, if we look at PROD_CUST_SALES (the first row), it was accessed only once and returned over 1,000 rows. Is that a helpful index?

Maybe/maybe not. It may be beneficial if this index access replaces a full table scan of an extremely wide table. If, on the other hand, it’s only marginally cheaper than a full table scan, it may not be worth the overhead of maintaining the index.

But perhaps the most exciting part of the query output above is that we have two indexes (CHAN_SOLD, PROD_SUB_IDX) that have never been accessed. These are good candidates to be dropped.

But dropping an index can be risky. What can you do to ensure you don’t get fired for dropping the wrong index?

Rather than dropping the index immediately, marking the index invisible is far easier. The optimizer will be unaware of the index in the future and won’t select it as part of any execution plan. However, the index will continue to be maintained.

ALTER INDEX prod_sub_idx INVISIBLE;

If no one complains about their query performance digressing after some time, you can safely drop the index. If, on the other hand, someone does complain, you can alter the index visible again in a matter of seconds.

ALTER INDEX prod_sub_idx VISIBLE;

You can also create a new index as invisible. The optimizer is unaware of the new index until you can verify it improves performance by setting the parameter OPTIMIZER_USE_INVISBLE_INDEXES within a single session. This will allow the optimizer to consider the new index only for SQL statements issued within your session. All other sessions will continue to use the existing execution plans.

-- New indexes can be marked invisible until you have an opportunity to prove they improve performance
CREATE INDEX my_idx ON t(x, object_id) INVISIBLE; 
 
-- Test newly created invisible indexes by setting OPTIMIZER_USE_INVISBLE_INDEXES to TRUE
ALTER SESSION SET optimizer_use_invisible_indexes  = TRUE;

Better diagnostics for SQL regressions in 19c and beyond #JoelKallmanDay

When diagnosing and correcting a performance regression for a SQL statement, it can often be challenging to find out what is happening during the execution and why your corrective measures are not working.

In today’s blog, I want to share several enhancements introduced in recent Oracle Database releases to help you better understand how to improve a SQL statement and understand why your corrective measures aren’t working as you hoped.

Enhanced Execution Plan

One of the most valuable tools at our disposal is the execution plan. Over the last several years, the Oracle Optimizer team has made a considerable effort to share even more insights and guidance on what might be impacting the performance of a SQL statement via the execution plan.

The Note section under the plan

The note section under the execution plan contains valuable information on what has happened during parse and execution. Over the last several releases, you will find a lot more information appearing in the note section, including details on the following features:

  • Dynamic Sampling – indicates a non-default value for the parameter OPTIMIZER_DYNAMIC_SAMPLING or that one or more objects referenced in the query are missing statistics.
  • Plan Stability Features – indicates that a store outline, SQL profile, SQL patch SQL directives, or SQL plan baseline influenced the plan chosen.
  • Adaptive plan – indicates that the plan shown through the Explain Plan command may be different from the plan used during execution as the Optimizer will use run-time statistics to make the final decision on what join types and aggregation methods to use. It is best to view the actual plan used via v$SQL_PLAN after the initial execution.
  • Statistics Feedback – indicates the plan was re-evaluated based on previous execution statistics, and a new plan was found.

Hint Usage Reporting in Oracle Database 19c

Starting in 19c, you will see a new hint usage report under the execution plan. The new report gives you a better understanding of what happened to each hint in a SQL statement.

Remember, the database doesn’t issue an error if a hint is invalid or not applicable for a SQL statement. The database will treat the hint as a comment, effectively ignoring it. Previously, the only way to find this information was via a 10053 trace file, and even then, it wasn’t very detailed.

The new hint report shows which hints were used and which were ignored and explains why hints weren’t used. The most common reasons for a hint not to be used are:

  • Syntax errors – the hint contains a typo or an invalid argument.
  • Unresolved hints – the hint contains an invalid argument or is not applicable to the SQL statement. For example, you request an index be used, but no index exists on the table.
  • Unused hints – the hint can’t be used in this specific scenario. For example, you requested a Hash Join for a non-equality join condition.
  • Conflicting hints – multiple hints provide conflicting directives.

In the example below, the hint NO_QURY_TRANSFORMATION was reported to contain a syntax error. The word query is misspelled in the hint, so the hint can’t be used.

---------------------------------------------------------------------------   
| Id  | Operation                      | Name     | Rows  | Bytes | Cost (%CPU)|   
--------------------------------------------------------------------------------    
|   0 | SELECT STATEMENT               |          |       |       |    47 (100)|    
|   1 |  HASH GROUP BY                 |          |   269 | 37929 |    47   (7)|    
|   2 |   HASH JOIN                    |          |   269 | 37929 |    46   (5)|    
|   3 |     TABLE ACCESS STORAGE FULL  | SALES    | 10000 | 90000 |     5   (0)|   
|*  4 |     TABLE ACCESS STORAGE FULL  | PRODUCTS | 43108 |  4841K|    40   (3)|    
--------------------------------------------------------------------------------    
Predicate Information (identified by operation id):                                            
---------------------------------------------------                                            
   4 - access("ITEM_1"="P"."PROD_ID")                                                          
   
Hint Report (identified by operation id / Query Block Name / Object Alias):                    
Total hints for statement: 1 (E - Syntax error (1))
--------------------------------------------------------------------------
 0 -  SEL$1                                                                                  
         E -  NO_QURY_TRANSFORMATION 
Note                                                                                           
-----
- dynamic statistics used: dynamic sampling (level=2)                                       
                                                              

In this second example, I provided two hints on how to access the employees table. One hint requested that the primary key index be used, and the other requested that the access leverage parallel execution.

SELECT /*+ index(e empno_pk_ind) parallel(e 8) */ e.empno, ename 
FROM    employees eWHERE   e.empno < 7700;

----------------------------------------------------------------------------
|Id  | Operation			     | Name	    | Rows | Bytes |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT		     |	            |	   |	   |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED | EMPLOYEES    |	 8 |	80 |
|*  2 |   INDEX RANGE SCAN		     | EMPNO_PK_IND |	 8 |	   |
----------------------------------------------------------------------------
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
---------------------------------------------------------------------------
	 U -  parallel(e 8)

The parallel hint is not used, as I’ve supplied an invalid combination of hints. An index range scan can’t be parallelized unless the index is partitioned, which is not true in this example. Therefore, the Optimizer can not honor both hints.

Note Nigel Bayliss, the Optimizer product manager, has also blogged about Optimizer hint reporting and has shared details on some limitations regarding execution hints such as GATHER_OPTIMIZER_STATISTICS, APPEND, etc.

SQL Analysis Report in Oracle Database 23c

In Oracle Database 23c, the execution plan got another new section: a SQL Analysis Report. This handy addition helps you diagnose common problems that can cause suboptimal execution plans. For example, the new report will point out situations where you are:

  • Missing join conditions
  • Have a WHERE clause predicate that prevents an index from being used
  • Have a datatype mismatch in a WHERE clause predicate
  • Using a UNION instead of a UNION ALL
EXPLAIN PLAN FOR
SELECT * FROM addresses
 WHERE UPPER(state) = 'CA';
 
SELECT * FROM dbms_xplan.display();
 
Plan hash value: 3184888728
 
--------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)|
--------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |   239 | 13384 |    61   (0)| 
|*  1 |  TABLE ACCESS FULL| ADDRESSES |   239 | 13384 |    61   (0)| 
--------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(UPPER("state")='CA')
 
SQL Analysis Report (identified by operation id/Query Block Name/Object Alias):
-----------------------------------------------------------------
 
   1 -  SEL$1 / "ADDRESSES"@"SEL$1" 
-  The following columns have predicates which preclude their use as keys in index range scan. Consider rewriting the predicates. "STATE"

Again, Nigel Bayliss has blogged about this in more detail on the official Optimizer blog.

Note that SQL Monitor active reports also contain SQL Analysis reports. You can find them under the new  SQL Analysis tab.

New Optimizer Dictionary Views

Along with enhancements to the execution plan in 23c, we added new data dictionary views to help identify what happens when we parse and execute a SQL statement.

Apps and users frequently change the value for one or more parameters that impact the Optimizer at a session level. As a DBA or performance engineer, you are often unaware of these logon triggers or ALTER SESSION commands. The first hint that the environment may have changed is when you see a plan change.

To check if a change in the optimizer environment caused the plan change, you can check the column OPTIMIZER_ENV_HASH_VALUE in V$SQL, V$SQLAREA, and DBA_HIST_SQLSTAT. However, this column doesn’t tell you what has changed.

Starting in 23c, you can query a new dictionary view, DBA_HIST_OPTIMIZER_ENV_DETAILS, to find out exactly what has changed in the optimizer environment when we parsed and executed a SQL statement.

Using the optimizer_env_hash_value for the original and new plan, you can query DBA_HIST_OPTIMIZER_ENV_DETAILS and get a list of the parameters settings for each scenario, which you can then compare to find the difference.

SELECT DISTINCT optimizer_env_hash_value
FROM  dba_hist_sqlstat
WHERE sql_id = 'bsvavk15n7cra'
ORDER BY 1;

OPTIMIZER_ENV_HASH_VALUE                                                        
------------------------                                                        
              1309615723                                                        
              2369923737 

SELECT name, value
FROM dba_hist_optimizer_env_details
WHERE optimizer_env_hash_value = 1309615723
ORDER BY 1;

NAME                                               VALUE                        
-------------------------------------------------- -------------------------    
BlockChain_ledger_infrastructure                   0                            
CLI_internal_cursor                                0                            
PMO_altidx_rebuild                                 0                            
_adaptive_window_consolidator_enabled              true                         
_add_stale_mv_to_dependency_list                   true  
   .
   .

You can get more information on how to leverage the DBA_HIST_OPTIMIZER_ENV_DETAILS view in MOS NOTE:2953121.1Examining the Optimizer Environment within Which a SQL Statement was Parsed in AWR.

How to use DBMS_STATS DIFF_TABLE_STATS functions

In 11g, Oracle introduced the DBMS_STAT.DIFF_TABLE_STATS functions to help you compare two sets of statistics for a table along with all its dependent objects (indexes, columns, partitions).

There are three versions of this function depending on where the statistics being compared are located:

  • DBMS_STAT.DIFF_TABLE_STATS_IN_HISTORY (compares statistics for a table from two timestamps in the past)
  • DBMS_STAT.DIFF_TABLE_STATS_IN_PENDING (compares pending statistics and statistics as of a timestamp or statistics from the data dictionary)
  • DBMS_STAT.DIFF_TABLE_STATS_IN_STATTAB (compares statistics from a user statistics table and the data dictionary, from two different user statistics tables, or a single user statistics table using two different STATSIDs)

The functions return a report that has three sections:

  1. Basic table statistics
    The report compares the basic table statistics (number of rows, blocks, etc.).
  2. Column statistics
    The second section of the report examines column statistics, including histograms.
  3. Index Statistics
    The final section of the report covers differences in index statistics.

Statistics will only be displayed in the report if the difference in the statistics exceeds a certain threshold (%). The threshold can be specified as an argument to the functions (PCTTHRESHOLD); the default value is 10%. The statistics corresponding to the first source, typically the current table stats in the data dictionary, will be used to compute the differential percentage.

The functions also return the MAXDIFFPCT (a number) along with the report. This is the maximum percentage difference between the statistics. These differences can come from the table, column, or index statistics.

Let’s look at an example.
Continue reading “How to use DBMS_STATS DIFF_TABLE_STATS functions”

SQL Tuning: How to determine why the Optimizer is picking the wrong index

One of the most common SQL Tuning challenges you will encounter with enterprise applications is a SQL statement where the Optimizer picks the wrong index. As tempting as it is to brute-force the plan you want via an index hint, it’s always better to understand why the Optimizer made the decision and provide the necessary information and access structures to allow the Optimizer to select the plan you want by default.

In the short video below, I explain how the Optimizer costs each of the index accesses available to it and provide you with a simple set of steps to help you identify these types of problems and guidance on how to create indexes so the Optimizer will automatically select them.

This post is part two of a series of blog posts on SQL Tuning. In part one, I shared some simple steps to help you tune a SQL Statement using the wrong Join Type.

SQL Tuning: How to tune a SQL Statement with the wrong Join Type

When it comes to SQL Tuning, I don’t typically recommend folks add one-off hints or look for magical underscore parameters to help improve their query performance.

Instead, I like to share some simple steps to

  1. Accurately characterise the problem
  2. Determine and apply a solution
  3. Measure the effectiveness of that solution

But instead of talking about my approach, I thought it would be more beneficial to show you how I do it. So in the video below, you will see the exact steps I used to determine what was causing a Sales Report to run slowly and the two alternative approaches you can take to resolve similar problems.

SQL Tuning Tips You Can’t Do Without

Last week, I enjoyed presenting at the aioug Sangam 20 on one of my favorite topics, SQL Tuning.

Often, we are led to believe you need a degree in wizardry to tune sub-optimal SQL statement, but in reality, you usually need to know where to look.

In the session, I look at four different SQL statements with sub-optimal plans and share details on where I look for information to help me understand why. Once I know the root cause of a problem, it’s easy to apply the appropriate solution.

Those who couldn’t make the session in person can download the presentation file from here.

Explain the Explain Plan: Cardinality Estimates

In last week’s post, I began a series on how to read and interpret Oracle execution plans by explaining what an execution plan is and how to generate one. This week I’m going to tackle the most important piece of information the Optimizer shares with you via the execution plan, it’s cardinality estimates.

What is a Cardinality Estimate?

A cardinality estimate is the estimated number of rows, the optimizer believes will be returned by a specific operation in the execution plan. The Optimizer determines the cardinality for each operation based on a complex set of formulas that use table and column level statistics as input (or the statistics derived by dynamic sampling). It’s considered the most important aspect of an execution plan because it strongly influences all of the other decisions the optimizer makes.

In part 4 of our series, I share some of the formulas used by the optimizer to estimate cardinalities, as well as showing you how to identify cardinalities in a plan. I also demonstrate multiple ways to determine if the cardinality estimates are accurate.

What can cause a Cardinality Misestimate and how do I fix it?

Several factors can lead to incorrect cardinality estimates even when the basic table and column statistics are up to date. In part 5 of our series, I explain the leading causes of cardinality misestimates and how you can address them.

Next weeks, instalment will be all about the different access methods available to the Optimizer and what you can do to encourage the optimizer to select the access method you want!

Don’t forget this series also covers, how to read an explain plan as well as the different join methods and join orders.

Don’t forget more information on the Oracle Optimizer can always be found on the Optimizer blog.

How to read a Parallel Execution Plan in Oracle

The volume of data being stored in databases has grown exponentially in recent years. So too has the need to rapidly generate value or business insights from that data.

Parallel execution is the key to processing large volumes of diverse data quickly, as it subdivides complex tasks into a number of small tasks allowing multiple processes to accomplish a single complex task.

However, the use of parallelism can complicate the execution plan displayed. Oracle not only displays the operations needed to complete the SQL statement in the plan but all of the communication steps between the parallel server processes.

So, how should you go about interpreting a parallel execution plan?

In the video below, I give you a step by step guide on how to read parallel plans and what additional information you can glean from them!

How to setup VNCServer on an OCI VM?

Now we are all working from home, I’ve noticed that my MAC laptop is severely overloaded when I do live demos during webinars. After all, it’s running my camera, Zoom, PowerPoint, my Java app, and monitoring tools.

So, I decided it was time to move my demo environment to the Oracle Cloud, where I quickly provisioned a 2-OCPU VM running Linux following the instructions in my previous blog post.

Screen Shot 2020-08-11 at 12.56.23 PM

Once I had my VM up and running, I wanted a proper desktop experience, so I needed VNC.

After a quick google search, I found the video below, which provides a very easy to follow, step by step guide to installing and configuring TigerVNC VNCServer on OCI infrastructure. I followed all of these steps except for the final stage where they describe adding the VNCServer to your firewall.

I’m married to a security expert, who strongly advised against this approach. He told me it would be far more secure to use an SSH tunnel instead of opening the firewall for the VNC port.

Below the video are the full set of commands I used in my setup, including how to establish the ssh tunnel, so you can quickly cut and paste them.

Continue reading “How to setup VNCServer on an OCI VM?”

How to Provision a VM in the Oracle Cloud?

This blog post outlines the 10 simple steps necessary to provision, and connect to, a VM on Oracle Cloud Infrastructure (OCI).

          1. Confirm you have an SSH Public Key on your laptop or localhost.
            $ cd ~/.ssh
            — check if you have an existing if you have a key already
            $ ls
            id_rsa    id_rsa.pub    known_hosts

            You’re looking for a file named either id_dsa or id_rsa and a matching file with a .pub extension. The .pub file is your public key, and the other file is the corresponding private key. If you don’t have these files or you don’t remember your passphrase, you will need to complete the steps outlined here.NOTE: You can’t move on unless you have your SSH Public key.
          2. Connect to the OCI console to begin the provisioning process.
            From the hamburger menu in the upper left-hand corner select, the Compute menu item followed by the Instances option.
          3. On the Instances page, click the Create Instance button.
            Step3_begin_Create_instance_2
          4. Specify a unique name for your instance and accept the default Oracle Linux image.
            Step4_name_instance
          5. Scroll down and click on the Change Shape button.
            Step5_change_shape
          6. To run my demos I typically use Swingbench, which needs a minimum of 2 OCPUs (4 OCPUs if you use the JSON workload). So, I select a Virtual Machine with Intel Skylake processors and  2 OCPUs. Then click the Select Shape button.
            Step6_pick_shape
          7. I use the automatic defaults in the Configure Networking and Boot Volume sections and move on to the SSH Key section. Here I select Paste SSH Keys and cut and paste my public key from my .ssh directory into the window provided.
            Picture7_ssh_key
          8. Finally hit the Create button at the end of the page.
          9. Instantly you will see a new VM is being provisioned for you. Once available, you can connect to the machine using the public IP address and the user OPC. You will find the IP address, on the main Instance console page.Step9_connect
          10. Simply ssh into the machine from your laptop using the supplied OPC user.

            $ ssh opc@XXX.XXX.XX.XXX
            Enter passphrase for key '/Users/sqlmaria/.ssh/id_rsa':
            Last login: Tue Jul 28 16:39:35 2020 from XXXXX
%d bloggers like this: