At one point or another during your career, you will face a situation where the optimizer picks the wrong join method. As tempting as it is to brute-force the plan you want via a hint, it’s always better to understand why the Optimizer made a decision and provide the necessary information to allow the Optimizer to select the plan you want by default.
In this short video below, I explain how the Optimizer determines the join method based on the cardinality of the table on the left-hand side of the join. I also provide a simple set of steps to help you identify the common problems that can cause an incorrect join method to be used and guidance on how to supply the necessary information, so the Optimizer will automatically select the appropriate join method.
Adaptive joins, introduced in 12c, can help address some cases where the Optimizer selects the wrong join method but not in all cases. It’s always better to know how to save yourself rather than relying on the built-in safety net.
One of the most common SQL Tuning challenges you will encounter with enterprise applications is a SQL statement where the Optimizer picks the wrong index. As tempting as it is to brute-force the plan you want via an index hint, it’s always better to understand why the Optimizer made the decision and provide the necessary information and access structures to allow the Optimizer to select the plan you want by default.
In the short video below, I explain how the Optimizer costs each of the index accesses available to it and provide you with a simple set of steps to help you identify these types of problems and guidance on how to create indexes so the Optimizer will automatically select them.
This post is part two of a series of blog posts on SQL Tuning. In part one, I shared some simple steps to help you tune a SQL Statement using the wrong Join Type.
When it comes to SQL Tuning, I don’t typically recommend folks add one-off hints or look for magical underscore parameters to help improve their query performance.
Instead, I like to share some simple steps to
Accurately characterise the problem
Determine and apply a solution
Measure the effectiveness of that solution
But instead of talking about my approach, I thought it would be more beneficial to show you how I do it. So in the video below, you will see the exact steps I used to determine what was causing a Sales Report to run slowly and the two alternative approaches you can take to resolve similar problems.
The join order is the order in which the tables are joined together in a multi-table SQL statement. Ideally, a plan should start with the join that eliminates the most data to minimize the amount of data carried into the subsequent joins.
How is the Join Order Determined?
The join order is determined based on cost, which is strongly influenced by the cardinality estimates and available access paths. However, the Optimizer will also always adhere to some basic rules:
The Optimizer will always select a join that will produce at most 1 row as the initial join in the plan. For example, a join between two row sources that only have 1 row each. Like a primary key lookup or an index unique scan.
If a SQL statement uses an outer join, then the optimizer must obey the join order specified by the outer join. That is to say; the row preserving table must come after the other table in the predicate to ensure all of the additional rows that don’t satisfy the join condition can be added to the result set correctly. For example, with the Oracle syntax for outer joins, the table with the outer join operator must come after the other table in the predicate. Thus, in this example, the cites table must come before the countries table.
WHERE cities.country_id = countries.id(+);
For SQL statements that reference a database view, the Optimizer will attempt to do view merging, where the definition of the view is inserted into the rest of the SQL statement, and the entire expanded statement is optimized as a whole. However, there are a few cases where view merging isn’t possible. In these cases, the optimizer will join all of the tables in the view together before the resulting data set is joined to the tables outside the view.
When a subquery has been converted into an anti-join (NOT IN subquery) or semi-join (EXISTS subquery), the tables from the subquery must come after those tables in the outer query block to which they were connected or correlated. However, hash anti-joins and semi-joins can override this ordering condition under certain circumstances.
How to determine the Join Order in an execution plan
You can take several approaches to determine the Join Order in a plan, from looking at the indentation of the tables in the operation column to a depth-first search. To clearly explain how to identify the Join Order in an execution plan, I’ve created a short video demonstrating several approaches using real-world examples.
The primary focus of conventional data security technologies like passwords, firewalls, and data encryption is to keep criminals out of your company and your data stores.
But what protects your data, especially your essential asset (contracts, property titles, account statements, etc.), from being modified or even deleted by folks who gain access to your systems legitimately or illegitimately (hackers)?
Crypto-secure Data Management
This is where Blockchain can help. Layering Blockchain technologies on top of conventional data security features provide an extra level of protection that prevents illicit modifications or deletes of data.
What is Blockchain?
When we think of Blockchain, many of us instantly think of decentralized peer to peer apps that only permit consensus-based data changes. However, adopting these apps requires new development methodologies, speciality data stores and potentially new business practices, which is complicated and expensive!
But if we take a closer look at Blockchain technologies, we see four critical components; immutability, cryptographic digests, cryptographic signatures, and distributed systems. Each part works to protect against a different aspect of illicit data changes performed using legitimate user credentials or by hackers.
Integrating these Blockchain technologies into the Oracle Database brings the critical security benefits of Blockchain to mainstream applications with minimal or no changes required. Providing the full functionality of the world’s leading database on crypto-protected data.
In the video below, Juan Loaiza explains how Oracle implemented Blockchain technologies in the Oracle Database and how they can be used to protect your essential business data. I’ve also included a brief description of these features under the video.
How do Blockchain technologies work in the Oracle Database?
To protect against illicit data changes made by rogue insiders or malicious actors using insiders’ credentials, Oracle has introduced Immutable tables (insert-only tables) in Oracle Database 21c (21.3).
With an Immutable table, it is possible to insert new data, but existing data cannot be changed or deleted by anyone using the database, even the database administrators (SYSDBA). It is also impossible to change an immutable table’s definition or convert it to an updatable table. However, an Immutable table appears like any other table in the database from an application’s point of view. It can store both relational data and JSON documents, and it can be indexed and partitioned or used as the basis of a view.
To protect against illicit changes made by hackers, Oracle has introduced Blockchain tables. Blockchain tables are immutable tables that organize rows into several chains. Each row, except the first row in the chain, is chained to the previous row via a cryptographic digest or hash. The hash is automatically calculated on insert based on that row’s data and the hash value of the previous row in the chain. Timestamps are also recorded for each row on insertion.
Any modification to data in a Blockchain table breaks the cryptographic chain because the hash value of the row will change. You can verify the contents of a blockchain table have not been modified since they were inserted using the DBMS_BLOCKCHAIN_TABLE.VERIFY_ROWS procedure.
Even with Immutable or Blockchain tables, data can be falsely inserted in an end user’s name by someone using stolen credentials. To address this vulnerability, Oracle allows end-users to cryptographically sign the data they insert using their private key that is never passed to the database.
Each end-user registers a digital certificate containing their public key with this database. This digital certificate allows the database to validate the end-users signature when new data is inserted. Even if a hacker manages to steal a valid set of credentials without the private key, the data insert signature won’t match and will therefore not be accepted.
It’s also possible for end-users to ensure the database has received their changes by requesting Oracle countersign the newly inserted data. Oracle returns a crypto-receipt to the user, ensuring nothing on the mid-tier can filter specific data to prevent it from being recorded.
Distributing Cryptographic Digest
Even with cryptographically chained rows, sophisticated cyber-criminals or authorities could illicitly change data via a large-scale cover-up, where the entire database is replaced. To detect such a cover-up, Oracle enables schema owners to sign and distribute the cryptographic digest for a blockchain table periodically. Remember, the digest can’t be used to infer the data in the table, but authorized users can use it to validate the chain and confirm their newly inserted data is present. The crypto-digest can be posted to an independent public store or blockchain, like Ethereum or sent out by email or made available via a REST API.
A cover-up can easily be detected by comparing the previously published digests to the current table content. Also, distributing the publicly across multiple independent services prevents an authority or cyber-attacker from deleting all the separate copies.
Getting Started With Blockchain
Both Immutable and Blockchain tables are free features of the Oracle Database. No additional licenses or software is needed to take advantage of these new table types, which are completely transparent to all new and existing applications.
Also, note Oracle has backported Immutable tables and Blockchain tables to Oracle Database 19c (19.11 and 19.10, respectively). Please check My Oracle Support for more details before attempting to use Blockchain tables in 19.10.
Continuing my blog series on reading and interpreting Oracle execution plans, this week’s post covers the different Join Methods and types available to the Optimizer.
What is an Oracle Join Method?
Join Methods are the techniques used by the Oracle Optimizer to join data coming from two data producing operators in an execution plan. You can find the individual Join Methods chosen by the Optimizer in the Operations column of an Execution table.
How many Join Methods does the Optimizer have to choose from?
The Oracles Optimizer supports three join methods; Nested Loops, Hash Join and Sort Merge Join.
Nested Loops Join
Nested loops joins are useful when small subsets
of data are being joined and if there is an efficient way of accessing the second table (for example an index lookup).
For every row in the first table (the outer table), Oracle accesses all the rows in the second table (the inner table) looking for a match. You can think of it as a set of embedded FOR loops.
NOTE: In Oracle Database 11g the internal implementation for nested loop joins changed to reduce overall latency for physical I/O. You may see two NESTED LOOPS operators in the plan’s operations column, where you previously only saw one on earlier versions of Oracle. You can find more details on why there are two operators in the video below.
Hash joins are used for joining large data sets. The Optimizer uses the smaller of the two tables or data sources to build a hash table, based on the join key, in memory. It then scans the larger table and performs the same hashing algorithm on the join column(s). It then probes the previously built hash table for each value and if they match, it returns a row.
Sort Merge Joins
Sort Merge joins are useful when the join condition between two tables is an in-equality condition such as, <, <=, >, or >=. Sort merge joins can perform better than nested loop joins for large data sets. The join consists of two steps:
Sort join operation: Both the inputs are sorted on the join key.
Merge join operation: The sorted lists are merged together.
A Sort Merge join is more likely to be chosen if there is an index on one of the tables that will eliminate one of the sorts.
When will the Optimizer choose each of these methods, and what can I do to influence that decision?
To clearly explain how each of the Join Methods works and when they will be chosen, I’ve created the short video below.
What if I don’t get the Join Method I want?
The leading cause of getting the wrong Join Method is typically a cardinality misestimate on the table on the left-hand side of the join. That’s why Oracle introduced Adaptive Plans and more specifically Adaptive Join Methods in Oracle Database 12c to help automatically correct itself if the wrong Join Method is chosen.
How Adaptive Joins work
During the initial execution of a plan, if Oracle detects that the Optimizer’s cardinality estimates were wrong, the join method can be changed “on the fly” to a better option.
It’s possible to change an adaptive plan “on the fly” because it consists of a default plan, which is the plan that the Optimizer picks based on the current statistics and an alternative method for various portions of the plan. For example, the default plan could be a Nested Loops plan, and the alternative(subplan) would be a Hash join.
A new operated called a Statistics Collector is inserted into the plan, right above the table on the left-hand side of the join, which will buffer the rows coming out of the table until we can get a sense of how many rows will be returned. Once we know the number of rows returned or the number is above a certain threshold, the Optimizer will choose the final join method. After the initial execution, the Statistics Collector and the subplan components not chosen become no-ops, and the impact on execution plan performance is nill.
At the end of last year, I began a blog series on reading and interpreting Oracle execution plans. In this week’s post, I will tackle the aspect of execution plans that I get the most questions about, Access Methods.
What are Oracle Access Paths or Methods?
Access Methods or Access Paths are the techniques used by Oracle to access the data needed to answer a query. Access Methods can be divided into two categories; data accessed via a table scan or index access. You can find the individual Access Methods chosen by the Optimizer in the Operations column of an Execution table.
How Many Access Paths are available to the Optimizer?
Oracle supports nine different Access Methods today, as shown in the image below.
When will the Optimizer choose each of these methods, and what can I do to influence that decision?
To clearly explain how each of the Access Methods works and when it will be chosen, I’ve created a short video.
What if I don’t get the Access Method I want?
If the Access Method you see in an execution plan is not what you expect, check the cardinality estimates for that object are correct, and the join order allows the access method you desire. Remember, Optimizer transformations (the rewriting of your query to open up additional access methods) can also greatly impact the Access Method.
2020 has been a challenging year for all of us. I hope wherever you are, you have an opportunity to relax and recharge as we head into the holiday season and I look forward to sharing more technical posts and videos in the New Year!
It looks like the holidays have come early this year for those of you with an Oracle Cloud account because starting today you can now create an Oracle 21c database!
That’s right, Oracle Database 21c is now production in the Oracle Cloud on the Oracle Cloud Database Service and the Autonomous Database Free Tier Service in Ashburn (IAD), Phoenix (PHX), Frankfurt (FRA) and London (LHR) regions. General availability of Oracle Database 21c for on-prem platforms (including Exadata, Linux and Windows) will be in 2021.