SQL Tuning Guide Pt1
SQL Tuning Guide Pt1
SQL Tuning Guide Pt1
18c
E84296-06
August 2021
Oracle Database SQL Tuning Guide, 18c
E84296-06
Contributors: Hermann Baer, Bjorn Bolltoft, Ali Cakmak, Sunil Chakkappen, Immanuel Chan, Deba
Chatterjee, Chris Chiappa, Dinesh Das, Kurt Engeleiter, Leonidas Galanis, William Endress, Marcus Fallen,
Bruce Golbus, Katsumi Inoue, Praveen Kumar Tupati Jaganath, Mark Jefferys, Shantanu Joshi, Adam
Kociubes, Keith Laker, Allison Lee, Sue Lee, Cheng Li, David McDermid, Colin McGregor, Ajit Mylavarapu,
Ted Persky, Lei Sheng, Ekrem Soylemez, Hong Su, Murali Thiyagarajah, Randy Urbano, Sahil Vazirani,
Bharath Venkatakrishnan, Hailing Yu, John Zimmerman
This software and related documentation are provided under a license agreement containing restrictions on
use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your
license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license,
transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse
engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is
prohibited.
The information contained herein is subject to change without notice and is not warranted to be error-free. If
you find any errors, please report them to us in writing.
If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on
behalf of the U.S. Government, then the following notice is applicable:
U.S. GOVERNMENT END USERS: Oracle programs (including any operating system, integrated software,
any programs embedded, installed or activated on delivered hardware, and modifications of such programs)
and Oracle computer documentation or other Oracle data delivered to or accessed by U.S. Government end
users are "commercial computer software" or "commercial computer software documentation" pursuant to the
applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use,
reproduction, duplication, release, display, disclosure, modification, preparation of derivative works, and/or
adaptation of i) Oracle programs (including any operating system, integrated software, any programs
embedded, installed or activated on delivered hardware, and modifications of such programs), ii) Oracle
computer documentation and/or iii) other Oracle data, is subject to the rights and limitations specified in the
license contained in the applicable contract. The terms governing the U.S. Government’s use of Oracle cloud
services are defined by the applicable contract for such services. No other rights are granted to the U.S.
Government.
This software or hardware is developed for general use in a variety of information management applications.
It is not developed or intended for use in any inherently dangerous applications, including applications that
may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you
shall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures to ensure its
safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this
software or hardware in dangerous applications.
Oracle, Java, and MySQL are registered trademarks of Oracle and/or its affiliates. Other names may be
trademarks of their respective owners.
Intel and Intel Inside are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are
used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Epyc,
and the AMD logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered
trademark of The Open Group.
This software or hardware and documentation may provide access to or information about content, products,
and services from third parties. Oracle Corporation and its affiliates are not responsible for and expressly
disclaim all warranties of any kind with respect to third-party content, products, and services unless otherwise
set forth in an applicable agreement between you and Oracle. Oracle Corporation and its affiliates will not be
responsible for any loss, costs, or damages incurred due to your access to or use of third-party content,
products, or services, except as set forth in an applicable agreement between you and Oracle.
Contents
Preface
Audience xxv
Documentation Accessibility xxv
Related Documents xxv
Conventions xxvi
iii
2.2 Guidelines for Deploying Your Application 2-3
2.2.1 Guideline for Deploying in a Test Environment 2-3
2.2.2 Guidelines for Application Rollout 2-4
3 SQL Processing
3.1 About SQL Processing 3-1
3.1.1 SQL Parsing 3-2
3.1.1.1 Syntax Check 3-2
3.1.1.2 Semantic Check 3-2
3.1.1.3 Shared Pool Check 3-3
3.1.2 SQL Optimization 3-5
3.1.3 SQL Row Source Generation 3-5
3.1.4 SQL Execution 3-7
3.2 How Oracle Database Processes DML 3-8
3.2.1 How Row Sets Are Fetched 3-8
3.2.2 Read Consistency 3-9
3.2.3 Data Changes 3-9
3.3 How Oracle Database Processes DDL 3-9
iv
4.4.1.1 About Adaptive Query Plans 4-13
4.4.1.2 Purpose of Adaptive Query Plans 4-13
4.4.1.3 How Adaptive Query Plans Work 4-14
4.4.1.4 When Adaptive Query Plans Are Enabled 4-20
4.4.2 Adaptive Statistics 4-21
4.4.2.1 Dynamic Statistics 4-21
4.4.2.2 Automatic Reoptimization 4-21
4.4.2.3 SQL Plan Directives 4-24
4.4.2.4 When Adaptive Statistics Are Enabled 4-24
4.5 About Approximate Query Processing 4-25
4.5.1 Approximate Query Initialization Parameters 4-26
4.5.2 Approximate Query SQL Functions 4-26
4.6 About SQL Plan Management 4-28
4.7 About the Expression Statistics Store (ESS) 4-29
5 Query Transformations
5.1 OR Expansion 5-1
5.2 View Merging 5-3
5.2.1 Query Blocks in View Merging 5-4
5.2.2 Simple View Merging 5-4
5.2.3 Complex View Merging 5-7
5.3 Predicate Pushing 5-9
5.4 Subquery Unnesting 5-11
5.5 Query Rewrite with Materialized Views 5-11
5.5.1 About Query Rewrite and the Optimizer 5-12
5.5.2 About Initialization Parameters for Query Rewrite 5-12
5.5.3 About the Accuracy of Query Rewrite 5-13
5.5.4 Example of Query Rewrite 5-14
5.6 Star Transformation 5-15
5.6.1 About Star Schemas 5-15
5.6.2 Purpose of Star Transformations 5-16
5.6.3 How Star Transformation Works 5-16
5.6.4 Controls for Star Transformation 5-16
5.6.5 Star Transformation: Scenario 5-17
5.6.6 Temporary Table Transformation: Scenario 5-20
5.7 In-Memory Aggregation (VECTOR GROUP BY) 5-22
5.8 Cursor-Duration Temporary Tables 5-22
5.8.1 Purpose of Cursor-Duration Temporary Tables 5-22
5.8.2 How Cursor-Duration Temporary Tables Work 5-23
5.8.3 Cursor-Duration Temporary Tables: Example 5-23
v
5.9 Table Expansion 5-25
5.9.1 Purpose of Table Expansion 5-25
5.9.2 How Table Expansion Works 5-25
5.9.3 Table Expansion: Scenario 5-26
5.9.4 Table Expansion and Star Transformation: Scenario 5-29
5.10 Join Factorization 5-31
5.10.1 Purpose of Join Factorization 5-31
5.10.2 How Join Factorization Works 5-31
5.10.3 Factorization and Join Orders: Scenario 5-32
5.10.4 Factorization of Outer Joins: Scenario 5-33
vi
7.2.2.2 Viewing Parallel Queries with EXPLAIN PLAN: Example 7-7
7.2.3 Viewing Bitmap Indexes with EXPLAIN PLAN 7-8
7.2.4 Viewing Result Cache with EXPLAIN PLAN 7-9
7.2.5 Viewing Partitioned Objects with EXPLAIN PLAN 7-10
7.2.5.1 Displaying Range and Hash Partitioning with EXPLAIN PLAN: Examples 7-10
7.2.5.2 Pruning Information with Composite Partitioned Objects: Examples 7-12
7.2.5.3 Examples of Partial Partition-Wise Joins 7-14
7.2.5.4 Example of Full Partition-Wise Join 7-16
7.2.5.5 Examples of INLIST ITERATOR and EXPLAIN PLAN 7-17
7.2.5.6 Example of Domain Indexes and EXPLAIN PLAN 7-18
7.2.6 PLAN_TABLE Columns 7-18
7.3 Execution Plan Reference 7-29
7.3.1 Execution Plan Views 7-29
7.3.2 PLAN_TABLE Columns 7-30
7.3.3 DBMS_XPLAN Display Functions 7-39
vii
8.3 B-Tree Index Access Paths 8-12
8.3.1 About B-Tree Index Access 8-13
8.3.1.1 B-Tree Index Structure 8-13
8.3.1.2 How Index Storage Affects Index Scans 8-14
8.3.1.3 Unique and Nonunique Indexes 8-15
8.3.1.4 B-Tree Indexes and Nulls 8-15
8.3.2 Index Unique Scans 8-17
8.3.2.1 When the Optimizer Considers Index Unique Scans 8-17
8.3.2.2 How Index Unique Scans Work 8-18
8.3.2.3 Index Unique Scans: Example 8-19
8.3.3 Index Range Scans 8-20
8.3.3.1 When the Optimizer Considers Index Range Scans 8-20
8.3.3.2 How Index Range Scans Work 8-21
8.3.3.3 Index Range Scan: Example 8-22
8.3.3.4 Index Range Scan Descending: Example 8-23
8.3.4 Index Full Scans 8-24
8.3.4.1 When the Optimizer Considers Index Full Scans 8-24
8.3.4.2 How Index Full Scans Work 8-24
8.3.4.3 Index Full Scans: Example 8-25
8.3.5 Index Fast Full Scans 8-26
8.3.5.1 When the Optimizer Considers Index Fast Full Scans 8-26
8.3.5.2 How Index Fast Full Scans Work 8-26
8.3.5.3 Index Fast Full Scans: Example 8-26
8.3.6 Index Skip Scans 8-27
8.3.6.1 When the Optimizer Considers Index Skips Scans 8-27
8.3.6.2 How Index Skip Scans Work 8-27
8.3.6.3 Index Skip Scans: Example 8-28
8.3.7 Index Join Scans 8-29
8.3.7.1 When the Optimizer Considers Index Join Scans 8-29
8.3.7.2 How Index Join Scans Work 8-30
8.3.7.3 Index Join Scans: Example 8-30
8.4 Bitmap Index Access Paths 8-31
8.4.1 About Bitmap Index Access 8-31
8.4.1.1 Differences Between Bitmap and B-Tree Indexes 8-32
8.4.1.2 Purpose of Bitmap Indexes 8-33
8.4.1.3 Bitmaps and Rowids 8-34
8.4.1.4 Bitmap Join Indexes 8-35
8.4.1.5 Bitmap Storage 8-36
8.4.2 Bitmap Conversion to Rowid 8-36
8.4.2.1 When the Optimizer Chooses Bitmap Conversion to Rowid 8-37
8.4.2.2 How Bitmap Conversion to Rowid Works 8-37
viii
8.4.2.3 Bitmap Conversion to Rowid: Example 8-37
8.4.3 Bitmap Index Single Value 8-37
8.4.3.1 When the Optimizer Considers Bitmap Index Single Value 8-38
8.4.3.2 How Bitmap Index Single Value Works 8-38
8.4.3.3 Bitmap Index Single Value: Example 8-38
8.4.4 Bitmap Index Range Scans 8-39
8.4.4.1 When the Optimizer Considers Bitmap Index Range Scans 8-39
8.4.4.2 How Bitmap Index Range Scans Work 8-39
8.4.4.3 Bitmap Index Range Scans: Example 8-40
8.4.5 Bitmap Merge 8-40
8.4.5.1 When the Optimizer Considers Bitmap Merge 8-41
8.4.5.2 How Bitmap Merge Works 8-41
8.4.5.3 Bitmap Merge: Example 8-41
8.5 Table Cluster Access Paths 8-42
8.5.1 Cluster Scans 8-42
8.5.1.1 When the Optimizer Considers Cluster Scans 8-42
8.5.1.2 How a Cluster Scan Works 8-43
8.5.1.3 Cluster Scans: Example 8-43
8.5.2 Hash Scans 8-44
8.5.2.1 When the Optimizer Considers a Hash Scan 8-44
8.5.2.2 How a Hash Scan Works 8-44
8.5.2.3 Hash Scans: Example 8-45
9 Joins
9.1 About Joins 9-1
9.1.1 Join Trees 9-1
9.1.2 How the Optimizer Executes Join Statements 9-3
9.1.3 How the Optimizer Chooses Execution Plans for Joins 9-4
9.2 Join Methods 9-5
9.2.1 Nested Loops Joins 9-6
9.2.1.1 When the Optimizer Considers Nested Loops Joins 9-6
9.2.1.2 How Nested Loops Joins Work 9-7
9.2.1.3 Nested Nested Loops 9-7
9.2.1.4 Current Implementation for Nested Loops Joins 9-10
9.2.1.5 Original Implementation for Nested Loops Joins 9-13
9.2.1.6 Nested Loops Controls 9-14
9.2.2 Hash Joins 9-16
9.2.2.1 When the Optimizer Considers Hash Joins 9-16
9.2.2.2 How Hash Joins Work 9-17
9.2.2.3 How Hash Joins Work When the Hash Table Does Not Fit in the PGA 9-19
ix
9.2.2.4 Hash Join Controls 9-20
9.2.3 Sort Merge Joins 9-20
9.2.3.1 When the Optimizer Considers Sort Merge Joins 9-21
9.2.3.2 How Sort Merge Joins Work 9-21
9.2.3.3 Sort Merge Join Controls 9-25
9.3 Join Types 9-26
9.3.1 Inner Joins 9-26
9.3.1.1 Equijoins 9-26
9.3.1.2 Nonequijoins 9-26
9.3.1.3 Band Joins 9-27
9.3.2 Outer Joins 9-31
9.3.2.1 Nested Loops Outer Joins 9-31
9.3.2.2 Hash Join Outer Joins 9-31
9.3.2.3 Sort Merge Outer Joins 9-34
9.3.2.4 Full Outer Joins 9-34
9.3.2.5 Multiple Tables on the Left of an Outer Join 9-35
9.3.3 Semijoins 9-36
9.3.3.1 When the Optimizer Considers Semijoins 9-36
9.3.3.2 How Semijoins Work 9-36
9.3.4 Antijoins 9-38
9.3.4.1 When the Optimizer Considers Antijoins 9-38
9.3.4.2 How Antijoins Work 9-39
9.3.4.3 How Antijoins Handle Nulls 9-41
9.3.5 Cartesian Joins 9-44
9.3.5.1 When the Optimizer Considers Cartesian Joins 9-44
9.3.5.2 How Cartesian Joins Work 9-44
9.3.5.3 Cartesian Join Controls 9-45
9.4 Join Optimizations 9-46
9.4.1 Bloom Filters 9-46
9.4.1.1 Purpose of Bloom Filters 9-46
9.4.1.2 How Bloom Filters Work 9-47
9.4.1.3 Bloom Filter Controls 9-47
9.4.1.4 Bloom Filter Metadata 9-48
9.4.1.5 Bloom Filters: Scenario 9-48
9.4.2 Partition-Wise Joins 9-50
9.4.2.1 Purpose of Partition-Wise Joins 9-50
9.4.2.2 How Partition-Wise Joins Work 9-51
9.4.3 In-Memory Join Groups 9-53
x
Part V Optimizer Statistics
11 Histograms
11.1 Purpose of Histograms 11-1
11.2 When Oracle Database Creates Histograms 11-1
11.3 How Oracle Database Chooses the Histogram Type 11-3
xi
11.4 Cardinality Algorithms When Using Histograms 11-4
11.4.1 Endpoint Numbers and Values 11-4
11.4.2 Popular and Nonpopular Values 11-4
11.4.3 Bucket Compression 11-5
11.5 Frequency Histograms 11-6
11.5.1 Criteria For Frequency Histograms 11-6
11.5.2 Generating a Frequency Histogram 11-7
11.6 Top Frequency Histograms 11-10
11.6.1 Criteria For Top Frequency Histograms 11-10
11.6.2 Generating a Top Frequency Histogram 11-11
11.7 Height-Balanced Histograms (Legacy) 11-14
11.7.1 Criteria for Height-Balanced Histograms 11-14
11.7.2 Generating a Height-Balanced Histogram 11-15
11.8 Hybrid Histograms 11-18
11.8.1 How Endpoint Repeat Counts Work 11-18
11.8.2 Criteria for Hybrid Histograms 11-20
11.8.3 Generating a Hybrid Histogram 11-21
xii
13 Gathering Optimizer Statistics
13.1 Configuring Automatic Optimizer Statistics Collection 13-1
13.1.1 About Automatic Optimizer Statistics Collection 13-1
13.1.2 Configuring Automatic Optimizer Statistics Collection Using Cloud Control 13-2
13.1.3 Configuring Automatic Optimizer Statistics Collection from the Command Line 13-4
13.2 Gathering Optimizer Statistics Manually 13-5
13.2.1 About Manual Statistics Collection with DBMS_STATS 13-6
13.2.2 Guidelines for Gathering Optimizer Statistics Manually 13-7
13.2.2.1 Guideline for Setting the Sample Size 13-7
13.2.2.2 Guideline for Gathering Statistics in Parallel 13-8
13.2.2.3 Guideline for Partitioned Objects 13-8
13.2.2.4 Guideline for Frequently Changing Objects 13-9
13.2.2.5 Guideline for External Tables 13-9
13.2.3 Determining When Optimizer Statistics Are Stale 13-9
13.2.4 Gathering Schema and Table Statistics 13-11
13.2.5 Gathering Statistics for Fixed Objects 13-11
13.2.6 Gathering Statistics for Volatile Tables Using Dynamic Statistics 13-12
13.2.7 Gathering Optimizer Statistics Concurrently 13-14
13.2.7.1 About Concurrent Statistics Gathering 13-14
13.2.7.2 Enabling Concurrent Statistics Gathering 13-16
13.2.7.3 Monitoring Statistics Gathering Operations 13-19
13.2.8 Gathering Incremental Statistics on Partitioned Objects 13-21
13.2.8.1 Purpose of Incremental Statistics 13-21
13.2.8.2 How DBMS_STATS Derives Global Statistics for Partitioned tables 13-22
13.2.8.3 Gathering Statistics for a Partitioned Table: Basic Steps 13-25
13.2.8.4 Maintaining Incremental Statistics for Partition Maintenance Operations 13-28
13.2.8.5 Maintaining Incremental Statistics for Tables with Stale or Locked
Partition Statistics 13-30
13.3 Gathering System Statistics Manually 13-32
13.3.1 About System Statistics 13-32
13.3.2 Guidelines for Gathering System Statistics 13-34
13.3.3 Gathering System Statistics with DBMS_STATS 13-34
13.3.3.1 About the GATHER_SYSTEM_STATS Procedure 13-34
13.3.3.2 Gathering Workload Statistics 13-36
13.3.3.3 Gathering Noworkload Statistics 13-40
13.3.4 Deleting System Statistics 13-42
13.4 Running Statistics Gathering Functions in Reporting Mode 13-42
xiii
14 Managing Extended Statistics
14.1 Managing Column Group Statistics 14-1
14.1.1 About Statistics on Column Groups 14-2
14.1.1.1 Why Column Group Statistics Are Needed: Example 14-2
14.1.1.2 Automatic and Manual Column Group Statistics 14-4
14.1.1.3 User Interface for Column Group Statistics 14-5
14.1.2 Detecting Useful Column Groups for a Specific Workload 14-6
14.1.3 Creating Column Groups Detected During Workload Monitoring 14-9
14.1.4 Creating and Gathering Statistics on Column Groups Manually 14-11
14.1.5 Displaying Column Group Information 14-12
14.1.6 Dropping a Column Group 14-13
14.2 Managing Expression Statistics 14-14
14.2.1 About Expression Statistics 14-14
14.2.1.1 When Expression Statistics Are Useful: Example 14-15
14.2.2 Creating Expression Statistics 14-15
14.2.3 Displaying Expression Statistics 14-16
14.2.4 Dropping Expression Statistics 14-17
xiv
16.2.2 Changing the Optimizer Statistics Retention Period 16-5
16.2.3 Purging Optimizer Statistics 16-6
16.3 Reporting on Past Statistics Gathering Operations 16-7
xv
19.2.3 Choosing an Optimizer Goal 19-8
19.2.4 Controlling Adaptive Optimization 19-9
19.3 Influencing the Optimizer with Hints 19-11
19.3.1 About Optimizer Hints 19-11
19.3.1.1 Types of Hints 19-12
19.3.1.2 Scope of Hints 19-13
19.3.1.3 Guidelines for Hints 19-14
19.3.2 Guidelines for Join Order Hints 19-15
xvi
Part VII Monitoring and Tracing SQL
xvii
23 Performing Application Tracing
23.1 Overview of End-to-End Application Tracing 23-1
23.1.1 Purpose of End-to-End Application Tracing 23-1
23.1.2 End-to-End Application Tracing in a Multitenant Environment 23-2
23.1.3 Tools for End-to-End Application Tracing 23-3
23.1.3.1 Overview of the SQL Trace Facility 23-3
23.1.3.2 Overview of TKPROF 23-4
23.2 Enabling Statistics Gathering for End-to-End Tracing 23-4
23.2.1 Enabling Statistics Gathering for a Client ID 23-4
23.2.2 Enabling Statistics Gathering for Services, Modules, and Actions 23-5
23.3 Enabling End-to-End Application Tracing 23-6
23.3.1 Enabling Tracing for a Client Identifier 23-6
23.3.2 Enabling Tracing for a Service, Module, and Action 23-7
23.3.3 Enabling Tracing for a Session 23-8
23.3.4 Enabling Tracing for an Instance or Database 23-9
23.4 Generating Output Files Using SQL Trace and TKPROF 23-10
23.4.1 Step 1: Setting Initialization Parameters for Trace File Management 23-11
23.4.2 Step 2: Enabling the SQL Trace Facility 23-12
23.4.3 Step 3: Generating Output Files with TKPROF 23-13
23.4.4 Step 4: Storing SQL Trace Facility Statistics 23-14
23.4.4.1 Generating the TKPROF Output SQL Script 23-14
23.4.4.2 Editing the TKPROF Output SQL Script 23-15
23.4.4.3 Querying the Output Table 23-15
23.5 Guidelines for Interpreting TKPROF Output 23-17
23.5.1 Guideline for Interpreting the Resolution of Statistics 23-17
23.5.2 Guideline for Recursive SQL Statements 23-17
23.5.3 Guideline for Deciding Which Statements to Tune 23-17
23.5.4 Guidelines for Avoiding Traps in TKPROF Interpretation 23-18
23.5.4.1 Guideline for Avoiding the Argument Trap 23-18
23.5.4.2 Guideline for Avoiding the Read Consistency Trap 23-19
23.5.4.3 Guideline for Avoiding the Schema Trap 23-19
23.5.4.4 Guideline for Avoiding the Time Trap 23-20
23.6.1 Application Tracing Utilities 23-21
23.6.1.1 TRCSESS 23-21
23.6.1.1.1 Purpose 23-21
23.6.1.1.2 Guidelines 23-21
23.6.1.1.3 Syntax 23-21
23.6.1.1.4 Options 23-22
23.6.1.1.5 Examples 23-22
23.6.1.2 TKPROF 23-23
xviii
23.6.1.2.1 Purpose 23-23
23.6.1.2.2 Guidelines 23-23
23.6.1.2.3 Syntax 23-24
23.6.1.2.4 Options 23-24
23.6.1.2.5 Output 23-26
23.6.1.2.6 Examples 23-29
23.7.1 Views for Application Tracing 23-34
23.7.1.1 Views Relevant for Trace Statistics 23-34
23.7.1.2 Views Related to Enabling Tracing 23-34
xix
25.1.3 SQL Tuning Advisor Operation 25-15
25.1.3.1 Automatic and On-Demand SQL Tuning 25-15
25.1.3.2 Local and Remote SQL Tuning 25-16
25.2 Managing the Automatic SQL Tuning Task 25-17
25.2.1 About the Automatic SQL Tuning Task 25-18
25.2.1.1 Purpose of Automatic SQL Tuning 25-18
25.2.1.2 Automatic SQL Tuning Concepts 25-18
25.2.1.3 Command-Line Interface to SQL Tuning Advisor 25-19
25.2.1.4 Basic Tasks for Automatic SQL Tuning 25-19
25.2.2 Enabling and Disabling the Automatic SQL Tuning Task 25-20
25.2.2.1 Enabling and Disabling the Automatic SQL Tuning Task Using Cloud
Control 25-21
25.2.2.2 Enabling and Disabling the Automatic SQL Tuning Task from the
Command Line 25-22
25.2.3 Configuring the Automatic SQL Tuning Task 25-23
25.2.3.1 Configuring the Automatic SQL Tuning Task Using Cloud Control 25-24
25.2.3.2 Configuring the Automatic SQL Tuning Task Using the Command Line 25-24
25.2.4 Viewing Automatic SQL Tuning Reports 25-26
25.2.4.1 Viewing Automatic SQL Tuning Reports Using the Command Line 25-27
25.3 Running SQL Tuning Advisor On Demand 25-30
25.3.1 About On-Demand SQL Tuning 25-30
25.3.1.1 Purpose of On-Demand SQL Tuning 25-30
25.3.1.2 User Interfaces for On-Demand SQL Tuning 25-30
25.3.1.3 Basic Tasks in On-Demand SQL Tuning 25-32
25.3.2 Creating a SQL Tuning Task 25-33
25.3.3 Configuring a SQL Tuning Task 25-35
25.3.4 Executing a SQL Tuning Task 25-37
25.3.5 Monitoring a SQL Tuning Task 25-38
25.3.6 Displaying the Results of a SQL Tuning Task 25-39
xx
26.1.3.1 Accessing the SQL Access Advisor: Initial Options Page Using Cloud
Control 26-7
26.1.3.2 Command-Line Interface to SQL Tuning Sets 26-8
26.2 Using SQL Access Advisor: Basic Tasks 26-8
26.2.1 Creating a SQL Tuning Set as Input for SQL Access Advisor 26-10
26.2.2 Populating a SQL Tuning Set with a User-Defined Workload 26-11
26.2.3 Creating and Configuring a SQL Access Advisor Task 26-14
26.2.4 Executing a SQL Access Advisor Task 26-15
26.2.5 Viewing SQL Access Advisor Task Results 26-16
26.2.6 Generating and Executing a Task Script 26-21
26.3 Performing a SQL Access Advisor Quick Tune 26-22
26.4 Using SQL Access Advisor: Advanced Tasks 26-23
26.4.1 Evaluating Existing Access Structures 26-23
26.4.2 Updating SQL Access Advisor Task Attributes 26-24
26.4.3 Creating and Using SQL Access Advisor Task Templates 26-25
26.4.4 Terminating SQL Access Advisor Task Execution 26-27
26.4.4.1 Interrupting SQL Access Advisor Tasks 26-27
26.4.4.2 Canceling SQL Access Advisor Tasks 26-28
26.4.5 Deleting SQL Access Advisor Tasks 26-29
26.4.6 Marking SQL Access Advisor Recommendations 26-30
26.4.7 Modifying SQL Access Advisor Recommendations 26-31
26.5 SQL Access Advisor Examples 26-32
26.6 SQL Access Advisor Reference 26-32
26.6.1 Action Attributes in the DBA_ADVISOR_ACTIONS View 26-32
26.6.2 Categories for SQL Access Advisor Task Parameters 26-34
26.6.3 SQL Access Advisor Constants 26-35
xxi
27.2.1 About SQL Profile Implementation 27-8
27.2.2 Implementing a SQL Profile 27-9
27.3 Listing SQL Profiles 27-9
27.4 Altering a SQL Profile 27-10
27.5 Dropping a SQL Profile 27-11
27.6 Transporting a SQL Profile 27-12
xxii
29.2.2 Managing the SPM Evolve Advisor Task 29-9
29.2.2.1 About the SPM Evolve Advisor Task 29-9
29.2.2.2 Enabling and Disabling the Automatic SPM Evolve Advisor Task 29-9
29.2.2.3 Configuring the Automatic SPM Evolve Advisor Task 29-10
29.3 Displaying Plans in a SQL Plan Baseline 29-13
29.4 Loading SQL Plan Baselines 29-14
29.4.1 About Loading SQL Plan Baselines 29-15
29.4.2 Loading Plans from AWR 29-16
29.4.3 Loading Plans from the Shared SQL Area 29-19
29.4.4 Loading Plans from a SQL Tuning Set 29-21
29.4.5 Loading Plans from a Staging Table 29-23
29.5 Evolving SQL Plan Baselines Manually 29-26
29.5.1 About the DBMS_SPM Evolve Functions 29-26
29.5.2 Managing an Evolve Task 29-28
29.6 Dropping SQL Plan Baselines 29-37
29.7 Managing the SQL Management Base 29-38
29.7.1 About Managing the SMB 29-38
29.7.2 Changing the Disk Space Limit for the SMB 29-39
29.7.3 Changing the Plan Retention Policy in the SMB 29-41
xxiii
A.1.4.1 Guidelines for Choosing Keys for Composite Indexes A-4
A.1.4.2 Guidelines for Ordering Keys for Composite Indexes A-4
A.1.5 Guidelines for Writing SQL Statements That Use Indexes A-5
A.1.6 Guidelines for Writing SQL Statements That Avoid Using Indexes A-5
A.1.7 Guidelines for Avoiding Index Serialization on a Sequence-Generated Key A-5
A.1.8 Guidelines for Re-Creating Indexes A-7
A.1.9 Guidelines for Compacting Indexes A-7
A.1.10 Guidelines for Using Nonunique Indexes to Enforce Uniqueness A-8
A.1.11 Guidelines for Using Enabled Novalidated Constraints A-8
A.2 Guidelines for Using Function-Based Indexes for Performance A-9
A.3 Guidelines for Using Partitioned Indexes for Performance A-10
A.4 Guidelines for Using Index-Organized Tables for Performance A-11
A.5 Guidelines for Using Bitmap Indexes for Performance A-12
A.6 Guidelines for Using Bitmap Join Indexes for Performance A-12
A.7 Guidelines for Using Domain Indexes for Performance A-13
A.8 Guidelines for Using Table Clusters A-13
A.9 Guidelines for Using Hash Clusters for Performance A-14
Glossary
Index
xxiv
Preface
This manual explains how to tune Oracle SQL.
This preface contains the following topics:
Audience
This document is intended for database administrators and application developers who
perform the following tasks:
• Generating and interpreting SQL execution plans
• Managing optimizer statistics
• Influencing the optimizer through initialization parameters or SQL hints
• Controlling cursor sharing for SQL statements
• Monitoring SQL execution
• Performing application tracing
• Managing SQL tuning sets
• Using SQL Tuning Advisor or SQL Access Advisor
• Managing SQL profiles
• Managing SQL baselines
Documentation Accessibility
For information about Oracle's commitment to accessibility, visit the Oracle Accessibility
Program website at http://www.oracle.com/pls/topic/lookup?ctx=acc&id=docacc.
Related Documents
This manual assumes that you are familiar with Oracle Database Concepts. The following
books are frequently referenced:
• Oracle Database Data Warehousing Guide
• Oracle Database VLDB and Partitioning Guide
xxv
Preface
Conventions
The following text conventions are used in this document:
Convention Meaning
boldface Boldface type indicates graphical user interface elements associated
with an action, or terms defined in text or the glossary.
italic Italic type indicates book titles, emphasis, or placeholder variables for
which you supply particular values.
monospace Monospace type indicates commands within a paragraph, URLs, code
in examples, text that appears on the screen, or text that you enter.
xxvi
Changes in This Release for Oracle Database
SQL Tuning Guide
This preface describes the most important changes in Oracle Database SQL Tuning Guide.
This preface contains the following topics:
New Features
The following features are new in this release:
• Private temporary tables
Private temporary tables are temporary database objects that are automatically dropped
at the end of a transaction or a session. A private temporary table is stored in memory
and is visible only to the session that created it. A private temporary table confines the
scope of a temporary table to a session or a transaction, thus providing more flexibility in
application coding, leading to easier code maintenance and a better ready-to-use
functionality.
See "Statistics for Global Temporary Tables".
• Approximate Top-N Query Processing
To obtain “top n” query results much faster than traditional queries, use the APPROX_SUM
and APPROX_COUNT SQL functions with APPROX_RANK .
See "About Approximate Query Processing".
• SQL Tuning Advisor enhancements for Oracle Exadata Database Machine
SQL Tuning Advisor can recommend an Exadata-aware SQL profile. On Oracle Exadata
Database Machine, the cost of smart scans depends on the system statistics I/O seek
time (ioseektim), multiblock read count (mbrc), and I/O transfer speed (iotfrspeed). The
values of these statistics usually differ on Exadata and can thus influence the choice of
plan. If system statistics are stale, and if gathering them improves performance, then
SQL Tuning Advisor recommends accepting an Exadata-aware SQL profile.
See "Statistical Analysis" and "Statistics in SQL Profiles".
• New package for managing SQL tuning sets
You can use DBMS_SQLSET instead of DBMS_SQLTUNE to create, modify, drop, and perform
all other SQL tuning set operations.
See "Command-Line Interface to SQL Tuning Sets".
xxvii
Changes in This Release for Oracle Database SQL Tuning Guide
• Scalable sequences
Scalable sequences alleviate index leaf block contention when loading data into
tables that use sequence values as keys.
• Decoupling OPTIMIZER_ADAPTIVE_STATISTICS from performance feedback
Unlike in previous releases, setting the OPTIMIZER_ADAPTIVE_STATISTICS
initialization parameter to TRUE or FALSE now has no effect on performance
feedback.
New Features
The following features are new in this release:
• Advisor enhancements
– Optimizer Statistics Advisor
Optimizer Statistics Advisor is built-in diagnostic software that analyzes the
quality of statistics and statistics-related tasks. The advisor task runs
automatically in the maintenance window, but you can also run it on demand.
You can then view the advisor report. If the advisor makes recommendations,
then in some cases you can run system-generated scripts to implement them.
See "Analyzing Statistics Using Optimizer Statistics Advisor".
– Active Data Guard Support for SQL Tuning Advisor
Using database links, you can tune a standby database workload on a primary
database.
See "Local and Remote SQL Tuning".
• DBMS_STATS enhancements
– DBMS_STATS preference for automatic column group statistics
If the DBMS_STATS preference AUTO_STAT_EXTENSIONS is set to ON (by default it
is OFF), then a SQL plan directive can automatically trigger the creation of
column group statistics based on usage of predicates in the workload.
See "Purpose of Optimizer Statistics Preferences".
– DBMS_STATS support for external table scan rates and In-Memory column store
(IM column store) statistics
If the database uses an IM column store, then you can set the im_imcu_count
parameter to the number of IMCUs in the table or partition, and
im_block_count to the number of blocks. For an external table, scanrate
specifies the rate at which data is scanned in MB/second.
See "Guideline for External Tables".
– DBMS_STATS statistics preference PREFERENCE_OVERRIDES_PARAMETER
The PREFERENCE_OVERRIDES_PARAMETER statistics preference determines
whether, when gathering optimizer statistics, to override the input value of a
xxviii
Changes in This Release for Oracle Database SQL Tuning Guide
parameter with the statistics preference. In this way, you control when the database
honors a parameter value passed to the statistics gathering procedures.
See "Statistics Preference Overrides".
– Access to current statistics does not require FLUSH_DATABASE_MONITORING_INFO
You no longer need to ensure that view metadata is up-to-date by using
DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO to save monitoring information to
disk. The statistics shown in DBA_TAB_STATISTICS and DBA_IND_STATISTICS come
from the same source as DBA_TAB_MODIFICATIONS, which means these views show
statistics obtained from disk and memory.
See "Determining When Optimizer Statistics Are Stale".
• Separate controls for adaptive plans and adaptive statistics
The OPTIMIZER_ADAPTIVE_PLANS initialization parameter enables (default) or disables
adaptive plans. The OPTIMIZER_ADAPTIVE_STATISTICS initialization parameter enables or
disables (default) adaptive statistics.
See "When Adaptive Query Plans Are Enabled" and "When Adaptive Statistics Are
Enabled".
• Join enhancements
– Join groups
A join group is a user-created object that lists two columns that can be meaningfully
joined. In certain queries, join groups enable the database to eliminate the
performance overhead of decompressing and hashing column values. Join groups
require an IM column store.
See "In-Memory Join Groups".
– Band join enhancements
A band join is a special type of nonequijoin in which key values in one data set must
fall within the specified range (“band”) of the second data set. When the database
detects a band join, the database evaluates the costs of band joins more efficiently,
avoiding unnecessary scans of rows that fall outside the defined bands. In most
cases, optimized performance is comparable to an equijoin.
See "Band Joins".
• Cursor management enhancements
– Cursor-duration temporary tables
To materialize the intermediate results of a query, Oracle Database may create a
cursor-duration temporary table in memory during query compilation. For complex
operations such as WITH clause queries and star transformations, this internal
optimization, which enhances the materialization of intermediate results from
repetitively used subqueries, improves performance and optimizes I/O.
See "Cursor-Duration Temporary Tables".
– Fine-grained cursor invalidation
Starting in this release, you can specify deferred invalidation on DDL statements.
When shared SQL areas are marked rolling invalid, the database assigns each one a
randomly generated time period. A hard parse occurs only if the query executes after
the time period has expired. In this way, the database can diffuse the performance
overhead of hard parsing over time.
xxix
Changes in This Release for Oracle Database SQL Tuning Guide
Desupported Features
The following features are desupported in Oracle Database 12c Release 2 (12.2.0.1).
• The OPTIMIZER_ADAPTIVE_FEATURES initialization parameter
See Also:
Oracle Database Upgrade Guide for a list of desupported features
Other Changes
This topic describes additional changes in the release.
• New Real-World Performance content
xxx
Changes in This Release for Oracle Database SQL Tuning Guide
New Features
The following features are new in this release.
• In-Memory aggregation
This optimization minimizes the join and GROUP BY processing required for each row when
joining a single large table to multiple small tables, as in a star schema. VECTOR GROUP BY
aggregation uses the infrastructure related to parallel query (PQ) processing, and blends
it with CPU-efficient algorithms to maximize the performance and effectiveness of the
initial aggregation performed before redistributing fact data.
See "In-Memory Aggregation (VECTOR GROUP BY)".
• SQL Monitor support for adaptive query plans
SQL Monitor supports adaptive query plans in the following ways:
– Indicates whether a query plan is adaptive, and show its current status: resolving or
resolved.
– Provides a list that enables you to select the current, full, or final query plans
See "Adaptive Query Plans" to learn more about adaptive query plans, and
"Monitoring SQL Executions Using Cloud Control" to learn more about SQL Monitor.
New Features
The following features are new in this release.
• Adaptive SQL Plan Management (SPM)
The SPM Evolve Advisor is a task infrastructure that enables you to schedule an evolve
task, rerun an evolve task, and generate persistent reports. The new automatic evolve
task, SYS_AUTO_SPM_EVOLVE_TASK, runs in the default maintenance window. This task
ranks all unaccepted plans and runs the evolve process for them. If the task finds a new
plan that performs better than existing plan, the task automatically accepts the plan. You
can also run evolution tasks manually using the DBMS_SPM package.
See "Managing the SPM Evolve Advisor Task".
• Adaptive query optimization
xxxi
Changes in This Release for Oracle Database SQL Tuning Guide
xxxii
Changes in This Release for Oracle Database SQL Tuning Guide
Also, regular frequency histograms have been enhanced. The optimizer computes
frequency histograms during NDV computation based on a full scan of the data rather
than a small sample (when AUTO_SAMPLING is used). The enhanced frequency histograms
ensure that even highly infrequent values are properly represented with accurate bucket
counts within a histogram.
See "Histograms ".
• Monitoring database operations
Real-Time Database Operations Monitoring enables you to monitor long running
database tasks such as batch jobs, scheduler jobs, and Extraction, Transformation, and
Loading (ETL) jobs as a composite business operation. This feature tracks the progress
of SQL and PL/SQL queries associated with the business operation being monitored. As
a DBA or developer, you can define business operations for monitoring by explicitly
specifying the start and end of the operation or implicitly with tags that identify the
operation.
See "Monitoring Database Operations ".
• Concurrent statistics gathering
You can concurrently gather optimizer statistics on multiple tables, table partitions, or
table subpartitions. By fully utilizing multiprocessor environments, the database can
reduce the overall time required to gather statistics. Oracle Scheduler and Advanced
Queuing create and manage jobs to gather statistics concurrently. The scheduler decides
how many jobs to execute concurrently, and how many to queue based on available
system resources and the value of the JOB_QUEUE_PROCESSES initialization parameter.
See "Gathering Optimizer Statistics Concurrently".
• Reporting mode for DBMS_STATS statistics gathering functions
You can run the DBMS_STATS functions in reporting mode. In this mode, the optimizer does
not actually gather statistics, but reports objects that would be processed if you were to
use a specified statistics gathering function.
See "Running Statistics Gathering Functions in Reporting Mode".
• Reports on past statistics gathering operations
You can use DBMS_STATS functions to report on a specific statistics gathering operation or
on operations that occurred during a specified time.
See "Reporting on Past Statistics Gathering Operations".
• Automatic column group creation
With column group statistics, the database gathers optimizer statistics on a group of
columns treated as a unit. Starting in Oracle Database 12c, the database automatically
determines which column groups are required in a specified workload or SQL tuning set,
and then creates the column groups. Thus, for any specified workload, you no longer
need to know which columns from each table must be grouped.
See "Detecting Useful Column Groups for a Specific Workload".
• Session-private statistics for global temporary tables
Starting in this release, global temporary tables have a different set of optimizer statistics
for each session. Session-specific statistics improve performance and manageability of
temporary tables because users no longer need to set statistics for a global temporary
table in each session or rely on dynamic statistics. The possibility of errors in cardinality
estimates for global temporary tables is lower, ensuring that the optimizer has the
necessary information to determine an optimal execution plan.
xxxiii
Changes in This Release for Oracle Database SQL Tuning Guide
Deprecated Features
The following features are deprecated in this release, and may be desupported in a
future release.
• Stored outlines
See "Managing SQL Plan Baselines" for information about alternatives.
• The SIMILAR value for the CURSOR_SHARING initialization parameter
xxxiv
Changes in This Release for Oracle Database SQL Tuning Guide
Desupported Features
Some features previously described in this document are desupported in Oracle Database
12c.
See Oracle Database Upgrade Guide for a list of desupported features.
Other Changes
This manual has the following additional changes in Oracle Database 12c.
• New tuning books
The Oracle Database 11g Oracle Database Performance Tuning Guide has been divided
into two books for Oracle Database 12c:
– Oracle Database Performance Tuning Guide, which contains only topics that pertain
to tuning the database
– Oracle Database SQL Tuning Guide, which contains only topics that pertain to tuning
SQL
xxxv
Part I
SQL Performance Fundamentals
SQL tuning is improving SQL statement performance to meet specific, measurable, and
achievable goals.
This part contains the following chapters:
1
Introduction to SQL Tuning
SQL tuning is the attempt to diagnose and repair SQL statements that fail to meet a
performance standard.
This chapter contains the following topics:
See Also:
1-1
Chapter 1
Tasks and Tools for SQL Tuning
1-2
Chapter 1
Tasks and Tools for SQL Tuning
Review past execution history to find the statements responsible for a large share of the
application workload and system resources.
2. Gathering performance-related data
The optimizer statistics are crucial to SQL tuning. If these statistics do not exist or are no
longer accurate, then the optimizer cannot generate the best plan. Other data relevant to
SQL performance include the structure of tables and views that the statement accessed,
and definitions of any indexes available to the statement.
3. Determining the causes of the problem
Typically, causes of SQL performance problems include:
• Inefficiently designed SQL statements
If a SQL statement is written so that it performs unnecessary work, then the optimizer
cannot do much to improve its performance. Examples of inefficient design include
– Neglecting to add a join condition, which leads to a Cartesian join
– Using hints to specify a large table as the driving table in a join
– Specifying UNION instead of UNION ALL
– Making a subquery execute for every row in an outer query
• Suboptimal execution plans
The query optimizer (also called the optimizer) is internal software that determines
which execution plan is most efficient. Sometimes the optimizer chooses a plan with
a suboptimal access path, which is the means by which the database retrieves data
from the database. For example, the plan for a query predicate with low selectivity
may use a full table scan on a large table instead of an index.
You can compare the execution plan of an optimally performing SQL statement to the
plan of the statement when it performs suboptimally. This comparison, along with
information such as changes in data volumes, can help identify causes of
performance degradation.
• Missing SQL access structures
Absence of SQL access structures, such as indexes and materialized views, is a
typical reason for suboptimal SQL performance. The optimal set of access structures
can improve SQL performance by orders of magnitude.
• Stale optimizer statistics
Statistics gathered by DBMS_STATS can become stale when the statistics maintenance
operations, either automatic or manual, cannot keep up with the changes to the table
data caused by DML. Because stale statistics on a table do not accurately reflect the
table data, the optimizer can make decisions based on faulty information and
generate suboptimal execution plans.
• Hardware problems
Suboptimal performance might be connected with memory, I/O, and CPU problems.
4. Defining the scope of the problem
The scope of the solution must match the scope of the problem. Consider a problem at
the database level and a problem at the statement level. For example, the shared pool is
too small, which causes cursors to age out quickly, which in turn causes many hard
parses. Using an initialization parameter to increase the shared pool size fixes the
problem at the database level and improves performance for all sessions. However, if a
1-3
Chapter 1
Tasks and Tools for SQL Tuning
single SQL statement is not using a helpful index, then changing the optimizer
initialization parameters for the entire database could harm overall performance. If
a single SQL statement has a problem, then an appropriately scoped solution
addresses just this problem with this statement.
5. Implementing corrective actions for suboptimally performing SQL statements
These actions vary depending on circumstances. For example, you might rewrite a
SQL statement to be more efficient, avoiding unnecessary hard parsing by
rewriting the statement to use bind variables. You might also use equijoins,
remove functions from WHERE clauses, and break a complex SQL statement into
multiple simple statements.
In some cases, you improve SQL performance not by rewriting the statement, but
by restructuring schema objects. For example, you might index a new access
path, or reorder columns in a concatenated index. You might also partition a table,
introduce derived values, or even change the database design.
6. Preventing SQL performance regressions
To ensure optimal SQL performance, verify that execution plans continue to
provide optimal performance, and choose better plans if they come available. You
can achieve these goals using optimizer statistics, SQL profiles, and SQL plan
baselines.
See Also:
1-4
Chapter 1
Tasks and Tools for SQL Tuning
See Also:
See Also:
1-5
Chapter 1
Tasks and Tools for SQL Tuning
See Also:
See Also:
1-6
Chapter 1
Tasks and Tools for SQL Tuning
See Also:
See Also:
Oracle Database Testing Guide
1-7
Chapter 1
Tasks and Tools for SQL Tuning
• AUTOTRACE
The AUTOTRACE command in SQL*Plus generates the execution plan and statistics
about the performance of a query. This command provides statistics such as disk
reads and memory reads. See SQL*Plus User's Guide and Reference.
See Also:
1-8
Chapter 1
Tasks and Tools for SQL Tuning
• trcsess
This utility consolidates trace output from multiple trace files based on criteria such as
session ID, client ID, and service ID. After trcsess merges the trace information into a
single output file, you can format the output file with TKPROF. trcsess is useful for
consolidating the tracing of a particular session for performance or debugging purposes.
End-to-End Application Tracing simplifies the process of diagnosing performance problems in
multitier environments. In these environments, the middle tier routes a request from an end
client to different database sessions, making it difficult to track a client across database
sessions. End-to-End application tracing uses a client ID to uniquely trace a specific end-
client through all tiers to the database.
See Also:
Oracle Database PL/SQL Packages and Types Reference to learn more about
DBMS_MONITOR and DBMS_SESSION
See Also:
1-9
Chapter 1
Tasks and Tools for SQL Tuning
Oracle recommends Cloud Control as the best interface for database administration
and tuning. In cases where the command-line interface better illustrates a particular
concept or task, this manual uses command-line examples. However, in these cases
the tuning tasks include a reference to the principal Cloud Control page associated
with the task.
1-10
2
SQL Performance Methodology
This chapter describes the recommended methodology for SQL tuning.
Note:
This book assumes that you have learned the Oracle Database performance
methodology described in Oracle Database 2 Day + Performance Tuning Guide.
2-1
Chapter 2
Guidelines for Designing Your Application
SELECT *
FROM employees
WHERE last_name LIKE 'KING';
SELECT *
FROM employees
WHERE last_name LIKE :1;
The following example shows the results of some tests on a simple OLTP
application:
2-2
Chapter 2
Guidelines for Deploying Your Application
These tests were performed on a four-CPU computer. The differences increase as the
number of CPUs on the system increase.
2-3
Chapter 2
Guidelines for Deploying Your Application
2-4
Part II
Query Optimizer Fundamentals
To tune Oracle SQL, you must understand the query optimizer. The optimizer is built-in
software that determines the most efficient method for a statement to access data.
This part contains the following chapters:
3
SQL Processing
This chapter explains how database processes DDL statements to create objects, DML to
modify data, and queries to retrieve data.
This chapter contains the following topics:
SQL Statement
Parsing
Syntax
Check
Semantic
Check
Hard Parse
Generation of
multiple Optimization
execution plans
Execution
3-1
Chapter 3
About SQL Processing
See Also:
Oracle Database Concepts to learn about deadlocks
3-2
Chapter 3
About SQL Processing
A syntactically correct statement can fail a semantic check, as shown in the following
example of a query of a nonexistent table:
Note:
The database always performs a hard parse of DDL.
During the hard parse, the database accesses the library cache and data dictionary
cache numerous times to check the data dictionary. When the database accesses these
areas, it uses a serialization device called a latch on required objects so that their
definition does not change. Latch contention increases statement execution time and
decreases concurrency.
• Soft parse
3-3
Chapter 3
About SQL Processing
A soft parse is any parse that is not a hard parse. If the submitted statement is the
same as a reusable SQL statement in the shared pool, then Oracle Database
reuses the existing code. This reuse of code is also called a library cache hit.
Soft parses can vary in how much work they perform. For example, configuring the
session shared SQL area can sometimes reduce the amount of latching in the soft
parses, making them "softer."
In general, a soft parse is preferable to a hard parse because the database skips
the optimization and row source generation steps, proceeding straight to
execution.
The following graphic is a simplified representation of a shared pool check of an
UPDATE statement in a dedicated server architecture.
Reserved
SQL Area
Pool
User Global Area (SGA)
Private
Shared Pool
Other
3967354608
Shared SQL Area
Server
Cache
Result
Library Cache
3667723989
3967354608
2190280494
Dictionary
PGA
Cache
Session Memory
Data
Process
Server
Process
Client
Update ...
User
If a check determines that a statement in the shared pool has the same hash value,
then the database performs semantic and environment checks to determine whether
the statements have the same meaning. Identical syntax is not sufficient. For example,
suppose two different users log in to the database and issue the following SQL
statements:
The SELECT statements for the two users are syntactically identical, but two separate
schema objects are named my_table. This semantic difference means that the second
statement cannot reuse the code for the first statement.
Even if two statements are semantically identical, an environmental difference can
force a hard parse. In this context, the optimizer environment is the totality of session
3-4
Chapter 3
About SQL Processing
settings that can affect execution plan generation, such as the work area size or optimizer
settings (for example, the optimizer mode). Consider the following series of SQL statements
executed by a single user:
In the preceding example, the same SELECT statement is executed in three different optimizer
environments. Consequently, the database creates three separate shared SQL areas for
these statements and forces a hard parse of each statement.
See Also:
• Oracle Database Concepts to learn about private SQL areas and shared SQL
areas
• Oracle Database Performance Tuning Guide to learn how to configure the
shared pool
• Oracle Database Concepts to learn about latches
3-5
Chapter 3
About SQL Processing
Execution Plan
----------------------------------------------------------
Plan hash value: 975837011
------------------------------------------------------------------------
---
| Id| Operation | Name |Rows|Bytes|Cost(%CPU)|
Time|
------------------------------------------------------------------------
---
| 0| SELECT STATEMENT | | 3 |189 |7(15)|
00:00:01 |
|*1| HASH JOIN | | 3 |189 |7(15)|
00:00:01 |
|*2| HASH JOIN | | 3 |141 |5(20)|
00:00:01 |
| 3| TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 3 | 60 |2 (0)|
00:00:01 |
|*4| INDEX RANGE SCAN | EMP_NAME_IX | 3 | |1 (0)|
00:00:01 |
| 5| TABLE ACCESS FULL | JOBS |19 |513 |2 (0)|
00:00:01 |
| 6| TABLE ACCESS FULL | DEPARTMENTS |27 |432 |2 (0)|
00:00:01 |
------------------------------------------------------------------------
---
1 - access("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
2 - access("E"."JOB_ID"="J"."JOB_ID")
4 - access("E"."LAST_NAME" LIKE 'A%')
filter("E"."LAST_NAME" LIKE 'A%')
3-6
Chapter 3
About SQL Processing
1
HASH JOIN
2 6
HASH JOIN TABLE ACCESS
FULL
departments
3 5
TABLE ACCESS TABLE ACCESS
BY INDEX ROWID FULL
employees jobs
4
INDEX RANGE
SCAN
emp_name_ix
In Figure 3-3, each node of the tree acts as a row source, which means that each step of the
execution plan in Example 3-1 either retrieves rows from the database or accepts rows from
one or more row sources as input. The SQL engine executes each row source as follows:
3-7
Chapter 3
How Oracle Database Processes DML
• Steps indicated by the black boxes physically retrieve data from an object in the
database. These steps are the access paths, or techniques for retrieving data from
the database.
– Step 6 uses a full table scan to retrieve all rows from the departments table.
– Step 5 uses a full table scan to retrieve all rows from the jobs table.
– Step 4 scans the emp_name_ix index in order, looking for each key that begins
with the letter A and retrieving the corresponding rowid. For example, the rowid
corresponding to Atkinson is AAAPzRAAFAAAABSAAe.
– Step 3 retrieves from the employees table the rows whose rowids were
returned by Step 4. For example, the database uses rowid
AAAPzRAAFAAAABSAAe to retrieve the row for Atkinson.
• Steps indicated by the clear boxes operate on row sources.
– Step 2 performs a hash join, accepting row sources from Steps 3 and 5,
joining each row from the Step 5 row source to its corresponding row in Step
3, and returning the resulting rows to Step 1.
For example, the row for employee Atkinson is associated with the job name
Stock Clerk.
– Step 1 performs another hash join, accepting row sources from Steps 2 and 6,
joining each row from the Step 6 source to its corresponding row in Step 2,
and returning the result to the client.
For example, the row for employee Atkinson is associated with the
department named Shipping.
In some execution plans the steps are iterative and in others sequential. The hash join
shown in Example 3-1 is sequential. The database completes the steps in their entirety
based on the join order. The database starts with the index range scan of
emp_name_ix. Using the rowids that it retrieves from the index, the database reads the
matching rows in the employees table, and then scans the jobs table. After it retrieves
the rows from the jobs table, the database performs the hash join.
During execution, the database reads the data from disk into memory if the data is not
in memory. The database also takes out any locks and latches necessary to ensure
data integrity and logs any changes made during the SQL execution. The final stage of
processing a SQL statement is closing the cursor.
3-8
Chapter 3
How Oracle Database Processes DDL
In general, the database cannot determine for certain the number of rows to be retrieved by a
query until the last row is fetched. Oracle Database retrieves the data in response to fetch
calls, so that the more rows the database reads, the more work it performs. For some queries
the database returns the first row as quickly as possible, whereas for others it creates the
entire result set before returning the first row.
See Also:
Oracle Database Concepts to learn about multiversion read consistency
Typically, the database would run dozens of recursive statements to execute the preceding
statement. The recursive SQL would perform actions such as the following:
• Issue a COMMIT before executing the CREATE TABLE statement
• Verify that user privileges are sufficient to create the table
• Determine which tablespace the table should reside in
3-9
Chapter 3
How Oracle Database Processes DDL
See Also:
Oracle Database Development Guide to learn about processing DDL,
transaction control, and other types of statements
3-10
4
Query Optimizer Concepts
This chapter describes the most important concepts relating to the query optimizer, including
its principal components.
This chapter contains the following topics:
4-1
Chapter 4
Introduction to the Query Optimizer
Note:
The optimizer may not make the same decisions from one version of Oracle
Database to the next. In recent versions, the optimizer might make different
decision because better information is available and more optimizer
transformations are possible.
GB Plan GB Plan
1 2
NL HJ
NL HJ
Generates Multiple
Plans and
Compares Them
Statistics
1 0 1 1 0 0 1 0 0
4-2
Chapter 4
Introduction to the Query Optimizer
Each SELECT block in the original SQL statement is represented internally by a query block. A
query block can be a top-level statement, subquery, or unmerged view.
Example 4-1 Query Blocks
The following SQL statement consists of two query blocks. The subquery in parentheses is
the inner query block. The outer query block, which is the rest of the SQL statement, retrieves
names of employees in the departments whose IDs were supplied by the subquery. The
query form determines how query blocks are interrelated.
See Also:
• "View Merging"
• Oracle Database Concepts for an overview of SQL processing
4-3
Chapter 4
About Optimizer Components
The advisor picks the most efficient (lowest cost) overall route based on user-specified
goals and the available statistics about roads and traffic conditions. The more accurate
the statistics, the better the advice. For example, if the advisor is not frequently notified
of traffic jams, road closures, and poor road conditions, then the recommended route
may turn out to be inefficient (high cost).
Dictionary
Data
Query + estimates
Parsed Query
(from Parser)
Query Plan
Transformer
Generator
Estimator
Query
Plan
A set of query blocks represents a parsed query, which is the input to the optimizer.
The following table describes the optimizer operations.
4-4
Chapter 4
About Optimizer Components
SELECT *
FROM sales
WHERE promo_id=33
OR prod_id=136;
Query Transformer
SELECT *
FROM sales
WHERE prod_id=136
UNION ALL
SELECT *
FROM sales
WHERE promo_id=33
AND LNNVL(prod_id=136);
4.2.2 Estimator
The estimator is the component of the optimizer that determines the overall cost of a given
execution plan.
The estimator uses three different measures to determine cost:
• Selectivity
The percentage of rows in the row set that the query selects, with 0 meaning no rows and
1 meaning all rows. Selectivity is tied to a query predicate, such as WHERE last_name
4-5
Chapter 4
About Optimizer Components
Note:
Selectivity is an internal calculation that is not visible in the execution
plans.
• Cardinality
The cardinality is the number of rows returned by each operation in an execution
plan. This input, which is crucial to obtaining an optimal plan, is common to all cost
functions. The estimator can derive cardinality from the table statistics collected by
DBMS_STATS, or derive it after accounting for effects from predicates (filter, join, and
so on), DISTINCT or GROUP BY operations, and so on. The Rows column in an
execution plan shows the estimated cardinality.
• Cost
This measure represents units of work or resource used. The query optimizer uses
disk I/O, CPU usage, and memory usage as units of work.
As shown in the following graphic, if statistics are available, then the estimator uses
them to compute the measures. The statistics improve the degree of accuracy of the
measures.
Cardinality
Selectivity Cost
GB Plan
Estimator
HJ
HJ Total Cost
Statistics
1 0 1 0 0
0 0 0 1 1
0 1 1 0 1
For the query shown in Example 4-1, the estimator uses selectivity, estimated
cardinality (a total return of 10 rows), and cost measures to produce its total cost
estimate of 3:
------------------------------------------------------------------------
---
|Id| Operation |Name |Rows|Bytes|Cost %CPU|
Time|
------------------------------------------------------------------------
---
| 0| SELECT STATEMENT | |10|250|3 (0)|
00:00:01|
4-6
Chapter 4
About Optimizer Components
| 1| NESTED LOOPS | | | | | |
| 2| NESTED LOOPS | |10|250|3 (0)|00:00:01|
|*3| TABLE ACCESS FULL |DEPARTMENTS | 1| 7|2 (0)|00:00:01|
|*4| INDEX RANGE SCAN |EMP_DEPARTMENT_IX|10| |0 (0)|00:00:01|
| 5| TABLE ACCESS BY INDEX ROWID|EMPLOYEES |10|180|1 (0)|00:00:01|
---------------------------------------------------------------------------
4.2.2.1 Selectivity
The selectivity represents a fraction of rows from a row set.
The row set can be a base table, a view, or the result of a join. The selectivity is tied to a
query predicate, such as last_name = 'Smith', or a combination of predicates, such as
last_name = 'Smith' AND job_id = 'SH_CLERK'.
Note:
Selectivity is an internal calculation that is not visible in execution plans.
A predicate filters a specific number of rows from a row set. Thus, the selectivity of a
predicate indicates how many rows pass the predicate test. Selectivity ranges from 0.0 to 1.0.
A selectivity of 0.0 means that no rows are selected from a row set, whereas a selectivity of
1.0 means that all rows are selected. A predicate becomes more selective as the value
approaches 0.0 and less selective (or more unselective) as the value approaches 1.0.
The optimizer estimates selectivity depending on whether statistics are available:
• Statistics not available
Depending on the value of the OPTIMIZER_DYNAMIC_SAMPLING initialization parameter, the
optimizer either uses dynamic statistics or an internal default value. The database uses
different internal defaults depending on the predicate type. For example, the internal
default for an equality predicate (last_name = 'Smith') is lower than for a range
predicate (last_name > 'Smith') because an equality predicate is expected to return a
smaller fraction of rows.
• Statistics available
When statistics are available, the estimator uses them to estimate selectivity. Assume
there are 150 distinct employee last names. For an equality predicate last_name =
'Smith', selectivity is the reciprocal of the number n of distinct values of last_name,
which in this example is .006 because the query selects rows that contain 1 out of 150
distinct values.
If a histogram exists on the last_name column, then the estimator uses the histogram
instead of the number of distinct values. The histogram captures the distribution of
different values in a column, so it yields better selectivity estimates, especially for
columns that have data skew.
4-7
Chapter 4
About Optimizer Components
See Also:
• "Histograms "
• Oracle Database Reference to learn more about
OPTIMIZER_DYNAMIC_SAMPLING
4.2.2.2 Cardinality
The cardinality is the number of rows returned by each operation in an execution
plan.
For example, if the optimizer estimate for the number of rows returned by a full table
scan is 100, then the cardinality estimate for this operation is 100. The cardinality
estimate appears in the Rows column of the execution plan.
The optimizer determines the cardinality for each operation based on a complex set of
formulas that use both table and column level statistics, or dynamic statistics, as input.
The optimizer uses one of the simplest formulas when a single equality predicate
appears in a single-table query, with no histogram. In this case, the optimizer assumes
a uniform distribution and calculates the cardinality for the query by dividing the total
number of rows in the table by the number of distinct values in the column used in the
WHERE clause predicate.
The employees table contains 107 rows. The current database statistics indicate that
the number of distinct values in the salary column is 58. Therefore, the optimizer
estimates the cardinality of the result set as 2, using the formula 107/58=1.84.
4.2.2.3 Cost
The optimizer cost model accounts for the machine resources that a query is
predicted to use.
The cost is an internal numeric measure that represents the estimated resource usage
for a plan. The cost is specific to a query in an optimizer environment. To estimate
cost, the optimizer considers factors such as the following:
• System resources, which includes estimated I/O, CPU, and memory
• Estimated number of rows returned (cardinality)
4-8
Chapter 4
About Optimizer Components
Note:
The cost is an internal measure that the optimizer uses to compare different plans
for the same query. You cannot tune or change cost.
The execution time is a function of the cost, but cost does not equate directly to time. For
example, if the plan for query A has a lower cost than the plan for query B, then the following
outcomes are possible:
• A executes faster than B.
• A executes slower than B.
• A executes in the same amount of time as B.
Therefore, you cannot compare the costs of different queries with one another. Also, you
cannot compare the costs of semantically equivalent queries that use different optimizer
modes.
4-9
Chapter 4
About Optimizer Components
departments 0 employees 1
employees 0 departments 1
WHERE e.department_id = d.department_id;
hr.employees e, hr.departments d
SELECT e.last_name, d.department_name
departments 0, employees 1
Full Table Scan
Access Path
Index
Hash Join
Loop, Sort Merge
Hash, Nested
Join Method
Transformer
FROM
Optimizer
The following snippet from an optimizer trace file shows some computations that the
optimizer performs:
GENERAL PLANS
***************************************
Considering cardinality-based initial join order.
Permutations for Starting Table :0
Join order[1]: DEPARTMENTS[D]#0 EMPLOYEES[E]#1
***************
Now joining: EMPLOYEES[E]#1
***************
NL Join
Outer table: Card: 27.00 Cost: 2.01 Resp: 2.01 Degree: 1 Bytes: 16
Access path analysis for EMPLOYEES
. . .
Best NL cost: 13.17
. . .
SM Join
SM cost: 6.08
resc: 6.08 resc_io: 4.00 resc_cpu: 2501688
resp: 6.08 resp_io: 4.00 resp_cpu: 2501688
. . .
SM Join (with index on outer)
Access Path: index (FullScan)
. . .
HA Join
4-10
Chapter 4
About Automatic Tuning Optimizer
HA cost: 4.57
resc: 4.57 resc_io: 4.00 resc_cpu: 678154
resp: 4.57 resp_io: 4.00 resp_cpu: 678154
Best:: JoinMethod: Hash
Cost: 4.57 Degree: 1 Resp: 4.57 Card: 106.00 Bytes: 27
. . .
***********************
Join order[2]: EMPLOYEES[E]#1 DEPARTMENTS[D]#0
. . .
***************
Now joining: DEPARTMENTS[D]#0
***************
. . .
HA Join
HA cost: 4.58
resc: 4.58 resc_io: 4.00 resc_cpu: 690054
resp: 4.58 resp_io: 4.00 resp_cpu: 690054
Join order aborted: cost > best plan cost
***********************
The trace file shows the optimizer first trying the departments table as the outer table in the
join. The optimizer calculates the cost for three different join methods: nested loops join (NL),
sort merge (SM), and hash join (HA). The optimizer picks the hash join as the most efficient
method:
The optimizer then tries a different join order, using employees as the outer table. This join
order costs more than the previous join order, so it is abandoned.
The optimizer uses an internal cutoff to reduce the number of plans it tries when finding the
lowest-cost plan. The cutoff is based on the cost of the current best plan. If the current best
cost is large, then the optimizer explores alternative plans to find a lower cost plan. If the
current best cost is small, then the optimizer ends the search swiftly because further cost
improvement is not significant.
4-11
Chapter 4
About Adaptive Query Optimization
improve the plan produced in normal mode. The optimizer output is not an
execution plan, but a series of actions, along with their rationale and expected
benefit for producing a significantly better plan.
See Also:
Adaptive Query
Optimization
Adaptive Adaptive
Plans Statistics
4-12
Chapter 4
About Adaptive Query Optimization
Dynamic Plans
To change plans at runtime, adaptive query plans use a dynamic plan, which is represented
as a set of subplan groups. A subplan group is a set of subplans. A subplan is a portion of a
plan that the optimizer can switch to as an alternative at run time. For example, a nested
loops join could switch to a hash join during execution.
The optimizer decides which subplan to use at run time. When notified of a new statistic
value relevant to a subplan group, the coordinator dispatches it to the handler function for this
subgroup.
Dynamic Plan
GB GB
Subplan Subplan
HJ HJ
HJ HJ
4-13
Chapter 4
About Adaptive Query Optimization
plan at run time based on actual execution statistics results in a more optimal final
plan. After choosing the final plan, the optimizer uses it for subsequent executions,
thus ensuring that the suboptimal plan is not reused.
SELECT product_name
FROM order_items o, prod_info p
WHERE o.unit_price = 15
AND quantity > 1
AND p.product_id = o.product_id
4-14
Chapter 4
About Adaptive Query Optimization
An adaptive query plan for this statement shows two possible plans, one with a nested loops
join and the other with a hash join:
-----------------------------------------------------------------------------
| Id | Operation | Name |Rows|Bytes|Cost (%CPU)|Time|
-----------------------------------------------------------------------------
| 0| SELECT STATEMENT | | | |7(100)| |
| * 1| HASH JOIN | |4| 128 | 7 (0)|00:00:01|
|- 2| NESTED LOOPS | |4| 128 | 7 (0)|00:00:01|
|- 3| NESTED LOOPS | |4| 128 | 7 (0)|00:00:01|
|- 4| STATISTICS COLLECTOR | | | | | |
| * 5| TABLE ACCESS FULL | ORDER_ITEMS |4| 48 | 3 (0)|00:00:01|
|-* 6| INDEX UNIQUE SCAN | PROD_INFO_PK |1| | 0 (0)| |
|- 7| TABLE ACCESS BY INDEX ROWID| PROD_INFO |1| 20 | 1 (0)|00:00:01|
| 8| TABLE ACCESS FULL | PROD_INFO |1| 20 | 1 (0)|00:00:01|
-----------------------------------------------------------------------------
1 - access("P"."PRODUCT_ID"="O"."PRODUCT_ID")
5 - filter(("O"."UNIT_PRICE"=15 AND "QUANTITY">1))
6 - access("P"."PRODUCT_ID"="O"."PRODUCT_ID")
Note
-----
- this is an adaptive plan (rows marked '-' are inactive)
A nested loops join is preferable if the database can avoid scanning a significant portion of
prod_info because its rows are filtered by the join predicate. If few rows are filtered,
however, then scanning the right table in a hash join is preferable.
The following graphic shows the adaptive process. For the query in the preceding example,
the adaptive portion of the default plan contains two subplans, each of which uses a different
join method. The optimizer automatically determines when each join method is optimal,
depending on the cardinality of the left side of the join.
The statistics collector buffers enough rows coming from the order_items table to determine
which join method to use. If the row count is below the threshold determined by the optimizer,
then the optimizer chooses the nested loops join; otherwise, the optimizer chooses the hash
join. In this case, the row count coming from the order_items table is above the threshold, so
the optimizer chooses a hash join for the final plan, and disables buffering.
4-15
Chapter 4
About Adaptive Query Optimization
Nested Hash
Loops Join
Statistics
Collector
The optimizer disables the statistics collector after making the decision,
and lets the rows pass through.
Nested Hash
Loops Join
Statistics
Collector
The Note section of the execution plan indicates whether the plan is adaptive, and
which rows in the plan are inactive.
See Also:
4-16
Chapter 4
About Adaptive Query Optimization
Broadcast Distribution
The following graphic depicts a hybrid hash join between the departments and employees
tables, with a query coordinator directing 8 parallel server processes: P5-P8 are producers,
whereas P1-P4 are consumers. Each producer has its own consumer.
4-17
Chapter 4
About Adaptive Query Optimization
Query
Coordinator
P1 P2 P3 P4
departments P5 employees
2
P6
3
P7
4
P8
The database inserts a statistics collector in front of each producer process scanning
the departments table. The query coordinator aggregates the collected statistics. The
distribution method is based on the run-time statistics. In Figure 4-9, the number of
rows is below the threshold (8), which is twice the DOP (4), so the optimizer chooses a
broadcast technique for the departments table.
See Also:
Oracle Database VLDB and Partitioning Guide to learn more about parallel
data redistribution techniques
4-18
Chapter 4
About Adaptive Query Optimization
The following sample execution plan shows that the query generated no rows for the bitmap
node in Step 12 and Step 17. The adaptive optimizer determined that filtering rows by using
the CAR_MODEL_IDX and CAR_MAKE_IDX indexes was inefficient. The query did not use the
steps in the plan that begin with a dash (-).
-----------------------------------------------------------
| Id | Operation | Name |
-----------------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | SORT GROUP BY NOSORT | |
| 2 | HASH JOIN | |
| 3 | VIEW | VW_ST_5497B905 |
| 4 | NESTED LOOPS | |
| 5 | BITMAP CONVERSION TO ROWIDS | |
| 6 | BITMAP AND | |
| 7 | BITMAP MERGE | |
| 8 | BITMAP KEY ITERATION | |
| 9 | TABLE ACCESS FULL | COLORS |
| 10 | BITMAP INDEX RANGE SCAN | CAR_COLOR_IDX |
|- 11 | STATISTICS COLLECTOR | |
|- 12 | BITMAP MERGE | |
4-19
Chapter 4
About Adaptive Query Optimization
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
- star transformation used for this statement
- this is an adaptive plan (rows marked '-' are inactive)
See Also:
4-20
Chapter 4
About Adaptive Query Optimization
4-21
Chapter 4
About Adaptive Query Optimization
2. Querying the plan in the cursor shows that the estimated rows (E-Rows) is far
fewer than the actual rows (A-Rows).
--------------------------------------------------------------------------------------------------
| Id | Operation | Name |Starts|E-Rows|A-Rows|A-Time|Buffers|OMem|1Mem|O/1/M|
--------------------------------------------------------------------------------------------------
| 0| SELECT STATEMENT | | 1| | 269 |00:00:00.14|1338| | | |
| 1| NESTED LOOPS | | 1| 1 | 269 |00:00:00.14|1338| | | |
| 2| MERGE JOIN CARTESIAN| | 1| 4 |9135 |00:00:00.05| 33| | | |
|*3| TABLE ACCESS FULL |PRODUCT_INFORMATION| 1| 1 | 87 |00:00:00.01| 32| | | |
| 4| BUFFER SORT | | 87| 105 |9135 |00:00:00.02| 1|4096|4096|1/0/0|
| 5| INDEX FULL SCAN |ORDER_PK | 1| 105 | 105 |00:00:00.01| 1| | | |
|*6| INDEX UNIQUE SCAN |ORDER_ITEMS_UK |9135| 1 | 269 |00:00:00.04|1305| | | |
--------------------------------------------------------------------------------------------------
4-22
Chapter 4
About Adaptive Query Optimization
4. Querying the plan in the cursor shows that the optimizer used statistics feedback (shown
in the Note) for the second execution, and also chose a different plan.
--------------------------------------------------------------------------------------------------
|Id | Operation | Name | Starts |E-Rows|A-Rows|A-Time|Buffers|Reads|OMem|1Mem|O/1/M|
--------------------------------------------------------------------------------------------------
| 0| SELECT STATEMENT | | 1| | 269 |00:00:00.05|60|1| | | |
| 1| NESTED LOOPS | | 1|269| 269 |00:00:00.05|60|1| | | |
|*2| HASH JOIN | | 1|313| 269 |00:00:00.05|39|1|1398K|1398K|1/0/0|
|*3| TABLE ACCESS FULL |PRODUCT_INFORMATION| 1| 87| 87 |00:00:00.01|15|0| | | |
| 4| INDEX FAST FULL SCAN|ORDER_ITEMS_UK | 1|665| 665 |00:00:00.01|24|1| | | |
|*5| INDEX UNIQUE SCAN |ORDER_PK |269| 1| 269 |00:00:00.01|21|0| | | |
--------------------------------------------------------------------------------------------------
2 - access("P"."PRODUCT_ID"="O"."PRODUCT_ID")
3 - filter(("MIN_PRICE"<40 AND "LIST_PRICE"<50))
5 - access("O"."ORDER_ID"="ORDER_ID")
Note
-----
- statistics feedback used for this statement
In the preceding output, the estimated number of rows (269) in Step 1 matches the actual
number of rows.
4-23
Chapter 4
About Adaptive Query Optimization
Note:
Even if PARALLEL_DEGREE_POLICY is not set to ADAPTIVE, statistics feedback
may influence the degree of parallelism chosen for a statement.
See Also:
4-24
Chapter 4
About Approximate Query Processing
Note:
Setting OPTIMIZER_ADAPTIVE_STATISTICS to FALSE preserves statistics feedback for
single-table cardinality misestimates.
See Also:
4-25
Chapter 4
About Approximate Query Processing
See Also:
"NDV Algorithms: Adaptive Sampling and HyperLogLog"
Set these parameters at the database or session level. The following table describes
initialization parameters and SQL functions relevant to approximation techniques.
See Also:
4-26
Chapter 4
About Approximate Query Processing
APPROX_COUNT_DISTINCT Returns the approximate number of rows that contain Oracle Database SQL
distinct values of an expression. Language Reference
APPROX_COUNT_DISTINCT_AGG Aggregates the precomputed approximate count Oracle Database SQL
distinct synopses to a higher level. Language Reference
APPROX_COUNT_DISTINCT_DETA Returns the synopses of the Oracle Database SQL
IL APPROX_COUNT_DISTINCT function as a BLOB. Language Reference
The database can persist the returned result to disk
for further aggregation.
APPROX_MEDIAN Accepts a numeric or date-time value, and returns an Oracle Database SQL
approximate middle or approximate interpolated value Language Reference
that would be the middle value when the values are
sorted.
This function provides an alternative to the MEDIAN
function.
APPROX_PERCENTILE Accepts a percentile value and a sort specification, Oracle Database SQL
and returns an approximate interpolated value that Language Reference
falls into that percentile value with respect to the sort
specification.
This function provides an alternative to the
PERCENTILE_CONT function.
4-27
Chapter 4
About SQL Plan Management
See Also:
Oracle Database Data Warehousing Guide to learn more about approximate
query processing
4-28
Chapter 4
About the Expression Statistics Store (ESS)
plan history and SQL plan baselines of SQL statements. The main objectives are as follows:
• Identify repeatable SQL statements
• Maintain plan history, and possibly SQL plan baselines, for a set of SQL statements
• Detect plans that are not in the plan history
• Detect potentially better plans that are not in the SQL plan baseline
The optimizer uses the normal cost-based search method.
See Also:
4-29
Chapter 4
About the Expression Statistics Store (ESS)
See Also:
4-30
5
Query Transformations
The optimizer employs many query transformation techniques. This chapter describes some
of the most important.
This chapter contains the following topics:
5.1 OR Expansion
In OR expansion, the optimizer transforms a query block containing top-level disjunctions into
the form of a UNION ALL query that contains two or more branches.
The optimizer achieves this goal by splitting the disjunction into its components, and then
associating each component with a branch of a UNION ALL query. The optimizer can choose
OR expansion for various reasons. For example, it may enable more efficient access paths or
alternative join methods that avoid Cartesian products. As always, the optimizer performs the
expansion only if the cost of the transformed statement is lower than the cost of the original
statement.
In previous releases, the optimizer used the CONCATENATION operator to perform the OR
expansion. Starting in Oracle Database 12c Release 2 (12.2), the optimizer uses the UNION-
ALL operator instead. The framework provides the following enhancements:
5-1
Chapter 5
OR Expansion
/
COMMIT;
EXEC DBMS_STATS.GATHER_TABLE_STATS ( ownname => 'hr', tabname =>
'employees');
You then connect as the user hr, and execute the following query, which joins the
employees and departments tables:
SELECT *
FROM employees e, departments d
WHERE (e.email='SSTILES' OR d.department_name='Treasury')
AND e.department_id = d.department_id;
With OR expansion, the optimizer breaks the disjunctive predicate into two independent
predicates, as shown in the following example:
SELECT *
FROM employees e, departments d
WHERE e.email = 'SSTILES'
AND e.department_id = d.department_id
UNION ALL
SELECT *
FROM employees e, departments d
WHERE d.department_name = 'Treasury'
AND e.department_id = d.department_id;
--------------------------------------------------------------------------------------
-----
| Id| Operation | Name |Rows|Bytes|Cost(%CPU)|
Time |
--------------------------------------------------------------------------------------
-----
| 0 |SELECT STATEMENT | | | |
122(100)| |
| 1 | VIEW |VW_ORE_19FF4E3E |9102|1679K|122 (5) |
00:00:01|
| 2 | UNION-ALL | | | |
| |
| 3 | NESTED LOOPS | | 1 | 78 | 4 (0) |
00:00:01|
| 4 | TABLE ACCESS BY INDEX ROWID | EMPLOYEES | 1 | 57 | 3 (0) |
00:00:01|
5-2
Chapter 5
View Merging
5 - access("E"."EMAIL"='SSTILES')
7 - access("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
10 - access("D"."DEPARTMENT_NAME"='Treasury')
11 - filter(LNNVL("E"."EMAIL"='SSTILES'))
12 - access("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
35 rows selected.
Note:
You can use hints to override view merging rejected because of cost or heuristics,
but not validity.
5-3
Chapter 5
View Merging
See Also:
Simple view merging frequently results in a more optimal plan because of the
additional join orders and access paths available after the merge. A view may not be
valid for simple view merging because:
• The view contains constructs not included in select-project-join views, including:
– GROUP BY
– DISTINCT
– Outer join
– MODEL
– CONNECT BY
– Set operators
– Aggregation
• The view appears on the right side of a semijoin or antijoin.
• The view contains subqueries in the SELECT list.
5-4
Chapter 5
View Merging
The database can execute the preceding query by joining departments and locations to
generate the rows of the view, and then joining this result to employees. Because the query
contains the view dept_locs_v, and this view contains two tables, the optimizer must use one
of the following join orders:
• employees, dept_locs_v (departments, locations)
• employees, dept_locs_v (locations, departments)
• dept_locs_v (departments, locations), employees
• dept_locs_v (locations, departments), employees
Join methods are also constrained. The index-based nested loops join is not feasible for join
orders that begin with employees because no index exists on the column from this view.
Without view merging, the optimizer generates the following execution plan:
-----------------------------------------------------------------
| Id | Operation | Name | Cost (%CPU)|
-----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 7 (15)|
|* 1 | HASH JOIN | | 7 (15)|
| 2 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 2 (0)|
|* 3 | INDEX RANGE SCAN | EMP_NAME_IX | 1 (0)|
| 4 | VIEW | | 5 (20)|
|* 5 | HASH JOIN | | 5 (20)|
| 6 | TABLE ACCESS FULL | LOCATIONS | 2 (0)|
| 7 | TABLE ACCESS FULL | DEPARTMENTS | 2 (0)|
-----------------------------------------------------------------
5-5
Chapter 5
View Merging
View merging merges the tables from the view into the outer query block, removing the
inner query block. After view merging, the query is as follows:
Because all three tables appear in one query block, the optimizer can choose from the
following six join orders:
• employees, departments, locations
• employees, locations, departments
• departments, employees, locations
• departments, locations, employees
• locations, employees, departments
• locations, departments, employees
The joins to employees and departments can now be index-based. After view merging,
the optimizer chooses the following more efficient plan, which uses nested loops:
-------------------------------------------------------------------
| Id | Operation | Name | Cost (%CPU)|
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 (0)|
| 1 | NESTED LOOPS | | |
| 2 | NESTED LOOPS | | 4 (0)|
| 3 | NESTED LOOPS | | 3 (0)|
| 4 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 2 (0)|
|* 5 | INDEX RANGE SCAN | EMP_NAME_IX | 1 (0)|
| 6 | TABLE ACCESS BY INDEX ROWID| DEPARTMENTS | 1 (0)|
|* 7 | INDEX UNIQUE SCAN | DEPT_ID_PK | 0 (0)|
|* 8 | INDEX UNIQUE SCAN | LOC_ID_PK | 0 (0)|
| 9 | TABLE ACCESS BY INDEX ROWID | LOCATIONS | 1 (0)|
-------------------------------------------------------------------
See Also:
The Oracle Optimizer blog at https://blogs.oracle.com/optimizer/ to
learn about outer join view merging, which is a special case of simple view
merging
5-6
Chapter 5
View Merging
The following query finds all of the customers from the United States who have bought at
least 100 fur-trimmed sweaters:
The cust_prod_totals_v view is eligible for complex view merging. After merging, the query
is as follows:
5-7
Chapter 5
View Merging
c.cust_last_name,
c.cust_first_name, c.cust_id
HAVING SUM(s.quantity_sold) > 100;
The transformed query is cheaper than the untransformed query, so the optimizer
chooses to merge the view. In the untransformed query, the GROUP BY operator applies
to the entire sales table in the view. In the transformed query, the joins to products
and customers filter out a large portion of the rows from the sales table, so the GROUP
BY operation is lower cost. The join is more expensive because the sales table has not
been reduced, but it is not much more expensive because the GROUP BY operation
does not reduce the size of the row set very much in the original query. If any of the
preceding characteristics were to change, merging the view might no longer be lower
cost. The final plan, which does not include a view, is as follows:
--------------------------------------------------------
| Id | Operation | Name | Cost (%CPU)|
--------------------------------------------------------
| 0 | SELECT STATEMENT | | 2101 (18)|
|* 1 | FILTER | | |
| 2 | HASH GROUP BY | | 2101 (18)|
|* 3 | HASH JOIN | | 2099 (18)|
|* 4 | HASH JOIN | | 1801 (19)|
|* 5 | TABLE ACCESS FULL| PRODUCTS | 96 (5)|
| 6 | TABLE ACCESS FULL| SALES | 1620 (15)|
|* 7 | TABLE ACCESS FULL | CUSTOMERS | 296 (11)|
--------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(SUM("QUANTITY_SOLD")>100)
3 - access("C"."CUST_ID"="CUST_ID")
4 - access("PROD_ID"="P"."PROD_ID")
5 - filter("P"."PROD_NAME"='T3 Faux Fur-Trimmed Sweater')
7 - filter("C"."COUNTRY_ID"='US')
After determining that view merging produces a lower-cost plan, the optimizer rewrites
the query into this equivalent query:
5-8
Chapter 5
Predicate Pushing
-------------------------------------------
| Id | Operation | Name |
-------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | VIEW | VM_NWVW_1 |
| 2 | HASH UNIQUE | |
|* 3 | HASH JOIN | |
|* 4 | HASH JOIN | |
|* 5 | TABLE ACCESS FULL| PRODUCTS |
| 6 | TABLE ACCESS FULL| SALES |
|* 7 | TABLE ACCESS FULL | CUSTOMERS |
-------------------------------------------
The preceding plan contains a view named vm_nwvw_1, known as a projection view, even
after view merging has occurred. Projection views appear in queries in which a DISTINCT
view has been merged, or a GROUP BY view is merged into an outer query block that also
contains GROUP BY, HAVING, or aggregates. In the latter case, the projection view contains the
GROUP BY, HAVING, and aggregates from the original outer query block.
In the preceding example of a projection view, when the optimizer merges the view, it moves
the DISTINCT operator to the outer query block, and then adds several additional columns to
maintain semantic equivalence with the original query. Afterward, the query can select only
the desired columns in the SELECT list of the outer query block. The optimization retains all of
the benefits of view merging: all tables are in one query block, the optimizer can permute
them as needed in the final join order, and the DISTINCT operation has been delayed until
after all of the joins complete.
5-9
Chapter 5
Predicate Pushing
You create a view that references employees and contract_workers. The view is
defined with a query that uses the UNION set operator, as follows:
SELECT last_name
FROM all_employees_vw
WHERE department_id = 50;
Because the view is a UNION set query, the optimizer cannot merge the view's query
into the accessing query block. Instead, the optimizer can transform the accessing
statement by pushing its predicate, the WHERE clause condition department_id=50, into
the view's UNION set query. The equivalent transformed query is as follows:
SELECT last_name
FROM ( SELECT employee_id, last_name, job_id, commission_pct,
department_id
FROM employees
WHERE department_id=50
UNION
SELECT employee_id, last_name, job_id, commission_pct,
department_id
FROM contract_workers
WHERE department_id=50 );
The transformed query can now consider index access in each of the query blocks.
5-10
Chapter 5
Subquery Unnesting
For example, suppose you connect as user sh and execute the following query:
SELECT *
FROM sales
WHERE cust_id IN ( SELECT cust_id
FROM customers );
Because the customers.cust_id column is a primary key, the optimizer can transform the
complex query into the following join statement that is guaranteed to return the same data:
SELECT sales.*
FROM sales, customers
WHERE sales.cust_id = customers.cust_id;
If the optimizer cannot transform a complex statement into a join statement, it selects
execution plans for the parent statement and the subquery as though they were separate
statements. The optimizer then executes the subquery and uses the rows returned to execute
the parent query. To improve execution speed of the overall execution plan, the optimizer
orders the subplans efficiently.
See Also:
Oracle Database Data Warehousing Guide to learn more about query rewrite
5-11
Chapter 5
Query Rewrite with Materialized Views
Dimensions, constraints, and rewrite integrity levels affect whether a query is rewritten
to use materialized views. Additionally, query rewrite can be enabled or disabled by
REWRITE and NOREWRITE hints and the QUERY_REWRITE_ENABLED session parameter.
5-12
Chapter 5
Query Rewrite with Materialized Views
Table 5-1 (Cont.) Initialization Parameters that Control Query Rewrite Behavior
Related Topics
• About the Accuracy of Query Rewrite
Query rewrite offers three levels of rewrite integrity that are controlled by the initialization
parameter QUERY_REWRITE_INTEGRITY.
The values that you can set for the QUERY_REWRITE_INTEGRITY parameter are as follows:
• ENFORCED
This is the default mode. The optimizer only uses fresh data from the materialized views
and only use those relationships that are based on ENABLED VALIDATED primary, unique,
or foreign key constraints.
• TRUSTED
In TRUSTED mode, the optimizer trusts that the relationships declared in dimensions and
RELY constraints are correct. In this mode, the optimizer also uses prebuilt materialized
views or materialized views based on views, and it uses relationships that are not
enforced as well as those that are enforced. It also trusts declared but not ENABLED
VALIDATED primary or unique key constraints and data relationships specified using
5-13
Chapter 5
Query Rewrite with Materialized Views
dimensions. This mode offers greater query rewrite capabilities but also creates
the risk of incorrect results if any of the trusted relationships you have declared are
incorrect.
• STALE_TOLERATED
In STALE_TOLERATED mode, the optimizer uses materialized views that are valid but
contain stale data as well as those that contain fresh data. This mode offers the
maximum rewrite capability but creates the risk of generating inaccurate results.
If rewrite integrity is set to the safest level, ENFORCED, the optimizer uses only enforced
primary key constraints and referential integrity constraints to ensure that the results of
the query are the same as the results when accessing the detail tables directly.
If the rewrite integrity is set to levels other than ENFORCED, there are several situations
where the output with rewrite can be different from that without it:
• A materialized view can be out of synchronization with the master copy of the
data. This generally happens because the materialized view refresh procedure is
pending following bulk load or DML operations to one or more detail tables of a
materialized view. At some data warehouse sites, this situation is desirable
because it is not uncommon for some materialized views to be refreshed at certain
time intervals.
• The relationships implied by the dimension objects are invalid. For example,
values at a certain level in a hierarchy do not roll up to exactly one parent value.
• The values stored in a prebuilt materialized view table might be incorrect.
• A wrong answer can occur because of bad data relationships defined by
unenforced table or view constraints.
You can set QUERY_REWRITE_INTEGRITY either in your initialization parameter file or
using an ALTER SYSTEM or ALTER SESSION statement.
Let us assume that, in a typical month, the number of sales in the store is around one
million. So this materialized aggregate view has the precomputed aggregates for the
dollar amount sold for each month.
Consider the following query, which asks for the sum of the amount sold at the store
for each calendar month:
5-14
Chapter 5
Star Transformation
In the absence of the previous materialized view and query rewrite feature, Oracle Database
must access the sales table directly and compute the sum of the amount sold to return the
results. This involves reading many million rows from the sales table, which will invariably
increase the query response time due to the disk access. The join in the query will also
further slow down the query response as the join needs to be computed on many million
rows.
In the presence of the materialized view cal_month_sales_mv, query rewrite will transparently
rewrite the previous query into the following query:
Because there are only a few dozen rows in the materialized view cal_month_sales_mv and
no joins, Oracle Database returns the results instantly.
products times
sales
(amount_sold,
quantity_sold)
Fact Table
customers channels
A snowflake schema is a star schema in which the dimension tables reference other tables. A
snowstorm schema is a combination of snowflake schemas.
5-15
Chapter 5
Star Transformation
See Also:
Oracle Database Data Warehousing Guide to learn more about star
schemas
5-16
Chapter 5
Star Transformation
• TEMP_DISABLE
This value is identical to true except that the optimizer does not attempt temporary table
transformation.
See Also:
Oracle Database Reference to learn about the STAR_TRANSFORMATION_ENABLED
initialization parameter
SELECT c.cust_city,
t.calendar_quarter_desc,
SUM(s.amount_sold) sales_amount
FROM sales s,
times t,
customers c,
channels ch
WHERE s.time_id = t.time_id
AND s.cust_id = c.cust_id
AND s.channel_id = ch.channel_id
AND c.cust_state_province = 'CA'
AND ch.channel_desc = 'Internet'
AND t.calendar_quarter_desc IN ('1999-01','1999-02')
GROUP BY c.cust_city, t.calendar_quarter_desc;
In this example, sales is the fact table, and the other tables are dimension tables. The sales
table contains one row for every sale of a product, so it could conceivably contain billions of
sales records. However, only a few products are sold to customers in California through the
Internet for the specified quarters.
5-17
Chapter 5
Star Transformation
Each bit in the bitmap corresponds to a row in the fact table. The bit is set when the
key value from the subquery is same as the value in the row of the fact table. For
example, in the bitmap 101000... (the ellipses indicates that the values for the
remaining rows are 0), rows 1 and 3 of the fact table have matching key values from
the subquery.
The operations in lines 12, 17, and 22 iterate over the keys from the subqueries and
retrieve the corresponding bitmaps. In Example 5-6, the customers subquery seeks
the IDs of customers whose state or province is CA. Assume that the bitmap 101000...
corresponds to the customer ID key value 103515 from the customers table subquery.
Also assume that the customers subquery produces the key value 103516 with the
bitmap 010000..., which means that only row 2 in sales has a matching key value
from the subquery.
The database merges (using the OR operator) the bitmaps for each subquery (lines 11,
16, 21). In our customers example, the database produces a single bitmap 111000...
for the customers subquery after merging the two bitmaps:
5-18
Chapter 5
Star Transformation
---------
111000... # result of OR operation
In line 10, the database applies the AND operator to the merged bitmaps. Assume that after
the database has performed all OR operations, the resulting bitmap for channels is 100000...
If the database performs an AND operation on this bitmap and the bitmap from customers
subquery, then the result is as follows:
In line 9, the database generates the corresponding rowids of the final bitmap. The database
retrieves rows from the sales fact table using the rowids (line 26). In our example, the
database generate only one rowid, which corresponds to the first row, and thus fetches only a
single row instead of scanning the entire sales table.
---------------------------------------------------------------------------
| Id | Operation | Name
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT |
| 1 | HASH GROUP BY |
|* 2 | HASH JOIN |
|* 3 | TABLE ACCESS FULL | CUSTOMERS
|* 4 | HASH JOIN |
|* 5 | TABLE ACCESS FULL | TIMES
| 6 | VIEW | VW_ST_B1772830
| 7 | NESTED LOOPS |
| 8 | PARTITION RANGE SUBQUERY |
| 9 | BITMAP CONVERSION TO ROWIDS|
| 10 | BITMAP AND |
| 11 | BITMAP MERGE |
| 12 | BITMAP KEY ITERATION |
| 13 | BUFFER SORT |
|* 14 | TABLE ACCESS FULL | CHANNELS
|* 15 | BITMAP INDEX RANGE SCAN| SALES_CHANNEL_BIX
| 16 | BITMAP MERGE |
| 17 | BITMAP KEY ITERATION |
| 18 | BUFFER SORT |
|* 19 | TABLE ACCESS FULL | TIMES
|* 20 | BITMAP INDEX RANGE SCAN| SALES_TIME_BIX
| 21 | BITMAP MERGE |
| 22 | BITMAP KEY ITERATION |
| 23 | BUFFER SORT |
|* 24 | TABLE ACCESS FULL | CUSTOMERS
|* 25 | BITMAP INDEX RANGE SCAN| SALES_CUST_BIX
| 26 | TABLE ACCESS BY USER ROWID | SALES
---------------------------------------------------------------------------
2 - access("ITEM_1"="C"."CUST_ID")
5-19
Chapter 5
Star Transformation
3 - filter("C"."CUST_STATE_PROVINCE"='CA')
4 - access("ITEM_2"="T"."TIME_ID")
5 - filter(("T"."CALENDAR_QUARTER_DESC"='1999-01'
OR "T"."CALENDAR_QUARTER_DESC"='1999-02'))
14 - filter("CH"."CHANNEL_DESC"='Internet')
15 - access("S"."CHANNEL_ID"="CH"."CHANNEL_ID")
19 - filter(("T"."CALENDAR_QUARTER_DESC"='1999-01'
OR "T"."CALENDAR_QUARTER_DESC"='1999-02'))
20 - access("S"."TIME_ID"="T"."TIME_ID")
24 - filter("C"."CUST_STATE_PROVINCE"='CA')
25 - access("S"."CUST_ID"="C"."CUST_ID")
Note
-----
- star transformation used for this statement
If the optimizer cannot eliminate the join back, however, then the database stores the
subquery results in a temporary table to avoid rescanning the dimension table for
bitmap key generation and join back. Also, if the query runs in parallel, then the
database materializes the results so that each parallel execution server can select the
results from the temporary table instead of executing the subquery again.
Example 5-8 Star Transformation Using Temporary Table
In this example, the database materializes the results of the subquery on customers
into a temporary table:
5-20
Chapter 5
Star Transformation
These columns correspond to cust_id and cust_city of the customers table. The database
populates the temporary table by executing the following query at the beginning of the
execution of the previous query:
Example 5-9 Partial Execution Plan for Star Transformation Using Temporary Table
The following example shows an edited version of the execution plan for the query in
Example 5-8:
---------------------------------------------------------------------------
| Id | Operation | Name
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT |
| 1 | TEMP TABLE TRANSFORMATION |
| 2 | LOAD AS SELECT |
|* 3 | TABLE ACCESS FULL | CUSTOMERS
| 4 | HASH GROUP BY |
|* 5 | HASH JOIN |
| 6 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6613_C716F
|* 7 | HASH JOIN |
|* 8 | TABLE ACCESS FULL | TIMES
| 9 | VIEW | VW_ST_A3F94988
| 10 | NESTED LOOPS |
| 11 | PARTITION RANGE SUBQUERY |
| 12 | BITMAP CONVERSION TO ROWIDS|
| 13 | BITMAP AND |
| 14 | BITMAP MERGE |
| 15 | BITMAP KEY ITERATION |
| 16 | BUFFER SORT |
|* 17 | TABLE ACCESS FULL | CHANNELS
|* 18 | BITMAP INDEX RANGE SCAN| SALES_CHANNEL_BIX
| 19 | BITMAP MERGE |
| 20 | BITMAP KEY ITERATION |
| 21 | BUFFER SORT |
|* 22 | TABLE ACCESS FULL | TIMES
|* 23 | BITMAP INDEX RANGE SCAN| SALES_TIME_BIX
| 24 | BITMAP MERGE |
| 25 | BITMAP KEY ITERATION |
| 26 | BUFFER SORT |
| 27 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6613_C716F
|* 28 | BITMAP INDEX RANGE SCAN| SALES_CUST_BIX
| 29 | TABLE ACCESS BY USER ROWID | SALES
---------------------------------------------------------------------------
3 - filter("C"."CUST_STATE_PROVINCE"='CA')
5 - access("ITEM_1"="C0")
7 - access("ITEM_2"="T"."TIME_ID")
8 - filter(("T"."CALENDAR_QUARTER_DESC"='1999-01' OR
5-21
Chapter 5
In-Memory Aggregation (VECTOR GROUP BY)
"T"."CALENDAR_QUARTER_DESC"='1999-02'))
17 - filter("CH"."CHANNEL_DESC"='Internet')
18 - access("S"."CHANNEL_ID"="CH"."CHANNEL_ID")
22 - filter(("T"."CALENDAR_QUARTER_DESC"='1999-01' OR
"T"."CALENDAR_QUARTER_DESC"='1999-02'))
23 - access("S"."TIME_ID"="T"."TIME_ID")
28 - access("S"."CUST_ID"="C0")
Lines 1, 2, and 3 of the plan materialize the customers subquery into the temporary
table. In line 6, the database scans the temporary table (instead of the subquery) to
build the bitmap from the fact table. Line 27 scans the temporary table for joining back
instead of scanning customers. The database does not need to apply the filter on
customers on the temporary table because the filter is applied while materializing the
temporary table.
See Also:
Oracle Database In-Memory Guide to learn more about in-memory
aggregation
5-22
Chapter 5
Cursor-Duration Temporary Tables
Note:
The metadata for the cursor-duration temporary table stays in memory as long as
the cursor is in memory. The metadata is not stored in the data dictionary, which
means it is not visible through data dictionary views. You cannot drop the metadata
explicitly.
The preceding scenario depends on the availability of memory. For serial queries, the
temporary tables use PGA memory.
The implementation of cursor-duration temporary tables is similar to sorts. If no more memory
is available, then the database writes data to temporary segments. For cursor-duration
temporary tables, the differences are as follows:
• The database releases memory and temporary segments at the end of the query rather
than when the row source is no longer active.
• Data in memory stays in memory, unlike in sorts where data can move between memory
and temporary segments.
When the database uses cursor-duration temporary tables, the keyword CURSOR DURATION
MEMORY appears in the execution plan.
WITH
q1 AS (SELECT department_id, SUM(salary) sum_sal FROM hr.employees GROUP
BY department_id),
q2 AS (SELECT * FROM q1),
q3 AS (SELECT department_id, sum_sal FROM q1)
5-23
Chapter 5
Cursor-Duration Temporary Tables
SELECT * FROM q1
UNION ALL
SELECT * FROM q2
UNION ALL
SELECT * FROM q3;
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------
-----
| Id | Operation | Name |Rows |Cost
(%CPU)|
--------------------------------------------------------------------------------------
-----
| 0 | SELECT STATEMENT | | |6
(100)|
| 1 | TEMP TABLE TRANSFORMATION | |
| |
| 2 | LOAD AS SELECT (CURSOR DURATION MEMORY) | SYS_TEMP_0FD9D6606_1AE004 |
| |
| 3 | HASH GROUP BY | | 11 | 3
(34)|
| 4 | TABLE ACCESS FULL | EMPLOYEES |107 | 2
(0) |
| 5 | UNION-ALL | |
| |
| 6 | VIEW | | 11 | 2
(0) |
| 7 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6606_1AE004 | 11 | 2
(0) |
| 8 | VIEW | | 11 | 2
(0) |
| 9 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6606_1AE004 | 11 | 2
(0) |
| 10 | VIEW | | 11 | 2
(0) |
| 11 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6606_1AE004 | 11 | 2
(0) |
--------------------------------------------------------------------------------------
-----
In the preceding plan, TEMP TABLE TRANSFORMATION in Step 1 indicates that the
database used cursor-duration temporary tables to execute the query. The CURSOR
DURATION MEMORY keyword in Step 2 indicates that the database used memory, if
available, to store the results of SYS_TEMP_0FD9D6606_1AE004. If memory was
unavailable, then the database wrote the temporary data to disk.
5-24
Chapter 5
Table Expansion
See Also:
5-25
Chapter 5
Table Expansion
Assumptions
This scenario assumes the following:
• You want to run a star query against the sh.sales table, which is range-partitioned
on the time_id column.
• You want to disable indexes on specific partitions to see the benefits of table
expansion.
SELECT *
FROM sales
WHERE time_id >= TO_DATE('2000-01-01 00:00:00', 'SYYYY-MM-DD
HH24:MI:SS')
AND prod_id = 38;
As shown in the Pstart and Pstop columns in the following plan, the optimizer
determines from the filter that only 16 of the 28 partitions in the table must be
accessed:
--------------------------------------------------------------------
----
|Id| Operation | Name |Pstart|
Pstop|
--------------------------------------------------------------------
----
| 0| SELECT STATEMENT | |
| |
| 1| PARTITION RANGE ITERATOR | |13|
28 |
| 2| TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| SALES |13|
28 |
| 3| BITMAP CONVERSION TO ROWIDS | |
5-26
Chapter 5
Table Expansion
| |
|*4| BITMAP INDEX SINGLE VALUE |SALES_PROD_BIX|13| 28 |
------------------------------------------------------------------------
4 - access("PROD_ID"=38)
After the optimizer has determined the partitions to be accessed, it considers any index
that is usable on all of those partitions. In the preceding plan, the optimizer chose to use
the sales_prod_bix bitmap index.
4. Disable the index on the SALES_1995 partition of the sales table:
The preceding DDL disables the index on partition 1, which contains all sales from before
1996.
Note:
You can obtain the partition information by querying the USER_IND_PARTITIONS
view.
5. Execute the query of sales again, and then query DBMS_XPLAN to obtain the plan.
The output shows that the plan did not change:
------------------------------------------------------------------------
|Id| Operation | Name |Pstart|Pstop
------------------------------------------------------------------------
| 0| SELECT STATEMENT | | | |
| 1| PARTITION RANGE ITERATOR | |13|28 |
| 2| TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| SALES |13|28 |
| 3| BITMAP CONVERSION TO ROWIDS | | | |
|*4| BITMAP INDEX SINGLE VALUE | SALES_PROD_BIX|13|28 |
------------------------------------------------------------------------
4 - access("PROD_ID"=38)
The plan is the same because the disabled index partition is not relevant to the query. If
all partitions that the query accesses are indexed, then the database can answer the
query using the index. Because the query only accesses partitions 16 through 28,
disabling the index on partition 1 does not affect the plan.
5-27
Chapter 5
Table Expansion
6. Disable the indexes for partition 28 (SALES_Q4_2003), which is a partition that the
query needs to access:
By disabling the indexes on a partition that the query does need to access, the
query can no longer use this index (without table expansion).
7. Query the plan using DBMS_XPLAN.
As shown in the following plan, the optimizer does not use the index:
--------------------------------------------------------------------
----
| Id| Operation | Name |Pstart|
Pstop
--------------------------------------------------------------------
----
| 0 | SELECT STATEMENT | |
| |
| 1 | PARTITION RANGE ITERATOR | |13 |
28 |
|*2 | TABLE ACCESS FULL | SALES |13 |
28 |
--------------------------------------------------------------------
----
SELECT *
FROM sales
WHERE time_id >= TO_DATE('2000-01-01 00:00:00', 'SYYYY-MM-DD
HH24:MI:SS')
AND time_id < TO_DATE('2003-10-01 00:00:00', 'SYYYY-MM-DD
HH24:MI:SS')
AND prod_id = 38
UNION ALL
SELECT *
FROM sales
WHERE time_id >= TO_DATE('2003-10-01 00:00:00', 'SYYYY-MM-DD
HH24:MI:SS')
AND time_id < TO_DATE('2004-01-01 00:00:00', 'SYYYY-MM-DD
5-28
Chapter 5
Table Expansion
HH24:MI:SS')
AND prod_id = 38;
In the preceding query, the first query block in the UNION ALL accesses the partitions that
are indexed, while the second query block accesses the partition that is not. The two
subqueries enable the optimizer to choose to use the index in the first query block, if it is
more optimal than using a table scan of all of the partitions that are accessed.
9. Query the plan using DBMS_XPLAN.
The plan appears as follows:
------------------------------------------------------------------------
|Id| Operation |Name |Pstart|Pstop|
------------------------------------------------------------------------
| 0|SELECT STATEMENT | | | |
| 1| VIEW |VW_TE_2 | | |
| 2| UNION-ALL | | | |
| 3| PARTITION RANGE ITERATOR | |13| 27|
| 4| TABLE ACCESS BY LOCAL INDEX ROWID BATCHED|SALES |13| 27|
| 5| BITMAP CONVERSION TO ROWIDS | | | |
|*6| BITMAP INDEX SINGLE VALUE |SALES_PROD_BIX|13| 27|
| 7| PARTITION RANGE SINGLE | |28| 28|
|*8| TABLE ACCESS FULL |SALES |28| 28|
------------------------------------------------------------------------
6 - access("PROD_ID"=38)
8 - filter("PROD_ID"=38)
As shown in the preceding plan, the optimizer uses a UNION ALL for two query blocks
(Step 2). The optimizer chooses an index to access partitions 13 to 27 in the first query
block (Step 6). Because no index is available for partition 28, the optimizer chooses a full
table scan in the second query block (Step 8).
Assumptions
This scenario assumes the following:
• You query the same schema used in "Star Transformation: Scenario".
5-29
Chapter 5
Table Expansion
• The last partition of sales is actively being updated, as is often the case with time-
partitioned tables.
• You want the optimizer to take advantage of table expansion.
3. Query the cursor using DBMS_XPLAN, which shows the following plan:
---------------------------------------------------------------------------
|Id| Operation | Name | Pstart| Pstop |
---------------------------------------------------------------------------
| 0| SELECT STATEMENT | | | |
| 1| HASH GROUP BY | | | |
| 2| VIEW |VW_TE_14 | | |
| 3| UNION-ALL | | | |
| 4| HASH JOIN | | | |
| 5| TABLE ACCESS FULL |TIMES | | |
| 6| VIEW |VW_ST_1319B6D8 | | |
| 7| NESTED LOOPS | | | |
| 8| PARTITION RANGE SUBQUERY | |KEY(SQ)|KEY(SQ)|
| 9| BITMAP CONVERSION TO ROWIDS| | | |
|10| BITMAP AND | | | |
|11| BITMAP MERGE | | | |
|12| BITMAP KEY ITERATION | | | |
|13| BUFFER SORT | | | |
|14| TABLE ACCESS FULL |CHANNELS | | |
|15| BITMAP INDEX RANGE SCAN|SALES_CHANNEL_BIX|KEY(SQ)|KEY(SQ)|
|16| BITMAP MERGE | | | |
|17| BITMAP KEY ITERATION | | | |
|18| BUFFER SORT | | | |
|19| TABLE ACCESS FULL |TIMES | | |
|20| BITMAP INDEX RANGE SCAN|SALES_TIME_BIX |KEY(SQ)|KEY(SQ)|
|21| BITMAP MERGE | | | |
|22| BITMAP KEY ITERATION | | | |
|23| BUFFER SORT | | | |
|24| TABLE ACCESS FULL |CUSTOMERS | | |
5-30
Chapter 5
Join Factorization
The preceding plan uses table expansion. The UNION ALL branch that is accessing every
partition except the last partition uses star transformation. Because the indexes on
partition 28 are disabled, the database accesses the final partition using a full table scan.
5-31
Chapter 5
Join Factorization
In the preceding query, table t1 appears in both UNION ALL branches, as does the filter
predicate t1.c1 > 1 and the join predicate t1.c1 = t2.c1. Without any
transformation, the database must perform the scan and the filtering on table t1 twice,
one time for each branch.
Example 5-11 Factorized Query
Example 5-10
In this case, because table t1 is factorized, the database performs the table scan and
the filtering on t1 only one time. If t1 is large, then this factorization avoids the huge
performance cost of scanning and filtering t1 twice.
Note:
If the branches in a UNION ALL query have clauses that use the DISTINCT
function, then join factorization is not valid.
SELECT *
FROM t5, (SELECT t1.c1, t2.c2
FROM t1, t2, t3
WHERE t1.c1 = t2.c1
AND t1.c1 > 1
AND t2.c2 = 2
5-32
Chapter 5
Join Factorization
t1t2t3t5
Example 5-13 Factorization of t1 from View V
If join factorization factorizes t1 from view V, as shown in the following query, then the
database can join t1 with t5.:
SELECT *
FROM t5, ( SELECT t1.c1, VW_JF_1.item_2
FROM t1, (SELECT t2.c1 item_1, t2.c2 item_2
FROM t2, t3
WHERE t2.c2 = t3.c2
AND t2.c2 = 2
UNION ALL
SELECT t2.c1 item_1, t2.c2 item_2
FROM t2, t4
WHERE t2.c3 = t4.c3) VW_JF_1
WHERE t1.c1 = VW_JF_1.item_1
AND t1.c1 > 1 )
WHERE t5.c1 = V.c1
The preceding query transformation opens up new join orders. However, join factorization
imposes specific join orders. For example, in the preceding query, tables t2 and t3 appear in
the first branch of the UNION ALL query in view VW_JF_1. The database must join t2 with t3
before it can join with t1, which is not defined within the VW_JF_1 view. The imposed join
order may not necessarily be the best join order. For this reason, the optimizer performs join
factorization using the cost-based transformation framework. The optimizer calculates the
cost of the plans with and without join factorization, and then chooses the cheapest plan.
Example 5-14 Factorization of t1 from View V with View Definition Removed
The following query is the same query in Example 5-13, but with the view definition removed
so that the factorization is easier to see:
SELECT *
FROM t5, (SELECT t1.c1, VW_JF_1.item_2
FROM t1, VW_JF_1
WHERE t1.c1 = VW_JF_1.item_1
AND t1.c1 > 1)
WHERE t5.c1 = V.c1
5-33
Chapter 5
Join Factorization
For example, join factorization can transform the following UNION ALL query by
factorizing t2:
The following example shows the transformation. Table t2 now no longer appears in
the UNION ALL branches of the subquery.
5-34
Part III
Query Execution Plans
If a query has suboptimal performance, the execution plan is the key tool for understanding
the problem and supplying a solution.
This part contains the following chapters:
6
Generating and Displaying Execution Plans
A thorough understanding of execution plans is essential to SQL tuning.
This chapter contains the following topics:
6-1
Chapter 6
About Plan Generation and Display
See Also:
Note:
To avoid possible SQL performance regression that may result from
execution plan changes, consider using SQL plan management.
See Also:
6-2
Chapter 6
About Plan Generation and Display
6-3
Chapter 6
About Plan Generation and Display
V$SQL_PLAN contains the execution plan for every statement stored in the shared SQL
area. Its definition is similar to PLAN_TABLE.
The advantage of V$SQL_PLAN over EXPLAIN PLAN is that you do not need to know the
compilation environment that was used to execute a particular statement. For EXPLAIN
PLAN, you would need to set up an identical environment to get the same plan when
executing the statement.
The V$SQL_PLAN_STATISTICS view provides the actual execution statistics for every
operation in the plan, such as the number of output rows and elapsed time. All
statistics, except the number of output rows, are cumulative. For example, the
statistics for a join operation also includes the statistics for its two inputs. The statistics
in V$SQL_PLAN_STATISTICS are available for cursors that have been compiled with the
STATISTICS_LEVEL initialization parameter set to ALL.
See Also:
• "PLAN_TABLE Columns"
• "Monitoring Database Operations " for information about the
V$SQL_PLAN_MONITOR view
• Oracle Database Reference for more information about V$SQL_PLAN
views
• Oracle Database Reference for information about the STATISTICS_LEVEL
initialization parameter
6-4
Chapter 6
Generating Plan Output Using the EXPLAIN PLAN Statement
See Also:
This temporary table holds the output of EXPLAIN PLAN statements for all users. PLAN_TABLE
is the default sample output table into which the EXPLAIN PLAN statement inserts rows
describing execution plans.
While a PLAN_TABLE table is automatically set up for each user, you can use the SQL script
catplan.sql to manually create the global temporary table and the PLAN_TABLE synonym.
The name and location of this script depends on your operating system. On UNIX and Linux,
the script is located in the $ORACLE_HOME/rdbms/admin directory.
For example, start a SQL*Plus session, connect with SYSDBA privileges, and run the script as
follows:
@$ORACLE_HOME/rdbms/admin/catplan.sql
Oracle recommends that you drop and rebuild your local PLAN_TABLE table after upgrading
the version of the database because the columns might change. This can cause scripts to fail
or cause TKPROF to fail, if you are specifying the table.
If you do not want to use the name PLAN_TABLE, create a new synonym after running the
catplan.sql script. For example:
See Also:
6-5
Chapter 6
Generating Plan Output Using the EXPLAIN PLAN Statement
• Use the SQL script catplan.sql to create a sample output table called
PLAN_TABLE in your schema.
• Include the EXPLAIN PLAN FOR clause before the SQL statement.
• After issuing the EXPLAIN PLAN statement, use a script or package provided by
Oracle Database to display the most recent plan table output.
• The execution order in EXPLAIN PLAN output begins with the line that is the furthest
indented to the right. The next step is the parent of that line. If two lines are
indented equally, then the top line is normally executed first.
Note:
– The EXPLAIN PLAN output tables in this chapter were displayed with
the utlxpls.sql script.
– The steps in the EXPLAIN PLAN output in this chapter may be
different on your database. The optimizer may choose different
execution plans, depending on database configurations.
To explain a SQL statement, use the EXPLAIN PLAN FOR clause immediately before the
statement. For example:
EXPLAIN PLAN FOR
SELECT last_name FROM employees;
The preceding plan explains the plan and stores the output in the PLAN_TABLE table.
You can then select the execution plan from PLAN_TABLE.
See Also:
6-6
Chapter 6
Displaying PLAN_TABLE Output
EXPLAIN PLAN
SET STATEMENT_ID = 'st1' FOR
SELECT last_name FROM employees;
EXPLAIN PLAN
INTO my_plan_table FOR
SELECT last_name FROM employees;
You can specify a statement ID when using the INTO clause, as in the following statement:
EXPLAIN PLAN
SET STATEMENT_ID = 'st1'
INTO my_plan_table FOR
SELECT last_name FROM employees;
See Also:
Oracle Database SQL Language Reference for a complete description of EXPLAIN
PLAN syntax.
6-7
Chapter 6
Displaying PLAN_TABLE Output
SELECT PLAN_TABLE_OUTPUT
FROM TABLE(DBMS_XPLAN.DISPLAY('MY_PLAN_TABLE', 'st1','TYPICAL'));
See Also:
Oracle Database PL/SQL Packages and Types Reference for more
information about the DBMS_XPLAN package
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 189 | 10 (10)|
| 1 | NESTED LOOPS | | 3 | 189 | 10 (10)|
| 2 | NESTED LOOPS | | 3 | 141 | 7 (15)|
|* 3 | TABLE ACCESS FULL | EMPLOYEES | 3 | 60 | 4 (25)|
| 4 | TABLE ACCESS BY INDEX ROWID| JOBS | 19 | 513 | 2 (50)|
|* 5 | INDEX UNIQUE SCAN | JOB_ID_PK | 1 | | |
6-8
Chapter 6
Displaying PLAN_TABLE Output
--------------------------------------------------------------------------------------------
----
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
Time |
--------------------------------------------------------------------------------------------
----
| 0 | SELECT STATEMENT | | 3 | 189 | 8 (13)|
00:00:01 |
| 1 | NESTED LOOPS | | | |
| |
| 2 | NESTED LOOPS | | 3 | 189 | 8 (13)|
00:00:01 |
| 3 | MERGE JOIN | | 3 | 141 | 5 (20)|
00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID | JOBS | 19 | 513 | 2 (0)|
00:00:01 |
| 5 | INDEX FULL SCAN | JOB_ID_PK | 19 | | 1 (0)|
00:00:01 |
|* 6 | SORT JOIN | | 3 | 60 | 3 (34)|
00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 3 | 60 | 2 (0)|
00:00:01 |
|* 8 | INDEX RANGE SCAN | EMP_EMP_ID_PK | 3 | | 1 (0)|
00:00:01 |
|* 9 | INDEX UNIQUE SCAN | DEPT_ID_PK | 1 | | 0 (0)|
00:00:01 |
| 10 | TABLE ACCESS BY INDEX ROWID | DEPARTMENTS | 1 | 16 | 1 (0)|
00:00:01 |
--------------------------------------------------------------------------------------------
----
6 - access("E"."JOB_ID"="J"."JOB_ID")
filter("E"."JOB_ID"="J"."JOB_ID")
8 - access("E"."EMPLOYEE_ID"<103)
9 - access("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
6-9
Chapter 6
Displaying PLAN_TABLE Output
For example:
• Start with ID = 0 and given STATEMENT_ID.
• Use the CONNECT BY clause to walk the tree from parent to child, the join keys
being STATEMENT_ID = PRIOR STATMENT_ID and PARENT_ID = PRIOR ID.
• Use the pseudo-column LEVEL (associated with CONNECT BY) to indent the children.
Rows Plan
------- ----------------------------------------
SELECT STATEMENT
TABLE ACCESS FULL EMPLOYEES
The NULL in the Rows column indicates that the optimizer does not have any
statistics on the table. Analyzing the table shows the following:
Rows Plan
------- ----------------------------------------
16957 SELECT STATEMENT
16957 TABLE ACCESS FULL EMPLOYEES
You can also select the COST. This is useful for comparing execution plans or for
understanding why the optimizer chooses one execution plan over another.
Note:
These simplified examples are not valid for recursive SQL.
6-10
7
Reading Execution Plans
Execution plans are represented as a tree of operations.
This chapter contains the following topics:
SELECT PLAN_TABLE_OUTPUT
FROM TABLE(DBMS_XPLAN.DISPLAY(NULL, 'statement_id','BASIC'));
Examples of the output from this statement are shown in Example 7-4 and Example 7-1.
Example 7-1 EXPLAIN PLAN for Statement ID ex_plan1
The following plan shows execution of a SELECT statement. The table employees is accessed
using a full table scan. Every row in the table employees is accessed, and the WHERE clause
criteria is evaluated for every row.
EXPLAIN PLAN
SET statement_id = 'ex_plan1' FOR
SELECT phone_number
FROM employees
WHERE phone_number LIKE '650%';
---------------------------------------
| Id | Operation | Name |
---------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | TABLE ACCESS FULL| EMPLOYEES |
---------------------------------------
EXPLAIN PLAN
SET statement_id = 'ex_plan2' FOR
SELECT last_name
FROM employees
WHERE last_name LIKE 'Pe%';
SELECT PLAN_TABLE_OUTPUT
FROM TABLE(DBMS_XPLAN.DISPLAY(NULL, 'ex_plan2','BASIC'));
7-1
Chapter 7
Reading Execution Plans: Advanced
----------------------------------------
| Id | Operation | Name |
----------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | INDEX RANGE SCAN| EMP_NAME_IX |
----------------------------------------
Assumptions
This tutorial assumes the following:
• The STATISTICS_LEVEL initialization parameter is set to ALL.
• The database uses the default settings for adaptive execution.
• As user oe, you want to issue the following separate queries:
SELECT o.order_id, v.product_name
FROM orders o,
( SELECT order_id, product_name
FROM order_items o, product_information p
WHERE p.product_id = o.product_id
AND list_price < 50
AND min_price < 40 ) v
WHERE o.order_id = v.order_id
SELECT product_name
FROM order_items o, product_information p
WHERE o.unit_price = 15
AND quantity > 1
AND p.product_id = o.product_id
• Before executing each query, you want to query DBMS_XPLAN.DISPLAY_PLAN to see
the default plan, that is, the plan that the optimizer chose before applying its
adaptive mechanism.
7-2
Chapter 7
Reading Execution Plans: Advanced
• After executing each query, you want to query DBMS_XPLAN.DISPLAY_CURSOR to see the
final plan and adaptive query plan.
• SYS has granted oe the following privileges:
– GRANT SELECT ON V_$SESSION TO oe
– GRANT SELECT ON V_$SQL TO oe
– GRANT SELECT ON V_$SQL_PLAN TO oe
– GRANT SELECT ON V_$SQL_PLAN_STATISTICS_ALL TO oe
The following sample output has been reformatted to fit on the page. In this plan, the
optimizer chooses a nested loops join. The original optimizer estimates are shown in the
E-Rows column, whereas the actual statistics gathered during execution are shown in the
A-Rows column. In the MERGE JOIN operation, the difference between the estimated and
actual number of rows is significant.
--------------------------------------------------------------------------------------------
|Id| Operation | Name |Start|E-Rows|A-Rows|A-Time|Buff|OMem|1Mem|O/1/M|
--------------------------------------------------------------------------------------------
| 0| SELECT STATEMENT | | 1| | 269|00:00:00.09|1338| | | |
| 1| NESTED LOOPS | | 1| 1| 269|00:00:00.09|1338| | | |
| 2| MERGE JOIN CARTESIAN| | 1| 4|9135|00:00:00.03| 33| | | |
|*3| TABLE ACCESS FULL |PRODUCT_INFORMAT| 1| 1| 87|00:00:00.01| 32| | | |
| 4| BUFFER SORT | | 87|105|9135|00:00:00.01| 1|4096|4096|1/0/0|
| 5| INDEX FULL SCAN | ORDER_PK | 1|105| 105|00:00:00.01| 1| | | |
|*6| INDEX UNIQUE SCAN | ORDER_ITEMS_UK |9135| 1| 269|00:00:00.03|1305| | | |
--------------------------------------------------------------------------------------------
7-3
Chapter 7
Reading Execution Plans: Advanced
5. View the execution plan in the cursor by using the same SELECT statement that you
ran in Step 3.
The following example shows that the optimizer has chosen a different plan, using
a hash join. The Note section shows that the optimizer used statistics feedback to
adjust its cost estimates for the second execution of the query, thus illustrating
automatic reoptimization.
--------------------------------------------------------------------------------------------
|Id| Operation |Name |Start|E-Rows|A-Rows|A-Time|Buff|Reads|OMem|1Mem|O/1/M|
--------------------------------------------------------------------------------------------
| 0| SELECT STATEMENT | | 1 | |269|00:00:00.02|60|1| | | |
| 1| NESTED LOOPS | | 1 |269|269|00:00:00.02|60|1| | | |
|*2| HASH JOIN | | 1 |313|269|00:00:00.02|39|1|1000K|1000K|1/0/0|
|*3| TABLE ACCESS FULL |PRODUCT_INFORMA| 1 | 87| 87|00:00:00.01|15|0| | | |
| 4| INDEX FAST FULL SCAN|ORDER_ITEMS_UK | 1 |665|665|00:00:00.01|24|1| | | |
|*5| INDEX UNIQUE SCAN |ORDER_PK |269| 1|269|00:00:00.01|21|0| | | |
--------------------------------------------------------------------------------------------
2 - access("P"."PRODUCT_ID"="O"."PRODUCT_ID")
3 - filter(("MIN_PRICE"<40 AND "LIST_PRICE"<50))
5 - access("O"."ORDER_ID"="ORDER_ID")
Note
-----
- statistics feedback used for this statement
6. Query V$SQL to verify the performance improvement.
The following query shows the performance of the two statements (sample output
included).
SELECT CHILD_NUMBER, CPU_TIME, ELAPSED_TIME, BUFFER_GETS
FROM V$SQL
WHERE SQL_ID = 'gm2npz344xqn8';
The second statement executed, which is child number 1, used statistics feedback.
CPU time, elapsed time, and buffer gets are all significantly lower.
7. Explain the plan for the query of order_items.
For example, use the following statement:
EXPLAIN PLAN FOR
SELECT product_name
FROM order_items o, product_information p
WHERE o.unit_price = 15
AND quantity > 1
AND p.product_id = o.product_id
8. View the plan in the plan table.
For example, run the following statement:
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);
7-4
Chapter 7
Reading Execution Plans: Advanced
Sample output appears below. Based on statistics collected at run time (Step 4), the
optimizer chose a hash join rather than the nested loops join. The dashes (-) indicate the
steps in the nested loops plan that the optimizer considered but do not ultimately choose.
The switch illustrates the adaptive query plan feature.
-------------------------------------------------------------------------------
|Id | Operation | Name |Rows|Bytes|Cost(%CPU)|Time |
-------------------------------------------------------------------------------
| 0| SELECT STATEMENT | |4|128|7(0)|00:00:01|
| *1| HASH JOIN | |4|128|7(0)|00:00:01|
|- 2| NESTED LOOPS | | | | | |
|- 3| NESTED LOOPS | | |128|7(0)|00:00:01|
|- 4| STATISTICS COLLECTOR | | | | | |
| *5| TABLE ACCESS FULL | ORDER_ITEMS |4| 48|3(0)|00:00:01|
|-*6| INDEX UNIQUE SCAN | PRODUCT_INFORMATI_PK|1| |0(0)|00:00:01|
|- 7| TABLE ACCESS BY INDEX ROWID| PRODUCT_INFORMATION |1| 20|1(0)|00:00:01|
| 8| TABLE ACCESS FULL | PRODUCT_INFORMATION |1| 20|1(0)|00:00:01|
-------------------------------------------------------------------------------
7-5
Chapter 7
Reading Execution Plans: Advanced
Note
-----
- this is an adaptive plan (rows marked '-' are inactive)
See Also:
In this example, customer is the smallest table, whereas transaction is the largest
table. A typical OLTP query retrieves transaction information about a specific customer
account. The query drives from the customer table. The goal is to minimize logical I/O,
which typically minimizes other critical resources including physical I/O and CPU time.
For parallel queries, the driving table is usually the largest table. It would not be
efficient to use parallel query in this case because only a few rows from each table are
7-6
Chapter 7
Reading Execution Plans: Advanced
accessed. However, what if it were necessary to identify all customers who had transactions
of a certain type last month? It would be more efficient to drive from the transaction table
because no limiting conditions exist on the customer table. The database would join rows
from the transaction table to the account table, and then finally join the result set to the
customer table. In this case, the used on the account and customer table are probably highly
selective primary key or unique indexes rather than the non-unique indexes used in the first
query. Because the transaction table is large and the column is not selective, it would be
beneficial to use parallel query driving from the transaction table.
See Also:
The OTHER_TAG column in "PLAN_TABLE Columns"
7-7
Chapter 7
Reading Execution Plans: Advanced
-------------------------------------------------------------------------------------
|Id | Operation | Name |Rows| Bytes |Cost %CPU| TQ |IN-OUT|PQ Distrib|
-------------------------------------------------------------------------------------
|0| SELECT STATEMENT | |107| 2782 | 3 (34) | | | |
|1| PX COORDINATOR | | | | | | | |
|2| PX SEND QC (RANDOM) |:TQ10001|107| 2782 | 3 (34) | Q1,01 | P->S |QC (RAND) |
|3| HASH GROUP BY | |107| 2782 | 3 (34) | Q1,01 | PCWP | |
|4| PX RECEIVE | |107| 2782 | 3 (34) | Q1,01 | PCWP | |
|5| PX SEND HASH |:TQ10000|107| 2782 | 3 (34) | Q1,00 | P->P |HASH |
|6| HASH GROUP BY | |107| 2782 | 3 (34) | Q1,00 | PCWP | |
|7| PX BLOCK ITERATOR | |107| 2782 | 2 (0) | Q1,00 | PCWP | |
|8| TABLE ACCESS FULL|EMP2 |107| 2782 | 2 (0) | Q1,00 | PCWP | |
-------------------------------------------------------------------------------------
One set of parallel execution servers scans EMP2 in parallel, while the second set
performs the aggregation for the GROUP BY operation. The PX BLOCK ITERATOR row
source represents the splitting up of the table EMP2 into pieces to divide the scan
workload between the parallel execution servers. The PX SEND and PX RECEIVE row
sources represent the pipe that connects the two sets of parallel execution servers as
rows flow up from the parallel scan, get repartitioned through the HASH table queue,
and then read by and aggregated on the top set. The PX SEND QC row source
represents the aggregated values being sent to the QC in random (RAND) order. The
PX COORDINATOR row source represents the QC or Query Coordinator which controls
and schedules the parallel plan appearing below it in the plan tree.
Note:
Queries using bitmap join index indicate the bitmap join index access path.
The operation for bitmap join index is the same as bitmap index.
7-8
Chapter 7
Reading Execution Plans: Advanced
SELECT STATEMENT
TABLE ACCESS T BY INDEX ROWID
BITMAP CONVERSION TO ROWID
BITMAP OR
BITMAP MINUS
BITMAP MINUS
BITMAP INDEX C1_IND SINGLE VALUE
BITMAP INDEX C2_IND SINGLE VALUE
BITMAP INDEX C2_IND SINGLE VALUE
BITMAP MERGE
BITMAP INDEX C3_IND RANGE SCAN
To view the EXPLAIN PLAN for this query, use the following command:
The EXPLAIN PLAN output for this query should look similar to the following:
-----------------------------------------------------------------------------
---
|Id| Operation | Name |Rows|Bytes|Cost(%CPU)|
Time |
-----------------------------------------------------------------------------
7-9
Chapter 7
Reading Execution Plans: Advanced
---
|0| SELECT STATEMENT | | 11 | 77 | 4 (25)|
00:00:01|
|1| RESULT CACHE |b06ppfz9pxzstbttpbqyqnfbmy| | |
| |
|2| HASH GROUP BY | | 11 | 77 | 4 (25)|
00:00:01|
|3| TABLE ACCESS FULL| EMP |107 | 749| 3 (0) |
00:00:01|
------------------------------------------------------------------------
--------
In this EXPLAIN PLAN, the ResultCache operator is identified by its CacheId, which is
b06ppfz9pxzstbttpbqyqnfbmy. You can now run a query on the
V$RESULT_CACHE_OBJECTS view by using this CacheId.
A join is implemented using partial partition-wise join if the DISTRIBUTION column of the
plan table of one of the joined tables contains PARTITION(KEY). Partial partition-wise
join is possible if one of the joined tables is partitioned on its join column and the table
is parallelized.
A join is implemented using full partition-wise join if the partition row source appears
before the join row source in the EXPLAIN PLAN output. Full partition-wise joins are
possible only if both joined tables are equipartitioned on their respective join columns.
Examples of execution plans for several types of partitioning follow.
This section contains the following topics:
Assume that the tables employees and departments from the Oracle Database sample
schema exist.
7-10
Chapter 7
Reading Execution Plans: Advanced
YYYY')),
PARTITION emp_p4 VALUES LESS THAN (TO_DATE('1-JAN-1998','DD-MON-YYYY')),
PARTITION emp_p5 VALUES LESS THAN (TO_DATE('1-JAN-2001','DD-MON-YYYY'))
)
AS SELECT * FROM employees;
--------------------------------------------------------------------
|Id| Operation | Name |Rows| Bytes|Cost|Pstart|Pstop|
--------------------------------------------------------------------
| 0| SELECT STATEMENT | | 105| 13965 | 2 | | |
| 1| PARTITION RANGE ALL| | 105| 13965 | 2 | 1 | 5 |
| 2| TABLE ACCESS FULL | EMP_RANGE | 105| 13965 | 2 | 1 | 5 |
--------------------------------------------------------------------
The database creates a partition row source on top of the table access row source. It iterates
over the set of partitions to be accessed. In this example, the partition iterator covers all
partitions (option ALL), because a predicate was not used for pruning. The PARTITION_START
and PARTITION_STOP columns of the PLAN_TABLE show access to all partitions from 1 to 5.
-----------------------------------------------------------------------
| Id | Operation | Name |Rows|Bytes|Cost|Pstart|Pstop|
-----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 399 | 2 | | |
| 1 | PARTITION RANGE ITERATOR| | 3 | 399 | 2 | 4 | 5 |
| *2 | TABLE ACCESS FULL |EMP_RANGE| 3 | 399 | 2 | 4 | 5 |
-----------------------------------------------------------------------
In the previous example, the partition row source iterates from partition 4 to 5 because the
database prunes the other partitions using a predicate on hire_date.
-----------------------------------------------------------------------
| Id | Operation | Name |Rows|Bytes|Cost|Pstart|Pstop|
7-11
Chapter 7
Reading Execution Plans: Advanced
-----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 133 | 2 | | |
| 1 | PARTITION RANGE SINGLE| | 1 | 133 | 2 | 1 | 1 |
|* 2 | TABLE ACCESS FULL | EMP_RANGE | 1 | 133 | 2 | 1 | 1 |
-----------------------------------------------------------------------
In the previous example, only partition 1 is accessed and known at compile time; thus,
there is no need for a partition row source.
Note:
Oracle Database displays the same information for hash partitioned objects,
except the partition row source name is PARTITION HASH instead of
PARTITION RANGE. Also, with hash partitioning, pruning is only possible using
equality or IN-list predicates.
-----------------------------------------------------------------------
|Id| Operation | Name | Rows | Bytes |Cost|Pstart|Pstop|
-----------------------------------------------------------------------
| 0| SELECT STATEMENT | | 10120 | 1314K| 78 | | |
| 1| PARTITION RANGE ALL| | 10120 | 1314K| 78 | 1 | 5 |
| 2| PARTITION HASH ALL| | 10120 | 1314K| 78 | 1 | 3 |
| 3| TABLE ACCESS FULL| EMP_COMP | 10120 | 1314K| 78 | 1 | 15 |
-----------------------------------------------------------------------
This example shows the plan when Oracle Database accesses all subpartitions of all
partitions of a composite object. The database uses two partition row sources for this
7-12
Chapter 7
Reading Execution Plans: Advanced
purpose: a range partition row source to iterate over the partitions, and a hash partition row
source to iterate over the subpartitions of each accessed partition.
In the following example, the range partition row source iterates from partition 1 to 5, because
the database performs no pruning. Within each partition, the hash partition row source
iterates over subpartitions 1 to 3 of the current partition. As a result, the table access row
source accesses subpartitions 1 to 15. In other words, the database accesses all
subpartitions of the composite object.
-----------------------------------------------------------------------
| Id | Operation | Name |Rows|Bytes |Cost|Pstart|Pstop|
-----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20 | 2660 | 17 | | |
| 1 | PARTITION RANGE SINGLE| | 20 | 2660 | 17 | 5 | 5 |
| 2 | PARTITION HASH ALL | | 20 | 2660 | 17 | 1 | 3 |
|* 3 | TABLE ACCESS FULL | EMP_COMP | 20 | 2660 | 17 | 13 | 15 |
-----------------------------------------------------------------------
In the previous example, only the last partition, partition 5, is accessed. This partition is
known at compile time, so the database does not need to show it in the plan. The hash
partition row source shows accessing of all subpartitions within that partition; that is,
subpartitions 1 to 3, which translates into subpartitions 13 to 15 of the emp_comp table.
------------------------------------------------------------------------
| Id | Operation |Name |Rows | Bytes |Cost|Pstart|Pstop|
------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 101 | 13433 | 78 | | |
| 1 | PARTITION RANGE ALL | | 101 | 13433 | 78 | 1 | 5 |
| 2 | PARTITION HASH SINGLE| | 101 | 13433 | 78 | 3 | 3 |
|* 3 | TABLE ACCESS FULL | EMP_COMP | 101 | 13433 | 78 | | |
------------------------------------------------------------------------
In the previous example, the predicate deptno=20 enables pruning on the hash dimension
within each partition. Therefore, Oracle Database only needs to access a single subpartition.
The number of this subpartition is known at compile time, so the hash partition row source is
not needed.
Finally, consider the following statement:
7-13
Chapter 7
Reading Execution Plans: Advanced
FROM emp_comp
WHERE department_id = :dno;
-----------------------------------------------------------------------
| Id| Operation | Name |Rows| Bytes |Cost|Pstart|Pstop|
-----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 101| 13433 | 78 | | |
| 1 | PARTITION RANGE ALL | | 101| 13433 | 78 | 1 | 5 |
| 2 | PARTITION HASH SINGLE| | 101| 13433 | 78 | KEY | KEY |
|*3 | TABLE ACCESS FULL | EMP_COMP | 101| 13433 | 78 | | |
-----------------------------------------------------------------------
The last two examples are the same, except that department_id = :dno replaces
deptno=20. In this last case, the subpartition number is unknown at compile time, and
a hash partition row source is allocated. The option is SINGLE for this row source
because Oracle Database accesses only one subpartition within each partition. In Step
2, both PARTITION_START and PARTITION_STOP are set to KEY. This value means that
Oracle Database determines the number of subpartitions at run time.
--------------------------------------------------------------------------------------
-----
|Id| Operation |Name |Row|Byte|Cost|Pstart|Pstop|TQ|IN-OUT|PQ
Distrib|
--------------------------------------------------------------------------------------
-----
7-14
Chapter 7
Reading Execution Plans: Advanced
The execution plan shows that the table dept2 is scanned serially and all rows with the same
partitioning column value of emp_range_did (department_id) are sent through a PART
(KEY), or partition key, table queue to the same parallel execution server doing the partial
partition-wise join.
Example 7-6 Partial Partition-Wise Join with Composite Partition
In the following example, emp_comp is joined on the partitioning column and is parallelized,
enabling use of a partial partition-wise join because dept2 is not partitioned. The database
dynamically partitions dept2 before the join.
-------------------------------------------------------------------------------------------
| Id| Operation | Name |Rows |Bytes |Cost|Pstart|Pstop|TQ |IN-OUT|PQ Distrib|
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 445 |17800| 5 | | | | | |
| 1 | PX COORDINATOR | | | | | | | | | |
| 2 | PX SEND QC (RANDOM) |:TQ10001| 445 |17800| 5 | | | Q1,01 |P->S| QC (RAND)|
|*3 | HASH JOIN | | 445 |17800| 5 | | | Q1,01 |PCWP| |
| 4 | PX PARTITION RANGE ALL | | 107 | 1070| 3 |1 | 5 | Q1,01 |PCWC| |
| 5 | PX PARTITION HASH ALL | | 107 | 1070| 3 |1 | 3 | Q1,01 |PCWC| |
| 6 | TABLE ACCESS FULL |EMP_COMP| 107 | 1070| 3 |1 | 15| Q1,01 |PCWP| |
| 7 | PX RECEIVE | | 21 | 630| 1 | | | Q1,01 |PCWP| |
| 8 | PX SEND PARTITION (KEY)|:TQ10000| 21 | 630| 1 | | | Q1,00 |P->P|PART (KEY)|
| 9 | PX BLOCK ITERATOR | | 21 | 630| 1 | | | Q1,00 |PCWC| |
|10 | TABLE ACCESS FULL |DEPT2 | 21 | 630| 1 | | | Q1,00 |PCWP| |
-------------------------------------------------------------------------------------------
The plan shows that the optimizer selects partial partition-wise join from one of two columns.
The PX SEND node type is PARTITION (KEY) and the PQ Distrib column contains the text
PART (KEY), or partition key. This implies that the table dept2 is re-partitioned based on the
join column department_id to be sent to the parallel execution servers executing the scan of
EMP_COMP and the join.
7-15
Chapter 7
Reading Execution Plans: Advanced
--------------------------------------------------------------------------------------
-----
|Id| Operation | Name |Rows|Bytes|Cost|Pstart|Pstop|TQ |IN-OUT|PQ
Distrib|
--------------------------------------------------------------------------------------
-----
| 0| SELECT STATEMENT | | 106 | 2544 |8| | | |
| |
| 1| PX COORDINATOR | | | | | | | |
| |
| 2| PX SEND QC (RANDOM) |:TQ10000 | 106 | 2544 |8| | | Q1,00 | P->S |QC
(RAND)|
| 3| PX PARTITION HASH ALL | | 106 | 2544 |8|1 | 3 | Q1,00 | PCWC
| |
|*4| HASH JOIN | | 106 | 2544 |8| | | Q1,00 | PCWP
| |
| 5| PX PARTITION RANGE ALL| | 107 | 1070 |3|1 | 5 | Q1,00 | PCWC
| |
| 6| TABLE ACCESS FULL |EMP_COMP | 107 | 1070 |3|1 |15 | Q1,00 | PCWP
| |
| 7| TABLE ACCESS FULL |DEPT_HASH | 27 | 378 |4|1 | 3 | Q1,00 | PCWP
| |
--------------------------------------------------------------------------------------
-----
The PX PARTITION HASH row source appears on top of the join row source in the plan
table output while the PX PARTITION RANGE row source appears over the scan of
emp_comp. Each parallel execution server performs the join of an entire hash partition
of emp_comp with an entire partition of dept_hash.
7-16
Chapter 7
Reading Execution Plans: Advanced
The INLIST ITERATOR operation iterates over the next operation in the plan for each value in
the IN-list predicate. The following sections describe the three possible types of IN-list
columns for partitioned tables and indexes.
This section contains the following topics:
The KEY(INLIST) designation for the partition start and stop keys specifies that an IN-list
predicate appears on the index start and stop keys.
7.2.5.5.2 When the IN-List Column is an Index and a Partition Column: Example
If empno is an indexed and a partition column, then the plan contains an INLIST ITERATOR
operation before the partition operation.
7-17
Chapter 7
Reading Execution Plans: Advanced
7-18
Chapter 7
Reading Execution Plans: Advanced
7-19
Chapter 7
Reading Execution Plans: Advanced
7-20
Chapter 7
Reading Execution Plans: Advanced
7-21
Chapter 7
Reading Execution Plans: Advanced
7-22
Chapter 7
Reading Execution Plans: Advanced
Table 7-2 describes the values that can appear in the DISTRIBUTION column:
Table 7-3 lists each combination of OPERATION and OPTIONS produced by the EXPLAIN PLAN
statement and its meaning within an execution plan.
7-23
Chapter 7
Reading Execution Plans: Advanced
Table 7-3 (Cont.) OPERATION and OPTIONS Values Produced by EXPLAIN PLAN
7-24
Chapter 7
Reading Execution Plans: Advanced
Table 7-3 (Cont.) OPERATION and OPTIONS Values Produced by EXPLAIN PLAN
7-25
Chapter 7
Reading Execution Plans: Advanced
Table 7-3 (Cont.) OPERATION and OPTIONS Values Produced by EXPLAIN PLAN
7-26
Chapter 7
Reading Execution Plans: Advanced
Table 7-3 (Cont.) OPERATION and OPTIONS Values Produced by EXPLAIN PLAN
7-27
Chapter 7
Reading Execution Plans: Advanced
Table 7-3 (Cont.) OPERATION and OPTIONS Values Produced by EXPLAIN PLAN
7-28
Chapter 7
Execution Plan Reference
Table 7-3 (Cont.) OPERATION and OPTIONS Values Produced by EXPLAIN PLAN
See Also:
Oracle Database Reference for more information about PLAN_TABLE
7-29
Chapter 7
Execution Plan Reference
View Description
V$SQL_SHARED_CURSOR Explains why a particular child cursor is not shared with
existing child cursors. Each column identifies a specific
reason why the cursor cannot be shared.
The USE_FEEDBACK_STATS column shows whether a
child cursor fails to match because of reoptimization.
V$SQL_PLAN Includes a superset of all rows appearing in all final
plans. PLAN_LINE_ID is consecutively numbered, but
for a single final plan, the IDs may not be consecutive.
V$SQL_PLAN_STATISTICS_ALL Contains memory usage statistics for row sources that
use SQL memory (sort or hash join). This view
concatenates information in V$SQL_PLAN with
execution statistics from V$SQL_PLAN_STATISTICS
and V$SQL_WORKAREA.
7-30
Chapter 7
Execution Plan Reference
7-31
Chapter 7
Execution Plan Reference
7-32
Chapter 7
Execution Plan Reference
7-33
Chapter 7
Execution Plan Reference
Table 7-7 lists each combination of OPERATION and OPTIONS produced by the EXPLAIN
PLAN statement and its meaning within an execution plan.
7-34
Chapter 7
Execution Plan Reference
Table 7-7 (Cont.) OPERATION and OPTIONS Values Produced by EXPLAIN PLAN
7-35
Chapter 7
Execution Plan Reference
Table 7-7 (Cont.) OPERATION and OPTIONS Values Produced by EXPLAIN PLAN
MERGE JOIN OUTER Merge join operation to perform an outer join statement.
MERGE JOIN ANTI Merge antijoin.
MERGE JOIN SEMI Merge semijoin.
MERGE JOIN CARTESIAN Can result from 1 or more of the tables not having any join
conditions to any other tables in the statement. Can occur even with
a join and it may not be flagged as CARTESIAN in the plan.
CONNECT BY Retrieval of rows in hierarchical order for a query containing a
CONNECT BY clause.
MAT_VIEW REWITE FULL Retrieval of all rows from a materialized view.
ACCESS
(These are access
methods.)
MAT_VIEW REWITE SAMPLE Retrieval of sampled rows from a materialized view.
ACCESS
MAT_VIEW REWITE CLUSTER Retrieval of rows from a materialized view based on a value of an
ACCESS indexed cluster key.
MAT_VIEW REWITE HASH Retrieval of rows from materialized view based on hash cluster key
ACCESS value.
7-36
Chapter 7
Execution Plan Reference
Table 7-7 (Cont.) OPERATION and OPTIONS Values Produced by EXPLAIN PLAN
7-37
Chapter 7
Execution Plan Reference
Table 7-7 (Cont.) OPERATION and OPTIONS Values Produced by EXPLAIN PLAN
7-38
Chapter 7
Execution Plan Reference
Table 7-7 (Cont.) OPERATION and OPTIONS Values Produced by EXPLAIN PLAN
See Also:
Oracle Database Reference for more information about PLAN_TABLE
The display functions accept options for displaying the plan table output. You can specify:
• A plan table name if you are using a table different from PLAN_TABLE
• A statement ID if you have set a statement ID with the EXPLAIN PLAN
• A format option that determines the level of detail: BASIC, SERIAL, TYPICAL, ALL, and in
some cases ADAPTIVE
7-39
Chapter 7
Execution Plan Reference
7-40
Chapter 7
Execution Plan Reference
See Also:
Oracle Database PL/SQL Packages and Types Reference to learn more about
DBMS_XPLAN display functions
7-41
Part IV
SQL Operators: Access Paths and Joins
A row source is a set of rows returned by a step in the execution plan. A SQL operator acts
on a row source.
A unary operator acts on one input, as with access paths. A binary operator acts on two
outputs, as with joins.
This part contains the following chapters:
8
Optimizer Access Paths
An access path is a technique used by a query to retrieve rows from a row source.
This chapter contains the following topics:
The optimizer considers different possible execution plans, and then assigns each plan a
cost. The optimizer chooses the plan with the lowest cost. In general, index access paths are
more efficient for statements that retrieve a small subset of table rows, whereas full table
scans are more efficient when accessing a large portion of a table.
8-1
Chapter 8
Table Access Paths
See Also:
• "Joins"
• "Cost-Based Optimization"
• Oracle Database Concepts for an overview of these structures
See Also:
8-2
Chapter 8
Table Access Paths
allocates one or more extents to form the data segment for a table. The database also
allocates one or more extents to form the index segment for a table.
By default, the database uses automatic segment space management (ASSM) for
permanent, locally managed tablespaces. When a session first inserts data into a table, the
database formats a bitmap block. The bitmap tracks the blocks in the segment. The database
uses the bitmap to find free blocks and then formats each block before writing to it. ASSM
spread out inserts among blocks to avoid concurrency issues.
The high water mark (HWM) is the point in a segment beyond which data blocks are
unformatted and have never been used. Below the HWM, a block may be formatted and
written to, formatted and empty, or unformatted. The low high water mark (low HWM) marks
the point below which all blocks are known to be formatted because they either contain data
or formerly contained data.
During a full table scan, the database reads all blocks up to the low HWM, which are known
to be formatted, and then reads the segment bitmap to determine which blocks between the
HWM and low HWM are formatted and safe to read. The database knows not to read past
the HWM because these blocks are unformatted.
See Also:
Oracle Database Concepts to learn about data block storage
See Also:
Oracle Database Concepts to learn about rowids
8-3
Chapter 8
Table Access Paths
Direct path
read
Situations in which Oracle Database may perform direct path reads include:
• Execution of a CREATE TABLE AS SELECT statement
• Execution of an ALTER REBUILD or ALTER MOVE statement
• Reads from a temporary tablespace
• Parallel queries
• Reads from a LOB segment
See Also:
Oracle Database Performance Tuning Guide to learn about wait events for
direct path reads
8-4
Chapter 8
Table Access Paths
8-5
Chapter 8
Table Access Paths
8-6
Chapter 8
Table Access Paths
Because the blocks are adjacent, the database can speed up the scan by making I/O calls
larger than a single block, known as a multiblock read. The size of a read call ranges from
one block to the number of blocks specified by the DB_FILE_MULTIBLOCK_READ_COUNT
initialization parameter. For example, setting this parameter to 4 instructs the database to
read up to 4 blocks in a single call.
The algorithms for caching blocks during full table scans are complex. For example, the
database caches blocks differently depending on whether tables are small or large.
See Also:
• "Table 19-2"
• Oracle Database Concepts for an overview of the default caching mode
• Oracle Database Reference to learn about the
DB_FILE_MULTIBLOCK_READ_COUNT initialization parameter
SELECT salary
FROM hr.employees
WHERE salary > 4000;
---------------------------------------------------------------------------
| Id| Operation | Name | Rows | Bytes |Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0| SELECT STATEMENT | | | | 3 (100)| |
|* 1| TABLE ACCESS FULL| EMPLOYEES | 98 | 6762 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
1 - filter("SALARY">4000)
8-7
Chapter 8
Table Access Paths
Note:
Rowids can change between versions. Accessing data based on position is
not recommended because rows can move.
See Also:
Oracle Database Development Guide to learn more about rowids
8-8
Chapter 8
Table Access Paths
SELECT *
FROM employees
WHERE employee_id > 190;
Step 2 of the following plan shows a range scan of the emp_emp_id_pk index on the
hr.employees table. The database uses the rowids obtained from the index to find the
corresponding rows from the employees table, and then retrieve them. The BATCHED access
shown in Step 1 means that the database retrieves a few rowids from the index, and then
attempts to access rows in block order to improve the clustering and reduce the number of
times that the database must access a block.
--------------------------------------------------------------------------------
|Id| Operation | Name |Rows|Bytes|Cost(%CPU)|Time|
--------------------------------------------------------------------------------
| 0| SELECT STATEMENT | | | |2(100)| |
| 1| TABLE ACCESS BY INDEX ROWID BATCHED|EMPLOYEES |16|1104|2 (0)|00:00:01|
|*2| INDEX RANGE SCAN |EMP_EMP_ID_PK|16| |1 (0)|00:00:01|
--------------------------------------------------------------------------------
2 - access("EMPLOYEE_ID">190)
• SAMPLE (sample_percent)
The database reads a specified percentage of rows in the table to perform a sample table
scan.
• SAMPLE BLOCK (sample_percent)
The database reads a specified percentage of table blocks to perform a sample table
scan.
The sample_percent specifies the percentage of the total row or block count to include in the
sample. The value must be in the range .000001 up to, but not including, 100. This
percentage indicates the probability of each row, or each cluster of rows in block sampling,
being selected for the sample. It does not mean that the database retrieves exactly
sample_percent of the rows.
8-9
Chapter 8
Table Access Paths
Note:
Block sampling is possible only during full table scans or index fast full
scans. If a more efficient execution path exists, then the database does not
sample blocks. To guarantee block sampling for a specific table or index, use
the FULL or INDEX_FFS hint.
See Also:
The EXPLAIN PLAN output for this statement might look as follows:
------------------------------------------------------------------------
-
| Id | Operation | Name | Rows | Bytes | Cost
(%CPU)|
------------------------------------------------------------------------
-
| 0 | SELECT STATEMENT | | 1 | 68 | 3
(34)|
| 1 | TABLE ACCESS SAMPLE | EMPLOYEES | 1 | 68 | 3
(34)|
------------------------------------------------------------------------
-
8-10
Chapter 8
Table Access Paths
See Also:
Oracle Database In-Memory Guide for an introduction to the IM column store
See Also:
8-11
Chapter 8
B-Tree Index Access Paths
SELECT *
FROM oe.product_information
WHERE list_price > 10
ORDER BY product_id
The plan for this statement might look as follows, with the INMEMORY keyword in Step 2
indicating that some or all of the object was accessed from the IM column store:
8-12
Chapter 8
B-Tree Index Access Paths
See Also:
8-13
Chapter 8
B-Tree Index Access Paths
246,rowid
248,rowid
248,rowid
250,rowid
....
...
200..209
210..220
221..228
246..250
....
221,rowid
222,rowid
223,rowid
228,rowid
200..250
81..120
...
41..80
0..40
....
....
...
41..48
49..53
54..65
78..80
....
11,rowid
11,rowid
12,rowid
19,rowid
....
11..19
20..25
32..40
0..10
....
Branch Blocks
10,rowid
0,rowid
0,rowid
Leaf Blocks
....
8.3.1.2 How Index Storage Affects Index Scans
Bitmap index blocks can appear anywhere in the index segment.
Figure 8-3 shows the leaf blocks as adjacent to each other. For example, the 1-10
block is next to and before the 11-19 block. This sequencing illustrates the linked lists
that connect the index entries. However, index blocks need not be stored in order
within an index segment. For example, the 246-250 block could appear anywhere in
the segment, including directly before the 1-10 block. For this reason, ordered index
scans must perform single-block I/O. The database must read an index block to
determine which index block it must read next.
The index block body stores the index entries in a heap, just like table rows. For
example, if the value 10 is inserted first into a table, then the index entry with key 10
might be inserted at the bottom of the index block. If 0 is inserted next into the table,
then the index entry for key 0 might be inserted on top of the entry for 10. Thus, the
index entries in the block body are not stored in key order. However, within the index
block, the row header stores records in key order. For example, the first record in the
header points to the index entry with key 0, and so on sequentially up to the record
that points to the index entry with key 10. Thus, index scans can read the row header
to determine where to begin and end range scans, avoiding the necessity of reading
every entry in the block.
8-14
Chapter 8
B-Tree Index Access Paths
See Also:
Oracle Database Concepts to learn about index blocks
0,AAAPvCAAFAAAAFaAAa
0,AAAPvCAAFAAAAFaAAg
0,AAAPvCAAFAAAAFaAAl
2,AAAPvCAAFAAAAFaAAm
In a unique index, the index key does not include the rowid. The database sorts the data only
by the index key values, such as 0, 1, 2, and so on.
See Also:
Oracle Database Concepts for an overview of unique and nonunique indexes
COUNT(*)
----------
1
COUNT(*)
----------
0
8-15
Chapter 8
B-Tree Index Access Paths
The following example shows that the optimizer chooses a full table scan for a query
of all department IDs in hr.employees. The optimizer cannot use the index on
employees.department_id because the index is not guaranteed to include entries for
every row in the table.
Explained.
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------
---
Plan hash value: 3476115102
------------------------------------------------------------------------
---
|Id | Operation | Name | Rows| Bytes | Cost (%CPU)|
Time |
------------------------------------------------------------------------
---
| 0 | SELECT STATEMENT | | 107 | 321 | 2 (0)|
00:00:01 |
| 1 | TABLE ACCESS FULL| EMPLOYEES | 107 | 321 | 2 (0)|
00:00:01 |
------------------------------------------------------------------------
---
The following example shows the optimizer can use the index on department_id for a
query of a specific department ID because all non-null rows are indexed.
Explained.
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------
---
Plan hash value: 67425611
------------------------------------------------------------------------
---
|Id| Operation | Name |Rows|Bytes|Cost (%CPU)|
Time |
------------------------------------------------------------------------
---
| 0| SELECT STATEMENT | | 1 | 3 | 1 (0)| 00:0
0:01|
|*1| INDEX RANGE SCAN| EMP_DEPARTMENT_IX | 1 | 3 | 1 (0)| 00:0
0:01|
8-16
Chapter 8
B-Tree Index Access Paths
---------------------------------------------------------------------------
The following example shows that the optimizer chooses an index scan when the predicate
excludes null values:
Explained.
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------
Plan hash value: 1590637672
---------------------------------------------------------------------------
| Id| Operation | Name |Rows|Bytes| Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0| SELECT STATEMENT | |106| 318 | 1 (0)| 00:0 0:01|
|*1| INDEX FULL SCAN | EMP_DEPARTMENT_IX |106| 318 | 1 (0)| 00:0 0:01|
---------------------------------------------------------------------------
A unique or primary key constraint is insufficient by itself to produce an index unique scan
because a non-unique index on the column may already exist. Consider the following
example, which creates the t_table table and then creates a non-unique index on numcol:
UNIQUENES
---------
NONUNIQUE
8-17
Chapter 8
B-Tree Index Access Paths
The following code creates a primary key constraint on a column with a non-unique
index, resulting in an index range scan rather than an index unique scan:
Execution Plan
----------------------------------------------------------
Plan hash value: 868081059
------------------------------------------------------------------------
---
| Id | Operation | Name |Rows |Bytes |Cost (%CPU)|
Time |
------------------------------------------------------------------------
---
| 0 | SELECT STATEMENT | | 1 | 13 | 1 (0)|
00:00:01 |
|* 1 | INDEX RANGE SCAN| T_TABLE_IDX | 1 | 13 | 1 (0)|
00:00:01 |
------------------------------------------------------------------------
---
You can use the INDEX(alias index_name) hint to specify the index to use, but not a
specific type of index access path.
See Also:
• Oracle Database Concepts for more details on index structures and for
detailed information on how a B-tree is searched
• Oracle Database SQL Language Reference to learn more about the
INDEX hint
8-18
Chapter 8
B-Tree Index Access Paths
Branch Blocks
0..40
41..80
81..120
....
200..250
Leaf Blocks
The following statement queries the record for product 19 in the sh.products table:
SELECT *
FROM sh.products
WHERE prod_id = 19;
Because a primary key index exists on the products.prod_id column, and the WHERE clause
references all of the columns using an equality operator, the optimizer chooses a unique
scan:
---------------------------------------------------------------------------
| Id| Operation | Name |Rows|Bytes|Cost (%CPU)|Time |
---------------------------------------------------------------------------
8-19
Chapter 8
B-Tree Index Access Paths
| 0| SELECT STATEMENT | | | |1
(100)| |
| 1| TABLE ACCESS BY INDEX ROWID| PRODUCTS |1 | 173 |1 (0)|
00:00:01|
|* 2| INDEX UNIQUE SCAN | PRODUCTS_PK |1 | |0
(0)| |
------------------------------------------------------------------------
---
2 - access("PROD_ID"=19)
8-20
Chapter 8
B-Tree Index Access Paths
Note:
For the optimizer to consider a range scan, wild-card searches of the form
col1 LIKE '%ASD' must not be in a leading position.
Tip:
If you require sorted data, then use the ORDER BY clause, and do not rely on an
index. If an index can satisfy an ORDER BY clause, then the optimizer uses this option
and thereby avoids a sort.
The optimizer considers an index range scan descending when an index can satisfy an ORDER
BY DESCENDING clause.
If the optimizer chooses a full table scan or another index, then a hint may be required to
force this access path. The INDEX(tbl_alias ix_name) and INDEX_DESC(tbl_alias ix_name)
hints instruct the optimizer to use a specific index.
See Also:
Oracle Database SQL Language Reference to learn more about the INDEX and
INDEX_DESC hints
Note:
In some cases, an index scan reads a set of index blocks, sorts the rowids, and
then reads a set of table blocks.
Thus, to scan the index, the database moves backward or forward through the leaf blocks.
For example, a scan for IDs between 20 and 40 locates the first leaf block that has the lowest
8-21
Chapter 8
B-Tree Index Access Paths
key value that is 20 or greater. The scan proceeds horizontally through the linked list of
leaf nodes until it finds a value greater than 40, and then stops.
The following figure illustrates an index range scan using ascending order. A
statement requests the employees records with the value 20 in the department_id
column, which has a nonunique index. In this example, 2 index entries for department
20 exist.
Branch Blocks
0..40
41..80
81..120
....
200..250
Leaf Blocks
SELECT *
FROM employees
WHERE department_id = 20
AND salary > 1000;
The preceding query has low cardinality (returns few rows), so the query uses the
index on the department_id column. The database scans the index, fetches the
8-22
Chapter 8
B-Tree Index Access Paths
records from the employees table, and then applies the salary > 1000 filter to these fetched
records to generate the result.
-------------------------------------------------------------------------------------------
|Id | Operation | Name |Rows|Bytes|Cost(%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 2 (100)| |
|*1 | TABLE ACCESS BY INDEX ROWID BATCHED| EMPLOYEES | 2 | 138 | 2 (0)|00:00:01|
|*2 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX| 2 | | 1 (0)|00:00:01|
-------------------------------------------------------------------------------------------
1 - filter("SALARY">1000)
2 - access("DEPARTMENT_ID"=20)
The following statement queries the records for employees in department 20 in descending
order:
SELECT *
FROM employees
WHERE department_id < 20
ORDER BY department_id DESC;
This preceding query has low cardinality, so the query uses the index on the department_id
column.
8-23
Chapter 8
B-Tree Index Access Paths
2 - access("DEPARTMENT_ID"<20)
The database locates the first index leaf block that contains the highest key value that
is 20 or less. The scan then proceeds horizontally to the left through the linked list of
leaf nodes. The database obtains the rowid from each index entry, and then retrieves
the row specified by the rowid.
8-24
Chapter 8
B-Tree Index Access Paths
Branch Blocks
0..40
41..80
81..120
....
200..250
Leaf Blocks
The following statement queries the ID and name for departments in order of department ID:
The following plan shows that the optimizer chose an index full scan:
------------------------------------------------------------------------
|Id | Operation | Name |Rows|Bytes|Cost(%CPU)|Time |
------------------------------------------------------------------------
|0| SELECT STATEMENT | | | |2 (100)| |
|1| TABLE ACCESS BY INDEX ROWID|DEPARTMENTS |27 |432|2 (0)|00:00:01 |
8-25
Chapter 8
B-Tree Index Access Paths
The database locates the first index leaf block, and then proceeds horizontally to the
right through the linked list of leaf nodes. For each index entry, the database obtains
the rowid from the entry, and then retrieves the table row specified by the rowid.
Because the index is sorted on department_id, the database avoids a separate
operation to sort the retrieved rows.
Note:
Unlike a full scan, a fast full scan cannot eliminate a sort operation because it
does not read the index in order.
See Also:
Oracle Database SQL Language Reference to learn more about the INDEX
hint
8-26
Chapter 8
B-Tree Index Access Paths
The following plan shows that the optimizer chose a fast full index scan:
See Also:
Oracle Database Concepts
8-27
Chapter 8
B-Tree Index Access Paths
The customers table contains a column cust_gender whose values are either M or F.
While logged in to the database as user sh, you create a composite index on the
columns (cust_gender, cust_email) as follows:
Conceptually, a portion of the index might look as follows, with the gender value of F or
M as the leading edge of the index.
F,Wolf@company.example.com,rowid
F,Wolsey@company.example.com,rowid
F,Wood@company.example.com,rowid
F,Woodman@company.example.com,rowid
F,Yang@company.example.com,rowid
F,Zimmerman@company.example.com,rowid
M,Abbassi@company.example.com,rowid
M,Abbey@company.example.com,rowid
You run the following query for a customer in the sh.customers table:
SELECT *
FROM sh.customers
WHERE cust_email = 'Abbey@company.example.com';
The database can use a skip scan of the customers_gender_email index even though
cust_gender is not specified in the WHERE clause. In the sample index, the leading
column cust_gender has two possible values: F and M. The database logically splits
the index into two. One subindex has the key F, with entries in the following form:
F,Wolf@company.example.com,rowid
F,Wolsey@company.example.com,rowid
F,Wood@company.example.com,rowid
F,Woodman@company.example.com,rowid
F,Yang@company.example.com,rowid
F,Zimmerman@company.example.com,rowid
The second subindex has the key M, with entries in the following form:
M,Abbassi@company.example.com,rowid
M,Abbey@company.example.com,rowid
When searching for the record for the customer whose email is
Abbey@company.example.com, the database searches the subindex with the leading
8-28
Chapter 8
B-Tree Index Access Paths
value F first, and then searches the subindex with the leading value M. Conceptually, the
database processes the query as follows:
( SELECT *
FROM sh.customers
WHERE cust_gender = 'F'
AND cust_email = 'Abbey@company.example.com' )
UNION ALL
( SELECT *
FROM sh.customers
WHERE cust_gender = 'M'
AND cust_email = 'Abbey@company.example.com' )
-----------------------------------------------------------------------------------------
|Id| Operation | Name |Rows|Bytes|Cost(%CPU)|Time|
-----------------------------------------------------------------------------------------
| 0|SELECT STATEMENT | | | |10(100)| |
| 1| TABLE ACCESS BY INDEX ROWID BATCHED| CUSTOMERS |33|6237| 10(0)|00:00:01|
|*2| INDEX SKIP SCAN | CUST_GENDER_EMAIL_IX |33| | 4(0)|00:00:01|
-----------------------------------------------------------------------------------------
2 - access("CUST_EMAIL"='Abbey@company.example.com')
filter("CUST_EMAIL"='Abbey@company.example.com')
See Also:
Oracle Database Concepts to learn more about skip scans
8-29
Chapter 8
B-Tree Index Access Paths
See Also:
Oracle Database SQL Language Reference
Banda,Amit,AAAVgdAALAAAABSABD
Bates,Elizabeth,AAAVgdAALAAAABSABI
Bell,Sarah,AAAVgdAALAAAABSABc
Bernstein,David,AAAVgdAALAAAABSAAz
Bissot,Laura,AAAVgdAALAAAABSAAd
Bloom,Harrison,AAAVgdAALAAAABSABF
Bull,Alexis,AAAVgdAALAAAABSABV
8-30
Chapter 8
Bitmap Index Access Paths
ABANDA,AAAVgdAALAAAABSABD
ABULL,AAAVgdAALAAAABSABV
ACABRIO,AAAVgdAALAAAABSABX
AERRAZUR,AAAVgdAALAAAABSAAv
AFRIPP,AAAVgdAALAAAABSAAV
AHUNOLD,AAAVgdAALAAAABSAAD
AHUTTON,AAAVgdAALAAAABSABL
The following example retrieves the plan using the DBMS_XPLAN.DISPLAY_CURSOR function.
The database retrieves all rowids in the emp_email_uk index, and then retrieves rowids in
emp_name_ix for last names that begin with A. The database uses a hash join to search both
sets of rowids for matches. For example, rowid AAAVgdAALAAAABSABD occurs in both sets of
rowids, so the database probes the employees table for the record corresponding to this
rowid.
Example 8-4 Index Join Scan
8-31
Chapter 8
Bitmap Index Access Paths
The database stores at least one bitmap for each index key. Each value in the bitmap,
which is a series of 1 and 0 values, points to a row within a rowid range. Thus, in a
bitmap index, one index entry points to a set of rows rather than a single row.
This section contains the following topics:
101,AAAPvCAAFAAAAFaAAa
Nonunique B-tree Indexed data None In an entry of the index on the employees.last_name
combined with rowid column, the name and rowid combination
Smith,AAAPvCAAFAAAAFaAAa is the key, and there is no
data:
Smith,AAAPvCAAFAAAAFaAAa
M,low-rowid,high-rowid,1000101010101010
The database stores a bitmap index in a B-tree structure. The database can search
the B-tree quickly on the first part of the key, which is the set of attributes on which the
index is defined, and then obtain the corresponding rowid range and bitmap.
See Also:
• "Bitmap Storage"
• Oracle Database Concepts for an overview of bitmap indexes
• Oracle Database Data Warehousing Guide for more information about
bitmap indexes
8-32
Chapter 8
Bitmap Index Access Paths
Bitmap indexes are a useful way to speed ad hoc queries in a data warehouse. They are
fundamental to star transformations. Specifically, bitmap indexes are useful in queries that
contain the following:
• Multiple conditions in the WHERE clause
Before the table itself is accessed, the database filters out rows that satisfy some, but not
all, conditions.
• AND, OR, and NOT operations on columns with low or medium NDV
Combining bitmap indexes makes these operations more efficient. The database can
merge bitmaps from bitmap indexes very quickly. For example, if bitmap indexes exist on
the customers.state and customers.county columns, then these indexes can
enormously improve the performance of the following query:
SELECT *
FROM customers
WHERE state = 'CA'
AND county = 'San Mateo'
The database can convert 1 values in the merged bitmap into rowids efficiently.
• The COUNT function
The database can scan the bitmap index without needing to scan the table.
• Predicates that select for null values
Unlike B-tree indexes, bitmap indexes can contain nulls. Queries that count the number
of nulls in a column can use the bitmap index without scanning the table.
• Columns that do not experience heavy DML
The reason is that one index key points to many rows. If a session modifies the indexed
data, then the database cannot lock a single bit in the bitmap: rather, the database locks
the entire index entry, which in practice locks the rows pointed to by the bitmap. For
example, if the county of residence for a specific customer changes from San Mateo to
Alameda, then the database must get exclusive access to the San Mateo index entry and
Alameda index entry in the bitmap. Rows containing these two values cannot be modified
until COMMIT.
8-33
Chapter 8
Bitmap Index Access Paths
See Also:
• "Star Transformation"
• Oracle Database SQL Language Reference to learn about the COUNT
function
Note:
The Hakan factor is an optimization used by the bitmap index algorithms to
limit the number of rows that Oracle Database assumes can be stored in a
single block. By artificially limiting the number of rows, the database reduces
the size of the bitmaps.
As shown in Table 8-4, bitmap indexes can include keys that consist entirely of null
values, unlike B-tree indexes. In Table 8-4, the null has a value of 1 for the 6th row in
the range, which means that the cust_marital_status value is null for the 6th row in
the range. Indexing nulls can be useful for some SQL statements, such as queries with
the aggregate function COUNT.
8-34
Chapter 8
Bitmap Index Access Paths
See Also:
Oracle Database Concepts to learn about rowid formats
The FROM and WHERE clause in the preceding CREATE statement represent the join condition
between the tables. The customers.cust_city column is the index key.
Each key value in the index represents a possible city in the customers table. Conceptually,
key values for the index might look as follows, with one bitmap associated with each key
value:
San Francisco 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 . . .
San Mateo 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 . . .
Smithville 1 0 0 0 1 0 0 1 0 0 1 0 1 0 0 . . .
.
.
.
Each bit in a bitmap corresponds to one row in the sales table. In the Smithville key, the
value 1 means that the first row in the sales table corresponds to a product sold to a
Smithville customer, whereas the value 0 means that the second row corresponds to a
product not sold to a Smithville customer.
Consider the following query of the number of separate sales to Smithville customers:
The following plan shows that the database reads the Smithville bitmap to derive the
number of Smithville sales (Step 4), thereby avoiding a join of the customers and sales
tables.
8-35
Chapter 8
Bitmap Index Access Paths
------------------------------------------------------------------------------------
|Id| Operation | Name |Rows|Bytes|Cost (%CPU)| Time|Pstart|Pstop|
------------------------------------------------------------------------------------
| 0| SELECT STATEMENT | | | |29 (100)| | | |
| 1| SORT AGGREGATE | | 1 | 5| | | | |
| 2| PARTITION RANGE ALL | | 1708|8540|29 (0)|00:00:01|1|28|
| 3| BITMAP CONVERSION COUNT | | 1708|8540|29 (0)|00:00:01| | |
|*4| BITMAP INDEX SINGLE VALUE|CUST_SALES_BJI| | | | |1|28|
------------------------------------------------------------------------------------
4 - access("S"."SYS_NC00008$"='Smithville')
See Also:
Oracle Database Concepts to learn about the CREATE INDEX statement
8-36
Chapter 8
Bitmap Index Access Paths
The following plan shows that the database uses a range scan to find all key values less than
1918 (Step 3), converts the 1 values in the bitmap to rowids (Step 2), and then uses the
rowids to obtain the rows from the customers table (Step 1):
-------------------------------------------------------------------------------------------
|Id| Operation | Name |Rows|Bytes|Cost(%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0| SELECT STATEMENT | | | |421 (100)| |
| 1| TABLE ACCESS BY INDEX ROWID BATCHED| CUSTOMERS |3604|68476|421 (1)|00:00:01|
| 2| BITMAP CONVERSION TO ROWIDS | | | | | |
|*3| BITMAP INDEX RANGE SCAN | CUSTOMERS_YOB_BIX| | | | |
-------------------------------------------------------------------------------------------
8-37
Chapter 8
Bitmap Index Access Paths
The database only needs to process a single bitmap. For example, the following table
represents the bitmap index (in two bitmap pieces) for the value widowed in the
sh.customers.cust_marital_status column. To satisfy a query of customers with the
status widowed, the database can search for each 1 value in the widowed bitmap and
find the rowid of the corresponding row.
SELECT *
FROM customers
WHERE cust_marital_status = 'Widowed';
The following plan shows that the database reads the entry with the Widowed key in the
customers bitmap index (Step 3), converts the 1 values in the bitmap to rowids (Step
2), and then uses the rowids to obtain the rows from the customers table (Step 1):
8-38
Chapter 8
Bitmap Index Access Paths
412(100)| |
| 1| TABLE ACCESS BY INDEX ROWID BATCHED|CUSTOMERS |3461|638K|412 (2)|00:00:01|
| 2| BITMAP CONVERSION TO ROWIDS | | | | | |
|*3| BITMAP INDEX SINGLE VALUE |CUSTOMERS_MARITAL_BIX| | | | |
-------------------------------------------------------------------------------------------
3 - access("CUST_MARITAL_STATUS"='Widowed')
See Also:
"Index Range Scans"
Column Start Rowid End Rowid 1st Row 2nd 3rd 4th Row 5th 6th Row
Value in Range in Range in Row in Row in in Row in in
Range Range Range Range Range Range
1913 AAA ... CCC ... 0 0 0 0 0 1
1917 AAA ... CCC ... 1 0 1 1 1 0
1918 AAA ... CCC ... 0 1 0 0 0 0
1918 DDD ... EEE ... 1 0 1 0 1 1
8-39
Chapter 8
Bitmap Index Access Paths
See Also:
"Index Range Scans"
The following plan shows that the database obtains all bitmaps for
cust_year_of_birth keys lower than 1918 (Step 3), converts the bitmaps to rowids
(Step 2), and then fetches the rows (Step 1):
8-40
Chapter 8
Bitmap Index Access Paths
1917 1 0 1 0 0 0 0 0 0 0 0 0 0 1
1916 0 1 0 0 0 0 0 0 0 0 0 0 0 0
1915 0 0 0 0 0 0 0 0 1 0 0 0 0 0
------------------------------------
merged: 1 1 1 0 0 0 0 0 1 0 0 0 0 1
The 1 values in resulting bitmap correspond to rows that contain the values 1915, 1916, or
1917.
The following plan shows that the database obtains all bitmaps for cust_year_of_birth keys
lower than 1918 (Step 6), and then merges these bitmaps using OR logic to create a single
bitmap (Step 5). The database obtains a single bitmap for the cust_gender key of F (Step 4),
and then performs an AND operation on these two bitmaps. The result is a single bitmap that
contains 1 values for the requested rows (Step 3).
8-41
Chapter 8
Table Cluster Access Paths
| 0|SELECT STATEMENT | | | |
288(100)| |
| 1| TABLE ACCESS BY INDEX ROWID BATCHED|CUSTOMERS |1802|37842|288 (1)|
00:00:01|
| 2| BITMAP CONVERSION TO ROWIDS | | | |
| |
| 3| BITMAP AND | | | |
| |
|*4| BITMAP INDEX SINGLE VALUE |CUSTOMERS_GENDER_BIX| | |
| |
| 5| BITMAP MERGE | | | |
| |
|*6| BITMAP INDEX RANGE SCAN |CUSTOMERS_YOB_BIX | | |
| |
--------------------------------------------------------------------------------------
-----
See Also:
Oracle Database Concepts for an overview of table clusters
8-42
Chapter 8
Table Cluster Access Paths
30,AADAAAA9d
When a user requests rows in the cluster, the database scans the index to obtain the DBAs of
the blocks containing the rows. Oracle Database then locates the rows based on these
DBAs.
SELECT *
FROM employees2
WHERE department_id = 30;
To perform the scan, Oracle Database first obtains the rowid of the row describing
department 30 by scanning the cluster index (Step 2). Oracle Database then locates the rows
in employees2 using this rowid (Step 1).
8-43
Chapter 8
Table Cluster Access Paths
------------------------------------------------------------------------
---
|Id| Operation | Name |Rows|Bytes|Cost(%CPU)|
Time|
------------------------------------------------------------------------
---
| 0| SELECT STATEMENT | | | | 2
(100)| |
| 1| TABLE ACCESS CLUSTER| EMPLOYEES2 | 6 |798 | 2 (0)|
00:00:01|
|*2| INDEX UNIQUE SCAN |IDX_EMP_DEPT_CLUSTER| 1 | | 1 (0)|
00:00:01|
------------------------------------------------------------------------
---
2 - access("DEPARTMENT_ID"=30)
See Also:
Oracle Database Concepts to learn about indexed clusters
8-44
Chapter 8
Table Cluster Access Paths
SELECT *
FROM employees2
WHERE department_id = 30
To perform a hash scan, Oracle Database first obtains the hash value by applying a hash
function to the key value 30, and then uses this hash value to scan the data blocks and
retrieve the rows (Step 1).
----------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
----------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 1 |
|* 1 | TABLE ACCESS HASH| EMPLOYEES2 | 10 | 1330 | |
----------------------------------------------------------------
1 - access("DEPARTMENT_ID"=30)
See Also:
Oracle Database Concepts to learn about hash clusters
8-45
9
Joins
Oracle Database provides several optimizations for joining row sets.
This chapter contains the following topics:
See Also:
• "Cartesian Joins"
• Oracle Database SQL Language Reference for a concise discussion of joins in
Oracle SQL
result set
table1 table2
9-1
Chapter 9
About Joins
The input of a join can be the result set from a previous join. If the right child of every
internal node of a join tree is a table, then the tree is a left deep join tree, as shown in
the following example. Most join trees are left deep joins.
result set
table4
table3
table1 table2
If the left child of every internal node of a join tree is a table, then the tree is called a
right deep join tree, as shown in the following diagram.
result set
table1
table2
table3 table4
If the left or the right child of an internal node of a join tree can be a join node, then the
tree is called a bushy join tree. In the following example, table4 is a right child of a join
node, table1 is the left child of a join node, and table2 is the left child of a join node.
9-2
Chapter 9
About Joins
result set
table4
table1
table2 table3
In yet another variation, both inputs of a join are the results of a previous join.
9-3
Chapter 9
About Joins
9-4
Chapter 9
Join Methods
• The cost of a hash join largely depends on the cost of building a hash table on one of the
input sides to the join and using the rows from the other side of the join to probe it.
Example 9-1 Estimating Costs for Join Order and Method
Conceptually, the optimizer constructs a matrix of join orders and methods and the cost
associated with each. For example, the optimizer must determine how best to join the
date_dim and lineorder tables in a query. The following table shows the possible variations
of methods and orders, and the cost for each. In this example, a nested loops join in the
order date_dim, lineorder has the lowest cost.
Table 9-2 Sample Costs for Join of date_dim and lineorder Tables
See Also:
Join Method
(Nested Loops, Hash
Join, or Sort Merge)
9-5
Chapter 9
Join Methods
Note:
The number of rows expected from the join is what drives the optimizer
decision, not the size of the underlying tables. For example, a query might
join two tables of a billion rows each, but because of the filters the optimizer
expects data sets of 5 rows each.
In general, nested loops joins work best on small tables with indexes on the join
conditions. If a row source has only one row, as with an equality lookup on a primary
key value (for example, WHERE employee_id=101), then the join is a simple lookup. The
optimizer always tries to put the smallest row source first, making it the driving table.
Various factors enter into the optimizer decision to use nested loops. For example, the
database may read several rows from the outer row source in a batch. Based on the
number of rows retrieved, the optimizer may choose either a nested loop or a hash join
to the inner row source. For example, if a query joins departments to driving table
employees, and if the predicate specifies a value in employees.last_name, then the
database might read enough entries in the index on last_name to determine whether
an internal threshold is passed. If the threshold is not passed, then the optimizer picks
a nested loop join to departments, and if the threshold is passed, then the database
performs a hash join, which means reading the rest of employees, hashing it into
memory, and then joining to departments.
If the access path for the inner loop is not dependent on the outer loop, then the result
can be a Cartesian product: for every iteration of the outer loop, the inner loop
produces the same set of rows. To avoid this problem, use other join methods to join
two independent row sources.
See Also:
• "Table 19-2"
• "Adaptive Query Plans"
9-6
Chapter 9
Join Methods
For example, if a query joins employees and departments, then a nested loop in pseudocode
might be:
The inner loop is executed for every row of the outer loop. The employees table is the "outer"
data set because it is in the exterior for loop. The outer table is sometimes called a driving
table. The departments table is the "inner" data set because it is in the interior for loop.
NESTED LOOPS
outer_loop
inner_loop
3. For every fetch request from the client, the basic process is as follows:
a. Fetch a row from the outer row source
b. Probe the inner row source to find rows that match the predicate criteria
c. Repeat the preceding steps until all rows are obtained by the fetch request
Sometimes the database sorts rowids to obtain a more efficient buffer access pattern.
9-7
Chapter 9
Join Methods
The database can nest two or more outer loops to join as many tables as needed.
Each loop is a data access method. The following template shows how the database
iterates through three nested loops:
SELECT STATEMENT
NESTED LOOPS 3
NESTED LOOPS 2 - Row source becomes OUTER LOOP 3.1
NESTED LOOPS 1 - Row source becomes OUTER LOOP 2.1
OUTER LOOP 1.1
INNER LOOP 1.2
INNER LOOP 2.2
INNER LOOP 3.2
NESTED LOOPS 1
OUTER LOOP 1.1
INNER LOOP 1.2
NESTED LOOPS 2
OUTER LOOP 2.1 - Row source generated by NESTED LOOPS 1
INNER LOOP 2.2
NESTED LOOPS 3
OUTER LOOP 3.1 - Row source generated by NESTED LOOPS 2
INNER LOOP 3.2
9-8
Chapter 9
Join Methods
The plan reveals that the optimizer chose two nested loops (Step 1 and Step 2) to access the
data:
----------------------------------------------------------------------------------
|Id| Operation |Name |Rows|Bytes|Cost(%CPU)|Time|
----------------------------------------------------------------------------------
| 0| SELECT STATEMENT | | | |5 (100)| |
| 1| NESTED LOOPS | | | | | |
| 2| NESTED LOOPS | | 3|102|5 (0)|00:00:01|
| 3| TABLE ACCESS BY INDEX ROWID BATCHED| EMPLOYEES | 3| 54|2 (0)|00:00:01|
|*4| INDEX RANGE SCAN | EMP_NAME_IX | 3| |1 (0)|00:00:01|
|*5| INDEX UNIQUE SCAN | DEPT_ID_PK | 1| |0 (0)| |
| 6| TABLE ACCESS BY INDEX ROWID | DEPARTMENTS | 1| 16|1 (0)|00:00:01|
----------------------------------------------------------------------------------
Abel,employees_rowid
Ande,employees_rowid
Atkinson,employees_rowid
Austin,employees_rowid
b. Using the rowids from the previous step, the database retrieves a batch of rows from
the employees table (Step 3). For example:
Abel,Ellen,80
Abel,John,50
These rows become the outer row source for the innermost nested loop.
The batch step is typically part of adaptive execution plans. To determine whether a
nested loop is better than a hash join, the optimizer needs to determine many rows
9-9
Chapter 9
Join Methods
come back from the row source. If too many rows are returned, then the
optimizer switches to a different join method.
c. For each row in the outer row source, the database scans the dept_id_pk
index to obtain the rowid in departments of the matching department ID (Step
5), and joins it to the employees rows. For example:
Abel,Ellen,80,departments_rowid
Ande,Sundar,80,departments_rowid
Atkinson,Mozhe,50,departments_rowid
Austin,David,60,departments_rowid
These rows become the outer row source for the outer nested loop (Step 1).
2. The database iterates through the outer nested loop as follows:
a. The database reads the first row in outer row source.
For example:
Abel,Ellen,80,departments_rowid
b. The database uses the departments rowid to retrieve the corresponding row
from departments (Step 6), and then joins the result to obtain the requested
values (Step 1).
For example:
Abel,Ellen,80,Sales
c. The database reads the next row in the outer row source, uses the
departments rowid to retrieve the corresponding row from departments (Step
6), and iterates through the loop until all rows are retrieved.
The result set has the following form:
Abel,Ellen,80,Sales
Ande,Sundar,80,Sales
Atkinson,Mozhe,50,Shipping
Austin,David,60,IT
9-10
Chapter 9
Join Methods
Consider the query in "Original Implementation for Nested Loops Joins". In the current
implementation, the execution plan for this query might be as follows:
-------------------------------------------------------------------------------------
| Id | Operation | Name |Rows|Bytes|Cost%CPU| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 19 | 722 | 3 (0)|00:00:01|
| 1 | NESTED LOOPS | | | | | |
| 2 | NESTED LOOPS | | 19 | 722 | 3 (0)|00:00:01|
|* 3 | TABLE ACCESS FULL | DEPARTMENTS | 2 | 32 | 2 (0)|00:00:01|
|* 4 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 10 | | 0 (0)|00:00:01|
| 5 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 10 | 220 | 1 (0)|00:00:01|
-------------------------------------------------------------------------------------
In this case, rows from the hr.departments table form the outer row source (Step 3) of the
inner nested loop (Step 2). The index emp_department_ix is the inner row source (Step 4) of
the inner nested loop. The results of the inner nested loop form the outer row source (Row 2)
of the outer nested loop (Row 1). The hr.employees table is the outer row source (Row 5) of
the outer nested loop.
For each fetch request, the basic process is as follows:
1. The database iterates through the inner nested loop (Step 2) to obtain the rows
requested in the fetch:
a. The database reads the first row of departments to obtain the department IDs for
departments named Marketing or Sales (Step 3). For example:
Marketing,20
This row set is the outer loop. The database caches the data in the PGA.
b. The database scans emp_department_ix, which is an index on the employees table,
to find employees rowids that correspond to this department ID (Step 4), and then
joins the result (Step 2).
The result set has the following form:
Marketing,20,employees_rowid
Marketing,20,employees_rowid
Marketing,20,employees_rowid
c. The database reads the next row of departments, scans emp_department_ix to find
employees rowids that correspond to this department ID, and then iterates through
the loop until the client request is satisfied.
9-11
Chapter 9
Join Methods
In this example, the database only iterates through the outer loop twice
because only two rows from departments satisfy the predicate filter.
Conceptually, the result set has the following form:
Marketing,20,employees_rowid
Marketing,20,employees_rowid
Marketing,20,employees_rowid
.
.
.
Sales,80,employees_rowid
Sales,80,employees_rowid
Sales,80,employees_rowid
.
.
.
These rows become the outer row source for the outer nested loop (Step 1).
This row set is cached in the PGA.
2. The database organizes the rowids obtained in the previous step so that it can
more efficiently access them in the cache.
3. The database begins iterating through the outer nested loop as follows:
a. The database retrieves the first row from the row set obtained in the previous
step, as in the following example:
Marketing,20,employees_rowid
b. Using the rowid, the database retrieves a row from employees to obtain the
requested values (Step 1), as in the following example:
Michael,Hartstein,13000,Marketing
c. The database retrieves the next row from the row set, uses the rowid to probe
employees for the matching row, and iterates through the loop until all rows are
retrieved.
The result set has the following form:
Michael,Hartstein,13000,Marketing
Pat,Fay,6000,Marketing
John,Russell,14000,Sales
Karen,Partners,13500,Sales
Alberto,Errazuriz,12000,Sales
.
.
.
In some cases, a second join row source is not allocated, and the execution plan looks
the same as it did before Oracle Database 11g. The following list describes such
cases:
9-12
Chapter 9
Join Methods
• All of the columns needed from the inner side of the join are present in the index, and
there is no table access required. In this case, Oracle Database allocates only one join
row source.
• The order of the rows returned might be different from the order returned in releases
earlier than Oracle Database 12c. Thus, when Oracle Database tries to preserve a
specific ordering of the rows, for example to eliminate the need for an ORDER BY sort,
Oracle Database might use the original implementation for nested loops joins.
• The OPTIMIZER_FEATURES_ENABLE initialization parameter is set to a release before
Oracle Database 11g. In this case, Oracle Database uses the original implementation for
nested loops joins.
In releases before Oracle Database 11g, the execution plan for this query might appear as
follows:
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |Cost (%CPU)|Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 19 | 722 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 10 | 220 | 1 (0)| 00:00:01 |
| 2 | NESTED LOOPS | | 19 | 722 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL | DEPARTMENTS | 2 | 32 | 2 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 10 | | 0 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
This row set is the outer loop. The database caches the row in the PGA.
9-13
Chapter 9
Join Methods
Marketing,20,employees_rowid
Marketing,20,employees_rowid
Marketing,20,employees_rowid
Marketing,20,employees_rowid
Marketing,20,employees_rowid
Marketing,20,employees_rowid
.
.
.
Sales,80,employees_rowid
Sales,80,employees_rowid
Sales,80,employees_rowid
.
.
.
2. Depending on the circumstances, the database may organize the cached rowids
obtained in the previous step so that it can more efficiently access them.
3. For each employees rowid in the result set generated by the nested loop, the
database retrieves a row from employees to obtain the requested values (Step 1).
Thus, the basic process is to read a rowid and retrieve the matching employees
row, read the next rowid and retrieve the matching employees row, and so on.
Conceptually, the result set has the following form:
Michael,Hartstein,13000,Marketing
Pat,Fay,6000,Marketing
John,Russell,14000,Sales
Karen,Partners,13500,Sales
Alberto,Errazuriz,12000,Sales
.
.
.
9-14
Chapter 9
Join Methods
The related hint USE_NL_WITH_INDEX(table index) hint instructs the optimizer to join the
specified table to another row source with a nested loops join using the specified table as the
inner table. The index is optional. If no index is specified, then the nested loops join uses an
index with at least one join predicate as the index key.
Example 9-3 Nested Loops Hint
Assume that the optimizer chooses a hash join for the following query:
---------------------------------------------------------------------------
|Id | Operation | Name | Rows| Bytes |Cost(%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 5 (100)| |
|*1 | HASH JOIN | | 106 | 2862 | 5 (20)| 00:00:01 |
| 2 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 432 | 2 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| EMPLOYEES | 107 | 1177 | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------
To force a nested loops join using departments as the inner table, add the USE_NL hint as in
the following query:
---------------------------------------------------------------------------
| Id | Operation | Name | Rows |Bytes |Cost (%CPU)|Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 34 (100)| |
| 1 | NESTED LOOPS | | 106 | 2862 | 34 (3)| 00:00:01 |
| 2 | TABLE ACCESS FULL| EMPLOYEES | 107 | 1177 | 2 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| DEPARTMENTS | 1 | 16 | 0 (0)| |
---------------------------------------------------------------------------
3 - filter("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
De Haan,90
9-15
Chapter 9
Join Methods
2. For the row obtained in the previous step, the database scans departments to find
the department name that matches the employees department ID (Step 3), and
joins the result (Step 1). For example:
De Haan,Executive
3. The database retrieves the next row in employees, retrieves the matching row from
departments, and then repeats this process until all rows are retrieved.
The result set has the following form:
De Haan,Executive
Kochnar,Executive
Baer,Public Relations
King,Executive
.
.
.
See Also:
• "Guidelines for Join Order Hints" to learn more about the USE_NL hint
• Oracle Database SQL Language Reference to learn about the USE_NL
hint
9-16
Chapter 9
Join Methods
to the temporary tablespace. This method can still be the most cost effective, especially when
the database uses parallel query servers.
The database applies the hash function to each department_id in the table, generating a
hash value for each. For this illustration, the hash table has 5 slots (it could have more or
less). Because n is 5, the possible hash values range from 1 to 5. The hash functions might
generate the following values for the department IDs:
f(10) = 4
f(20) = 1
f(30) = 4
f(40) = 2
f(50) = 5
Note that the hash function happens to generate the same hash value of 4 for departments
10 and 30. This is known as a hash collision. In this case, the database puts the records for
departments 10 and 30 in the same slot, using a linked list. Conceptually, the hash table looks
as follows:
1 20,Marketing,201,1800
2 40,Human Resources,203,2400
3
4 10,Administration,200,1700 -> 30,Purchasing,114,1700
5 50,Shipping,121,1500
9-17
Chapter 9
Join Methods
2. The database probes the second data set, called the probe table, using
whichever access mechanism has the lowest cost.
Typically, the database performs a full scan of both the smaller and larger data set.
The algorithm in pseudocode might look as follows:
For each row retrieved from the larger data set, the database does the following:
a. Applies the same hash function to the join column or columns to calculate the
number of the relevant slot in the hash table.
For example, to probe the hash table for department ID 30, the database
applies the hash function to 30, which generates the hash value 4.
b. Probes the hash table to determine whether rows exists in the slot.
If no rows exist, then the database processes the next row in the larger data
set. If rows exist, then the database proceeds to the next step.
c. Checks the join column or columns for a match. If a match occurs, then the
database either reports the rows or passes them to the next step in the plan,
and then processes the next row in the larger data set.
If multiple rows exist in the hash table slot, the database walks through the
linked list of rows, checking each one. For example, if department 30 hashes
to slot 4, then the database checks each row until it finds 30.
9-18
Chapter 9
Join Methods
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 665 | 13300 | 8 (25)|
|* 1 | HASH JOIN | | 665 | 13300 | 8 (25)|
| 2 | TABLE ACCESS FULL | ORDERS | 105 | 840 | 4 (25)|
| 3 | TABLE ACCESS FULL | ORDER_ITEMS | 665 | 7980 | 4 (25)|
--------------------------------------------------------------------------
Because the orders table is small relative to the order_items table, which is 6 times larger,
the database hashes orders. In a hash join, the data set for the build table always appears
first in the list of operations (Step 2). In Step 3, the database performs a full scan of the larger
order_items later, probing the hash table for each row.
9.2.2.3 How Hash Joins Work When the Hash Table Does Not Fit in the PGA
The database must use a different technique when the hash table does not fit entirely in the
PGA. In this case, the database uses a temporary space to hold portions (called partitions) of
the hash table, and sometimes portions of the larger table that probes the hash table.
The basic process is as follows:
1. The database performs a full scan of the smaller data set, and then builds an array of
hash buckets in both the PGA and on disk.
When the PGA hash area fills up, the database finds the largest partition within the hash
table and writes it to temporary space on disk. The database stores any new row that
belongs to this on-disk partition on disk, and all other rows in the PGA. Thus, part of the
hash table is in memory and part of it on disk.
2. The database takes a first pass at reading the other data set.
For each row, the database does the following:
a. Applies the same hash function to the join column or columns to calculate the
number of the relevant hash bucket.
b. Probes the hash table to determine whether rows exist in the bucket in memory.
If the hashed value points to a row in memory, then the database completes the join
and returns the row. If the value points to a hash partition on disk, however, then the
9-19
Chapter 9
Join Methods
database stores this row in the temporary tablespace, using the same
partitioning scheme used for the original data set.
3. The database reads each on-disk temporary partition one by one
4. The database joins each partition row to the row in the corresponding on-disk
temporary partition.
See Also:
MERGE JOIN
9-20
Chapter 9
Join Methods
9-21
Chapter 9
Join Methods
For example, the following table shows sorted values in two data sets: temp_ds1 and
temp_ds2.
temp_ds1 temp_ds2
10 20
20 20
30 40
40 40
50 40
60 40
70 40
. 60
. 70
. 70
As shown in the following table, the database begins by reading 10 in temp_ds1, and
then reads the first value in temp_ds2. Because 20 in temp_ds2 is higher than 10 in
temp_ds1, the database stops reading temp_ds2.
The database proceeds to the next value in temp_ds1, which is 20. The database
proceeds through temp_ds2 as shown in the following table.
9-22
Chapter 9
Join Methods
The database proceeds to the next row in temp_ds1, which is 30. The database starts at the
number of its last match, which was 20, and then proceeds through temp_ds2 looking for a
match, as shown in the following table.
The database proceeds to the next row in temp_ds1, which is 40. As shown in the following
table, the database starts at the number of its last match in temp_ds2, which was 20, and then
proceeds through temp_ds2 looking for a match.
9-23
Chapter 9
Join Methods
The database continues in this way until it has matched the final 70 in temp_ds2. This
scenario demonstrates that the database, as it reads through temp_ds1, does not need
to read every row in temp_ds2. This is an advantage over a nested loops join.
A query of DBMS_XPLAN.DISPLAY_CURSOR shows that the plan uses a sort merge join:
------------------------------------------------------------------------
---
|Id| Operation | Name |Rows|Bytes|Cost (%CPU)|
Time|
------------------------------------------------------------------------
---
| 0| SELECT STATEMENT | | | |
5(100)| |
| 1| MERGE JOIN | |106 |4028 |5 (20)|
00:00:01|
| 2| TABLE ACCESS BY INDEX ROWID|DEPARTMENTS | 27 | 432 |2 (0)|
00:00:01|
| 3| INDEX FULL SCAN |DEPT_ID_PK | 27 | |1 (0)|
00:00:01|
|*4| SORT JOIN | |107 |2354 |3 (34)|
00:00:01|
| 5| TABLE ACCESS FULL |EMPLOYEES |107 |2354 |2 (0)|
00:00:01|
------------------------------------------------------------------------
---
4 - access("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
filter("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
The two data sets are the departments table and the employees table. Because an
index orders the departments table by department_id, the database can read this
index and avoid a sort (Step 3). The database only needs to sort the employees table
(Step 4), which is the most CPU-intensive operation.
9-24
Chapter 9
Join Methods
A query of DBMS_XPLAN.DISPLAY_CURSOR shows that the plan uses a sort merge join:
---------------------------------------------------------------------------
| Id| Operation | Name | Rows| Bytes|Cost (%CPU)|Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 6 (100)| |
| 1 | MERGE JOIN | | 106 | 9540 | 6 (34)| 00:00:01|
| 2 | SORT JOIN | | 27 | 567 | 3 (34)| 00:00:01|
| 3 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 567 | 2 (0)| 00:00:01|
|*4 | SORT JOIN | | 107 | 7383 | 3 (34)| 00:00:01|
| 5 | TABLE ACCESS FULL| EMPLOYEES | 107 | 7383 | 2 (0)| 00:00:01|
---------------------------------------------------------------------------
4 - access("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
filter("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
In some situations it may make sense to override the optimizer with the USE_MERGE hint. For
example, the optimizer can choose a full scan on a table and avoid a sort operation in a
query. However, there is an increased cost because a large table is accessed through an
index and single block reads, as opposed to faster access through a full table scan.
See Also:
Oracle Database SQL Language Reference to learn about the USE_MERGE hint
9-25
Chapter 9
Join Types
9.3.1.1 Equijoins
An equijoin is an inner join whose join condition contains an equality operator.
The following example is an equijoin because the join condition contains only an
equality operator:
9.3.1.2 Nonequijoins
A nonequijoin is an inner join whose join condition contains an operator that is not an
equality operator.
The following query lists all employees whose hire date occurred when employee 176
(who is listed in job_history because he changed jobs in 2007) was working at the
company:
In the preceding example, the condition joining employees and job_history does not
contain an equality operator, so it is a nonequijoin. Nonequijoins are relatively rare.
Note that a hash join requires at least a partial equijoin. The following SQL script
contains an equality join condition (e1.empno = e2.empno) and a nonequality
condition:
9-26
Chapter 9
Join Types
The optimizer chooses a hash join for the preceding query, as shown in the following plan:
Execution Plan
----------------------------------------------------------
Plan hash value: 3638257876
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 174 | 5 (20)| 00:00:01 |
|* 1 | HASH JOIN | | 1 | 174 | 5 (20)| 00:00:01 |
| 2 | TABLE ACCESS FULL| EMP | 14 | 1218 | 2 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| EMP | 14 | 1218 | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------
1 - access("E1"."EMPNO"="E2"."EMPNO")
filter("E1"."HIREDATE">=INTERNAL_FUNCTION("E2"."HIREDATE")-1 AND
"E1"."HIREDATE"<=INTERNAL_FUNCTION("E2"."HIREDATE")+1)
SELECT e1.last_name ||
' has salary between 100 less and 100 more than ' ||
e2.last_name AS "SALARY COMPARISON"
FROM employees e1,
employees e2
WHERE e1.salary
BETWEEN e2.salary - 100
AND e2.salary + 100;
9-27
Chapter 9
Join Types
SALARY COMPARISON
-------------------------------------------------------------
King has salary between 100 less and 100 more than King
Kochhar has salary between 100 less and 100 more than Kochhar
Kochhar has salary between 100 less and 100 more than De Haan
De Haan has salary between 100 less and 100 more than Kochhar
De Haan has salary between 100 less and 100 more than De Haan
Russell has salary between 100 less and 100 more than Russell
Partners has salary between 100 less and 100 more than Partners
...
------------------------------------------
PLAN_TABLE_OUTPUT
------------------------------------------
------------------------------------------
| Id | Operation | Name |
------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | MERGE JOIN | |
| 2 | SORT JOIN | |
| 3 | TABLE ACCESS FULL | EMPLOYEES |
|* 4 | FILTER | |
|* 5 | SORT JOIN | |
| 6 | TABLE ACCESS FULL| EMPLOYEES |
------------------------------------------
In this plan, Step 2 sorts the e1 row source, and Step 5 sorts the e2 row source. The
sorted row sources are illustrated in the following table.
The join begins by iterating through the sorted input (e1), which is the left branch of the
join, corresponding to Step 2 of the plan. The original query contains two predicates:
• e1.sal >= e2.sal–100, which is the Step 5 filter
9-28
Chapter 9
Join Types
The following table shows the first iteration of e1, which begins with 24000 (King) in data set
e1. The database determines that the first row in e2, which is 24000 (King), passes the Step
5 filter. The database then sends the row to the Step 4 filter, e1.sal <= w2.sal+100, which
also passes. The database sends this row to the MERGE row source. Next, the database
checks 17000 (Kochhar) against the Step 5 filter, which also passes. However, the row fails
the Step 4 filter, and is discarded. The database proceeds to test 17000 (De Haan) against
the Step 5 filter.
Table 9-9 First Iteration of e1: Separate SORT JOIN and FILTER
Scan e2 Step 5 Filter (e1.sal >= e2.sal–100) Step 4 Filter (e1.sal <= e2.sal+100)
24000 (King) Pass because 24000 >= 23900. Send Pass because 24000 <= 24100. Return
to Step 4 filter. row for merging.
17000 (Kochhar) Pass because 24000 >= 16900. Send Fail because 24000 <=17100 is false.
to Step 4 filter. Discard row. Scan next row in e2.
17000 (De Haan) Pass because 24000 >= 16900. Send Fail because 24000 <=17100 is false.
to Step 4 filter. Discard row. Scan next row in e2.
14000 (Russell) Pass because 24000 >= 13900. Send Fail because 24000 <=14100 is false.
to Step 4 filter. Discard row. Scan next row in e2.
13500 (Partners) Pass because 24000 >= 13400. Send Fail because 24000 <=13600 is false.
to Step 4 filter. Discard row. Scan next row in e2.
As shown in the preceding table, every e2 row necessarily passes the Step 5 filter because
the e2 salaries are sorted in descending order. Thus, the Step 5 filter always sends the row to
the Step 4 filter. Because the e2 salaries are sorted in descending order, the Step 4 filter
necessarily fails every row starting with 17000 (Kochhar). The inefficiency occurs because
the database tests every subsequent row in e2 against the Step 5 filter, which necessarily
passes, and then against the Step 4 filter, which necessarily fails.
Example 9-8 Query With Band Join Optimization
Starting in Oracle Database 12c Release 2 (12.2), the database optimizes the band join by
using the following plan, which does not have a separate FILTER operation:
------------------------------------------
PLAN_TABLE_OUTPUT
------------------------------------------
| Id | Operation | Name |
------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | MERGE JOIN | |
| 2 | SORT JOIN | |
| 3 | TABLE ACCESS FULL | EMPLOYEES |
|* 4 | SORT JOIN | |
| 5 | TABLE ACCESS FULL | EMPLOYEES |
9-29
Chapter 9
Join Types
------------------------------------------
The difference is that Step 4 uses Boolean AND logic for the two predicates to create a
single filter. Instead of checking a row against one filter, and then sending it to a
different row source for checking against a second filter, the database performs one
check against one filter. If the check fails, then processing stops.
In this example, the query begins the first iteration of e1, which begins with 24000
(King). The following figure represents the range. e2 values below 23900 and above
24100 fall outside the range.
The following table shows that the database tests the first row of e2, which is 24000
(King), against the Step 4 filter. The row passes the test, so the database sends the
row to be merged. The next row in e2 is 17000 (Kochhar). This row falls outside of the
range (band) and thus does not satisfy the filter predicate, so the database stops
testing e2 rows in this iteration. The database stops testing because the descending
sort of e2 ensures that all subsequent rows in e2 fail the filter test. Thus, the database
can proceed to the second iteration of e1.
Scan e2 Filter 4 (e1.sal >= e2.sal – 100) AND (e1.sal <= e2.sal + 100)
24000 (King) Passes test because it is true that (24000 >= 23900) AND (24000
<= 24100).
Send row to MERGE. Test next row.
17000 (Kochhar) Fails test because it is false that (24000 >= 16900) AND (24000 <=
17100).
Stop scanning e2. Begin next iteration of e1.
17000 (De Haan) n/a
14000 (Russell) n/a
13500 (Partners) n/a
In this way, the band join optimization eliminates unnecessary processing. Instead of
scanning every row in e2 as in the unoptimized case, the database scans only the
minimum two rows.
9-30
Chapter 9
Join Types
Outer joins require the outer-joined table to be the driving table. In the preceding example,
employees is the driving table, and departments is the driven-to table.
9-31
Chapter 9
Join Types
The cost determines the order of tables. The outer table, including preserved rows,
may be used to build the hash table, or it may be used to probe the hash table.
Example 9-9 Hash Join Outer Joins
This example shows a typical hash join outer join query, and its execution plan. In this
example, all the customers with credit limits greater than 1000 are queried. An outer
join is needed so that the query captures customers who have no orders.
• The outer table is customers.
• The inner table is orders.
• The join preserves the customers rows, including those rows without a
corresponding row in orders.
You could use a NOT EXISTS subquery to return the rows. However, because you are
querying all the rows in the table, the hash join performs better (unless the NOT EXISTS
subquery is not nested).
------------------------------------------------------------------------
---
| Id | Operation | Name |Rows |Bytes|Cost (%CPU)|
Time |
------------------------------------------------------------------------
---
| 0 | SELECT STATEMENT | | | | 7
(100)| |
| 1 | HASH GROUP BY | | 168 | 3192 | 7 (29)|
00:00:01 |
|* 2 | HASH JOIN OUTER | | 318 | 6042 | 6 (17)|
00:00:01 |
|* 3 | TABLE ACCESS FULL| CUSTOMERS | 260 | 3900 | 3 (0)|
00:00:01 |
|* 4 | TABLE ACCESS FULL| ORDERS | 105 | 420 | 2 (0)|
00:00:01 |
------------------------------------------------------------------------
---
2 - access("C"."CUSTOMER_ID"="O"."CUSTOMER_ID")
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------
--
3 - filter("C"."CREDIT_LIMIT">1000)
4 - filter("O"."CUSTOMER_ID">0)
9-32
Chapter 9
Join Types
The query looks for customers which satisfy various conditions. An outer join returns NULL for
the inner table columns along with the outer (preserved) table rows when it does not find any
corresponding rows in the inner table. This operation finds all the customers rows that do not
have any orders rows.
customers.customer_id = orders.customer_id(+)
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 144 | 4608 | 16 (32)|
| 1 | HASH GROUP BY | | 144 | 4608 | 16 (32)|
|* 2 | HASH JOIN OUTER | | 663 | 21216 | 15 (27)|
|* 3 | TABLE ACCESS FULL | CUSTOMERS | 195 | 2925 | 6 (17)|
| 4 | VIEW | V_ORDERS | 665 | 11305 | |
| 5 | HASH GROUP BY | | 665 | 15960 | 9 (34)|
|* 6 | HASH JOIN | | 665 | 15960 | 8 (25)|
|* 7 | TABLE ACCESS FULL| ORDERS | 105 | 840 | 4 (25)|
| 8 | TABLE ACCESS FULL| ORDER_ITEMS | 665 | 10640 | 4 (25)|
---------------------------------------------------------------------------
9-33
Chapter 9
Join Types
DEPARTMENT_ID EMPLOYEE_ID
------------- -----------
10 200
20 201
20 202
30 114
30 115
30 116
...
270
280
178
207
9-34
Chapter 9
Join Types
---------------------------------------------------------------------------
| Id| Operation | Name |Rows|Bytes |Cost (%CPU)|Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | |122 | 4758 | 6 (34)|00:0 0:01|
| 1 | SORT ORDER BY | |122 | 4758 | 6 (34)|00:0 0:01|
| 2 | VIEW | VW_FOJ_0 |122 | 4758 | 5 (20)|00:0 0:01|
|*3 | HASH JOIN FULL OUTER | |122 | 1342 | 5 (20)|00:0 0:01|
| 4 | INDEX FAST FULL SCAN| DEPT_ID_PK | 27 | 108 | 2 (0)|00:0 0:01|
| 5 | TABLE ACCESS FULL | EMPLOYEES |107 | 749 | 2 (0)|00:0 0:01|
---------------------------------------------------------------------------
HASH JOIN FULL OUTER is included in the preceding plan (Step 3), indicating that the query
uses the hash full outer join execution method. Typically, when the full outer join condition
between two tables is an equijoin, the hash full outer join execution method is possible, and
Oracle Database uses it automatically.
To instruct the optimizer to consider using the hash full outer join execution method, apply the
NATIVE_FULL_OUTER_JOIN hint. To instruct the optimizer not to consider using the hash full
outer join execution method, apply the NO_NATIVE_FULL_OUTER_JOIN hint. The
NO_NATIVE_FULL_OUTER_JOIN hint instructs the optimizer to exclude the native execution
method when joining each specified table. Instead, the full outer join is executed as a union of
left outer join and an antijoin.
9-35
Chapter 9
Join Types
9.3.3 Semijoins
A semijoin is a join between two data sets that returns a row from the first set when a
matching row exists in the subquery data set.
The database stops processing the second data set at the first match. Thus,
optimization does not duplicate rows from the first data set when multiple rows in the
second data set satisfy the subquery criteria.
Note:
Semijoins and antijoins are considered join types even though the SQL
constructs that cause them are subqueries. They are internal algorithms that
the optimizer uses to flatten subquery constructs so that they can be
resolved in a join-like way.
9-36
Chapter 9
Join Types
END IF
END LOOP
In the preceding pseudocode, ds1 is the first data set, and ds2_subquery is the subquery data
set. The code obtains the first row from the first data set, and then loops through the
subquery data set looking for a match. The code exits the inner loop as soon as it finds a
match, and then begins processing the next row in the first data set.
Example 9-13 Semijoin Using WHERE EXISTS
The following query uses a WHERE EXISTS clause to list only the departments that contain
employees:
---------------------------------------------------------------------------
| Id| Operation | Name |Rows|Bytes|Cost (%CPU)|Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 2 (100)| |
| 1 | NESTED LOOPS SEMI | |11 | 209 | 2 (0)|00:00:01 |
| 2 | TABLE ACCESS FULL| DEPARTMENTS |27 | 432 | 2 (0)|00:00:01 |
|*3 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX |44 | 132 | 0 (0)| |
---------------------------------------------------------------------------
For each row in departments, which forms the outer loop, the database obtains the
department ID, and then probes the employees.department_id index for matching entries.
Conceptually, the index looks as follows:
10,rowid
10,rowid
10,rowid
10,rowid
30,rowid
30,rowid
30,rowid
...
If the first entry in the departments table is department 30, then the database performs a
range scan of the index until it finds the first 30 entry, at which point it stops reading the index
and returns the matching row from departments. If the next row in the outer loop is
department 20, then the database scans the index for a 20 entry, and not finding any
matches, performs the next iteration of the outer loop. The database proceeds in this way
until all matching rows are returned.
9-37
Chapter 9
Join Types
------------------------------------------------------------------------
---
| Id| Operation | Name |Rows|Bytes|Cost (%CPU)|
Time |
------------------------------------------------------------------------
---
| 0 | SELECT STATEMENT | | | | 2
(100)| |
| 1 | NESTED LOOPS SEMI | |11 | 209 | 2 (0)|
00:00:01 |
| 2 | TABLE ACCESS FULL| DEPARTMENTS |27 | 432 | 2 (0)|
00:00:01 |
|*3 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX |44 | 132 | 0
(0)| |
------------------------------------------------------------------------
---
9.3.4 Antijoins
An antijoin is a join between two data sets that returns a row from the first set when a
matching row does not exist in the subquery data set.
Like a semijoin, an antijoin stops processing the subquery data set when the first
match is found. Unlike a semijoin, the antijoin only returns a row when no match is
found.
This section contains the following topics:
9-38
Chapter 9
Join Types
Execution Plan
----------------------------------------------------------
Plan hash value: 1543991079
------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |Cost (%CPU)|Time |
------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 1400 | 5 (20)| 00:00:01 |
|* 1 | HASH JOIN ANTI | | 14 | 1400 | 5 (20)| 00:00:01 |
| 2 | TABLE ACCESS FULL| EMP | 14 | 1218 | 2 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| DEPT | 4 | 52 | 2 (0)| 00:00:01 |
------------------------------------------------------------------------
1 - access("EMP"."DEPTNO"="DEPT"."DEPTNO")
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
9-39
Chapter 9
Join Types
In the preceding pseudocode, ds1 is the first data set, and ds2 is the second data set.
The code obtains the first row from the first data set, and then loops through the
second data set looking for a match. The code exits the inner loop as soon as it finds a
match, and begins processing the next row in the first data set.
Example 9-15 Semijoin Using WHERE EXISTS
The following query uses a WHERE EXISTS clause to list only the departments that
contain employees:
------------------------------------------------------------------------
---
| Id| Operation | Name |Rows|Bytes |Cost(%CPU)|
Time |
------------------------------------------------------------------------
---
| 0 | SELECT STATEMENT | | | | 2
(100)| |
| 1 | NESTED LOOPS SEMI | |11 | 209 | 2 (0)|
00:00:01 |
| 2 | TABLE ACCESS FULL| DEPARTMENTS |27 | 432 | 2 (0)|
00:00:01 |
|*3 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX |44 | 132 | 0
(0)| |
------------------------------------------------------------------------
---
For each row in departments, which forms the outer loop, the database obtains the
department ID, and then probes the employees.department_id index for matching
entries. Conceptually, the index looks as follows:
10,rowid
10,rowid
10,rowid
10,rowid
30,rowid
30,rowid
30,rowid
...
If the first record in the departments table is department 30, then the database
performs a range scan of the index until it finds the first 30 entry, at which point it stops
reading the index and returns the matching row from departments. If the next row in
the outer loop is department 20, then the database scans the index for a 20 entry, and
9-40
Chapter 9
Join Types
not finding any matches, performs the next iteration of the outer loop. The database proceeds
in this way until all matching rows are returned.
For the entire expression to be true, each individual condition must be true. However, a null
value cannot be compared to another value, so the department_id !=null condition cannot
be true, and thus the whole expression is always false. The following techniques enable a
statement to return records even when nulls are returned to the NOT IN operator:
The preceding query returns no rows even though several departments contain no
employees. This result, which was not intended by the user, occurs because the
employees.department_id column is nullable.
9-41
Chapter 9
Join Types
The execution plan reveals a NESTED LOOPS ANTI SNA operation in Step 2:
------------------------------------------------------------------------
---
| Id| Operation | Name |Rows|Bytes|Cost (%CPU)|
Time|
------------------------------------------------------------------------
---
| 0| SELECT STATEMENT | | | |
4(100)| |
|*1| FILTER | | | |
| |
| 2| NESTED LOOPS ANTI SNA| |17 |323 | 4 (50)|
00:00:01|
| 3| TABLE ACCESS FULL | DEPARTMENTS |27 |432 | 2 (0)|
00:00:01|
|*4| INDEX RANGE SCAN | EMP_DEPARTMENT_IX |41 |123 | 0
(0)| |
|*5| TABLE ACCESS FULL | EMPLOYEES | 1 | 3 | 2 (0)|
00:00:01|
------------------------------------------------------------------------
---
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------
---
The ANTI SNA stands for "single null-aware antijoin." ANTI NA stands for "null-aware
antijoin." The null-aware operation enables the optimizer to use the antijoin
optimization even on a nullable column. In releases earlier than Oracle Database 11g,
the database could not perform antijoins on NOT IN queries when nulls were possible.
Suppose that the user rewrites the query by applying an IS NOT NULL condition to the
subquery:
The preceding query returns 16 rows, which is the expected result. Step 1 in the plan
shows a standard NESTED LOOPS ANTI join instead of an ANTI NA or ANTI SNA join
because the subquery cannot returns nulls:
------------------------------------------------------------------------
---
9-42
Chapter 9
Join Types
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("DEPARTMENT_ID"="DEPARTMENT_ID")
filter("DEPARTMENT_ID" IS NOT NULL)
The preceding query avoids the null problem for NOT IN clauses. Thus, even though
employees.department_id column is nullable, the statement returns the desired result.
Step 1 of the execution plan reveals a NESTED LOOPS ANTI operation, not the ANTI NA variant,
which was necessary for NOT IN when nulls were possible:
---------------------------------------------------------------------------
| Id| Operation | Name |Rows|Bytes| Cost (%CPU)|Time|
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 2 (100)| |
| 1 | NESTED LOOPS ANTI | | 17 | 323 | 2 (0)|00:00:01|
| 2 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 432 | 2 (0)|00:00:01|
|*3 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 41 | 123 | 0 (0)| |
---------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
9-43
Chapter 9
Join Types
Note:
If a Cartesian join appears in a query plan, it could be caused by an
inadvertently omitted join condition. In general, if a query joins n tables,
then n-1 join conditions are required to avoid a Cartesian join.
At a high level, the algorithm for a Cartesian join looks as follows, where ds1 is
typically the smaller data set, and ds2 is the larger data set:
9-44
Chapter 9
Join Types
---------------------------------------------------------------------------
| Id| Operation | Name | Rows | Bytes |Cost (%CPU)|Time|
---------------------------------------------------------------------------
| 0| SELECT STATEMENT | | | |11 (100)| |
| 1| MERGE JOIN CARTESIAN | | 2889 |57780 |11 (0)|00:00:01|
| 2| TABLE ACCESS FULL | DEPARTMENTS | 27 | 324 | 2 (0)|00:00:01|
| 3| BUFFER SORT | | 107 | 856 | 9 (0)|00:00:01|
| 4| INDEX FAST FULL SCAN| EMP_NAME_IX | 107 | 856 | 0 (0)| |
---------------------------------------------------------------------------
In Step 1 of the preceding plan, the CARTESIAN keyword indicates the presence of a Cartesian
join. The number of rows (2889) is the product of 27 and 107.
In Step 3, the BUFFER SORT operation indicates that the database is copying the data blocks
obtained by the scan of emp_name_ix from the SGA to the PGA. This strategy avoids multiple
scans of the same blocks in the database buffer cache, which would generate many logical
reads and permit resource contention.
The following execution plan shows a Cartesian product (Step 3) between locations (Step 6)
and employees (Step 4), which is then joined to the departments table (Step 2):
---------------------------------------------------------------------------
| Id| Operation | Name |Rows | Bytes |Cost (%CPU)|Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | |37 (100)| |
|*1 | HASH JOIN | | 106 | 4664 |37 (6)|00:00:01 |
| 2 | TABLE ACCESS FULL | DEPARTMENTS | 27 | 513 | 2 (0)|00:00:01 |
9-45
Chapter 9
Join Optimizations
See Also:
Oracle Database SQL Language Reference to learn about the ORDERED hint
9-46
Chapter 9
Join Optimizations
• Filter members in Exadata cells, especially when joining a large fact table and small
dimension tables in a star schema
Bloom filters can occur in both parallel and serial processing.
e1 e2 e3 e4 e5 e6 e7 e8
0 0 0 0 0 0 0 0
This array represents a set. To represent an input value i in this array, three separate hash
functions (three is arbitrary) are applied to i, each generating a hash value between 1 and 8:
f1(i) = h1
f2(i) = h2
f3(i) = h3
For example, to store the value 17 in this array, the hash functions set i to 17, and then return
the following hash values:
f1(17) = 5
f2(17) = 3
f3(17) = 5
In the preceding example, two of the hash functions happened to return the same value of 5,
known as a hash collision. Because the distinct hash values are 5 and 3, the 5th and 3rd
elements in the array are set to 1:
e1 e2 e3 e4 e5 e6 e7 e8
0 0 1 0 1 0 0 0
Testing the membership of 17 in the set reverses the process. To test whether the set
excludes the value 17, element 3 or element 5 must contain a 0. If a 0 is present in either
element, then the set cannot contain 17. No false negatives are possible.
To test whether the set includes 17, both element 3 and element 5 must contain 1 values.
However, if the test indicates a 1 for both elements, then it is still possible for the set not to
include 17. False positives are possible. For example, the following array might represent the
value 22, which also has a 1 for both element 3 and element 5:
e1 e2 e3 e4 e5 e6 e7 e8
1 0 1 0 1 0 0 0
9-47
Chapter 9
Join Optimizations
See Also:
Oracle Database SQL Language Reference to learn more about the bloom
filter hints
------------------------------------------------------------------------
---
| Id | Operation | Name | TQ |IN-OUT| PQ
Distrib |
------------------------------------------------------------------------
---
...
| 9 | JOIN FILTER CREATE | :BF0000 | Q1,03 | PCWP
| |
In the Predicate Information section of the plan, filters that contain functions
beginning with the string SYS_OP_BLOOM_FILTER indicate use of a Bloom filter.
9-48
Chapter 9
Join Optimizations
SELECT * FROM
TABLE(DBMS_XPLAN.DISPLAY_CURSOR(format => 'BASIC,+PARALLEL,+PREDICATE'));
---------------------------------------------------------------------------
| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
| 1 | PX COORDINATOR | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10003 | Q1,03 | P->S | QC (RAND) |
|* 3 | HASH JOIN BUFFERED | | Q1,03 | PCWP | |
| 4 | PX RECEIVE | | Q1,03 | PCWP | |
| 5 | PX SEND BROADCAST | :TQ10001 | Q1,01 | S->P | BROADCAST |
| 6 | PX SELECTOR | | Q1,01 | SCWC | |
| 7 | TABLE ACCESS FULL | PRODUCTS | Q1,01 | SCWP | |
|* 8 | HASH JOIN | | Q1,03 | PCWP | |
| 9 | JOIN FILTER CREATE | :BF0000 | Q1,03 | PCWP | |
| 10 | BUFFER SORT | | Q1,03 | PCWC | |
| 11 | PX RECEIVE | | Q1,03 | PCWP | |
| 12 | PX SEND HYBRID HASH| :TQ10000 | | S->P | HYBRID HASH|
|*13 | TABLE ACCESS FULL | TIMES | | | |
| 14 | PX RECEIVE | | Q1,03 | PCWP | |
| 15 | PX SEND HYBRID HASH | :TQ10002 | Q1,02 | P->P | HYBRID HASH|
| 16 | JOIN FILTER USE | :BF0000 | Q1,02 | PCWP | |
| 17 | PX BLOCK ITERATOR | | Q1,02 | PCWC | |
|*18 | TABLE ACCESS FULL | SALES | Q1,02 | PCWP | |
---------------------------------------------------------------------------
3 - access("S"."PROD_ID"="P"."PROD_ID")
8 - access("S"."TIME_ID"="T"."TIME_ID")
13 - filter("T"."FISCAL_WEEK_NUMBER"=18)
18 - access(:Z>=:Z AND :Z<=:Z)
filter(SYS_OP_BLOOM_FILTER(:BF0000,"S"."TIME_ID"))
A single server process scans the times table (Step 13), and then uses a hybrid hash
distribution method to send the rows to the parallel execution servers (Step 12). The
processes in set Q1,03 create a bloom filter (Step 9). The processes in set Q1,02 scan sales
in parallel (Step 18), and then use the Bloom filter to discard rows from sales (Step 16)
before sending them on to set Q1,03 using hybrid hash distribution (Step 15). The processes
in set Q1,03 hash join the times rows to the filtered sales rows (Step 8). The processes in set
Q1,01 scan products (Step 7), and then send the rows to Q1,03 (Step 5). Finally, the
9-49
Chapter 9
Join Optimizations
processes in Q1,03 join the products rows to the rows generated by the previous hash
join (Step 3).
The following figure illustrates the basic process.
Q1, 03
Create
Bloom filter
:BF0000
Q1, 01 Q1, 02
See Also:
Oracle Database VLDB and Partitioning Guide explains partition-wise joins in
detail
9-50
Chapter 9
Join Optimizations
Server
Process
t1 t2
9-51
Chapter 9
Join Optimizations
PE Coordinator
t1 t2
PE Server
PE Server
PE Server
PE Server
A full partition-wise join can also join partitions to subpartitions, which is useful when
the tables use different partitioning methods. For example, customers is partitioned by
hash, but sales is partitioned by range. If you subpartition sales by hash, then the
database can perform a full partition-wise join between the hash partitions of the
customers and the hash subpartitions of sales.
In the execution plan, the presence of a partition operation before the join signals the
presence of a full partition-wise join, as in the following snippet:
| 8 | PX PARTITION HASH ALL|
|* 9 | HASH JOIN |
See Also:
Oracle Database VLDB and Partitioning Guide explains full partition-wise
joins in detail, and includes several examples
9-52
Chapter 9
Join Optimizations
PE Coordinator
t1 t2
PE Server
PE Server t1 t2
PE Server
PE Server
PE Server
PE Server
PE Server
PE Server
Dynamically created
partitions
Because t2 is not partitioned, a set of parallel execution servers must generate partitions
from t2 as needed. A different set of parallel execution servers then joins the t1 partitions to
the dynamically generated partitions. The parallel execution coordinator assembles the result.
In the execution plan, the operation PX SEND PARTITION (KEY) signals a partial partition-wise
join, as in the following snippet:
See Also:
Oracle Database VLDB and Partitioning Guide explains full partition-wise joins in
detail, and includes several examples
9-53
Chapter 9
Join Optimizations
See Also:
Oracle Database In-Memory Guide to learn how to optimize In-Memory
queries with join groups
9-54
Part V
Optimizer Statistics
The accuracy of an execution plan depends on the quality of the optimizer statistics.
This part contains the following chapters:
10
Optimizer Statistics Concepts
Oracle Database optimizer statistics describe details about the database and its objects.
This chapter includes the following topics:
10-1
Chapter 10
About Optimizer Statistics Types
Note:
The optimizer statistics are different from the performance statistics visible
through V$ views.
Database
Optimizer
Data Dictionary
Optimizer Statistics
PERSON
Table GB Execution
Plan
ID Name HJ
100 Kumar HJ
PERSON_ID_IX
10-2
Chapter 10
About Optimizer Statistics Types
• Number of rows
The database uses the row count stored in DBA_TAB_STATISTICS when determining
cardinality.
• Average row length
• Number of data blocks
The optimizer uses the number of data blocks with the DB_FILE_MULTIBLOCK_READ_COUNT
initialization parameter to determine the base table access cost.
• Number of empty data blocks
DBMS_STATS.GATHER_TABLE_STATS commits before gathering statistics on permanent tables.
See Also:
10-3
Chapter 10
About Optimizer Statistics Types
10-4
Chapter 10
About Optimizer Statistics Types
See Also:
Note:
You cannot collect statistics for private temporary tables.
The following table shows how global temporary tables differ in how they gather and store
optimizer statistics, depending on whether the tables are scoped to a transaction or session.
The following procedures do not commit for transaction-specific temporary tables, so that
rows in these tables are not deleted:
• GATHER_TABLE_STATS
10-5
Chapter 10
About Optimizer Statistics Types
See Also:
Note:
In releases before Oracle Database 12c, the database did not maintain
optimizer statistics for global temporary tables and non-global temporary
tables differently. The database maintained one version of the statistics
shared by all sessions, even though data in different sessions could differ.
10-6
Chapter 10
About Optimizer Statistics Types
•
CREATE ... AS SELECT automatically gathers optimizer statistics. When
GLOBAL_TEMP_TABLE_STATS is set to SHARED, however, you must gather statistics manually
using DBMS_STATS.
• Pending statistics are not supported.
• Other sessions do not share a cursor that uses the session-specific statistics.
Different sessions can share a cursor that uses shared statistics, as in releases earlier
than Oracle Database 12c. The same session can share a cursor that uses session-
specific statistics.
• By default, GATHER_TABLE_STATS for the temporary table immediately invalidates previous
cursors compiled in the same session. However, this procedure does not invalidate
cursors compiled in other sessions.
See Also:
• Oracle Database PL/SQL Packages and Types Reference to learn about the
GLOBAL_TEMP_TABLE_STATS preference
• Oracle Database Reference for a description of the DBA_TAB_STATISTICS view
See Also:
• "Histograms "
• "About Statistics on Column Groups"
• Oracle Database Reference for a description of the DBA_TAB_COL_STATISTICS
view
10-7
Chapter 10
About Optimizer Statistics Types
See Also:
Oracle Database Reference for a description of the DBA_IND_STATISTICS
view
10-8
Chapter 10
About Optimizer Statistics Types
10-9
Chapter 10
About Optimizer Statistics Types
4. Create a new copy of the customers table, with rows ordered by cust_last_name.
For example, execute the following statements:
10-10
Chapter 10
About Optimizer Statistics Types
The table customers3 has the same data as the original customers table, but the index
on customers3 has a much lower clustering factor because the data in the table is
ordered by the cust_last_name. The clustering factor is now about 10 times the number
of blocks instead of 70 times.
9. Query the customers table.
For example, execute the following query (sample output included):
CUST_FIRST_NAME CUST_LAST_NAME
-------------------- ----------------------------------------
Vida Puleo
Harriett Quinlan
Madeleine Quinn
Caresse Puleo
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows |Bytes|Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0| SELECT STATEMENT | | | | 405 (100)| |
|* 1| TABLE ACCESS STORAGE FULL| CUSTOMERS | 2335|35025| 405 (1)|00:00:01|
-------------------------------------------------------------------------------
The preceding plan shows that the optimizer did not use the index on the original
customers tables.
11. Query the customers3 table.
CUST_FIRST_NAME CUST_LAST_NAME
-------------------- ----------------------------------------
Vida Puleo
Harriett Quinlan
Madeleine Quinn
Caresse Puleo
For example, execute the following query (partial sample output included):
10-11
Chapter 10
About Optimizer Statistics Types
-----------------------------------------------------------------------------------
----
|Id| Operation | Name |Rows|Bytes|Cost(%CPU)|
Time|
-----------------------------------------------------------------------------------
----
| 0| SELECT STATEMENT | | | |
69(100)| |
| 1| TABLE ACCESS BY INDEX ROWID|CUSTOMERS3 |2335|35025|69(0) |
00:00:01|
|*2| INDEX RANGE SCAN |CUSTOMERS3_LAST_NAME_IDX|2335| |7(0) |
00:00:01|
-----------------------------------------------------------------------------------
----
The result set is the same, but the optimizer chooses the index. The plan cost is
much less than the cost of the plan used on the original customers table.
13. Query customers with a hint that forces the optimizer to use the index.
For example, execute the following query (partial sample output included):
CUST_FIRST_NAME CUST_LAST_NAME
-------------------- ----------------------------------------
Vida Puleo
Caresse Puleo
Harriett Quinlan
Madeleine Quinn
For example, execute the following query (partial sample output included):
-----------------------------------------------------------------------------------
----
| Id | Operation | Name |Rows|Bytes|Cost(%CPU)|
Time |
-----------------------------------------------------------------------------------
----
| 0| SELECT STATEMENT | | | |
422(100)| |
| 1| TABLE ACCESS BY INDEX ROWID|CUSTOMERS |335 |35025|422(0) |
00:00:01|
|*2| INDEX RANGE SCAN |CUSTOMERS_LAST_NAME_IDX|2335| |7(0) |
00:00:01|
-----------------------------------------------------------------------------------
----
10-12
Chapter 10
About Optimizer Statistics Types
The preceding plan shows that the cost of using the index on customers is higher than
the cost of a full table scan. Thus, using an index does not necessarily improve
performance. The index clustering factor is a measure of whether an index scan is more
effective than a full table scan.
In this example, the index clustering factor for col1_idx is low. The rows that have the same
indexed column values for col1 are in the same data blocks in the table. Thus, the cost of
using an index range scan to return all rows with value A is low because only one block in the
table must be read.
Example 10-5 Scattered Data
Assume that the same rows are scattered across the data blocks as follows:
In this example, the index clustering factor for col1_idx is higher. The database must read all
three blocks in the table to retrieve all rows with the value A in col1.
See Also:
Oracle Database Reference for a description of the DBA_INDEXES view
10-13
Chapter 10
How the Database Gathers Optimizer Statistics
statements when updating system statistics. The database parses all new SQL
statements using new statistics.
See Also:
See Also:
This package enables you to control what and how statistics are collected, including
the degree of parallelism for statistics collection, sampling methods, granularity of
statistics collection in partitioned tables, and so on.
10-14
Chapter 10
How the Database Gathers Optimizer Statistics
Note:
Do not use the COMPUTE and ESTIMATE clauses of the ANALYZE statement to collect
optimizer statistics. These clauses have been deprecated. Instead, use DBMS_STATS.
Statistics gathered with the DBMS_STATS package are required for the creation of accurate
execution plans. For example, table statistics gathered by DBMS_STATS include the number of
rows, number of blocks, and average row length.
By default, Oracle Database uses automatic optimizer statistics collection. In this case, the
database automatically runs DBMS_STATS to collect optimizer statistics for all schema objects
for which statistics are missing or stale. The process eliminates many manual tasks
associated with managing the optimizer, and significantly reduces the risks of generating
suboptimal execution plans because of missing or stale statistics. You can also update and
manage optimizer statistics by manually executing DBMS_STATS.
See Also:
Note:
Dynamic statistics augment statistics rather than providing an alternative to them.
Dynamic statistics supplement optimizer statistics such as table and index block counts, table
and join cardinalities (estimated number of rows), join column statistics, and GROUP BY
statistics. This information helps the optimizer improve plans by making better estimates for
predicate cardinality.
Dynamic statistics are beneficial in the following situations:
• An execution plan is suboptimal because of complex predicates.
• The sampling time is a small fraction of total execution time for the query.
• The query executes many times so that the sampling time is amortized.
10-15
Chapter 10
How the Database Gathers Optimizer Statistics
Note:
By default, a parallel insert uses a direct path insert. You can force a direct
path insert by using the /*+APPEND*/ hint.
See Also:
Oracle Database Data Warehousing Guide to learn more about bulk loads
10-16
Chapter 10
How the Database Gathers Optimizer Statistics
the empty partition during the insert. However, the database does not gather global statistics.
Assume that you run INSERT INTO sales PARTITION (sales_q4_2000) SELECT. If partition
sales_q4_2000 is empty before the insert (other partitions need not be empty), then the
database gathers statistics during the insert. Moreover, if the INCREMENTAL preference is
enabled for sales, then the database also gathers a synopsis for sales_q4_2000. Statistics
are immediately available after the INSERT statement. However, if you roll back the
transaction, then the database automatically deletes statistics gathered during the bulk load.
See Also:
The preceding example only gathers missing or stale statistics. The database does not
gather table and basic column statistics collected during the bulk load.
Note:
You can set the table preference options to GATHER AUTO on the tables that you
plan to bulk load. In this way, you need not explicitly set the options parameter
when running GATHER_TABLE_STATS.
See Also:
10-17
Chapter 10
How the Database Gathers Optimizer Statistics
Specifically, bulk loads do not gather statistics automatically when any of the following
conditions applies to the target table, partition, or subpartition:
• It is not empty, and you perform an INSERT INTO ... SELECT.
In this case, an OPTIMIZER STATISTICS GATHERING row source appears in the
plan, but this row source is only a pass-through. The database does not actually
gather optimizer statistics.
Note:
The DBA_TAB_COL_STATISTICS.NOTES column is set to STATS_ON_LOAD by
a bulk load into an empty table. However, subsequent bulk loads into the
non-empty table do not reset the NOTES column. One technique for
determining whether the database gathered statistics is to query the
USER_TAB_MODIFICATIONS.INSERTS column. If the query returns a row
indicating the number of rows loaded, then the most recent bulk load did
not gather statistics automatically.
• It is loaded using an INSERT INTO ... SELECT, and neither of the following
conditions is true: all columns of the target table are specified, or a subset of the
target columns are specified and the unspecified columns have default values.
Put differently, the database only gathers statistics automatically for bulk loads
when either all columns of the target table are specified, or a subset of the target
columns are specified and the unspecified columns have default values. For
example, the sales table has only columns c1, c2, c3, and c4. The column c4 does
not have a default value. You load sales_copy by executing INSERT /*+ APPEND
*/ INTO sales_copy SELECT c1, c2, c3 FROM sales. In this case, the database
does not gather online statistics for sales_copy. The database would gather
statistics if c4 had a default value or if it were included in the SELECT list.
• It is in an Oracle-owned schema such as SYS.
• It is one of the following types of tables: nested table, index-organized table (IOT),
external table, or global temporary table defined as ON COMMIT DELETE ROWS.
• It has a PUBLISH preference set to FALSE.
• Its statistics are locked.
• It is partitioned, INCREMENTAL is set to true, and partition-extended syntax is not
used.
For example, assume that you execute DBMS_STATS.SET_TABLE_PREFS(null,
'sales', incremental', 'true'). In this case, the database does not gather
statistics for INSERT INTO sales SELECT, even when sales is empty. However, the
database does gather statistics automatically for INSERT INTO sales PARTITION
(sales_q4_2000) SELECT.
• It is loaded using a multitable INSERT statement.
10-18
Chapter 10
When the Database Gathers Optimizer Statistics
See Also:
For example, the following statement disables online statistics gathering for bulk loads:
See Also:
Oracle Database SQL Language Reference to learn about the
GATHER_OPTIMIZER_STATISTICS and NO_GATHER_OPTIMIZER_STATISTICS hints
10-19
Chapter 10
When the Database Gathers Optimizer Statistics
reparsing at the next statement execution. If the cursor is marked for reparsing,
then the optimizer uses actual row counts from the previous execution instead of
estimates.
• SQL profiles
A SQL profile is a collection of auxiliary statistics on a query. The profile stores
these supplemental statistics in the data dictionary. The optimizer uses SQL
profiles during optimization to determine the most optimal plan.
The database stores optimizer statistics in the data dictionary and updates or replaces
them as needed. You can query statistics in data dictionary views.
See Also:
10-20
Chapter 10
When the Database Gathers Optimizer Statistics
See Also:
Oracle Database Reference to learn more about DBA_SQL_PLAN_DIRECTIVES
Note:
Currently, the optimizer monitors only column groups. The optimizer does not
create an extension on expressions.
When the problem that occasioned a directive is solved, either because a better directive
exists or because a histogram or extension exists, the DBA_SQL_PLAN_DIRECTIVES.STATE
value changes from USABLE to SUPERSEDED. More information about the directive state is
exposed in the DBA_SQL_PLAN_DIRECTIVES.NOTES column.
10-21
Chapter 10
When the Database Gathers Optimizer Statistics
See Also:
You can manage directives by using the DBMS_SPD package. For example, you can:
See Also:
Oracle Database PL/SQL Packages and Types Reference to learn about the
DBMS_SPD package
Assumptions
You plan to run queries against the sh schema, and you have privileges on this
schema and on data dictionary and V$ views.
SELECT /*+gather_plan_statistics*/ *
FROM customers
10-22