2. If an access plan is not optimal it is because the cardinality estimate for ont or more of the row sources is grossly incorrect
만일 실행 계획이 불량하다면 Row Source(실행 계획의 각 단계)들 중 하나 이상에서 예측 Row 수가 매우 부정확하기 때문이다.Technical Reviewer - Wolfgang Breitling
1. Cardinality of a set is the number of elements of the set
Cardinality는 특정 집합에 속한 원소(element)의 수를 의미한다.
2. Oracle에서는 Cardinality 의미를 다음의 4가지로 사용
3. 이 책에서는 Calculated Cardinality + Estimated Cardinality
SQL> explain plan for
2 select count(*)
3 from t1
4 where c1 between 1 and 10
5 ;
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 14 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 4 | | |
|* 2 | TABLE ACCESS FULL| T1 | 10 | 40 | 14 (0)| 00:00:01 |
---------------------------------------------------------------------------
explain plan에서의 Rows는 Estimated Cardinality를 의미한다.
현재 10건 정도의 row가 추출될 것으로 추측
SQL> select /*+ gather_plan_statistics */
2 count(*)
3 from t1
4 where c1 between 1 and 10
5 ;
COUNT(*)
----------
10
--------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |
--------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |
|* 2 | TABLE ACCESS FULL| T1 | 1 | 10 | 14 (0)| 10 |
--------------------------------------------------------------------------------
A-row와 E-row가 정확히 일치함을 볼수 있다. 그러나 항상 일치하는 것은 아니다.
대부분의 경우에 E-row는 부정확하다.
SQL> explain plan for
2 select count(*)
3 from t1
4 where c2 = 1
5 ;
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 14 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T1 | 1000 | 3000 | 14 (0)| 00:00:01 |
---------------------------------------------------------------------------
-------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |
--------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |
|* 2 | TABLE ACCESS FULL| T1 | 1 | 1000 | 14 (0)| 960 |
--------------------------------------------------------------------------------
위의 경우에는 A-row와 E-row가 불일치 함을 볼수 있다.
Density 와 Selectivity |
데이터 생성 결과
TABLE_NAME : T1
COLUMN_NAME : C1
NUM_DISTINCT : 10000
NUM_NULLS : 0
DENSITY : .0001
LOW_VALUE : C102
HIGH_VALUE : C302
HISTOGRAM : NONE
-----------------
TABLE_NAME : T1
COLUMN_NAME : C2
NUM_DISTINCT : 10
NUM_NULLS : 0
DENSITY : .1
LOW_VALUE : 80
HIGH_VALUE : C10A
HISTOGRAM : NONE
-----------------
TABLE_NAME : T1
COLUMN_NAME : C3
NUM_DISTINCT : 100
NUM_NULLS : 0
DENSITY : .01
LOW_VALUE : 786B0B17010101
HIGH_VALUE : 786C0301010101
HISTOGRAM : NONE
-----------------
TABLE_NAME : T1
COLUMN_NAME : C4
NUM_DISTINCT : 100
NUM_NULLS : 0
DENSITY : .01
LOW_VALUE : 3230303731313233
HIGH_VALUE : 3230303830333031
HISTOGRAM : NONE
Selectivity는 Predicate에 따라 변하는 속성이다.
1. c1 컬럼에 literal
SQL> explain plan for
2 select *
3 from t1
4 where c1 = 1
5 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 23 | 14 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 1 | 23 | 14 (0)| 00:00:01 |
--------------------------------------------------------------------------
Cardinality = 10000 * Selectivity = 10000 * 0.0001 = 1
2. c2 컬럼에 literal
SQL> explain plan for
2 select *
3 from t1
4 where c2 = 1
5 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 23000 | 14 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 1000 | 23000 | 14 (0)| 00:00:01 |
--------------------------------------------------------------------------
Cardinality = 10000 * Selectivity = 10000 * 0.1 = 1000
3. c1 컬럼에 Bind 변수
SQL> explain plan for
2 select *
3 from t1
4 where c1 = :b1
5 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 23 | 15 (7)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 1 | 23 | 15 (7)| 00:00:01 |
--------------------------------------------------------------------------
Cardinality = 10000 * Selectivity = 10000 * 0.0001 = 1
SQL> explain plan for
2 select *
3 from t1
4 where c2 = :b1
5 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 23000 | 15 (7)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 1000 | 23000 | 15 (7)| 00:00:01 |
--------------------------------------------------------------------------
Cardinality = 10000 * Selectivity = 10000 * 0.1 = 1000
HISTOGRAM 이 없으므로 literal과 차이가 없다.
3. Between 연산
SQL> explain plan for
2 select *
3 from t1
4 where c1 between 1 and 100
5 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100 | 2300 | 14 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 100 | 2300 | 14 (0)| 00:00:01 |
--------------------------------------------------------------------------
Cardinality = Base Cardinality * (Selectivity(C1 BETWEEN X AND Y))
= Base Cardinality * (Selectivity(C1 = X) + Selectivity(C1 = Y) + Selectivity(X < C1 < Y))
= Base Cardinality * (Selectivity(C1 = X) + Selectivity(C1 = Y) + (Y - X)/(MAX - MIN))
= Base Cardinality * (Selectivity(C1 = 1) + Selectivity(C1 = 100) + (100 - 1)/(10000 - 1))
= 10000 * (0.0001 + 0.0001 + (99)/(9999))
= 10000 * (0.0001 + 0.0001 + 0.00990099)
= 10000 * 0.01000099
= 100
3. 2개 이상의 Predicate
SQL> explain plan for
2 select *
3 from t1
4 where c1 between 1 and 100 and
5 c2 = 1
6 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 230 | 14 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 10 | 230 | 14 (0)| 00:00:01 |
--------------------------------------------------------------------------
Cardinality = Base Cardinality * (Selectivity(C1 BETWEEN X AND Y)) * (SELECTIVITY(C2 = 1))
= 10000 * 0.01 * 0.1
= 0.001
SQL> explain plan for
2 select *
3 from t1
4 where c1 between 1 and 100 or
5 c2 = 1
6 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1090 | 25070 | 15 (7)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 1090 | 25070 | 15 (7)| 00:00:01 |
--------------------------------------------------------------------------
Cardinality = Base Cardinality * (Selectivity(P1 OR P2))
= Base Cardinality * (Selectivity(P1) + Selectivity(P2) - Selectivity(P1 AND P2))
= Base Cardinality * (Selectivity(P1) + Selectivity(P2) - (Selectivity(P1) * Selectivity(P2)))
= 10000 * (0.01 + 0.1 - (0.01 * 0.1))
= 10000 * (0.109) = 1090
3. date Type
SQL> explain plan for
2 select *
3 from t1
4 where c3 = to_date('20080201','yyyymmdd')
5 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100 | 2300 | 14 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 100 | 2300 | 14 (0)| 00:00:01 |
--------------------------------------------------------------------------
Cardinality = Base Cardinality * Selectivity
= 10000 * 0.01
= 100
SQL> explain plan for
2 select *
3 from t1
4 where c4 = '20080201'
5 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100 | 2300 | 14 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 100 | 2300 | 14 (0)| 00:00:01 |
--------------------------------------------------------------------------
Cardinality = Base Cardinality * Selectivity
= 10000 * 0.01
= 100
SQL> explain plan for
2 select *
3 from t1
4 where c3 between to_date('20071231','yyyymmdd')
5 and to_date('20080110','yyyymmdd')
6 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1210 | 27830 | 14 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 1210 | 27830 | 14 (0)| 00:00:01 |
--------------------------------------------------------------------------
Cardinality = Base Cardinality * (Selectivity(C3 BETWEEN P1 AND P2))
= Base Cardinality * (Selectivity(P1) + Selectivity(P2) + Selectivity(P1 < C1 < P2))
= Base Cardinality * (Selectivity(P1) + Selectivity(P2) + (P2 - P1)/(MAX - MIN))
= 10000 * (0.01 + 0.01 + (2008/01/10 - 2007/12/31) / (2008/03/01 - 2007/11/23))
= 10000 * (0.01 + 0.01 + (10)/(99))
= 10000 * 0.121010101
= 1210
SQL> explain plan for
2 select *
3 from t1
4 where c4 between '20071231'
5 and '20080110'
6 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9874 | 221K| 15 (7)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 9874 | 221K| 15 (7)| 00:00:01 |
--------------------------------------------------------------------------
Cardinality = Base Cardinality * (Selectivity(C4 BETWEEN P1 AND P2))
= Base Cardinality * (Selectivity(P1) + Selectivity(P2) + Selectivity(P1 < C4 < P2))
= Base Cardinality * (Selectivity(P1) + Selectivity(P2) + (P2 - P1)/(MAX - MIN))
= 10000 * (0.01 + 0.01 + (20080110 - 20071231) / (20080301 - 20071123))
= 10000 * (0.01 + 0.01 + (8879) / (9178))
= 10000 * 0.987422096
= 9874
Magic Numbers |
1. Bind 변수와 Range Predicate가 함께 사용된 경우
SQL> explain plan for
2 select *
3 from t1
4 where c1 > :b1
5 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 500 | 11500 | 15 (7)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 500 | 11500 | 15 (7)| 00:00:01 |
--------------------------------------------------------------------------
Rows = 500 즉, Cardinality는 500 이다.
이것은 5%의 magic number를 사용한 것을 의미한다
Cardinality = 10000 * 5% = 10000 * 0.05 = 500
SQL> explain plan for
2 select *
3 from t1
4 where c1 >= :b1
5 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 500 | 11500 | 15 (7)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 500 | 11500 | 15 (7)| 00:00:01 |
--------------------------------------------------------------------------
SQL> explain plan for
2 select *
3 from t1
4 where c2 > :b1
5 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 500 | 11500 | 15 (7)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 500 | 11500 | 15 (7)| 00:00:01 |
--------------------------------------------------------------------------
2. Bind 변수와 Between Predicate가 함께 사용된 경우
다음 c1 between :b1 and :b2 의 조건이 |c1 >= :b1| and |c2 <= :b2| 로 치환된다
Selectivity = 5%* 5% = 0.25%가 된다.
Cardinality = 10000 * 25% = 10000 * 0.0025 = 25
SQL> explain plan for
2 select *
3 from t1
4 where c1 between :b1 and :b2
5 ;
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 25 | 575 | 15 (7)| 00:00:01 |
|* 1 | FILTER | | | | | |
|* 2 | TABLE ACCESS FULL| T1 | 25 | 575 | 15 (7)| 00:00:01 |
---------------------------------------------------------------------------
3. Function Predicate가 사용된 경우의 magic Number
SQL> create or replace function f1(v1 int)
2 return int
3 is
4 begin
5 return v1;
6 end f1;
7 /
함수가 생성되었습니다.
SQL> explain plan for
2 select *
3 from t1
4 where f1(c1) = 1
5 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100 | 2300 | 19 (27)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 100 | 2300 | 19 (27)| 00:00:01 |
--------------------------------------------------------------------------
Cardinality = 10000 * 1% = 10000 * 0.01 = 100
다른 경우들도 같음을 확인할 수 있다.
SQL> explain plan for
2 select *
3 from t1
4 where f1(c2) = 1
5 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100 | 2300 | 19 (27)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 100 | 2300 | 19 (27)| 00:00:01 |
--------------------------------------------------------------------------
4. Function Predicate가 Range 연산과 함께 사용된 경우는 다시 5%의 Rule 이 적용됩니다.
SQL> explain plan for
2 select *
3 from t1
4 where f1(c1) > 1
5 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 500 | 11500 | 19 (27)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 500 | 11500 | 19 (27)| 00:00:01 |
--------------------------------------------------------------------------
Cardinality = 10000 * 5% = 10000 * 0.05 = 500
SQL> explain plan for
2 select *
3 from t1
4 where f1(c1) between 1 and 100
5 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 25 | 575 | 20 (30)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 25 | 575 | 20 (30)| 00:00:01 |
--------------------------------------------------------------------------
Cardinality = 10000 * 25% = 10000 * 0.0025 = 25
5. Function Predicate가 Range 연산, Bind 변수가 함께 사용된 경우도 5%의 Rule 이 적용됩니다.
SQL> explain plan for
2 select *
3 from t1
4 where f1(c1) between :b1 and :b2
5 ;
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 25 | 575 | 20 (30)| 00:00:01 |
|* 1 | FILTER | | | | | |
|* 2 | TABLE ACCESS FULL| T1 | 25 | 575 | 20 (30)| 00:00:01 |
---------------------------------------------------------------------------
Cardinality = 10000 * 25% = 10000 * 0.0025 = 25
function연산에서의 cardinality는 부정확할 수밖에 없다. 실제 데이터 분포와 상관없이
1% or 5% magic number를 사용하기 때문이다.
이것의 해결법은 Function Based Index 를 사용하는 것이다.
6. Function Predicate에 사용되는 Magic Number는 Function에 의한 변형이 발생한 경우에만 해당한다
단순히 값을 제공하는 경우에는 Magic Number가 아닌 density가 사용 된다.
SQL> explain plan for
2 select *
3 from t1
4 where c1 = f1(c1)
5 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 23 | 19 (27)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 1 | 23 | 19 (27)| 00:00:01 |
--------------------------------------------------------------------------
c1의 density(c1) = 1/10000 = 0.0001 이다
Cardinality = 10000 * 1% = 10000 * 0.0001 = 1
7. Like Predicate가 사용되는 예
7-1. 단순 상수와 사용된 예
SQL> explain plan for
2 select *
3 from t1
4 where c4 like '20080101'
5 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100 | 2300 | 14 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 100 | 2300 | 14 (0)| 00:00:01 |
--------------------------------------------------------------------------
Cardinality = 10000 * 1% = 10000 * 0.01 = 100
7-2. Bind변수와 함께 사용된 예
SQL> explain plan for
2 select *
3 from t1
4 where c4 like :b1
5 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 500 | 11500 | 15 (7)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 500 | 11500 | 15 (7)| 00:00:01 |
--------------------------------------------------------------------------
Cardinality = 10000 * 5% = 10000 * 0.05 = 500
7-3. Parameter 제어
SQL> explain plan for
2 select /*+ opt_param('_like_with_bind_as_equality', 'true') */
3 *
4 from t1
5 where c4 like :b1
6 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 101 | 2323 | 15 (7)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 101 | 2323 | 15 (7)| 00:00:01 |
--------------------------------------------------------------------------
7-4. '%' 앞에 등장하는 경우 5% Rule
SQL> explain plan for
2 select *
3 from t1
4 where c4 like '%2008'
5 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 500 | 11500 | 15 (7)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 500 | 11500 | 15 (7)| 00:00:01 |
--------------------------------------------------------------------------
Cardinality = 10000 * 5% = 10000 * 0.05 = 500
7-5. '%' 앞뒤로 등장하는 경우 5% Rule
SQL> explain plan for
2 select *
3 from t1
4 where c4 like '%2008%'
5 ;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 500 | 11500 | 15 (7)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 500 | 11500 | 15 (7)| 00:00:01 |
--------------------------------------------------------------------------
Cardinality = 10000 * 5% = 10000 * 0.05 = 500