05 카디널러티
(1) 선택도
- 선택도 (Selectivity) : 전체 대상 레코드 중 특정 조건에 의해 선택될 것으로 예상되는 레코드 비율
- 선택도 -> 카디널러티 -> 비용 -> 엑세스 방식, 조인순서, 조인 방법 등 결정
- 히스토그램이 없거나, 조건절에 바인드 변수 사용 시 옵티마이저는 데이터 분포가 균일하다고 가정한 상태에서 선택도 구함
- 히스토그램이 없을 때 등치(=) 조건에 대한 선택도 = 1/Distinct Value 개수 = 1/num_distinct
- 히스토그램이 없을때 범위검색(ex. between) 조건에 대한 선택도 = 조건절에서 요청한 값 범위/전체 값 범위
* num_rows : 1,000,000
* num_distinct : 100
* low_value : 1
* high_value : 1,000
- no > 500 : (1,000-500) / (1,000-1) = 0.5
- no >= 500 : (1,000-500) / (1,000-1) + 1/100 = 0.51
- no bwtwwen 500 and 1000 : (1,000-500) / (1,000-1) + 1/100 + 1/100 = 0.52
* no > 500
선택도 = 1,000 - 500 ( high_value - 비교값 )/ 1,000 - 1 ( high_value - low_value ) = 0.5
* no < 500
선택도 = 500 - 1 ( 비교값 - 1 )/1,000 - 1 ( high_value - low_value ) = 0.499
- 컬럼 히스토그램이 없을 떄 옵티마이저는 '조건절에서 요청한 값 범위'에 속한 값들이 '전체 값 범위'에 고르게 분포돼 있음을 가정하고 선택도 구함
(2) 카디널러티
- 카디널러티 :특정 액세스 단계를 거치고 나서 출력될 것으로 예상되는 결과 건수
- 카디널리티 = 총 로우수 * 선택도 = num_rows/num_distinct
- Distinct Value = 10 이면 선택도 = 0.1이고, 총 로우 수가 1,000이라면 카디널리티 100이 됨.
- 선택도 및 카디널리티 계산식 테스트
SQL> CREATE TABLE T_EMP AS
2 SELECT B.NO, A.*
3 FROM (SELECT * FROM SCOTT.EMP WHERE ROWNUM <= 10 ) A
4 , (SELECT ROWNUM NO FROM DUAL CONNECT BY LEVEL <= 100) B;
테이블이 생성되었습니다.
SQL> begin
2 dbms_stats.gather_table_stats( user, 'T_EMP', method_opt=> 'for all columns size 1' );
3 end;
4 /
PL/SQL 처리가 정상적으로 완료되었습니다.
SQL> select job, count(*) from t_emp group by job order by job;
JOB COUNT(*)
--------- ----------
ANALYST 100
CLERK 100
MANAGER 300
PRESIDENT 100
SALESMAN 400
SQL> SELECT COUNT(*) FROM T_EMP WHERE JOB = 'SALESMAN';
COUNT(*)
----------
400
SQL> select * from table( dbms_xplan.display_cursor( null, null, 'ALLSTATS LAST'));
--------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 8 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 8 |
|* 2 | TABLE ACCESS FULL| T_EMP | 1 | 200 | 400 |00:00:00.01 | 8 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("JOB"='SALESMAN')
SQL> SELECT T.NUM_ROWS, C.NUM_NULLS, C.NUM_DISTINCT
2 , 1/C.NUM_DISTINCT SELECTIVITY
3 , NUM_ROWS/C.NUM_DISTINCT AS CRADINALITY
4 FROM USER_TABLES T, USER_TAB_COLUMNS C
5 WHERE T.TABLE_NAME = 'T_EMP'
6 AND C.TABLE_NAME = T.TABLE_NAME
7 AND C.COLUMN_NAME = 'JOB'
8 ;
NUM_ROWS NUM_NULLS NUM_DISTINCT SELECTIVITY CRADINALITY
---------- --------- ------------ ----------- -----------
1000 0 5 .2 200
* 히스토그램이 있는 경우
SQL> begin
2 dbms_stats.gather_table_stats( user, 'T_EMP', method_opt=> 'for all columns size 5' ); --도수 분포 히스토그램
3 end;
4 /
SQL> SELECT COUNT(*) FROM T_EMP WHERE JOB = 'CLERK';
COUNT(*)
----------
100
SQL> @XPLAN
--------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 8 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 8 |
|* 2 | TABLE ACCESS FULL| T_EMP | 1 | 100 | 100 |00:00:00.01 | 8 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("JOB"='CLERK')
SQL> SELECT COUNT(*) FROM T_EMP WHERE JOB = 'SALESMAN';
COUNT(*)
----------
400
SQL> @XPLAN
--------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 8 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 8 |
|* 2 | TABLE ACCESS FULL| T_EMP | 1 | 400 | 400 |00:00:00.01 | 8 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("JOB"='SALESMAN')
-- 히스토그램 바인드
SQL> SELECT COUNT(*) FROM T_EMP WHERE JOB = :JOB;
COUNT(*)
----------
0
SQL> @XPLAN
----------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 |
|* 2 | TABLE ACCESS FULL| T_EMP | 1 | 200 | 0 |00:00:00.01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("JOB"=:JOB)
(3) NULL 값을 포함할 때
SQL> update t_emp set job = null where no <= 50;
500 행이 갱신되었습니다.
SQL> commit;
커밋이 완료되었습니다.
SQL> select count(*) from t_emp where job is null;
COUNT(*)
----------
500
SQL> begin
2 dbms_stats.gather_table_stats( user, 'T_EMP', method_opt=> 'for all columns size 1' ); --히스토그램 제거
3 end;
4 /
PL/SQL 처리가 정상적으로 완료되었습니다.
SQL> COLUMN NUM_NULLS FORMAT 9999999
SQL>
SQL> SELECT NUM_NULLS, NUM_DISTINCT
2 FROM USER_TAB_COLUMNS
3 WHERE TABLE_NAME = 'T_EMP'
4 AND COLUMN_NAME = 'JOB';
NUM_NULLS NUM_DISTINCT
--------- ------------
500 5
SQL>
1 NULL 값을 제외한 로우 수
선택도 = -------------------- * --------------------
Distinct Value 개수 총 로우 수
1 (NUM_ROWS - NUM_NULLS)
= -------------------- * ------------------------
NUM_DISTINCT NUM_ROWS
1 - ( NUM_NULLS / NUM_ROWS )
= ----------------------------
NUM_DISTINCT
1 - ( 500 / 1000 )
= ---------------------------- = 0.1
5
카디널리티 = 1000 * 0.1 = 100
SQL> SELECT COUNT(*) FROM T_EMP WHERE JOB = :JOB;
COUNT(*)
----------
0
SQL> @XPLAN
----------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 |
|* 2 | TABLE ACCESS FULL| T_EMP | 1 | 100 | 0 |00:00:00.01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("JOB"=:JOB)
{CODE}
h3. (4) 조건절이 두개 이상일 때
- 카디널리티 : 각 컬럼의 선택도 * 전체 로우수
{code:sql}
SQL> SELECT T.NUM_ROWS, C.NUM_NULLS, C.NUM_DISTINCT
2 , ( 1-C.NUM_NULLS/T.NUM_ROWS)/C.NUM_DISTINCT SELECTIVITY
3 FROM USER_TABLES T, USER_TAB_COLUMNS C
4 WHERE T.TABLE_NAME = 'T_EMP'
5 AND C.TABLE_NAME = T.TABLE_NAME
6 AND C.COLUMN_NAME IN( 'JOB', 'DEPTNO' )
7 ;
NUM_ROWS NUM_NULLS NUM_DISTINCT SELECTIVITY
---------- --------- ------------ -----------
1000 500 5 .1
1000 0 3 .333333333
SQL> SELECT COUNT(*) FROM T_EMP WHERE JOB = :JOB AND DEPTNO = :DEPTNO;
COUNT(*)
----------
0
SQL> @XPLAN
----------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 |
|* 2 | TABLE ACCESS FULL| T_EMP | 1 | 33 | 0 |00:00:00.01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(("JOB"=:JOB AND "DEPTNO"=:DEPTNO))
(5) 범위 검색 조건일 때
선택도 = 조건절에서 요청한 값 범위/전체 값 범위
SQL> CREATE TABLE T
2 AS
3 SELECT ROWNUM NO1
4 , CASE WHEN ROWNUM <= 1000 OR ROWNUM > 9000 THEN ROWNUM ELSE 5000 END NO2
5 FROM DUAL
6 CONNECT BY LEVEL <= 10000;
테이블이 생성되었습니다.
SQL> begin
2 dbms_stats.gather_table_stats( user, 'T', method_opt=> 'for all columns size 1' ); --히스토그램 제거
3 end;
4 /
PL/SQL 처리가 정상적으로 완료되었습니다.
SQL> SELECT MAX( NO1 ) AS MAX_1, MIN( NO1 ) AS MIN_1
2 , MAX( NO2 ) AS MAX_2, MIN( NO2 ) AS MIN_2
3 FROM T;
MAX_1 MIN_1 MAX_2 MIN_2
---------- ---------- ---------- ----------
10000 1 10000 1
-- NO1 : 1 ~ 10000
-- NO2 : 1~1000, 9001 ~ 10000
SQL> SELECT COUNT(*) FROM T WHERE NO1 > 3000;
COUNT(*)
----------
7000
SQL> @XPLAN
-------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 19 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 19 |
|* 2 | TABLE ACCESS FULL| T | 1 | 7001 | 7000 |00:00:00.01 | 19 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("NO1">3000)
19 개의 행이 선택되었습니다.
SQL> SELECT COUNT(*) FROM T WHERE NO2 > 3000;
COUNT(*)
----------
9000
SQL> @XPLAN
-------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 19 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 19 |
|* 2 | TABLE ACCESS FULL| T | 1 | 7001 | 9000 |00:00:00.01 | 19 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("NO2">3000)
SQL> begin
2 dbms_stats.gather_table_stats( user, 'T', method_opt=> 'for all columns size 254' ); --히스토그램 생성
3 end;
4 /
PL/SQL 처리가 정상적으로 완료되었습니다.
SQL> SELECT COUNT(*) FROM T WHERE NO1 > 3000;
COUNT(*)
----------
7000
SQL> @XPLAN
-------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 19 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 19 |
|* 2 | TABLE ACCESS FULL| T | 1 | 7001 | 7000 |00:00:00.01 | 19 |
-------------------------------------------------------------------------------------
SQL> SELECT COUNT(*) FROM T WHERE NO2 > 3000;
-------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 19 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 19 |
|* 2 | TABLE ACCESS FULL| T | 1 | 7001 | 9000 |00:00:00.01 | 19 |
-------------------------------------------------------------------------------------
SQL> SELECT COUNT(*) FROM T WHERE NO1 > 5000;
-------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 19 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 19 |
|* 2 | TABLE ACCESS FULL| T | 1 | 5047 | 5000 |00:00:00.01 | 19 |
-------------------------------------------------------------------------------------
SQL> SELECT COUNT(*) FROM T WHERE NO2 > 5000;
-------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 19 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 19 |
|* 2 | TABLE ACCESS FULL| T | 1 | 1004 | 1000 |00:00:00.01 | 19 |
-------------------------------------------------------------------------------------
{CODE}
h3. (6)cardinality 힌트를 이용한 실행계획 제어
{code:sql}
SQL> SELECT /*+ USE_HASH( D E ) */ * FROM SCOTT.DEPT D, SCOTT.EMP E
2 WHERE D.DEPTNO = E.DEPTNO
3 ;
...
----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 14 |00:00:00.01 | 14 | | | |
|* 1 | HASH JOIN | | 1 | 14 | 14 |00:00:00.01 | 14 | 1000K| 1000K| 747K (0)|
| 2 | TABLE ACCESS FULL| DEPT | 1 | 4 | 4 |00:00:00.01 | 7 | | | |
| 3 | TABLE ACCESS FULL| EMP | 1 | 14 | 14 |00:00:00.01 | 7 | | | |
----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("D"."DEPTNO"="E"."DEPTNO")
SQL> SELECT /*+ USE_HASH( D E ) CARDINALITY(D 16) */ * FROM SCOTT.DEPT D, SCOTT.EMP E
2 WHERE D.DEPTNO = E.DEPTNO
3 ;
...
----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 14 |00:00:00.01 | 14 | | | |
|* 1 | HASH JOIN | | 1 | 56 | 14 |00:00:00.01 | 14 | 865K| 865K| 664K (0)|
| 2 | TABLE ACCESS FULL| EMP | 1 | 14 | 14 |00:00:00.01 | 6 | | | |
| 3 | TABLE ACCESS FULL| DEPT | 1 | 16 | 4 |00:00:00.01 | 8 | | | |
----------------------------------------------------------------------------------------------------------------
SQL> SELECT /*+ USE_HASH( D E ) OPT_ESTIMATE( TABLE, D, SCALE_ROWS=4 ) */ * FROM SCOTT.DEPT D, SCOTT.EMP E
2 WHERE D.DEPTNO = E.DEPTNO ;
...
----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 14 |00:00:00.01 | 14 | | | |
|* 1 | HASH JOIN | | 1 | 5726 | 14 |00:00:00.01 | 14 | 865K| 865K| 632K (0)|
| 2 | TABLE ACCESS FULL| EMP | 1 | 14 | 14 |00:00:00.01 | 6 | | | |
| 3 | TABLE ACCESS FULL| DEPT | 1 | 1636 | 4 |00:00:00.01 | 8 | | | |
----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("D"."DEPTNO"="E"."DEPTNO")