h1.목차
골치아픈 Case
I. Skewed Data
II. Correlated Columns
III. Join Cardanality
IV. Partition KEY
골치아픈 Case
구분 | 설명 |
---|
Skewed Data | - Oracle은 값에 따른 Data 분포의 편차가 큰 경우 잘 해석하지 못함.
- Skew 문제를 해결할 수 있는 유일한 방법은 Histogram
- 하지만 Histogram으로도 100% 해결할 수 없는 Case가 있음
|
Correlated Columns | - Oracle이 Cardinality를 계산할 때 사용하는 공식들은 Column들이 서로 독립적인것을 전제로 함.
- Oracle은 이 가정에 위배되는 Data가 존재하는 경우 잘 해석하지 못함.
|
Join Cardinality | - Skewed Data를 잘 해석하지 못하는 것처럼 Join시에도 Join Column들의 Data 편차가 있을 경우
잘 해석하지 못함.
|
Partition Key | - Partition Table에 대한 Cardinality 예측 또한 예상치 못한 결과를 보임
- Partition Elimination이 가능한 경우와 불가능한 경우 많은 차이를 보이며, 가능하더라도 단일 Partition
을 Access 하는 경우와 다중 Partiton을 Access 하는 경우에 다른 Patten의 Cardanality 계산이 이루어짐
|
I. Skewed Data
I.1. 테스트 데이터 생성
-- create object
DROP TABLE T1 PURGE;
CREATE TABLE T1(C1 VARCHAR2(1), C2 INT);
CREATE INDEX T1_N1 ON T1(C1);
-- create data
INSERT INTO T1
SELECT 'A',
LEVEL
FROM DUAL
CONNECT BY LEVEL <= 10000
UNION ALL
SELECT 'B', LEVEL
FROM DUAL
CONNECT BY LEVEL <= 10
;
COMMIT;
-- gather stats "without" histogram
EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T1', METHOD_OPT => 'FOR ALL COLUMNS SIZE 1', NO_INVALIDATE => FALSE);
I.2. 통계정보 확인
-- Table
SELECT TABLE_NAME,
NUM_ROWS,
BLOCKS,
SAMPLE_SIZE,
TO_CHAR(LAST_ANALYZED, 'YYYY/MM/DD HH24:MI:SS') AS LAST_ANAL
FROM USER_TAB_STATISTICS
WHERE TABLE_NAME = UPPER('T1')
;
TABLE_NAME | NUM_ROWS | BLOCKS | SAMPLE_SIZE | LAST_ANAL |
---|
T1 | 10010 | 28 | 10010 | 2009-03-01 1:43:32 PM |
-- Column
SELECT S.TABLE_NAME,
S.COLUMN_NAME,
S.NUM_DISTINCT,
S.NUM_NULLS,
S.DENSITY,
S.LOW_VALUE,
S.HIGH_VALUE,
S.HISTOGRAM
FROM USER_TAB_COLS S
WHERE S.TABLE_NAME = UPPER('T1')
;
TABLE_NAME | COLUMN_NAME | NUM_DISTINCT | NUM_NULLS | DENSITY | LOW_VALUE | HIGH_VALUE | HISTOGRAM | T1 | C2 | 10000 | 0 | 0.0001 | C102 | C302 | NONE |
---|
T1 | C1 | 2 | 0 | 0.5 | 41 | 42 | NONE |
-- Histogram
SELECT TABLE_NAME,
COLUMN_NAME,
ENDPOINT_NUMBER,
ENDPOINT_VALUE || '(' || ENDPOINT_ACTUAL_VALUE || ')' AS ENDPOINT_VALUE
FROM USER_TAB_HISTOGRAMS
WHERE TABLE_NAME = UPPER('T1')
ORDER BY COLUMN_NAME,
ENDPOINT_NUMBER
;
TABLE_NAME | COLUMN_NAME | ENDPOINT_NUMBER | ENDPOINT_VALUE |
---|
T1 | C1 | 0 | 337499295804764000000000000000000000() |
T1 | C1 | 1 | 342691592663299000000000000000000000() |
T1 | C2 | 0 | 1() |
T1 | C2 | 1 | 10000() |
I.3. 테스트
1) C1 Columns 'A' Cardanality
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM T1
WHERE C1 = 'A'
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 31 |
|* 2 | TABLE ACCESS FULL| T1 | 1 | 10000 | 9 (0)| 10000 |00:00:00.01 | 31 |
--------------------------------------------------------------------------------------------------
2) C1 Columns 'B' Cardanality
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM T1
WHERE C1 = 'B'
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 31 |
|* 2 | TABLE ACCESS FULL| T1 | 1 | 5005 | 9 (0)| 10 |00:00:00.01 | 31 |
--------------------------------------------------------------------------------------------------
3) gather stats "with" histogram(SKEWONLY : 데이터 분산도에 따라 생성 결정)
EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T1', METHOD_OPT => 'FOR ALL COLUMNS SIZE SKEWONLY', NO_INVALIDATE => FALSE);
-- Histogram
SELECT S.TABLE_NAME,
S.COLUMN_NAME,
S.NUM_DISTINCT,
S.NUM_NULLS,
S.DENSITY,
S.LOW_VALUE,
S.HIGH_VALUE,
S.HISTOGRAM
FROM USER_TAB_COLS S
WHERE S.TABLE_NAME = UPPER('T1')
;
TABLE_NAME | COLUMN_NAME | NUM_DISTINCT | NUM_NULLS | DENSITY | LOW_VALUE | HIGH_VALUE | HISTOGRAM | T1 | C2 | 10000 | 0 | 0.0001 | C102 | C302 | NONE |
---|
T1 | C1 | 2 | 0 | 00004995004995005 | 41 | 42 | FREQUENCY |
4) C1 Columns 'A' Cardanality with histogram
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM T1
WHERE C1 = 'A'
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 31 |
|* 2 | TABLE ACCESS FULL| T1 | 1 | 10000 | 9 (0)| 10000 |00:00:00.01 | 31 |
--------------------------------------------------------------------------------------------------
;
5) C1 Columns 'B' Cardanality with histogram
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM T1
WHERE C1 = 'B'
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 2 |
|* 2 | INDEX RANGE SCAN| T1_N1 | 1 | 10 | 1 (0)| 10 |00:00:00.01 | 2 |
--------------------------------------------------------------------------------------------------
;
II.1. 테스트 데이터 생성
-- create object
DROP TABLE T1 PURGE;
CREATE TABLE T1(C1 VARCHAR2(1), C2 INT);
-- create data
INSERT INTO T1
SELECT 'A', 1
FROM DUAL
CONNECT BY LEVEL <= 1000
UNION ALL
SELECT 'B', 2
FROM DUAL
CONNECT BY LEVEL <= 100
UNION ALL
SELECT 'C', 2
FROM DUAL
CONNECT BY LEVEL <= 100
;
COMMIT;
-- gather stats "with" histogram
EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T1', METHOD_OPT => 'FOR ALL COLUMNS SIZE SKEWONLY', NO_INVALIDATE => FALSE);
II.2. 통계정보 확인
-- Table
SELECT TABLE_NAME,
NUM_ROWS,
BLOCKS,
SAMPLE_SIZE,
TO_CHAR(LAST_ANALYZED, 'YYYY/MM/DD HH24:MI:SS') AS LAST_ANAL
FROM USER_TAB_STATISTICS
WHERE TABLE_NAME = UPPER('T1')
;
TABLE_NAME | NUM_ROWS | BLOCKS | SAMPLE_SIZE | LAST_ANAL |
---|
T1 | 1200 | 13 | 1200 | 2009-03-01 2:39:21 PM |
-- Column
SELECT S.TABLE_NAME,
S.COLUMN_NAME,
S.NUM_DISTINCT,
S.NUM_NULLS,
S.DENSITY,
S.LOW_VALUE,
S.HIGH_VALUE,
S.HISTOGRAM
FROM USER_TAB_COLS S
WHERE S.TABLE_NAME = UPPER('T1')
;
TABLE_NAME | COLUMN_NAME | NUM_DISTINCT | NUM_NULLS | DENSITY | LOW_VALUE | HIGH_VALUE | HISTOGRAM | T1 | C2 | 2 | 0 | 0.000416666666666667 | C102 | C103 | FREQUENCY |
---|
T1 | C1 | 3 | 0 | 0.000416666666666667 | 41 | 43 | FREQUENCY |
-- Histogram
SELECT TABLE_NAME,
COLUMN_NAME,
ENDPOINT_NUMBER,
ENDPOINT_VALUE || '(' || ENDPOINT_ACTUAL_VALUE || ')' AS ENDPOINT_VALUE
FROM USER_TAB_HISTOGRAMS
WHERE TABLE_NAME = UPPER('T1')
ORDER BY COLUMN_NAME,
ENDPOINT_NUMBER
;
TABLE_NAME | COLUMN_NAME | ENDPOINT_NUMBER | ENDPOINT_VALUE |
---|
T1 | C1 | 1000 | 337499295804764000000000000000000000() |
T1 | C1 | 1100 | 342691592663299000000000000000000000() |
T1 | C1 | 1200 | 347883889521833000000000000000000000() |
T1 | C2 | 1000 | 1() |
T1 | C2 | 1200 | 2() |
II.3. 테스트
1) C1 Columns 'A' Cardanality with histogram
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM T1
WHERE C1 = 'A'
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 15 |
|* 2 | TABLE ACCESS FULL| T1 | 1 | 1000 | 5 (0)| 1000 |00:00:00.01 | 15 |
--------------------------------------------------------------------------------------------------
;
2) C1 = 'A' AND C2 = 1
= Base Cardinality * Selectivity
= Base Cardinality * Selectivity(C1 = 'A' AND C2 = 1)
= Base Cardinality * Selectivity(C1 = 'A') * Selectivity(C2 = 1)
= 1200 * 1000/1200 * 1000/1200
= 833
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM T1
WHERE C1 = 'A'
AND C2 = 1
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 15 |
|* 2 | TABLE ACCESS FULL| T1 | 1 | 833 | 5 (0)| 1000 |00:00:00.01 | 15 |
--------------------------------------------------------------------------------------------------
;
3) C1 = 'A' AND C2 = 2
= Base Cardinality * Selectivity
= Base Cardinality * Selectivity(C1 = 'A' AND C2 = 2)
= Base Cardinality * Selectivity(C1 = 'A') * Selectivity(C2 = 2)
= 1200 * 1000/1200 * 200/1200
= 166
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM T1
WHERE C1 = 'A'
AND C2 = 2
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 15 |
|* 2 | TABLE ACCESS FULL| T1 | 1 | 166 | 5 (0)| 0 |00:00:00.01 | 15 |
--------------------------------------------------------------------------------------------------
;
4. Exetended Statistics
1) 10g까지는 Column들간으 상호 의존성을 Oracle이 인식하도록 할 수 있는 방법은 없다
2) 11g부터는 Extended Statistics를 이용해 Column들간의 상호 의존성을 통계 정보에 저장 가능
3) 구문
- EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T1', METHOD_OPT => 'FOR COLUMNS (C1, C2) SIZE SKEWONLY', NO_INVALIDATE => FALSE);
III. Join Cardanality
III.1. 테스트 데이터 생성
-- create object
DROP TABLE T1 PURGE;
DROP TABLE T2 PURGE;
CREATE TABLE T1(C1 INT,
C2 INT,
C3 INT,
C4 INT);
CREATE TABLE T2(C1 INT,
C2 INT,
C3 INT,
C4 VARCHAR2(1));
-- create data
INSERT INTO T1
SELECT 1,
LEVEL,
MOD(LEVEL, 100) + 1,
'1'
FROM DUAL
CONNECT BY LEVEL <= 1000
;
INSERT INTO T2
SELECT 2,
LEVEL,
CASE WHEN LEVEL <= 99 THEN LEVEL
ELSE 0
END,
'2'
FROM DUAL
CONNECT BY LEVEL <= 1000
;
COMMIT;
-- gather stats "without" histogram
EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T1', METHOD_OPT => 'FOR ALL COLUMNS SIZE 1', NO_INVALIDATE => FALSE);
-- gather stats "without" histogram
EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T2', METHOD_OPT => 'FOR ALL COLUMNS SIZE 1', NO_INVALIDATE => FALSE);
III.2. 통계정보 확인
-- Table
SELECT TABLE_NAME,
NUM_ROWS,
BLOCKS,
SAMPLE_SIZE,
TO_CHAR(LAST_ANALYZED, 'YYYY/MM/DD HH24:MI:SS') AS LAST_ANAL
FROM USER_TAB_STATISTICS
WHERE TABLE_NAME IN ('T1', 'T2')
;
TABLE_NAME | NUM_ROWS | BLOCKS | SAMPLE_SIZE | LAST_ANAL |
---|
T1 | 1000 | 13 | 1000 | 2009-03-15 11:51:32 AM |
T2 | 1000 | 13 | 1000 | 2009-03-15 11:51:33 AM |
-- Column
SELECT S.TABLE_NAME,
S.COLUMN_NAME,
S.NUM_DISTINCT,
S.NUM_NULLS,
S.DENSITY,
S.LOW_VALUE,
S.HIGH_VALUE,
S.HISTOGRAM
FROM USER_TAB_COLS S
WHERE S.TABLE_NAME IN ('T1', 'T2')
;
TABLE_NAME | COLUMN_NAME | NUM_DISTINCT | NUM_NULLS | DENSITY | LOW_VALUE | HIGH_VALUE | HISTOGRAM | T1 | C4 | 1 | 0 | 1 | C102 | C102 | NONE |
---|
T1 | C3 | 100 | 0 | 0.01 | C102 | C202 | NONE |
T1 | C2 | 1000 | 0 | 0.001 | C102 | C20B | NONE |
T1 | C1 | 1 | 0 | 1 | C102 | C102 | NONE |
T2 | C4 | 1 | 0 | 1 | 32 | 32 | NONE |
T2 | C3 | 100 | 0 | 0.01 | 80 | C164 | NONE |
T2 | C2 | 1000 | 0 | 0.001 | C102 | C20B | NONE |
T2 | C1 | 1 | 0 | 1 | C103 | C103 | NONE |
-- Histogram
SELECT TABLE_NAME,
COLUMN_NAME,
ENDPOINT_NUMBER,
ENDPOINT_VALUE || '(' || ENDPOINT_ACTUAL_VALUE || ')' AS ENDPOINT_VALUE
FROM USER_TAB_HISTOGRAMS
WHERE TABLE_NAME IN ('T1', 'T2')
ORDER BY COLUMN_NAME,
ENDPOINT_NUMBER
;
TABLE_NAME | COLUMN_NAME | ENDPOINT_NUMBER | ENDPOINT_VALUE |
---|
T1 | C1 | 0 | 1() |
T1 | C1 | 1 | 1() |
T1 | C2 | 0 | 1() |
T1 | C2 | 1 | 1000() |
T1 | C3 | 0 | 1() |
T1 | C3 | 1 | 100() |
T1 | C4 | 0 | 1() |
T1 | C4 | 1 | 1() |
T2 | C1 | 0 | 2() |
T2 | C1 | 1 | 2() |
T2 | C2 | 0 | 1() |
T2 | C2 | 1 | 1000() |
T2 | C3 | 0 | 0() |
T2 | C3 | 1 | 99() |
T2 | C4 | 0 | 259614842926741000000000000000000000() |
T2 | C4 | 1 | 259614842926741000000000000000000000() |
III.3. 테스트
1) Join Column 'C1' Cardinality without Histogram
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM T1,
T2
WHERE T1.C1 = T2.C1
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 30 |
|* 2 | HASH JOIN | | 1 | 1 | 11 (10)| 0 |00:00:00.01 | 30 |
| 3 | TABLE ACCESS FULL| T1 | 1 | 1000 | 5 (0)| 1000 |00:00:00.01 | 15 |
| 4 | TABLE ACCESS FULL| T2 | 1 | 1000 | 5 (0)| 1000 |00:00:00.01 | 15 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T1"."C1"="T2"."C1")
;
2) Join Column 'C2' Cardinality without Histogram
= Cardinality(T1.C2 = T2.C2)
= Cardinality(T1.C2) * Cardinality(T2.C2) * MIN(Selectivity(T1.C2), Selectivity(T2.C2))
= 1000 * 1000 * MIN(0.001, 0.001)
= 1000
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM T1,
T2
WHERE T1.C2 = T2.C2
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.02 | 30 |
|* 2 | HASH JOIN | | 1 | 1000 | 11 (10)| 1000 |00:00:00.02 | 30 |
| 3 | TABLE ACCESS FULL| T1 | 1 | 1000 | 5 (0)| 1000 |00:00:00.01 | 15 |
| 4 | TABLE ACCESS FULL| T2 | 1 | 1000 | 5 (0)| 1000 |00:00:00.01 | 15 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T1"."C2"="T2"."C2")
;
3) Join Column 'C3' Cardinality without Histogram
= Cardinality(T1.C3 = T2.C3)
= Cardinality(T1.C3) * Cardinality(T2.C3) * MIN(Selectivity(T1.C3), Selectivity(T2.C3))
= 1000 * 1000 * MIN(0.01, 0.01)
= 10000
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM T1,
T2
WHERE T1.C3 = T2.C3
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.02 | 30 |
|* 2 | HASH JOIN | | 1 | 10000 | 11 (10)| 990 |00:00:00.02 | 30 |
| 3 | TABLE ACCESS FULL| T1 | 1 | 1000 | 5 (0)| 1000 |00:00:00.01 | 15 |
| 4 | TABLE ACCESS FULL| T2 | 1 | 1000 | 5 (0)| 1000 |00:00:00.01 | 15 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T1"."C3"="T2"."C3")
;
- T1.C3, T2.C3 컬럼의 데이터들은 Skew되어 있기 때문에 정확한 Cardinality를 구할 수 없으며
이를 해결하기 위해서는 Histogram을 생성해야 한다.
4) Create 'C3' Column Histogram
-- gather stats "with" histogram
EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T1', METHOD_OPT => 'FOR COLUMNS C3 SIZE SKEWONLY', NO_INVALIDATE => FALSE);
-- gather stats "with" histogram
EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T2', METHOD_OPT => 'FOR COLUMNS C3 SIZE SKEWONLY', NO_INVALIDATE => FALSE);
5) onemore Test for Join Column 'C3' Cardinality with Histogram
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM T1,
T2
WHERE T1.C3 = T2.C3
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.02 | 30 |
|* 2 | HASH JOIN | | 1 | 496 | 11 (10)| 990 |00:00:00.02 | 30 |
| 3 | TABLE ACCESS FULL| T1 | 1 | 1000 | 5 (0)| 1000 |00:00:00.01 | 15 |
| 4 | TABLE ACCESS FULL| T2 | 1 | 1000 | 5 (0)| 1000 |00:00:00.01 | 15 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T1"."C3"="T2"."C3")
;
6) mismatch datatype 'C4' Column Cardinality without Histogram
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM T1,
T2
WHERE T1.C4 = T2.C4
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 30 |
|* 2 | HASH JOIN | | 1 | 1000K| 20 (50)| 0 |00:00:00.01 | 30 |
| 3 | TABLE ACCESS FULL| T1 | 1 | 1000 | 5 (0)| 1000 |00:00:00.01 | 15 |
| 4 | TABLE ACCESS FULL| T2 | 1 | 1000 | 5 (0)| 1000 |00:00:00.01 | 15 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T1"."C4"=TO_NUMBER("T2"."C4"))
;
7) Semi Join 'C1' Column Cardinality without Histogram
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM T1
WHERE C1 IN (SELECT C1
FROM T2)
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 30 |
|* 2 | HASH JOIN SEMI | | 1 | 1000 | 11 (10)| 0 |00:00:00.01 | 30 |
| 3 | TABLE ACCESS FULL| T1 | 1 | 1000 | 5 (0)| 1000 |00:00:00.01 | 15 |
| 4 | TABLE ACCESS FULL| T2 | 1 | 1000 | 5 (0)| 1000 |00:00:00.01 | 15 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("C1"="C1")
;
8) Semi Join 'C3' Column Cardinality with Histogram
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM T1
WHERE C3 IN (SELECT C3
FROM T2)
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.02 | 30 |
|* 2 | HASH JOIN SEMI | | 1 | 990 | 11 (10)| 990 |00:00:00.02 | 30 |
| 3 | TABLE ACCESS FULL| T1 | 1 | 1000 | 5 (0)| 1000 |00:00:00.01 | 15 |
| 4 | TABLE ACCESS FULL| T2 | 1 | 1000 | 5 (0)| 1000 |00:00:00.01 | 15 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("C3"="C3")
;
9) Filter Operation Cardinality
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM (SELECT /*+ NO_MERGE */
T1.C1,
T1.C2
FROM T1
WHERE C3 IN (SELECT /*+ NO_UNNEST */
C3
FROM T2)
) V,
T2
WHERE V.C1 = T2.C1
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.02 | 1557 |
|* 2 | HASH JOIN | | 1 | 1 | 514 (1)| 0 |00:00:00.02 | 1557 |
| 3 | VIEW | | 1 | 10 | 508 (1)| 990 |00:00:00.02 | 1542 |
|* 4 | FILTER | | 1 | | | 990 |00:00:00.02 | 1542 |
| 5 | TABLE ACCESS FULL| T1 | 1 | 1000 | 5 (0)| 1000 |00:00:00.01 | 15 |
|* 6 | TABLE ACCESS FULL| T2 | 190 | 10 | 5 (0)| 189 |00:00:00.01 | 1527 |
| 7 | TABLE ACCESS FULL | T2 | 1 | 1000 | 5 (0)| 1000 |00:00:00.01 | 15 |
-----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("V"."C1"="T2"."C1")
4 - filter( IS NOT NULL)
6 - filter("C3"=:B1)
;
10) Not In Operation 'C1' Column Cardinality
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM T1
WHERE C1 NOT IN (SELECT C1
FROM T2
WHERE C1 IS NOT NULL)
AND C1 IS NOT NULL
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.02 | 30 |
|* 2 | HASH JOIN ANTI | | 1 | 1 | 11 (10)| 1000 |00:00:00.02 | 30 |
|* 3 | TABLE ACCESS FULL| T1 | 1 | 1000 | 5 (0)| 1000 |00:00:00.01 | 15 |
|* 4 | TABLE ACCESS FULL| T2 | 1 | 1000 | 5 (0)| 1000 |00:00:00.01 | 15 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("C1"="C1")
3 - filter("C1" IS NOT NULL)
4 - filter("C1" IS NOT NULL)
;
11) Not In Operation 'C3' Column Cardinality
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM T1
WHERE C3 NOT IN (SELECT C3
FROM T2
WHERE C1 IS NOT NULL)
AND C3 IS NOT NULL
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 1542 |
|* 2 | FILTER | | 1 | | | 10 |00:00:00.01 | 1542 |
|* 3 | TABLE ACCESS FULL| T1 | 1 | 1000 | 5 (0)| 1000 |00:00:00.01 | 15 |
|* 4 | TABLE ACCESS FULL| T2 | 190 | 10 | 5 (0)| 189 |00:00:00.01 | 1527 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter( IS NULL)
3 - filter("C3" IS NOT NULL)
4 - filter(("C1" IS NOT NULL AND LNNVL("C3"<>:B1)))
;
IV. Partition KEY
IV.1. 테스트 데이터 생성
-- create partition table T1
DROP TABLE T1 PURGE;
CREATE TABLE T1(C1 INT,
C2 INT)
PARTITION BY LIST(C1)
(PARTITION P1 VALUES (1),
PARTITION P2 VALUES (2),
PARTITION P3 VALUES (3),
PARTITION P4 VALUES (4),
PARTITION P5 VALUES (5)
);
-- create data
INSERT INTO T1
SELECT 1, LEVEL FROM DUAL CONNECT BY LEVEL <= 10000 UNION ALL
SELECT 2, LEVEL FROM DUAL CONNECT BY LEVEL <= 1000 UNION ALL
SELECT 3, LEVEL FROM DUAL CONNECT BY LEVEL <= 100 UNION ALL
SELECT 4, LEVEL FROM DUAL CONNECT BY LEVEL <= 10 UNION ALL
SELECT 5, LEVEL FROM DUAL CONNECT BY LEVEL <= 1
;
COMMIT;
-- gather stats "without" histogram
EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T1', METHOD_OPT => 'FOR ALL COLUMNS SIZE 1', NO_INVALIDATE => FALSE);
-- create table T4
DROP TABLE T4 PURGE;
CREATE TABLE T4(C1 INT,
C2 INT)
;
-- create data
INSERT INTO T4
SELECT 1, LEVEL FROM DUAL CONNECT BY LEVEL <= 10000 UNION ALL
SELECT 2, LEVEL FROM DUAL CONNECT BY LEVEL <= 1000 UNION ALL
SELECT 3, LEVEL FROM DUAL CONNECT BY LEVEL <= 100 UNION ALL
SELECT 4, LEVEL FROM DUAL CONNECT BY LEVEL <= 10 UNION ALL
SELECT 5, LEVEL FROM DUAL CONNECT BY LEVEL <= 1
;
COMMIT;
-- gather stats "without" histogram
EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T4', METHOD_OPT => 'FOR ALL COLUMNS SIZE 1', NO_INVALIDATE => FALSE);
-- disable bind peeking
ALTER SYSTEM SET "_OPTIM_PEEK_USER_BINDS" = FALSE;
IV.2. 통계정보 확인
-- Table
SELECT TABLE_NAME,
NUM_ROWS,
BLOCKS,
SAMPLE_SIZE,
TO_CHAR(LAST_ANALYZED, 'YYYY/MM/DD HH24:MI:SS') AS LAST_ANAL
FROM USER_TAB_STATISTICS
WHERE TABLE_NAME IN ('T1', 'T4')
;
TABLE_NAME | NUM_ROWS | BLOCKS | SAMPLE_SIZE | LAST_ANAL |
---|
T1 | 11111 | 80 | 11111 | 2009-03-15 12:54 |
T1 | 1000 | 13 | 1000 | 2009-03-15 12:54 |
T1 | 100 | 13 | 100 | 2009-03-15 12:54 |
T1 | 10 | 13 | 10 | 2009-03-15 12:54 |
T1 | 1 | 13 | 1 | 2009-03-15 12:54 |
T1 | 10000 | 28 | 10000 | 2009-03-15 12:54 |
T4 | 11111 | 28 | 11111 | 2009-03-15 12:54 |
-- Column
SELECT S.TABLE_NAME,
S.COLUMN_NAME,
S.NUM_DISTINCT,
S.NUM_NULLS,
S.DENSITY,
S.LOW_VALUE,
S.HIGH_VALUE,
S.HISTOGRAM
FROM USER_TAB_COLS S
WHERE S.TABLE_NAME IN ('T1', 'T4')
;
TABLE_NAME | COLUMN_NAME | NUM_DISTINCT | NUM_NULLS | DENSITY | LOW_VALUE | HIGH_VALUE | HISTOGRAM | T1 | C2 | 10000 | 0 | 0.0001 | C102 | C302 | NONE |
---|
T1 | C1 | 5 | 0 | 0.2 | C102 | C106 | NONE |
T4 | C2 | 10000 | 0 | 0.0001 | C102 | C302 | NONE |
T4 | C1 | 5 | 0 | 0.2 | C102 | C106 | NONE |
-- Histogram
SELECT TABLE_NAME,
COLUMN_NAME,
ENDPOINT_NUMBER,
ENDPOINT_VALUE || '(' || ENDPOINT_ACTUAL_VALUE || ')' AS ENDPOINT_VALUE
FROM USER_TAB_HISTOGRAMS
WHERE TABLE_NAME IN ('T1', 'T4')
ORDER BY COLUMN_NAME,
ENDPOINT_NUMBER
;
TABLE_NAME | COLUMN_NAME | ENDPOINT_NUMBER | ENDPOINT_VALUE |
---|
T1 | C1 | 0 | 1() |
T1 | C1 | 1 | 5() |
T1 | C2 | 0 | 1() |
T1 | C2 | 1 | 10000() |
T4 | C1 | 0 | 1() |
T4 | C1 | 1 | 5() |
T4 | C2 | 0 | 1() |
T4 | C2 | 1 | 10000() |
IV.3. 테스트
1) Partition Table 'T1' with Bind Variable 'C1' Column
= Base Cardinality * Selectivity(C1)
= 11111 * 0.2
= 2222
VAR B1 NUMBER;
VAR B2 NUMBER;
EXEC :B1 := 1;
EXEC :B2 := 3;
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM T1
WHERE C1 = :B1
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
------------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 31 |
| 2 | PARTITION LIST SINGLE| | 1 | 2222 | 6 (0)| 10000 |00:00:00.01 | 31 |
| 3 | TABLE ACCESS FULL | T1 | 1 | 2222 | 6 (0)| 10000 |00:00:00.01 | 31 |
------------------------------------------------------------------------------------------------------
2) Normal Table 'T4' with Bind Variable 'C1' COLUMN
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM T4
WHERE C1 = :B1
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 31 |
|* 2 | TABLE ACCESS FULL| T4 | 1 | 2222 | 9 (0)| 10000 |00:00:00.01 | 31 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("C1"=:B1)
;
3) Partition Table 'T1' with Literal Variable 'C1' Column
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM T1
WHERE C1 = 1
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
------------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 31 |
| 2 | PARTITION LIST SINGLE| | 1 | 10000 | 9 (0)| 10000 |00:00:00.01 | 31 |
| 3 | TABLE ACCESS FULL | T1 | 1 | 10000 | 9 (0)| 10000 |00:00:00.01 | 31 |
------------------------------------------------------------------------------------------------------
4) Partition Table 'T4' with Literal Variable 'C1' Column
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM T4
WHERE C1 = 1
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 31 |
|* 2 | TABLE ACCESS FULL| T4 | 1 | 2222 | 9 (0)| 10000 |00:00:00.01 | 31 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("C1"=1)
;
5) Partition Table 'T1' access range with Literal Variable 'C1' Column
= Cardinality(C1 BETWEEN 1 AND 3)
= Cardinality(C1 = 1) + Cardinality(C1 = 3) + Cardinality(1 < C1 < 3)
= 0 + (11111 * 0.2) + ((3 - 1) * (5 - 1) * 11111)
= 7777.7 = 7778
- 다중 Partition에 대한 Access가 이루어지는 경우는 Partition Statistics가 아닌 Global Statistics가 사용된다.
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM T1
WHERE C1 BETWEEN 1
AND 3
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 61 |
| 2 | PARTITION LIST ITERATOR| | 1 | 7778 | 16 (0)| 11100 |00:00:00.01 | 61 |
| 3 | TABLE ACCESS FULL | T1 | 3 | 7778 | 16 (0)| 11100 |00:00:00.01 | 61 |
--------------------------------------------------------------------------------------------------------
6) Normal Table 'T4' access range with Literal Variable 'C1' Column
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM T4
WHERE C1 BETWEEN 1
AND 3
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 31 |
|* 2 | TABLE ACCESS FULL| T4 | 1 | 7778 | 9 (0)| 11100 |00:00:00.01 | 31 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(("C1"<=3 AND "C1">=1))
;
7) Partition Table 'T1' access range with Bind Variable 'C1' Column
= Cardinality = 11111 * 0.2 * 0.2 = 444.4 = 445
- Distinct Count가 20보다 큰 경우 : Density가 0.05보다 작은 경우에는 5%의 Rule을 사용
- Distinct Count가 20보다 작은 경우 : Density가 0.05보다 큰 경우에는 Density를 사용
SELECT /*+ GATHER_PLAN_STATISTICS */
COUNT(*)
FROM T1
WHERE C1 BETWEEN :B1
AND :B2
;
select * from table(dbms_xplan.display_cursor(null, null, 'iostats cost last'));
---------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 61 |
|* 2 | FILTER | | 1 | | | 11100 |00:00:00.01 | 61 |
| 3 | PARTITION LIST ITERATOR| | 1 | 445 | 23 (0)| 11100 |00:00:00.01 | 61 |
| 4 | TABLE ACCESS FULL | T1 | 3 | 445 | 23 (0)| 11100 |00:00:00.01 | 61 |
---------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(:B1<=:B2)
문서에 대하여
- 최초작성자 : 강정식
- 최초작성일 : 2009년 4월 4일
- 이 문서는 오라클클럽 코어 오라클 데이터베이스 스터디 모임에서 작성하였습니다.
- {*}이 문서의 내용은 조동욱님의 'Optimizing Oracle Optimizer'를 참고하였습니다.*