비용기반의 오라클 원리 (2009년)
도수분포 히스토그램(Frequency Histogram) 0 0 56,661

by 구루비스터디 히스토그램 Histogram [2023.09.23]


IV. 도수분포 히스토그램(Frequency Histogram)

1. 데이터 생성



-- 0. Oracle Version
SELECT *
FROM   V$VERSION
;


BANNER
------------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit
PL/SQL Release 10.2.0.3.0 - Production
CORE    10.2.0.3.0      Production
NLSRTL Version 10.2.0.3.0 - Production
;


-- 1. 테이블 T1 생성
DROP TABLE T1;
DROP TABLE GENERATOR;

CREATE TABLE GENERATOR AS
SELECT ROWNUM ID
FROM   ALL_OBJECTS
WHERE  ROWNUM <= 2000
;

CREATE TABLE T1 (SKEW NOT NULL,
	             PADDING)
AS
SELECT /*+ ORDERED USE_NL(V2) */
	   V1.ID,
	   RPAD('X',400)
FROM   GENERATOR	V1,
	   GENERATOR	V2
WHERE  V1.ID <= 80
AND	   V2.ID <= 80
AND	   V2.ID <= V1.ID
ORDER BY  V2.ID,V1.ID
;

CREATE INDEX T1_I1 ON T1(SKEW);



-- 2. 테이블 T1 데이터 조회
SELECT COUNT(*)
FROM   T1
;

  COUNT(*)
----------
      3240

SELECT SKEW,
       COUNT(*)
FROM   T1
GROUP  BY SKEW
ORDER  BY SKEW
;

      SKEW   COUNT(*)
---------- ----------
         1          1
         2          2
.....................
        79         79
        80         80


-- 3. 테이블 T1 Height-Balance Histogram 포함된 통계정보 생성
-- Height-Balance Histogram 생성 시 Rows가 80개인 생태에서 Bucket을 120개 요청했지만 결과는 80개 로우만 생성
BEGIN
	DBMS_STATS.GATHER_TABLE_STATS(USER,
                            	  'T1',
                            	  CASCADE => TRUE,
                            	  ESTIMATE_PERCENT => NULL,
                            	  METHOD_OPT => 'FOR ALL COLUMNS SIZE 120');
END;
/

SELECT ROWNUM,
       ENDPOINT_NUMBER,
       ENDPOINT_VALUE
FROM   USER_TAB_HISTOGRAMS
WHERE  COLUMN_NAME = 'SKEW'
AND    TABLE_NAME  = 'T1'
ORDER  BY ENDPOINT_NUMBER
;

ROWNUM ENDPOINT_NUMBER ENDPOINT_VALUE
------ --------------- --------------
     1               1              1
     2               3              2
.....................................
    79            3160             79
    80            3240             80
;


-- 4. USER_TAB_HISTOGRAMS 조회하여 분포도 확인
SELECT ENDPOINT_VALUE ROW_VALUE,
       CURR_NUM - NVL(PREV_NUM, 0) ROW_COUNT
FROM   (SELECT ENDPOINT_VALUE,
               ENDPOINT_NUMBER CURR_NUM,
               LAG(ENDPOINT_NUMBER, 1) OVER(ORDER BY ENDPOINT_NUMBER) PREV_NUM
        FROM   USER_TAB_HISTOGRAMS
        WHERE  COLUMN_NAME = 'SKEW'
        AND    TABLE_NAME  = 'T1')
ORDER  BY ENDPOINT_VALUE
;

 ROW_VALUE  ROW_COUNT
---------- ----------
         1          1
         2          2
.....................
        79         79
        80         80
;



2. Frequency Histogram을 이용한 테스트


조건절설명CBO사람
SKEW = 40상수 값4040
SKEW = 40.5존재하기는 않지만 범위에 포함10
SKEW BETWEEN 21 AND 24범위 내에 포함되는 BETWEEN 조건9090
SKEW BETWEEN 20.5 AND 24.5범위 내에 포함되는 BETWEEN 조건9090
SKEW BETWEEN 1 AND 2시작점을 포함한 BETWEEN 조건33
SKEW BETWEEN 79 AND 80종료점을 포함한 BETWEEN 조건159159
SKEW > 4 AND SKEW < 8">" AND "<" 범위 조건1818
SKEW = -10최소값 미만10
SKEW = 100최대값 미만10
SKEW BETWEEN -5 AND -3최소값 미만의 범위 조건10
SKEW BETWEEN 92 AND 94최대값 미만의 범위 조건10
SKEW BETWEEN 79 AND 82경계 값에 걸치는 범위 조건159159
SKEW = :B1바인드 변수41???
SKEW BETWEEN :B1 AND :B2바인드 변수의 BETWEEN 조건8???




-- 0. Oracle Version
SELECT *
FROM   V$VERSION
;

BANNER
------------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit
PL/SQL Release 10.2.0.3.0 - Production
CORE    10.2.0.3.0      Production
NLSRTL Version 10.2.0.3.0 - Production
;


-- 1. SKEW = 40
EXPLAIN PLAN FOR
SELECT	*
FROM	T1
WHERE	SKEW = 40
;

@xplan
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    40 | 16160 |    56   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |    40 | 16160 |    56   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("SKEW"=40)
;


-- 2. SKEW = 40.5
EXPLAIN PLAN FOR
SELECT	*
FROM	T1
WHERE	SKEW = 40.5
;

@xplan
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   404 |    56   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   404 |    56   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("SKEW"=40.5)
;


-- 3. SKEW BETWEEN 21 AND 24
EXPLAIN PLAN FOR
SELECT	*
FROM	T1
WHERE	SKEW BETWEEN 21 AND 24
;

@xplan
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    90 | 36360 |    56   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |    90 | 36360 |    56   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("SKEW"<=24 AND "SKEW">=21)
;


-- 4. SKEW BETWEEN 20.5 AND 24.5
EXPLAIN PLAN FOR
SELECT	*
FROM	T1
WHERE	SKEW BETWEEN 20.5 AND 24.5
;

@xplan
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    90 | 36360 |    56   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |    90 | 36360 |    56   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("SKEW"<=24.5 AND "SKEW">=20.5)
;


-- 5. SKEW BETWEEN 1 AND 2
EXPLAIN PLAN FOR
SELECT	*
FROM	T1
WHERE	SKEW BETWEEN 1 AND 2
;

@xplan
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    90 | 36360 |    56   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |    90 | 36360 |    56   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("SKEW"<=24.5 AND "SKEW">=20.5)
;


-- 6. SKEW BETWEEN 79 AND 80
EXPLAIN PLAN FOR
SELECT	*
FROM	T1
WHERE	SKEW BETWEEN 79 AND 80
;

@xplan
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   158 | 63832 |    56   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |   158 | 63832 |    56   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("SKEW">=79 AND "SKEW"<=80)
;


-- 7. SKEW > 4 AND SKEW < 8
EXPLAIN PLAN FOR
SELECT	*
FROM	T1
WHERE	SKEW > 4 AND SKEW < 8
;

@xplan
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    18 |  7272 |    56   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |    18 |  7272 |    56   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("SKEW"<8 AND "SKEW">4)
;


-- 8. SKEW = -10
EXPLAIN PLAN FOR
SELECT	*
FROM	T1
WHERE	SKEW = -10
;

@xplan
-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |   404 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |     1 |   404 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_I1 |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("SKEW"=(-10))
;


-- 9. SKEW= 100
EXPLAIN PLAN FOR
SELECT	*
FROM	T1
WHERE	SKEW = 100
;

@xplan
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   404 |    56   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   404 |    56   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("SKEW"=100)
;


-- 10. SKEW BETWEEN -5 AND -3
EXPLAIN PLAN FOR
SELECT	*
FROM	T1
WHERE	SKEW BETWEEN -5 AND -3
;

@xplan
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   404 |    56   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   404 |    56   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("SKEW"<=(-3) AND "SKEW">=(-5))
;


-- 11. SKEW BETWEEN 92 AND 94
EXPLAIN PLAN FOR
SELECT	*
FROM	T1
WHERE	SKEW BETWEEN 92 AND 94
;

@xplan
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   404 |    56   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   404 |    56   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("SKEW">=92 AND "SKEW"<=94)
;


-- 12. SKEW BETWEEN 79 AND 82
EXPLAIN PLAN FOR
SELECT	*
FROM	T1
WHERE	SKEW BETWEEN 79 AND 82
;

@xplan
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   158 | 63832 |    56   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |   158 | 63832 |    56   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("SKEW">=79 AND "SKEW"<=82)
;


-- 13. SKEW = :B1
EXPLAIN PLAN FOR
SELECT	*
FROM	T1
WHERE	SKEW = :B1
;

@xplan
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    41 | 16564 |    56   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |    41 | 16564 |    56   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("SKEW"=TO_NUMBER(:B1))
;


-- 14. SKEW BETWEEN :B1 AND :B2
EXPLAIN PLAN FOR
SELECT	*
FROM	T1
WHERE	SKEW BETWEEN :B1 AND :B2
;

@xplan
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     8 |  3232 |    56   (0)| 00:00:01 |
|*  1 |  FILTER            |      |       |       |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |     8 |  3232 |    56   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TO_NUMBER(:B1)<=TO_NUMBER(:B2))
   2 - filter("SKEW">=TO_NUMBER(:B1) AND "SKEW"<=TO_NUMBER(:B2))
;


0의 Cardinality
  • 일반적으로, 옵티마이저는 카디널리티 계산 과정에서 0이 전달되는 것을 허용하지 않는다.
  • 계산된 카디널리티가 0이 될 때마다, 옵티마이저는 안전하게 1의 카디널리티를 사용한다.
  • 하지만 조건절이 '1 = 0'처럼 논리적으로 모순된 경우에는 0 카디널리티가 허용된다.


3. Frequency Histogram 사용시 고려해야 할 사항

  1. 통계정보 수집 시 버킷 개수는 254 이하를 사용해야 값을 놓칠 위험히 적다.
  2. 데이터의 중요한 값이 변경되면, 히스토그램을 재생성해 주어야 한다.
  3. Frequency Histogram은 바인드 변수를 사용하게 되면 1/NDV 값을 사용한다.
  4. 적절한 히스토그램이 존재할 경우, CBO는 범위조건이 LOW/HIGH 값을 벗어날 때 더 잘 처리한다.
"코어 오라클 데이터베이스 스터디모임" 에서 2009년에 "비용기반의 오라클 원리 " 도서를 스터디하면서 정리한 내용 입니다.

- 강좌 URL : http://www.gurubee.net/lecture/4415

- 구루비 강좌는 개인의 학습용으로만 사용 할 수 있으며, 다른 웹 페이지에 게재할 경우에는 출처를 꼭 밝혀 주시면 고맙겠습니다.~^^

- 구루비 강좌는 서비스 제공을 위한 목적이나, 학원 홍보, 수익을 얻기 위한 용도로 사용 할 수 없습니다.

댓글등록
SQL문을 포맷에 맞게(깔끔하게) 등록하려면 code() 버튼을 클릭하여 작성 하시면 됩니다.
로그인 사용자만 댓글을 작성 할 수 있습니다. 로그인, 회원가입