h1.06 Sort Area를 적게 사용하도록 SQL 작성

  • 소트연산이 불가피하다면 메모리내에서 Sort Area를 적게 사용하도록 처리해야 한다.

h3.(1) 소트를 완료하고 나서 데이터 가공하기


[1번 쿼리]
SELECT LPAD(상품번호, 30) || LPAD(상품명, 30) || LPAD(고객id, 10)
       || LPAD(고객명, 20) || TO_CHAR(주문일시, 'yyyymmdd hh24:mi:ss')
FROM   주문상품
WHERE  주문일시 BETWEEN :start AND :end
ORDER  BY 상품번호;


[2번 쿼리]
SELECT LPAD(상품번호, 30) || LPAD(상품명, 30) || LPAD(고객ID, 10)
       || LPAD(고객명, 20) || TO_CHAR(주문일시, 'YYYYMMDD HH24:MI:SS')
FROM 
(
  SELECT 상품번호, 상품명, 고객ID, 고객명, 주문일시
  FROM 주문상품
  WHERE  주문일시 BETWEEN :START AND :END
  ORDER  BY 상품번호
);

- 가공된 결과치를 Sort Area에 담아 정렬 출력하는 1번 쿼리 보다 정렬을 완료후 최종 출력을 가공하는 inline view를 사용한 2번 쿼리가 Sort Area를 적게 사용한다.

(2) Top-N 쿼리

h5.# Top-N 쿼리 알고리즘

  • rownum <= 10 일때
    1. 처음 읽은 10개의 레코드를 오름차순으로 읽는다.
    2. 이후의 레코드에 대해서 정렬된 맨 오른쪽 값과 비교 하여 작은 값이 나타나면 새로 정렬을 한후 맨 오른쪽 값은 버린다.
    3. 2번 과정 반복
    4. 따로 정렬하지 않고도 오름차순으로 10개의 값을 찾을수 있어 소트연산 횟수와 Sort Area 사용량을 줄일 수 있다.

select *
from   
   (
        select 거래일시, 채결건수, 체수량, 거래대금
        where  종목코드 = 'kr123456'
        and    거래일시 >= '20080304'
        order  거래일시
   )
where  rownum <= 10;



Execution Plan
------------------------------------------------------------------------------
0    SELECT STATEMENT Optimizer=ALL_ROWS
1  0   COUNT (STOPKEY)
2  1     VIEW
3  2       TABLE ACCESS (BY INDEX ROWID) OF '시간별종목거래' (TABLE)
4  3         INDEX (RANGE SCAN) OF '시간별종목거래_PK' (INDEX (UNIQUE))


- Top-N 쿼리 사용시 장점!!!!

1. 위의 SQL처럼 inline view 사용함으로 Sort Area를 줄일수 있다.
2. 만약 종목코드 + 거래일시 순으로 인덱스가 존재할 경우 인덱스가 order by 연산을 대체할 수 있다.
3. rownum 사용으로 N건에서 멈추도록 했으므로 매우 빠른 수행 속도를 낼수 있다 (COUNT (STOPKEY))

Top-N쿼리의 효과 측정


- 설정 - 
SQL> create table t as select * from all_objects;

Table created.

SQL> alter session set workarea_size_policy=manual;

Session altered.

SQL> alter session set sort_area_size=524288;             

Session altered.

SQL> select  count(*) from t;

  COUNT(*)
----------
     67616

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        967  consistent gets
        964  physical reads
          0  redo size
        528  bytes sent via SQL*Net to client
        519  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed


h3.# Top-N 쿼리가 작동할 때


select * 
from ( 
	select * 
	from t
	order by object_name
      )
where rownum <=10;


Execution Plan
----------------------------------------------------------
Plan hash value: 3299198703

----------------------------------------------------------------------------------------
| Id  | Operation               | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |      |    10 |  1580 |       |  3581   (1)| 00:00:43 |
|*  1 |  COUNT STOPKEY          |      |       |       |       |            |          |
|   2 |   VIEW                  |      | 70368 |    10M|       |  3581   (1)| 00:00:43 |
|*  3 |    
Y|      | 70368 |    10M|    13M|  3581   (1)| 00:00:43 |
|   4 |     TABLE ACCESS FULL   | T    | 70368 |    10M|       |   270   (1)| 00:00:04 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=10)
   3 - filter(ROWNUM<=10)

Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
         22  recursive calls
          0  db block gets
       1137  consistent gets
        964  physical reads
          0  redo size
       2259  bytes sent via SQL*Net to client
        519  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
         10  rows processed


h3.# Top-N 쿼리가 작동하지 않을 때


select * 
from ( 
	select a.*, rownum no 
	from (
	select * from t order by object_name
	)a
      )
where rownum <=10;


Execution Plan
----------------------------------------------------------
Plan hash value: 301548906

---------------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |    10 |  1710 |       |  3581   (1)| 00:00:43 |
|*  1 |  COUNT STOPKEY         |      |       |       |       |            |          |
|   2 |   VIEW                 |      | 70368 |    11M|       |  3581   (1)| 00:00:43 |
|   3 |    COUNT               |      |       |       |       |            |          |
|   4 |     VIEW               |      | 70368 |    10M|       |  3581   (1)| 00:00:43 |
|   5 |      SORT ORDER BY     |      | 70368 |    10M|    13M|  3581   (1)| 00:00:43 |
|   6 |       TABLE ACCESS FULL| T    | 70368 |    10M|       |   270   (1)| 00:00:04 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=10)

Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
         15  recursive calls
         18  db block gets
       1107  consistent gets
       1013  physical reads
          0  redo size
       2293  bytes sent via SQL*Net to client
        519  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          1  sorts (disk)
         10  rows processed

(3) 분석함수에서의 Top-N 쿼리

  • rank(), row_number()를 쓰면 Top-N 쿼리 알고리즘이 작동해 max() 함수 쓸때보다 소트 부하를 줄여준다

# 설정

- 같은 ID가 10개이며 seq컬럼을 두어 ID가 같을때 레코드를 식별할 수 있는 table 생성
- disk sort가 발생하도록 sort_area_size를 줄여 테스트함


create table tt
as
select 1 id, rownum seq, owner, object_name, object_type, created, status 
from   all_objects ;



begin 
  for i in 1..9
  loop
    insert into tt
    select i+1 id, rownum seq
         , owner, object_name, object_type, created, status 
    from   tt
    where id  = 1;
    commit;
  end loop;
end;
/


alter session set workarea_size_policy = manual;
alter session set sort_area_size = 1048576;

# max() 함수 사용시


- physical read 18578 발생!! 


SQL> select id, seq, owner, object_name, object_type, created, status 
  2  from  (select id, seq
  3              , max(seq) over (partition by id) last_seq
  4              , owner, object_name, object_type, created, status 
  5         from tt)
  6  where  seq = last_seq;



Execution Plan
----------------------------------------------------------
Plan hash value: 4234058736

------------------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |   626K|    58M|       | 18360   (1)| 00:03:41 |
|*  1 |  VIEW               |      |   626K|    58M|       | 18360   (1)| 00:03:41 |
|   2 |   WINDOW SORT       |      |   626K|    50M|    62M| 18360   (1)| 00:03:41 |
|   3 |    TABLE ACCESS FULL| TT   |   626K|    50M|       |  1718   (1)| 00:00:21 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("SEQ"="LAST_SEQ")

Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
         54  recursive calls
         14  db block gets
       6487  consistent gets
      18578  physical reads
          0  redo size
       1121  bytes sent via SQL*Net to client
        519  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          1  sorts (disk)
         10  rows processed

h3.# rank() 함수 사용시


 - 동일한 결과 값을 내는 rank() 함수를 쓴 SQL이 위의 max() 함수를 쓴 SQL에 비해 physical read 발생이 58번 밖에 안된다.!!! 

SQL> select id, seq, owner, object_name, object_type, created, status 
  2  from  (select id, seq
  3              , rank() over (partition by id order by seq desc) rnum
  4              , owner, object_name, object_type, created, status 
  5         from   tt)
  6  where rnum = 1;  


Execution Plan
----------------------------------------------------------
Plan hash value: 437768642

-----------------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |   626K|    58M|       | 18360   (1)| 00:03:41 |
|*  1 |  VIEW                    |      |   626K|    58M|       | 18360   (1)| 00:03:41 |
|*  2 |   WINDOW SORT PUSHED RANK|      |   626K|    50M|    62M| 18360   (1)| 00:03:41 |
|   3 |    TABLE ACCESS FULL     | TT   |   626K|    50M|       |  1718   (1)| 00:00:21 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("RNUM"=1)
   2 - filter(RANK() OVER ( PARTITION BY "ID" ORDER BY INTERNAL_FUNCTION("SEQ")
              DESC )<=1)

Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
          5  recursive calls
         61  db block gets
       6487  consistent gets
         58  physical reads
          0  redo size
       1121  bytes sent via SQL*Net to client
        519  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          1  sorts (disk)
         10  rows processed