PGA 메모리가 비정상적으로 커지는 현상 분석

{note:title=상황가정}
* 특정 프로세스가 특정 작업을 수행할 때 PGA 메모리의 크기가 급증하고, 그로 인해 시스템 성능이 전반적으로 저하되는 현상이 종종 발생
* PGA 메모리 크기가 임계치(500 MB) 이상이 되면 프로세스의 작업을 강제 중단 시키고, PGA 힙 덤프를 수행하도록 설정 하고 싶음
* 위와 같은 문제가 발생하는 시간이 주로 새벽 시간이기 때문에 위의 작업을 자동화 하고 싶음
{note}
{info:title=10261 진단 이벤트 : PGA의 크기가 특정 임계점을 넘어가면 ORA-600 에러 발생 시킴}
ORA-10261: Limit the size of the PGA heap
       Cause: the limit is one kilobyte times the level of the event. If the pga grows bigger than this signal an internal error.
       Action: None
{info}
10261 진단 이벤트 데모
{code:noneborderStyle=solid}
SQL>


















--
SQL> -- 10261 diag event
SQL> -- when pga size is over 100MB, you would be hit by ORA-600 error
SQL> alter session set events '10261 trace name context forever, level 100000';

Session altered.

SQL>




















--
SQL> -- with 600 error, dump heap dump at level 0x20000001
SQL> alter session set events '600 trace name heapdump level 0x20000001';

Session altered.

|
|{code:none|borderStyle=solid}
>> {PGA 의 크기가 임계점 초과} → {10261 이벤트에 의해 ORA-600 에러 발생} → {600 에러 진단 이벤트에 의해 PGA 힙 덤프 수행}
>> "특정 프로세스의 PGA 크기가 지나치게 커지면 PGA 힙 덤프를 수행함으로써 이 문제를 트러블슈팅할 수 있도록 하라"

|

10261 진단 이벤트 사례
{code:noneborderStyle=solid}
SQL> create or replace procedure proc_array(len number)
is
type vtable is table of varchar2(1000);
vt vtable := vtable();
begin
for idx in 1 .. len loop
vt.extend;
vt(idx) := rpad('x', 1000, 'x');
end loop;
end;
/ 2 3 4 5 6 7 8 9 10 11

Procedure created.

SQL> alter session set events '10261 trace name context forever, level 100000';

Session altered.

SQL> alter session set events '600 trace name heapdump level 0x20000001';

Session altered.

SQL> exec proc_array(20000000);
BEGIN proc_array(20000000); END;

*
ERROR at line 1:
ORA-00600: internal error code, arguments: 723, 16424, pga heap, [], [],
[], [], []

SQL> alter session set events '10261 trace name context off';

Session altered.

SQL> alter session set events '600 trace name context off';

Session altered.

|
|{code:none|borderStyle=solid}
Fri Apr 27 22:06:39 2012
Errors in file /home/owen/app/owen/diag/rdbms/orcl/orcl/incident/incdir_4945/orcl_ora_2808_i4945.trc:
ORA-00600: internal error code, arguments: [723], [82452], [pga heap], [], [], [], [], []
ORA-00600: internal error code, arguments: [723], [16424], [pga heap], [], [], [], [], []
Trace dumping is performing id=[cdmp_20120427220640]

|

{code:noneborderStyle=solid}
owen@ol5u811gr1 ~$ more /home/owen/app/owen/diag/rdbms/orcl/orcl/incident/incdir_4945/orcl_ora_2808_i4945.trc
Dump file /home/owen/app/owen/diag/rdbms/orcl/orcl/incident/incdir_4945/orcl_ora_2808_i4945.trc
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /home/owen/app/owen/product/11.1.0/db_1
System name: Linux
Node name: ol5u811gr1.oracle.com
Release: 2.6.32-300.10.1.el5uek
Version: #1 SMP Wed Feb 22 17:22:40 EST 2012
Machine: i686
Instance name: orcl
Redo thread mounted by this instance: 1
Oracle process number: 18
Unix process pid: 2808, image: oracle@ol5u811gr1.oracle.com (TNS V1-V3)
...
Dump continued from file: /home/owen/app/owen/diag/rdbms/orcl/orcl/trace/orcl_ora_2808.trc
ORA-00600: internal error code, arguments: 723, 16424, pga heap, [], [], [], [], []

========= Dump for incident 4945 (ORA 600 723) ========


- Beginning of Customized Incident Dump(s)
-

            • ERROR: PGA size limit exceeded: 102455604 > 102400000 *****
              ******************************************************
              ...
|
|{code:none|borderStyle=solid}
SQL> select * from table(tpack.heap_file_report('/home/owen/app/owen/diag/rdbms/orcl/orcl/incident/incdir_4945/orcl_ora_2808_i4945.trc'));

TYPE		     HEAP_NAME		  ITEM		       ITEM_COUNT  ITEM_SIZE  HEAP_SIZE      RATIO
-------------------- -------------------- -------------------- ---------- ---------- ---------- ----------
HEAP		     pga heap						0      95.73	  95.73        100
...
OBJECT		     pga heap		  koh-kghu call 	     4584	71.7	   95.7       74.8
OBJECT		     pga heap		  KGNFS pcontext		1	   0	   95.7 	 0
OBJECT		     pga heap		  dbgdInitEventGr		1	   0	   95.7 	 0
OBJECT		     pga heap		  Alloc environm		7	   0	   95.7 	 0
OBJECT		     pga heap		  allocate kzthsm		1	   0	   95.7 	 0
OBJECT		     pga heap		  none			     1532	23.6	   95.7       24.7
...

|

{code:noneborderStyle=solid}
>> 이러한 간단한 정보도 힙 덤프 분석 없이는 얻을 수 없다.
{code}

PGA 메모리의 점진적인 증가현상 분석

조금 더 복잡한 데모
{code:noneborderStyle=solid}
SQL> -- t1 생성
SQL> create table t1(c1, c2)
as
select level, rpad('x', 1000)
from dual
connect by level <= 1000
; 2 3 4 5 6

Table created.

SQL> exec dbms_application_info.set_client_info('session1');

PL/SQL procedure successfully completed.

SQL> set serveroutput on

SQL> declare
2 type varchar2_array is table of varchar2(30000) index by pls_integer;
va varchar2_array;
v_start_time number := dbms_utility.get_time;
v_temp number;
begin
for idx in 1 .. 5000 loop

-- allocate memory
va(idx) := rpad('x', 30000);

-- search memory
for idx2 in 1 .. 10 loop
for idx3 in 1 .. idx loop
v_temp := length(va(idx3));
end loop;
end loop;

-- do logical reads
for r in (select count(*) from t1) loop
null;
end loop;

if mod(idx, 100) = 0 then
dbms_output.put_line(idx || 'th elapsed time =' || (dbms_utility.get_time - v_start_time) || '(cs)');
v_start_time := dbms_utility.get_time;
end if;
end loop;
end; 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
30 /
100th elapsed time =6(cs)
200th elapsed time =12(cs)
300th elapsed time =20(cs)
400th elapsed time =27(cs)
500th elapsed time =12(cs)
...
4600th elapsed time =63(cs)
4700th elapsed time =58(cs)
4800th elapsed time =60(cs)
4900th elapsed time =60(cs)
5000th elapsed time =61(cs)

PL/SQL procedure successfully completed.

SQL>

|
|{code:none|borderStyle=solid}
>> 시간이 지날수록 성능이 저하 (메모리 릭 가능성 있음)
>> 세션 스냅샷 리포트 , 힙 덤프 분석을 통해서 원인 분석 가능

|

{code:noneborderStyle=solid}
-- 세션#1 에서 위의 작업이 진행 되는 동안 세션#2 에서 1초 간격으로 총 6번의 스냅샷을 생성

SQL> col sid new_value sid
SQL> select sid from v$session where client_info = 'session1';

SID



--
170

SQL> col type format a10
SQL> col item format a35
SQL> col deltas format a40
SQL> select type, item, deltas from table(tpack.session_snapshot_report(session_id=>&sid, snapshot_count=>5, sleep_time=>1));
old 1: select type, item, deltas from table(tpack.session_snapshot_report(session_id=>&sid, snapshot_count=>5, sleep_time=>1))
new 1: select type, item, deltas from table(tpack.session_snapshot_report(session_id=> 170, snapshot_count=>5, sleep_time=>1))

TYPE ITEM DELTAS



--







---










STAT session pga memory 33,816,576->36,831,232->26,411,008->22,3
47,776->25,427,968

STAT table scan rows gotten 511,000->562,000->403,000->341,000->388,
000

STAT session logical reads 76,146->82,614->59,241->50,127->57,036
STAT consistent gets 75,852->82,614->59,241->50,127->57,036
STAT consistent gets from cache (fastpat 75,852->82,614->59,241->50,127->57,036
h)

TYPE ITEM DELTAS



--







---










STAT consistent gets from cache 75,852->82,614->59,241->50,127->57,036
STAT no work - consistent read gets 73,359->80,366->57,629->48,763->55,484
STAT table scan blocks gotten 73,073->80,366->57,629->48,763->55,484
...

|
|{code:none|borderStyle=solid}
>> 매 스냅샷 마다 PGA 영역의 크기가 계속해서 증가하는 것을 알 수 있다(?)
>> 이후 힙 덤프를 통해서 어떤 유형의 객체 때문인지 추적 가능

|

Oracle 11.1.0.6 의 트리거 버그 분석 사례 (Memory Leak)
{code:noneborderStyle=solid}
SQL> select * from v$version;

BANNER






















Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
PL/SQL Release 11.1.0.6.0 - Production
CORE 11.1.0.6.0 Production
TNS for Linux: Version 11.1.0.6.0 - Production
NLSRTL Version 11.1.0.6.0 - Production

SQL> create table t1(c1 int);

Table created.

SQL> create table t2(c1 int);

Table created.

SQL> -- create triggers
SQL> create or replace trigger t1_trg1 after insert on t1 for each row
begin
insert into t2 values (:new.c1);
end;
/
2 3 4 5
Trigger created.

SQL> create or replace trigger t2_trg1 after insert on t2 for each row
begin
declare
numrows number;
begin
numrows := 1;
end;
end;
/ 2 3 4 5 6 7 8 9

Trigger created.

SQL> -- insertions get slow down
SQL> declare
2 v_start_time number := dbms_utility.get_time;
begin
for idx in 1 .. 100000 loop
insert into t1 values (idx);

if mod(idx, 1000) = 0 then
dbms_output.put_line(idx || 'th exe = ' || (dbms_utility.get_time - v_start_time)/100);
v_start_time := dbms_utility.get_time;
end if;
end loop;
end;
/ 3 4 5 6 7 8 9 10 11 12 13

1000th exe = .3
2000th exe = .33
3000th exe = .32
4000th exe = .35
5000th exe = .35
...
53000th exe = .85
54000th exe = .8
55000th exe = .85
56000th exe = 1.24
57000th exe = .89
...
96000th exe = 2.71
97000th exe = 2.15
98000th exe = 2.23
99000th exe = 2.2
100000th exe = 2.23

|
|{code:none|borderStyle=solid}
SQL> select type, item, deltas from table(tpack.session_snapshot_report(session_id=>&sid, snapshot_count=>5, sleep_time=>1));
old   1: select type, item, deltas from table(tpack.session_snapshot_report(session_id=>&sid, snapshot_count=>5, sleep_time=>1))
new   1: select type, item, deltas from table(tpack.session_snapshot_report(session_id=>       170, snapshot_count=>5, sleep_time=>1))

TYPE	   ITEM 			       DELTAS
---------- ----------------------------------- ----------------------------------------
STAT	   session uga memory max	       523,712->589,176->851,032->654,640->589,
					       176

STAT	   session uga memory		       523,712->589,176->851,032->654,640->589,
					       176

STAT	   session pga memory max	       524,288->589,824->851,968->655,360->458,
					       752

STAT	   session pga memory		       524,288->589,824->851,968->524,288->589,
					       824

|

{code:noneborderStyle=solid}
SQL> select * from table(tpack.pga_heap_report(session_id=>&sid, subheap_depth=>2));
old 1: select * from table(tpack.pga_heap_report(session_id=>&sid, subheap_depth=>2))
new 1: select * from table(tpack.pga_heap_report(session_id=> 170, subheap_depth=>2))
select * from table(tpack.pga_heap_report(session_id=> 170, subheap_depth=>2))
*
ERROR at line 1:
ORA-29532: Java call terminated by uncaught Java exception: java.io.IOException: Exception during creation of the process
ORA-06512: at "SYS.TPACK", line 61
ORA-06512: at "SYS.TPACK", line 269
ORA-06512: at "SYS.TPACK", line 4936
ORA-06512: at line 1
ORA-06512: at "SYS.TPACK", line 5039
ORA-06512: at line 1
{code}