Anti Join

  • 역조인
  • Not In, Not Exists Operation

{section}
{column:width=50}

Not In & Filter


SQL> SELECT /*+ gather_plan_statistics */
  2         t1.c1, t1.c2
  3    FROM t1
  4   WHERE t1.c1 NOT IN (SELECT t2.c1 FROM t2)
  5  ;
      1001 dummy
      1002 dummy
      ...
      9769 dummy
      9770 dummy

9000 개의 행이 선택되었습니다.

SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor(null,null,'allstats cost last'));
SQL_ID  73szqa0z3h6tq, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */        t1.c1, t1.c2   FROM t1  WHERE t1.c1 NOT IN
(SELECT t2.c1 FROM t2)

Plan hash value: 895956251

--------------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------
|*  1 |  FILTER            |      |      1 |        |            |   9000 |00:00:00.82 |   68908 |
|   2 |   TABLE ACCESS FULL| T1   |      1 |  10000 |    11   (0)|  10000 |00:00:00.01 |     638 |
|*  3 |   TABLE ACCESS FULL| T2   |  10000 |      1 |     3   (0)|   1000 |00:00:00.83 |   68270 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter( IS NULL)
   3 - filter(LNNVL("T2"."C1"<>:B1))


21 개의 행이 선택되었습니다.


{column}
{column:width=50}
{column}
{section}

  • Subquery Unnesting 이 이루어지지 않았다. Why?
    • Anti Join 은 Null 값이 존재 하지 않는다는 것이 보장될 때만 동작한다.
    • Filter 로 동작하면서 9000 Row 읽는데 68908 Block I/O 가 발생했다.
    • 따라서 IS NOT NULL 조건을 추가하던가, 컬럼속성을 NOT NULL 로 변경하면 된다.

{section}
{column:width=50}

Not In & Is Not Null & Anti Join


SQL> -- Not In & Is Not Null & Anti Join --
SQL> SELECT /*+ gather_plan_statistics */
  2         t1.c1, t1.c2
  3    FROM t1
  4   WHERE t1.c1 NOT IN (SELECT t2.c1 FROM t2 WHERE t2.c1 IS NOT NULL)
  5     AND t1.c1 IS NOT NULL
  6  ;
      1001 dummy
      1002 dummy
      ...
      9769 dummy
      9770 dummy

9000 개의 행이 선택되었습니다.

SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor(null,null,'allstats cost last'));
SQL_ID  2j5s2ad2pwbzb, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */        t1.c1, t1.c2   FROM t1  WHERE t1.c1 NOT IN (SELECT t2.c1 FROM t2 WHERE t2.c1
IS NOT NULL)    AND t1.c1 IS NOT NULL

Plan hash value: 667136805

---------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation             | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------
|*  1 |  HASH JOIN RIGHT ANTI |       |      1 |   9001 |    14   (8)|   9000 |00:00:00.01 |     645 |  1066K|  1066K| 1186K (0)|
|*  2 |   INDEX FAST FULL SCAN| T2_N1 |      1 |   1000 |     2   (0)|   1000 |00:00:00.01 |       7 |       |       |          |
|*  3 |   TABLE ACCESS FULL   | T1    |      1 |  10000 |    11   (0)|  10000 |00:00:00.01 |     638 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."C1"="T2"."C1")
   2 - filter("T2"."C1" IS NOT NULL)
   3 - filter("T1"."C1" IS NOT NULL)


22 개의 행이 선택되었습니다.


{column}
{column:width=50}
{column}
{section}

  • IS NOT NULL 조건 추가에 따른 효과.
    • Subquery Unnesting 이 이루어졌다.
    • HASH JOIN RIGHT ANTI 이 되면서 일량이 645 Block로 현격하게 개선되었다.
    • 테이블 스캔 대신 인덱스 스캔을 사용한다.
    • 따라서 Subquery Unnesting 이 되지 않더라도 IS NOT NULL 조건을 명확하게 추가하는것이 유리하다.

{section}
{column:width=50}

Not In & Is Not Null & No_Unnest


SQL> -- Not In & Is Not Null & No_Unnest --
SQL> SELECT /*+ gather_plan_statistics */
  2         t1.c1, t1.c2
  3    FROM t1
  4   WHERE t1.c1 NOT IN (SELECT /*+ No_Unnest */ t2.c1 FROM t2 WHERE t2.c1 IS NOT NULL)
  5     AND t1.c1 IS NOT NULL
  6  ;
      1001 dummy
      1002 dummy
      ...
      9769 dummy
      9770 dummy

9000 개의 행이 선택되었습니다.

SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor(null,null,'allstats cost last'));
SQL_ID  7jdn0xgkm8gzm, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */        t1.c1, t1.c2   FROM t1  WHERE t1.c1 NOT IN
(SELECT /*+ No_Unnest */ t2.c1 FROM t2 WHERE t2.c1 IS NOT NULL)    AND t1.c1 IS NOT NULL

Plan hash value: 2243065295

---------------------------------------------------------------------------------------------------
| Id  | Operation          | Name  | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------
|*  1 |  FILTER            |       |      1 |        |            |   9000 |00:00:00.05 |   20638 |
|*  2 |   TABLE ACCESS FULL| T1    |      1 |  10000 |    11   (0)|  10000 |00:00:00.01 |     638 |
|*  3 |   INDEX RANGE SCAN | T2_N1 |  10000 |      1 |     1   (0)|   1000 |00:00:00.04 |   20000 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter( IS NULL)
   2 - filter("T1"."C1" IS NOT NULL)
   3 - access("T2"."C1"=:B1)
       filter("T2"."C1" IS NOT NULL)


23 개의 행이 선택되었습니다.


{column}
{column:width=50}
{column}
{section}

  • IS NOT NULL 조건 추가에 따른 효과.
    • Subquery Unnesting 이 되지 않더라도 인덱스 스캔만으로도 일량이 20638로 68908에 비해 큰 개선효과가 있다.

{section}
{column:width=50}

Not In & 11G


SQL> -- Not In & 11G --
SQL> SELECT /*+ gather_plan_statistics */
  2         t1.c1, t1.c2
  3    FROM t1
  4   WHERE t1.c1 NOT IN (SELECT t2.c1 FROM t2)
  5  ;
      1001 dummy
      1002 dummy
      ...
      9999 dummy
     10000 dummy

9000 개의 행이 선택되었습니다.

SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor(null,null,'allstats cost last'));
SQL_ID  73szqa0z3h6tq, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */        t1.c1, t1.c2   FROM t1
WHERE t1.c1 NOT IN (SELECT t2.c1 FROM t2)

Plan hash value: 2739594415

----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |      |      1 |        |    15 (100)|   9000 |00:00:00.05 |     645 |       |       |          |
|*  1 |  HASH JOIN RIGHT ANTI NA|      |      1 |   9001 |    15   (7)|   9000 |00:00:00.05 |     645 |  1066K|  1066K| 1227K (0)|
|   2 |   TABLE ACCESS FULL     | T2   |      1 |   1000 |     3   (0)|   1000 |00:00:00.01 |       7 |       |       |          |
|   3 |   TABLE ACCESS FULL     | T1   |      1 |  10000 |    11   (0)|  10000 |00:00:00.04 |     638 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."C1"="T2"."C1")


21 개의 행이 선택되었습니다.

SQL>

{column}
{column:width=50}
{column}
{section}

  • Oracle 11g 에서는 Anti Join을 사용하기 위해 더이상 NOT NULL 을 보장할 필요가 없다.
    • Null Aware Anti Join(ANTI NA) 이라는 새로운 Join Operation 추가.