2017/07/21

Wrong result when hierarchical query uses rownum filter and subquery filter over dblink

Here is a test case to reproduce the issue. It uses well-known EMP and DEPT tables. If you need a script to create and populate it, see sql-script at the end of this article.
Additionally, loopback database link is needed:
create database link dbloop connect to USERNAME identified by PASSWORD using 'localhost:1521/SERVICENAME';

And here is a result:

SQL> select
  2    LPAD(' ',2*(LEVEL-1)) || ename ename, job
  3  from emp
  4  where
  5      deptno not in (select deptno from dept where loc='NEW YORK')
  6  and rownum=1
  7  start with mgr is null
  8  connect by prior empno=mgr;

ENAME           JOB
--------------- ---------
  JONES         MANAGER

SQL> select
  2    LPAD(' ',2*(LEVEL-1)) || ename ename, job
  3  from emp
  4  where
  5      deptno not in (select /*+ driving_site(dept) */ deptno from dept@dbloop where loc='NEW YORK')
  6  and rownum=1
  7  start with mgr is null
  8  connect by prior empno=mgr;

no rows selected

So, the only difference is that in the second query NOT IN subquery uses a table DEPT  behind dblink and DRIVING_SITE hint makes Oracle execute the whole statement remotely.

The execution plan for the second query gives some clue about the root cause of the wrong result:

Plan hash value: 1623810261

--------------------------------------------------------------------
| Id  | Operation              | Name | Cost (%CPU)| Inst   |IN-OUT|
--------------------------------------------------------------------
|   0 | SELECT STATEMENT REMOTE|      |     3  (34)|        |      |
|   1 |  COUNT                 |      |            |        |      |
|*  2 |   FILTER               |      |            |        |      |
|   3 |    REMOTE              | EMP  |            |      ! | R->S |
|*  4 |    TABLE ACCESS FULL   | DEPT |     3   (0)| DB12C~ |      |
--------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter( NOT EXISTS (SELECT 0 FROM "DEPT" "A2" WHERE 
              "A2"."LOC"='NEW YORK' AND LNNVL("A2"."DEPTNO"<>:B1)))
   4 - filter("A2"."LOC"='NEW YORK' AND LNNVL("A2"."DEPTNO"<>:B1))

Remote SQL Information (identified by operation id):
----------------------------------------------------

   3 - SELECT "MGR","EMPNO","DEPTNO","JOB","ENAME",PRIOR "EMPNO",LEVEL 
       FROM "EMP" "A7" WHERE ROWNUM=1 START WITH ("MGR" IS NULL) 
  CONNECT BY ("MGR"=PRIOR "EMPNO") (accessing '!' )


Note
-----
   - fully remote statement
   - dynamic statistics used: dynamic sampling (level=2)

It shows that ROWNUM=1 predicate is executed on step 3 and before NOT IN predicate on step 2, which is wrong.

Checked on 11.2.0.3, 12.1.0.2, 12.2.0.1.

P.S. I have not managed to find anything matching on MOS.

=============================================================

create table dept(  
  deptno     number(2,0),  
  dname      varchar2(14),  
  loc        varchar2(13),  
  constraint pk_dept primary key (deptno)  
);
create table emp(  
  empno    number(4,0),  
  ename    varchar2(10),  
  job      varchar2(9),  
  mgr      number(4,0),  
  hiredate date,  
  sal      number(7,2),  
  comm     number(7,2),  
  deptno   number(2,0),  
  constraint pk_emp primary key (empno),  
  constraint fk_deptno foreign key (deptno) references dept (deptno)  
);

insert into DEPT (DEPTNO, DNAME, LOC) values(10, 'ACCOUNTING', 'NEW YORK');
insert into dept  values(20, 'RESEARCH', 'DALLAS');
insert into dept  values(30, 'SALES', 'CHICAGO');
insert into dept  values(40, 'OPERATIONS', 'BOSTON');
insert into emp   values(7839, 'KING', 'PRESIDENT',  null,   to_date('17-11-1981','dd-mm-yyyy'),5000, null, 10  );
insert into emp   values(7698, 'BLAKE', 'MANAGER',   7839,   to_date('1-5-1981','dd-mm-yyyy'),  2850, null, 30  );
insert into emp   values(7782, 'CLARK', 'MANAGER',   7839,   to_date('9-6-1981','dd-mm-yyyy'),  2450, null, 10  );
insert into emp   values(7566, 'JONES', 'MANAGER',   7839,   to_date('2-4-1981','dd-mm-yyyy'),  2975, null, 20  );
insert into emp   values(7788, 'SCOTT', 'ANALYST',   7566,   to_date('30-MAR-87','dd-mm-rr'),   3000, null, 20  );
insert into emp   values(7902, 'FORD', 'ANALYST',    7566,   to_date('3-12-1981','dd-mm-yyyy'), 3000, null, 20  );
insert into emp   values(7369, 'SMITH', 'CLERK',     7902,   to_date('17-12-1980','dd-mm-yyyy'),800,  null, 20  );
insert into emp   values(7499, 'ALLEN', 'SALESMAN',  7698,   to_date('20-2-1981','dd-mm-yyyy'), 1600, 300, 30  );
insert into emp   values(7521, 'WARD', 'SALESMAN',   7698,   to_date('22-2-1981','dd-mm-yyyy'), 1250, 500, 30  );
insert into emp   values(7654, 'MARTIN', 'SALESMAN', 7698,   to_date('28-9-1981','dd-mm-yyyy'), 1250, 1400, 30  );
insert into emp   values(7844, 'TURNER', 'SALESMAN', 7698,   to_date('8-9-1981','dd-mm-yyyy'),  1500, 0, 30  );
insert into emp   values(7876, 'ADAMS', 'CLERK',     7788,   to_date('21-APR-87', 'dd-mm-rr'),  1100, null, 20  );
insert into emp   values(7900, 'JAMES', 'CLERK',     7698,   to_date('3-12-1981','dd-mm-yyyy'), 950,  null, 30  );
insert into emp   values(7934, 'MILLER', 'CLERK',    7782,   to_date('23-1-1982','dd-mm-yyyy'), 1300, null, 10  );
commit;

select
  LPAD(' ',2*(LEVEL-1)) || ename ename, job
from emp
where
    deptno not in (select deptno from dept where loc='NEW YORK')
and rownum=1
start with mgr is null
connect by prior empno=mgr;

select 
  LPAD(' ',2*(LEVEL-1)) || ename ename, job
from emp
where
    deptno not in (select /*+ driving_site(dept) */ deptno from dept@dbloop where loc='NEW YORK')
and rownum=1
start with mgr is null
connect by prior empno=mgr;