2017/01/18

MERGE statement can fail with ORA-00001: unique constraint violated

How can MERGE statement fail with ORA-00001: unique constraint violated if all other things have been done right? I mean both source and target tables have primary key constraints and all columns from that constraints are mentioned in ON clause.
There is a tiny window of possibility. Suppose the first session inserts a row into source table with commit, then the second session starts long-running MERGE into the target table. While it is running, the first session inserts the same row into target table and commits it. When the second session reaches that new row, it gets ORA-00001.
When the second session started MERGE statement, the first thing it did was that it outer-joined the source and target tables. At that moment questioned row existed in source table, but did not exist in target table (or invisible due to uncommited transaction), so, MERGE decided to put the row into insert branch of MERGE statement.

Here is a test-case for reproducing this effect.

First session:
--preparation
drop table t1 purge;
drop table t2 purge;

create table t1 (
 OBJECT_ID     NUMBER,
 OBJECT_NAME   VARCHAR2(128),
 OBJECT_TYPE   VARCHAR2(23),
 filler        char(1000) default 'X'
 );

insert into t1 select OBJECT_ID,OBJECT_NAME,OBJECT_TYPE, 'Y' from all_objects where rownum < 5000;
create table t2 as select * from t1 where 1=2;
alter table t1 add constraint t1_pk primary key (object_id);
alter table t2 add constraint t2_pk primary key (object_id);

--start of the experiment
insert into t1 (object_id, object_name, object_type) values (100000,'test_obj', 'test_type');
commit;
insert into t2 (object_id, object_name, object_type) values (100000,'test_obj', 'test_type');

Second session:
merge into t2 trg
 using t1 src
 on (trg.object_id=src.object_id)
 when matched then update set
 trg.object_name=src.object_name
 when not matched then insert
 (trg.object_id,trg.object_name, trg.object_type)
 values

 (src.object_id,src.object_name,src.object_type);

First session:
commit;

Second session:
merge into t2 trg
*
ERROR at line 1:
ORA-00001: unique constraint (SCOTT.T2_PK) violated

2017/01/12

Swallowing exceptions

Have you heard about Oracle can hide some exceptions raised from user code?
One well-known situation is described here.

But there is one more case I've just found.
Try this code:

create or replace procedure test1 is
begin
  raise no_data_found;
end;
/
begin test1; end;
/
begin
      dbms_scheduler.create_job(job_name        => 'T1',
                                job_type        => 'STORED_PROCEDURE',
                                job_action      => 'TEST1',
                                enabled         => TRUE,
                                start_date      => systimestamp,
                                auto_drop       => true);
END;
/
select ACTUAL_START_DATE, LOG_DATE,STATUS,ERRORS from user_scheduler_job_run_details where job_name='T1';

The final query shows STATUS='SUCCEEDED' even though call of test1() raises a no_data_found exception.

Btw, if job is defined slightly differently

  job_type        => 'PLSQL_BLOCK',
  job_action      => 'BEGIN raise no_data_found; end;',

all works as expected, i.e. STATUS='FAILED'.

Checked on 11.2.0.3/4 and 12.1.0.2.

UPD, see MOS doc 1331778.1 for more details.

2016/12/13

How many times is pl/sql function called

Recently, a query caught my eye with some strange overuse of resources (LIO in that case). As usual, there were several calls of pretty complex pl/sql user-defined functions. Also an analytic function row_numbe() was used there to eliminate possible row duplicates. The query was similar to the following one:

select id, val
  from (select id, val, row_number() over (partition by id order by val) as n
    from (select id,
                 tc1$p1.test1(id) as val
            from tc1$t1
         )
       ) where n=1;

As it appeared, for the table tc1$t1, which contains N unique values of id column, the function tc1$p1.test1(id) is called  almost 2*N times.

It is not clear what is going on behind the scene. For example, if I use trace event 10032 to get some sorting statistics, I can see no correlation between the number of comparisons ("Total number of comparisons performed" statistic) and the number of function calls while adding some duplicates or changing clustering of the data in tc1$t1 (loading the table with different order by clause).

For those who hits this issue, a very simple workaround exists: just wrap function call with subquery from dual like in the following example:

select id, val
  from (select id, val, row_number() over (partition by id order by val) as n
    from (select id,
                 (select tc1$p1.test1(id) from dual) as val
            from tc1$t1
         )
       ) where n=1;

Here is a script to reproduce the described behavior. I've checked it on 11.2.0.4 and 12.1.0.2, the output was the same:

===========================================================
How many times is user-defined PL/SQL function (UDF) called
for table tc1$t1 with                   100000 rows in it
===========================================================
#1 Using analytic function
===========================================================
Counter (start): 0
  1  select id, val
  2    from (select id, val, row_number() over (partition by id order by val) as n
  3      from (select id,
  4                   tc1$p1.test1(id) as val
  5              from tc1$t1
  6           )
  7*        ) where n=1
Counter (end), UDF function calls: 200040
===========================================================
#2 Without analytic function
Counter (start): 0
  1  SELECT id,
  2         tc1$p1.test1(id) AS val
  3*   FROM tc1$t1
Counter (end), UDF function calls: 100000
===========================================================
#3 WA for analytic function
Counter (start): 0
  1  select id, val
  2    from (select id, val, row_number() over (partition by id order by val) as n
  3      from (select id,
  4                   (select tc1$p1.test1(id) from dual) as val
  5              from tc1$t1
  6           )
  7*        ) where n=1
Counter (end), UDF function calls: 100000
===========================================================


2016/11/29

How much redo data does you query generate (not about delayed block cleanup)

I want to show a tiny effect of materializing factored subquery, which can lead to huge redo generation.
It is also a good time to taste new feature of defining PL/SQL function in the WITH clause. But it is used here only for convenience.

So, let's check how much redo is generated for materializing just one row and just one time

SQL> with function calc(p_in in number) return number
  2  is
  3    res number;
  4  begin
  5    with a as (select /*+ materialize */ 1 c1 from dual)
  6    select calc.p_in + c1 into res from a;
  7    return res;
  8  end;
  9  select calc(1) from dual;
 10  /

Statistics
----------------------------------------------------------
          3  recursive calls
          8  db block gets
          5  consistent gets
          1  physical reads
        572  redo size
        357  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

Nothing to worry about.
But what if you have some PL/SQL function, which employs materialization, and this function is called for each 100K rows output of your report query.

SQL> with function calc(p_in in number) return number
  2  is
  3    res number;
  4  begin
  5    with a as (select /*+ materialize */ 1 c1 from dual)
  6    select calc.p_in + c1 into res from a;
  7    return res;
  8  end;
  9  select calc(rownum) from dual connect by level <=1e5;
 10  /

Statistics
----------------------------------------------------------
     300001  recursive calls
     802222  db block gets
     500000  consistent gets
     100000  physical reads
   57171620  redo size
     611083  bytes sent via SQL*Net to client
        708  bytes received via SQL*Net from client
         21  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
     100000  rows processed

Things become much more interesting. And it is just for one row stored in the temporary table.

One question remains, what exactly causes redo generation. Unfortunately, SQL Trace does not show anything about materialization. But doing the following run, I can infer that redo data is generated for DDL part of operation only.

SQL> with function calc(p_in in number) return number
  2  is
  3    res number;
  4  begin
  5    with a as (select /*+ materialize */ 1 c1 from dual connect by level<=1e5)
  6    select calc.p_in + sum(c1) into res from a;
  7    return res;
  8  end;
  9  select calc(1) from dual;
 10  /

Statistics
----------------------------------------------------------
          5  recursive calls
        161  db block gets
        157  consistent gets
        152  physical reads
        572  redo size
        359  bytes sent via SQL*Net to client
        499  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed

So, materialization feature sometimes is very useful, but it has to be applied with caution.

2016/11/18

Including files in SQL*Plus script

Do you know that @ (at sign) can also work as an include directive?
Suppose you have a script for getting your session statistics from v$mystat. But you need a particular list of statistics. Furthermore, you want to use the same list in several places of your script without writing it multiple times.
It is easy to do using @ (at sign).
Here is an example of a script which needs to include another script:

select m.statistic#,n.name,value from v$mystat m, v$statname n
where n.statistic#=m.statistic# and
n.name in (
@mystat_list
);

And here is content of mystat_list.sql
'session logical reads',
'CPU used by this session',
'session connect time',
'user I/O wait time',
'session pga memory max',
'consistent gets',
'physical reads',
'physical writes',
'physical reads direct temporary tablespace',
'physical writes direct temporary tablespace',
'lob reads',
'lob writes',
'parse time cpu',
'parse time elapsed',
'sorts (memory)',
'sorts (disk)',
'sorts (rows)'

The result is:
SQL> select m.statistic#,n.name,value from v$mystat m, v$statname n
  2  where n.statistic#=m.statistic# and
  3  n.name in (
  4  @mystat_list
 21  );

 STATISTIC# NAME                                                VALUE
----------- --------------------------------------------- -----------
         14 session logical reads                                 203
         19 CPU used by this session                                8
         24 user I/O wait time                                      6
         29 session connect time                           1479453315
         38 session pga memory max                            3395896
        132 consistent gets                                       200
        141 physical reads                                         20
        152 physical writes                                         0
        157 physical reads direct temporary tablespace              0
        158 physical writes direct temporary tablespace             0
        743 lob reads                                               0
        744 lob writes                                              0
       1082 parse time cpu                                          2
       1083 parse time elapsed                                      8
       1100 sorts (memory)                                         23
       1101 sorts (disk)                                            0
       1102 sorts (rows)                                           23

17 rows selected.

So, you can include whatever you need wherever you need. It can be part of PL/SQL anonymous block, part of a query or something else. Just make sure that resulted text will be syntactically correct. 

2016/07/22

Why you don't want to parse clob using instr/substr functions

A very strange issue we have hit recently with a report which completion time could suddenly rise two or three times and then drop to some normal half an hour.
The report is a pretty complex query with lots of PL/SQL function calls, which in its turn calls another query with another PL/SQL function and so on. There were three levels of recursive pl/sql calls: sql(top)->pl/sql->sql(1)->pl/sql->sql(2)->pl/sql->sql & pl/sql in a jumble.

As it appeared, when the report is slow, it consumes CPU only, and all V$ views (like ASH or v$sql) show as active only the query from the level "sql(2)". Also v$sesstat didn't show any interesting anomalies (except some lob and data changing activities (db block changes), which did not give any new idea at that time).

As Tanel Poder once said, if your code constantly burns CPU, you have to trace Oracle kernel (it isn't the exact quote, just my attempt to express his message).

I started with truss, just to learn that it produced almost empty output in my case. No OS calls were observed.

pstack was not available, the last resort was oradebug.

I ended up with the following command:

oradebug PDUMP interval=1 ndumps=60 CALLSTACK 1

So, what? What can I get from that highly encrypted trace files?
The first thing I observe was "plsql_run" kernel function, which appeared exactly the same number of times as pl/sql recursive calls in my report. Actually, when I tested the report and made an oradebug trace, I run the report from anonymous PL/SQL block, so the trace below will show one additional "plsql_run" call in the top (actually bottom) of the stack.
Observation of "plsql_run" gave me some confidence that I can decrypt this trace somehow. To move further I had to find well known but unavailable on MOS Note 175982.1. And, thank God, it can be googled easily. Also, it was worth to refresh my knowledge about how to use all this stuff. Here it is.

Below is edited stack trace of Oracle kernel calls. At the right of some kernel function, I placed, what I believe, corresponds to my report's sql and pl/sql calls. It was not a rocket science after all. I gathered pl/sql source codes, queries, query plans and put it all together.

Received ORADEBUG command (#64) 'DUMP CALLSTACK 1 0x0 _seq=60 ndumps=60' from process 'Unix process pid: 6815810, image: '
Executing periodic dump 60 of 60

----- Call Stack Trace -----
calling              
location             
-------------------- 
skdstdst()+40        
ksedst1()+112        
ksedst_tracecb()+48  
ksdxdmp()+544        
ksdxfdmp()+132       
ksdxcb()+3432        
sspuser()+116        
__sighandler()       oradebug dump CALLSTACK
kcbinvbrhb()+312     
kcbivbr()+596        
kcblibr()+76         manages Oracle's buffer cache operation
kdlrsp()+924         
kdl_destroy()+928    
kdlt_freetemp()+148  lob and temporary lob management
kokltfr()+448        
kokltfc()+16         lob access
koltitfr()+612       
kolttfr()+136        support for object Lob buffering PKGPROV.get_last_ord() temporary_quote.tmp_quote (CLOB field operations)
pfrcvfc_format_conv  
pevm_CNVMSC()+28     
pfrinstr_CNVMSC()+6  
pfrrun_no_tool()+96  
pfrrun()+1032        
plsql_run()+616      PKGPROV.get_last_ord
peidxr_run()+280     
peidxexe()+104       
kkxdexe()+352        
kkxmpexe()+244       
kgmexwi()+584        
kgmexec()+1388       
evapls()+736         
evaopn2()+1284       
evaopn2()+4492       
evaopn2()+4224       
evaopn2()+4224       
evaopn2()+4224       
evaopn2()+4224       
evaopn2()+3712      
qerixGetKey()+4132   Index row source |* 68 |            INDEX UNIQUE SCAN          | XPKTAB_OBJ
qerixStart()+1832    
qerjotFetch()+1096   
qerjotFetch()+392    
qerghFetch()+324     
qervwFetch()+148     
rwsfcd()+292         
qeruaFetch()+404    
opifch2()+2312       row source : union-All SQL_ID  7p6225u9gc7ka |   1 |  UNION-ALL
opiall0()+2196       
opikpr()+460         
opiodr()+908         
rpidrus()+216        
skgmstack()+160      
rpidru()+132         
rpiswu2()+520        
kprball()+1044       
kkxsarfch()+568      
kkxsfcn()+556        
pevm_icd_call_commo  
pfrinstr_ICAL()+140  
pfrrun_no_tool()+96  
pfrrun()+1032        
plsql_run()+616      table(CALCS.GET_PAR(ci.id, 113852902)) t
peicnt()+324        
kkxuexe()+468        
kkxmpsexe()+400      
kgmexwi()+584        
kgmexec()+1388       
evapls()+736         
evaopn2()+1284       
qerocImageIterStart  object collection iterator row source |  38 | COLLECTION ITERATOR PICKLER FETCH| GETCALCVAL
qerocStart()+504     
qerjotFetch()+1096   
qerjotFetch()+392    
qerjotFetch()+392    
qernsFetch()+624     
rwsfcd()+100         
qeruaFetch()+404     
subsr1()+288         
subsr3()+364         
IPRA.$evaopn3()+697  
evanvl()+956         
evaopn2()+1284       
opifcr()+5500        
qervwRowProcedure()  
qerghFetch()+1372    
qervwFetch()+148     qervw - view row source SQL_ID  bb8mfnvw4agxp |  41 |  VIEW | VM_NWVW_3
opifch2()+2312       
opiall0()+2196       
opikpr()+460         
opiodr()+908         
rpidrus()+216        
skgmstack()+160      
rpidru()+132         
rpiswu2()+520        
kprball()+1044       
kkxsarfch()+568      
kkxsfcn()+556        
pevm_icd_call_commo  
pfrinstr_ICAL()+140  
pfrrun_no_tool()+96  
pfrrun()+1032        
plsql_run()+616      CALCS.GET_PAR(P_ID => A.OBJ_ID, P_A_ID => 313006642)
peicnt()+324        
kkxuexe()+468        
kkxmpsexe()+400      
kgmexwi()+584        
kgmexec()+1388       
evapls()+736         
evaopn2()+1284       
qerocImageIterStart  SQL_ID  3kp00z3rvsjf0 |   1 |  COLLECTION ITERATOR PICKLER FETCH| GET_PAR | 1 |
qerocStart()+504     object collection iterator row source
subsr1()+208         
subsr3()+364         
IPRA.$evaopn3()+697  
evaopn3()+92         
opifcr()+5808        
qertbFetchByRowID()  tab by rowid SQL_ID  3kp00z3rvsjf0 |*  2 |  TABLE ACCESS BY INDEX ROWID | TAB_OBJ
opifch2()+2312       
opifch()+48          
opipls()+4248        
opiodr()+908         
rpidrus()+216        
skgmstack()+160    
rpidru()+132         
rpiswu2()+520        
rpidrv()+1060      recurcive sql
psddr0()+448         
psdnal()+472       binds
pevm_BFTCHC()+284    
pfrinstr_BFTCHC()+1  
pfrrun_no_tool()+96  
pfrrun()+1032        
plsql_run()+616    top level anonymous block
peicnt()+324         
kkxexe()+484         
opiexe()+17356       
kpoal8()+1872        
opiodr()+908         
ttcpip()+1028        
opitsk()+1612        
opiino()+940         
opiodr()+908         
opidrv()+1132        
sou2o()+136          
opimai_real()+560    
ssthrdmain()+276     
main()+204           
__start()+112        

So, my report was constantly doing some LOB manipulation! Very soon I found some CLOB column which was fetched inside PKGPROV.get_last_ord(). And that CLOB was parsed using instr() and substr() functions. It was pretty good example of the "SQL Semantics" usage. It works, but how much does it cost? You can read more about this in the famous "Oracle PL/SQL Programming, Sixth Edition by Steven Feuerstein with Bill Pribyl" book, paragraph "SQL semantics may yield temporary LOBs"

Taking this, a quick workaround appeared in my mind. In my case CLOB column temporary_quote.tmp_quote contained not so long pieces of text, 6K at maximum. It was possible to declare a local variable with VARCHAR2(32768), assign CLOB column content to it, and just do the same parsing but with local VARCHAR2 variable.

See below Tomas Kyte's RUNSTATS utility style comparison for CLOB manipulation part of the report. Run1 is the original approach, Run2 is using a varchar2 local variable. The difference is huge!

Statistic                                     Run1             Run2
---------------------------------- --------------- ----------------
Elapsed (santisec)...............:          124349              743
CPU (santisec)...................:           69551              404

One more thing. If you have noted that CPU time is much smaller than elapsed time, it is not due to some IO waiting time or similar. All these experiments were conducted on AIX Power 7 server. So, this article gives some explanation.

2016/03/10

Underscore (hidden) parameters in Multitenant database

It is pretty usual when your application needs some underscore (hidden) parameter to be set.
But what can happen in new Multitenant database.

[oracle@ora12c ~]$ sqlplus / as sysdba                                          

SQL*Plus: Release 12.1.0.2.0 Production on Thu Mar 10 15:45:21 2016

Copyright (c) 1982, 2014, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options                                         

SQL> sho con_name

CON_NAME
------------------------------
CDB$ROOT                      

SQL> alter session set container=pdb1;

Session altered.

SQL> sho con_name

CON_NAME
------------------------------
PDB1
SQL> sho parameter _connect_by_use_union_all
SQL> alter system set "_connect_by_use_union_all" = 'old_plan_mode';

System altered.

SQL> sho parameter _connect_by_use_union_all

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
_connect_by_use_union_all            string      old_plan_mode
SQL> select value from v$spparameter where name='_connect_by_use_union_all';

VALUE
------------------------------------
old_plan_mode

SQL> alter pluggable database close immediate;

Pluggable database altered.

SQL> alter pluggable database open read write;

Pluggable database altered.

SQL> sho con_name

CON_NAME
------------------------------
PDB1
SQL> sho parameter _connect_by_use_union_all
SQL> select value from v$spparameter where name='_connect_by_use_union_all';

VALUE
------------------------------------
old_plan_mode

SQL>

So, you can set underscore parameter on PDB level but it is "disappeared" after reboot either PDB or whole CDB.
So far the only workaround I`ve found is to create LOGON trigger with corresponding parameter set by alter session statement.