пятница, 22 июля 2016 г.

Why you don't want to parse clob using instr/substr functions

A very strange issue we have hit recently with a report which completion time could suddenly rise two or three times and then drop to some normal half an hour.
The report is a pretty complex query with lots of PL/SQL function calls, which in its turn calls another query with another PL/SQL function and so on. There were three levels of recursive pl/sql calls: sql(top)->pl/sql->sql(1)->pl/sql->sql(2)->pl/sql->sql & pl/sql in a jumble.

As it appeared, when the report is slow, it consumes CPU only, and all V$ views (like ASH or v$sql) show as active only the query from the level "sql(2)". Also v$sesstat didn't show any interesting anomalies (except some lob and data changing activities (db block changes), which did not give any new idea at that time).

As Tanel Poder once said, if your code constantly burns CPU, you have to trace Oracle kernel (it isn't the exact quote, just my attempt to express his message).

I started with truss, just to learn that it produced almost empty output in my case. No OS calls were observed.

pstack was not available, the last resort was oradebug.

I ended up with the following command:

oradebug PDUMP interval=1 ndumps=60 CALLSTACK 1

So, what? What can I get from that highly encrypted trace files?
The first thing I observe was "plsql_run" kernel function, which appeared exactly the same number of times as pl/sql recursive calls in my report. Actually, when I tested the report and made an oradebug trace, I run the report from anonymous PL/SQL block, so the trace below will show one additional "plsql_run" call in the top (actually bottom) of the stack.
Observation of "plsql_run" gave me some confidence that I can decrypt this trace somehow. To move further I had to find well known but unavailable on MOS Note 175982.1. And, thank God, it can be googled easily. Also, it was worth to refresh my knowledge about how to use all this stuff. Here it is.

Below is edited stack trace of Oracle kernel calls. At the right of some kernel function, I placed, what I believe, corresponds to my report's sql and pl/sql calls. It was not a rocket science after all. I gathered pl/sql source codes, queries, query plans and put it all together.

Received ORADEBUG command (#64) 'DUMP CALLSTACK 1 0x0 _seq=60 ndumps=60' from process 'Unix process pid: 6815810, image: '
Executing periodic dump 60 of 60

----- Call Stack Trace -----
calling              
location             
-------------------- 
skdstdst()+40        
ksedst1()+112        
ksedst_tracecb()+48  
ksdxdmp()+544        
ksdxfdmp()+132       
ksdxcb()+3432        
sspuser()+116        
__sighandler()       oradebug dump CALLSTACK
kcbinvbrhb()+312     
kcbivbr()+596        
kcblibr()+76         manages Oracle's buffer cache operation
kdlrsp()+924         
kdl_destroy()+928    
kdlt_freetemp()+148  lob and temporary lob management
kokltfr()+448        
kokltfc()+16         lob access
koltitfr()+612       
kolttfr()+136        support for object Lob buffering PKGPROV.get_last_ord() temporary_quote.tmp_quote (CLOB field operations)
pfrcvfc_format_conv  
pevm_CNVMSC()+28     
pfrinstr_CNVMSC()+6  
pfrrun_no_tool()+96  
pfrrun()+1032        
plsql_run()+616      PKGPROV.get_last_ord
peidxr_run()+280     
peidxexe()+104       
kkxdexe()+352        
kkxmpexe()+244       
kgmexwi()+584        
kgmexec()+1388       
evapls()+736         
evaopn2()+1284       
evaopn2()+4492       
evaopn2()+4224       
evaopn2()+4224       
evaopn2()+4224       
evaopn2()+4224       
evaopn2()+3712      
qerixGetKey()+4132   Index row source |* 68 |            INDEX UNIQUE SCAN          | XPKTAB_OBJ
qerixStart()+1832    
qerjotFetch()+1096   
qerjotFetch()+392    
qerghFetch()+324     
qervwFetch()+148     
rwsfcd()+292         
qeruaFetch()+404    
opifch2()+2312       row source : union-All SQL_ID  7p6225u9gc7ka |   1 |  UNION-ALL
opiall0()+2196       
opikpr()+460         
opiodr()+908         
rpidrus()+216        
skgmstack()+160      
rpidru()+132         
rpiswu2()+520        
kprball()+1044       
kkxsarfch()+568      
kkxsfcn()+556        
pevm_icd_call_commo  
pfrinstr_ICAL()+140  
pfrrun_no_tool()+96  
pfrrun()+1032        
plsql_run()+616      table(CALCS.GET_PAR(ci.id, 113852902)) t
peicnt()+324        
kkxuexe()+468        
kkxmpsexe()+400      
kgmexwi()+584        
kgmexec()+1388       
evapls()+736         
evaopn2()+1284       
qerocImageIterStart  object collection iterator row source |  38 | COLLECTION ITERATOR PICKLER FETCH| GETCALCVAL
qerocStart()+504     
qerjotFetch()+1096   
qerjotFetch()+392    
qerjotFetch()+392    
qernsFetch()+624     
rwsfcd()+100         
qeruaFetch()+404     
subsr1()+288         
subsr3()+364         
IPRA.$evaopn3()+697  
evanvl()+956         
evaopn2()+1284       
opifcr()+5500        
qervwRowProcedure()  
qerghFetch()+1372    
qervwFetch()+148     qervw - view row source SQL_ID  bb8mfnvw4agxp |  41 |  VIEW | VM_NWVW_3
opifch2()+2312       
opiall0()+2196       
opikpr()+460         
opiodr()+908         
rpidrus()+216        
skgmstack()+160      
rpidru()+132         
rpiswu2()+520        
kprball()+1044       
kkxsarfch()+568      
kkxsfcn()+556        
pevm_icd_call_commo  
pfrinstr_ICAL()+140  
pfrrun_no_tool()+96  
pfrrun()+1032        
plsql_run()+616      CALCS.GET_PAR(P_ID => A.OBJ_ID, P_A_ID => 313006642)
peicnt()+324        
kkxuexe()+468        
kkxmpsexe()+400      
kgmexwi()+584        
kgmexec()+1388       
evapls()+736         
evaopn2()+1284       
qerocImageIterStart  SQL_ID  3kp00z3rvsjf0 |   1 |  COLLECTION ITERATOR PICKLER FETCH| GET_PAR | 1 |
qerocStart()+504     object collection iterator row source
subsr1()+208         
subsr3()+364         
IPRA.$evaopn3()+697  
evaopn3()+92         
opifcr()+5808        
qertbFetchByRowID()  tab by rowid SQL_ID  3kp00z3rvsjf0 |*  2 |  TABLE ACCESS BY INDEX ROWID | TAB_OBJ
opifch2()+2312       
opifch()+48          
opipls()+4248        
opiodr()+908         
rpidrus()+216        
skgmstack()+160    
rpidru()+132         
rpiswu2()+520        
rpidrv()+1060      recurcive sql
psddr0()+448         
psdnal()+472       binds
pevm_BFTCHC()+284    
pfrinstr_BFTCHC()+1  
pfrrun_no_tool()+96  
pfrrun()+1032        
plsql_run()+616    top level anonymous block
peicnt()+324         
kkxexe()+484         
opiexe()+17356       
kpoal8()+1872        
opiodr()+908         
ttcpip()+1028        
opitsk()+1612        
opiino()+940         
opiodr()+908         
opidrv()+1132        
sou2o()+136          
opimai_real()+560    
ssthrdmain()+276     
main()+204           
__start()+112        

So, my report was constantly doing some LOB manipulation! Very soon I found some CLOB column which was fetched inside PKGPROV.get_last_ord(). And that CLOB was parsed using instr() and substr() functions. It was pretty good example of the "SQL Semantics" usage. It works, but how much does it cost? You can read more about this in the famous "Oracle PL/SQL Programming, Sixth Edition by Steven Feuerstein with Bill Pribyl" book, paragraph "SQL semantics may yield temporary LOBs"

Taking this, a quick workaround appeared in my mind. In my case CLOB column temporary_quote.tmp_quote contained not so long pieces of text, 6K at maximum. It was possible to declare local variable with VARCHAR2(32768), assign CLOB column content to it, and just do the same parsing but with local VARCHAR2 variable.

See below Tomas Kyte's RUNSTATS utility style comparison for CLOB manipulation part of the report. Run1 is the original approach, Run2 is using varchar2 local variable. The difference is huge!

Statistic                                            Run1             Run2
----------------------------------------- --------------- ----------------
Elapsed (santisec)......................:          124349              743
CPU (santisec)..........................:           69551              404

One more thing. If you have noted that CPU time is much smaller than elapsed time, it is not due to some IO waiting time or similar. All these experiments were conducted on AIX Power 7 server. So, this article gives some explanation.

четверг, 10 марта 2016 г.

Underscore (hidden) parameters in Multitenant database

It is pretty usual when your application needs some underscore (hidden) parameter to be set.
But what can happen in new Multitenant database.

[oracle@ora12c ~]$ sqlplus / as sysdba                                          

SQL*Plus: Release 12.1.0.2.0 Production on Thu Mar 10 15:45:21 2016

Copyright (c) 1982, 2014, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options                                         

SQL> sho con_name

CON_NAME
------------------------------
CDB$ROOT                      

SQL> alter session set container=pdb1;

Session altered.

SQL> sho con_name

CON_NAME
------------------------------
PDB1
SQL> sho parameter _connect_by_use_union_all
SQL> alter system set "_connect_by_use_union_all" = 'old_plan_mode';

System altered.

SQL> sho parameter _connect_by_use_union_all

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
_connect_by_use_union_all            string      old_plan_mode
SQL> select value from v$spparameter where name='_connect_by_use_union_all';

VALUE
------------------------------------
old_plan_mode

SQL> alter pluggable database close immediate;

Pluggable database altered.

SQL> alter pluggable database open read write;

Pluggable database altered.

SQL> sho con_name

CON_NAME
------------------------------
PDB1
SQL> sho parameter _connect_by_use_union_all
SQL> select value from v$spparameter where name='_connect_by_use_union_all';

VALUE
------------------------------------
old_plan_mode

SQL>

So, you can set underscore parameter on PDB level but it is "disappeared" after reboot either PDB or whole CDB.
So far the only workaround I`ve found is to create LOGON trigger with corresponding parameter set by alter session statement.

вторник, 8 декабря 2015 г.

Do not use instance_id attribute of dbms_scheduler

You must not use dbms_scheduler.set_attribute('YOUR_JOB', 'instance_id', ); for your production system.
If DBA makes some maintenance on your job`s RAC node, the job will just not run during that.
So, if you want your job to run on particular instance(s), you ought to use JOB_CLASS and SERVICES to achieve this.

Another side effect can happen if you run your job through dbms_schdeuler.run_job() with USE_CURRENT_SESSION => FALSE.
Suppose your job attached to node 1 and your session with dbms_schdeuler.run_job happen to connect to node 2. Your job will not run in this circumstance. It will not produce any exceptions or log record with an error.

пятница, 11 июля 2014 г.

Materialize subquery on Active DataGuard

Due to some bug Oracle can fail to materialize subquery from with ... as clause on Active DataGuard with error
ORA-00604: error occurred at recursive SQL level 2
ORA-16000: database open for read-only access

If subquery does not have materialize hint, the situation can be fixed by setting "_with_subquery"=INLINE on session or system level.

Please consult Oracle Support before using this on production system.

вторник, 8 июля 2014 г.

Oracle PL/SQL programming 6th edition

There are several statement from the book "Oracle PL/SQL programming 6th edition" which I disagree with.

1) page 475 (paper)
SAVEPOINT gives a name to, and marks a point in, the processing of your transaction.
This marker allows you to ROLLBACK TO that point, undoing any changes and releasing
any locks issued after that savepoint, but preserving any changes and locks that
occurred before you marked the savepoint.

I think that it is worth to be mentioned that sessions which was waiting on locks released after rollback to savepoint remain waiting, although new sessions can lock that resources.
By the way, implicit savepoint before anonymous block works in the same manner.

2) page 476 (paper)
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE | READ COMMITTED
This version defines how transactions that modify the database should be handled.
You can specify a serializable or read-committed isolation level. When you specify
SERIALIZABLE, a DML statement that attempts to modify a table that has already
been modified in an uncommitted transaction will fail.

Serialized transaction will fail with "ORA-08177: can't serialize access for this transaction" only after the uncommitted transaction eventually becomes committed, if it rolls back then serialized is successful.

to be continued...



четверг, 12 декабря 2013 г.

FOR UPDATE SKIP LOCKED

One new thing I`ve learned from recent PL/SQL Challenge quiz is about how SELECT FOR UPDATE SKIP LOCKED works.

It does not really lock rows on EXEC phase of query execution (open cursor), when ordinary FOR UPDATE does.

Here are two excerpts from SQL Trace:
PARSING IN CURSOR #47354360485408 len=72 dep=1 uid=63 oct=3 lid=63 tim=1386839485248389 hv=4069083409 ad='539eab670' sqlid='csw3rbvt8kk8j'
SELECT * FROM PLCH_EMPLOYEES ORDER BY EMPLOYEE_ID FOR UPDATE SKIP LOCKED
END OF STMT
EXEC #47354360485408:c=0,e=31,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,plh=1212233107,tim=1386839485248389

PARSING IN CURSOR #47085989084664 len=60 dep=1 uid=63 oct=3 lid=63 tim=1386839807239479 hv=3207034820 ad='539de1f98' sqlid='aqqrp16zkfxy4'
SELECT * FROM PLCH_EMPLOYEES ORDER BY EMPLOYEE_ID FOR UPDATE
END OF STMT
PARSE #47085989084664:c=3000,e=2655,p=0,cr=8,cu=0,mis=1,r=0,dep=1,og=1,plh=1212233107,tim=1386839807239478
EXEC #47085989084664:c=0,e=199,p=0,cr=7,cu=4,mis=0,r=0,dep=1,og=1,plh=1212233107,tim=1386839807239715 

вторник, 25 июня 2013 г.

SQLPLUS ARRAYSIZE and LOB

SQLPLUS ARRAYSIZE drops to 1 if your query select list contains LOB column(s).

Extended SQL Trace file contains following rows for such queries (on my 11.2.0.3 Linux box)



FETCH #4013088:c=24997,e=24286,p=0,cr=384,cu=394,mis=0,r=1,dep=0,og=1,plh=1447629766,tim=1372161822772660
WAIT #4013088: nam='SQL*Net message from client' ela= 315 driver id=1413697536 #bytes=1 p3=0 obj#=3 tim=1372161822773140
WAIT #0: nam='SQL*Net message to client' ela= 5 driver id=1413697536 #bytes=1 p3=0 obj#=3 tim=1372161822773212
LOBREAD: c=0,e=57,p=0,cr=2,cu=0,tim=1372161822773231
WAIT #0: nam='SQL*Net message from client' ela= 385 driver id=1413697536 #bytes=1 p3=0 obj#=3 tim=1372161822773655
WAIT #0: nam='SQL*Net message to client' ela= 5 driver id=1413697536 #bytes=1 p3=0 obj#=3 tim=1372161822773720
LOBREAD: c=0,e=54,p=0,cr=2,cu=0,tim=1372161822773739
WAIT #0: nam='SQL*Net message from client' ela= 1432 driver id=1413697536 #bytes=1 p3=0 obj#=3 tim=1372161822775210
LOBTMPFRE: c=0,e=21,p=0,cr=0,cu=0,tim=1372161822775271
WAIT #0: nam='SQL*Net message to client' ela= 5 driver id=1413697536 #bytes=1 p3=0 obj#=3 tim=1372161822775301
WAIT #0: nam='SQL*Net message from client' ela= 359 driver id=1413697536 #bytes=1 p3=0 obj#=3 tim=1372161822775677
WAIT #4013088: nam='SQL*Net message to client' ela= 6 driver id=1413697536 #bytes=1 p3=0 obj#=3 tim=1372161822795869
FETCH #4013088:c=19997,e=20226,p=0,cr=389,cu=394,mis=0,r=1,dep=0,og=1,plh=1447629766,tim=1372161822795935