Tuesday, June 12, 2018

Database Incomplete recovery lead to ORA-00600: internal error code, arguments: [13013] ... On SYS related CLUSTER table.

An Oracle database hosting SAP SLT Application (SAP Landscape Transformation) got corrupted due to unexpected crash of ASM disk groups.

The most corrupted disks were related to FRA where the ORL and Archive logs placed upon.



When Oracle Support confirmed that these disks are no more in a state of recovery, One of my buddy Database Administrator Performed the  Incomplete Recovery.Post of incomplete DB recovery, all were in an assumption that everything is working fine.  



But The SAP SLT application users started complaining as below in their own words. 

"We are seeing lot of db oracle related error messages in the application level, which hampers almost all activity in the system. Kindly check and do the needful."
Then myself started investigating. In my initial investigation itself  found that below error was being triggered since above mentioned Incomplete DB recovery. 

The Oracle Internal Error encountered as below.

ORA-00600: internal error code, arguments: [13013], [5001], [344], [8485480], [0], [8485480], [17], [], [], [], [], []


Oracle says if ORA-600 [13013] is due to corrupted Indexes and the following optional arguments should be considered as below(7 optional arguments), continued my investigation by matching the arguments as below.


Format: ORA-600 [13013] [a] [b] [c] [d] [e] [f]


    Arg [a] Passcount -- 5001

    Arg [b] Data Object number -- 344
    Arg [c] Tablespace Decimal Relative DBA (RDBA) of block containing the row to be updated -- 8485480
    Arg [d] Row Slot number -- [0]
    Arg [e] Decimal RDBA of block being updated (Typically same as [c]) -- [8485480]
    Arg [f] Code -- [17]

When i tried to catch  the problematic table using corrupted Indexes,  i found that is not a table but one of the SYS schema CLUSTER object.


A CLUSTER Object is a group of tables that share common columns and store related data in the same blocks, 

  • Identification of DB Object affected by above corruption.

SET LINESIZE 200
COL OBJECT_NAME FORMAT A10
COL OWNER FORMAT A15
SET LINESIZE 200
SELECT OBJECT_NAME,OWNER,OBJECT_ID,OBJECT_TYPE FROM DBA_OBJECTS WHERE OBJECT_ID=344;
OBJECT_NAME                    OWNER            OBJECT_ID OBJECT_TYPE
------------------------------ --------------- ---------- -----------------------
SMON_SCN_TO_TIME_AUX           SYS                    344 CLUSTER

The above cluster object is build up on the columns from a SYS table called SMON_SCN_TIME. I followed one odd way to find out the table building this cluster object by analyzing all the SYS related tables matching above CLUSTER name. Then i found the exact table and encountered the real issue when validating its structure.

  • Table affected by the above corruption


SQL> ANALYZE TABLE SYS."SMON_SCN_TIME" VALIDATE STRUCTURE CASCADE;

ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file

  • Issues got resolved post of rebuilding the corrupted index rebuilding below.

SQL> SELECT OWNER,INDEX_NAME FROM DBA_INDEXES WHERE TABLE_NAME='SMON_SCN_TIME';

OWNER      INDEX_NAME

---------- -------------------------
SYS        SMON_SCN_TIME_TIM_IDX
SYS        SMON_SCN_TIME_SCN_IDX

SQL> ANALYZE INDEX SMON_SCN_TIME_TIM_IDX VALIDATE STRUCTURE;


Index analyzed.


SQL> ANALYZE INDEX SMON_SCN_TIME_SCN_IDX VALIDATE STRUCTURE;


Index analyzed.


SQL> ALTER INDEX SMON_SCN_TIME_TIM_IDX REBUILD ONLINE;


Index altered.


SQL> ALTER INDEX SMON_SCN_TIME_SCN_IDX REBUILD ONLINE;


Index altered.


Finally After rebuilding above corrupted indexes got recovered and i could able to re-analyze the problematic table without any issues.



SQL> ANALYZE TABLE SYS."SMON_SCN_TIME" VALIDATE STRUCTURE CASCADE;

Table analyzed.


Post of above the  Oracle Internal error got disappeared from alert log and No complaints from Application user.


Happy reading...

Oracle RAC node unavailable with error: Server unexpectedly closed network connection6]clsc_connect: (0x251c670) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_node2_))

 Early midnight I received a call from the monitoring team that one of the critical production database node is not available. As I am aware...