Shafiulla Syed's Technical Blog.: 2014

Saturday, September 27, 2014

Data Pump Import job got stuck for space quota..

The import job got stuck and continuously alerting the below two error.

ORA-39171: Job is experiencing a resumable wait.

ORA-01536: space quota exceeded for tablespace 'SPDEFAULT'

As per oracle the above errors information is like below.

----------------------------------------------------------------------------------------------------------------------------------------

ORA- 39171: Job is experiencing a resumable wait.

Cause: The Data Pump job is stalled with one or more of its sessions having a resumable wait. Resumable waits are typically caused by a non-expandable tablespace running out of space. The follow-on message describes the nature of the wait.

Action: Correct the condition causing the wait. This will typically involve adding datafiles to the tablespace that is full

ORA-01536: space quota exceeded for tablespace "string"

Cause: The space quota for the segment owner in the tablespace has been exhausted and the operation attempted the creation of a new segment extent in the tablespace.

Action: Either drop unnecessary objects in the tablespace to reclaim space or have a privileged user increase the quota on this tablespace for the segment owner.

----------------------------------------------------------------------------------------------------------------------------------------

However the both alerts were not directing me the proper fix to my problem because though i wanted to grant some quota for a owner/schema which is unknown in this here and there are a lot of users getting imported.

Then i thought to dig it further, so i looked into database to know what exactly is happening at this moment. Actually a user called 'SP' is waiting to create an object in SPDEFAULT tablespace and I found this user does not have any quota on SPDEFAULT tablespace.

set pages 50000 lines 32767

col "Last SQL" format a1000

SELECT t.inst_id,s.username, s.sid, s.serial#,t.sql_id,t.sql_text "Last SQL",t.module

FROM gv$session s, gv$sqlarea t

WHERE s.sql_address =t.address AND

s.sql_hash_value =t.hash_value;

INST_ID USERNAME SID SERIAL# SQL_ID Last SQL

---------- ------------------------------ ---------- ---------- ------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------

1 EXP$ORACLE 44 741 bjnvhuy0bayab CREATE TABLE "SP"."SPDBOWNER" ("DBNAME" VARCHAR2(8 BYTE) NOT NULL ENABLE, "OWNERID" VARCHAR2(8 BYTE) NOT NULL ENABLE) SEGMENT CREATION IMMEDIATE PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT) TABLESPACE "SPDEFAULT"

Finally i got the schema to whome the quota is needed, after granting quota here we go, the import progressed.

ORA-39171: Job is experiencing a resumable wait.

ORA-01536: space quota exceeded for tablespace 'SPDEFAULT'

Processing object type DATABASE_EXPORT/SCHEMA/TABLE/TABLE_DATA

. . imported "SYSDMPS"."SP_GLSI_ADEX_DTL" 25.89 GB 102216301 rows

Wednesday, September 17, 2014

The operations on a DATABASE/TABLESPACE under begin backup / offline mode - Understand better!!

Even though RMAN is being used as a powerful backup and recovery tool for the oracle databases instead of traditional user managed backups, yet some times questions posed to me or they just come in my mind about how Oracle manages datafiles and data of tablespaces/database when they are kept in BEGIN BACKUP mode. Let us explore it further.

Note: Before testing these scenarios please make sure that your database has a consistent backup if you think your data is important.

Lets keep the USERS tablespace in begin backup mode

SQL> alter tablespace users begin backup;

Tablespace altered.

Makre sure that the datafiles under this tablespace are ACTIVE status, which indicates that these datafiles are ACTIVE
Backup mode.

SQL> SELECT t.name AS "TB_NAME", d.file# as "DF#", d.name AS "DF_NAME", b.status
FROM V$DATAFILE d, V$TABLESPACE t, V$BACKUP b
WHERE d.TS#=t.TS#
AND b.FILE#=d.FILE#
AND b.STATUS='ACTIVE';

TB_NAME DF# DF_NAME STATUS
------------- ---------- -------------------------------------- ------------------
USERS 4 /u01/app/oracle/oradata/SPRD/users01.dbf ACTIVE

Can we add a new datafile to a tablespace if it is in Backup mode ?

-- YES, However this newly created datafile will not be in begin backup mode. To keep it in backup mode you need to re-

enable the backup mode of the entire tablespace.

Example:
--------

SQL> alter tablespace users add datafile '/u01/app/oracle/oradata/SPRD/users02.dbf' size 5m;

Tablespace altered.

SQL> SELECT t.name AS "TB_NAME", d.file# as "DF#", d.name AS "DF_NAME", b.status
FROM V$DATAFILE d, V$TABLESPACE t, V$BACKUP b
WHERE d.TS#=t.TS#
AND b.FILE#=d.FILE#
AND b.STATUS='ACTIVE';

TB_NAME DF# DF_NAME STATUS
------------- ---------- -------------------------------------- ------------------
USERS 4 /u01/app/oracle/oradata/SPRD/users01.dbf ACTIVE

SQL> ALTER TABLESPACE USERS END BACKUP;

Tablespace altered.

SQL> ALTER TABLESPACE USERS BEGIN BACKUP;

Tablespace altered.

SELECT t.name AS "TB_NAME", d.file# as "DF#", d.name AS "DF_NAME", b.status
FROM V$DATAFILE d, V$TABLESPACE t, V$BACKUP b
WHERE d.TS#=t.TS#
AND b.FILE#=d.FILE#
AND b.STATUS='ACTIVE';

TB_NAME DF# DF_NAME STATUS
------------------------------ ---------- ------------------------------------------------------------ ------------------
USERS 4 /u01/app/oracle/oradata/SPRD/users01.dbf ACTIVE
USERS 5 /u01/app/oracle/oradata/SPRD/users02.dbf ACTIVE

Take a physical copy of the above two datafiles which are in Backup mode for further scenarioes.

[oracle@oranode1 Manual_bkps]$ pwd
/fra/SPRD/Manual_bkps
oracle@oranode1 Manual_bkps]$ cp /u01/app/oracle/oradata/SPRD/users01.dbf .
[oracle@oranode1 Manual_bkps]$ cp /u01/app/oracle/oradata/SPRD/users02.dbf .
[oracle@oranode1 Manual_bkps]$ ls -ltr
total 107652
-rw-r----- 1 oracle oinstall 104865792 Sep 16 22:09 users01.dbf
-rw-r----- 1 oracle oinstall 5251072 Sep 16 22:09 users02.dbf

Can we resize a datafile which is in BBM ?

Yes and NO...

Yes, if we are expanding the size of the datafile.

SQL> alter database datafile 4 resize 100m;

Database altered.

SQL> ALTER DATABASE DATAFILE '/u01/app/oracle/oradata/SPRD/users02.dbf' RESIZE 10M;

Database altered.

SQL> select file_name,bytes/1024/1024 from dba_data_files where tablespace_name='USERS';

FILE_NAME BYTES/1024/1024
-------------------------------------------- --------------

/u01/app/oracle/oradata/SPRD/users01.dbf 100

/u01/app/oracle/oradata/SPRD/users02.dbf 10

No - If we are shrinking (reducing the space) the datafile than the existing size, Below i tried to reduce its size from
100M to 99M (1 MB less) infact this datafile is almost empty in size.

SQL> ALTER DATABASE DATAFILE '/u01/app/oracle/oradata/SPRD/users01.dbf' RESIZE 99M;
ALTER DATABASE DATAFILE '/u01/app/oracle/oradata/SPRD/users01.dbf' RESIZE 99M
*
ERROR at line 1:
ORA-03298: cannot shrink datafile - file 4 is under hot backup
ORA-01110: data file 4: '/u01/app/oracle/oradata/SPRD/users01.dbf'

Can we drop a datafile ?

No - Here the example.

SQL> ALTER TABLESPACE USERS DROP DATAFILE '/u01/app/oracle/oradata/SPRD/users02.dbf';
ALTER TABLESPACE USERS DROP DATAFILE '/u01/app/oracle/oradata/SPRD/users02.dbf'
*
ERROR at line 1:
ORA-01150: cannot prevent writes - file 5 has online backup set
ORA-01110: data file 5: '/u01/app/oracle/oradata/SPRD/users02.dbf'

This time I added one more datafile and kept the whole tablespace again into backup mode and copied only this newly

created datafile at OS level ( the two remainiing datafiles are already backed up above and our DB is in Archivelog mode

so no issues)

SQL> select file_name,bytes/1024/1024 SIZE_MB from dba_data_files WHERE TABLESPACE_NAME='USERS';

FILE_NAME SIZE_MB
------------------------------------------------------------ ----------
/u01/app/oracle/oradata/SPRD/users01.dbf 100
/u01/app/oracle/oradata/SPRD/users02.dbf 10
/u01/app/oracle/oradata/SPRD/users03.dbf 5

SQL> ALTER TABLESPACE USERS BEGIN BACKUP;

Tablespace altered.

[oracle@oranode1 Manual_bkps]$ cp /u01/app/oracle/oradata/SPRD/users03.dbf .

SQL> ALTER TABLESPACE USERS END BACKUP;

Tablespace altered.

Let's drop the datafiles at OS level for further investigation..
Dropped all the above datafiles from their original locations. Make sure these datafiles are backedup at OS level by

keeping them in backup mode and keep the database in archivelog mode.

[oracle@oranode1 SPRD]$ pwd

/u01/app/oracle/oradata/SPRD

[oracle@oranode1 SPRD]$ rm users*

Now the datafiles belong to USERS tablespace got removed from their original locations. So let us examine how far we can
operate the USERS tablespace.

select/delete/update/Deletion of EMP table ?

select username,default_tablespace from dba_users where default_tablespace='USERS';

USERNAME DEFAULT_TABLESPACE
------------------------------ ------------------------------
ORACLE_OCM USERS
SPATIAL_CSW_ADMIN_USR USERS
SCOTT USERS
APEX_PUBLIC_USER USERS
DIP USERS
MDDATA USERS
XS$NULL USERS
SPATIAL_WFS_ADMIN_USR USERS

SQL> select table_name,tablespace_name from dba_tables where owner='SCOTT' and table_name='EMP';

TABLE_NAME TABLESPACE_NAME
------------------------------ ------------------------------
EMP USERS

SQL> select B.FILE_NAME from dba_segments a, dba_data_files b where a.header_file=b.file_ID and segment_name='EMP';

FILE_NAME
--------------------------------------------------------------------------------
/u01/app/oracle/oradata/SPRD/users01.dbf

SQL> conn scott/tiger;
Connected.

SQL> show user;
USER is "SCOTT"

SQL> select * from tab;

TNAME TABTYPE CLUSTERID
------------------------------ ------- ----------
BONUS TABLE
DEPT TABLE
EMP TABLE
SALGRADE TABLE

SQL> select * from emp;

EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
7369 SMITH CLERK 7902 17-DEC-80 800 20
7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30
7566 JONES MANAGER 7839 02-APR-81 2975 20
7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30
7698 BLAKE MANAGER 7839 01-MAY-81 2850 30
7782 CLARK MANAGER 7839 09-JUN-81 2450 10
7788 SCOTT ANALYST 7566 19-APR-87 3000 20
7839 KING PRESIDENT 17-NOV-81 5000 10
7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30
7876 ADAMS CLERK 7788 23-MAY-87 1100 20

EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
7900 JAMES CLERK 7698 03-DEC-81 950 30
7902 FORD ANALYST 7566 03-DEC-81 3000 20
7934 MILLER CLERK 7782 23-JAN-82 1300 10

Yes - The DML operations also work on these datafiles, however this datafile goes offline automatically as i could not be accessed.

SQL> delete emp where empno=7934;

1 row deleted.

SQL> commit;

Commit complete.

SQL> update emp set EMPNO=1000 where ename='SMITH';

1 row updated.

SQL> COMMIT;

Commit complete.

You can see the above changes in this table. One row deleted and one row updated.

SQL> select * from emp;

EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
1000 SMITH CLERK 7902 17-DEC-80 800 20
7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30
7566 JONES MANAGER 7839 02-APR-81 2975 20
7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30
7698 BLAKE MANAGER 7839 01-MAY-81 2850 30
7782 CLARK MANAGER 7839 09-JUN-81 2450 10
7788 SCOTT ANALYST 7566 19-APR-87 3000 20
7839 KING PRESIDENT 17-NOV-81 5000 10
7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30
7876 ADAMS CLERK 7788 23-MAY-87 1100 20

EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
7900 JAMES CLERK 7698 03-DEC-81 950 30
7902 FORD ANALYST 7566 03-DEC-81 3000 20

13 rows selected.

It seems all above are going fine, however the alert log file logs the errors which says that these datafiles are getting offline and could not be accessed for any DML opperations. The reason is Oracle automatically changes a tablespace from online to offline when certain errors are encountered like when the DBWR fails in several attempts to write to a datafile of the tablespace.

Then the question is how the operations are progressing though the datafile is offline, the answer from oracle is..

When a tablespace goes offline, Oracle does not permit any subsequent SQL statements to reference objects contained in the tablespace. Active transactions with completed statements that refer to data in a tablespace that has been taken offline are not affected at the transaction level. Oracle saves rollback data corresponding to statements that affect data in the offline tablespace in a deferred rollback segment (in the SYSTEM tablespace). When the tablespace is brought back online, Oracle applies the rollback data to the tablespace, if needed.

==============ALERT LOG FILE===========================

ORA-01116: error in opening database file 4
ORA-01110: data file 4: '/u01/app/oracle/oradata/SPRD/users01.dbf'

ORA-01116: error in opening database file 5
ORA-01110: data file 5: '/u01/app/oracle/oradata/SPRD/users02.dbf'

ORA-01135: file 6 accessed for DML/query is offline
ORA-01110: data file 6: '/u01/app/oracle/oradata/SPRD/users03.dbf'

======================================================================

SQL> select B.FILE_NAME from dba_segments a, dba_data_files b
where a.header_file=b.file_ID and segment_name=' 2 EMP';

FILE_NAME
------------------------------------------------------------
/u01/app/oracle/oradata/SPRD/users01.dbf

SQL> ! ls -ltr /u01/app/oracle/oradata/SPRD/users01.dbf
ls: /u01/app/oracle/oradata/SPRD/users01.dbf: No such file or directory

SQL> select group#,members, status from v$log;

GROUP# MEMBERS STATUS
---------- ---------- ----------------
1 1 INACTIVE
2 1 INACTIVE
3 1 CURRENT

SQL> select group#,status,member from v$logfile;

GROUP# STATUS
---------- -------
MEMBER
--------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------
3
/u01/app/oracle/oradata/SPRD/redo03.log

2
/u01/app/oracle/oradata/SPRD/redo02.log

1
/u01/app/oracle/oradata/SPRD/redo01.log

SQL> ALTER SYSTEM DUMP LOGFILE '/u01/app/oracle/oradata/SPRD/redo03.log';

System altered.

SQL> CONN SCOTT/tiger;
Connected.

Even table creation also works at this situation which means the table creation physically does not write anything on the datafile which is missing and the metadata will not be updated properly.

SQL> CREATE TABLE T1 (id number);

Table created.

SQL> select * from tab;

TNAME TABTYPE CLUSTERID
------------------------------ ------- ----------
BONUS TABLE
DEPT TABLE
EMP TABLE
SALGRADE TABLE
T1 TABLE

SQL> select B.FILE_NAME from dba_segments a, dba_data_files b where a.header_file=b.file_ID and segment_name='T1';

no rows selected

SQL> select B.FILE_NAME from dba_segments a, dba_data_files b where a.header_file=b.file_ID and segment_name='EMP';

FILE_NAME
--------------------------------------------------------------------------------
/u01/app/oracle/oradata/SPRD/users01.dbf

Let us insert some value in T1 which has been created just now to check how it progresses. It should through error like below. Whereas you can yet insert/delete/update the rows in the tables(eg EMP) that are exisitng in the datafile (though the datafile is missing the changes are recorded in redo logs).

SQL> insert into t1 values (&id);
Enter value for id: 1
old 1: insert into t1 values (&id)
new 1: insert into t1 values (1)
insert into t1 values (1)
*
ERROR at line 1:
ORA-01116: error in opening database file 4
ORA-01110: data file 4: '/u01/app/oracle/oradata/SPRD/users01.dbf'
ORA-27041: unable to open file
Linux Error: 2: No such file or directory
Additional information: 3

Now let us force oracle to read the required data from datafiles instead of memory by flushing shared pool and buffer cache. Which should through error as it could not read from a datafile which is actually missing at original location.

SQL> alter system flush shared_pool;

System altered.

SQL> ALTER SYSTEM FLUSH BUFFER_CACHE;

System altered.

SQL> alter system switch logfile;

System altered.

SQL> select * from emp;
select * from emp
*
ERROR at line 1:
ORA-01116: error in opening database file 4
ORA-01110: data file 4: '/u01/app/oracle/oradata/SPRD/users01.dbf'
ORA-27041: unable to open file
Linux Error: 2: No such file or directory
Additional information: 3

Finally let us try to restore the datafiles which are deleted from their original locations.

SQL> select name,open_mode from v$database;

NAME OPEN_MODE
--------- --------------------
SPRD READ WRITE

SQL> shutdown immediate;
ORA-01116: error in opening database file 4
ORA-01110: data file 4: '/u01/app/oracle/oradata/SPRD/users01.dbf'
ORA-27041: unable to open file
Linux Error: 2: No such file or directory
Additional information: 3

SQL> select name,open_mode from v$database;

NAME OPEN_MODE
--------- --------------------
SPRD READ WRITE

SQL> shutdown abort;
ORACLE instance shut down.

oracle@oranode1 Manual_bkps]$ cp user* /u01/app/oracle/oradata/SPRD/
[oracle@oranode1 Manual_bkps]$ cd /u01/app/oracle/oradata/SPRD/
[oracle@oranode1 SPRD]$ ls -ltr
total 1529216
-rw-r----- 1 oracle oinstall 20979712 Sep 16 22:01 temp01.dbf
-rw-r----- 1 oracle oinstall 52429312 Sep 16 22:28 redo02.log
-rw-r----- 1 oracle oinstall 52429312 Sep 16 23:04 redo03.log
-rw-r----- 1 oracle oinstall 52436992 Sep 16 23:17 undotbs01.dbf
-rw-r----- 1 oracle oinstall 713039872 Sep 16 23:17 system01.dbf
-rw-r----- 1 oracle oinstall 503324672 Sep 16 23:17 sysaux01.dbf
-rw-r----- 1 oracle oinstall 52429312 Sep 16 23:17 redo01.log
-rw-r----- 1 oracle oinstall 9748480 Sep 16 23:17 control02.ctl
-rw-r----- 1 oracle oinstall 9748480 Sep 16 23:17 control01.ctl
-rw-r----- 1 oracle oinstall 5251072 Sep 16 23:21 users03.dbf
-rw-r----- 1 oracle oinstall 5251072 Sep 16 23:21 users02.dbf
-rw-r----- 1 oracle oinstall 104865792 Sep 16 23:21 users01.dbf

SQL> col name format a60
SQL> set linesize 200
SQL> select NAME, CHECKPOINT_TIME, STATUS from v$datafile;

NAME CHECKPOIN STATUS
------------------------------------------------------------ --------- -------
/u01/app/oracle/oradata/SPRD/system01.dbf 16-SEP-14 SYSTEM
/u01/app/oracle/oradata/SPRD/sysaux01.dbf 16-SEP-14 ONLINE
/u01/app/oracle/oradata/SPRD/undotbs01.dbf 16-SEP-14 ONLINE
/u01/app/oracle/oradata/SPRD/users01.dbf 16-SEP-14 ONLINE
/u01/app/oracle/oradata/SPRD/users02.dbf 16-SEP-14 ONLINE
/u01/app/oracle/oradata/SPRD/users03.dbf 16-SEP-14 RECOVER

6 rows selected.

SQL> alter session set nls_date_format='yyyy-mm-dd hh24:mi:ss';

Session altered.

SQL> select NAME, CHECKPOINT_TIME, STATUS from v$datafile;

NAME CHECKPOINT_TIME STATUS
------------------------------------------------------------ ------------------- -------
/u01/app/oracle/oradata/SPRD/system01.dbf 2014-09-16 23:04:53 SYSTEM
/u01/app/oracle/oradata/SPRD/sysaux01.dbf 2014-09-16 23:04:53 ONLINE
/u01/app/oracle/oradata/SPRD/undotbs01.dbf 2014-09-16 23:04:53 ONLINE
/u01/app/oracle/oradata/SPRD/users01.dbf 2014-09-16 23:04:53 ONLINE
/u01/app/oracle/oradata/SPRD/users02.dbf 2014-09-16 23:04:53 ONLINE
/u01/app/oracle/oradata/SPRD/users03.dbf 2014-09-16 22:20:40 RECOVER

6 rows selected.

SQL> select name,open_mode from v$database;

NAME OPEN_MODE
------------------------------------------------------------ --------------------
SPRD MOUNTED

SQL> alter database recover datafile '/u01/app/oracle/oradata/SPRD/users03.dbf';

Database altered.

SQL> select NAME, CHECKPOINT_TIME, STATUS from v$datafile;

NAME CHECKPOINT_TIME STATUS
------------------------------------------------------------ ------------------- -------
/u01/app/oracle/oradata/SPRD/system01.dbf 2014-09-16 23:04:53 SYSTEM
/u01/app/oracle/oradata/SPRD/sysaux01.dbf 2014-09-16 23:04:53 ONLINE
/u01/app/oracle/oradata/SPRD/undotbs01.dbf 2014-09-16 23:04:53 ONLINE
/u01/app/oracle/oradata/SPRD/users01.dbf 2014-09-16 23:04:53 ONLINE
/u01/app/oracle/oradata/SPRD/users02.dbf 2014-09-16 23:04:53 ONLINE
/u01/app/oracle/oradata/SPRD/users03.dbf 2014-09-16 22:33:06 OFFLINE

6 rows selected.

SQL> alter database datafile '/u01/app/oracle/oradata/SPRD/users03.dbf' online;

Database altered.

SQL> select NAME, CHECKPOINT_TIME, STATUS from v$datafile;

NAME CHECKPOINT_TIME STATUS
------------------------------------------------------------ ------------------- -------
/u01/app/oracle/oradata/SPRD/system01.dbf 2014-09-16 23:04:53 SYSTEM
/u01/app/oracle/oradata/SPRD/sysaux01.dbf 2014-09-16 23:04:53 ONLINE
/u01/app/oracle/oradata/SPRD/undotbs01.dbf 2014-09-16 23:04:53 ONLINE
/u01/app/oracle/oradata/SPRD/users01.dbf 2014-09-16 23:04:53 ONLINE
/u01/app/oracle/oradata/SPRD/users02.dbf 2014-09-16 23:04:53 ONLINE
/u01/app/oracle/oradata/SPRD/users03.dbf 2014-09-16 22:33:06 ONLINE

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01113: file 4 needs media recovery
ORA-01110: data file 4: '/u01/app/oracle/oradata/SPRD/users01.dbf'

COL DF# FORMAT 999
COL DF_NAME FORMAT A50
COL TBSP_NAME FORMAT A7
COL STATUS FORMAT A7
COL CHANGE# FORMAT 99999999
SELECT r.FILE# AS df#, d.NAME AS df_name, t.NAME AS tbsp_name,
d.STATUS, r.CHANGE#, r.TIME
FROM V$RECOVER_FILE r, V$DATAFILE d, V$TABLESPACE t
WHERE t.TS# = d.TS#
AND d.FILE# = r.FILE#;

DF# DF_NAME TBSP_NA STATUS CHANGE# TIME
---- -------------------------------------------------- ------- ------- --------- -------------------
1 /u01/app/oracle/oradata/SPRD/system01.dbf SYSTEM SYSTEM 825879 2014-09-16 23:04:53
2 /u01/app/oracle/oradata/SPRD/sysaux01.dbf SYSAUX ONLINE 825879 2014-09-16 23:04:53
3 /u01/app/oracle/oradata/SPRD/undotbs01.dbf UNDOTBS ONLINE 825879 2014-09-16 23:04:53
4 /u01/app/oracle/oradata/SPRD/users01.dbf USERS ONLINE 822715 2014-09-16 22:06:11
5 /u01/app/oracle/oradata/SPRD/users02.dbf USERS ONLINE 822715 2014-09-16 22:06:11

SQL> alter database open;

ORA-01113: file 4 needs media recovery
ORA-01110: data file 4: '/u01/app/oracle/oradata/SPRD/users01.dbf'

COL DF# FORMAT 999
COL DF_NAME FORMAT A35
COL TBSP_NAME FORMAT A7
COL STATUS FORMAT A7
COL CHANGE# FORMAT 99999999
SELECT r.FILE# AS df#, d.NAME AS df_name, t.NAME AS tbsp_name,
d.STATUS, r.ERROR, r.CHANGE#, r.TIME
FROM V$RECOVER_FILE r, V$DATAFILE d, V$TABLESPACE t
WHERE t.TS# = d.TS#
AND d.FILE# = r.FILE#;SQL> SQL> SQL> SQL> SQL> 2 3 4 5

DF# DF_NAME TBSP_NA STATUS ERROR CHANGE# TIME
---- ----------------------------------- ------- ------- ---------- --------- -------------------

4 /u01/app/oracle/oradata/SPRD/users0 USERS ONLINE 822715 2014-09-16 22:06:11
1.dbf

5 /u01/app/oracle/oradata/SPRD/users0 USERS ONLINE 822715 2014-09-16 22:06:11
2.dbf

The Null (empty values) shown in the ERROR colum indicates that these datafiles need recovery.

Let us recover the datafiles 4 & 5 ..and open the database.

SQL> alter database recover datafile 4;

Database altered.

SQL> alter database recover datafile 5;

Database altered.

Make sure that no other datafiles need recovery...

SQL> COL DF# FORMAT 999
COL DF_NAME FORMAT A50
COL TBSP_NAME FORMAT A7
COL STATUS FORMAT A7
COL CHANGE# FORMAT 99999999
SELECT r.FILE# AS df#, d.NAME AS df_name, t.NAME AS tbsp_name,
d.STATUS, r.CHANGE#, r.TIME
FROM V$RECOVER_FILE r, V$DATAFILE d, V$TABLESPACE t
WHERE t.TS# = d.TS#
AND d.FILE# = r.FILE#;SQL> SQL> SQL> SQL> SQL> 2 3 4 5

no rows selected

SQL> alter database open;

Database altered.

Here is the snippet from alert log what is happening at the time of datafiles recovery.

======================ALERT LOG==========================
alter database recover datafile 4
Media Recovery Start
Serial Media Recovery started
Recovery of Online Redo Log: Thread 1 Group 2 Seq 5 Reading mem 0
Mem# 0: /u01/app/oracle/oradata/SPRD/redo02.log
Recovery of Online Redo Log: Thread 1 Group 3 Seq 6 Reading mem 0
Mem# 0: /u01/app/oracle/oradata/SPRD/redo03.log
Recovery of Online Redo Log: Thread 1 Group 1 Seq 7 Reading mem 0
Mem# 0: /u01/app/oracle/oradata/SPRD/redo01.log
Media Recovery Complete (SPRD)
Completed: alter database recover datafile 4

alter database recover datafile 5
Media Recovery Start
Serial Media Recovery started
Recovery of Online Redo Log: Thread 1 Group 2 Seq 5 Reading mem 0
Mem# 0: /u01/app/oracle/oradata/SPRD/redo02.log
Media Recovery Complete (SPRD)
Completed: alter database recover datafile 4
alter database recover datafile 5
Media Recovery Start
Serial Media Recovery started
Recovery of Online Redo Log: Thread 1 Group 2 Seq 5 Reading mem 0
Mem# 0: /u01/app/oracle/oradata/SPRD/redo02.log
Recovery of Online Redo Log: Thread 1 Group 3 Seq 6 Reading mem 0
Mem# 0: /u01/app/oracle/oradata/SPRD/redo03.log
Recovery of Online Redo Log: Thread 1 Group 1 Seq 7 Reading mem 0
Mem# 0: /u01/app/oracle/oradata/SPRD/redo01.log
Media Recovery Complete (SPRD)
Completed: alter database recover datafile 5
Tue Sep 16 23:44:55 2014
alter database open
Beginning crash recovery of 1 threads
parallel recovery started with 2 processes
Started redo scan
Completed redo scan
read 4 KB redo, 11 data blocks need recovery
Started redo application at
Thread 1: logseq 7, block 1170
Recovery of Online Redo Log: Thread 1 Group 1 Seq 7 Reading mem 0
Mem# 0: /u01/app/oracle/oradata/SPRD/redo01.log

=============================================================================================

Enjoy....... Please Comment !!

Saturday, March 8, 2014

NFS locking issue while data pump export - Linux-x86_64 Error: 37: No locks available

Yesterday, as usual the cron job triggered a datapump export job against a database on a Linux Server.
Immediately post running the export job it got failed. When i look into the dump logfile i found below sort of errors.

Export: Release 11.2.0.3.0 - Production on Sat Mar 8 05:53:37 2014

Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
;;;
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning and Automatic Storage Management options
ORA-39000: bad dump file specification
ORA-31641: unable to create dump file "/oraexp/NTLSNDB/Data_Pump_Export_NTLSNDB_FULL_030814_0553_01.dmp"
ORA-27086: unable to lock file - already in use
Linux-x86_64 Error: 37: No locks available
Additional information: 10

I verified at database level whether the dump directory, its path and the proper read & write privileges are granted on the directory. Yes everything was fine at database end.

I believed this could be an issue of nfs mount option at OS level. We are using an NFS shared mount point for all of the servers it needs to get mounted with proper options on each server to get it used by the database. I could see this mount point is mounted properly with recommended options by Oracle.

Then i checked the logs at OS level, then i found the issue is related to nfslock services. The nfslock service is not running on this database. this service helps the client to lock a file in the related NFS mount point on the server to create a file and make write operations.

>cat messages | grep lockd
Mar 8 04:03:31 demoserver kernel: lockd: cannot monitor 10.207.80.179
Mar 8 04:03:31 demoserver kernel: lockd: failed to monitor 10.207.80.179
Mar 8 04:20:27 demoserver kernel: lockd: cannot monitor 10.207.80.179
Mar 8 04:20:27 demoserver kernel: lockd: failed to monitor 10.207.80.179

Further i came to know that t the server got rebooted couple of days ago for a reason, after reboot the nfslock services did not startup automatically. So manually we started the services. Note that If the nfslock services need to get auto start after a reboot then we need to use chkconfig nfslock on. Later the same has been taken care. hence onwards whenever the server gets rebooted the nfslock services will automatically startup.

cat messages | grep rpc
Mar 8 07:01:43 demoserver rpc.statd[12667]: Version 1.0.9 Starting
Mar 8 07:01:49 demoserver rpc.statd[12667]: Caught signal 15, un-registering and exiting.
Mar 8 07:01:49 demoserver rpc.statd[12745]: Version 1.0.9 Starting

You can manage the nfslock services by below commands.

service nfslock status
service nfslock start
service nfslock stop

After making sure that the services got started and the client could able to lock the file on the NFS file system on the server. we re-triggered the export job. It executed successfully.

Wednesday, March 5, 2014

Killing process in Unix

To check running process in Unix,

Command- ps –ef

Here we can use “grep” option to find out any particular process,

Example-

To find out running processes for apache,

root@sunpstsrv01# ps -ef | grep http

webservd 587 584 0 Sep 01 ? 0:00 /opt/csw/apache2/sbin/httpd -k start

root 584 1 0 Sep 01 ? 0:47 /opt/csw/apache2/sbin/httpd -k start

nobody 1498 1494 0 Sep 01 ? 0:00 /usr/local/apache2/bin/httpd -k start

webservd 586 584 0 Sep 01 ? 0:00 /opt/csw/apache2/sbin/httpd -k start

webservd 588 584 0 Sep 01 ? 0:00 /opt/csw/apache2/sbin/httpd -k start

nobody 8860 1494 0 Sep 02 ? 0:00 /usr/local/apache2/bin/httpd -k start

nobody 1499 1494 0 Sep 01 ? 0:00 /usr/local/apache2/bin/httpd -k start

nobody 8861 1494 0 Sep 02 ? 0:00 /usr/local/apache2/bin/httpd -k start

nobody 1500 1494 0 Sep 01 ? 0:00 /usr/local/apache2/bin/httpd -k start

nobody 1501 1494 0 Sep 01 ? 0:00 /usr/local/apache2/bin/httpd -k start

nobody 1502 1494 0 Sep 01 ? 0:00 /usr/local/apache2/bin/httpd -k start

nobody 2832 1494 0 Sep 01 ? 0:00 /usr/local/apache2/bin/httpd -k start

webservd 6031 584 0 Sep 01 ? 0:00 /opt/csw/apache2/sbin/httpd -k start

To find out parent & child processes in unix.

Command- ptree- To print process tree

Example-

root@sunpstsrv01# ptree 8860

1494 /usr/local/apache2/bin/httpd -k start

8860 /usr/local/apache2/bin/httpd -k start

Here in above example we took any process id “8860” and used ptree command, we can see pid “1494” is a parent process for child process “8860”

Using parent PID we can get all running child processes id’s.

Example-

root@sunpstsrv01# ptree 1494

1494 /usr/local/apache2/bin/httpd -k start

1498 /usr/local/apache2/bin/httpd -k start

1499 /usr/local/apache2/bin/httpd -k start

1500 /usr/local/apache2/bin/httpd -k start

1501 /usr/local/apache2/bin/httpd -k start

1502 /usr/local/apache2/bin/httpd -k start

2832 /usr/local/apache2/bin/httpd -k start

8860 /usr/local/apache2/bin/httpd -k start

8861 /usr/local/apache2/bin/httpd -k start

8862 /usr/local/apache2/bin/httpd -k start

Here we can see all child PID’s associated with Parent process ID “1494”

To kill Parent & child process,

Command- kill -9 ‘PID’

Example-

To kill apache process,

root@sunpstsrv01# kill -9 1494

Here we are killing parent process running for apache.

Most of the time if we killed parent process then child process associated with that gets killed.

We can confirm that by using “ps –ef “ command.

Zombie process in Unix

It is a process that has completed execution but still has an entry in the process table, allowing the process that started it to read its exit status.

When a process ends, all of the memory and resources associated with it are de-allocated so they can be used by other processes. However, the process entry in the process table remains. The parent is sent a SIGCHLD signal indicating that a child has died; the handler for this signal will typically execute the wait system call, which reads the exit status and removes the zombie.

Zombies can be identified in the output from the UNIX ps command by the presence of a “Z” in the STAT column.

Example-

ps -el | grep 'Z'

With a normal ps -el command you see an output with in the second colum the state of the process. Here are some states:

S : sleeping

R : running

D : waiting (over het algemeen voor IO)

T : gestopt (suspended) of getrasseerd

Z : zombie (defunct)

The output under this text is an example. We can see that dovecot-auth is the zombie.

[root@s324 /]# ps -el | grep 'Z'

F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD

1 Z 0 1213 589 0 75 0 - 0 funct> ? 00:00:00 dovecot-auth

Here 2^nd column “Z” indicates zombie process.

Most of the time zombie process can be killed by “kill -9 ‘Zombie PID’” but still if that zombie process is not being killed then we might need to restart that application related to process.

Shafiulla Syed's Technical Blog.

Saturday, September 27, 2014

Data Pump Import job got stuck for space quota..

Wednesday, September 17, 2014

The operations on a DATABASE/TABLESPACE under begin backup / offline mode - Understand better!!

Saturday, March 8, 2014

NFS locking issue while data pump export - Linux-x86_64 Error: 37: No locks available

Wednesday, March 5, 2014

Killing process in Unix

Oracle RAC node unavailable with error: Server unexpectedly closed network connection6]clsc_connect: (0x251c670) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_node2_))

Search This Blog