mardi 12 avril 2011

Redologs & Archivelogs


0] Archivelogs over-generation

How do I manage Archivelogs over-generation, a quick summary :

1/ Seek to the generated volume in the past to highlight when the issue occurs
In following [ chapter 1] Volume / 1.1 Archivelog mode ]

-               per day, per hour
-               par day, specific hours

2/ Log Miner on several Archivelogs to find out which username, segment are responsible for the redo vectors
In following chapter 2] Log Miner

3/ Get in the past, which segment has most DB_BLOCK_CHANGES_TOTAL
With post [ Performance / AWR / 3.2 Redolog over generation ]

4/ Confirm in Real Time from current sessions
With post [ Administration / Redologs & Archivelogs / 5] Archivelogs space over-filling : Kill the guilty session ]

5/ Set an alerting on such event :
With post [ Administration / Redologs & Archivelogs / 4] Monitor excessive archive log generation to prevent ORA-16014


1] Volume

Goal
Redolog Volume Generation
Version
10gR2
Views
v$archived_log      ( Archivelog )
v$log v$log_history ( Noarchivelog )

1.1 Archivelog mode

This is the Real volume given by Blocks column.

-- by day, past 30 days
select trunc(completion_time) "day", min(sequence#) "num_log_begin_day", count(1) "nb_arc",
ceil(sum(blocks*block_size)/1024/1024/1024) Arclogs_GB from v$archived_log
where completion_time > SYSDATE-30 group by trunc(completion_time) order by 1 desc

day
num_log_begin_day
nb_arc
ARCLOGS_GB
28/08/2015
1075875
188
78
27/08/2015
1075591
284
116
26/08/2015
1075358
233
87
25/08/2015
1075194
164
51
24/08/2015
1075043
151
42
23/08/2015
1074945
98
2
22/08/2015
1074841
104
13
21/08/2015
1074654
187
63
20/08/2015
1074477
177
62
19/08/2015
1074340
137
38
18/08/2015
1074205
135
36
17/08/2015
1074079
126
28

-- by hour, past 1 day
select trunc(completion_time,'HH24') hour_D, ceil(sum(blocks*block_size)/1024/1024) Arclogs_MB from v$archived_log
where completion_time > SYSDATE-1  group by trunc(completion_time,'HH24') order by 1 desc

-- by day, for a specific interval of hours
select trunc(hour_D) DAY_G, ceil(sum(Arclogs_MB/1024)) Arclogs_GB from
(
select trunc(completion_time,'HH24') hour_D, ceil(sum(blocks*block_size)/1024/1024) Arclogs_MB from v$archived_log
where completion_time > SYSDATE-30
and to_char(completion_time, 'HH24:MI') between '07:01' and '18:00'
-- I had to exclude days with issues
and to_char(completion_time, 'DD/MM/YYYY') not in ('27/08/2015')
group by trunc(completion_time,'HH24') order by 1 desc
) group by trunc(hour_D) order by 1 desc

-- by day, past 30 days , + volume if we keep 2 days or 7 days on disk
select "day" , num_log_begin_day , nb_arc , Arclogs_GB ,
Arclogs_GB+lag(Arclogs_GB, 1, -1) over(partition by RN order by "day") "2Days" ,
sum(Arclogs_GB) over(partition by RN order by "day" rows 6 preceding) "7Days"
from (
select '1' RN,trunc(completion_time) "day", min(sequence#) num_log_begin_day, count(1) nb_arc,
ceil(sum(blocks*block_size)/1024/1024/1024) Arclogs_GB from v$archived_log
where completion_time > SYSDATE-30 group by trunc(completion_time) order by 1 desc )

day
NUM_LOG_BEGIN_DAY
NB_ARC
ARCLOGS_GB
2Days
7Days
INSTANCE
au 12/06/13

MAX GB =
71
191 
13/05/2013
972884
25
3
2
3
14/05/2013
972909
102
16
19
19
15/05/2013
973011
106
19
35
38
16/05/2013
973117
107
20
39
58
17/05/2013
973224
101
14
34
72
18/05/2013
973325
100
9
23
81
19/05/2013
973425
97
1
10
82
20/05/2013
973522
98
4
5
83
21/05/2013
973620
110
23
27
90
22/05/2013
973730
102
17
40
88
23/05/2013
973832
120
28
45
96
24/05/2013
973952
105
19
47
101
25/05/2013
974057
99
9
28
101
26/05/2013
974156
97
1
10
101
27/05/2013
974253
114
23
24
120
28/05/2013
974367
110
22
45
119
29/05/2013
974477
104
17
39
119
30/05/2013
974581
135
37
54
128
31/05/2013
974716
133
34
71
143
01/06/2013
974849
108
24
58
158
02/06/2013
974957
100
3
27
160
03/06/2013
975057
122
27
30
164
04/06/2013
975179
137
40
67
182
05/06/2013
975316
114
26
66
191
06/06/2013
975430
104
19
45
173

If we’re interested in Open Hours only

-- by hour, past 30 days, between 06:00 and 18:00
select trunc(hour_D) "day", ceil(sum(Arclogs_MB/1024)) Arclogs_GB from
(
select trunc(completion_time,'HH24') hour_D, ceil(sum(blocks*block_size)/1024/1024) Arclogs_MB from v$archived_log
where completion_time > SYSDATE-30
and to_char(completion_time, 'HH24:MI') between '06:00' and '19:00'
group by trunc(completion_time,'HH24') order by 1 desc
) group by trunc(hour_D) order by 1 desc

day
ARCLOGS_GB
20/05/2015
20
19/05/2015
8
18/05/2015
11
17/05/2015
6
16/05/2015
7
15/05/2015
15
14/05/2015
7
13/05/2015
12
12/05/2015
10
11/05/2015
10
10/05/2015
5
09/05/2015
6

1.2 Noarchivelog mode

This is an Estimated volume according to the Redolog file Maxsize.

-- by day , past 30 days
with rdo_sz_GB as ( select max(bytes/1024/1024/1024) sz from v$log )
select day , num_log_begin_day , nb_log , ceil(nb_log*sz) Logs_GB
from ( select trunc(first_time) day, min(sequence#) num_log_begin_day, count(1) nb_log
from v$log_history
where first_time > SYSDATE-30
group by trunc(first_time) ) , rdo_sz_GB
order by 1 desc

-- by hour , past 2 days
select hour_D , num_log_begin_day , nb_arc , nb_arc*(select max(bytes/1024/1024/1024) from v$log ) Logs_GB
from (
select trunc(first_time,'HH24') hour_J, min(sequence#) num_log_begin_day, count(1) nb_arc
from v$log_history
where first_time > SYSDATE-2  group by trunc(first_time,'HH24') order by 1 desc )


2] Log Miner

Log Miner is a powerful tool that helped me many times ; it allows the DBA to fully understand what’s inside the archived log and find out logged information – By whom and when rows were deleted ? What were the complete INSert statements in rows, now removed accidentally , and so on …

Goal
See Redo Entries
Version
10gR2
Package
dbms_logmnr
Views
V$LOGMNR_CONTENTS
Log as
SYS


2.1 Using Online Dictionary

Be careful when mining on Production Database ( it’s very resource consuming ).

2.1.1 Access few archivelogs

Start Log Miner session under SYS user.


-- add the log file to the log miner session
exec dbms_logmnr.add_logfile (LOGFILENAME => '/<log_file_path_name_i>.arc', OPTIONS => DBMS_LOGMNR.NEW) ;

EXEC DBMS_LOGMNR.ADD_LOGFILE(logfilename => '/<log_file_path_name_i+1>.arc', options => DBMS_LOGMNR.ADDFILE);

exec dbms_logmnr.start_logmnr(options => dbms_logmnr.dict_from_online_catalog) ;


Query the archivelog content.


select count(operation),operation,seg_owner,seg_name FROM V$LOGMNR_CONTENTS group by operation,seg_owner,seg_name order by 1 desc ;

select substr(sql_redo,1,54) , count(1) FROM V$LOGMNR_CONTENTS where seg_name = 'TAB01' group by substr(sql_redo,1,54) order by 2 desc ;

select SEG_OWNER, SEG_NAME , OPERATION from v$logmnr_contents where operation is null and rownum < 10000 ;

select count(operation),operation,seg_owner,seg_name FROM V$LOGMNR_CONTENTS group by operation,seg_owner,seg_name order by 1 desc ;

col SEG_NAME for a30
col SEG_OWNER for a15
col OPERATION for a15
select distinct SEG_OWNER, SEG_NAME , OPERATION, COUNT(*) from v$logmnr_contents
group by SEG_OWNER, SEG_NAME , OPERATION order by 4 desc ;


End Log Miner session.


exec dbms_logmnr.end_logmnr() ;


2.1.2 Access many archivelogs using 10G « Continuous Mine » option

Note : even in the same database, « Continuous Mine » may not be possible :
for instance, when the references of the needed archivelogs have disapeared from the Controlfile ( recycling ) , or when archivelogs were removed from Disk :


ALTER SESSION SET NLS_DATE_FORMAT = 'DD-MON-YYYY HH24:MI:SS';
EXECUTE DBMS_LOGMNR.START_LOGMNR( -
   STARTTIME => '03-Dec-2012 08:00:00', -
   ENDTIME => '03-Dec-2012 12:00:00', -
   OPTIONS => DBMS_LOGMNR.DICT_FROM_ONLINE_CATALOG + -
   DBMS_LOGMNR.CONTINUOUS_MINE);

ERROR at line 1:
ORA-01292: no log file has been specified for the current LogMiner session
ORA-06512: at "SYS.DBMS_LOGMNR", line 58
ORA-06512: at line 1

Working examples :


ALTER SESSION SET NLS_DATE_FORMAT = 'DD-MON-YYYY HH24:MI:SS';
EXECUTE DBMS_LOGMNR.START_LOGMNR( -
   STARTTIME => '04-Dec-2012 09:45:00', -
   ENDTIME => '04-Dec-2012 10:00:00', -
   OPTIONS => DBMS_LOGMNR.DICT_FROM_ONLINE_CATALOG + -
   DBMS_LOGMNR.CONTINUOUS_MINE);

select TIMESTAMP, USERNAME, SEG_OWNER, SEG_NAME , OPERATION from v$logmnr_contents where seg_owner = 'APP' and seg_name = 'CITY'

TIMESTAMP            USERNAME           SEG_OWNER       SEG_NAME   OPERATION
 -------------------- ------------------------------ --------------- --------
04-DEC-2012 09:57:11 UNKNOWN            APP             CITY    UNSUPPORTED
04-DEC-2012 09:57:13 UNKNOWN            APP             CITY    UPDATE
04-DEC-2012 09:57:37 UNKNOWN            APP             CITY    INSERT
04-DEC-2012 09:57:40 UNKNOWN            APP             CITY    UNSUPPORTED
04-DEC-2012 09:57:41 UNKNOWN            APP             CITY    DELETE


UNSUPPORTED = change was caused by operations not currently supported by LogMiner

EXEC DBMS_LOGMNR.START_LOGMNR(StartTime => To_Date('07/04/2013 13:10:00', 'MM/DD/YYYY HH24:MI:SS'), EndTime => To_Date('07/04/2013 14:45:00', 'MM/DD/YYYY HH24:MI:SS'), OPTIONS => DBMS_LOGMNR.DICT_FROM_ONLINE_CATALOG+DBMS_LOGMNR.CONTINUOUS_MINE);

select username, operation, SQL_REDO FROM V$LOGMNR_CONTENTS where username = '<SOME_NAME>' -- AND SEG_OWNER = 'SRS' AND SEG_NAME = 'CONTRAT';

EXEC DBMS_LOGMNR.END_LOGMNR ;

2.2 Mining on a remote Database

It’s wise to move the load of analyzing archivelogs to another server ( pre-production for example ).
  • Remote machine should be on the same hardware platform
  • Database should have the same Oracle Version and Blocksize.

2.2.1 The Dictionnary should be extracted, first, from Production Database.
Pre-requisite : “utl_file_dir” parameter  =  extraction directory
Connect as SYS – note, it’s working in a 10g Express Edition.


$ sqlplus '/ as sysdba'
SQL*Plus: Release 10.2.0.1.0 - Production on Mon Sep 26 09:59:33 2011
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
Connected to:
Oracle Database 10g Express Edition Release 10.2.0.1.0 - Production

SQL> sho parameter utl
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
create_stored_outlines               string
utl_file_dir                         string      /tmp
SQL>

-- Extract Dictionary in a flat file

SQL> EXECUTE DBMS_LOGMNR_D.BUILD('<base>_dict.ora', '/tmp', DBMS_LOGMNR_D.STORE_IN_FLAT_FILE);

PL/SQL procedure successfully completed.



$ ls -l /tmp/<base>_dict.ora
-rw-r--r--    1 oracle   dba      16447504 sep 26 10:45 /tmp/<base>_dict.ora


Copy Dictionary file and Archivelogs to the remote server.

2.2.2 Mining few archivelogs

We’re searching for COMMITED Data only.


-- Add archivelogs to log miner session
SQL> EXECUTE DBMS_LOGMNR.ADD_LOGFILE(logfilename => '/tmp/arc1.log', options => DBMS_LOGMNR.NEW);
SQL> EXECUTE DBMS_LOGMNR.ADD_LOGFILE(logfilename => '/tmp/arc2.log', options => DBMS_LOGMNR.ADDFILE);
SQL> EXECUTE DBMS_LOGMNR.ADD_LOGFILE(logfilename => '/tmp/arc3.log', options => DBMS_LOGMNR.ADDFILE);

-- Start Log Miner
SQL> EXECUTE DBMS_LOGMNR.START_LOGMNR(DICTFILENAME =>'/tmp/<base>_dict.ora', -
                    StartTime => To_Date('09/07/2006 07:40:52', 'MM/DD/YYYY HH24:MI:SS'), -
                    EndTime => To_Date('09/07/2006 07:43:13', 'MM/DD/YYYY HH24:MI:SS'), -
                    OPTIONS => DBMS_LOGMNR.COMMITTED_DATA_ONLY );

-- Seek for DML
SQL> select username, operation, SQL_REDO FROM V$LOGMNR_CONTENTS where SEG_NAME = 'WK_SUNOPSIS_LAST_SESSION';

update "PIVOT_ENV"."WK_SUNOPSIS_LAST_SESSION" set "SESSION_ID" = '35960', "START
_DATE" = TO_DATE('07-SEP-06', 'DD-MON-RR'), "END_DATE" = NULL where "SESSION_ID"
 = '33982' and "START_DATE" = TO_DATE('31-AUG-06', 'DD-MON-RR') and "END_DATE" =
 TO_DATE('31-AUG-06', 'DD-MON-RR') and ROWID = 'AAACuSAAUAAAAO4AAF';

-- Close LogMiner session
SQL> EXECUTE Dbms_Logmnr.end_logmnr;


2.2.3 Mining a lot of archivelogs using a script

Archivelog Sequence Number : argument 1 ( DBMS_LOGMNR.NEW ) & argument 2 & 3 ( DBMS_LOGMNR.ADDFILE ) [ Arg_2 = Arg_1 +1 ].
A spoolfile named as ‘<base>_<sequence_arg_1_<sequence_arg_3>’ is created with SQL_REDO statements.

$ cat mine_arc_<base>.sql
set serveroutput on
set pages 5000 lines 600
col sql_redo for a600

-- get first and last sequence number to analyze
define seq1=&1
define seq2=&2
define seq3=&3
-- spool
COLUMN timecol new_value spoolfile NOPRINT
SELECT '<base>_&seq1-&seq3..lst' AS timecol FROM sys.dual;

-- a/ first arc to mine
EXECUTE DBMS_LOGMNR.ADD_LOGFILE(logfilename => '/PATH/arch_appli/1_&seq1'||'_619093165.arch', options => DBMS_LOGMNR.NEW);

-- b/ loop other arcs
begin
--  for i in 55186..55361 loop
  for i in &seq2...&seq3 loop
        dbms_output.put_line('Adding Logfile = ' || i);
        DBMS_LOGMNR.ADD_LOGFILE(logfilename => '/PATH/arch_appli/1_'||i||'_619093165.arch', options => DBMS_LOGMNR.ADDFILE);
  end loop;
end;
/

-- c/ start session & query redo entries
spool &spoolfile
EXECUTE DBMS_LOGMNR.START_LOGMNR(DICTFILENAME =>'/tmp/<base>_dict.ora', OPTIONS => DBMS_LOGMNR.COMMITTED_DATA_ONLY);
select SQL_REDO from v$logmnr_contents
where seg_owner = 'APPLICATION_USRV5' and seg_name = 'ABONNES_CHEQUIER' and OPERATION = 'INSERT' ;

-- d/ end session
EXECUTE Dbms_Logmnr.end_logmnr;
spool off
exit
$


Another example :

set serveroutput on
set pages 5000 lines 600
col sql_redo for a600

-- get first and last sequence number to analyze
define seq1=&1
define seq2=&2 -- seq1+1
define seq3=&3
define dbase=&4
-- spool
COLUMN timecol new_value spoolfile NOPRINT
SELECT '&dbase-&seq1-&seq3..lst' AS timecol FROM sys.dual;

-- a/ first arc to mine
EXECUTE DBMS_LOGMNR.ADD_LOGFILE(logfilename => '/arch/<base>/LOGMNR/A&seq1'||'1606572332.arch', options => DBMS_LOGMNR.NEW);

-- b/ loop other arcs
begin
--  for i in 55186..55361 loop
  for i in &seq2...&seq3 loop
        dbms_output.put_line('Adding Logfile = ' || i);
        DBMS_LOGMNR.ADD_LOGFILE(logfilename => '/arch/<base>/LOGMNR/A'||i||'1606572332.arch', options => DBMS_LOGMNR.ADDFILE);
  end loop;
end;
/

-- c/ start session & query redo entries
spool &spoolfile
EXECUTE DBMS_LOGMNR.START_LOGMNR(OPTIONS => DBMS_LOGMNR.DICT_FROM_ONLINE_CATALOG + DBMS_LOGMNR.COMMITTED_DATA_ONLY);
--EXECUTE DBMS_LOGMNR.START_LOGMNR(DICTFILENAME =>'/tmp/&dbase_dict.ora', OPTIONS => DBMS_LOGMNR.COMMITTED_DATA_ONLY);

select SQL_REDO from v$logmnr_contents
where seg_owner = 'ITS' and seg_name = 'CARTOUCHE' and OPERATION = 'UPDATE' ;

-- d/ end session
EXECUTE Dbms_Logmnr.end_logmnr;
spool off
exit


I wrote a shell script because Mining more than 20 archivelogs at a time crashed my instance ( XE ).
The ksh arguments are the first and last archivelogs seq numbers overall and it will loop through them, calling mine_arc_<base>.sql for a bunch of archivelogs according to the third argument ( # step = 20 for example ).


[oracle@<host> arch_appli]$ cat mine_arc_<base>.sh
#!/bin/ksh
# loop on archivelogs # step at a time

sequence_1=$1
sequence_2=$2
step=$3

i=$sequence_1
(( j = i + step ))

while (( i <= sequence_2 )) ; do
if (( j > sequence_2 )) ; then (( endp = sequence_2 )) ; else (( endp = j )) ; fi
(( k = i + 1 ))
if (( k > endp )) ; then (( k = endp )) ; fi
print "\nMining sequence log number $i , $k to $endp"
sqlplus '/ as sysdba' @./mine_arc_<base>.sql $i $k $endp
(( i = j + 1 ))
(( j = i + step ))
done


return 0


Execution
I was looking for INSERT statements to retrieve deleted rows no more in UNDO tablespace.


[oracle@<host> arch_appli]$ ./mine_arc_<base>.sh 54996 55006 4

Mining sequence log number 54996 , 54997 to 55000

Connected to:
Oracle Database 10g Express Edition Release 10.2.0.1.0 - Production

old   1: SELECT '<base>_&seq1-&seq3..lst' AS timecol FROM sys.dual
new   1: SELECT '<base>_54996-55000.lst' AS timecol FROM sys.dual

old   3: dbms_output.put_line('Adding First Logfile = &seq1');
new   3: dbms_output.put_line('Adding First Logfile = 54996');
old   4: DBMS_LOGMNR.ADD_LOGFILE(logfilename => '/PATH/arch_appli/1_&seq1'||'_619093165.arch', options => DBMS_LOGMNR.NEW);
new   4: DBMS_LOGMNR.ADD_LOGFILE(logfilename => '/PATH/arch_appli/1_54996'||'_619093165.arch', options => DBMS_LOGMNR.NEW);
old   8:   for i in &seq2...&seq3 loop
new   8:   for i in 54997..55000 loop

Adding First Logfile = 54996
Adding Logfile = 54997
Adding Logfile = 54998
Adding Logfile = 54999
Adding Logfile = 55000

PL/SQL procedure successfully completed.

PL/SQL procedure successfully completed.

no rows selected

PL/SQL procedure successfully completed.

Disconnected from Oracle Database 10g Express Edition Release 10.2.0.1.0 - Production

Mining sequence log number 55001 , 55002 to 55005

insert into "APPLICATION_USRV5"."ABONNES_CHEQUIER"("IDABONNE","CIVILITE","NOM","PRENOM","DATENAISSANCE","ADRESSENOM","ADRESSEPRENOM","ADRESSERUE","ADRESSECODEPOSTAL","ADRESSECITY","TELPORTABLE","MAIL","IDCLASSE","IDETABLISSEMENT","IDCITY","IDSTATUT","DATESTATUT","COMMENTAIRESTATUT","ADRESSESUITE") values ('6264','Mademoiselle','XXXXXXXXXX',…,'2','134','106','1',TO_DATE('06-SEP-11', 'DD-MON-RR'),NULL,NULL);
insert into "APPLICATION_USRV5"."ABONNES_CHEQUIER"("IDABONNE","CIVILITE","NOM","PRENOM","DATENAISSANCE","ADRESSENOM","ADRESSEPRENOM","ADRESSERUE","ADRESSECODEPOSTAL","ADRESSECITY","TELPORTABLE","MAIL","IDCLASSE","IDETABLISSEMENT","IDCITY","IDSTATUT","DATESTATUT","COMMENTAIRESTATUT","ADRESSESUITE") values ('6297','Mademoiselle',…,'2','70','80','1',TO_DATE('06-SEP-11', 'DD-MON-RR'),NULL,'DOMAINE DU GRAND PIN');

PL/SQL procedure successfully completed.

Disconnected from Oracle Database 10g Express Edition Release 10.2.0.1.0 - Production
[oracle@<host> arch_appli]$



Other query :

select SQL_REDO from v$logmnr_contents
where seg_owner = 'APPLICATION_USRV5' and seg_name = 'ABONNES_CHEQUIER' and OPERATION = 'DELETE'
and (
sql_redo like '%"IDABONNE" = ''2''%'
or sql_redo like '%"IDABONNE" = ''3''%'
or sql_redo like '%"IDABONNE" = ''4''%'
or sql_redo like '%"IDABONNE" = ''5''%'
or sql_redo like '%"IDABONNE" = ''6''%'
or sql_redo like '%"IDABONNE" = ''7''%'
or sql_redo like '%"IDABONNE" = ''8''%' )


3/ Management

To increase size and/or move to another location, no need to say more than :


$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Tue Apr 16 10:50:16 2013
Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> set pages 5000 lines 150
SQL> col MEMBER for a30

SQL> select * from v$logfile ;
    GROUP# STATUS  TYPE    MEMBER                         IS_
---------- ------- ------- ------------------------------ ---
         3         ONLINE  /data/<base>/log_03_1.rdo       NO
         2         ONLINE  /data/<base>/log_02_1.rdo       NO
         1         ONLINE  /data/<base>/log_01_1.rdo       NO

SQL> alter database drop logfile group 1 ;
Database altered.

SQL> ALTER DATABASE ADD LOGFILE GROUP 1 ('/rdo1/<base>/log_01_1.rdo','/rdo2/<base>/log_01_2.rdo') size 512M ;
Database altered.

SQL> alter database drop logfile group 3 ;
Database altered.

SQL> select * from v$logfile ;

    GROUP# STATUS  TYPE    MEMBER                         IS_
---------- ------- ------- ------------------------------ ---
         2         ONLINE  /data/<base>/log_02_1.rdo       NO
         1         ONLINE  /rdo1/<base>/log_01_1.rdo       NO
         4         ONLINE  /rdo1/<base>/log_04_1.rdo       NO
         4         ONLINE  /rdo2/<base>/log_04_2.rdo       NO
         1         ONLINE  /rdo2/<base>/log_01_2.rdo       NO

SQL> alter system switch logfile ;
System altered.

SQL> alter system checkpoint ;
System altered.

SQL> alter database drop logfile group 2 ;
Database altered.

SQL> ALTER DATABASE ADD LOGFILE GROUP 2 ('/rdo1/<base>/log_02_1.rdo','/rdo2/<base>/log_02_2.rdo') size 512M ;
Database altered.

SQL> ALTER DATABASE ADD LOGFILE GROUP 3  ('/rdo1/<base>/log_03_1.rdo','/rdo2/<base>/log_03_2.rdo') size 512M ;
Database altered.

SQL> !ls -l /data/<base>/log_*.rdo
-rw-r-----    1 oracle   dba       268435968 Apr 16 04:40 /data/<base>/log_01_1.rdo
-rw-r-----    1 oracle   dba       268435968 Apr 16 10:52 /data/<base>/log_02_1.rdo
-rw-r-----    1 oracle   dba       268435968 Apr 16 04:38 /data/<base>/log_03_1.rdo

SQL> !rm /data/<base>/log_*.rdo

SQL> select * from v$logfile order by 1,4 ;

    GROUP# STATUS  TYPE    MEMBER                         IS_
---------- ------- ------- ------------------------------ ---
         1         ONLINE  /rdo1/<base>/log_01_1.rdo       NO
         1         ONLINE  /rdo2/<base>/log_01_2.rdo       NO
         2         ONLINE  /rdo1/<base>/log_02_1.rdo       NO
         2         ONLINE  /rdo2/<base>/log_02_2.rdo       NO
         3         ONLINE  /rdo1/<base>/log_03_1.rdo       NO
         3         ONLINE  /rdo2/<base>/log_03_2.rdo       NO

6 rows selected.



4/ Monitor excessive archive log generation to prevent ORA-16014
ARCH: Archival stopped, error occurred. Will continue retrying
ORA-16014: log 1 sequence# not archived, no available destinations

I wrote this query to monitor an abnormal generation of archivelogs based on the numbers found in the past 30 days :
We get the maximum volume generated per day up to the current hour
-- past 30 days to get the Max, between 00:00 and current hour
-- past 30 days before current day, get max generated
We compare this number to the volume generated today
-- current day, get sum
Then
-- oracle compare two numbers with decode
-- Current_Arclog <= Max_Arclogs => 'OK'
-- Current_Arclog > Max_Arclogs => 'KO'

WITH
-- past 30 days to get the Max, between 00:00 and current hour
GLPD AS (
select trunc(hour_D) DAY_G, ceil(sum(Arclogs_MB/1024)) Arclogs_GB from
(
select trunc(completion_time,'HH24') hour_D, ceil(sum(blocks*block_size)/1024/1024) Arclogs_MB from v$archived_log
where completion_time > SYSDATE-30
and to_char(completion_time, 'HH24:MI') between '00:00' and '12:00'
-- may have to exclude days with issues
and to_char(completion_time, 'DD/MM/YYYY') not in ('15/05/2015','20/05/2015','22/05/2015','23/05/2015','26/05/2015','27/05/2015','28/05/2015','29/05/2015','30/05/2015','01/06/2015')
group by trunc(completion_time,'HH24') order by 1 desc
) group by trunc(hour_D) order by 1 desc
),
MGPLD AS (
-- past 30 days before current day , get max generated
select max(Arclogs_GB) MA from GLPD where DAY_G != trunc(sysdate) ),
CGPLD AS (
-- current day, get sum
select Arclogs_GB CA from GLPD where DAY_G = trunc(sysdate) )
select MA Max_GB_30Days , CA Current_GB from MGPLD , CGPLD
;

Max_GB_30Days Current_GB
9            10

Full PL script with a current_hour variable :

set serveroutput on

declare
RES varchar2(8) ;
cur_hour varchar2(16) ;

begin

select to_char(trunc(sysdate,'HH24'), 'HH24:MI') into cur_hour from dual ;

-- oracle compare two numbers with decode
-- Current_Arclog <= Max_Arclogs => 'OK'
-- Current_Arclog > Max_Arclogs => 'KO'

select decode(sign(Max_GB_30Days-Current_GB),-1,'KO','OK') into RES from (
WITH
-- past 30 days to get the Max, between 00:00 and current hour
GLPD AS (
select trunc(hour_D) DAY_G, ceil(sum(Arclogs_MB/1024)) Arclogs_GB from
(
select trunc(completion_time,'HH24') hour_D, ceil(sum(blocks*block_size)/1024/1024) Arclogs_MB from v$archived_log
where completion_time > SYSDATE-30
and to_char(completion_time, 'HH24:MI') between '00:00' and cur_hour
-- may have to exclude days with issues
and to_char(completion_time, 'DD/MM/YYYY') not in ('15/05/2015','20/05/2015','22/05/2015','23/05/2015','26/05/2015','27/05/2015','28/05/2015','29/05/2015','30/05/2015','01/06/2015')
group by trunc(completion_time,'HH24') order by 1 desc
) group by trunc(hour_D) order by 1 desc
),
MGPLD AS (
-- past 30 days before current day , get max generated
select max(Arclogs_GB) MA from GLPD where DAY_G != trunc(sysdate) ),
CGPLD AS (
-- current day, get sum
select Arclogs_GB CA from GLPD where DAY_G = trunc(sysdate) )
select MA Max_GB_30Days , CA Current_GB from MGPLD , CGPLD
) ;

DBMS_OUTPUT.PUT_LINE('RES Monitor Redo Generation = '||RES);

end;
/

RES Monitor Redo Generation = KO
PL/SQL procedure successfully completed.

And in a Function to be plugged into GRID’s User Defined Metrics :

-- To be plugged into User Defined Metric in GRID
-- see "Using Invoker's Rights or Definer's Rights (AUTHID Clause)". The default is DEFINER
-- A unit whose AUTHID value is CURRENT_USER is called an invoker's rights unit
-- grant select on SYS.v_$archived_log to SYSTEM
create or replace function SCH.ORA_ARC_LOG_GEN return varchar
AUTHID CURRENT_USER
AS
RES varchar2(4);
cur_hour varchar2(16) ;

begin

select to_char(trunc(sysdate,'HH24'), 'HH24:MI') into cur_hour from dual ;

-- oracle compare two numbers with decode
-- Current_Arclog <= Max_Arclogs => 'OK' return 0
-- Current_Arclog > Max_Arclogs =>  'KO' return 1 ( Warning )

select decode(sign(Max_GB_30Days-Current_GB),-1,'1','0') into RES from (
WITH
-- past 30 days to get the Max, between 00:00 and current hour
GLPD AS (
select trunc(hour_D) DAY_G, ceil(sum(Arclogs_MB/1024)) Arclogs_GB from
(
select trunc(completion_time,'HH24') hour_D, ceil(sum(blocks*block_size)/1024/1024) Arclogs_MB from v$archived_log
where completion_time > SYSDATE-30
and to_char(completion_time, 'HH24:MI') between '00:00' and cur_hour
-- may have to exclude days with issues
and to_char(completion_time, 'DD/MM/YYYY') not in ('15/05/2015','20/05/2015','22/05/2015','23/05/2015','26/05/2015','27/05/2015','28/05/2015','29/05/2015','30/05/2015','01/06/2015')
group by trunc(completion_time,'HH24') order by 1 desc
) group by trunc(hour_D) order by 1 desc
),
MGPLD AS (
-- past 30 days before current day , get max generated
select max(Arclogs_GB) MA from GLPD where DAY_G != trunc(sysdate) ),
CGPLD AS (
-- current day, get sum
select Arclogs_GB CA from GLPD where DAY_G = trunc(sysdate) )
select MA Max_GB_30Days , CA Current_GB from MGPLD , CGPLD
) ;

return RES ;

end;
/

Call it : select SCH.ORA_ARC_LOG_GEN from dual ;


5] Archivelogs space over-filling

Goal
Avoid the Instance hanging
when archivelog space fills up to 100 %
Version
10gR2
Views
V$ARCHIVED_LOG
V_$TRANSACTION
V_$SESS_IO
Log as
SYS

  • First, monitor the archivelogs generation
with
post [ Administration / Redo Logs / 4] Monitor excessive archive log generation to prevent ORA-16014

Then, action : Kill the guilty session

Archivelog Space Occupation threshold : kill the session responsible for archivelogs over-generation
we assume that the following SQL are triggered only in case of incident
( archivelog space is correctly sized to retain at least 2 days of logs )
See "How to Find Sessions Generating Lots of Redo or Archive logs" [ID 167492.1]

5.1 Pre-requisite

  • In addition to grants already seen for Services management post :

GRANT SELECT ON  SYS.V_$TRANSACTION  TO SYSTEM;
GRANT SELECT ON  SYS.V_$SESS_IO TO SYSTEM;
GRANT EXECUTE ON SYS.DBMS_SYSTEM TO SYSTEM;
GRANT EXECUTE ON SYS.DBMS_SERVICE TO SYSTEM;

  • 2 queries

-- 0/ Long running transaction
-- find top UNDO Consuming & Active session
select * from ( 
SELECT s.sid, s.serial#, s.username, s.program, ROUND(t.used_ublk*(SELECT VALUE FROM v$parameter WHERE NAME = 'db_block_size')/1024/1024,2) "UNDO Used Size (MB)"
FROM v$session s, v$transaction t
WHERE s.taddr = t.addr
ORDER BY 5 desc, 1, 2, 3, 4
) where rownum < 2 ;

-- 1/ Program with many Transactions ( intermediate Commit )
-- top Block Change session

-- ignore background processes
select * from (
SELECT s.sid, s.serial#, s.username, s.program, i.block_changes
-- very frequent commit, then is not seen in v$transaction ( only active txn )
FROM v$session s, v$sess_io i --, v$transaction t
WHERE s.sid = i.sid -- and s.taddr = t.addr
and service_name != 'SYS$BACKGROUND'
ORDER BY 5 desc, 1, 2, 3, 4
) where rownum < 2 ;

-- we keep the second query because it takes care of the 2 profiles

  • To monitor :

select 'alter system kill session '''||SID||','||SERIAL#||'''' cmd , username , program , block_changes from (
SELECT s.sid, s.serial#, s.username, s.program, i.block_changes
-- very frequent commit, then is not seen in v$transaction ( only active txn )
FROM v$session s, v$sess_io i --, v$transaction t
WHERE s.sid = i.sid -- and s.taddr = t.addr
and service_name != 'SYS$BACKGROUND'
ORDER BY 5 desc, 1, 2, 3, 4
) where rownum < 2 ;

SID
SERIAL#
USER1
PROGRAM
BLOCK_CHANGES
2172
18579
USER1
JDBC Thin Client
30540017
1738
17990
USER1
JDBC Thin Client
29152077
1756
31732
USER1
JDBC Thin Client
27769616
1646
20165
USER1
JDBC Thin Client
26376994
2082
11138
USER2
RTSDGN@sr-app-9 (TNS V1-V3)
6530220
2541
6858
USER3
mgARES02.exe
367752

5.2 Procedure

Add to package

CREATE OR REPLACE package app_admin
as
-- 11/06/12 add kill_long_running_session
        procedure kill_long_running_session;
end;
/


procedure kill_long_running_session
is
-- kill the most redo entries generating session
-- the procedure may be called by Nagios when reaching threshold WARNING and CRITICAl
-- usage : execute system.app_admin.kill_long_running_session;

cursor cur is
select 'alter system kill session '''||SID||','||SERIAL#||'''' cmd , username , program , block_changes from (
SELECT s.sid, s.serial#, s.username, s.program, i.block_changes
-- very frequent commit, then is not seen in v$transaction ( only active txn )
FROM v$session s, v$sess_io i --, v$transaction t
WHERE s.sid = i.sid -- and s.taddr = t.addr
and service_name != 'SYS$BACKGROUND'

ORDER BY 5 desc, 1, 2, 3, 4
) where rownum < 2 ;

begin
 dbms_output.put_line('Kill top Block Change session');
 -- This message goes to the alert log
 sys.dbms_system.ksdwrt(2, ' Called Procedure app_admin.kill_long_running_session: Kill top Block Change session');
 for c_tab in cur loop
  begin
    dbms_output.put_line('Kill session username: '||c_tab.username||' '||c_tab.cmd);
    -- This message goes to the alert log
    sys.dbms_system.ksdwrt(2, ' Procedure app_admin.kill_long_running_session: '||c_tab.username||' '||c_tab.program||' block_changes: '||c_tab.block_changes||' '||c_tab.cmd);
    sql_stmt := c_tab.cmd;
    cn := dbms_sql.open_cursor;
    dbms_sql.parse(cn, sql_stmt,dbms_sql.native);
    ret:=dbms_sql.execute(cn);
    dbms_sql.close_cursor(cn);
    exception
       when others then dbms_output.put_line(sqlerrm);
  end;
 end loop;

end kill_long_running_session;


5.3 Test

  • Block Change intensive PL/SQL

conn auc

create table t_intensive ( a int ) ;

BEGIN
  DECLARE
  m NUMBER;
  BEGIN
  FOR i IN 1..100000 LOOP
  FOR j  IN 1..10000 LOOP
  m := sqrt(4567);
  insert into t_intensive values (m) ;
  END LOOP;
  commit ;
  END LOOP;
  END;
  END;
  /

  • KILL


$ sqlplus system

SQL*Plus: Release 10.2.0.5.0 - Production on Mon Jun 11 17:40:50 2012

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.

Enter password:

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> set serveroutput on

SQL> execute system.app_admin.kill_long_running_session ;
Kill top Block Change and Active session
Kill session username: SCH alter system kill session '481,38572'

PL/SQL procedure successfully completed.


BEGIN
*
ERROR at line 1:
ORA-00028: your session has been killed

  • Alert.log : we write down to the Oracle Alert Log calling sys.dbms_system.ksdwrt


Mon Jun 11 17:41:49 GMT+02:00 2012Procedure app_admin.kill_long_running_session: SCH sqlplus@sr-dev-1 (TNS V1-V3) block_changes: 1316160 alter system kill session '481,38572'


We may even run an RMAN Backup to backup/remove overflowed archivelogs : see Kernel post.


6] Log write alignment

To avoid misalignment of REDO Vector writes ( multiple IOs instead of One ) : use the 11G+ syntax when creating logfiles : on AIX


alter system set "_disk_sector_size_override" = true ;

ALTER DATABASE ADD LOGFILE GROUP 6 ('/rdo1/<db_name>/log_06_1.rdo','/rdo2/<db_name>/log_06_2.rdo') size 256M BLOCKSIZE 4096 ;



7] Oracle Loss or Corruption of a Redolog Member

Case 1. « /rdo2 » filesystem is no more available, members on ‘/rdo1 » exist.
First start mount the instance, then remove the rdo2 members.

$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.5.0 - Production on Wed Nov 22 15:41:58 2017
Connected to an idle instance.
SQL> startup nomount
ORACLE instance started.

Total System Global Area 1610612736 bytes
Fixed Size                  2096736 bytes
Variable Size            1107296672 bytes
Database Buffers          486539264 bytes
Redo Buffers               14680064 bytes
SQL> !ls -l /rdo2/dbora/
total 720912
drwxr-xr-x  2 oracle dba       256 2010-03-22 15:00 ctl
-rw-r-----  1 oracle dba  67109376 2017-11-22 02:17 log_01_2.rdo
-rw-r-----  1 oracle dba  67109376 2017-11-22 02:04 log_02_2.rdo
-rw-r-----  1 oracle dba  67109376 2017-11-22 02:11 log_03_2.rdo
-rw-r-----  1 oracle dba 536871424 2017-11-22 14:55 log_05_2.rdo

SQL> ALTER DATABASE DROP LOGFILE MEMBER '/rdo2/dbora/log_01_2.rdo';
ALTER DATABASE DROP LOGFILE MEMBER '/rdo2/dbora/log_01_2.rdo'
*
ERROR at line 1:
ORA-01507: database not mounted

SQL> alter database mount ;
Database altered.

SQL> ALTER DATABASE DROP LOGFILE MEMBER '/rdo2/dbora/log_01_2.rdo';
Database altered.

SQL> ALTER DATABASE DROP LOGFILE MEMBER '/rdo2/dbora/log_02_2.rdo';
Database altered.

SQL> ALTER DATABASE DROP LOGFILE MEMBER '/rdo2/dbora/log_03_2.rdo';
Database altered.

SQL> ALTER DATABASE DROP LOGFILE MEMBER '/rdo2/dbora/log_05_2.rdo';
ALTER DATABASE DROP LOGFILE MEMBER '/rdo2/dbora/log_05_2.rdo'
*
ERROR at line 1:
ORA-01609: log 5 is the current log for thread 1 - cannot drop members
ORA-00312: online log 5 thread 1: '/rdo1/dbora/log_05_1.rdo'
ORA-00312: online log 5 thread 1: '/rdo2/dbora/log_05_2.rdo'


SQL> alter system switch logfile ;
alter system switch logfile
*
ERROR at line 1:
ORA-01109: database not open

SQL> alter database rename file '/rdo2/dbora/log_05_2.rdo' to '/rdo1/dbora/log_05_2.rdo';
alter database rename file '/rdo2/dbora/log_05_2.rdo' to '/rdo1/dbora/log_05_2.rdo'
*
ERROR at line 1:
ORA-01511: error in renaming log/data files
ORA-01512: error renaming log file /rdo2/dbora/log_05_2.rdo - new file
/rdo1/dbora/log_05_2.rdo not found
ORA-27037: unable to obtain file status
IBM AIX RISC System/6000 Error: 2: No such file or directory
Additional information: 3

SQL> !cp -p /rdo1/dbora/log_05_1.rdo /rdo1/dbora/log_05_2.rdo

SQL> SQL> SQL> SQL> SQL> alter database rename file '/rdo2/dbora/log_05_2.rdo' to '/rdo1/dbora/log_05_2.rdo';

Database altered.

SQL> alter database open ;
Database altered.

SQL>

Case 2. ORA-00341: Corruption, wrong log # 3 in header

ALTER DATABASE OPEN
Errors in file /work/oracle/dbora/diag/rdbms/dbora/DBORA/trace/DBORA_lgwr_9830604.trc:
ORA-00341: log 4 of thread 1, wrong log # 3 in header
ORA-00312: online log 4 thread 1: '/rdo1/dbora/log_03_2.rdo'
Errors in file /work/oracle/dbora/diag/rdbms/dbora/DBORA/trace/DBORA_lgwr_9830604.trc:
ORA-00341: log 4 of thread 1, wrong log # 3 in header
ORA-00312: online log 4 thread 1: '/rdo1/dbora/log_03_2.rdo'
Errors in file /work/oracle/dbora/diag/rdbms/dbora/DBORA/trace/DBORA_ora_8716718.trc:
ORA-00341: log 1 of thread , wrong log #  in header
ORA-00312: online log 4 thread 1: '/rdo1/dbora/log_03_2.rdo'
USER (ospid: 8716718): terminating the instance due to error 341
Wed Nov 22 18:07:29 2017
System state dump requested by (instance=1, osid=8716718), summary=[abnormal instance termination].
System State dumped to trace file /work/oracle/dbora/diag/rdbms/dbora/DBORA/trace/DBORA_diag_2556030.trc
Dumping diagnostic data in directory=[cdmp_20171122180729], requested by (instance=1, osid=8716718), summary=[abnormal instance termination].
Instance terminated by USER, pid = 8716718

$ /rdo1/dbora > rm /rdo1/dbora/log_03_2.rdo
$ /rdo1/dbora > cp -p /rdo1/dbora/log_03_1.rdo /data/dbora/log_03_2.rdo

$ /rdo1/dbora > sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Wed Nov 22 18:09:58 2017

Connected to an idle instance.

SQL> startup mount
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance

ORACLE instance started.

Total System Global Area 2137886720 bytes
Fixed Size                  2223208 bytes
Variable Size            1459618712 bytes
Database Buffers          671088640 bytes
Redo Buffers                4956160 bytes

Database mounted.
SQL> SQL> SQL> SQL> SQL> SQL> SQL> SQL> alter database rename file '/rdo2/dbora/log_03_2.rdo' to '/data/dbora/log_03_2.rdo';
alter database rename file '/rdo2/dbora/log_03_2.rdo' to '/data/dbora/log_03_2.rdo'
*
ERROR at line 1:
ORA-01511: error in renaming log/data files
ORA-01516: nonexistent log file, data file, or temporary file
"/rdo2/dbora/log_03_2.rdo"

SQL> !ls -l /data/dbora/log_03_2.rdo
-rw-r-----  1 oracle dba 268435968 2017-11-22 18:07 /data/dbora/log_03_2.rdo

SQL> alter database rename file '/rdo1/dbora/log_03_2.rdo' to '/data/dbora/log_03_2.rdo';
Database altered.

SQL> alter database open ;
alter database open
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 9437560
Session ID: 178 Serial number: 5


SQL> Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

$ /rdo1/dbora > sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Wed Nov 22 18:12:18 2017

Connected to an idle instance.

SQL> startup mount
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORACLE instance started.

Total System Global Area 2137886720 bytes
Fixed Size                  2223208 bytes
Variable Size            1459618712 bytes
Database Buffers          671088640 bytes
Redo Buffers                4956160 bytes
Database mounted.
SQL> alter database open resetlogs
  2  ;
alter database open resetlogs
*
ERROR at line 1:
ORA-01139: RESETLOGS option only valid after an incomplete database recovery


SQL> recover database using backup controlfile until cancel ;
ORA-00279: change 40237970573 generated at 11/22/2017 14:55:49 needed for
thread 1
ORA-00289: suggestion : /data/dbora/arc/DBORA1_231_939491616.arc
ORA-00280: change 40237970573 for thread 1 is in sequence #231


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
CANCEL
Media recovery cancelled.
SQL> alter database open resetlogs ;

Database altered.

SQL>