0] Archivelogs over-generation
How do I
manage Archivelogs over-generation, a quick summary :
1/ Seek to
the generated volume in the past to highlight when the issue occurs
In following
[ chapter 1] Volume / 1.1 Archivelog mode ]
-
per day, per hour
-
par day, specific hours
2/ Log Miner
on several Archivelogs to find out which username, segment are responsible for
the redo vectors
In following
chapter 2] Log Miner
3/ Get in
the past, which segment has most DB_BLOCK_CHANGES_TOTAL
With post [ Performance
/ AWR / 3.2 Redolog over generation ]
4/ Confirm
in Real Time from current sessions
With post [ Administration
/ Redologs & Archivelogs / 5] Archivelogs space over-filling : Kill the guilty
session ]
5/ Set an
alerting on such event :
With post [ Administration
/ Redologs & Archivelogs / 4] Monitor excessive archive log generation to prevent
ORA-16014
1] Volume
Goal
|
Redolog Volume Generation
|
Version
|
10gR2
|
Views
|
v$archived_log ( Archivelog )
v$log v$log_history ( Noarchivelog )
|
1.1 Archivelog mode
This is the Real
volume given by Blocks column.
-- by day,
past 30 days
select trunc(completion_time) "day", min(sequence#) "num_log_begin_day", count(1) "nb_arc",
ceil(sum(blocks*block_size)/1024/1024/1024) Arclogs_GB from v$archived_log
where completion_time > SYSDATE-30 group by trunc(completion_time) order by 1 desc
day
|
num_log_begin_day
|
nb_arc
|
ARCLOGS_GB
|
28/08/2015
|
1075875
|
188
|
78
|
27/08/2015
|
1075591
|
284
|
116
|
26/08/2015
|
1075358
|
233
|
87
|
25/08/2015
|
1075194
|
164
|
51
|
24/08/2015
|
1075043
|
151
|
42
|
23/08/2015
|
1074945
|
98
|
2
|
22/08/2015
|
1074841
|
104
|
13
|
21/08/2015
|
1074654
|
187
|
63
|
20/08/2015
|
1074477
|
177
|
62
|
19/08/2015
|
1074340
|
137
|
38
|
18/08/2015
|
1074205
|
135
|
36
|
17/08/2015
|
1074079
|
126
|
28
|
-- by
hour, past 1 day
select trunc(completion_time,'HH24') hour_D, ceil(sum(blocks*block_size)/1024/1024) Arclogs_MB from v$archived_log
where completion_time > SYSDATE-1 group by trunc(completion_time,'HH24') order by 1 desc
-- by day,
for a specific interval of hours
select trunc(hour_D) DAY_G, ceil(sum(Arclogs_MB/1024)) Arclogs_GB from
(
select trunc(completion_time,'HH24') hour_D, ceil(sum(blocks*block_size)/1024/1024)
Arclogs_MB from
v$archived_log
where
completion_time > SYSDATE-30
and to_char(completion_time, 'HH24:MI') between '07:01' and '18:00'
-- I had to exclude days
with issues
and to_char(completion_time, 'DD/MM/YYYY') not in ('27/08/2015')
group by trunc(completion_time,'HH24') order by 1 desc
) group by trunc(hour_D) order by 1 desc
-- by day, past 30 days
, + volume if we keep 2 days or 7 days on disk
select "day" , num_log_begin_day , nb_arc ,
Arclogs_GB ,
Arclogs_GB+lag(Arclogs_GB, 1, -1) over(partition by RN order by "day") "2Days" ,
sum(Arclogs_GB) over(partition by RN order by "day" rows 6 preceding) "7Days"
from (
select '1' RN,trunc(completion_time) "day", min(sequence#)
num_log_begin_day, count(1) nb_arc,
ceil(sum(blocks*block_size)/1024/1024/1024)
Arclogs_GB from
v$archived_log
where
completion_time > SYSDATE-30 group by trunc(completion_time) order by 1 desc )
day
|
NUM_LOG_BEGIN_DAY
|
NB_ARC
|
ARCLOGS_GB
|
2Days
|
7Days
|
INSTANCE
|
au 12/06/13
|
|
MAX GB =
|
71
|
191
|
13/05/2013
|
972884
|
25
|
3
|
2
|
3
|
14/05/2013
|
972909
|
102
|
16
|
19
|
19
|
15/05/2013
|
973011
|
106
|
19
|
35
|
38
|
16/05/2013
|
973117
|
107
|
20
|
39
|
58
|
17/05/2013
|
973224
|
101
|
14
|
34
|
72
|
18/05/2013
|
973325
|
100
|
9
|
23
|
81
|
19/05/2013
|
973425
|
97
|
1
|
10
|
82
|
20/05/2013
|
973522
|
98
|
4
|
5
|
83
|
21/05/2013
|
973620
|
110
|
23
|
27
|
90
|
22/05/2013
|
973730
|
102
|
17
|
40
|
88
|
23/05/2013
|
973832
|
120
|
28
|
45
|
96
|
24/05/2013
|
973952
|
105
|
19
|
47
|
101
|
25/05/2013
|
974057
|
99
|
9
|
28
|
101
|
26/05/2013
|
974156
|
97
|
1
|
10
|
101
|
27/05/2013
|
974253
|
114
|
23
|
24
|
120
|
28/05/2013
|
974367
|
110
|
22
|
45
|
119
|
29/05/2013
|
974477
|
104
|
17
|
39
|
119
|
30/05/2013
|
974581
|
135
|
37
|
54
|
128
|
31/05/2013
|
974716
|
133
|
34
|
71
|
143
|
01/06/2013
|
974849
|
108
|
24
|
58
|
158
|
02/06/2013
|
974957
|
100
|
3
|
27
|
160
|
03/06/2013
|
975057
|
122
|
27
|
30
|
164
|
04/06/2013
|
975179
|
137
|
40
|
67
|
182
|
05/06/2013
|
975316
|
114
|
26
|
66
|
191
|
06/06/2013
|
975430
|
104
|
19
|
45
|
173
|
If we’re
interested in Open Hours only
-- by hour, past 30 days, between 06:00 and 18:00
select trunc(hour_D) "day", ceil(sum(Arclogs_MB/1024)) Arclogs_GB from
(
select trunc(completion_time,'HH24') hour_D, ceil(sum(blocks*block_size)/1024/1024) Arclogs_MB from
v$archived_log
where completion_time > SYSDATE-30
and to_char(completion_time, 'HH24:MI') between '06:00' and '19:00'
group by trunc(completion_time,'HH24') order by 1 desc
) group by trunc(hour_D) order by 1 desc
day
|
ARCLOGS_GB
|
20/05/2015
|
20
|
19/05/2015
|
8
|
18/05/2015
|
11
|
17/05/2015
|
6
|
16/05/2015
|
7
|
15/05/2015
|
15
|
14/05/2015
|
7
|
13/05/2015
|
12
|
12/05/2015
|
10
|
11/05/2015
|
10
|
10/05/2015
|
5
|
09/05/2015
|
6
|
1.2 Noarchivelog mode
This is an Estimated
volume according to the Redolog file Maxsize.
-- by day
, past 30 days
with rdo_sz_GB as ( select max(bytes/1024/1024/1024) sz from v$log )
select day , num_log_begin_day , nb_log , ceil(nb_log*sz) Logs_GB
from ( select trunc(first_time) day, min(sequence#) num_log_begin_day, count(1) nb_log
from v$log_history
where first_time > SYSDATE-30
group by trunc(first_time) ) , rdo_sz_GB
order by 1 desc
-- by hour
, past 2 days
select hour_D , num_log_begin_day , nb_arc , nb_arc*(select max(bytes/1024/1024/1024) from v$log ) Logs_GB
from (
select trunc(first_time,'HH24') hour_J, min(sequence#) num_log_begin_day, count(1) nb_arc
from v$log_history
where first_time > SYSDATE-2 group by trunc(first_time,'HH24') order by 1 desc )
2] Log Miner
Log Miner is
a powerful tool that helped me many times ; it allows the DBA to fully
understand what’s inside the archived log and find out logged information – By
whom and when rows were deleted ? What were the complete INSert statements
in rows, now removed accidentally , and so on …
Goal
|
See Redo Entries
|
Version
|
10gR2
|
Package
|
dbms_logmnr
|
Views
|
V$LOGMNR_CONTENTS
|
Log as
|
SYS
|
2.1 Using Online Dictionary
Be careful when mining on Production Database (
it’s very resource consuming ).
2.1.1 Access
few archivelogs
Start Log Miner session under SYS
user.
-- add the log file to the log
miner session
exec dbms_logmnr.add_logfile (LOGFILENAME => '/<log_file_path_name_i>.arc', OPTIONS => DBMS_LOGMNR.NEW) ;
EXEC DBMS_LOGMNR.ADD_LOGFILE(logfilename => '/<log_file_path_name_i+1>.arc', options => DBMS_LOGMNR.ADDFILE);
exec dbms_logmnr.start_logmnr(options => dbms_logmnr.dict_from_online_catalog) ;
Query the
archivelog content.
select count(operation),operation,seg_owner,seg_name FROM V$LOGMNR_CONTENTS group by operation,seg_owner,seg_name order by 1 desc ;
select substr(sql_redo,1,54) , count(1) FROM V$LOGMNR_CONTENTS where seg_name = 'TAB01' group by substr(sql_redo,1,54) order by 2 desc ;
select SEG_OWNER, SEG_NAME , OPERATION from v$logmnr_contents where operation is null and rownum < 10000 ;
select count(operation),operation,seg_owner,seg_name FROM V$LOGMNR_CONTENTS group by operation,seg_owner,seg_name order by 1 desc ;
col SEG_NAME for a30
col SEG_OWNER for a15
col OPERATION for a15
select distinct SEG_OWNER, SEG_NAME , OPERATION, COUNT(*) from v$logmnr_contents
group by SEG_OWNER, SEG_NAME , OPERATION order by 4 desc ;
…
End Log Miner session.
exec dbms_logmnr.end_logmnr() ;
2.1.2 Access
many archivelogs using 10G « Continuous Mine » option
Note :
even in the same database, « Continuous Mine » may not be
possible :
for
instance, when the references of the needed archivelogs have disapeared from
the Controlfile ( recycling ) , or when archivelogs were removed from
Disk :
ALTER SESSION SET NLS_DATE_FORMAT = 'DD-MON-YYYY HH24:MI:SS';
EXECUTE DBMS_LOGMNR.START_LOGMNR( -
STARTTIME => '03-Dec-2012
08:00:00', -
ENDTIME => '03-Dec-2012
12:00:00', -
OPTIONS =>
DBMS_LOGMNR.DICT_FROM_ONLINE_CATALOG + -
DBMS_LOGMNR.CONTINUOUS_MINE);
ERROR at line 1:
ORA-01292: no log
file has been specified for the current LogMiner session
ORA-06512: at
"SYS.DBMS_LOGMNR", line 58
ORA-06512: at line
1
Working examples :
ALTER SESSION SET NLS_DATE_FORMAT = 'DD-MON-YYYY HH24:MI:SS';
EXECUTE DBMS_LOGMNR.START_LOGMNR( -
STARTTIME => '04-Dec-2012
09:45:00', -
ENDTIME => '04-Dec-2012
10:00:00', -
OPTIONS =>
DBMS_LOGMNR.DICT_FROM_ONLINE_CATALOG + -
DBMS_LOGMNR.CONTINUOUS_MINE);
select TIMESTAMP, USERNAME, SEG_OWNER, SEG_NAME , OPERATION from
v$logmnr_contents where seg_owner = 'APP' and seg_name = 'CITY'
TIMESTAMP USERNAME SEG_OWNER SEG_NAME OPERATION
--------------------
------------------------------ --------------- --------
04-DEC-2012 09:57:11 UNKNOWN
APP CITY UNSUPPORTED
04-DEC-2012 09:57:13 UNKNOWN
APP CITY UPDATE
04-DEC-2012 09:57:37 UNKNOWN
APP CITY INSERT
04-DEC-2012 09:57:40 UNKNOWN
APP CITY UNSUPPORTED
04-DEC-2012 09:57:41 UNKNOWN
APP CITY DELETE
UNSUPPORTED
= change was caused by operations
not currently supported by LogMiner
EXEC DBMS_LOGMNR.START_LOGMNR(StartTime
=> To_Date('07/04/2013 13:10:00', 'MM/DD/YYYY
HH24:MI:SS'), EndTime => To_Date('07/04/2013
14:45:00', 'MM/DD/YYYY
HH24:MI:SS'), OPTIONS => DBMS_LOGMNR.DICT_FROM_ONLINE_CATALOG+DBMS_LOGMNR.CONTINUOUS_MINE);
select username, operation, SQL_REDO
FROM
V$LOGMNR_CONTENTS where username
= '<SOME_NAME>' -- AND SEG_OWNER = 'SRS' AND SEG_NAME = 'CONTRAT';
EXEC DBMS_LOGMNR.END_LOGMNR ;
2.2 Mining on a remote
Database
It’s wise to move the
load of analyzing archivelogs to another server ( pre-production for example ).
- Remote machine should be on the same hardware
platform
- Database should have the same Oracle Version and
Blocksize.
2.2.1 The Dictionnary
should be extracted, first, from Production Database.
Pre-requisite : “utl_file_dir”
parameter = extraction directory
Connect as SYS – note,
it’s working in a 10g Express Edition.
$ sqlplus '/ as sysdba'
SQL*Plus: Release 10.2.0.1.0 - Production on Mon Sep 26 09:59:33 2011
Copyright (c) 1982, 2005, Oracle.
All rights reserved.
Connected to:
Oracle Database 10g Express Edition Release 10.2.0.1.0 - Production
SQL> sho parameter utl
NAME
TYPE VALUE
------------------------------------ -----------
------------------------------
create_stored_outlines
string
utl_file_dir
string /tmp
SQL>
-- Extract Dictionary in a flat file
SQL> EXECUTE DBMS_LOGMNR_D.BUILD('<base>_dict.ora', '/tmp',
DBMS_LOGMNR_D.STORE_IN_FLAT_FILE);
PL/SQL procedure successfully completed.
$ ls -l /tmp/<base>_dict.ora
-rw-r--r-- 1 oracle dba
16447504 sep 26 10:45 /tmp/<base>_dict.ora
Copy Dictionary file
and Archivelogs to the remote server.
2.2.2 Mining few
archivelogs
We’re searching for
COMMITED Data only.
-- Add archivelogs to log miner session
SQL> EXECUTE DBMS_LOGMNR.ADD_LOGFILE(logfilename => '/tmp/arc1.log',
options => DBMS_LOGMNR.NEW);
SQL> EXECUTE DBMS_LOGMNR.ADD_LOGFILE(logfilename => '/tmp/arc2.log',
options => DBMS_LOGMNR.ADDFILE);
SQL> EXECUTE DBMS_LOGMNR.ADD_LOGFILE(logfilename => '/tmp/arc3.log',
options => DBMS_LOGMNR.ADDFILE);
-- Start Log Miner
SQL> EXECUTE
DBMS_LOGMNR.START_LOGMNR(DICTFILENAME =>'/tmp/<base>_dict.ora', -
StartTime
=> To_Date('09/07/2006 07:40:52', 'MM/DD/YYYY HH24:MI:SS'), -
EndTime
=> To_Date('09/07/2006 07:43:13', 'MM/DD/YYYY HH24:MI:SS'), -
OPTIONS
=> DBMS_LOGMNR.COMMITTED_DATA_ONLY );
-- Seek for DML
SQL> select username, operation, SQL_REDO
FROM V$LOGMNR_CONTENTS where SEG_NAME = 'WK_SUNOPSIS_LAST_SESSION';
update
"PIVOT_ENV"."WK_SUNOPSIS_LAST_SESSION" set
"SESSION_ID" = '35960', "START
_DATE" = TO_DATE('07-SEP-06', 'DD-MON-RR'),
"END_DATE" = NULL where "SESSION_ID"
= '33982'
and "START_DATE" = TO_DATE('31-AUG-06', 'DD-MON-RR') and
"END_DATE" =
TO_DATE('31-AUG-06', 'DD-MON-RR') and ROWID =
'AAACuSAAUAAAAO4AAF';
-- Close LogMiner session
SQL> EXECUTE Dbms_Logmnr.end_logmnr;
2.2.3 Mining a lot
of archivelogs using a script
Archivelog Sequence
Number : argument 1 ( DBMS_LOGMNR.NEW ) & argument 2 & 3 (
DBMS_LOGMNR.ADDFILE ) [ Arg_2 = Arg_1 +1 ].
A spoolfile named as
‘<base>_<sequence_arg_1_<sequence_arg_3>’ is created with
SQL_REDO statements.
$ cat
mine_arc_<base>.sql
set
serveroutput on
set pages
5000 lines 600
col sql_redo
for a600
-- get first
and last sequence number to analyze
define
seq1=&1
define
seq2=&2
define
seq3=&3
-- spool
COLUMN
timecol new_value spoolfile NOPRINT
SELECT '<base>_&seq1-&seq3..lst'
AS timecol FROM sys.dual;
-- a/ first
arc to mine
EXECUTE
DBMS_LOGMNR.ADD_LOGFILE(logfilename => '/PATH/arch_appli/1_&seq1'||'_619093165.arch',
options => DBMS_LOGMNR.NEW);
-- b/ loop
other arcs
begin
-- for i in 55186..55361 loop
for i in &seq2...&seq3 loop
dbms_output.put_line('Adding Logfile = '
|| i);
DBMS_LOGMNR.ADD_LOGFILE(logfilename
=> '/PATH/arch_appli/1_'||i||'_619093165.arch', options =>
DBMS_LOGMNR.ADDFILE);
end loop;
end;
/
-- c/ start
session & query redo entries
spool
&spoolfile
EXECUTE
DBMS_LOGMNR.START_LOGMNR(DICTFILENAME =>'/tmp/<base>_dict.ora',
OPTIONS => DBMS_LOGMNR.COMMITTED_DATA_ONLY);
select
SQL_REDO from v$logmnr_contents
where
seg_owner = 'APPLICATION_USRV5' and seg_name = 'ABONNES_CHEQUIER' and OPERATION
= 'INSERT' ;
-- d/ end
session
EXECUTE
Dbms_Logmnr.end_logmnr;
spool off
exit
$
Another
example :
set serveroutput on
set pages 5000 lines 600
col sql_redo for a600
-- get first and last sequence number to analyze
define seq1=&1
define seq2=&2 -- seq1+1
define seq3=&3
define dbase=&4
-- spool
COLUMN timecol new_value spoolfile NOPRINT
SELECT '&dbase-&seq1-&seq3..lst' AS timecol FROM sys.dual;
-- a/ first arc to mine
EXECUTE DBMS_LOGMNR.ADD_LOGFILE(logfilename => '/arch/<base>/LOGMNR/A&seq1'||'1606572332.arch',
options => DBMS_LOGMNR.NEW);
-- b/ loop other arcs
begin
-- for i in 55186..55361 loop
for i in &seq2...&seq3 loop
dbms_output.put_line('Adding
Logfile = ' || i);
DBMS_LOGMNR.ADD_LOGFILE(logfilename => '/arch/<base>/LOGMNR/A'||i||'1606572332.arch',
options => DBMS_LOGMNR.ADDFILE);
end loop;
end;
/
-- c/ start session & query redo entries
spool &spoolfile
EXECUTE DBMS_LOGMNR.START_LOGMNR(OPTIONS =>
DBMS_LOGMNR.DICT_FROM_ONLINE_CATALOG + DBMS_LOGMNR.COMMITTED_DATA_ONLY);
--EXECUTE DBMS_LOGMNR.START_LOGMNR(DICTFILENAME
=>'/tmp/&dbase_dict.ora', OPTIONS =>
DBMS_LOGMNR.COMMITTED_DATA_ONLY);
select SQL_REDO from v$logmnr_contents
where seg_owner = 'ITS' and seg_name = 'CARTOUCHE' and OPERATION = 'UPDATE'
;
-- d/ end session
EXECUTE Dbms_Logmnr.end_logmnr;
spool off
exit
I wrote a
shell script because Mining more than 20 archivelogs at a time crashed my
instance ( XE ).
The ksh
arguments are the first and last archivelogs seq numbers overall and it will
loop through them, calling mine_arc_<base>.sql for a bunch
of archivelogs according to the third argument ( # step = 20 for example ).
[oracle@<host>
arch_appli]$ cat mine_arc_<base>.sh
#!/bin/ksh
# loop on
archivelogs # step at a time
sequence_1=$1
sequence_2=$2
step=$3
i=$sequence_1
(( j = i +
step ))
while (( i
<= sequence_2 )) ; do
if (( j >
sequence_2 )) ; then (( endp = sequence_2 )) ; else (( endp = j )) ; fi
(( k = i + 1
))
if (( k >
endp )) ; then (( k = endp )) ; fi
print "\nMining
sequence log number $i , $k to $endp"
sqlplus '/ as sysdba'
@./mine_arc_<base>.sql $i $k $endp
(( i = j + 1
))
(( j = i +
step ))
done
return 0
Execution
I was
looking for INSERT statements to retrieve deleted rows no more in UNDO
tablespace.
[oracle@<host>
arch_appli]$ ./mine_arc_<base>.sh 54996 55006 4
Mining sequence log number
54996 , 54997 to 55000
Connected
to:
Oracle
Database 10g Express Edition Release 10.2.0.1.0 - Production
old 1: SELECT '<base>_&seq1-&seq3..lst'
AS timecol FROM sys.dual
new 1: SELECT '<base>_54996-55000.lst' AS
timecol FROM sys.dual
old 3: dbms_output.put_line('Adding First
Logfile = &seq1');
new 3: dbms_output.put_line('Adding First
Logfile = 54996');
old 4: DBMS_LOGMNR.ADD_LOGFILE(logfilename =>
'/PATH/arch_appli/1_&seq1'||'_619093165.arch', options => DBMS_LOGMNR.NEW);
new 4: DBMS_LOGMNR.ADD_LOGFILE(logfilename =>
'/PATH/arch_appli/1_54996'||'_619093165.arch', options => DBMS_LOGMNR.NEW);
old 8:
for i in &seq2...&seq3 loop
new 8:
for i in 54997..55000 loop
Adding First
Logfile = 54996
Adding Logfile
= 54997
Adding
Logfile = 54998
Adding
Logfile = 54999
Adding
Logfile = 55000
PL/SQL
procedure successfully completed.
PL/SQL
procedure successfully completed.
no rows
selected
PL/SQL
procedure successfully completed.
Disconnected
from Oracle Database 10g Express Edition Release 10.2.0.1.0 - Production
Mining sequence log number
55001 , 55002 to 55005
…
insert into
"APPLICATION_USRV5"."ABONNES_CHEQUIER"("IDABONNE","CIVILITE","NOM","PRENOM","DATENAISSANCE","ADRESSENOM","ADRESSEPRENOM","ADRESSERUE","ADRESSECODEPOSTAL","ADRESSECITY","TELPORTABLE","MAIL","IDCLASSE","IDETABLISSEMENT","IDCITY","IDSTATUT","DATESTATUT","COMMENTAIRESTATUT","ADRESSESUITE")
values ('6264','Mademoiselle','XXXXXXXXXX',…,'2','134','106','1',TO_DATE('06-SEP-11',
'DD-MON-RR'),NULL,NULL);
insert into
"APPLICATION_USRV5"."ABONNES_CHEQUIER"("IDABONNE","CIVILITE","NOM","PRENOM","DATENAISSANCE","ADRESSENOM","ADRESSEPRENOM","ADRESSERUE","ADRESSECODEPOSTAL","ADRESSECITY","TELPORTABLE","MAIL","IDCLASSE","IDETABLISSEMENT","IDCITY","IDSTATUT","DATESTATUT","COMMENTAIRESTATUT","ADRESSESUITE")
values ('6297','Mademoiselle',…,'2','70','80','1',TO_DATE('06-SEP-11',
'DD-MON-RR'),NULL,'DOMAINE DU GRAND PIN');
PL/SQL
procedure successfully completed.
Disconnected
from Oracle Database 10g Express Edition Release 10.2.0.1.0 - Production
[oracle@<host>
arch_appli]$
Other
query :
select SQL_REDO from v$logmnr_contents
where seg_owner = 'APPLICATION_USRV5' and seg_name =
'ABONNES_CHEQUIER' and OPERATION = 'DELETE'
and (
sql_redo like '%"IDABONNE" = ''2''%'
or sql_redo like '%"IDABONNE" = ''3''%'
or sql_redo like '%"IDABONNE" = ''4''%'
or sql_redo like '%"IDABONNE" = ''5''%'
or sql_redo like '%"IDABONNE" = ''6''%'
or sql_redo like '%"IDABONNE" = ''7''%'
or sql_redo like '%"IDABONNE" = ''8''%' )
3/ Management
To increase
size and/or move to another location, no need to say more than :
$ sqlplus
/ as sysdba
SQL*Plus:
Release 11.2.0.3.0 Production on Tue Apr 16 10:50:16 2013
Copyright
(c) 1982, 2011, Oracle. All rights
reserved.
Connected
to:
Oracle
Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With
the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL>
set pages 5000 lines 150
SQL>
col MEMBER for a30
SQL>
select * from v$logfile ;
GROUP# STATUS TYPE
MEMBER IS_
----------
------- ------- ------------------------------ ---
3 ONLINE
/data/<base>/log_03_1.rdo
NO
2 ONLINE
/data/<base>/log_02_1.rdo
NO
1 ONLINE
/data/<base>/log_01_1.rdo
NO
SQL>
alter database drop logfile group 1 ;
Database
altered.
SQL>
ALTER DATABASE ADD LOGFILE GROUP 1 ('/rdo1/<base>/log_01_1.rdo','/rdo2/<base>/log_01_2.rdo')
size 512M ;
Database
altered.
SQL>
alter database drop logfile group 3 ;
Database
altered.
SQL>
select * from v$logfile ;
GROUP# STATUS TYPE
MEMBER IS_
----------
------- ------- ------------------------------ ---
2 ONLINE
/data/<base>/log_02_1.rdo
NO
1
ONLINE /rdo1/<base>/log_01_1.rdo NO
4 ONLINE
/rdo1/<base>/log_04_1.rdo
NO
4 ONLINE
/rdo2/<base>/log_04_2.rdo
NO
1 ONLINE
/rdo2/<base>/log_01_2.rdo
NO
SQL>
alter system switch logfile ;
System
altered.
SQL>
alter system checkpoint ;
System
altered.
SQL>
alter database drop logfile group 2 ;
Database
altered.
SQL>
ALTER DATABASE ADD LOGFILE GROUP 2 ('/rdo1/<base>/log_02_1.rdo','/rdo2/<base>/log_02_2.rdo')
size 512M ;
Database
altered.
SQL>
ALTER DATABASE ADD LOGFILE GROUP 3
('/rdo1/<base>/log_03_1.rdo','/rdo2/<base>/log_03_2.rdo')
size 512M ;
Database
altered.
SQL>
!ls -l /data/<base>/log_*.rdo
-rw-r----- 1 oracle
dba 268435968 Apr 16 04:40
/data/<base>/log_01_1.rdo
-rw-r----- 1 oracle
dba 268435968 Apr 16 10:52
/data/<base>/log_02_1.rdo
-rw-r----- 1 oracle
dba 268435968 Apr 16 04:38
/data/<base>/log_03_1.rdo
SQL>
!rm /data/<base>/log_*.rdo
SQL>
select * from v$logfile order by 1,4 ;
GROUP# STATUS TYPE
MEMBER IS_
----------
------- ------- ------------------------------ ---
1 ONLINE
/rdo1/<base>/log_01_1.rdo
NO
1 ONLINE
/rdo2/<base>/log_01_2.rdo
NO
2
ONLINE /rdo1/<base>/log_02_1.rdo NO
2 ONLINE
/rdo2/<base>/log_02_2.rdo
NO
3 ONLINE
/rdo1/<base>/log_03_1.rdo
NO
3 ONLINE
/rdo2/<base>/log_03_2.rdo
NO
6
rows selected.
4/ Monitor excessive archive log generation to prevent ORA-16014
ARCH:
Archival stopped, error occurred. Will continue retrying
ORA-16014:
log 1 sequence# not archived, no available destinations
I wrote this
query to monitor an abnormal generation of archivelogs based on the numbers
found in the past 30 days :
We get the
maximum volume generated per day up to the current hour
-- past 30 days to get the Max,
between 00:00 and current hour
-- past 30 days before current day,
get max generated
We compare
this number to the volume generated today
-- current day, get sum
Then
-- oracle compare two numbers with decode
-- Current_Arclog <= Max_Arclogs => 'OK'
-- Current_Arclog > Max_Arclogs
=> 'KO'
WITH
-- past 30 days to get the Max, between 00:00 and current
hour
GLPD AS (
select trunc(hour_D) DAY_G, ceil(sum(Arclogs_MB/1024))
Arclogs_GB from
(
select trunc(completion_time,'HH24') hour_D, ceil(sum(blocks*block_size)/1024/1024) Arclogs_MB from
v$archived_log
where completion_time > SYSDATE-30
and to_char(completion_time, 'HH24:MI') between '00:00' and '12:00'
-- may have to exclude days with issues
and to_char(completion_time, 'DD/MM/YYYY') not in ('15/05/2015','20/05/2015','22/05/2015','23/05/2015','26/05/2015','27/05/2015','28/05/2015','29/05/2015','30/05/2015','01/06/2015')
group by trunc(completion_time,'HH24') order by 1 desc
) group by trunc(hour_D) order by 1 desc
),
MGPLD AS (
-- past 30 days before current day , get max generated
select max(Arclogs_GB) MA from GLPD where DAY_G != trunc(sysdate) ),
CGPLD AS (
-- current day, get sum
select Arclogs_GB CA from GLPD where DAY_G = trunc(sysdate) )
select MA Max_GB_30Days , CA Current_GB from MGPLD , CGPLD
;
Max_GB_30Days
Current_GB
9 10
Full PL
script with a current_hour variable :
set serveroutput on
declare
RES varchar2(8) ;
cur_hour varchar2(16) ;
begin
select to_char(trunc(sysdate,'HH24'), 'HH24:MI') into cur_hour
from dual ;
-- oracle compare two numbers with decode
-- Current_Arclog <= Max_Arclogs => 'OK'
-- Current_Arclog > Max_Arclogs => 'KO'
select decode(sign(Max_GB_30Days-Current_GB),-1,'KO','OK') into RES from (
WITH
-- past 30 days to get the Max, between 00:00 and current
hour
GLPD AS (
select trunc(hour_D) DAY_G, ceil(sum(Arclogs_MB/1024))
Arclogs_GB from
(
select trunc(completion_time,'HH24') hour_D, ceil(sum(blocks*block_size)/1024/1024) Arclogs_MB from
v$archived_log
where completion_time > SYSDATE-30
and to_char(completion_time, 'HH24:MI') between '00:00' and cur_hour
-- may have to exclude days with issues
and to_char(completion_time, 'DD/MM/YYYY') not in ('15/05/2015','20/05/2015','22/05/2015','23/05/2015','26/05/2015','27/05/2015','28/05/2015','29/05/2015','30/05/2015','01/06/2015')
group by trunc(completion_time,'HH24') order by 1 desc
) group by trunc(hour_D) order by 1 desc
),
MGPLD AS (
-- past 30 days before current day , get max generated
select max(Arclogs_GB) MA from GLPD where DAY_G != trunc(sysdate) ),
CGPLD AS (
-- current day, get sum
select Arclogs_GB CA from GLPD where DAY_G = trunc(sysdate) )
select MA Max_GB_30Days , CA Current_GB from MGPLD , CGPLD
) ;
DBMS_OUTPUT.PUT_LINE('RES Monitor Redo Generation = '||RES);
end;
/
RES Monitor Redo Generation = KO
PL/SQL procedure successfully
completed.
And in a
Function to be plugged into GRID’s User Defined Metrics :
-- To be plugged into User Defined Metric in GRID
-- see "Using Invoker's Rights or Definer's Rights
(AUTHID Clause)". The default is DEFINER
-- A unit whose AUTHID value is CURRENT_USER is called an
invoker's rights unit
-- grant select on SYS.v_$archived_log to SYSTEM
create or replace function SCH.ORA_ARC_LOG_GEN return varchar
AUTHID CURRENT_USER
AS
RES varchar2(4);
cur_hour varchar2(16) ;
begin
select to_char(trunc(sysdate,'HH24'), 'HH24:MI') into cur_hour
from dual ;
-- oracle compare two numbers with decode
-- Current_Arclog <= Max_Arclogs => 'OK' return 0
-- Current_Arclog > Max_Arclogs => 'KO' return 1 ( Warning )
select decode(sign(Max_GB_30Days-Current_GB),-1,'1','0') into RES from (
WITH
-- past 30 days to get the Max, between 00:00 and current
hour
GLPD AS (
select trunc(hour_D) DAY_G, ceil(sum(Arclogs_MB/1024))
Arclogs_GB from
(
select trunc(completion_time,'HH24') hour_D, ceil(sum(blocks*block_size)/1024/1024) Arclogs_MB from
v$archived_log
where completion_time > SYSDATE-30
and to_char(completion_time, 'HH24:MI') between '00:00' and cur_hour
-- may have
to exclude days with
issues
and to_char(completion_time, 'DD/MM/YYYY') not in ('15/05/2015','20/05/2015','22/05/2015','23/05/2015','26/05/2015','27/05/2015','28/05/2015','29/05/2015','30/05/2015','01/06/2015')
group by trunc(completion_time,'HH24') order by 1 desc
) group by trunc(hour_D) order by 1 desc
),
MGPLD AS (
-- past 30 days before current day , get max generated
select max(Arclogs_GB) MA from GLPD where DAY_G != trunc(sysdate) ),
CGPLD AS (
-- current day, get sum
select Arclogs_GB CA from GLPD where DAY_G = trunc(sysdate) )
select MA Max_GB_30Days , CA Current_GB from MGPLD , CGPLD
) ;
return RES ;
end;
/
Call
it : select SCH.ORA_ARC_LOG_GEN from dual ;
5] Archivelogs space over-filling
Goal
|
Avoid the Instance hanging
when archivelog space fills
up to 100 %
|
Version
|
10gR2
|
Views
|
V$ARCHIVED_LOG
V_$TRANSACTION
V_$SESS_IO
|
Log as
|
SYS
|
- First, monitor the archivelogs
generation
with
post [
Administration / Redo Logs / 4] Monitor
excessive archive log generation to prevent ORA-16014
Then,
action : Kill the guilty session
Archivelog Space Occupation threshold : kill
the session responsible for archivelogs over-generation
we assume that the following SQL are triggered
only in case of incident
( archivelog space is correctly sized to retain
at least 2 days of logs )
See "How to Find Sessions Generating Lots
of Redo or Archive logs" [ID 167492.1]
5.1 Pre-requisite
- In addition to grants already
seen for Services management post :
GRANT SELECT ON SYS.V_$TRANSACTION TO SYSTEM;
GRANT SELECT ON SYS.V_$SESS_IO TO SYSTEM;
GRANT EXECUTE ON
SYS.DBMS_SYSTEM TO SYSTEM;
GRANT EXECUTE ON SYS.DBMS_SERVICE
TO SYSTEM;
- 2 queries
-- 0/ Long running
transaction
-- find top UNDO
Consuming & Active session
select * from (
SELECT s.sid, s.serial#, s.username, s.program, ROUND(t.used_ublk*(SELECT VALUE FROM
v$parameter WHERE NAME = 'db_block_size')/1024/1024,2) "UNDO
Used Size (MB)"
FROM v$session
s, v$transaction t
WHERE s.taddr = t.addr
ORDER BY 5 desc, 1, 2, 3, 4
) where rownum < 2 ;
-- 1/ Program with many
Transactions ( intermediate Commit )
-- top Block Change session
-- ignore background processes
select * from (
SELECT s.sid, s.serial#, s.username, s.program, i.block_changes
-- very frequent commit, then is not seen in
v$transaction ( only active txn )
FROM v$session s, v$sess_io i --, v$transaction t
WHERE s.sid = i.sid -- and s.taddr = t.addr
and service_name != 'SYS$BACKGROUND'
ORDER BY 5 desc, 1, 2, 3, 4
) where rownum < 2 ;
-- we keep the second
query because it takes care of the 2 profiles
- To monitor :
select 'alter system kill
session '''||SID||','||SERIAL#||'''' cmd , username , program , block_changes from (
SELECT s.sid, s.serial#, s.username, s.program, i.block_changes
-- very frequent commit, then is not seen in v$transaction (
only active txn )
FROM v$session s,
v$sess_io i --, v$transaction t
WHERE s.sid = i.sid -- and s.taddr = t.addr
and service_name != 'SYS$BACKGROUND'
ORDER BY 5 desc, 1, 2, 3, 4
) where rownum < 2 ;
SID
|
SERIAL#
|
USER1
|
PROGRAM
|
BLOCK_CHANGES
|
2172
|
18579
|
USER1
|
JDBC Thin Client
|
30540017
|
1738
|
17990
|
USER1
|
JDBC Thin Client
|
29152077
|
1756
|
31732
|
USER1
|
JDBC Thin Client
|
27769616
|
1646
|
20165
|
USER1
|
JDBC Thin Client
|
26376994
|
2082
|
11138
|
USER2
|
RTSDGN@sr-app-9 (TNS V1-V3)
|
6530220
|
2541
|
6858
|
USER3
|
mgARES02.exe
|
367752
|
5.2 Procedure
Add to package
CREATE OR REPLACE package app_admin
as
-- 11/06/12 add
kill_long_running_session
procedure
kill_long_running_session;
…
end;
/
procedure
kill_long_running_session
is
-- kill the most redo
entries generating session
-- the procedure may be
called by Nagios when reaching threshold WARNING and CRITICAl
-- usage : execute
system.app_admin.kill_long_running_session;
cursor cur is
select 'alter
system kill session '''||SID||','||SERIAL#||'''' cmd , username , program , block_changes from (
SELECT s.sid, s.serial#, s.username, s.program, i.block_changes
-- very frequent commit,
then is not seen in v$transaction ( only active txn )
FROM v$session
s, v$sess_io i --, v$transaction t
WHERE s.sid = i.sid -- and s.taddr = t.addr
and
service_name != 'SYS$BACKGROUND'
ORDER BY 5 desc, 1, 2, 3, 4
) where rownum < 2 ;
begin
dbms_output.put_line('Kill top Block Change session');
-- This message goes to the alert log
sys.dbms_system.ksdwrt(2, ' Called
Procedure app_admin.kill_long_running_session: Kill top Block Change session');
for c_tab in cur loop
begin
dbms_output.put_line('Kill session username: '||c_tab.username||' '||c_tab.cmd);
-- This message goes to the alert log
sys.dbms_system.ksdwrt(2, ' Procedure
app_admin.kill_long_running_session: '||c_tab.username||' '||c_tab.program||'
block_changes: '||c_tab.block_changes||' '||c_tab.cmd);
sql_stmt := c_tab.cmd;
cn := dbms_sql.open_cursor;
dbms_sql.parse(cn, sql_stmt,dbms_sql.native);
ret:=dbms_sql.execute(cn);
dbms_sql.close_cursor(cn);
exception
when others then dbms_output.put_line(sqlerrm);
end;
end loop;
end
kill_long_running_session;
5.3 Test
- Block Change intensive PL/SQL
conn auc
create table t_intensive ( a int ) ;
BEGIN
DECLARE
m NUMBER;
BEGIN
FOR i IN 1..100000 LOOP
FOR j IN 1..10000 LOOP
m := sqrt(4567);
insert into t_intensive values (m)
;
END LOOP;
commit ;
END LOOP;
END;
END;
/
- KILL
$ sqlplus
system
SQL*Plus:
Release 10.2.0.5.0 - Production on Mon Jun 11 17:40:50 2012
Copyright
(c) 1982, 2010, Oracle. All Rights
Reserved.
Enter
password:
Connected
to:
Oracle
Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With
the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL>
set serveroutput on
SQL>
execute system.app_admin.kill_long_running_session ;
Kill
top Block Change and Active session
Kill
session username: SCH alter system kill session '481,38572'
PL/SQL
procedure successfully completed.
BEGIN
*
ERROR
at line 1:
ORA-00028:
your session has been killed
- Alert.log : we write down to
the Oracle Alert Log calling sys.dbms_system.ksdwrt
Mon Jun 11 17:41:49
GMT+02:00 2012Procedure app_admin.kill_long_running_session: SCH
sqlplus@sr-dev-1 (TNS V1-V3) block_changes: 1316160 alter system kill session
'481,38572'
We may even
run an RMAN Backup to backup/remove overflowed archivelogs : see Kernel
post.
6] Log write alignment
To avoid
misalignment of REDO Vector writes ( multiple IOs instead of One ) : use
the 11G+ syntax when creating logfiles : on AIX
alter system set
"_disk_sector_size_override" = true ;
ALTER DATABASE ADD LOGFILE GROUP 6 ('/rdo1/<db_name>/log_06_1.rdo','/rdo2/<db_name>/log_06_2.rdo')
size 256M BLOCKSIZE 4096 ;
7] Oracle Loss or Corruption of a Redolog Member
Case 1. « /rdo2 »
filesystem is no more available, members on ‘/rdo1 » exist.
First start
mount the instance, then remove the rdo2 members.
$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.5.0 - Production on Wed Nov
22 15:41:58 2017
Connected to an idle instance.
SQL> startup nomount
ORACLE instance started.
Total System Global Area 1610612736 bytes
Fixed
Size
2096736 bytes
Variable
Size
1107296672 bytes
Database
Buffers 486539264 bytes
Redo
Buffers
14680064 bytes
SQL> !ls -l /rdo2/dbora/
total 720912
drwxr-xr-x 2 oracle
dba 256 2010-03-22 15:00 ctl
-rw-r----- 1 oracle dba 67109376
2017-11-22 02:17 log_01_2.rdo
-rw-r----- 1 oracle dba 67109376
2017-11-22 02:04 log_02_2.rdo
-rw-r----- 1 oracle dba 67109376
2017-11-22 02:11 log_03_2.rdo
-rw-r----- 1 oracle dba 536871424 2017-11-22
14:55 log_05_2.rdo
SQL> ALTER DATABASE DROP LOGFILE MEMBER '/rdo2/dbora/log_01_2.rdo';
ALTER DATABASE DROP LOGFILE MEMBER '/rdo2/dbora/log_01_2.rdo'
*
ERROR at line 1:
ORA-01507: database not mounted
SQL> alter database mount ;
Database altered.
SQL> ALTER DATABASE DROP LOGFILE MEMBER '/rdo2/dbora/log_01_2.rdo';
Database altered.
SQL> ALTER DATABASE DROP LOGFILE MEMBER '/rdo2/dbora/log_02_2.rdo';
Database altered.
SQL> ALTER DATABASE DROP LOGFILE MEMBER '/rdo2/dbora/log_03_2.rdo';
Database altered.
SQL> ALTER DATABASE DROP LOGFILE MEMBER '/rdo2/dbora/log_05_2.rdo';
ALTER DATABASE DROP LOGFILE MEMBER '/rdo2/dbora/log_05_2.rdo'
*
ERROR at line 1:
ORA-01609: log 5 is the current log for thread 1 -
cannot drop members
ORA-00312: online log 5 thread 1: '/rdo1/dbora/log_05_1.rdo'
ORA-00312: online log 5 thread 1: '/rdo2/dbora/log_05_2.rdo'
SQL> alter system switch logfile ;
alter system switch logfile
*
ERROR at line 1:
ORA-01109: database not open
SQL> alter database rename file '/rdo2/dbora/log_05_2.rdo'
to '/rdo1/dbora/log_05_2.rdo';
alter database rename file '/rdo2/dbora/log_05_2.rdo'
to '/rdo1/dbora/log_05_2.rdo'
*
ERROR at line 1:
ORA-01511: error in renaming log/data files
ORA-01512: error renaming log file /rdo2/dbora/log_05_2.rdo
- new file
/rdo1/dbora/log_05_2.rdo not found
ORA-27037: unable to obtain file status
IBM AIX RISC System/6000 Error: 2: No such file or
directory
Additional information: 3
SQL> !cp -p /rdo1/dbora/log_05_1.rdo /rdo1/dbora/log_05_2.rdo
SQL> SQL> SQL> SQL> SQL> alter database
rename file '/rdo2/dbora/log_05_2.rdo' to '/rdo1/dbora/log_05_2.rdo';
Database altered.
SQL> alter database open ;
Database altered.
SQL>
Case 2. ORA-00341: Corruption,
wrong log # 3 in header
ALTER DATABASE OPEN
Errors in file /work/oracle/dbora/diag/rdbms/dbora/DBORA/trace/DBORA_lgwr_9830604.trc:
ORA-00341: log 4 of thread 1, wrong log # 3 in header
ORA-00312: online log 4 thread 1: '/rdo1/dbora/log_03_2.rdo'
Errors in file /work/oracle/dbora/diag/rdbms/dbora/DBORA/trace/DBORA_lgwr_9830604.trc:
ORA-00341: log 4 of thread 1, wrong log # 3 in header
ORA-00312: online log 4 thread 1: '/rdo1/dbora/log_03_2.rdo'
Errors in file /work/oracle/dbora/diag/rdbms/dbora/DBORA/trace/DBORA_ora_8716718.trc:
ORA-00341: log 1 of thread , wrong log # in header
ORA-00312: online log 4 thread 1: '/rdo1/dbora/log_03_2.rdo'
USER (ospid: 8716718): terminating the instance due to
error 341
Wed Nov 22 18:07:29 2017
System state dump requested by (instance=1,
osid=8716718), summary=[abnormal instance termination].
System State dumped to trace file /work/oracle/dbora/diag/rdbms/dbora/DBORA/trace/DBORA_diag_2556030.trc
Dumping diagnostic data in
directory=[cdmp_20171122180729], requested by (instance=1, osid=8716718),
summary=[abnormal instance termination].
Instance terminated by USER, pid = 8716718
$ /rdo1/dbora > rm /rdo1/dbora/log_03_2.rdo
$ /rdo1/dbora > cp -p /rdo1/dbora/log_03_1.rdo
/data/dbora/log_03_2.rdo
$ /rdo1/dbora > sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Wed Nov 22
18:09:58 2017
Connected to an idle instance.
SQL> startup mount
ORA-32004: obsolete or deprecated parameter(s)
specified for RDBMS instance
ORACLE instance started.
Total System Global Area 2137886720 bytes
Fixed Size 2223208 bytes
Variable Size 1459618712 bytes
Database Buffers 671088640 bytes
Redo Buffers 4956160 bytes
Database mounted.
SQL> SQL> SQL> SQL> SQL> SQL>
SQL> SQL> alter database rename file '/rdo2/dbora/log_03_2.rdo' to
'/data/dbora/log_03_2.rdo';
alter database rename file '/rdo2/dbora/log_03_2.rdo'
to '/data/dbora/log_03_2.rdo'
*
ERROR at line 1:
ORA-01511: error in renaming log/data files
ORA-01516: nonexistent log file, data file, or
temporary file
"/rdo2/dbora/log_03_2.rdo"
SQL> !ls -l /data/dbora/log_03_2.rdo
-rw-r----- 1
oracle dba 268435968 2017-11-22 18:07 /data/dbora/log_03_2.rdo
SQL> alter database rename file '/rdo1/dbora/log_03_2.rdo'
to '/data/dbora/log_03_2.rdo';
Database altered.
SQL> alter database open ;
alter database open
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 9437560
Session ID: 178 Serial number: 5
SQL> Disconnected from Oracle Database 11g
Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real
Application Testing options
$ /rdo1/dbora > sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Wed Nov 22
18:12:18 2017
Connected to an idle instance.
SQL> startup mount
ORA-32004: obsolete or deprecated parameter(s)
specified for RDBMS instance
ORACLE instance started.
Total System Global Area 2137886720 bytes
Fixed Size 2223208 bytes
Variable Size 1459618712 bytes
Database Buffers 671088640 bytes
Redo Buffers 4956160 bytes
Database mounted.
SQL> alter database open resetlogs
2 ;
alter database open resetlogs
*
ERROR at line 1:
ORA-01139: RESETLOGS option only valid after an
incomplete database recovery
SQL> recover database using backup controlfile
until cancel ;
ORA-00279: change 40237970573 generated at 11/22/2017
14:55:49 needed for
thread 1
ORA-00289: suggestion : /data/dbora/arc/DBORA1_231_939491616.arc
ORA-00280: change 40237970573 for thread 1 is in
sequence #231
Specify log: {<RET>=suggested | filename | AUTO
| CANCEL}
CANCEL
Media recovery cancelled.
SQL> alter database open resetlogs ;
Database altered.
SQL>