Data Recovery Advisor in Oracle Database 11g Release 1
The Data Recovery Advisor automatically diagnoses corruption or loss of persistent data on disk, determines the appropriate repair options, and executes repairs at the user's request. This reduces the complexity of recovery process, thereby reducing the Mean Time To Recover (MTTR).Create a Failure
Before we can start identifying and repairing failures, we need to create one. Probably the easiest way to do this is to corrupt a datafile using the "echo" command. The following script navigates to the directory holding the datafiles, checks the current size of the users01.dbf file, echos nothing to it, then checks the file size once more.You can see the file size has been reduced to a single byte.$ cd /u01/app/oracle/oradata/DB11G $ ls -l users01.dbf -rw-r----- 1 oracle oinstall 57745408 Jan 3 11:42 users01.dbf $ echo > users01.dbf $ ls -l users01.dbf -rw-r----- 1 oracle oinstall 1 Jan 3 13:26 users01.dbf
Next, we connect to the database and try to create a table in the USERS tablespace.
The resulting error triggers a data integrity check that searches the database for failures related to the error and records them in the Automatic Diagnostic Repository (ADR).SQL> CREATE TABLE test_tab (id NUMBER) TABLESPACE USERS; CREATE TABLE test_tab (id NUMBER) TABLESPACE USERS * ERROR at line 1: ORA-01115: IO error reading block from file 4 (block # 3) ORA-01110: data file 4: '/u01/app/oracle/oradata/DB11G/users01.dbf' ORA-27072: File I/O error Additional information: 4 Additional information: 2
Relevant RMAN Functionality
LIST FAILURE
TheLIST FAILURE command displays any failures with a status OPEN and a priority of CRITICAL or HIGH in order of importance. If no such failures exist it will list LOW priority failures.TheRMAN> LIST FAILURE; List of Database Failures ========================= Failure ID Priority Status Time Detected Summary ---------- -------- --------- ------------- ------- 202 HIGH OPEN 03-JAN-08 One or more non-system datafiles are corrupt RMAN>
LIST FAILURE command has a number of options that can be combined to alter the failures listed. See the Syntax Diagram and Syntax Element Description for details.ADVISE FAILURE
TheADVISE FAILURE command, as the name implies, provides repair advice for failures listed by the LIST FAILURE command, as well as closing all open failures that are already repaired.If manual repair actions are produced you shoudl attempt them first, as they are likely to be less disruptive. If manual repair actions aren't present, or they do not fix the problem, you can use the automated repair option.RMAN> ADVISE FAILURE; List of Database Failures ========================= Failure ID Priority Status Time Detected Summary ---------- -------- --------- ------------- ------- 202 HIGH OPEN 03-JAN-08 One or more non-system datafiles are corrupt analyzing automatic repair options; this may take some time allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=124 device type=DISK analyzing automatic repair options complete Mandatory Manual Actions ======================== no manual actions available Optional Manual Actions ======================= no manual actions available Automated Repair Options ======================== Option Repair Description ------ ------------------ 1 Restore and recover datafile 4 Strategy: The repair includes complete media recovery with no data loss Repair script: /u01/app/oracle/diag/rdbms/db11g/DB11G/hm/reco_3657335472.hm RMAN>
The full syntax for the
ADVISE FAILURE command can be seen here.REPAIR FAILURE
TheREPAIR FAILURE command applies the repair scripts produced by the ADVISE FAILURE command. Using the PREVIEW option lists the contents of the repair script without applying it.By default, theRMAN> REPAIR FAILURE PREVIEW; Strategy: The repair includes complete media recovery with no data loss Repair script: /u01/app/oracle/diag/rdbms/db11g/DB11G/hm/reco_2408143298.hm contents of repair script: # restore and recover datafile sql 'alter database datafile 4 offline'; restore datafile 4; recover datafile 4; sql 'alter database datafile 4 online'; RMAN>
REPAIR FAILURE command prompts the user to confirm the repair, but this can be prevented using the NOPROMPT keyword.
RMAN> REPAIR FAILURE NOPROMPT;
Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/app/oracle/diag/rdbms/db11g/DB11G/hm/reco_2408143298.hm
contents of repair script:
# restore and recover datafile
sql 'alter database datafile 4 offline';
restore datafile 4;
recover datafile 4;
sql 'alter database datafile 4 online';
executing repair script
sql statement: alter database datafile 4 offline
Starting restore at 03-JAN-08
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00004 to /u01/app/oracle/oradata/DB11G/users01.dbf
channel ORA_DISK_1: reading from backup piece /u01/app/oracle/flash_recovery_area/DB11G/backupset/
2008_01_03/o1_mf_nnndf_BACKUP_DB11G.WORLD_0_3qsl2hy4_.bkp
channel ORA_DISK_1: piece handle=/u01/app/oracle/flash_recovery_area/DB11G/backupset/2008_01_03/
o1_mf_nnndf_BACKUP_DB11G.WORLD_0_3qsl2hy4_.bkp tag=BACKUP_DB11G.WORLD_010308113407
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:07
Finished restore at 03-JAN-08
Starting recover at 03-JAN-08
using channel ORA_DISK_1
starting media recovery
media recovery complete, elapsed time: 00:00:01
Finished recover at 03-JAN-08
sql statement: alter database datafile 4 online
repair failure complete
RMAN>
The full syntax for the REPAIR FAILURE command can be seen here.CHANGE FAILURE
TheCHANGE FAILURE command allows you to change the priority of a failure or close an open failure. You may wish to change the priority of a failure if it does not represent a problem to you. For example, a failure associated with a tablespace you know longer use may be listed as a high priority, when in fact it has no effect on the normal running of your system.It is unlikely you will need to close an open failure, as even manually repaired failures are closed implicitly, but the option is there should you need it.RMAN> CHANGE FAILURE 202 PRIORITY LOW;
See the Syntax Diagram and Syntax Element Description for details.
VALIDATE
TheVALIDATE command initiates data integrity checks, logging physical, and optionally logical, block corruptions of database files and backups in the V$DATABASE_BLOCK_CORRUPTION view and the Automatic Diagnostic Repository as one or more failures. The following code shows some of the possible syntax variations.The full syntax for the# Check for physical corruption of all database files. VALIDATE DATABASE; # Check for physical and logical corruption of a tablespace. VALIDATE CHECK LOGICAL TABLESPACE USERS; # Check for physical and logical corruption of a datafile. VALIDATE CHECK LOGICAL DATAFILE 4; # Check for physical corruption of all archived redo logs files. VALIDATE ARCHIVELOG ALL; # Check for physical and logical corruption of the controlfile. VALIDATE CHECK LOGICAL CURRENT CONTROLFILE; # Check for physical and logical corruption of a specific backupset. VALIDATE CHECK LOGICAL BACKUPSET 3;
VALIDATE command can be seen here.The
BACKUP VALIDATE and RESTORE VALIDATE commands perform the same checks as the VALIDATE command for the files targeted by the backup or restore command, but they don't actually perform the specified backup or restore operation. This allows you to check the integrity of a backup or restore operation before actually performing it. The following code shows some of the possible syntax variations.The full syntax for the# Check for physical corruption of files to be backed up. BACKUP VALIDATE DATABASE ARCHIVELOG ALL; # Check for physical and logical corruption of files to be backed up. BACKUP VALIDATE CHECK LOGICAL DATABASE ARCHIVELOG ALL; # Check for physical corruption of files to be restored. RESTORE VALIDATE DATABASE; # Check for physical and logical corruption of files to be restored. RESTORE VALIDATE CHECK LOGICAL DATABASE;
BACKUP and RESTORE commands can be seen here and here respectively.Health Monitor (DBMS_HM)
In the previous example the failure was detected and logged automatically in reaction to the table creation failure. If this were a low use tablespace, we might not have noticed the problem for some time. The Health Monitor also allows us to perform the same integrity checks manually, rather than waiting for the reactive tests to take place. This may help you identify and fix problems before they are ever noticed by users.Access to Health Monitor is available using the
DBMS_HM package or Enterprise Manager (see below). The available checks are displayed using the V$HM_CHECK view.Most of the health checks accept parameters, which are displayed using theSQL> SELECT name FROM v$hm_check WHERE internal_check='N'; NAME ---------------------------------- DB Structure Integrity Check Data Block Integrity Check Redo Integrity Check Transaction Integrity Check Undo Segment Integrity Check Dictionary Integrity Check 6 rows selected. SQL>
V$HM_CHECK_PARAM view.
SET LINESIZE 120
COLUMN check_name FORMAT A30
COLUMN parameter_name FORMAT A15
COLUMN type FORMAT A15
COLUMN default_value FORMAT A15
COLUMN description FORMAT A20
SELECT c.name check_name, p.name parameter_name, p.type, p.default_value, p.description
FROM v$hm_check_param p, v$hm_check c
WHERE p.check_id = c.id
AND c.internal_check = 'N'
ORDER BY c.name;
CHECK_NAME PARAMETER_NAME TYPE DEFAULT_VALUE DESCRIPTION
------------------------------ --------------- --------------- --------------- --------------------
Data Block Integrity Check BLC_DF_NUM DBKH_PARAM_UB4 Block Data File numb
er
Data Block Integrity Check BLC_BL_NUM DBKH_PARAM_UB4 Datablock number
Dictionary Integrity Check CHECK_MASK DBKH_PARAM_TEXT ALL Check Mask
Dictionary Integrity Check TABLE_NAME DBKH_PARAM_TEXT ALL_CORE_TABLES Table Name
Redo Integrity Check SCN_TEXT DBKH_PARAM_TEXT 0 SCN of the latest go
od redo (if known)
Transaction Integrity Check TXN_ID DBKH_PARAM_TEXT Transaction ID
Undo Segment Integrity Check USN NUMBER DBKH_PARAM_TEXT Undo Segment Number
7 rows selected.
SQL>
The DBMS_HM.RUN_CHECK procedure is used to run a specific check with the appropriate parameters.
BEGIN
DBMS_HM.run_check (
check_name => 'DB Structure Integrity Check',
run_name => 'my_test_run');
END;
/
PL/SQL procedure successfully completed.
SQL>
The findings of the health checks can be displayed using a variety of methods:- The
DBMS_HM.GET_RUN_REPORTfunction returns a report of the findings in TEXT, HTML or XML format. - The findings are visible using the RMAN
LIST FAILUREcommand, as seen previously. - The results can be queried directly from the
V$HM_RUN,V$HM_FINDING, andV$HM_RECOMMENDATIONviews. - Enterprise Manager supports the Data Recovery Advisor and Health Monitor. (see below)
- Automatic Diagnostic Repository Command Interpreter (ADRCI) utility.
DBMS_HM.GET_RUN_REPORT function.
SET LONG 100000
SET LONGCHUNKSIZE 1000
SET PAGESIZE 1000
SET LINESIZE 512
SELECT DBMS_HM.get_run_report('MY_TEST_RUN') FROM dual;
DBMS_HM.GET_RUN_REPORT('MY_TEST_RUN')
------------------------------------------------------------------------------
Basic Run Information
Run Name : my_test_run
Run Id : 330
Check Name : DB Structure Integrity Check
Mode : MANUAL
Status : COMPLETED
Start Time : 2008-01-04 11:30:27.293105 +00:00
End Time : 2008-01-04 11:30:27.345898 +00:00
Error Encountered : 0
Source Incident Id : 0
Number of Incidents Created : 0
Input Paramters for the Run
Run Findings And Recommendations
Finding
Finding Name : Corrupt Datafile
Finding ID : 334
Type : FAILURE
Status : OPEN
Priority : HIGH
Message : Datafile 4: '/u01/app/oracle/oradata/DB11G/users01.dbf' is
corrupt
Message : Some objects in tablespace USERS might be unavailable
SQL>
The ADR Command Interpreter (ADRCI) utility is initiated by issuing the "adrci" command on the command line.Health Monitor runs are displayed using the following command.$ adrci ADRCI: Release 11.1.0.6.0 - Beta on Fri Jan 4 12:01:20 2008 Copyright (c) 1982, 2007, Oracle. All rights reserved. ADR base = "/u01/app/oracle" adrci>
The relevant run name is then used to produce a report using the following commands.adrci> show hm_run
adrci> set homepath diag/rdbms/db11g/DB11G
adrci> create report hm_run my_test_run
adrci> show report hm_run my_test_run
<?xml version="1.0" encoding="US-ASCII"?>
<HM-REPORT REPORT_ID="my_test_run">
<TITLE>HM Report: my_test_run</TITLE>
<RUN_INFO>
<CHECK_NAME>DB Structure Integrity Check</CHECK_NAME>
<RUN_ID>330</RUN_ID>
<RUN_NAME>my_test_run</RUN_NAME>
<RUN_MODE>MANUAL</RUN_MODE>
<RUN_STATUS>COMPLETED</RUN_STATUS>
<RUN_ERROR_NUM>0</RUN_ERROR_NUM>
<SOURCE_INCIDENT_ID>0</SOURCE_INCIDENT_ID>
<NUM_INCIDENTS_CREATED>0</NUM_INCIDENTS_CREATED>
<RUN_START_TIME>2008-01-04 11:30:27.293105 +00:00</RUN_START_TIME>
<RUN_END_TIME>2008-01-04 11:30:27.345898 +00:00</RUN_END_TIME>
</RUN_INFO>
<RUN_PARAMETERS/>
<RUN-FINDINGS>
<FINDING>
<FINDING_NAME>Corrupt Datafile</FINDING_NAME>
<FINDING_ID>334</FINDING_ID>
<FINDING_TYPE>FAILURE</FINDING_TYPE>
<FINDING_STATUS>OPEN</FINDING_STATUS>
<FINDING_PRIORITY>HIGH</FINDING_PRIORITY>
<FINDING_CHILD_COUNT>0</FINDING_CHILD_COUNT>
<FINDING_CREATION_TIME>2008-01-04 11:30:27.341374 +00:00</FINDING_CREATION_TIME>
<FINDING_MESSAGE>Datafile 4: '/u01/app/oracle/oradata/DB11G/users01.dbf' is corrupt</FINDING_MESSAGE>
<FINDING_MESSAGE>Some objects in tablespace USERS might be unavailable</FINDING_MESSAGE>
</FINDING>
</RUN-FINDINGS>
</HM-REPORT>
adrci>
Oracle Enterprise Manager
Access to the Data Recovery Advisor is available from the "View and Manage Failures" screen (Advisor Central > Data Recovery Advisor).
The "Advise and Recover" button on the "Perform Recovery" screen (Availability > Perform Recovery) also links through to the "View and Manage Failures" screen.

Access to the Health Monitor is available from the "Checkers" screen (Advisor Central > Checkers (tab)).

For more information see:
- Diagnosing and Repairing Failures with Data Recovery Advisor
- Running Health Checks with Health Monitor
- DBMS_HM
- Validating Database Files and Backups
- RMAN Enhancements in Oracle Database 11g Release 1
Back to the Top.

