In this post I will some tips for setting up RMAN in RAC environment.
I will not cover topics about RMAN that can be configured in standalone environment (e.g Incremental backup, use of FRA, etc.)
First question: Is there a difference of setting up RMAN between standalone and RAC environments?
The answer is YES, not too much but some points must be observed.
First of all, In my point of view use RMAN Catalog is mandatory. Because it’s a HA environment and to restore a database without RMAN Catalog can take long time.
To protect and keep backup metadata for longer retention times than can be accommodated by the control file, you can create a recovery catalog. You should create the recovery catalog schema in a dedicated standalone database. Do not locate the recovery catalog with other production data. If you use Oracle Enterprise Manager, you can create the recovery catalog schema in the Oracle Enterprise Manager repository database.
About HA of RMAN Catalog?
I always recommend to place the Host that will hold RMAN Catalog on VirtualMachine, because is a machine which require low resource and disk space and have low activity.
In case of the failure of Host RMAN Catalog is easy move that host to another Physical Host or Recover the whole virtual machine.
But if the option of use a VM is not avaliable. Use another cluster (e.g Test Cluster) env if avaliable.
The Database of RMAN catalog must be in ARCHIVELOG. Why?
It’s a prod env (is enough), will generate very small amount of archivelogs, and in case of any corruption or user errors (e.g User generated new incarnation of Prod Database during a Test Validation of Backup in Test env) can be necessary recovery point in time.
Due a small database of low activity I see some customers not giving importance to this database. It’s should not happens.
High availability of execution of backup using RMAN:
We have some challenges:
- The backup must not affect the availability cluster, but the backup must be executed daily.
- The backup cannot be dependent of nodes (i.e backup must be able to execute in all nodes independently if have some nodes active or not)
- Where store the scripts of backup? Where store the Logs?
I don’t recommend use any nodes of cluster to start/store scripts backups. Due if that node fail backup will not be executed.
Use the Host where is stored RMAN Catalog to store your backup scripts too and start these scripts from this host, the utility RMAN works as client only… the backup is always performed on server side.
Doing this you will centralize all scripts and logs of backup from your environment. That will ease the management of backup.
Configuring the RMAN Snapshot Control File Location in a RAC 11.2
RMAN creates a copy of the control file for read consistency, this is the snapshot controlfile. Due to the changes made to the controlfile backup mechanism in 11gR2 any instances in the cluster may write to the snapshot controlfile. Therefore, the snapshot controlfile file needs to be visible to all instances.
The same happens when a backup of the controlfile is created directly from sqlplus any instance in the cluster may write to the backup controfile file.
In 11gR2 onwards, the controlfile backup happens without holding the control file enqueue. For non-RAC database, this doesn’t change anything.
But, for RAC database, the snapshot controlfile location must be in a shared file system that will be accessible from all the nodes.
The snapshot controlfile MUST be accessible by all nodes of a RAC database.
See how do that:
Since version 11.1 : Node Affinity Awareness of Fast Connections
In some cluster database configurations, some nodes of the cluster have faster access to certain data files than to other data files. RMAN automatically detects this, which is known as node affinity awareness. When deciding which channel to use to back up a particular data file, RMAN gives preference to the nodes with faster access to the data files that you want to back up. For example, if you have a three-node cluster, and if node 1 has faster read/write access to data files 7, 8, and 9 than the other nodes, then node 1 has greater node affinity to those files than nodes 2 and 3.
Channel Connections to Cluster Instances with RMAN
Channel connections to the instances are determined using the connect string defined by channel configurations. For example, in the following configuration, three channels are allocated using dbauser/pwd@service_name. If you configure the SQL Net service name with load balancing turned on, then the channels are allocated at a node as decided by the load balancing algorithm.
However, if the service name used in the connect string is not for load balancing, then you can control at which instance the channels are allocated using separate connect strings for each channel configuration. So,your backup scripts will fail if that node/instance is down.
So, my recommendation in admin-managed database environment is create a set of nodes to perform the backup.
E.g : If you have 3 nodes you should use one or two node to perform backup, while the other node is less loaded. If you are using Load Balance in your connection… the new connection will be directed to the least loaded node.
Autolocation for Backup and Restore Commands
RMAN automatically performs autolocation of all files that it must back up or restore. If you use the noncluster file system local archiving scheme, then a node can only read the archived redo logs that were generated by an instance on that node. RMAN never attempts to back up archived redo logs on a channel it cannot read.
During a restore operation, RMAN automatically performs the autolocation of backups. A channel connected to a specific node only attempts to restore files that were backed up to the node. For example, assume that log sequence 1001 is backed up to the drive attached to node1, while log 1002 is backed up to the drive attached to node2. If you then allocate channels that connect to each node, then the channel connected to node1 can restore log 1001 (but not 1002), and the channel connected to node2 can restore log 1002 (but not 1001).
Configuring Channels to Use Automatic Load Balancing
To configure channels to use automatic load balancing, use the following syntax:
CONFIGURE DEVICE TYPE [disk | sbt] PARALLELISM number_of_channels;
Where number_of_channels is the number of channels that you want to use for the operation. After you complete this one-time configuration, you can issue BACKUP or RESTORE commands.
Setup Parallelism on RMAN is not enough to keep a balance, because if you start the backup from remote host using default SERVICE_NAME and if you are using parallelism the RMAN can start a session in each node and the backup be performed by all nodes at same time, this is not a problem, but can cause a performance issue on your environment due high load.
Even at night the backup can cause performance problems due maintenance of the database (statistics gathering, verification of new SQL plans “automatic sql tuning set”, etc).
The bottlenecks are usually in or LAN or SAN, so use all nodes to perform backup can be a waste. If the backup is run via LAN you can gain by using more than one node, but the server that is receiving the backup data will become a bottleneck.
I really don’t like to use more than 50% of nodes of RAC to execute backup due it can increase the workload in all nodes of clusters and this can be a problem to the application or database.
So, thinking to prevent it we can configure a Database Service to control where backup will be performed.
Creating a Database Service to perform Backup
Before start I should explain about limitation of database service.
Some points about Oracle Services.
When a user or application connects to a database, Oracle recommends that you use a service for the connection. Oracle Database automatically creates one database service (default service is always the database name) when the database is created. For more flexibility in the management of the workload using the database, Oracle Database enables you to create multiple services and specify which database instances offer the services.
You can define services for both policy-managed and administrator-managed databases.
- Policy-managed database: When you define services for a policy-managed database, you assign the service to a server pool where the database is running. You can define the service as either uniform (running on all instances in the server pool) or singleton (running on only one instance in the server pool).
- Administrator-managed database: When you define a service for an administrator-managed database, you define which instances normally support that service. These are known as the
PREFERRED instances. You can also define other instances to support a service if the preferred instance fails. These are known as
About Service Failover in Administrator-Managed Databases
When you specify a preferred instance for a service, the service runs on that instance during normal operation. Oracle Clusterware attempts to ensure that the service always runs on all the preferred instances that have been configured for a service. If the instance fails, then the service is relocated to an available instance. You can also manually relocate the service to an available instance.
About Service Failover in Policy-Managed Databases
When you specify that a service is UNIFORM, Oracle Clusterware attempts to ensure that the service always runs on all the available instances for the specified server pool. If the instance fails, then the service is no longer available on that instance. If the cardinality of the server pool increases and a instance is added to the database, then the service is started on the new instance. You cannot manually relocate the service to a specific instance.
When you specify that a service is SINGLETON, Oracle Clusterware attempts to ensure that the service always runs on only one of the available instances for the specified server pool. If the instance fails, then the service fails overs to a different instance in the server pool. You cannot specify which instance in the server pool the service should run on.
For SINGLETON services, if a service fails over to an new instance, then the service is not moved back to its original instance when that instance becomes available again.
Summarizing about use Services
If your database is Administrator-Managed we can create a service and define where backup will be executed, and how much nodes we can use with preferred and available nodes.
If your database is Policy-Managed we cannot define where backup will be executed, but we can configure a service SINGLETON, that will be sure that backup will be executed in only node, if that node fail the service will be moved to another available node, but we cannot choose in which node backup will be performed.
For connections to the target and auxiliary databases, the following rules apply:
Starting with 10gR2, these connections can use a connect string that does not bind to any particular instance. This means you can use load balancing.
Once a connection is established, however, it must be a dedicated connection that cannot migrate to any other process or instance. This means that you still can’t use MTS or TAF.
Example creating service for Administrator-Managed Database
The backup will be executed on db11g2 and db11g3, but can be executed on db11g1 if db11g2 and db11g3 fail.
Set ORACLE_HOME to same used by Database
$ srvctl add service -d db_unique_name -s service_name -r preferred_list [-a available_list] [-P TAF_policy]
$ srvctl add service -d db11g -s srv_rman -r db11g2,db11g3 -a db11g1 -P NONE -j LONG
$ srvctl start service -d db11g -s srv_rman
Example creating service for Policy-Managed Database
Using a service SINGLETON the backup will be executed on node which service was started/assigned. The service will be changed to another host only if that node fail.
Set ORACLE_HOME to same used by Database
$ srvctl config database -d db11g |grep "Server pools"
Server pools: racdb11gsp
$ srvctl add service -d db11g -s srv_rman -g racdb11gsp -c SINGLETON
$ srvctl start service -d db11g -s srv_rman
If you have more than 2 nodes on cluster (with policy managed database) and you want use only 2 or more nodes to perform backup, you can choose the options below.
Configure a Service UNIFORM (the service will be available on all nodes) you can control how much instance will be used to perform backup, but you cannot choose in which node backup will be performed. In fact the service does not control anything, you will set PARALLELISM (RMAN) equal number of nodes wich you want use .
Ex: I have 4 Nodes but I want start backup in 2 nodes. I must choose parallelism 2. Remember that Oracle can start 2 Channel on same host, this depend on workload of each node.
Using Policy Managed Database you should be aware that you do not care where (node) each instance is running, you will have a pool with many nodes and Oracle will manage all resources inside that pool. For this reason is not possible to control where you will place a heavier load.
This will only work if you are performing online backup or are using Parallel Backup.
Configuring RMAN to Automatically Backup the Control File and SPFILE
If you set CONFIGURE CONTROLFILE AUTOBACKUP to ON, then RMAN automatically creates a control file and an SPFILE backup after you run the BACKUP or COPYcommands. RMAN can also automatically restore an SPFILE, if this is required to start an instance to perform recovery, because the default location for the SPFILE must be available to all nodes in your Oracle RAC database.
These features are important in disaster recovery because RMAN can restore the control file even without a recovery catalog. RMAN can restore an autobackup of the control file even after the loss of both the recovery catalog and the current control file. You can change the default name that RMAN gives to this file with the CONFIGURE CONTROLFILE AUTOBACKUP FORMAT command. Note that if you specify an absolute path name in this command, then this path must exist identically on all nodes that participate in backups.
RMAN performs the control file autobackup on the first allocated channel. Therefore, when you allocate multiple channels with different parameters, especially when you allocate a channel with the CONNECT command, determine which channel will perform the control file autobackup. Always allocate the channel for this node first.