Helping colleagues in OTN Forum I found a issue that have not been solved yet, but we must be aware of this issue before performing implementation.

When you are using policy-managed database Oracle Manage all instance automatically on demand without dba intervention. For the whole thing work well is mandatory use SCAN (IP and SCAN Listeners) and VIP (IP and Local Listeners) database use these resource to automatically register it's own services and that's way how the Oracle client will find a Oracle Instance.

When you Configure a SHARED SERVER on Policy-Managed Dababase one question arise:
SHARED SERVER don't use Parameter LOCAL_LISTENER to register database on Listener, but it use DISPATCHERS parameter.

DISPATCHERS parameter is NOT Like LOCAL_LISTENER parameter which automatically updated by Database Agent. Then you must set DISPATCHERS manually using VIP of current Node on each Instance. As explained in note How To Configure Shared Server Dispatchers For RAC Environment [ID 578524.1]. BUT this will work only if you have a ADMIN-MANAGED database because instance is fixed on that node.

Policy-managed database has not fixed a instance on specific node and also has not a specific INSTANCE_NAME, because INSTANCE_NAME can be changed automatically on demand or when you change configuration of your SERVER POOL.

The BIG QUESTION:
How configure DISPATCHERS parameter when database is POLICY-MANAGED ? Since DISPATCHERS can't be automatically configured dynamically, but the INSTANCE is dynamic and can be reloacated to any node on that server pool.

When I get the anwser I'll post here Oracle Solution. (If someone already has the answer please post here)

Backup execution time  is slow.. where is the problem?

When we perform backup using third party software and backup of database is slow, there always is some uncertain who is causing the slowness.

The Database Administrator (DBA) says: is Media Management Library (aka MML) and Backup Operator says: is RMAN.

To end this conflict I ​​will show how to identify where is the problem.

I'll use the term RMAN (like RMAN spend time), but actually means Database spend time. Because RMAN is only a client. So typically the database is slow and not RMAN Client is slow.

Note: I will not diagnose what is causing it, I will only  help you  identify whether the problem is, in  MML or RMAN.

Media Management

The Oracle Media Management Layer (MML) API lets third-party vendors build a media manager, software that works with RMAN and the vendor's hardware to allow backups to sequential media devices such as tape drives. A media manager handles loading, unloading, and labeling of sequential media such as tapes.

RMAN Interaction with a Media Manager

RMAN does not issue specific commands to load, label, or unload tapes. When backing up, RMAN gives the media manager a stream of bytes and associates a unique name with this stream. When RMAN must restore the backup, it asks the media manager to retrieve the byte stream. All details of how and where that stream is stored are handled entirely by the media manager. For example, the media manager labels and keeps track of the tape and names of files on each tape, and automatically loads and unloads tapes, or signals an operator to do so.

RMAN provides a list of files requiring backup or restore to the media manager, which in turn makes all decisions regarding how and when to move the data.

Before RMAN making a call to any of functions in the media management API, the server create a EVENT WAIT. So, These with  EVENT WAIT  is possible to get the number of  wait (in seconds or number) that the server has been waiting for this call to return.

So, we can calculate it and find out how much time RMAN spend waiting MML (e.g writing  or querying a backup piece filename) processing request and return to RMAN.

Complete list  EVENT of MML

Oracle 11.2 or above:

SELECT NAME
FROM   V$EVENT_NAME
WHERE  NAME LIKE '%MML%';
NAME
----------------------------------------
Backup: MML initialization
Backup: MML v1 open backup piece
Backup: MML v1 read backup piece
Backup: MML v1 write backup piece
Backup: MML v1 close backup piece
Backup: MML v1 query backup piece
Backup: MML v1 delete backup piece
Backup: MML create a backup piece
Backup: MML commit backup piece
Backup: MML command to channel
Backup: MML shutdown
Backup: MML obtain textual error
Backup: MML query backup piece
Backup: MML extended initialization
Backup: MML read backup piece
Backup: MML delete backup piece
Backup: MML restore backup piece
Backup: MML write backup piece
Backup: MML proxy initialize backup
Backup: MML proxy cancel
Backup: MML proxy commit backup piece
Backup: MML proxy session end
Backup: MML datafile proxy backup?
Backup: MML datafile proxy restore?
Backup: MML proxy initialize restore
Backup: MML proxy start data movement
Backup: MML data movement done?
Backup: MML proxy prepare to start
Backup: MML obtain a direct buffer
Backup: MML release a direct buffer
Backup: MML get base address
Backup: MML query for direct buffers

Previous version of Oracle Database 11.2 the Event name MML does not exists because it’s was changed on version 11.2 from %STB% to %MML%.

So, If you are using Oracle 11.1 or previous you can query V$EVENT_NAME where NAME like '%sbt%'.

SELECT NAME
FROM   V$EVENT_NAME
WHERE  NAME LIKE '%sbt%';

Backup: sbtinit
Backup: ssbtopen
Backup: ssbtread
Backup: ssbtwrite
Backup: ssbtbackup
.
.
.

So, lets start...
Oracle store statistics (cumulative, since database was started) of these wait on v$system_event. I always use GV$ because is very common we admin RAC env on this days.

Before start backup I'll take a snapshot intial of gv$system_event...by creating a table RMAN_MML_EVENT_T1.

Click on icon "View Source" to see formated text.

CREATE TABLE  RMAN_MML_EVENT_T1 AS
SELECT inst_id,
  event,
  TOTAL_WAITS,
  TOTAL_TIMEOUTS,
  TIME_WAITED,
  AVERAGE_WAIT,
  TIME_WAITED_MICRO,
  sysdate as SNAPSHOT_TIME
FROM gv$system_event
WHERE event LIKE 'Backup%';

SQL> select * from RMAN_MML_EVENT_T1;

   INST_ID EVENT                                   TOTAL_WAITS TOTAL_TIMEOUTS TIME_WAITED AVERAGE_WAIT TIME_WAITED_MICRO SNAPSHOT_TIME
---------- --------------------------------------- ----------- -------------- ----------- ------------ ----------------- -----------------
         1 Backup: MML initialization                      371              0       54365       146.54         543651136 08-08-12 17:11:05
         1 Backup: MML create a backup piece               450              0        4827        10.73          48270960 08-08-12 17:11:05
         1 Backup: MML commit backup piece                 450              0        7417        16.48          74172281 08-08-12 17:11:05
         1 Backup: MML shutdown                            371              0          47          .13            469267 08-08-12 17:11:05
         1 Backup: MML query backup piece                  894              0       11222        12.55         112222166 08-08-12 17:11:05
         1 Backup: MML extended initialization             371              0           0            0              3655 08-08-12 17:11:05
         1 Backup: MML delete backup piece                 444              0        5348        12.05          53480530 08-08-12 17:11:05
         1 Backup: MML write backup piece              1378078              0     3053683         2.22        3.0537E+10 08-08-12 17:11:05

8 rows selected.

I started backup using RMAN and MML (Tivoli Storage Manager). When backup finished you can query V$RMAN_BACKUP_JOB_DETAILS to get accurate time of backup

SELECT START_TIME,
  END_TIME,
  ROUND(INPUT_BYTES  /1024/1024/1024,2) IMPUT_GBYTES ,
  ROUND(OUTPUT_BYTES /1024/1024/1024,2) OUTPUT_GBYTES,
  INPUT_TYPE,
  ELAPSED_SECONDS
FROM V$RMAN_BACKUP_JOB_DETAILS
WHERE TRUNC(START_TIME) = TRUNC(SYSDATE)
AND INPUT_TYPE LIKE 'DB%';

START_TIME        END_TIME          IMPUT_GBYTES OUTPUT_GBYTES INPUT_TYPE    ELAPSED_SECONDS
----------------- ----------------- ------------ ------------- ------------- ---------------
08-08-12 17:23:44 08-08-12 17:26:38        12.85         10.06 DB FULL                   174

In my case the backup full take 174 seconds to backup read 12.85GB and Write on MML 10.06GB of data

So, after backup finish I take the 2nd snapshot by creating the table RMAN_SNAPSHOT_T2.


CREATE TABLE  RMAN_SNAPSHOT_T2 AS
SELECT inst_id,
  event,
  TOTAL_WAITS,
  TOTAL_TIMEOUTS,
  TIME_WAITED,
  AVERAGE_WAIT,
  TIME_WAITED_MICRO,
  sysdate as SNAPSHOT_TIME
FROM gv$system_event
WHERE event LIKE 'Backup%';

SQL> select * from RMAN_MML_EVENT_T2;

   INST_ID EVENT                                   TOTAL_WAITS TOTAL_TIMEOUTS TIME_WAITED AVERAGE_WAIT TIME_WAITED_MICRO SNAPSHOT_TIME
---------- --------------------------------------- ----------- -------------- ----------- ------------ ----------------- -----------------
         1 Backup: MML initialization                      373              0       54665       146.56         546652333 08-08-12 17:27:45
         1 Backup: MML create a backup piece               454              0        4860        10.71          48604759 08-08-12 17:27:45
         1 Backup: MML commit backup piece                 454              0        7482        16.48          74820999 08-08-12 17:27:45
         1 Backup: MML shutdown                            373              0          47          .13            471590 08-08-12 17:27:45
         1 Backup: MML query backup piece                  900              0       11281        12.53         112808077 08-08-12 17:27:45
         1 Backup: MML extended initialization             373              0           0            0              3665 08-08-12 17:27:45
         1 Backup: MML delete backup piece                 446              0        5373        12.05          53727006 08-08-12 17:27:45
         1 Backup: MML write backup piece              1419274              0     3067298         2.16        3.0673E+10 08-08-12 17:27:45

8 rows selected.

Now I can calculate the values from RMAN_MML_EVENT_T2 minus RMAN_MML_EVENT_T1 to get the real time spend on MML.
Note:
EVENT: Name of the wait event
TOTAL_WAITS: Total number of waits for the event
TOTAL_TIMEOUTS: Total number of timeouts for the event
TIME_WAITED: Total amount of time waited for the event (in hundredths of a second)
AVERAGE_WAIT: Average amount of time waited for the event (in hundredths of a second)
TIME_WAITED_MICRO: Total amount of time waited for the event (in microseconds)

SELECT t1.inst_id,
  t1.event,
  t2.total_waits       - t1.total_waits total_waits,
  t2.total_timeouts    -t1.total_timeouts total_timeouts,
  t2.time_waited       - t1.time_waited time_waited,
  t2.time_waited_micro - t1.time_waited_micro time_waited_micro
FROM RMAN_MML_EVENT_T1 T1,
  RMAN_MML_EVENT_T2 T2
WHERE t1.inst_id = t2.inst_id
AND t1.event     = t2.event;

   INST_ID EVENT                                   TOTAL_WAITS TOTAL_TIMEOUTS TIME_WAITED TIME_WAITED_MICRO
---------- --------------------------------------- ----------- -------------- ----------- -----------------
         1 Backup: MML initialization                        2              0         300           3001197
         1 Backup: MML create a backup piece                 4              0          33            333799
         1 Backup: MML commit backup piece                   4              0          65            648718
         1 Backup: MML shutdown                              2              0           0              2323
         1 Backup: MML query backup piece                    6              0          59            585911
         1 Backup: MML extended initialization               2              0           0                10
         1 Backup: MML delete backup piece                   2              0          25            246476
         1 Backup: MML write backup piece                41196              0       13615         136141912

8 rows selected.

As I can see above the MML spend more time writing backup piece.

So, I'll sum the time to get total time spend on MML.

SELECT SUM (total_waits) total_waits,
  SUM(total_timeouts) total_timeouts ,
  SUM (time_waited)/100 time_waited_in_second,
  SUM (time_waited_micro) time_waited_micro
FROM
  (SELECT t1.inst_id,
    t1.event,
    t2.total_waits       - t1.total_waits total_waits,
    t2.total_timeouts    -t1.total_timeouts total_timeouts,
    t2.time_waited       - t1.time_waited time_waited,
    t2.time_waited_micro - t1.time_waited_micro time_waited_micro
  FROM RMAN_MML_EVENT_T1 T1,
    RMAN_MML_EVENT_T2 T2
  WHERE t1.inst_id = t2.inst_id
  AND t1.event     = t2.event
  )

TOTAL_WAITS TOTAL_TIMEOUTS TIME_WAITED_IN_SECOND TIME_WAITED_MICRO
----------- -------------- --------------------- -----------------
      41218              0                140.97         140960346

Calculating time total of backup window, time spend on MML and time spend of RMAN.

Note: TIME_SPEND_BY_RMAN = (ELAPSED_SECOND_BACKUP-TIME_SPEND_BY_MML_SECOND)

ELAPSED_SECONDS_BACKUP          TIME_SPEND_BY_MML_SECOND       TIME_SPEND_BY_RMAN_SECOND
------------------------------ ------------------------------ -------------------
174                             140.97                         33.03

Summarizing:
Total time of backup : 174
Time spend by MML: 141
Time spend by RMAN : 33

If this backup is slow is because MML take (141*100/174) 81% of time spend of backup window.

Additional info:
As my backup was done over Lan:
(10.06GB * 1024 = 10301MB)
10301MB / 144 = 71Mbytes/second

As I'm using network interface of 1 Gbit I can consider a normal throughput.

Also you can monitoring in real time where is wait.

Just execute this script above:

Note : if you are using previous version of 11.2 change %MML% to %sbt%.

vi monitoring_mml.sh
sqlplus -s sys/<password>@<db_name> as sysdba<<EOF
set echo off
COLUMN EVENT FORMAT a17
COLUMN SECONDS_IN_WAIT FORMAT 999
COLUMN STATE FORMAT a15
COLUMN CLIENT_INFO FORMAT a30
set linesize 200

select to_char(sysdate,'dd-mm-yyyy hh24:mi:ss') actual_date from dual;

SELECT p.SPID, sw.EVENT, sw.SECONDS_IN_WAIT AS SEC_WAIT, 
       sw.STATE, CLIENT_INFO
FROM   gV\$SESSION_WAIT sw, gv\$SESSION s, gV\$PROCESS p
WHERE  sw.EVENT LIKE '%MML%'
AND    s.SID=sw.SID
AND    s.PADDR=p.ADDR;
EOF
exit

Using shell execute the command above, and you will see in real time the wait on MML.

while true
do
sh monitoring_mml.sh
sleep 1
done

.

Find us on Google+

Customers who experience Oracle Real Application Clusters (RAC) node evictions due to  excessive AIX kernel paging should carefully review and implement these recommended best practices. Testing and experience have found that memory over commitments may
cause scheduling delays for Oracle’s ‘oprocd’ process in Oracle RAC versions prior to 11.2 which may result in node evictions. Implementing all of these recommendations will reduce scheduling delays and corresponding oprocd initiated evictions for Oracle RAC versions prior to 11.2. For Oracle RAC versions 11.2 and later, implementing all of these recommendations will ensure optimal performance and scalability

 

http://www.oracle.com/technetwork/products/clusterware/overview/rac-aix-system-stability-131022.pdf

If link above is broke see below  click below:

rac-aix-system-stability-131022

Enjoy…

 

Oracle RDBMS Server 11gR2 Pre-Install RPM for Oracle Linux 6 has been released

By Lenz Grimmer

 Now that the certification of the Oracle Database 11g R2 with Oracle Linux 6 and the Unbreakable Enterprise Kernel has been announced, we are glad to announce the availability of oracle-rdbms-server-11gR2-preinstall, the Oracle RDBMS Server 11gR2 Pre-install RPM package (formerly known as oracle-validated). Designed specifically for Oracle Linux 6, this RPM aids in the installation of the Oracle Database.

In order to install the Oracle Database 11g R2 on Oracle Linux 6, your system needs to meet a few prerequisites, as outlined in the Linux Installation Guides. Using the Oracle RDBMS Server 11gR2 Pre-install RPM, you can complete most of the pre-installation configuration tasks. which is now available from the Unbreakable Linux Network, or via the Oracle public yum repository.

https://blogs.oracle.com/linux/entry/oracle_rdbms_server_11gr2_pre

Enjoy

I saw some doubts as to which utility to use and in what situation we should use.

I searched on some sites related to Oracle and saw that the people is still a bit confused about  which command we should use.

But before start there is a rule to a Clusterware Envorinment:

The “srvctl” is to be used to managed  resources with the prefix  ora.* resources and “crsctl” is to be used  to query or start/stop resources with prefix ora.*, but crsctl is not supported to modify or edit resources with prefix ora.* .

See this note on MOS:

Oracle Clusterware and Application Failover Management [ID 790189.1]

Using crs_* or crsctl commands on resources with the prefix ora.* (resources provided by Oracle) remains unsupported.

 

So, if you created a resource with “srvctl” this resource should be managed only by “srvctl”. If you create a resource with “crsctl” this resource should be managed using “crsctl” command.

Let’s talk about  the concept Policy-Based Cluster.

Oracle Clusterware 11g release 2 (11.2) introduces a different method of managing nodes and resources used by a database called policy-based management.

With Oracle Clusterware 11g release 2 (11.2) and later, resources managed by Oracle Clusterware are contained in logical groups of servers called server pools. Resources are hosted on a shared infrastructure and are contained within server pools. The resources are restricted with respect to their hardware resource (such as CPU and memory) consumption by policies, behaving as if they were deployed in a single-system environment.

Policy-based management:

  • Enables dynamic capacity assignment when needed to provide server capacity in accordance with the priorities you set with policies
  • Enables allocation of resources by importance, so that applications obtain the required minimum resources, whenever possible, and so that lower priority applications do not take resources from more important applications
  • Ensures isolation where necessary, so that you can provide dedicated servers in a cluster for applications and databases

Applications and databases running in server pools do not share resources. Because of this, server pools isolate resources where necessary, but enable dynamic capacity assignments as required. Together with role-separated management, this capability addresses the needs of organizations that have standardized cluster environments, but allow multiple administrator groups to share the common cluster infrastructure.

This is only a concept.

Therefore Oracle divided this concept to be used for two types of configuration

Policy-Managed Database and  Policy-Based Management to non-database.

Policy-Managed Database

A database that you define as a cluster resource. Management of the database is defined by how you configure the resource, including on which servers the database can run and how many instances of the database are necessary to support the expected workload.

To configure Policy managed database, Oracle already have pre-defined configuration for that.

So, the options are limited and specific to Database resources (such as Services,Database).

For that reason Oracle provided “srvctl add serverpool”.

 $ srvctl add serverpool -h

Adds a server pool to the Oracle Clusterware.

Usage: srvctl add srvpool -g <pool_name> [-l <min>] [-u <max>] [-i <importance>] [-n "<server_list>"] [-f]
 -g <pool_name> Server pool name
 -l <min> Minimum size of the server pool (Default value is 0)
 -u <max> Maximum size of the server pool (Default value is -1 for unlimited maximum size)
 -i <importance> Importance of the server pool (Default value is 0)
 -n "<server_list>" Comma separated list of candidate server names
 -f Force the operation even though some resource(s) will be stopped
 -h Print usage
 

http://docs.oracle.com/cd/E11882_01/rac.112/e16795/srvctladmin.htm#BAJDJFCJ

 Policy-Based Management to non-database.

To configure Non-Database resources, Oracle provided another command with much more options “crsctl add serverpool”.

This allow the DBA explore all options which Policy Managed can supply.

 $ crsctl add serverpool -h
 Usage:
 crsctl add serverpool <spName> [[-file <filePath>] | [-attr "<attrName>=<value>[,...]"]] [-i]
 where
 spName Add named server pool
 filePath Attribute file
 attrName Attribute name
 value Attribute value
 -i Fail if request cannot be processed immediately
 -f Force option

http://docs.oracle.com/cd/E11882_01/rac.112/e16794/crsref.htm#CWADD92179

So, we NEVER should not mix the serverpool used by database resource and serverpool used by non-database resource.

Also never use “crsctl” command to change Database Server Pool wich was created by “srvctl”.  Never put a database  in a serverpool created by using “crsctl” command.

Server Pool to database resource must be created by using “srvctl”.

Server Pool to non-database resource must be created by using “crsctl”

Question: Is possible change ora.* resources with “crsctl”?

Yes, It’s possible but not supported by Oracle.

Hope make this clear.

Enjoy…

In this post I will some tips for setting up RMAN in RAC environment.

I will not cover  topics about  RMAN that can be configured in standalone environment (e.g Incremental backup, use of FRA, etc.)

First question: Is there a difference of setting up RMAN between  standalone and RAC environments?

The answer is YES, not too much but some points must be observed.

RMAN Catalog

First of all, In my point of view use RMAN Catalog is mandatory. Because it’s a HA environment and  to restore a database without RMAN Catalog can take long time.

To protect and keep backup metadata for longer retention times than can be accommodated by the control file, you can create a recovery catalog. You should create the recovery catalog schema in a dedicated standalone database. Do not locate the recovery catalog with other production data. If you use Oracle Enterprise Manager, you can create the recovery catalog schema in the Oracle Enterprise Manager repository database.

About HA of RMAN Catalog?

I always recommend to place the Host that will hold RMAN Catalog on VirtualMachine, because  is a machine which require low resource and disk space and have low activity.

In case of the failure of Host RMAN Catalog is easy move that host to another Physical Host or Recover the whole virtual machine.

But if the option of use a VM is not avaliable. Use another cluster (e.g Test Cluster) env if avaliable.

The Database of RMAN catalog must be in ARCHIVELOG. Why?

It’s a prod env (is enough), will generate very small amount of  archivelogs, and in case of any corruption or user errors (e.g User generated new incarnation of Prod Database  during a Test Validation of Backup in Test env) can be necessary recovery point in time.

Due a small database of low activity I see some customers not giving importance to this database. It’s should not happens.

High availability of execution of backup using RMAN:

We have some challenges:

  1.  The backup must not affect the availability cluster, but the backup must be executed daily.
  2.  The backup cannot be dependent of nodes (i.e backup must be able to execute in all nodes independently if have some nodes active or not)
  3. Where store the scripts of backup? Where store the Logs?

I don’t recommend use any nodes of cluster to start/store scripts backups. Due if that node fail backup will not be executed.
Use the Host where is stored RMAN Catalog to store your backup scripts too and start these scripts from this host, the utility RMAN works as client only… the backup is always performed on server side.

Doing this you will centralize all scripts and logs of backup from your environment. That will ease the management of backup.

Configuring the RMAN Snapshot Control File Location in a RAC 11.2

RMAN creates a copy of the control file for read consistency, this is the snapshot controlfile. Due to the changes made to the controlfile backup mechanism in 11gR2 any instances in the cluster may write to the snapshot controlfile. Therefore, the snapshot controlfile file needs to be visible to all instances.
The same happens when a backup of the controlfile is created directly from sqlplus any instance in the cluster may write to the backup controfile file.
In 11gR2 onwards, the controlfile backup happens without holding the control file enqueue. For non-RAC database, this doesn’t change anything.
But, for RAC database, the snapshot controlfile location must be in a shared file system that will be accessible from all the nodes.
The snapshot controlfile MUST be accessible by all nodes of a RAC database.

See how do that:

https://forums.oracle.com/forums/thread.jspa?messageID=9997615

Since version 11.1 : Node Affinity Awareness of Fast Connections

In some cluster database configurations, some nodes of the cluster have faster access to certain data files than to other data files. RMAN automatically detects this, which is known as node affinity awareness. When deciding which channel to use to back up a particular data file, RMAN gives preference to the nodes with faster access to the data files that you want to back up. For example, if you have a three-node cluster, and if node 1 has faster read/write access to data files 7, 8, and 9 than the other nodes, then node 1 has greater node affinity to those files than nodes 2 and 3.

Channel Connections to Cluster Instances with RMAN

Channel connections to the instances are determined using the connect string defined by channel configurations. For example, in the following configuration, three channels are allocated using dbauser/pwd@service_name. If you configure the SQL Net service name with load balancing turned on, then the channels are allocated at a node as decided by the load balancing algorithm.

However, if the service name used in the connect string is not for load balancing, then you can control at which instance the channels are allocated using separate connect strings for each channel configuration. So,your backup scripts will fail if  that node/instance is down.

So, my recommendation in admin-managed database environment is create a set of nodes to perform the backup.

E.g : If  you have 3 nodes you should use one or two node to perform backup, while the other node is less loaded. If you are using Load Balance in your connection… the new connection will be directed to the least loaded node.

Autolocation for Backup and Restore Commands

RMAN automatically performs autolocation of all files that it must back up or restore. If you use the noncluster file system local archiving scheme, then a node can only read the archived redo logs that were generated by an instance on that node. RMAN never attempts to back up archived redo logs on a channel it cannot read.

During a restore operation, RMAN automatically performs the autolocation of backups. A channel connected to a specific node only attempts to restore files that were backed up to the node. For example, assume that log sequence 1001 is backed up to the drive attached to node1, while log 1002 is backed up to the drive attached to node2. If you then allocate channels that connect to each node, then the channel connected to node1 can restore log 1001 (but not 1002), and the channel connected to node2 can restore log 1002 (but not 1001).

Configuring Channels to Use Automatic Load Balancing

To configure channels to use automatic load balancing, use the following syntax:

CONFIGURE DEVICE TYPE [disk | sbt] PARALLELISM number_of_channels;

Where number_of_channels is the number of channels that you want to use for the operation. After you complete this one-time configuration, you can issue BACKUP or RESTORE commands.

Setup  Parallelism on RMAN  is not enough to keep a balance, because if you start the backup from remote host using default SERVICE_NAME and if you are using parallelism the RMAN can start a session in each node and the backup be performed by all nodes at same time, this is not a problem, but can cause a performance   issue on your environment due high load.

Even at night the backup can cause performance problems due maintenance of the database (statistics gathering, verification of new SQL plans “automatic sql tuning set”, etc).

The bottlenecks are usually in or LAN or SAN, so use all nodes to perform backup can be  a waste. If the backup is run via LAN you can gain by using more than one node, but the server that is receiving the backup data will become a bottleneck.

I really don’t like to use more than 50% of nodes of RAC to execute backup due it can increase the workload in all nodes of clusters and this can be a problem to the application or database.

So, thinking to prevent it we can configure a Database Service to control where backup will be performed.

Creating a Database Service to perform Backup

Before start I should explain about limitation of database service.

Some points about Oracle Services.

When a user or application connects to a database, Oracle recommends that you use a service for the connection. Oracle Database automatically creates one database service (default service is always the database  name)  when the database is created.   For more flexibility in the management of the workload using the database, Oracle Database enables you to create multiple services and specify which database instances offer the services.

You can define services for both policy-managed and administrator-managed databases.

  • Policy-managed database: When you define services for a policy-managed database, you assign the service to a server pool where the database is running. You can define the service as either uniform (running on all instances in the server pool) or singleton (running on only one instance in the server pool).
  • Administrator-managed database: When you define a service for an administrator-managed database, you define which instances normally support that service. These are known as the PREFERRED instances. You can also define other instances to support a service if the preferred instance fails. These are known as AVAILABLE instances.
About Service Failover in Administrator-Managed Databases

When you specify a preferred instance for a service, the service runs on that instance during normal operation. Oracle Clusterware attempts to ensure that the service always runs on all the preferred instances that have been configured for a service. If the instance fails, then the service is relocated to an available instance. You can also manually relocate the service to an available instance.

About Service Failover in Policy-Managed Databases

When you specify that a service is UNIFORM, Oracle Clusterware attempts to ensure that the service always runs on all the available instances for the specified server pool. If the instance fails, then the service is no longer available on that instance. If the cardinality of the server pool increases and a instance is added to the database, then the service is started on the new instance. You cannot manually relocate the service to a specific instance.

When you specify that a service is SINGLETON, Oracle Clusterware attempts to ensure that the service always runs on only one of the available instances for the specified server pool. If the instance fails, then the service fails overs to a different instance in the server pool. You cannot specify which instance in the server pool the service should run on.

For SINGLETON services, if a service fails over to an new instance, then the service is not moved back to its original instance when that instance becomes available again.

Summarizing about use Services

If your database is Administrator-Managed  we can create a service and define  where backup will be executed, and how much nodes we can use with preferred and available nodes.

If your database is Policy-Managed we cannot define where backup will be executed, but we can configure a service SINGLETON, that will be sure that backup  will be executed in  only node, if that node fail the service will be moved to another available node, but we cannot choose in which node backup will be performed.

Note:

For connections to the target and auxiliary databases, the following rules apply:

Starting with 10gR2, these connections can use a connect string that does not bind to any particular instance. This means you can use load balancing.

Once a connection is established, however, it must be a dedicated connection  that cannot migrate to any other process or instance. This means that you still can’t use MTS or TAF.

Example creating service for Administrator-Managed Database

The backup will be executed on db11g2 and db11g3, but can be executed on db11g1 if db11g2 and db11g3 fail.

Set ORACLE_HOME  to same used by Database

$ srvctl add service -d db_unique_name -s service_name -r preferred_list [-a available_list] [-P TAF_policy]

$ srvctl add service -d db11g -s srv_rman -r db11g2,db11g3 -a db11g1 -P NONE -j LONG

$ srvctl start service -d db11g -s srv_rman
Example creating service for Policy-Managed Database

Using a service SINGLETON the backup will be executed on node which service was started/assigned. The service will be changed to another host only if that node fail.

Set ORACLE_HOME  to same used by Database

$ srvctl config database -d db11g |grep "Server pools"

Server pools: racdb11gsp

$ srvctl add service -d db11g -s srv_rman -g racdb11gsp -c SINGLETON

$ srvctl start service -d db11g -s srv_rman

If you have more than 2 nodes on cluster (with policy managed database) and you want use only 2 or more nodes to perform backup, you can choose the options below.

Configure a Service UNIFORM (the service will be available on all nodes)   you can control how much instance will be used to perform backup, but you cannot choose in which node backup will be performed. In fact the service does not control anything, you will set  PARALLELISM (RMAN) equal number of nodes wich you want use .

Ex: I have 4 Nodes but I want start backup in 2 nodes. I must choose parallelism 2.  Remember that Oracle can start 2 Channel on same host, this depend on workload of each node.

Using Policy Managed Database you should be aware that you do not care where (node) each instance is running, you will have a pool with many nodes and Oracle will manage all resources inside that pool. For this reason is not possible to control where you will place a heavier load.

This will only work if you are performing online backup or are using Parallel Backup.

Configuring RMAN to Automatically Backup the Control File and SPFILE

If you set CONFIGURE CONTROLFILE AUTOBACKUP to ON, then RMAN automatically creates a control file and an SPFILE backup after you run the BACKUP or COPYcommands. RMAN can also automatically restore an SPFILE, if this is required to start an instance to perform recovery, because the default location for the SPFILE must be available to all nodes in your Oracle RAC database.

These features are important in disaster recovery because RMAN can restore the control file even without a recovery catalog. RMAN can restore an autobackup of the control file even after the loss of both the recovery catalog and the current control file. You can change the default name that RMAN gives to this file with the CONFIGURE CONTROLFILE AUTOBACKUP FORMAT command. Note that if you specify an absolute path name in this command, then this path must exist identically on all nodes that participate in backups.

RMAN performs the control file autobackup on the first allocated channel. Therefore, when you allocate multiple channels with different parameters, especially when you allocate a channel with the CONNECT command, determine which channel will perform the control file autobackup. Always allocate the channel for this node first.

Enjoy..

In this post I will show you how to setting up environment high availability without the option Oracle RAC.

Oracle Fail Safe is available only for Windows, for Unix / Linux would need third party software  Cluster to do the  Failover.

Good News From Oracle:

Oracle Clusterware

Oracle Clusterware provides cluster membership and high availability services. It provides the cluster membership for features such as Oracle Real Application Clusters and Oracle ASM. It includes the following features:

  • Application monitoring, restart, and failover
  • Cluster membership services
  • Server monitoring and fencing
  • Single Client Access Name (SCAN)
  • Server Pools
  • Grid Naming Services

Oracle Clusterware can be used to protect any application (restarting or failing over the application in the event of a failure), free of charge, if one or more of the following conditions are met:

  • The server OS is supported by a valid Oracle Unbreakable Linux support contract.
  • The product to be protected is either:
    • Any Oracle product (e.g. Oracle Applications, Siebel, Hyperion, Oracle Database EE, Oracle Database XE)
    • Any third-party product that directly or indirectly stores data in an Oracle database
  • At least one of the servers in the cluster is licensed for Oracle Database (SE or EE)

A cluster is defined to include all the machines that share the same Oracle Cluster Registry (OCR) and Voting Disk.

http://download.oracle.com/docs/cd/E11882_01/license.112/e10594/editions.htm

See step by step here using clusterware 11.1, we can improvise this setup to 11.2 using SCAN feature which is more easy.

http://www.oracle.com/technetwork/products/clusterware/overview/si-db-failover-11g-134623.pdf

If link above is off, click here