Monday 25 September 2017

Oracle 11gR2 RAC Architecture

Introduction 

Oracle RAC(Real Application Cluster) is high availability and scalability solution provided by Oracle for enterprise applications. This feature was introduced in Oracle 9i since then it has been evolved.
Oracle RAC system enables multiple servers or nodes to access the data from single database that resides in the shared storage i.e. multiple oracle database instances accessing single database. In Oracle 11gR2, Grid Infrastructure was introduced that is used to the implement cluster. Grid infrastructure software bundle includes the software for both Oracle Clusterware and Oracle ASM

What is SCAN in Oracle RAC?

Scan provides single name for the client to access databases running in the cluster. When a new node joins the cluster, clusterware obtains the VIP address, updates the cluster resource and makes it accessible through DNS. Advantage of using SCAN, is that connection information is not required to change on the client, in case a node gets added or removed within the cluster. SCAN provides load balancing and failover to the client connections to the database

What is Cache fusion?

Oracle Cache fusion is a mechanism where Oracle RAC logically combines each instance's buffer cache to enable applications to process data as if it resides in the single cache. Components of Cache fusion mechanism as mentioned below
  • GES - Global Enque Service
  • GCS - Global Cache Service
  • GRD - Global Resource Directory
     Cache fusion process is managed through two services GCS and GES. Both (GEC and GCS) records status of each cached block using Global Resource Directory (GRD). GRD contents are shared across all the  members of Oracle RAC cluster. All these actvities are performed using some of the background processes mentioned below

Background Process for Cache Fusion 

  • ACMS (Atomic Controlfile to Memory Service) - This process is an agent process that runs on each nodes of the cluster and it ensures distributed SGA update is committed Globally on success or Globally abrted in case of failure occurs
  • LMON (GES) - It maintains GES memory structure in case of process failure. It takes care of locks reconfiguration and cluster reconfiguration when ever a node joins or leaves the cluster
  • LMS (GCS) - It maintains the records of all the blocks that have been cached in the memory and this information is stored in GRD (Global Resource Directory). LMS also controls and manages the data blocks access that are cached, it also transmits the blocks between buffer cache of different instances
  • LMD (GES) - It manages incoming remote resource requests within the each instance
  • LCK0 (GES) - This process manages non-Cache Fusion resource requests such as library and row cache requests.
  • GTX0-j: Global Transaction Process :- This process provides transparent support for XA global transactions in an Oracle RAC environment. The database auto tunes the number of these processes based on the workload of XA global transactions.

Components of Oracle Clusterware

  • Voting Disks
  • OCR - Oracle Cluster Registry
  • Oracle Clusterware Stack

Voting Disks

Voting disk stores nodes membership information and provides fencing. Oracle recommends to have at least three voting disks but not more than five and at-most we can use 15 voting disks. It is stored in shared location(shared storage) within the cluster and is shared by all the nodes that belongs to the cluster.
     In Oracle RAC, CSSD monitors the health of RAC nodes via heart beat (Network heart beat and Disk heart beat). Heart beat information from each node is constantly written on to the voting disks, in case a node that cannot access the voting disk, then that particular node gets evicted from the cluster. This happens to avoid the split brain phenomenon. Voting disk decides what nodes should be part of the cluster. Nodes with quorum(a number, usually a majority of members of a body) will maintain active membership of the cluster and other node(s) will be fenced/rebooted.
Possible failure scenario are mentioned below
  • Network heart beat is fine, but disk heart beat is failed
  • Disk heart beat is fine, but network heart beat is failed
  • Both heart beats failed
  • Few nodes in the cluster have split in to subsets of nodes, communicating within that set but not with other members
  • One of the node is unhealthy

OCR - Oracle Cluster Registry

Oracle clusterware uses the OCR - Oracle Cluster Registry to store and manage information of cluster resources such as Oracle RAC database, listeners, VIP etc. OCR resides on shared storage location which is accessible to all the nodes of the clusters.
     OCR is a major component of the cluster and its automatically backed up every 4 hours. You can check backup using command
$ ocrconfig -showbackup 

     OCSSd (Oracle CSS deamon) uses the OCR extensively and writes the changes to the registry. OCR is loaded in the cache of each node, every node in the cluster will update the cache and one node at a time writes the cache to OCR file. CRSd updates the OCR about the status of each nodes in the cluster during reconfiguration and failures.

Oracle Clusterware Stack

Oracle Clusterware has two stacks such as cluster ready services stack and High availability services stack

Cluster Ready Services Stack

  1. Cluster ready services (CRS) :-  CRSd deamon manages cluster resources using information stored in the OCR for each resources. It generates the event whenever there is change in status of the resource. CRSd monitors the resources such as database, listener etc in case there is failure it restarts these components. 
  2. Cluster syncronization services (CSS) :- It manages the node membership on nodes in the cluster and also notfies the members of node when ever a nodes leaves or joins the cluster. CSSd monitors the cluster and provides the i/o fencing. Prior to Oracle 11gR2 this was taken care by OPRCd daemon.
  3. Oracle ASM :- Provides disk management for cluster and databases.
  4. Cluster Time Synchronization Service :- Provides time management for the cluster.
  5. Event Management (EVM) :- It publishes events that clusterware generates.
  6. Oracle Notification Service :- Its publish and subscribe service for Fast Application Notification (FAN) events.
  7. Oracle Agent (oraagent) :- Extends clusterware to support Oracle specific requirements and runs server call out scripts when FAN events occur. 
  8. Oracle Root Agent (orarootagent) :- oraroot agent helps CRSd to maintain all the resources owned by root such as network Grid VIP etc

High availability Services Stack

  1. Cluster Logger Service (ologgerd) :- Receives information from all the nodes and writes the information to CHM repository database. This service runs on only two nodes
  2. System Monitor Service (osysmond) :- This service collects operating system metrics and monitoring data and this data is send to cluster logger service. It runs on all the nodes of the cluster
  3. Grid Plug and Play (GPNPD) :- It prvode access to Grid plug and play profile and coordinates the profile updates among the nodes in the cluster. This makes sure all the nodes in the cluster have most recent profile 
  4. Grid Inter-process Communication :- It enables redundant interconnect usage 
  5. Multicast Domain Name Service :- This is used by grid plug and play to locate the profiles in the cluster
  6. Oracle Grid Naming Service :- Handles request sent by external DNS servers and perform name resolution for names provided in the cluster

Utilities to Manage Oracle RAC Environment 

SRVCTL - Server Control

Its the command line utility used to manage services such as database, listener and services in the cluster.

CRSCTL - Oracle Clusterware Control

This is command line utility used to manage the clusterware and its resources.

OIFCFG - Oracle Interface Configuration Tool

This utility is used to allocate and de-allocate network resources or component for the cluster.

OCRCONFIG - Oracle Cluster Registry Configuration Tool

Its the command line utility to manage OCR component of RAC. Also we can use OCRCHECK and OCRDUMP to troubleshoot issues related to OCR.

CHM - Cluster Health Monitor

It is a tool that tracks operating system resource consumption such as processes, devices etc. It collects and analyzes clusterware data.

CVU - Cluster Verification Utility 

This is command line utility used to verify resources such as shared storage devices, networking configurations, system requirements, and Oracle Clusterware, and operating system groups and users.

OEM - Oracle Enterprise Manager

Its is as GUI portal where we can monitor standalone and RAC databases along with all the components of Grid infrastructure.

3 comments:


  1. Iam so thrilled because of finding your alluring website here.Actually i was searching for Oracle RAC.Your blog is so astounding and informative too..Iam very happy to find such a creative blog. Iam also find another one by mistake while am searching the same topicOracle SQL.Thank you soo much..

    ReplyDelete