Large Oracle RAC Databases Performance on MPP Machines

Unlike a cluster where only a few nodes are interconnected, massively parallel processing (MPP) systems allow a large number of nodes to be interconnected. There are MPP systems with more than one thousand nodes. Each these nodes is a separate system having its own CPU, disks, controllers, memory, and internal system buses to form a “loosely coupled”, share-nothing architecture. All of these nodes are connected via a high-speed, high-bandwidth interconnect. Each node has separate copy of the operating system. In MPP architecture, Oracle is installed in RAC mode.

Each Oracle instance on each node is responsible for all resources it holds, has a view of the entire database, and can find out which node holds a lock on any part of the database. If it needs something that is locked by another node, it will require an inter-instance ping, where the other instance would have to write to disk all the changes it has made. A typical database query would go against one node; data would be picked up from the memory and or disks of any appropriate node and travel via interconnect to return the data to the requesting node. Each node controls its own set of disks and can take over control of another set if a node fails. Thus, all the nodes may be configured to be primary or a combination of primary and secondary. The database files are placed on the primary nodes, whereas the secondary nodes provide the necessary redundancy to take over and replace the primary nodes, in case of the latter fail.

The biggest disadvantage in MPP architectures is not the architecture itself, but application design flaws that exist in a typical implementation. Most application designers are well versed in implementing applications on SMP machines. However, MPP architectures require a total paradigm shift in terms of analyzing which data needs to be placed on which node to reduce data sharing and inter-instance ping across nodes. Since each node is highly independent in a share-nothing architecture, any situation that causes nodes to trade large-scale resources due to data sharing results in high traffic across the interconnect. This situation is causing heavy performance problems in Oracle RAC databases. Performance of Oracle RAC database is mostly depending on wrong hardware architecture configuration. You should need to consider hardware architecture and deployment of operating system before processing for Oracle Real Application Cluster configuration. Otherwise, you would get lots of problems after deployment and changing hardware configuration is requiring downtime of your high available databases.