Blog | Pythian

Listener over Infiniband on Exadata (part 1)

Written by Franky Faust | Feb 6, 2019 5:00:00 AM

 

Working with Oracle Engineered Systems offers unique opportunities for optimization. One such optimization is configuring a listener over the InfiniBand network. But why would you do this, and how does it benefit your environment?

Background: Understanding InfiniBand and SDP

First, it is important to understand that InfiniBand (IB) is a computer-networking communications standard used in high-performance computing. It features very high throughput and very low latency, making it the ideal interconnect for data transfer among servers and storage systems.

To leverage this hardware, we enable the Sockets Direct Protocol (SDP). SDP provides an interface between the network interface card and the application. By offloading the messaging burden to the network card, SDP frees up the CPU, decreases latency, and improves overall performance.

This approach is applicable to most Oracle Engineered Systems, including Exadata (full rack or elastic), SuperCluster, Exalytics, and BDA.

Contextualization: Why Offload to InfiniBand?

Imagine you have a high-demand application, such as a GoldenGate (GG) instance in a downstream architecture, running on one of your compute nodes (e.g., exadb01). While the database uses the InfiniBand network to talk to storage cells, it typically uses the public 10Gbps network to ship data to the GoldenGate instance.

The Problem: Network Bottlenecks

In environments generating massive amounts of redo—for example, 21TB per day (~900GB per hour)—the public network can become a bottleneck. While 2Gbps might only be 20% of a 10Gbps link, that bandwidth is shared with every other database and client application.

The Solution: InfiniBand Offloading

By moving this traffic to the Private InfiniBand network:

  • Reduced Overhead: Traffic that consumed 20% of the public network might only consume 2.5% to 5% of the InfiniBand bandwidth.
  • Improved Efficiency: The public network "breathes" again, improving response times for client-facing applications.
  • Direct Communication: Since compute nodes already use IB for cluster communication, we are simply leveraging existing high-speed paths for application data.

Note: Do not implement this if your InfiniBand network is already saturated with I/O traffic.

Implementation: Configuring the OS for SDP

To configure a listener (or DBLinks) over InfiniBand, we must first prepare the operating system across all compute nodes.

1. Enable the SDP Module

Check the current status of SDP across your nodes using dcli:

[root@exadb01 ~]# dcli -l root -g ~/dbs_group grep -i sdp /etc/rdma/rdma.conf exadb01: SDP_LOAD=no ... 

If it is set to no, change it to yes:

[root@exadb01 ~]# dcli -g ~/dbs_group -l root sed -i -e 's/SDP_LOAD=no/SDP_LOAD=yes/g' /etc/rdma/rdma.conf 

2. Configure Module Options

Verify if the SDP options are set in the Exadata configuration file:

[root@exadb01 ~]# dcli -l root -g ~/dbs_group grep -i options /etc/modprobe.d/exadata.conf 

If the ib_sdp options are missing, append them to the end of the file on all nodes:

[root@exadb01 ~]# dcli -l root -g ~/dbs_group "echo \"options ib_sdp sdp_zcopy_thresh=0 sdp_apm_enable=0 recv_poll=0\" >> /etc/modprobe.d/exadata.conf" 

3. Verify libsdp Configuration

Ensure that SDP is enabled for both server and client roles in /etc/libsdp.conf:

[root@exadb01 ~]# dcli -g ~/dbs_group -l root grep ^use /etc/libsdp.conf exadb01: use both server * *:* exadb01: use both client * *:* ... 

Next Steps

Once these OS-level parameters are set, you must reboot the compute nodes for the changes to take effect. In the next part of this series, we will walk through the actual creation of the listener over the InfiniBand network.

The solution proposed was to offload the data traffic from the Client Access network to the Private InfiniBand network. Since database servers also talk to each other using the InfiniBand Private network, we can use it for some cases and leverage that when it makes sense. You do not want to do that if your environment is starving for I/O and is already consuming almost all the IB network bandwidth. Ok, so configuring the log shipping through the IB network will make the client network breathe again.

Cassandra Consulting Services

Ready to optimize your Oracle Database for the future?