DNS Setup for Effective 11i DR Failover

Jul 28, 2008 / By Vasu Balla

Tags: ,

One of the main goals in architecting a Disaster Recovery (DR) solution is to make a DR failover transparent to the end users. Too often, users must reboot their desktops, clear their browser cache and the jinitiator jar cache, and so on, even when we have made sure that the post-failover URL of the 11i instance is the same. After a failover of an 11i instance from a primary site to a DR site, if the user can operate without changing anything in his desktop, only then can we say that the goal is achieved.

In most cases the culprits are: forgetting the DNS setup for the hostnames of Middle Tiers, or the load balancer, if one is used; and the caching of DNS entries at the different levels in the network. A quick look at the caching section of Wikipedia’s page on DNS gives some idea of I’m talking about. Because of the default settings, the old IP address gets cached in the user’s desktop and in caching DNS servers in the network. As a result, the user’s desktop is still trying to reach the old server, which is now offline.

The best fix for these kind of DNS side effects is to change the TTL (Time To Live) parameter of the DNS entry for the hostname from the default value to a smaller one. I prefer setting it to a value a little smaller than the time you take to failover. That is, if you take 60 minutes to failover from Primary to Secondary datacenter, then set the TTL to 50 minutes.

Let’s take an example here. Let’s say our 11i instance has the URL http://apps.example.com:8000, the primary instance being windsor, the secondary ottawa. And we have two load balancers: one at primary site and one at the secondary, with hostnames lb.windsor.example.com and lb.ottawa.example.com respectively. If the DNS is set up with default values, it will look like this:

hostname                 TTL     Type    value
----------------------------------------------
apps.example.com         86400   CNAME   lb.windsor.example.com
lb.windsor.example.com   86400   A       192.168.1.100
lb.ottawa.example.com    86400   A       192.168.2.100

apps.example.com is an alias (CNAME) to lb.windsor.example.com and the TTL value is set to 86400 seconds, i.e., 24 hours. That means this record gets cached for a duration of 24 hours at the user’s desktop and at any caching DNS servers being used by the client. So at the time of failover, even though we change the DNS records of apps.pythian.com to point to the ottawa load balancer instead of windsor, because the TTL is set to a very high value of 24 hours, the user’s browser will still be trying to reach the primary site load balancer, as it is cached in their desktop for next 24 hours

As I suggested earlier, if we set the TTL of apps.example.com to 50 minutes (3000 seconds) and do the changes to DNS as first step in the failover procedure, then by the time we finish (which is supposed to be 60 minutes), the old DNS records in the user’s desktop cache and the caching DNS server will have expired, and they will start seeing the new alias for apps.example.com: lb.ottawa.example.com.

hostname                 TTL     Type    value
----------------------------------------------
apps.example.com         3000    CNAME   lb.ottawa.example.com
lb.windsor.example.com   86400   A       192.168.1.100
lb.ottawa.example.com    86400   A       192.168.2.100

Some of you might already be thinking, why not set it to even lower values, like 5 minutes? The main problem with setting it to a lower value such as this is that it will increase the load on the DNS server. If you have a single DNS server with too low values, any kind of outage on DNS server will effect your users immediately, as their desktops will be making DNS lookups much more frequently than before. So in cases where you have low TTL settings, make sure you have at least two DNS servers at two different locations.

Please feel free to post your experiences related to DNS in the comments section. Any comments or suggestions are welcome!

9 Responses to “DNS Setup for Effective 11i DR Failover”

  • Madhu Sudhan says:

    Hi Vasu,

    >> One of the main goals in architecting a Disaster Recovery (DR) solution is to make a DR failover transparent to the end users.

    I doubt if 11i Instance fail over can be made transparent to the users. Is there such a technology…? You will have at least to bounce the middle tier services.

    Thanks,
    Madhu

  • Vasu Balla says:

    Hi Madhu,

    You are right. We cannot make 11i failover 100% transparent. There is some downtime involved there. My point was more in the context of URLs that user uses to login to 11i after failover. I have seen clients, who use completely different URL for the DR failover instance. This makes it difficult for the end User, who often bookmarks the 11i URL in his/her browser

    thanks
    Vasu

  • yadava ladisetty says:

    Hi Vasu,

    I appreciate your effort in posting the DNS Setup for Effective 11i DR Failover. I am implementing a DNS level failover for our 11i applications, on the same server it contains the other 3rd party application like appxwors for scheduling the jobs, and it does uses any Load balancers as of now.My concern is if we use DNS failover then end users can access the secondary server with the same URL or they need to change the URL.

  • yadava ladisetty says:

    Hi Vasu,

    I appreciate your patience in presenting the article. Do you have any best practise documents available for implementing the DNS for 11i failover.

    thanks
    yadava

  • yadava ladisetty says:

    Hi Vasu

    Appreciate your effort in clarifying the below thing.
    I want to have a failover DNS, where, 1 entry can point to 2 different ip address. But at one time, only point to 1 primary server.
    And if the primary server down or not available, the other server will take over and the dns can automatically change the dns entry to point to the failover server.

    1. I want to know whether simplefailover can do or not?
    2. is it free software?
    3. how about maintenance. is it easy to maintain?

    thanks yadava

  • Vasu Balla says:

    Hi Yadava,

    you can make your DNS point to 2 IP addresses at the same time, the browser will connect to what ever server thats listening on web port. have apache running only the server thats is active. Make sure to test it before implementing

    review the below link for better understanding of DNS
    http://www.tenereillo.com/GSLBPageOfShame.htm

    Vasu

  • yadava ladisetty says:

    Hi Vasu,

    thanks for the information

    regards
    yadava

  • yadava ladisetty says:

    Hi Vasu

    What are things we need to take care from Oracle Applications and Database side for the DNS Level failover.

    regards
    yadava

  • DNS Lookup says:

    My point was more in the context of URLs that user uses to lo-gin to 11i after fail over.Yes you are correct to make sure of maintaing at-least two DNS servers

    ——————-
    Stephen

Leave a Reply

  • (will not be published)

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>