How EBS Concurrent Processing Should Run on Oracle RAC

Feb 19, 2013 / By Yury Velikanov

Tags: , , , , , ,

Suggested audience: Oracle Apps DBAs, Oracle e-Business Suite technical architects, and Oracle ATG team.

Introduction

In this post, I describe the Oracle E-Business Suite (EBS) concurrent processing configuration options currently available in an Oracle Real Application Cluster (RAC) environment and share my views on how to improve it. My views are very close to enhancement request 4159920 APPSRAP:PCP/CONCURRENT MANAGER – RAC NODE AFFINITY & LISTENER LOAD BALANCING, which was originally submitted by Dell engineers in 2005. Please feel free to share your opinion on the PCP & RAC set up in the comments section below. Let me know if you share my vision and if it makes sense to you. Your opinion is important to me, and I will use it in further conversations with the Oracle ATG team. If you have time and think that your organization would benefit from enhanced functionality, please log an SR with Oracle support asking them to make 4159920 enhancement request a priority and add your business case. I hope you love EBS as much as I do and want to make it even better :)

What is wrong with the current options?

RAC was introduced for two main reasons: a) Performance (Scalability) and b) Availability. An application that runs on RAC should provide good opportunities to leverage  performance and ensure controlled fail-over in case of problems with any of RAC nodes (read – availability).  EBS components should switch to available nodes based on a well-designed and implemented fail-over plan. Below is a list of options available today for configuring the concurrent processing components in a RAC environment. If there is enough interest, I will provide more details on each of the options in a separate blog post. Please let me know if you’re interested using the comments section below.

  1. Today, the default option in EBS leverages < environment name >_BALANCE tns alias and load balances all services across all available database nodes [MOS Note ID 823587.1].
    • While this option works well for a small EBS implementation with over-provisioned hardware , it could lead to significant performance penalties in more active or less optimized environments. If we spread functionality across all available RAC nodes without functional partitioning, there is a very good chance that the interconnect network between database nodes will get overloaded, which will significantly reduce RAC performance benefits and, in some extreme cases, lead to negative performance improvements (single instance processes load faster than several instances in a RAC configuration).
  2. Parallel Concurrent Processing (PCP) enables you to specify a tns alias per application node via s_cp_twotask context file parameter. This way, if there is a higher or equal number of application nodes than there are database instances, we could point each application node to its own database instance and ensure functional partitioning. This separates different application modules by assigning each concurrent manager to its own application node using “Primary Node” parameter (e.g. payroll, human resources, financials to be executed on a separate application, and corresponding RAC node).
    • This way we have better control over RAC interconnect load by ensuring that application functionality using the same data runs on the same RAC node.
    • In case of one database node’s failure, EBS switches all the concurrent managers executed on an associated application node to a node specified by “Secondary Node” parameter for each concurrent manager. Please note that this fail-over process works if “Concurrent:PCP Instance Check” profile value is set to ON value.
    • This method has one significant disadvantage. You need an equal number of application nodes and RAC nodes or higher. This wasn’t originally a big issue because most EBS customers executed concurrent managers on database nodes. In doing so, if a database node went down, it would stop application processes associated with that node, and PCP would migrate those processes to another concurrent processing node. Today, because most processing happens on the database side, there are often fewer application nodes than RAC nodes.
    • If an organization wants to benefit from application partitioning, it could implement the solution I describe under point 3 (below) or introduce, at a minimum, the same number of application nodes as RAC nodes. To save on hardware resources, application nodes could be virtualized; however, this doesn’t reduce the amount of operational effort needed to manage those nodes (e.g. patching).
  3. In R12.1.3 EBS, Oracle has introduced “Target Instance” concurrent program level parameter (navigation path: Concurrent => Program => Define => Search => Session Control). This allows you to specify a preferable RAC instance on a specific concurrent program level. Oracle will try to execute associated with the concurrent program requests on the RAC node. If it is impossible, it will execute it in any available node (switching to the default mode).
    • This seems to be a very good option compared to other options we have talked about. It lets you assign a specific RAC node to all Payroll or Finance programs, which ensures that they are executed on a specific node, reduces interconnect load, and leverages RAC performance benefits.
    • There are a few disadvantages to this approach. One of them is the controlled fail-over option I talked about earlier in this blog post. If a RAC node fails, there isn’t a good control where each concurrent request is executed. This may trigger a bit more chaos where it is least expected. One of the ways to mitigate this problem is to update the “Target Instance” parameter immediately after the node failure using a half-automated script; however, it requires customization so it may not be a well supported option.
    • The other challenge with this option is the complex maintenance. There are hundreds of concurrent programs, and it could be very time consuming to plan and assign RAC nodes to each or some of them.

Solution

I think that there should be an option to assign a tns alias (service) to an individual Concurrent Manager. It would significantly simplify the concurrent processing configuration in the RAC environments. Have a look at the diagram below. Let me go through some of the key components here:

  • I have highlighted different module’s related components with different colors (e.g. payroll and human resources – green, finance – blue, etc.) to make it easier to understand.
  • As soon as there is the option (as a result of enhancement request 4159920 implementation) to assign a tns alias to a concurrent manager, it will address the main limitations of the current configuration options listed in the first part of this blog.
    • We don’t need to worry about the number of application servers. Two application servers for redundancy purposes should be enough in a minimal configuration. Concurrent managers could establish connections to as many database instances as necessary.
    • The solution leverages Oracle Net Services. The fail-over configuration is done on the database side on a per service basis, which allows you to design and implement a fail-over plan well before a fail-over happens. Fail-overs happen without any human intervention preventing chaotic resource utilization.
    • It is simpler than assigning a “Target Instance” parameter to each concurrent program. Most EBS environments have queues and incompatibility parameters defined already. The rest is simple. Just add a tns alias and assign it to the concurrent manager. A single service or tns entry needs to be added to adjust the configuration if necessary.
  • In addition, the solution allows several other benefits, including:
    • A single service could be load balanced over several RAC instance, if necessary.
    • It is very easy to switch a service to be executed on one or set RAC instances or exclude an instance from executing concurrent processing.
    • Most of the changes could be executed dynamically without stopping or restarting concurrent managers or changing configuration files.
    • Services is an important Oracle 11GR2 Cluster concept and should be leveraged by any application running on RAC.
    • Resource manager could be used to limit the amount of resources available to a certain module at any time during the life cycle. This is especially important when a database cluster, for one reason or another, runs in limited capacity.
    • Resources (e.g. CPU, IO, etc) could be controlled per service defined (e.g. payroll, finance, etc.).
    • To my knowledge, you can implement the feature without making many architectural changes. In fact, it looks to me like the “Target Instance” parameter is a more complex change as each concurrent request executed on a concurrent manager may need to create a connection to a different database instance. Proposed solution:
      • Internal Concurrent Manager should be able to monitor all concurrent managers the way it did before.
      • Service Managers should have no problem starting the concurrent managers by setting the appropriate TWO_TASK value.
      • Service Monitors monitors Internal Concurrent manager the same way it does today.

Ideal Parallel Concurrent Processing setup in Real Applications Clusters environment

  • Just in case you have difficulties to understand the diagram let me to list the components used:
    • Payroll Worker Process, Payroll Message Report, etc – concurrent programs grouped using incompatibility parameters to be executed by one Concurrent Manager.
    • “Standard Manager (Payroll & HR)”, “Standard Manager (Financials)”, etc – are separate concurrent managers (queues).
    • PAY_STD1, OTH_STD1, FIN_STD1 - tnsnames.ora file’s entries on Application servers’ side.
    • PAY1, OTH1, FIN1 – services defined on the database cluster side. Each services has one or several primary instances and one or several fail-over instances. Each services can have certain resources assigned by Resource manager.
    • RAC Node 1/2/3/4 – are RAC cluster nodes.
    • Instance PRD1/2/3/4 – are instances running on the RAC nodes.
    • PCP Node 1/2 – are concurrent processing applications nodes.
  • Please note that
    • There are less applications nodes than database nodes as concurrent processing are not dependent on what application node it is running anymore.
    • Because of PCP the concurrent managers can still fail over to other application node based on “Primary Node” and “Secondary Node” settings.

I am very interested to hear your thoughts and experience on how to configure and use Concurrent Processing in Real Application Cluster environments. Please feel free to comment or even send me an email or a message directly. Please keep in mind Dell engineers’ enhancement request 4159920 APPSRAP:PCP/CONCURRENT MANAGER – RAC NODE AFFINITY & LISTENER LOAD BALANCING. If you think this is an option your organisation could leverage from, feel free to log a SR with Oracle Support and ask to add your business case to the enhancement request/bug description.

Yury

View Yury Velikanov's profile on LinkedIn

30 Responses to “How EBS Concurrent Processing Should Run on Oracle RAC”

  • I would thank you Yury for this amazing article which describe concurrent availability solutions, I configure PCP For RAC and NON-RAC EBS environment as i know when you configure PCP you have more than CM in nodes , but when shutdown one apps node including CM for patching as i know its not supported so when you configure a concurrent manager, you assign a primary node for that manager. If that node goes down, then GSM starts a new manager on another node. I would like to know option we have in this case

  • Fantastic Yury! This indeed is the good solution for a large EBS implementation with heavy batch processing of order management module as Dell’s. I am currently part of this architecture :) and the architecture mentioned in enhancement request 4159920(In 2005) has now grown from 6-Node MT to 8-Node MT and 6-Node RAC to 16-Node RAC. Below is how the solution looks like currently.

    DR DB servers DB Tier Services DB Tier Services Failover
    ============== ================ ====================================
    DBNode-01 CM_PCP1| CM_PCP4 |CM_PCP5 |CM_PCP6 |CM_PCP10|
    DBNode-02 CM_PCP2|
    DBNode-03 CM_PCP3| CM_PCP9|
    DBNode-04 CM_PCP4| CM_PCP8|
    DBNode-05 CM_PCP5|
    DBNode-06 CM_PCP6|
    DBNode-07 CM_PCP7|
    DBNode-08 CM_PCP8| CM_PCP3|CM_PCP7|
    DBNode-09 CM_PCP9|
    DBNode-10 CM_PCP10|
    DBNode-11
    DBNode-12 CM_PCP1| CM_PCP1|
    DBNode-13
    DBNode-14 CM_PCP2|
    DBNode-15
    DBNode-16

    MT servers Middle Tier Services
    ========== ====================
    APPNode01 CM_PCP1|WEB_FORMS|
    APPNode02 CM_PCP2|WEB_FORMS|
    APPNode03 CM_PCP3|WEB_FORMS|
    APPNode04 CM_PCP4|WEB_FORMS|
    APPNode05 CM_PCP5|
    APPNode06 CM_PCP6|
    APPNode07 CM_PCP7|
    APPNode08 CM_PCP8|

    Thanks & Regards,
    Ajith Narayanan

    • Hello Ajith,

      Thanks for the input. I would be interested to work with Oracle and possibly get them implemented that feature that simplify the big EBS environments setup.
      I wonder if you could find who from Dell submitted the original enhancement request?
      We may work together and coordinate.

      Yury

    • BTW: I have implemented a custom virtual Apps PCP nodes setup for one of my clients some time ago where we have had 3 Apps nodes and 6 DB RAC nodes. We have created a setup where each Apps nodes have been running 6 virtual PCP nodes (one per RAC node). This way we have been able to run concurrent managers on one of 3 Apps node and connect them to any RAC nodes.

      The virtual PCP was nothing but set of scripts that lied to Oracle about hostname it is running on :)

      • Dmitry Stepanov says:

        Could you provide more details about virtual PCP nodes ? How did you implement it?
        I suppose you should used LD_PRELOAD enviroment variable to specify to your custom hostname resolving library ? Because conc managers depend on hostname/uname.

        • For the client I ended up with shell scripts hacks (11i). Later on for other client I have implemented the LD_PRELOAD fake host procedure (1 min DR solution).

          I should blog about it.

          Y.

  • Baskar says:

    Yury,
    Very Good Article. Is that your idea based on defining services, make a node as preferred and other nodes as available and then have those services for managers?
    Baskar.l

    • Hello Baskar,

      If I understood your question in the right way than we do not need the same setup as we have for PCP Concurrent Managers fail-over. In PCP we have Primary ans Secondary node where a CM should fail-over to in case of issue with primary node. Fail-over for DB services is defined on DB side using srvctl. we don’t need the Primary/Secondary setup for a service on Apps side. The only thing that I would need to have is an ability to specify a TNS alias to be sued per Concurrent Manager. The rest to be configured on TNS and DB side.

      Yury

  • Yury, Its way long back in 2005, But let me try to find the original person who proposed the idea, will get back to you.

    Thanks & Regards,
    Ajith Narayanan

  • Mark Burgess says:

    Really good post Yury. I would like to see Oracle using a SERVICE_NAME and database service based configuration as default not only for the concurrent manager connections but across the Forms and OACORE connections as well. Whilst this is achievable with the current AutoConfig implementation it does somewhat require a level of customisation that some people will not be comfortable with.

    Even on smaller sites the use of database services to place resource limitations on concurrent active sessions can be particularly useful.

    Hopefully this is something that we can look forward to in 12.2.

    • Hello Mark,

      Thanks for the comment. I wonder if we can leverage “Database Instance” (INSTANCE_PATH) profile value to specify a service for Forms users?
      For OACORE it could get complected and for most of EBS environments I was working for wasn’t necessary as load from Self-Service interface wasn’t too high comparing with Concurrent Processing and other components.

      Yury

  • Gunes says:

    Hi Yury,
    We have 2 node rac system which each node has 200G ram and 60 CPUs, On our system We have 2 node application tier and PCP also involved. We were using our CM wihtout direct job to specific db node. But If you have huge data volume like us(Db size almost 45T) it brings some performance issue such as RAC waits.To prevent to rac related wait we changed our CM to made them to run specific job on specific db node.It brings some advantage and also disadvantage. It has advantage because your system become HA and even you one node down some of your CM can run on other node.It has some disadvantage such as one node running on %100 and other node is sleeping…
    Those are just my experience on EBS on RAC system which I want to share wiht you ppl.

    Regards
    Gunes

    • Thanks Gunes for your comment. 2 nodes RAC is a bit special case, isn’t it. If one node fails we don’t have any other chose but run whole load on the survived one. However I see how the system alike can benefit from services introduction. One of the advantages you may leverage from is resources monitoring and management consumed by different CM groups. Other advantage of using Oracle Net services: you can decide what nodes should run a concurrent manager and change the setup on the fly. IN your case it would be 1 of the nodes or both.

      Yury

  • Paul Ferguson says:

    Hi Yury,
    My name is Paul Ferguson and I am the CP development manager here in ATG. I really liked your post and you make some very good arguments.
    This is actually an ER we have been planning to do for some time, it has just gotten pushed aside for other things. It seems though that this is something we need to prioritize higher, and while I can’t tell you anything specific I can tell you that we will definitely be implementing this in the near future.

    • WOW! Thanks Paul for the great and to some extend unexpected comment. It made my day and gave me more energy to work on my “Oracle Cluster 11GR2 with Oracle Applications” for Collaborate 2013 paper this weekend. If you or your team members going to be part of the conference I would love to meet.

      Let me know if I can help you or your team to plan/design the feature. I gave got in touch with the Dell’s Apps team who are supporting the system the original ER comes from and I am sure they would be glad to contribute too.

      Thanks once again for your attention and have a great weekend
      Yury

  • [...] How EBS Concurrent Processing should run on Oracle RAC? Pythian’s own Ace Director Yury Velikanov replies. [...]

  • Lou says:

    Yuri,

    Nice article. I am new to ESB/SOA…do you have any Proc( I am asking for a procedures since on my site we use autosys to call anything) that will clean/purge the ESB logs?

    thanks…

    L

  • Nee says:

    Hi Yury,

    Very interesting article, I would be pleased to see you blog more about each option you listed.

    I have 4 RAC nodes and 4 Application nodes and see the problems with interconnect
    and gc type waits as all concurrent processes has the s_cp_twotask context variable set to _BALANCE – Also I have Concurrent:PCP Instance Check profile value set to Off. I am still unclear how this works in practice and whether this configuration is optimal.

    For example, doco says Set profile option 'Concurrent: PCP Instance Check' to OFF if database instance-sensitive failover is not required. By setting it to 'ON', a concurrent manager will fail over to a secondary Application tier node if the database instance to which it is connected becomes unavailable for some reason.

    But in my case, with it set to off, what happens if DB instance becomes unavailable, well the conc mgr just uses load balancing anyway and will reach another available instance, so what?

    Also, with only one Standard Manager, all the STD Manager processes are only running on the one application server which is a waste of other server resources?

    Thanks again.

    • Hello Nee,

      Using default _BALANCE configuration simplifies things a lot. You don’t need to think about all sort of things and as you mentioned the fail-over it easy. The concurrent managers should migrate to survived databases nodes.

      >> Also, with only one Standard Manager, all the STD Manager processes are only running on the one application server which is a waste of other server resources?
      I wouldn’t worry too much about the fact that CMs are running on one Apps nodes only unless you see that this generate a signification load. From my experience the bottleneck typically is on DB side rather than on Apps hosts running CM processes.

      However if you see that a significant portion of database processing is spent waiting on interconnect traffic you may want to implement Parallel Concurrent Processing setup where each Apps node is connected to it own RAC nodes. Then you would divide concurrent programs to be executed on by specific concurrent manager. The manager (queue) to be executed on a specific primary Apps node and associated RAC node. This way you partition load based on data CP works with and therefore reduce interconnect traffic.

      On top of better performance PCP provide additional HA option. Where you can specify a secondary node to fail to if associated Apps or DB nodes become unavailable.

      ‘Concurrent: PCP Instance Check’ in PCP configuration with each Apps node connected to a separate RAC instance I would switch the option ON as it would ensure automatic availability for you CMs.

      Saying everything above I still think that Oracle should put a bit of efforts and introduce an option to point individual CMs to DB services.

      Thanks for your comments,
      Yury

  • Dmitry Stepanov says:

    Moreover there are a lot of old program code in OEBS that cannot use load balancing at all. Check 279156.1

  • Lou says:

    Let me put my question again here for everyone. I am curious what everyone is using to clean up the esb logs. I know they provided a proc to do the clean up but as you know that job only do a delete….and does not reclaim the space.

    Any suggestions….

    • Hello Lou,

      I am afraid you need to be a bit more specific on what log files you are talking about. There many different type of log files in EBS starting from Apache end up with Concurrent Requests log files. It would be good if you mention EBS version you having the problem with.

      Yury

      • Lou says:

        Yuri,

        I am talking about the provided PL/SQL script that iterates through tables to delete instances matching specified conditions.I am using
        SOA Suite 11gR1 PS3.

        procedure delete_instances (
        min_creation_date in timestamp,
        max_creation_date in timestamp,
        batch_size in integer,
        max_runtime in integer,
        retention_period in timestamp,
        purge_partitioned_component in boolean
        );

  • Aashish says:

    I like this architecture, but have following question about this.

    1. What configurations changes are required to point to CM to particular node using database services. s_cp_twotask will have to point to alias and tns entries in EBS app server will have TNS entries like (SERVICE_NAME=service_name)(INSTANCE_NAME=instance1) so still I am restricting application services pointing to instance1, how the CM will failover to instance 2 when there is no entry for CM to failover on instance 2) can some one give me example or point me to any node how context file entry should look like IAS TWO TASK, CP TWO TASK, Tools TWO TASK.

    I want to achieve:
    1. create specific services for CM for module wise and how do I define s_cp_twotask point to that service on particular node TNS entries can be define without instance_name (just give the service name) so it can failover as service failover.

    can some one refer to any note/document/article which point using Database Services with EBS ?

  • Raj says:

    Hi Yury,

    Is it possible to define a concurrent manager in EBS 11i to run against only one specific node on 11g RAC database? if yes how? We don’t want to set different two_task for for entire application server. Can we set different two_task for any specific concurrent manager?

    • John Piwowar says:

      Hi Raj,

      I’m not Yury, but I can answer your question. Unfortunately, the answer is no. At this time, your best option is to define s_cp_twotask on a per-node basis.

      What you’re asking about is actually part of the enhancement that Yury proposes, but it doesn’t exist yet:
      “As soon as there is the option (as a result of enhancement request 4159920 implementation) to assign a tns alias to a concurrent manager, it will address the main limitations of the current configuration options listed in the first part of this blog. “

  • Naresh says:

    Oracle released the patch for this bug 4159920

    Patch 18803853: 1OFF:4159920:APPSRAP:PCP/CONCURRENT MANAGER – RAC NODE AFFINITY & LISTENER LOAD

Leave a Reply

  • (will not be published)

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>