Upgraded to 11GR2? Congrats! You are in Direct Reads Trouble!
Nov 14, 2012 / By Yury Velikanov
I made this blog title intentionally provocative. However it isn’t far from the truth. Don’t believe me? Continue reading :)
I came up with the graph bellow to demonstrate the problem. I was called on an emergency to help a desperate client to take their Oracle e-Business Suite system’s performance under control yesterday. After researching for several hours we found the root cause of the problem. The graph reflects direct reads percentage vs all physical reads. You would say looking on aggregated data and percentages is a bad style and waste of time. Well … I will leave it to you to decide. I think the graph demonstrates the problem very well and this is all that counts.
You do not need to be a scientist to tell when the system has been upgraded to the 18.104.22.168 version. It was kind of obvious looking on the graph, right? Well it took us a bit of time to find where the problem was. To save your time I put an SQL together that you can use to build a graph just like this one.
Did you migrate your Oracle database to 22.214.171.124 or a higher version recently? “Are you ready see how deep the rabbit hole goes?” (c) Matrix. Here you go :)
If you see a significant change in the pattern after upgrade to 11G, you may be interested to have a look on the next set of statistics.
The blue line represents the Physical Reads statistic. The red line are Direct Reads. Did you notice that Physical read volume increased from 100MB/s to 400MB/s? Well my client didn’t just notice it. An expensive storage array’s resources got saturated and the whole user community were paralyzed for several days. A good portion of the first day the client spent working with storage vendor as all other Oracle IOs (e.g. log file sync, db file scattered read, db file scattered read etc) became very slow . Than many hours work with Oracle Support followed. After all that the client called us as they had run out of ideas. If we would have the SQL handy we would avoid 1 day of troubleshooting efforts and focus our efforts on how to fix the issue. Keep the script handy. Who knows it may save you some troubleshooting efforts.
Other recently migrated client
Following up on the emergency call’s results, out of curiosity, I decided to check another client’s environment recently migrated to 126.96.36.199 version. Have a look. I think I don’t need to make too many additional comments.
and the Direct Reads contribution to total Physical Reads graph for the same database looks like the following
We are still in a discussion on how to resolve the issue for good. The following two actions have been implemented as a quick fix to give a bit of relief to the system’s user community.
- Disable a new functionality related to Direct reads enhancement introduced in the 188.8.131.52 version. Google for the event number and you will find several related blog posts including Tanel’s blog post here.
SQL> alter system set events '10949 trace name context forever';
- [see my comment bellow]
Switch offAutomatic Memory Management. IMHO: it should be switched off for any serious Oracle database. It introduces more problems than it adds value
High 'direct path read' waits in 11g [ID 793845.1]
- Be aware that the script is AWR based. Therefore a database should have Diagnostic licence in order to use it
- If you don’t have the Diagnostic licence just use the idea and convert it to STATSPACK ( don’t forget to share it with me and others :)
- Please do not hesitate to drop me a message if you spot any mistakes (I know I do those all the time) or have related comments
Let me know if the script I have introduced to you in this blog post helped you to spot something interesting in your system.
Added by Yury on 2012.11.24. As result of internal discussion with Pythian folks (special thanks to Marc Billette)
Additional Comments related to this blog post
- % of DPR doesn’t mean anything. High number could be a good or bad thing for different systems. The answer as always …. It depends :)
- However in my clients’ case the DPR increase was accompanied by SIGNIFICANT physical IO increase (see my blog post second graph).
- As the change in behavior happened right after 11G upgrade it was enough to points us to the temporary fix
- A single serial DPR is more efficient (cheap from resources point of view) than traditional Oracle physical Read (db file sequential read or db file scattered read).
- If it is given that a process MUST read X physical blocks and none of table blocks are cached at the moment then DPR will be faster and utilize less resources on the server.
- However DPR doesn’t cache anything. Therefore none of others processes can share the work the process did.
- The Exadata designed to make a lot of DPR to leverage data processing offloading to cells hosts
- In fact I think the new serial direct reads decision mechanism was implemented because and for the Exadata (just guessing here)
- High DPR just an indication that an environment doesn’t use buffer cache. That’s all. Is it good or bad ….. It as always … depends :)
Just an idea
16 comments on “Upgraded to 11GR2? Congrats! You are in Direct Reads Trouble!”
Leave a Reply
You must be logged in to post a comment.