THE WORLD DISCUSSES #PYTHIAN ON TWITTER. HAVE A QUESTION? USE OUR HASHTAG AND ASK AWAY.

Performance tuning: HugePages in Linux

Recently we quickly and efficiently resolved a major performance issue with one of our New York clients. In this blog, I will discuss about this performance issue and its solution.

Problem statement

The client’s central database was intermittently freezing because of high CPU usage, and their business severely affected. They had already worked with vendor support and the problem was still unresolved.

Symptoms

Intermittent High Kernel mode CPU usage was the symptom. The server hardware was 4 dual-core CPUs, hyperthreading enabled, with 20GB of RAM, running a Red Hat Linux OS with a 2.6 kernel.

During this database freeze, all CPUs were using kernel mode and the database was almost unusable. Even log-ins and simple SQL such as SELECT * from DUAL; took a few seconds to complete. A review of the AWR report did not help much, as expected, since the problem was outside the database.

Analyzing the situation, collecting system activity reporter (sar) data, we could see that at 08:32 and then at 8:40, CPU usage in kernel mode was almost at 70%. It is also interesting to note that, SADC (sar data collection) also suffered from this CPU spike, since SAR collection at 8:30 completed two minutes later at 8:32, as shown below.

A similar issue repeated at 10:50AM: Read the rest of this entry . . .

Riyaj’s Presentations at SIOUG and DOUG

I recently traveled to Europe to present at a few conferences. The Slovenia Oracle User Group (SIOUG) conducted an Oracle conference in Portoroz, a port city in Adriatic Sea.

Thanks to my friend Joze Senegacnik, I had wonderful time in Slovenia. We also visited the city of Venice while we were there. Of course, we all knew that Venice is the city of bridges—over a hundred small islands connected by bridges. But visiting Venice in person had a dramatic effect on us. We were mesmerized by the beauty, culture, and architecture, in particular the Basilica of St.Mark. Venice is indeed the Queen of the Adriatic.

Back to reality. I presented a few papers for the Dallas Oracle Users Group (DOUG) for their October tech meeting too. All these papers can be accessed following these links: Read the rest of this entry . . .

Oracle Performance Issue: High Kernel-Mode CPU Usage

Recently, I resolved a performance issue with one of our esteemed clients. I found the problem interesting and worth blogging about.

The problem

An application makes 300 static connections to database DB1 in the database server, say CLNTDB server. The application relies on database link and over a period of time, each session executes SQL through a database link, creating a connection in central database PROD1. So, there are 300 connections in the PROD1 database coming from the DB1 database through database links. Performance is fine during normal operation.

The problem starts when the application is shutdown. Shutting down the application in DB1 creates massive CPU consumption on the PROD1DB server. Unfortunately, this spike in CPU usage lasts for five to ten seconds and causes the ASM heartbeat to fail. Considering that PROD1 is a central database, this has a global effect on applications’ functionality. See the presentation below for graphical representation of this problem.

The symptoms

Looking at the symptoms in detail, we can see the CPU usage in sys mode. It is not uncommon to see high CPU usage during a storm of disconnects. But, this specific CPU usage is much higher and uses all CPUs in kernel-mode. If this is a problem due to process destruction, then this will show up on the CLNTDB server too, as it too faces 300 disconnects. But, this problem manifests only on the PROD1DB server.

mpstat output from the PROD1DB server shows the CPU usage in %sys mode. Notice the numbers below, under sys column—almost all CPUs are used in sys mode. We need to drill down further to understand why this many CPUs are used in sys mode. It can also be seen that no other columns have any abnormally higher values:

Read the rest of this entry . . .

How to Find Objects Creating nologging Changes

In an Oracle-l thread, this question was raised: how can you find objects creating nologging changes?

First, what is a ‘nologging’ change?

The redo logging mechanism plays critical role in media recovery. Media recovery relies on archivelog files generated to roll the database forward. A standby database or dataguard recovery also relies on archivelog files. It is possible to do DML changes with minimal logging, i.e., nologging changes, also known as direct mode inserts. For example, insert /*+ append */ can be used to populate the rows into a table without generating much redo. This can invalidate the standby database and might trigger rebuilding some or all parts of the standby database.

nologging changes generates minimal redo, since the blocks are pre-formatted and written to disk directly. A redo record is generated, invalidating a range of affected blocks. This invalidation redo record size is far smaller, for e.g. hundreds of blocks can be invalidated using just a single redo record. Of course, recovery is severely affected as the changes performed with nologging operations cannot be reapplied/recovered.

Internals of nologging changes

Since nologging is all about redo records, dumping the redo log file or archivelog file is a concrete way to see what happens under the hood. Let’s consider an example to explain the internals of nologging changes. We will create a table, insert rows, and closely review the redo records.

Read the rest of this entry . . .

Oracle Import Performance: Does Import Use Single-Row Inserts for Date Columns?

Recently, I was involved in a discussion about import performance in this OTN forum in which the original poster raised the issue of whether or not import will resort to single-row inserts for tables with date columns.

For array inserts, the buffer parameter essentially specifies the size of this array. We know, however, that if a table has lob columns, the import parameter buffer is not honored, and the import utility will use single-row inserts for those tables. But tables with date columns, so the claim goes, also must suffer single-row inserts. In this blog, I will probe this further and validate that claim.

Let’s create a table and populate it with 300K rows.

 create table t1 (n1 number, v1 varchar2 (512), d1 date);

 insert into t1
 select n1, lpad(n1, 500, 'x'), sysdate
 from (select level n1 from dual connect by level <=300003);
 commit;

 REM Creating an  export file..
 host exp userid=cbqt/cbqt file=exp_t1.dmp log=exp_t1.log tables=t1

The above code fragment created a table, inserted 300,000 rows, and exported that table to an export dump file. This dump file is ready to be imported. But, we need to trace the import to measure the effect of the buffer parameter. The problem, though, is how to trace the import session alone without generating every session in the database. This can be achieved by creating a logon trigger as below. Only sessions from a test user will have trace enabled from this trigger (the username is CBQT).

REM I could use "on schema clause too, but this is part of generic code that I use.
REM Riyaj Shamsudeen - To trace a session through logon trigger
create or replace trigger
set_system_event
after logon  on database
declare
v_user dba_users.username%TYPE:=user;
sql_stmt1 varchar2(256) :='alter session set events '||chr(39)||'10046 trace name context forever, level 12'||chr(39);
begin
  if (v_user = 'CBQT') THEN
      execute immediate sql_stmt1;
  end if;
end;
/

Let’s drop the table, then import with a default buffer size of 64K. Through the logon trigger, a new SQL trace file will be generated. That trace file will be analyzed with the tkprof utility as shown in the code fragment below:

Read the rest of this entry . . .

Tuning Latch Contention: Cache-buffers-chain latches

Recently, I had an opportunity to tune latch contention for cache buffers chain (CBC) latches. The problem was high CPU-usage combined with poor application performance. A quick review of the statspack report for 15 minutes showed a latch-free wait as the top event, consuming approximately 3600 seconds in an 8-CPU server. CPU usage was quite high, which is a typical symptom of latch contention, due to the spinning involved. v$session_wait showed that hundreds of sessions were waiting for latch free event.

SQL> @waits10g

   SID PID     EVENT         P1_P2_P3_TEXT
------ ------- ------------  --------------------------------------
   294  17189  latch free    address 15873156640-number 127-tries 0
   628  17187  latch free    address 15873156640-number 127-tries 0
....
   343  17191  latch free    address 15873156640-number 127-tries 0
   599  17199  latch: cache  address 17748373096-number 122-tries 0
               buffers chains
   337  17214  latch: cache  address 17748373096-number 122-tries 0
               buffers chains
.....
   695  17228  latch: cache  address 17748373096-number 122-tries 0
               buffers chains
....
   276  15153  latch: cache  address 19878655176-number 122-tries 1
               buffers chains

I will use a two-pronged approach to find the root cause scientifically. First, I’ll find the SQL suffering from latch contention and objects associated with the access plan for that SQL. Next,I will find the buffers involved in latch contention, and map that back to objects. Finally, I will match these two techniques to pinpoint the root cause.

Before I go any further, let’s do a quick summary of the internals of latch operations.

Read the rest of this entry . . .

Tuning ‘log file sync’ Event Waits

In this blog entry, I will discuss strategies and techniques to resolve ‘log file sync’ waits. This entry is intended to show an approach based upon scientific principles, not necessarily a step-by-step guide. Let’s understand how LGWR is inherent in implementing the commit mechanism first.

Commit Mechanism and LGWR Internals

At commit time, a process creates a redo record (containing commit opcodes) and copies that redo record into the log buffer. Then, that process signals LGWR to write the contents of log buffer. LGWR writes from the log buffer to the log file and signals user process back completing a commit. A commit is considered successful after the LGWR write is successful.

Of course, there are minor deviations from this general concept, such as latching, commits from a plsql block, or IMU-based commit generation, etc. But the general philosophy remains the same.

Signals, Semaphores and LGWR

The following section introduces the internal workings of commit and LGWR interation in Unix platforms. There are minor implementation differences between a few Unix flavours or platforms like NT/XP, for example the use of post-wait drivers instead of semaphores etc. This section will introduce, but not necessarily dive deep into, internals. I used truss to trace LGWR and user process. The command is:

truss -rall -wall -fall -vall -d -o /tmp/truss.log -p 22459

(A word of caution: don’t truss LGWR or any background process unless it is absolutely necessary. Doing so can cause performance issues, or worse, shutdown the database.)

Read the rest of this entry . . .

Resolving High Water Enqueue Contention

Recently, I had a few email exchanges — on the Oracle-l list and offline — about HW enqueue contention. A few interesting observations emerged from test cases I created during that discussion.

HW Enqueue

When a session needs access to a resource, it requests a lock on that resource in a specific mode. Internally, lock and resource structures are used to control access to a resource. Enqueues, as the name suggests, have a First In/First Out queuing mechanism. You can find more information about the internals of locks in my paper, printed in 2001, Internals of locks.

Segments have a High Water Mark (HWM) indicating that blocks below that HWM have been formatted. New tables or truncated tables (that is, truncated without a reuse storage clause), have the HWM value set to the segment header block — meaning, there are zero blocks below the HWM. As new rows are inserted or existing rows updated (increasing row length), more blocks are added to the free lists, and the HWM is bumped up to reflect these new blocks. HW enqueues are acquired in Exclusive mode before updating the HWM, and essentially, HW enqueues operate as a serializing mechanism for HWM updates.

In non-ASSM tablespaces, the HWM is bumped up by five blocks at a time. (Actually, the undocumented parameter _bump_highwater_mark_count controls this behavior, and it defaults to five.) Heavy inserts into a table can result in increased HWM activity leading to HW enqueue contention. This issue is prevalent if the table has LOB columns or if the row length is large.

Measuring HW Enqueue Contention

We will use a few test cases to see how the underlying extent size and table structures are affecting HW enqueue contention. But first we need to find a way to measure the total number of gets on HW enqueue. If the total number of gets on the HW enqueue is reduced, enqueue contention can be relieved.

The fixed table x$ksqst stores statistics about the total number of enqueue gets, and the successes and failures of those gets at the instance level. For example, to see total number of gets on HW enqueue, the following query can be used. (The ksqstreq column indicates total number of gets; ksqstwat shows the total number of waits.) Read the rest of this entry . . .

Identifying SQL Execution Bottlenecks Scientifically

A few days ago, a developer and I had an interesting conversation. The developer was trying to tune an expensive SQL statement, using following trial-and-error method:

loop until acceptable performance
    explain plan -> execute SQL with sql trace -> tkprof -> rewrite
end loop;

After looking at his method in amusement, I showed him how to identify and tune SQL statements scientifically, and decided to blog about it.

Let’s look at a simple case and then proceed to slightly more complex versions. The following code fragment creates test tables, indices, and collects statistics on those tables.

 create table t1_vc as
 select trunc(n/10000) n1, mod(n, 1000) n2 ,
          lpad( n,255) c_filler
 from (select level n from dual connect by level <= 100001);
 create index t1_vc_i1 on t1_vc (n1);
 create table t2_vc as
 select trunc(n/ 100) n1, mod(n, 10000) n2 ,
          lpad( n,255) c_filler
 from (select level n from dual connect by level   null, cascade => true);
 exec dbms_stats.gather_table_stats(user, 't2_vc',estimate_percent => null, cascade => true);
 null, cascade => true);
  exec dbms_stats.gather_table_stats(user, 't2_vc',estimate_percent => null, cascade => true);

Simple SQL, but I had to use hints to illustrate the point I’m driving at. Let’s do an explain plan on this SQL.

explain plan for
select /*+ use_nl (t1_vc, t2_vc ) */
t1_vc.n1 , t2_vc.n2
from  t1_vc, t2_vc where
t1_vc.n1 = t2_vc.n1 and t1_vc.n2 between 101 and 105 and t1_vc.n1=1
/
select * from table(dbms_xplan.display)
/

Plan hash value: 3808913109

------------------------------------------------------------------------------------------
| Id  | Operation                     | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |          |  5453 | 81795 |   643   (0)| 00:00:08 |
|   1 |  NESTED LOOPS                 |          |       |       |            |          |
|   2 |   NESTED LOOPS                |          |  5453 | 81795 |   643   (0)| 00:00:08 |
|*  3 |    TABLE ACCESS BY INDEX ROWID| T1_VC    |    55 |   385 |   368   (0)| 00:00:05 |
|*  4 |     INDEX RANGE SCAN          | T1_VC_I1 |  9091 |       |    18   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN           | T2_VC_I1 |   100 |       |     1   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID | T2_VC    |   100 |   800 |     5   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("T1_VC"."N2"=101)
   4 - access("T1_VC"."N1"=1)
   5 - access("T2_VC"."N1"=1)

20 rows selected.

The execution plan looks okay, but this statement is executed millions of times, so we need to reduce time as much as possible. Can this SQL be tuned further?

Read the rest of this entry . . .

Multi-Column Correlation and Extended Stats in Oracle 11g

We all have encountered this situation many times before: the cost-based optimizer assumes no correlation between two columns (until 11g), and this has the effect of erroneously reducing the cardinality of a row source. Incorrect cardinality estimates are one of many root causes for SQL performance issues. Consider the following example.

This code creates a table and populates data with four columns:

create table t_vc as
select mod(n, 100) n1, mod(n, 100) n2 ,
mod(n, 50) n3 , mod(n, 20) n4
from (select level n from dual connect by level <= 10001);

The first two columns, n1 and n2, have a strong correlation: n1 is always equals to n2.

Let’s collect statistics with histograms on all columns.

begin
dbms_stats.gather_Table_stats( user, 'T_VC', estimate_percent => null, method_opt => 'for all columns size 254');
end;
/

Let’s explain the plan for the query, specifying one predicate, exactly 100 rows with a value of 10.

Read the rest of this entry . . .

Start NowWith Pythian - database design, management and emergency handling capabilities...

Live Updates

pythian: RT @FN_Press2: Schooner Information Technology Teams with Pythian to Deliver Advanced Support and High... http://finanznachrichten.de/20
more



Testimonials

  • Serge Racine

    DBA, Brookfield Energy

    We are very satisfied by the service given to us by Andre and Shakir in support of our recent data quality and reorganization initiative.... more



Social links powered by Ecreative Internet Marketing