1.617.682.4508

Pythian Blog

The world discusses #Pythian on Twitter. Have a question? Use our hashtag and ask away.

Emergency

24x7 Support

Not a Pythian client but need help now? No problem. Click here.

Adding Columns with Default Values and Not Null in Oracle 11g

By: Christo Kutrovsky

Oracle 11g has a new performance enhancement when adding columns. In the pre-11g releases, adding a new not null column with a default value would have caused a massive update on the entire table, locking it for the operation and generating tons of undo and redo. I’ve seen this happening in production.

Oracle 11g has improved this behaviour by storing the default value in the metadata and making the column addition instantaneous. An example of this feature at work can be seen in 11g Enhancement for ALTER TABLE .. ADD COLUMN Functionality and some bugs regarding sysdate, as pointed out in the comments.

Although this is a welcomed enhancement, there are some unexpected aspects beyond the basic operations.

First, we know default values for new columns are stored in the metadata, but what happens when you change the default?

Read the rest of this entry »

RMOUG: Day 2

By: Christo Kutrovsky

Day 2 finished yesterday. It was quite a busy day, with some excellent sessions.

Battle of the Nodes: RAC Performance Myths — Riyaj Shamsudeen
A great presentation on popular RAC myths, with some great examples. Excellent visuals that made complex processes look simple. I really liked this one.

Getting the Most Out of AWR — Tim Gorman
A first-rate session attended by a lot of the conference. It went into detail on what scripts are available to extract AWR information without needing Grid Control or Database Control. For command-line lovers, it’s great.

The SAN is guilty… until proven otherwise — Gaja Krishna Vaidyanatha
A very important session for all DBAs, showing the end-to-end components involved in database I/O. There are so many more components that can cause problems between the database and the physical spindles. Concepts, case studies, plenty of information.

Understanding Oracle Execution Plans: How SQL is Really Executed - Tanel Poder
One of those eye-opening sessions, starting with how to read SQL Execution Plans, and moving to showing stack traces and mapping function calls, to Execution Plan steps. A must-see for everyone tuning SQL.

And that is it. As exhausting conferences are, I always wish for them to have been longer.

RMOUG: Day 1

By: Christo Kutrovsky

Day One at RMOUG in Denver is now over.

There were quite a few interesting presentations. Unfortunately, the very first I went to was canceled due to car trouble. I also found that several sessions of similar interest to me overlapped, so I had to choose my spots.

Advanced Oracle Troubleshooting
This presentation was particularly good. Tanel goes into detail on how to quickly asses a situation without going through a number of “health checks” and still be nowhere near solving the problem. His approach is to look directly at what a “hanging” session is waiting on, and to systematically determine the cause of the problem, with no time wasted.

Putting your database on a Diet: Oracle’s Data compression
A short overview of table compression. I found that that even though the presenter obviously had some experience with compression, there were hardly any examples nor anything mentioned about how to determine proper re-ordering to improve compression.

All About Oracle’s In-Memory Undo
An unusual topic—something that works so well that no one really talks about it. The presentation, however, was very short, and provided little new information. There was only one demonstrated test case. Although it went into detail about the difference between in-memory and standard undo, the other-than-obvious effects were omitted.

During lunch I took a picture that shows the entire RMOUG crowd:
RMOUG Day 1  Lunch

Tomorrow is Day 2, and I will be posting about it here.

Christo Kutrovsky Presenting at RMOUG

By: Christo Kutrovsky

I am back on the road, going to RMOUG Training Days to present The Answer to Free Memory, Swap, Oracle, and Everything.

I am quite excited, as the RMOUG schedule (PDF) looks quite promising, especially these presentations:

  • Further RMAN Optimizations in 11g — Stephan Haisley
  • Advanced Oracle Troubleshooting: No Magic is Needed — Tanel Poder
  • Understanding Oracle Execution Plans: How SQL is Really Executed — Tanel Poder
  • The SAN is Guilty until proven otherwise — Gaja Krishna Vaidyanatha

Some of these overlap, so I guess I will have to make a difficult choice.

I hope to see you all in Denver.

Interview: Kevin Closson on the Oracle Exadata Storage Server

By: Christo Kutrovsky

Last Friday (September 26), Paul Vallée and I were lucky enough to interview Kevin Closson about the Oracle Exadata Storage Server. A tidied-up stream of the audio is here: closson-interview.m3u.

The audio quality is a little spotty here and there, so you might like to follow the transcription below.

Paul gets the interview started.

Paul Vallée (PV): Christo Kutrovsky and myself, Paul Vallée. We’re on the line with Kevin Closson of Oracle (and prior to that with Hewlett-Packard, and prior to that with Polyserve, and prior to that with Sequent). A giant of our industry, and I’m honoured to be speaking to him. Kevin, hello.

Kevin Closson (KC): Well, they always say that flattery gets you nowhere, but apparently it’ll get you on the phone.

PV: [laughs] Very nice!

KC: No seriously, it’s more than a pleasure to be here. I like what you guys do, so this is good.

PV: Thank you, Kevin. So, we are here to talk about the work that Larry Ellison announced yesterday, specifically the work around the Oracle Database Machine and the Exadata Storage Server. Kevin, can you just quickly introduce yourself and how you came to be involved in the project?

KC: Right. So, I’m a performance architect with Oracle, and the project that I’m stationed on, if you will, is the development team for Oracle Exadata Storage Server. And the way I came to Oracle is, quite a few of the folks who are involved with the very genesis of Exadata are people that I’ve known and worked with closely dating back to the early ’90s. And after a fruitful endeavour as the chief software architect for Oracle solutions at Polyserve, it became an opportunity to latch onto Oracle, because we sold our company to them. So there we are.

PV: How exciting! Congratulations! So I noticed that there’s still a little, I guess a diversion in terms of the branding. Larry definitely introduced it as the Exadata Programmable Storage Server, and I double-checked the video. But in your blog, you’re calling it, for sure, just the Exadata Storage Server. Just how recently was the marketing/messaging developed for this?

KC: You know, I’m not a part of the Go-To-Market (GTM) efforts, but, you know, honestly, the way these things are brought to market . . .  They’re developed under a project name, and the project name remains the same for years. It was over the last few months that Marketing began cooking the name and what-have-you. Now, if you’re referring to something that Larry said in his keynotes, I have to admit I didn’t commit to photographic memory all the slides. And certainly, if he used the term “programmable”, I’m not going to correct Larry Ellison.

PV: [laughs] That would be risky.

Read the rest of this entry »

Analysis of the Oracle Exadata Storage Server and Database Machine

By: Christo Kutrovsky

*Updated* see comments.
Exadata — the smart storage server. I am definitely excited about this product, but my point of view is a bit different.

It’s fast, and much faster than anything out there right now. But how many shops will actually need this? How many shops can spend 2.2 million dollars on hardware and equipment?

What are the products, in a nutshell? The Oracle Exadata Storage Server (Data Sheet, PDF):

  • 2U Storage “unit” with either 1 TB SAS or 3.3 TB SATA redundant capacity. There is a query processor in the box that can “offload” tasks from the main database server. Primary filtering, decompression, joins, backups.
  • Storage units linked to database servers via dual Infiniband offering 20 Gbit/s (2.5 GBytes/sec) bandwidth

The Database Machine (Data Sheet, PDF):

  • A standard 42U rack with 8 database servers and 12 Exadata storage servers.
  • Pre-installed Linux and Oracle. Pre-configured.
  • In 8 servers — a total of 256GB RAM, 64 Intel cores @ 2.66 Ghz, InfiniBand-ed and gigabit-switched.

The cost for one Database Machine: $2.33M ($650,000 + $1,680,000 in software) as grabbed from Larry’s keynote (thank chet) I called the “call us now” phone mentioned on the Oracle Exadata website to ask them for pricing. They had no idea what I was asking about, and I’m still waiting on a salesperson to call me back. (Hint for Oracle — educate your sales staff about new products, just in case I decide to buy one the day after you announce it.)

You have to realize how “cheap” this is. It comes down to $25,000 per core for Oracle EE, RAC, and Partitioning! And extra “free” CPUs for decompressing, filtering and joining, and backups. That’s a good deal. Oh, did I mention you can interconnect several 42U racks?

Back to the main question, what problems does this product solve?

Read the rest of this entry »

Oracle’s Secret New Feature: Educated Guesses

By: Christo Kutrovsky

Larry Ellison is announcing a major new feature this Wednesday at Open World. For the first time in a while, his keynote is dedicated to the “database” as opposed to the usual high level ERP/Apps/Fusion. Even the title of his keynote is catchy — “Extreme Performance”.

Oracle has been keeping the new feature a secret. Even the 11gR2 beta program had very few participants to prevent information leaking out. It’s, “Something’s coming, but I am not telling what.”

Okay, it worked on me, I’m excited about it. Let’s think what it could be. What single database feature is so major, that Larry himself will announce it during OpenWorld?

What do we know so far?

  • Starting with the obvious, Larry’s keynote is “Extreme Performance”, so it’s related to performance.
  • We know Kevin Closson has worked on it - he had a blog entry saying “I am working on something big” that got pulled off the web. (Here’s Google’s cache.)

Given these two point, let’s further think about it. What do we know about Kevin?

  • He worked for PolyServe — a company whose main product is a cluster file system.
  • He worked for Sequent on NUMA systems, which in today’s world is pretty close to cluster software with a very fast, low latency interconnect.
  • He is an expert in storage systems and disk performance.
  • He joined Oracle recently, possibly to work on this secret project.
  • He must be really excited about it, to post anything on his blog under radio silence.

I think it’s something related to storage, something new and revolutionary about storage. But what?

We already know, from leaks on certain websites, that ASM will become a cluster filesystem which will allow storing OCR files, as well as user files, on the ASM disks.

But is this big enough? It’s definitely significant. Now you get a “free” reliable, cluster file system with Oracle. I don’t think it’s big enough though. Oracle already had OCFS and OCFS2. So it’s not something new to release a filesystem. And even if ASM becomes a true filesystem, that would not provide such a significant performance boost to warrant a keynote called “Extreme Performance”. An ASM filesystem would be a major manageability feature, not so much a performance feature.

That being ruled out, what could it be?

Recently, when setting up a new 11g database on a server with 128gb of RAM, I was setting up hugepages as usual, and thinking about how big my cache would be. It struck me that the cache will be bigger than the database for quite a while. Why do we even need the SAN/Datafiles?!

Then it hit me.

Read the rest of this entry »

Recent Spike Report from v$active_session_history (ASH)

By: Christo Kutrovsky

For the past few months I’ve been using a query that I refer to as “ash report - recent spike”. That’s the second thing I do when I get a call of the “the system is slow” type. The first thing I do is run “top” (or whichever alternative for the OS) and check the overall CPU usage.

The script is fully RAC-aware, and although it’s not 100% perfect, I use this imperfection to see if any particular node is doing something stupid. Although it is primarily targeted for OLTP systems, it can be useful for data warehouses as well, especially if they use the parallel option.

The query is to the database what “load” (uptime) for a Linux/Unix machine is, except it has much more detail. It is basically a summarization query of the v$active_session_history table. NOTE: you need to have the performance pack license to use it. It is not designed to be aligned, or read. The best is to leave it on just a few lines and concentrate on the results.

It has two “variables” that you can adjust: how far back to look (I use two hours), and how aggressively to look for problems (having count(*) >= 2).

An explanation of how to make sense of the results follows the query.

Read the rest of this entry »

How to Make an In-Database listener.log File

By: Christo Kutrovsky

Ever wished the listener.log file was a table in the database? Wish no more! About three years ago, I sent this recipe in an email to my co-workers. Just recently, Shakir re-sent it after using the method in an emergency. Since it seems to have proved its value, I now offer it to our readers.

Using Oracle’s external tables, we can “query” the listener.log file:

Step 1: Create an “oracle directory” of where your file is:

create directory TNSLOG as '$ORACLE_HOME/network/log';

Step 2: Create an external table definition. Note that no data is loaded, just the method reading the file:

CREATE TABLE listener_log (
timestamp date,
connect_data VARCHAR2(2000),
protocol_info VARCHAR2(80),
EVENT VARCHAR2(200),
SID VARCHAR2(200),
RETURN_CODE number
)
ORGANIZATION EXTERNAL
(TYPE oracle_loader
DEFAULT DIRECTORY TNSLOG
ACCESS PARAMETERS
(
RECORDS DELIMITED BY newline
NOBADFILE NODISCARDFILE NOLOGFILE
FIELDS TERMINATED BY "*" LRTRIM (timestamp char date_format DATE mask "DD-MON-YYYY hh24:mi:ss", connect_data, protocol_info, event,sid,return_code)
)
LOCATION ('listener.log')
)
REJECT LIMIT UNLIMITED;

Step 3: Query the table directly or load it into an Oracle table for better performance and consistency. You can limit your date range here to load only the period you need:

create table listener_log2 as
select TIMESTAMP,connect_data, event, sid, return_code,
substr(connect_data, instr(connect_data,'HOST=')+5, instr(connect_data,')', instr(connect_data,'HOST='))-instr(connect_data,'HOST=')-5)
as host from listener_log where timestamp >= sysdate - interval '3' day;

Note that I extract the “host” to see where connections are coming from.

Step 4: Query grouping by hour. If you need say by minute, replace hh with mi.

select host,trunc(timestamp,'hh'),count(*)-count(nullif(return_code,0)) as success, count(nullif(return_code,0)) as failure,count(*) as total
from listener_log2 group by host,trunc(timestamp,'hh') order by 2,1;

This a quick and dirty way of doing it, but it serves its purpose. Feel free to improve on this (perhaps have it extract all the connect data, including “service updates” etc.) and post the improved version in a comment.

Bind Peeking, Ad Hoc Queries, Stable Performance. On 10G you can only pick any two.

By: Christo Kutrovsky

I got to troubleshoot an amazing situation a few weeks ago. I think, we can all agree that assuming we are on a supported configuration of a production release of Oracle, it is essentially inconceivable that allowing a single query to run on your system can flip another query’s plans and cause major performance issues (and in this case even downtime!)

There have been plenty of posts about bind peeking. Alex Gorbachev wrote about it last year, and so did Jonathan Lewis. It’s a well known issue. However what hasn’t been written about is when it is expected to strike and cause you headaches.

A single query can change the plan of a number of other queries, but just sometimes.

One day you come to work (or get woken up) and you have a badly performing system. You look into it and it comes to one query. Nothing has changed, nothing has been modified.

I found the query no problems. I did an explain plan, comes up with an index range scan. Given the execution time however, it can’t be a range scan, it’s taking way too much time. Looking at v$sql_plan reveals the full table scan. At this point I was almost sure what the problem was.

How did I fix it? I did an “alter system flush shared_pool;” the query got reparsed, a new child cursor was created and all new executions of the query were using the expected and much faster index range scan.

Easy fix, but what happened? Nothing has been changed for months!

Sometimes it’s coincidence. The query just happens to run at the right time, or some job is delayed and queries run in a different order and the wrong plan is selected. Those are generally easy to catch as they happen early in the life of a system.

Sometimes it’s load. You decide to run an additional application, which causes more SQL statements to be aged out and re-parsed, creating a higher chance of parsing with “bad” values. Although the additional load will be blamed for the “slow” performance, the root cause will eventually be discovered. In a way, it will be expected that something will go bad when you increase the load.

And sometimes it’s a single ad hoc query with a new predicate that starts the slowly-ticking time bomb.

Here is how it happens . . .

Read the rest of this entry »

Start NowWith Pythian - database design, management and emergency handling capabilities...

Pythian Blog

Connecting to Oracle with SQL Server 2005 x64
The quirks of connecting to Oracle from SQL 2005 64
more



Live Updates

pythian: Pythian is now official members of the Microsoft Partner Program. Thanks Peter
more



RSSTestimonials

  • Casey Dyke

    Database Team Manager Service Delivery and Applications , Telstra

    Pythian were recently engaged to take a lead role in a high end infrastructure build project at Telstra. Our requirements were a combination of... more