THE WORLD DISCUSSES #PYTHIAN ON TWITTER. HAVE A QUESTION? USE OUR HASHTAG AND ASK AWAY.

RAC+ASM 3 years in production. Stories to share (slides from RMOUG10)

Here are the slides from my presentation at RMOUG 2010.

I am not sure how much sense all this will make without my comments. We may do it in a webinar if there is sufficient interest. Regardless I will probably be doing it again at some point in the future.

Oracle Parallel Query Sorting and Index Creation Performance Problems

Ever wondered why recreating certain indexes takes forever, even when you do so in parallel? Ever wondered why certain PQ queries just don’t run that fast?

Here’s a serious performance bug that’s been in Oracle for a while, and finally there are hints of it been fixed, but only partially.

The bug happens when performing sorting operations in parallel, and the source data is already well sorted. The “ranger” doesn’t do a good job of properly assigning row ranges to sorter processes, and ~90% of the rows end up being sent to the same parallel process, regardless of the level of parallelism. So even if you have 256 CPUs, only about 10% performance improvement is achieve, instead of a factor of your parallelism when running the query in parallel.

For example, if the non parallel sort/index creation took 45 minutes, running with parallel 32 will take 41 minutes, instead of the possible 1.4 minutes (assuming you have sufficient horsepower).

Read the rest of this entry . . .

Adding Columns with Default Values and Not Null in Oracle 11g

Oracle 11g has a new performance enhancement when adding columns. In the pre-11g releases, adding a new not null column with a default value would have caused a massive update on the entire table, locking it for the operation and generating tons of undo and redo. I’ve seen this happening in production.

Oracle 11g has improved this behaviour by storing the default value in the metadata and making the column addition instantaneous. An example of this feature at work can be seen in 11g Enhancement for ALTER TABLE .. ADD COLUMN Functionality and some bugs regarding sysdate, as pointed out in the comments.

Although this is a welcomed enhancement, there are some unexpected aspects beyond the basic operations.

First, we know default values for new columns are stored in the metadata, but what happens when you change the default?

Read the rest of this entry . . .

RMOUG: Day 2

Day 2 finished yesterday. It was quite a busy day, with some excellent sessions.

Battle of the Nodes: RAC Performance Myths — Riyaj Shamsudeen
A great presentation on popular RAC myths, with some great examples. Excellent visuals that made complex processes look simple. I really liked this one.

Getting the Most Out of AWR — Tim Gorman
A first-rate session attended by a lot of the conference. It went into detail on what scripts are available to extract AWR information without needing Grid Control or Database Control. For command-line lovers, it’s great.

The SAN is guilty… until proven otherwise — Gaja Krishna Vaidyanatha
A very important session for all DBAs, showing the end-to-end components involved in database I/O. There are so many more components that can cause problems between the database and the physical spindles. Concepts, case studies, plenty of information.

Understanding Oracle Execution Plans: How SQL is Really Executed – Tanel Poder
One of those eye-opening sessions, starting with how to read SQL Execution Plans, and moving to showing stack traces and mapping function calls, to Execution Plan steps. A must-see for everyone tuning SQL.

And that is it. As exhausting conferences are, I always wish for them to have been longer.

RMOUG: Day 1

Day One at RMOUG in Denver is now over.

There were quite a few interesting presentations. Unfortunately, the very first I went to was canceled due to car trouble. I also found that several sessions of similar interest to me overlapped, so I had to choose my spots.

Advanced Oracle Troubleshooting
This presentation was particularly good. Tanel goes into detail on how to quickly asses a situation without going through a number of “health checks” and still be nowhere near solving the problem. His approach is to look directly at what a “hanging” session is waiting on, and to systematically determine the cause of the problem, with no time wasted.

Putting your database on a Diet: Oracle’s Data compression
A short overview of table compression. I found that that even though the presenter obviously had some experience with compression, there were hardly any examples nor anything mentioned about how to determine proper re-ordering to improve compression.

All About Oracle’s In-Memory Undo
An unusual topic—something that works so well that no one really talks about it. The presentation, however, was very short, and provided little new information. There was only one demonstrated test case. Although it went into detail about the difference between in-memory and standard undo, the other-than-obvious effects were omitted.

During lunch I took a picture that shows the entire RMOUG crowd:
RMOUG Day 1  Lunch

Tomorrow is Day 2, and I will be posting about it here.

Christo Kutrovsky Presenting at RMOUG

I am back on the road, going to RMOUG Training Days to present The Answer to Free Memory, Swap, Oracle, and Everything.

I am quite excited, as the RMOUG schedule (PDF) looks quite promising, especially these presentations:

  • Further RMAN Optimizations in 11g — Stephan Haisley
  • Advanced Oracle Troubleshooting: No Magic is Needed — Tanel Poder
  • Understanding Oracle Execution Plans: How SQL is Really Executed — Tanel Poder
  • The SAN is Guilty until proven otherwise — Gaja Krishna Vaidyanatha

Some of these overlap, so I guess I will have to make a difficult choice.

I hope to see you all in Denver.

Interview: Kevin Closson on the Oracle Exadata Storage Server

Last Friday (September 26), Paul Vallée and I were lucky enough to interview Kevin Closson about the Oracle Exadata Storage Server. A tidied-up stream of the audio is here: closson-interview.m3u.

The audio quality is a little spotty here and there, so you might like to follow the transcription below.

Paul gets the interview started.

Paul Vallée (PV): Christo Kutrovsky and myself, Paul Vallée. We’re on the line with Kevin Closson of Oracle (and prior to that with Hewlett-Packard, and prior to that with Polyserve, and prior to that with Sequent). A giant of our industry, and I’m honoured to be speaking to him. Kevin, hello.

Kevin Closson (KC): Well, they always say that flattery gets you nowhere, but apparently it’ll get you on the phone.

PV: [laughs] Very nice!

KC: No seriously, it’s more than a pleasure to be here. I like what you guys do, so this is good.

PV: Thank you, Kevin. So, we are here to talk about the work that Larry Ellison announced yesterday, specifically the work around the Oracle Database Machine and the Exadata Storage Server. Kevin, can you just quickly introduce yourself and how you came to be involved in the project?

KC: Right. So, I’m a performance architect with Oracle, and the project that I’m stationed on, if you will, is the development team for Oracle Exadata Storage Server. And the way I came to Oracle is, quite a few of the folks who are involved with the very genesis of Exadata are people that I’ve known and worked with closely dating back to the early ’90s. And after a fruitful endeavour as the chief software architect for Oracle solutions at Polyserve, it became an opportunity to latch onto Oracle, because we sold our company to them. So there we are.

PV: How exciting! Congratulations! So I noticed that there’s still a little, I guess a diversion in terms of the branding. Larry definitely introduced it as the Exadata Programmable Storage Server, and I double-checked the video. But in your blog, you’re calling it, for sure, just the Exadata Storage Server. Just how recently was the marketing/messaging developed for this?

KC: You know, I’m not a part of the Go-To-Market (GTM) efforts, but, you know, honestly, the way these things are brought to market . . .  They’re developed under a project name, and the project name remains the same for years. It was over the last few months that Marketing began cooking the name and what-have-you. Now, if you’re referring to something that Larry said in his keynotes, I have to admit I didn’t commit to photographic memory all the slides. And certainly, if he used the term “programmable”, I’m not going to correct Larry Ellison.

PV: [laughs] That would be risky.

Read the rest of this entry . . .

Analysis of the Oracle Exadata Storage Server and Database Machine

*Updated* see comments.
Exadata — the smart storage server. I am definitely excited about this product, but my point of view is a bit different.

It’s fast, and much faster than anything out there right now. But how many shops will actually need this? How many shops can spend 2.2 million dollars on hardware and equipment?

What are the products, in a nutshell? The Oracle Exadata Storage Server (Data Sheet, PDF):

  • 2U Storage “unit” with either 1 TB SAS or 3.3 TB SATA redundant capacity. There is a query processor in the box that can “offload” tasks from the main database server. Primary filtering, decompression, joins, backups.
  • Storage units linked to database servers via dual Infiniband offering 20 Gbit/s (2.5 GBytes/sec) bandwidth

The Database Machine (Data Sheet, PDF):

  • A standard 42U rack with 8 database servers and 12 Exadata storage servers.
  • Pre-installed Linux and Oracle. Pre-configured.
  • In 8 servers — a total of 256GB RAM, 64 Intel cores @ 2.66 Ghz, InfiniBand-ed and gigabit-switched.

The cost for one Database Machine: $2.33M ($650,000 + $1,680,000 in software) as grabbed from Larry’s keynote (thank chet) I called the “call us now” phone mentioned on the Oracle Exadata website to ask them for pricing. They had no idea what I was asking about, and I’m still waiting on a salesperson to call me back. (Hint for Oracle — educate your sales staff about new products, just in case I decide to buy one the day after you announce it.)

You have to realize how “cheap” this is. It comes down to $25,000 per core for Oracle EE, RAC, and Partitioning! And extra “free” CPUs for decompressing, filtering and joining, and backups. That’s a good deal. Oh, did I mention you can interconnect several 42U racks?

Back to the main question, what problems does this product solve?

Read the rest of this entry . . .

Oracle’s Secret New Feature: Educated Guesses

Larry Ellison is announcing a major new feature this Wednesday at Open World. For the first time in a while, his keynote is dedicated to the “database” as opposed to the usual high level ERP/Apps/Fusion. Even the title of his keynote is catchy — “Extreme Performance”.

Oracle has been keeping the new feature a secret. Even the 11gR2 beta program had very few participants to prevent information leaking out. It’s, “Something’s coming, but I am not telling what.”

Okay, it worked on me, I’m excited about it. Let’s think what it could be. What single database feature is so major, that Larry himself will announce it during OpenWorld?

What do we know so far?

  • Starting with the obvious, Larry’s keynote is “Extreme Performance”, so it’s related to performance.
  • We know Kevin Closson has worked on it – he had a blog entry saying “I am working on something big” that got pulled off the web. (Here’s Google’s cache.)

Given these two point, let’s further think about it. What do we know about Kevin?

  • He worked for PolyServe — a company whose main product is a cluster file system.
  • He worked for Sequent on NUMA systems, which in today’s world is pretty close to cluster software with a very fast, low latency interconnect.
  • He is an expert in storage systems and disk performance.
  • He joined Oracle recently, possibly to work on this secret project.
  • He must be really excited about it, to post anything on his blog under radio silence.

I think it’s something related to storage, something new and revolutionary about storage. But what?

We already know, from leaks on certain websites, that ASM will become a cluster filesystem which will allow storing OCR files, as well as user files, on the ASM disks.

But is this big enough? It’s definitely significant. Now you get a “free” reliable, cluster file system with Oracle. I don’t think it’s big enough though. Oracle already had OCFS and OCFS2. So it’s not something new to release a filesystem. And even if ASM becomes a true filesystem, that would not provide such a significant performance boost to warrant a keynote called “Extreme Performance”. An ASM filesystem would be a major manageability feature, not so much a performance feature.

That being ruled out, what could it be?

Recently, when setting up a new 11g database on a server with 128gb of RAM, I was setting up hugepages as usual, and thinking about how big my cache would be. It struck me that the cache will be bigger than the database for quite a while. Why do we even need the SAN/Datafiles?!

Then it hit me.

Read the rest of this entry . . .

Recent Spike Report from v$active_session_history (ASH)

For the past few months I’ve been using a query that I refer to as “ash report – recent spike”. That’s the second thing I do when I get a call of the “the system is slow” type. The first thing I do is run “top” (or whichever alternative for the OS) and check the overall CPU usage.

The script is fully RAC-aware, and although it’s not 100% perfect, I use this imperfection to see if any particular node is doing something stupid. Although it is primarily targeted for OLTP systems, it can be useful for data warehouses as well, especially if they use the parallel option.

The query is to the database what “load” (uptime) for a Linux/Unix machine is, except it has much more detail. It is basically a summarization query of the v$active_session_history table. NOTE: you need to have the performance pack license to use it. It is not designed to be aligned, or read. The best is to leave it on just a few lines and concentrate on the results.

It has two “variables” that you can adjust: how far back to look (I use two hours), and how aggressively to look for problems (having count(*) >= 2).

An explanation of how to make sense of the results follows the query.

Read the rest of this entry . . .

Start NowWith Pythian - database design, management and emergency handling capabilities...

Live Updates

pythian: RT @sheeri: #confoo talk "Bending Queries to your Will with EXPLAIN" slides http://bit.ly/explainslides & handout
more



Testimonials

  • Serge Racine

    DBA, Brookfield Energy

    We are very satisfied by the service given to us by Andre and Shakir in support of our recent data quality and reorganization initiative.... more