THE WORLD DISCUSSES #PYTHIAN ON TWITTER. HAVE A QUESTION? USE OUR HASHTAG AND ASK AWAY.

RMOUG: Day 2

Day 2 finished yesterday. It was quite a busy day, with some excellent sessions.

Battle of the Nodes: RAC Performance Myths — Riyaj Shamsudeen
A great presentation on popular RAC myths, with some great examples. Excellent visuals that made complex processes look simple. I really liked this one.

Getting the Most Out of AWR — Tim Gorman
A first-rate session attended by a lot of the conference. It went into detail on what scripts are available to extract AWR information without needing Grid Control or Database Control. For command-line lovers, it’s great.

The SAN is guilty… until proven otherwise — Gaja Krishna Vaidyanatha
A very important session for all DBAs, showing the end-to-end components involved in database I/O. There are so many more components that can cause problems between the database and the physical spindles. Concepts, case studies, plenty of information.

Understanding Oracle Execution Plans: How SQL is Really Executed – Tanel Poder
One of those eye-opening sessions, starting with how to read SQL Execution Plans, and moving to showing stack traces and mapping function calls, to Execution Plan steps. A must-see for everyone tuning SQL.

And that is it. As exhausting conferences are, I always wish for them to have been longer.

RMOUG: Day 1

Day One at RMOUG in Denver is now over.

There were quite a few interesting presentations. Unfortunately, the very first I went to was canceled due to car trouble. I also found that several sessions of similar interest to me overlapped, so I had to choose my spots.

Advanced Oracle Troubleshooting
This presentation was particularly good. Tanel goes into detail on how to quickly asses a situation without going through a number of “health checks” and still be nowhere near solving the problem. His approach is to look directly at what a “hanging” session is waiting on, and to systematically determine the cause of the problem, with no time wasted.

Putting your database on a Diet: Oracle’s Data compression
A short overview of table compression. I found that that even though the presenter obviously had some experience with compression, there were hardly any examples nor anything mentioned about how to determine proper re-ordering to improve compression.

All About Oracle’s In-Memory Undo
An unusual topic—something that works so well that no one really talks about it. The presentation, however, was very short, and provided little new information. There was only one demonstrated test case. Although it went into detail about the difference between in-memory and standard undo, the other-than-obvious effects were omitted.

During lunch I took a picture that shows the entire RMOUG crowd:
RMOUG Day 1  Lunch

Tomorrow is Day 2, and I will be posting about it here.

Christo Kutrovsky Presenting at RMOUG

I am back on the road, going to RMOUG Training Days to present The Answer to Free Memory, Swap, Oracle, and Everything.

I am quite excited, as the RMOUG schedule (PDF) looks quite promising, especially these presentations:

  • Further RMAN Optimizations in 11g — Stephan Haisley
  • Advanced Oracle Troubleshooting: No Magic is Needed — Tanel Poder
  • Understanding Oracle Execution Plans: How SQL is Really Executed — Tanel Poder
  • The SAN is Guilty until proven otherwise — Gaja Krishna Vaidyanatha

Some of these overlap, so I guess I will have to make a difficult choice.

I hope to see you all in Denver.

Interview: Kevin Closson on the Oracle Exadata Storage Server

Last Friday (September 26), Paul Vallée and I were lucky enough to interview Kevin Closson about the Oracle Exadata Storage Server. A tidied-up stream of the audio is here: closson-interview.m3u.

The audio quality is a little spotty here and there, so you might like to follow the transcription below.

Paul gets the interview started.

Paul Vallée (PV): Christo Kutrovsky and myself, Paul Vallée. We’re on the line with Kevin Closson of Oracle (and prior to that with Hewlett-Packard, and prior to that with Polyserve, and prior to that with Sequent). A giant of our industry, and I’m honoured to be speaking to him. Kevin, hello.

Kevin Closson (KC): Well, they always say that flattery gets you nowhere, but apparently it’ll get you on the phone.

PV: [laughs] Very nice!

KC: No seriously, it’s more than a pleasure to be here. I like what you guys do, so this is good.

PV: Thank you, Kevin. So, we are here to talk about the work that Larry Ellison announced yesterday, specifically the work around the Oracle Database Machine and the Exadata Storage Server. Kevin, can you just quickly introduce yourself and how you came to be involved in the project?

KC: Right. So, I’m a performance architect with Oracle, and the project that I’m stationed on, if you will, is the development team for Oracle Exadata Storage Server. And the way I came to Oracle is, quite a few of the folks who are involved with the very genesis of Exadata are people that I’ve known and worked with closely dating back to the early ’90s. And after a fruitful endeavour as the chief software architect for Oracle solutions at Polyserve, it became an opportunity to latch onto Oracle, because we sold our company to them. So there we are.

PV: How exciting! Congratulations! So I noticed that there’s still a little, I guess a diversion in terms of the branding. Larry definitely introduced it as the Exadata Programmable Storage Server, and I double-checked the video. But in your blog, you’re calling it, for sure, just the Exadata Storage Server. Just how recently was the marketing/messaging developed for this?

KC: You know, I’m not a part of the Go-To-Market (GTM) efforts, but, you know, honestly, the way these things are brought to market . . .  They’re developed under a project name, and the project name remains the same for years. It was over the last few months that Marketing began cooking the name and what-have-you. Now, if you’re referring to something that Larry said in his keynotes, I have to admit I didn’t commit to photographic memory all the slides. And certainly, if he used the term “programmable”, I’m not going to correct Larry Ellison.

PV: [laughs] That would be risky.

Read the rest of this entry . . .

Analysis of the Oracle Exadata Storage Server and Database Machine

Pythian has a full-featured Oracle Exadata Practice complete with successful implementations and reference customers.

*Updated* see comments.
Exadata — the smart storage server. I am definitely excited about this product, but my point of view is a bit different.

It’s fast, and much faster than anything out there right now. But how many shops will actually need this? How many shops can spend 2.2 million dollars on hardware and equipment?

What are the products, in a nutshell? The Oracle Exadata Storage Server (Data Sheet, PDF):

  • 2U Storage “unit” with either 1 TB SAS or 3.3 TB SATA redundant capacity. There is a query processor in the box that can “offload” tasks from the main database server. Primary filtering, decompression, joins, backups.
  • Storage units linked to database servers via dual Infiniband offering 20 Gbit/s (2.5 GBytes/sec) bandwidth

The Database Machine (Data Sheet, PDF):

  • A standard 42U rack with 8 database servers and 12 Exadata storage servers.
  • Pre-installed Linux and Oracle. Pre-configured.
  • In 8 servers — a total of 256GB RAM, 64 Intel cores @ 2.66 Ghz, InfiniBand-ed and gigabit-switched.

The cost for one Database Machine: $2.33M ($650,000 + $1,680,000 in software) as grabbed from Larry’s keynote (thank chet) I called the “call us now” phone mentioned on the Oracle Exadata website to ask them for pricing. They had no idea what I was asking about, and I’m still waiting on a salesperson to call me back. (Hint for Oracle — educate your sales staff about new products, just in case I decide to buy one the day after you announce it.)

You have to realize how “cheap” this is. It comes down to $25,000 per core for Oracle EE, RAC, and Partitioning! And extra “free” CPUs for decompressing, filtering and joining, and backups. That’s a good deal. Oh, did I mention you can interconnect several 42U racks?

Back to the main question, what problems does this product solve?

Read the rest of this entry . . .

Oracle’s Secret New Feature: Educated Guesses

Larry Ellison is announcing a major new feature this Wednesday at Open World. For the first time in a while, his keynote is dedicated to the “database” as opposed to the usual high level ERP/Apps/Fusion. Even the title of his keynote is catchy — “Extreme Performance”.

Oracle has been keeping the new feature a secret. Even the 11gR2 beta program had very few participants to prevent information leaking out. It’s, “Something’s coming, but I am not telling what.”

Okay, it worked on me, I’m excited about it. Let’s think what it could be. What single database feature is so major, that Larry himself will announce it during OpenWorld?

What do we know so far?

  • Starting with the obvious, Larry’s keynote is “Extreme Performance”, so it’s related to performance.
  • We know Kevin Closson has worked on it – he had a blog entry saying “I am working on something big” that got pulled off the web. (Here’s Google’s cache.)

Given these two point, let’s further think about it. What do we know about Kevin?

  • He worked for PolyServe — a company whose main product is a cluster file system.
  • He worked for Sequent on NUMA systems, which in today’s world is pretty close to cluster software with a very fast, low latency interconnect.
  • He is an expert in storage systems and disk performance.
  • He joined Oracle recently, possibly to work on this secret project.
  • He must be really excited about it, to post anything on his blog under radio silence.

I think it’s something related to storage, something new and revolutionary about storage. But what?

We already know, from leaks on certain websites, that ASM will become a cluster filesystem which will allow storing OCR files, as well as user files, on the ASM disks.

But is this big enough? It’s definitely significant. Now you get a “free” reliable, cluster file system with Oracle. I don’t think it’s big enough though. Oracle already had OCFS and OCFS2. So it’s not something new to release a filesystem. And even if ASM becomes a true filesystem, that would not provide such a significant performance boost to warrant a keynote called “Extreme Performance”. An ASM filesystem would be a major manageability feature, not so much a performance feature.

That being ruled out, what could it be?

Recently, when setting up a new 11g database on a server with 128gb of RAM, I was setting up hugepages as usual, and thinking about how big my cache would be. It struck me that the cache will be bigger than the database for quite a while. Why do we even need the SAN/Datafiles?!

Then it hit me.

Read the rest of this entry . . .

Recent Spike Report from v$active_session_history (ASH)

For the past few months I’ve been using a query that I refer to as “ash report – recent spike”. That’s the second thing I do when I get a call of the “the system is slow” type. The first thing I do is run “top” (or whichever alternative for the OS) and check the overall CPU usage.

The script is fully RAC-aware, and although it’s not 100% perfect, I use this imperfection to see if any particular node is doing something stupid. Although it is primarily targeted for OLTP systems, it can be useful for data warehouses as well, especially if they use the parallel option.

The query is to the database what “load” (uptime) for a Linux/Unix machine is, except it has much more detail. It is basically a summarization query of the v$active_session_history table. NOTE: you need to have the performance pack license to use it. It is not designed to be aligned, or read. The best is to leave it on just a few lines and concentrate on the results.

It has two “variables” that you can adjust: how far back to look (I use two hours), and how aggressively to look for problems (having count(*) >= 2).

An explanation of how to make sense of the results follows the query.

Read the rest of this entry . . .

How to Make an In-Database listener.log File

Ever wished the listener.log file was a table in the database? Wish no more! About three years ago, I sent this recipe in an email to my co-workers. Just recently, Shakir re-sent it after using the method in an emergency. Since it seems to have proved its value, I now offer it to our readers.

Using Oracle’s external tables, we can “query” the listener.log file:

Step 1: Create an “oracle directory” of where your file is:

create directory TNSLOG as '$ORACLE_HOME/network/log';

Step 2: Create an external table definition. Note that no data is loaded, just the method reading the file:

CREATE TABLE listener_log (
timestamp date,
connect_data VARCHAR2(2000),
protocol_info VARCHAR2(80),
EVENT VARCHAR2(200),
SID VARCHAR2(200),
RETURN_CODE number
)
ORGANIZATION EXTERNAL
(TYPE oracle_loader
DEFAULT DIRECTORY TNSLOG
ACCESS PARAMETERS
(
RECORDS DELIMITED BY newline
NOBADFILE NODISCARDFILE NOLOGFILE
FIELDS TERMINATED BY "*" LRTRIM (timestamp char date_format DATE mask "DD-MON-YYYY hh24:mi:ss", connect_data, protocol_info, event,sid,return_code)
)
LOCATION ('listener.log')
)
REJECT LIMIT UNLIMITED;

Step 3: Query the table directly or load it into an Oracle table for better performance and consistency. You can limit your date range here to load only the period you need:

create table listener_log2 as
select TIMESTAMP,connect_data, event, sid, return_code,
substr(connect_data, instr(connect_data,'HOST=')+5, instr(connect_data,')', instr(connect_data,'HOST='))-instr(connect_data,'HOST=')-5)
as host from listener_log where timestamp >= sysdate - interval '3' day;

Note that I extract the “host” to see where connections are coming from.

Step 4: Query grouping by hour. If you need say by minute, replace hh with mi.

select host,trunc(timestamp,'hh'),count(*)-count(nullif(return_code,0)) as success, count(nullif(return_code,0)) as failure,count(*) as total
from listener_log2 group by host,trunc(timestamp,'hh') order by 2,1;

This a quick and dirty way of doing it, but it serves its purpose. Feel free to improve on this (perhaps have it extract all the connect data, including “service updates” etc.) and post the improved version in a comment.

Bind Peeking, Ad Hoc Queries, Stable Performance. On 10G you can only pick any two.

I got to troubleshoot an amazing situation a few weeks ago. I think, we can all agree that assuming we are on a supported configuration of a production release of Oracle, it is essentially inconceivable that allowing a single query to run on your system can flip another query’s plans and cause major performance issues (and in this case even downtime!)

There have been plenty of posts about bind peeking. Alex Gorbachev wrote about it last year, and so did Jonathan Lewis. It’s a well known issue. However what hasn’t been written about is when it is expected to strike and cause you headaches.

A single query can change the plan of a number of other queries, but just sometimes.

One day you come to work (or get woken up) and you have a badly performing system. You look into it and it comes to one query. Nothing has changed, nothing has been modified.

I found the query no problems. I did an explain plan, comes up with an index range scan. Given the execution time however, it can’t be a range scan, it’s taking way too much time. Looking at v$sql_plan reveals the full table scan. At this point I was almost sure what the problem was.

How did I fix it? I did an “alter system flush shared_pool;” the query got reparsed, a new child cursor was created and all new executions of the query were using the expected and much faster index range scan.

Easy fix, but what happened? Nothing has been changed for months!

Sometimes it’s coincidence. The query just happens to run at the right time, or some job is delayed and queries run in a different order and the wrong plan is selected. Those are generally easy to catch as they happen early in the life of a system.

Sometimes it’s load. You decide to run an additional application, which causes more SQL statements to be aged out and re-parsed, creating a higher chance of parsing with “bad” values. Although the additional load will be blamed for the “slow” performance, the root cause will eventually be discovered. In a way, it will be expected that something will go bad when you increase the load.

And sometimes it’s a single ad hoc query with a new predicate that starts the slowly-ticking time bomb.

Here is how it happens . . .

Read the rest of this entry . . .

Apartment Hunt in Dubai

As I mentioned earlier, the Pythian Dubai office is now open. I only mentioned it took me 5 days to find an apartment here, so I would like to share my experience.

Initially I was looking for apartments mainly in the Dubai Marina area (click here for interactive map with building names and pictures). It looked to be the closest to the Dubai Internet City, where the Pythian Dubai Office is. Ideally it would’ve been in walking distance.

I did find some apartments in the Dubai Marina, and they have some really nice buildings, with gyms, playgrounds, swimming pools on the roof (yes more then one). It’s quite nice, however the Dubai Marina is still under construction. At least half of it is with cranes, workers, heavy trucks. Half the streets don’t have sidewalks, and you have to share the street with zooming cars and heavy trucks.
Not the best place for walks, especially with a baby stroller. And “green” areas are almost non-existent.

I had to refocus on The Greens which is just across Sheikh Zayed road, the main highway here. It is actually closer to the Internet City. The first pictures I posted were from my temporary apartment, and I could see the office from it, however there is no way for a pedestrian to cross the highway (which is not elevated in fact). They are building a metro (which is elevated) and I am guessing when it’s ready, the metro station will have a bridge to both sides of Sheikh Zayed road and can be used as a bridge.

The Greens is a very nice neighborhood, but still full with construction. The difference is that the construction here is on the outskirts of The Greens, and the core is finished. There is a whole new area called TECOM build right next to The Greens. It’s impressive to see a whole neighborhood of 20 buildings 30 floors each been build at the same time. Now that I am writing this, I should’ve taken pictures … but I don’t go to this area, there’s nothing but construction there.

The furnished apartment I found is in not in a high-rise building. It’s a 6 floor building, or actually 4 buildings, arranged in a square. In the middle, there’s swimming pool, gym, playground, and general “play” area. You can’t get inside this area without access. It’s quite nice. Unfortunately my apartment does not look at the inside or “pool view” as it’s called.

I like the place. It reminds me of the “old times” when I was going in vacation resorts at the black sea. Almost feels like I am on vacation.

Start NowWith Pythian - database design, management and emergency handling capabilities...

Live Updates

pythian: RT @pythianfielding: My #ukoug2011 #Exadata IORM presentation starts in a few mins in hall 7A
more



Testimonials

  • Serge Racine

    DBA, Brookfield Energy

    We are very satisfied by the service given to us by Andre and Shakir in support of our recent data quality and reorganization initiative.... more