THE WORLD DISCUSSES #PYTHIAN ON TWITTER. HAVE A QUESTION? USE OUR HASHTAG AND ASK AWAY.

Secrets of Oracle’s Automatic Degree of Parallelism

Automatic degree of parallelism, or Auto DOP, is a new feature in 11gR2 that promises to help manage systems where large subset of the workload runs with parallel processing. In this post I’ll introduce the feature and give very useful tips I got from Oracle’s Real World Performance expert Greg Rahn on how to use it. So this is worth reading even if you are familiar with the feature.

The problem is fairly well known – you system only has finite amount of resources. Only so many CPUs, only so many disks capable of delivering only so many IO/s and MB/s. A certain query may have amazing performance when running with 32 parallel processes all alone on your test system. When 5 people need to run it at once, and at the same time there are two scheduled jobs running each with its own parallel processes, there are two likely outcomes:

  1. You will run more parallel processes than your system is capable of serving. Resulting in long queues on the CPU and storage, and overall performance degradation.
  2. You limit the maximum number of parallel processes to protect the database resources, and some of the queries degrade. If you don’t detect it, the ETL process that should have finished in two hours takes 24, which means that the daily report sent to the CEO is missing some of the data. Ouch.

Read the rest of this entry . . .

OOW11: Exalytics Hits The Stage — In-Memory Analytics

News from Oracle OpenWorld flor…

What is Exalytics?

It’s a BI appliance machine — it’s like an application middle tier for complete Business Intellegence data warehousing solutions. You put it in front of Exadata and users get all the tools to work with that data – analyze, predict, run reports and etc.

Exalytics is a server having Oracle BI Suite, OBIEE, side by side with Essbase. OBI works with relational data warehouse like Exadata.

Exalytics features Essbase that can pull the data from Exadata and use its in-memory analytical capabilities to give users richer functionality. This is something which SAP HANA seems to target as well. Real-time analytics.

I’m still not sure how TimesTen fits here but we will learn soon.

It’s one server with 1TB DRAM and four 10 core Xeon CPUs. Of course InfiniBand to connect to Exadata back-end.

My note from the LJE comments – go easy with in-memory data compression – it’s expensive to decompress it each time.

I’m interested to see how Exalytics handles unstructured data analysis.

Data Warehousing Best Practices: Comparing Oracle to MySQL, part 2 (partitioning)

At Kscope this year, I attended a half day in-depth session entitled Data Warehousing Performance Best Practices, given by Maria Colgan of Oracle. My impression, which was confirmed by folks in the Oracle world, is that she knows her way around the Oracle optimizer.

See part 1 for the introduction and talking about power and hardware. This part will go over the 2nd “P”, partitioning. Learning about Oracle’s partitioning has gotten me more interested in how MySQL’s partitioning works, and I do hope that MySQL partitioning will develop to the level that Oracle partitioning does, because Oracle’s partitioning looks very nice (then again, that’s why it costs so much I guess).
Read the rest of this entry . . .

Data Warehousing Best Practices: Comparing Oracle to MySQL, part 1 (introduction and power)

At Kscope this year, I attended a half day in-depth session entitled Data Warehousing Performance Best Practices, given by Maria Colgan of Oracle. My impression, which was confirmed by folks in the Oracle world, is that she knows her way around the Oracle optimizer.

These are my notes from the session, which include comparisons of how Oracle works (which Maria gave) and how MySQL works (which I researched to figure out the difference, which is why this blog post took a month after the conference to write). Note that I am not an expert on data warehousing in either Oracle or MySQL, so these are more concepts to think about than hard-and-fast advice. In some places, I still have questions, and I am happy to have folks comment and contribute what they know.
Read the rest of this entry . . .

Sydney Oracle Meetup #8 — Exadata Extravaganza

What: Sydney Oracle Meetup #8 — Exadata Extravaganza

When: Friday, July 17, 2009 5:30 PM (please, make sure to RSVP yes/no/maybe)

Where: Sydney CBD Join meetup for the detailed location.

The topic for this meetup is quite exciting – Oracle Exadata and everything about it. David Centellas, Senior Database Consultant from Oracle will do technical presentation on Exadata and, after the break, we will have a open forum discussion where two Oracle’s Enterprise Architects, Tim Rubin and Chris Jones, will answer our questions and share thir real-world experience.

Schedule:

  • 5:30pm – 6:00pm — Networking with food and refreshments
  • 6:00pm – 7:00pm — Presentation on Exadata by David Centellas
  • 7:00pm – 7:30pm — Break and informal networking
  • 7:30pm – 8:30pm — Open Forum based on real-world Exadata experience with David Centellas, Tim Rubin and Chris Jones
  • post even — optional gathering in the nearby pub – it’s Friday night in the end!

Read the rest of this entry . . .

Real Time Data Warehousing Presentation and Video

At the March Boston MySQL User Group meeting, Jacob Nikom of MIT’s Lincoln Laboratory presented “Optimizing Concurrent Storage and Retrieval Operations for Real-Time Surveillance Applications.” In the middle of the talk, Jacob said he sometimes calls what he did in this application as “real-time data warehousing”, which was so accurate I decided to give that title to this blog post.

The slides can be downloaded in PDF format (1.3 Mb) at http://www.technocation.org/files/doc/Concurrent_database_performance_02.pdf.

This talk discussed how to do real-time retrieval operations while doing concurrent high volume insertion, including:

  • How to keep up with 1.5 Mb/second per server incoming data stream
  • server hardware comparison between a multi-core AMD Opteron and a multi core Intel Xeon
  • MySQL/Postgres comparison
  • schema design
  • design of the storage/retrieval benchmark
  • tuning MySQL

Read the rest of this entry . . .

Oracle Open World 2008 Diaries: HP Oracle Database Machine

For those of you who didn’t see the Larry Ellison’s keynote here it is courtesy to Sheeri.

We cut out the HP part but I don’t think anyone will complain. It’s not the best angle but we didn’t get there early in advance to secure the right location for the camera.

Read the rest of this entry . . .

Analysis of the Oracle Exadata Storage Server and Database Machine

Pythian has a full-featured Oracle Exadata Practice complete with successful implementations and reference customers.

*Updated* see comments.
Exadata — the smart storage server. I am definitely excited about this product, but my point of view is a bit different.

It’s fast, and much faster than anything out there right now. But how many shops will actually need this? How many shops can spend 2.2 million dollars on hardware and equipment?

What are the products, in a nutshell? The Oracle Exadata Storage Server (Data Sheet, PDF):

  • 2U Storage “unit” with either 1 TB SAS or 3.3 TB SATA redundant capacity. There is a query processor in the box that can “offload” tasks from the main database server. Primary filtering, decompression, joins, backups.
  • Storage units linked to database servers via dual Infiniband offering 20 Gbit/s (2.5 GBytes/sec) bandwidth

The Database Machine (Data Sheet, PDF):

  • A standard 42U rack with 8 database servers and 12 Exadata storage servers.
  • Pre-installed Linux and Oracle. Pre-configured.
  • In 8 servers — a total of 256GB RAM, 64 Intel cores @ 2.66 Ghz, InfiniBand-ed and gigabit-switched.

The cost for one Database Machine: $2.33M ($650,000 + $1,680,000 in software) as grabbed from Larry’s keynote (thank chet) I called the “call us now” phone mentioned on the Oracle Exadata website to ask them for pricing. They had no idea what I was asking about, and I’m still waiting on a salesperson to call me back. (Hint for Oracle — educate your sales staff about new products, just in case I decide to buy one the day after you announce it.)

You have to realize how “cheap” this is. It comes down to $25,000 per core for Oracle EE, RAC, and Partitioning! And extra “free” CPUs for decompressing, filtering and joining, and backups. That’s a good deal. Oh, did I mention you can interconnect several 42U racks?

Back to the main question, what problems does this product solve?

Read the rest of this entry . . .

Implementing Many-to-many Relationships in Data Warehousing

This article will discuss how to make many-to-many relationships in data warehousing easily queried by novice SQL users using point-and-click query tools.

This is a big problem with Oracle Discoverer-like tools where the metadata layer is basically a set of pre-joined tables from which the user simply clicks on columns and hits the run button. You can create custom complex queries that they can run, but then every query is custom, which defeats the purpose of the tool in the first place.

The design goal is to create a structure that is simple for the end user and which normally translates to something as flat as possible. This article will go through the different methods of implementing many-to-many relationships, and look at their effect on query complexity, especially for someone who use a tool that hides the SQL.

Read the rest of this entry . . .

Start NowWith Pythian - database design, management and emergency handling capabilities...

Live Updates

pythian: RT @FN_Press2: Schooner Information Technology Teams with Pythian to Deliver Advanced Support and High... http://finanznachrichten.de/20
more



Testimonials

  • Serge Racine

    DBA, Brookfield Energy

    We are very satisfied by the service given to us by Andre and Shakir in support of our recent data quality and reorganization initiative.... more