Today I want to show what kind of IO performance we can get from Oracle Database Appliance (ODA). In this part, I will focus on hard disks. That’s right — those good old brown spinning disks.
I often use Oracle ORION tool to stress-test an IO subsystem and find it’s limits. It’s a very simple and handy tool and usually provide most of the IO simulation I need.
I usually benchmark for small random IOs and for large sequential reads. This gives me a good idea what I can expect from this IO subsystem for OLTP workloads as well as bulk data processing workloads including data warehousing, backups and batch activity. I usually don’t stress test mixed workload until I know what’s the profile of the application that I will run on this platform. In this particular case, I’m more after generic IO stress test and finding the limit.
Today, let’s talk about small random IOs which is the attribute of the OLTP workloads. I’m interested in single IO response time and IOs per second (IOPS).
When I stress test an IO subsystem I usually process average numbers but I always remember that averages are just that — averages. Because my artificial ORION workload is pretty randomly distributed and I use reasonably small intervals, the results have good confidence for me but in some cases I would want to dig further and collect some histograms of IO latencies. I haven’t done it for Oracle Database Appliance though and knowing what’s behind I expect response time to be quite consistent – there is no disk cache or something similar that skews response time.
I should note that I use term IO response time and IO latency interchangeable here in case you are using these terms differently. It might be a bad habit but that’s what I do.
Before I stress test an IO subsystem, I usually set some expectations. Let’s do the same here. ODA has 20 disks – 15K RPM SAS disks. My experience tells me that I should expect very good single IO latency (below 10ms) from these disks serving at least 100 IOPS each. I also expect that these disks will still provide reasonable response time if you crank up the workload to about 200 IOPS but this is where I would see much higher response times — getting into 20ms range. Now, I know that 15K RPM SAS disk can deliver even more IOPS each but then IO response time becomes generally unacceptable for OLTP systems. In fact, 10 ms target is what’s been a good rule of thumb in the last decade. Read the rest of this entry . . .
At Kscope this year, I attended a half day in-depth session entitled Data Warehousing Performance Best Practices, given by Maria Colgan of Oracle. In that session, there was a section on how to determine I/O throughput for a system, because in data warehousing I/O per second (iops) is less important than I/O throughput (how much actual data goes through, not just how many reads/writes).
The section contained an Oracle-specific in-database tool, and a standalone tool that can be used on many operating systems, regardless of whether or not a database exists: Read the rest of this entry . . .
For the past two months, I have been running tests on TokuDB in my free time. TokuDB is a storage engine put out by Tokutek. TokuDB uses fractal tree indexes instead of B-tree indexes to improve performance, which is dramatically noticeable when dealing with large tables (over 100 million rows).
For those that like the information “above the fold”, here is a table with results from a test comparing InnoDB and TokuDB. All the steps are explained in the post below, if you want more details, but here’s the table:
Action
InnoDB
TokuDB
Importing ~40 million rows
119 min 20.596 sec
69 min 1.982 sec
INSERTing again, ~80 million rows total
5 hours 13 min 52.58 sec
56 min 44.56 sec
INSERTing again, ~160 million rows total
20 hours 10 min 32.35 sec
2 hours 2 min 11.95 sec
Size of table on disk
42 Gb
15 Gb
COUNT(*) query with GROUP BY
58 min 10.11 sec
5 min 3.21 sec
DELETE query
2 hours 46 min 18.13 sec
1 hour 14 min 57.75 sec
Size of table on disk
42 Gb
12 Gb
OPTIMIZE TABLE
1 day 2 hours 19 min 21.96 sec
21 min 4.41 sec
Size of table on disk
41 Gb
12 Gb
TRUNCATE TABLE
1 min 0.13 sec
0.27 sec
Size of table on disk
41 Gb
193 Mb (after waiting 60 seconds before doing an ls -l)
However, the tests are skewed — not on purpose, I am sure, and Mark admits he has not used the query cache before — but they are skewed all the same. Mark’s error was that he assumed he could just turn on the query cache and see if it works. Most features of MySQL do not work that way — you have to understand the strengths and weaknesses of the feature in order to use it properly.
Mark’s benchmark definitely reinforces that turning on the query cache without any knowledge of your system is a bad idea, and I agree with him on that. But it does not in any way mean that the query cache is always a bad idea. In fact, the MySQL manual page at http://dev.mysql.com/doc/refman/5.1/en/query-cache.html has some caveats about when the query cache is useful and when it is not useful.
It is important to know how the MySQL query cache works, so I will first explain that, and then explain why Mark’s test is not a very good broad generalization. MySQL’s query cache is not useful all the time, but it can be quite useful! If you are already familiar with the query cache, you can skip ahead to why Mark’s benchmark is skewed.
At the March Boston MySQL User Group meeting, Jacob Nikom of MIT’s Lincoln Laboratory presented “Optimizing Concurrent Storage and Retrieval Operations for Real-Time Surveillance Applications.” In the middle of the talk, Jacob said he sometimes calls what he did in this application as “real-time data warehousing”, which was so accurate I decided to give that title to this blog post.
Recently, I had an opportunity to participate in a stress-testing exercise. By stress-testing, I mean simulating an expected peak load on the database and observing how the database performs. The objective is to make sure the particular hardware can handle the expected load and also to test where the physical limit of the machine lies if we keep increasing the number of concurrent users.
I searched for tools available out there. On forums, most people were talking about Hammerora, and although some people recommended high end tools for big bucks, I decided to try this free tool that promised to do exactly what I needed it to do. The open source Hammerora turned out to be excellent tool for the purpose of simulating a typical transactions load in any number of threads. It’s built with the Tcl scripting language and it can simulate a real workload with as many client user sessions as your system can handle.
So if you’re looking for an easy way to measure throughput performance of your database, keep reading.