Posted by Sheeri Cabral on Sep 28, 2009
I spent Friday examining the systems for a more traditional consulting gig (in case you did not know, Pythian has offered more traditional consulting and emergency services in addition to our remote DBA offering for our entire 12 year history). It is a familiar story to many people — the database performance was unacceptable.
The company had a few log tables that had huge amounts of inserts and foreign keys, so they used InnoDB. Unfortunately, they also used a GUID as a primary key (which is getting more and more common these days, but long primary keys with InnoDB slow down INSERT, UPDATE and DELETE commands a great deal) — varchar(32) and utf8.
That’s right — their primary key for many of these tables was 96 bytes long (32 characters * 3 bytes per character), and as an InnoDB table, the primary key is clustered with every other index. One of the more frequently used tables had 8 indexes other than the PRIMARY KEY — 7 were indexes on varchar(32) fields, the other was on an int(11) field (signed integer).
The query load on the server was 99% writes, 1% reads. The log tables only had rows inserted, not deleted or updated. Thus, we offered a few different solutions (in no particular order):
1) Use a BLACKHOLE table on the master and replicate to a slave server. This would make the INSERTs to the master fast, though the slave would lag behind. The slave could be changed to MyISAM or ARCHIVE or some other storage engine.
2) Use MyISAM on the master and slave. This was the solution we went to, because the log tables could be re-created with no data and moved into place, and the loss of foreign keys was not a big deal, since the application checked for them anyway. ARCHIVE could also be used, it is faster than MyISAM on inserts because it only allows one index, the PRIMARY KEY. concurrent_insert was set at the default of 1, but we changed it to 2 just in case there was an errant UPDATE or DELETE causing a gap in the MyISAM table.
3) Add a PRIMARY KEY that is AUTO_INCREMENT, keeping the GUID a UNIQUE KEY. This would speed up the writes without sacrificing any data. This can be combined with dropping unnecessary foreign keys to make the inserts faster.
4) Move the log tables to a different machine entirely. This would require the application to have two different database handlers.
5) Use INSERT DELAYED into the logging tables.
We discussed these scenarios and the pros and cons, and eventually we chose to do #2, which we could do right away, and indeed, we had an immediate positive impact. Almost all the unnacceptable performance went away.
We were also able to audit the system and provide several good recommendations for ways they can improve their system. It is exciting for me to be able to help people out in a very tangible way — often we do not see some of the really crazy problems that affect few people (such as slave lag getting further and further behind) because our remote DBA clients have the benefit of ongoing help from us, and we generally find the smaller degradations in performance before they become a large issue.
Being able to spend a few hours with a new environment and really help them out was extremely fulfilling for me. I love that MySQL has many different options (and there are likely more that I did not think of) that all would work to achieve better performance. I think people like to have a choice, too, so they do not feel they were forced to do something they did not want to do. Especially when there are solutions on both the DBA and developer side.