THE WORLD DISCUSSES #PYTHIAN ON TWITTER. HAVE A QUESTION? USE OUR HASHTAG AND ASK AWAY.

There is more than one way to do it….

I spent Friday examining the systems for a more traditional consulting gig (in case you did not know, Pythian has offered more traditional consulting and emergency services in addition to our remote DBA offering for our entire 12 year history). It is a familiar story to many people — the database performance was unacceptable.

The company had a few log tables that had huge amounts of inserts and foreign keys, so they used InnoDB. Unfortunately, they also used a GUID as a primary key (which is getting more and more common these days, but long primary keys with InnoDB slow down INSERT, UPDATE and DELETE commands a great deal) — varchar(32) and utf8.

That’s right — their primary key for many of these tables was 96 bytes long (32 characters * 3 bytes per character), and as an InnoDB table, the primary key is clustered with every other index. One of the more frequently used tables had 8 indexes other than the PRIMARY KEY — 7 were indexes on varchar(32) fields, the other was on an int(11) field (signed integer).

The query load on the server was 99% writes, 1% reads. The log tables only had rows inserted, not deleted or updated. Thus, we offered a few different solutions (in no particular order):

1) Use a BLACKHOLE table on the master and replicate to a slave server. This would make the INSERTs to the master fast, though the slave would lag behind. The slave could be changed to MyISAM or ARCHIVE or some other storage engine.

2) Use MyISAM on the master and slave. This was the solution we went to, because the log tables could be re-created with no data and moved into place, and the loss of foreign keys was not a big deal, since the application checked for them anyway. ARCHIVE could also be used, it is faster than MyISAM on inserts because it only allows one index, the PRIMARY KEY. concurrent_insert was set at the default of 1, but we changed it to 2 just in case there was an errant UPDATE or DELETE causing a gap in the MyISAM table.

3) Add a PRIMARY KEY that is AUTO_INCREMENT, keeping the GUID a UNIQUE KEY. This would speed up the writes without sacrificing any data. This can be combined with dropping unnecessary foreign keys to make the inserts faster.

4) Move the log tables to a different machine entirely. This would require the application to have two different database handlers.

5) Use INSERT DELAYED into the logging tables.

We discussed these scenarios and the pros and cons, and eventually we chose to do #2, which we could do right away, and indeed, we had an immediate positive impact. Almost all the unnacceptable performance went away.

We were also able to audit the system and provide several good recommendations for ways they can improve their system. It is exciting for me to be able to help people out in a very tangible way — often we do not see some of the really crazy problems that affect few people (such as slave lag getting further and further behind) because our remote DBA clients have the benefit of ongoing help from us, and we generally find the smaller degradations in performance before they become a large issue.

Being able to spend a few hours with a new environment and really help them out was extremely fulfilling for me. I love that MySQL has many different options (and there are likely more that I did not think of) that all would work to achieve better performance. I think people like to have a choice, too, so they do not feel they were forced to do something they did not want to do. Especially when there are solutions on both the DBA and developer side.

Using the Sphinx Search Engine with MySQL

MySQL Full Text Search Limitations

Suppose you have a MyISAM table containing a column with a full text index. This table starts to grow to a significant size (millions of rows) and gets updated fairly frequently. Chances are that you’ll start to see some bottlenecks when accessing this table, since without row level locking, the reading and writing operations will be blocking each other.

A solution that many people would suggest right away is to use the master for writes and a slave for reads, but this only masks the problem, and it won’t take long before enough read traffic on the slave starts causing slave lags.

Why Sphinx?

Read the rest of this entry . . .

MySQL Schema Synchronization and GUI Tools

Or, There is a Use for GUI Tools!

Recently I was working on a problem for a customer. They are converting a server with two InnoDB data files to innodb_file_per_table:

$ ls -l ibdata*
-rwxr-xr-x  1 mysql  mysql  2.0G Oct 14 13:10 ibdata1
-rwxr-xr-x  1 mysql  mysql  350G Oct 14 13:10 ibdata2

I honestly don’t recall ever seeing (or hearing about) so large a data file.

The method chosen for conversion boils down to this:

  1. stop and start the server to enable innodb_file_per_table
  2. alter all tables to myisam
  3. stop server and delete ibdata file
  4. restart server
  5. convert tables back to InnoDB
  6. add foreign keys

This post isn’t about how we did it, or about whether or not it was the best way, or anything like that. This post is really about the the last two steps.

There were over 500 InnoDB tables and something like 375 foreign keys. So I needed to somehow create the ALTER TABLE statements for these 500 servers. To drop the Foreign Keys, I used the following SELECT statement:

SELECT CONCAT( "mysql -u root -pPASS ", table_schema," -e 'ALTER TABLE ",
table_name, ' DROP FOREIGN KEY ', constraint_name,';'' &')
AS list_of_fks FROM information_schema.KEY_COLUMN_USAGE
WHERE REFERENCED_TABLE_NAME is not null
ORDER BY TABLE_NAME, COLUMN_NAME;

This created a list of statements that I could then put in a batch file, edit a little, and have them run in parallel.

I used the following SELECT statement to find all InnoDB tables and prepare a command to convert them to MyISAM:

SELECT CONCAT("mysql -u root -pPASS ", TABLE_SCHEMA, " -e 'ALTER TABLE ",
TABLE_NAME, " ENGINE=MYISAM;' & ")
FROM INFORMATION_SCHEMA.TABLES WHERE ENGINE='INNODB';

Again, the output can be put in a batch file, modified a little, and run in parallel.

Once everything was converted to MyISAM, I shut down the database and removed the InnoDB log and data files and modified the my.cnf to so I had file_per_table and a much smaller InnoDB data file. That was the fairly straightforward part. Now the fun began. Read the rest of this entry . . .

MySQL Back to Basics: Analyze, Check, Optimize, and Repair

It felt like the right time for us to look back at some useful commands for table maintenance that some of us may not have mastered as much as we might like to think.

In my post about gathering index statistics, I referred to OPTIMIZE TABLE, ANALYZE TABLE, and REPAIR TABLE — but I never explained in depth what the different commands do, and what the differences between them are. That is what I thought I would do with this post, focusing on InnoDB and MyISAM, and the differences in how they treat those commands. I will also look at different cases and see which one is right for in each case.

Read the rest of this entry . . .

MyISAM Statistics Gathering

So, how does one gather statistics on indexes? With InnoDB it’s one thing, with MyISAM it’s another thing. This post however, will focus on MyISAM. There are various ways, each having their own drawbacks and positive sides.

What about ANALYZE TABLE? It’s great — very fast in InnoDB. But with MyISAM a lock occurs while scanning the table.

OPTIMIZE TABLE? Still relatively quick with MyISAM, unless the table in question is huge. InnoDB requires an ALTER TABLE — it might take forever, depending on the server configuration, as OPTIMIZE TABLE for InnoDB maps to a ALTER TABLE tableName ENGINE=InnoDB

We all know how ANALYZE TABLE, REPAIR TABLE, and OPTIMIZE TABLE work with MyISAM. However, there’s a less explored way, for MyISAM — the myisam_stats_method.

Read the rest of this entry . . .

Start NowWith Pythian - database design, management and emergency handling capabilities...

Live Updates

pythian: RT @sheeri: #confoo talk "Bending Queries to your Will with EXPLAIN" slides http://bit.ly/explainslides & handout
more



Testimonials

  • Serge Racine

    DBA, Brookfield Energy

    We are very satisfied by the service given to us by Andre and Shakir in support of our recent data quality and reorganization initiative.... more