Author Archive

Overloading BINARY

By Sheeri Cabral November 19th, 2008 at 5:28 am
Posted in MySQLNon-Tech Articles
Tags:

“There are 10 types of people in the world — those who understand binary, and those who don’t.”

The term “binary” in MySQL has many different meanings. How many can you come up with? I have 6, but I am willing to believe there are more!

0) “Binary distribution” is the name for the package that contains a binary. Another use is “binary installation” but that’s pretty much the same usage pattern as “binary distribution”, so I won’t count “binary installation” as a separate usage.
1) “Server binary” or “client binary” is the actual program (mysqld, mysql).
2) “Binary format” is a compressed format. For example, DECIMAL is stored in a “binary format” — each group of nine digits is compressed into 4 bytes.
3) “Binary log” is the change log. You can argue that this is an extension of #3, because the binary log is a compressed log, but “binary log” is encountered ubiquitously in the MySQL world, and “binary format” is somewhat obscure knowledge.
4) “Binary CHARSET” - sets the collation to be case-insensitive. For instance, utf8_bin and latin1_bin are the binary collations for the utf8 and latin1 character sets, respectively.
5) “Binary string” - a byte string. This is also known as the BINARY data type. There is some kind of wit/pun in the fact that a number in binary is actually a “bit string”.

Any more I missed? There are over 1600 mentions of the word “binary” in the 5.0 manual!

The State of Open Source Databases: OpenSQL Camp Keynote Featuring Brian Aker

By Sheeri Cabral November 16th, 2008 at 8:37 am
Posted in MySQLNon-Tech Articles
Tags:

Brian Aker delivers the keynote speech at OpenSQL Camp: State of the Open Source Databases. The presentation begins with a disclaimer:
“There is no way I’m going to tell you exactly where the future of databases go. We have way too many egos in the room to ever even begin a discussion…”
and ends with Aker saying,
“What the hell does that mean?”

View the video online at
http://technocation.org/node/649/play
or download the 42.6 MB Flash video file (.flv) directly at http://technocation.org/node/649/download.

My summary: (more…)

MySQL 5.1 GA Release

By Sheeri Cabral November 16th, 2008 at 6:52 am
Posted in MySQLNon-Tech ArticlesNot on Homepage
Tags:

The MySQL 5.1 GA Release will be on or about Dec. 6th, 2008.

How do I know?
(more…)

OpenSQL Camp is in full swing!

By Sheeri Cabral November 15th, 2008 at 11:35 am
Posted in MySQLNon-Tech Articles
Tags:

(one note, I have updated a previous blog post by adding the video for the LISA presentation I gave entitled “How to Stop Hating MySQL”)

OpenSQL Camp is in full swing! Baron Schwartz has done an amazing job organizing this free unconference.

We are well into the 2nd session of the day, and the quality of the presentations is excellent (though I expected that!) and it is always great to see people.

Some pictures….
(more…)

Video: How to Stop Hating MySQL

By Sheeri Cabral November 12th, 2008 at 5:02 pm
Posted in MySQL
Tags:

(Note: updated with the presentation video on 11/15/2008)

At LISA 2008, I gave a presentation entitled “How to Stop Hating MySQL: Fixing Common Mistakes and Myths”.

The presentation slides can be downloaded as a PDF at:

http://technocation.org/files/doc/stophatingmysql.pdf

View the video online at http://technocation.org/node/646/play or download the 202.5 MB Flash video file (.flv) directly at http://technocation.org/node/646/download.

Here are some notes and links I referred to:

Technocation, Inc containing free videos, a MySQL podcast (currently on hiatus) and blog posts about MySQL.

Why you want to switch to MySQL 5.1

How MySQL Uses Memory

Query Cache Configuration

EXPLAIN manual page

EXPLAIN cheatsheet

Max VARCHAR Size

By Sheeri Cabral November 8th, 2008 at 5:13 pm
Posted in MySQL
Tags:

The maximum length of a VARCHAR is only restricted by the maximum row length. In most storage engines, the maximum row length is the maximum allowed by MySQL, which is 65,535 bytes. Only the NDB storage engine has a different maximum value.

The VARCHAR data type has some overhead. If the length of VARCHAR is less than 255, one byte per row is used to store the actual length of the string. If the length of VARCHAR is greater than 255, the overhead cost of storing the string length is two bytes per row. Thus, the maximum length of a VARCHAR should be 65,533 bytes.

However, that is not the case:

mysql> CREATE TABLE max_len_varchar(fld VARCHAR(65533) CHARSET latin1);
ERROR 1118 (42000): Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. You have to change some columns to TEXT or BLOBs

mysql> CREATE TABLE max_len_varchar(fld VARCHAR(65532) CHARSET latin1);
Query OK, 0 rows affected (0.16 sec)

Why is that? I believe that is because there is also a 1-byte overhead per row, though I have not confirmed that. There is some allusion to that on the Forge Internals page……but I have not confirmed that it is the case.

(one of the reasons I believe this is that there’s a different error message when the character set is multi-byte:

mysql> CREATE TABLE max_len_varchar(fld VARCHAR(16384) CHARSET utf8);
ERROR 1074 (42000): Column length too big for column 'fld' (max = 16383); use BLOB or TEXT instead

This actually gives the maximum length you can use. Note that the maximum usable length still seems to be missing 1 byte from the theoretical max of 65,535:

mysql> SELECT 16383*4+2;
+-----------+
| 16383*4+2 |
+-----------+
|     65534 |
+-----------+
1 row in set (0.00 sec)

)

(and yes, you can use TEXT and BLOB fields if you actually want a variable-length string with more than 65,532 bytes, that’s not really the point of this article)

Happy Birthday to Wonderful Community Organizers!

By Sheeri Cabral November 5th, 2008 at 9:46 am
Posted in Non-Tech ArticlesNot on Homepage
Tags:

Happy birthday to Giuseppe Maxia, a wonderful community organizer for MySQL.

And happy birthday to Leslie Hawthorn, Google’s geek herder.

Thank you for making life better for open source volunteers like me!

Virtualizing MySQL

By Sheeri Cabral October 27th, 2008 at 8:04 pm
Posted in MySQLNon-Tech Articles
Tags:

I had so much to say in response to a recent post asking about virtualization from Jennifer Glore that I realized it was long enough to be a blog post.

It really depends on what you’re looking to do. Many companies don’t have the money and staff to have an in-house data center with proper power and network redundancy; others don’t want the depreciation associated with owning computer hardware (even if they leased space in a data center, they’d have to buy equipment to put in it).

Some reasons to virtualize:
1) you need a fresh machine and cannot wait to order a new one or re-purpose an older one.
2) your need for machines/services fluctuates (and again, re-purposing takes time). This need can be as broad as employee desktops or as specialized as needing extra machines for a qa cycle.
3) you own resources that are not being utilized to the fullest extent — virtualization can sometimes make more or better use of these resources.
4) you need to easily re-create an exact environment and do not want to worry about hardware differences.

Here at The Pythian Group, we have clients using many different types of virtualization.

A few clients are using MySQL on Amazon’s EC2 platform. The biggest advantage is also one of the biggest disadvantages — before EBS (elastic block storage) was offered, the threat of a reboot wiping the filesystem clean meant that we really had to ensure that we had:

1) redundancy
2) a catalog of what was needed on the machine — everything from users to perl modules. This can be done either by using a machine image, documenting a setup and recreating the machine manually, or via automation. By using tools like CFEngine or Puppet to control machine configurations, our clients have the added benefit of more standardized installations and layouts. As well, pushing a change (say, adding a new hire’s public key into an authorized_keys file) is made much much more easy.

The downside is, of course, the work to set up tools such as this. However, they are fairly common best practices, and are almost always good to implement.

The clients that are using Amazon EC2 are happy with the service they get; One client I work with especially closely moved to EC2 because they had a bad experience with their hosting provider. I cannot speak to pricing, but I do know that being able to just *have* another machine up and running in minutes is very useful … periodic work such as load testing and qa cycles work really well.

We also have many clients who “virtualize” MySQL by running more than one instance on a server. A few clients have a replication slave that has 2 instances — one as a read-only reporting instance, and another as an instance to backup, doing a cold backup — stopping the instance, copying the files, starting the instance back up. Other clients have 4-5 MySQL instances running on one machine, for developers to have individual database instances to write code against and as backups for 4-5 different physical machines.

IMPORTANT: Registration for OpenSQL Camp

By Sheeri Cabral October 27th, 2008 at 7:42 pm
Posted in MySQLNon-Tech ArticlesPostgreSQL
Tags:

OpenSQL Camp is a free unconference happening Friday, Nov. 14th through Sunday, Nov. 16th in Charlottesville, Virginia, USA. More details can be found at http://www.opensqlcamp.org.

Whether or not you have already registered for OpenSQL Camp by putting your name here:

http://www.opensqlcamp.org/index.php?title=Events/2008/AttendeeList

you should be aware of very important information.
(more…)

Liveblogging: A Five-Step Framework for Achieving the Strategic Value of Cloud Computing

By Sheeri Cabral October 23rd, 2008 at 2:07 pm
Posted in Non-Tech Articles
Tags:

I took part in a webinar on cloud computing today, including some of the top names in cloud computing services. As Pythian has some MySQL clients using cloud computing, I was particularly interested…

I was interested by the many levels of what cloud computing means, including such categorizations as Facebook apps being a part of the cloud. I think many of us consider cloud computing to mean “virtual infrastructure as a service” and overlook some pretty robust cloud computing that’s already out there, such as “application components as a service” and “software [platform] as a service”.

Following are my notes:

“Our objective today is to cut through some of the noise associated with ‘cloud’ and get to a real world approach for getting some serious value from the cloud.”
(more…)