Kenny Tilton posted about database troubleshooting, and he anecdotally illustrates and elaborates on a law of troubleshooting that I strongly agree with: Always solve the first problem. The corollary to his law is that “there only is the first problem.” I’m not sure I entirely agree with that one, but I will admit that that corollary is true at least 90% of the time, which is often enough to make it an incredibly useful insight.
I have been automating to make the life of my team members easier. As part of the DBA service, we offer monitoring in the form of alerting, and also in the form of daily checks to ensure everything is running smoothly. Daily checks consist of things that do not need to be checked every minute, but should be checked frequently. This is very valuable to ensure that no changes get lost. A DBA might be adjusting the configuration, but forget to put the final changes in the config file. In that case, the next day our daily checks will throw a warning, and that DBA will say “oh yeah, I forgot to put that into the config file!”
Open Source means that the source code is open. There are many inferences that can be made from this, and many stereotypes that can be applied, but in the end, all it means is that you can read the source code as well as use the binaries. One of my team’s current tasks is to restore a backup (using InnoDB Hot Backup, and compressed) from a client’s production machine to a development instance…weekly — thus we want to automate it. We got everything going well, except uncompressing the production backup and applying any logs (with ibbackup).
I was recently asked a question by someone who had attended my Shmoocon talk entitled “Why are Databases So Hard to Secure?”. (PDF slides are available). I was going to put this into a more formal structure, but the conversational nature works really well. I would love to see comments reflecting others’ thoughts.
On Monday, March 10th, Sun makes a stop in Boston on its world tour of “Mashup Meetups”. If you can’t make it in person, join us on the live ustream videocast. We will have Sun and MySQL employees present, and there will be a short (30 min max) presentation on “What is Cluster Good For?”. We will try to broadcast live on ustream.tv if possible.
Welcome the the 87th of Log Buffer, the weekly review of database blogs.
This Sunday, March 9th, most locales in Canada and the US start to “save daylight” by “springing ahead” one hour. At 2 am local time (that would be “really late Saturday night” for the party-goers), the time jumps ahead one hour.
Have you ever heard the one about throwing hardware at a software problem? I have this nifty RAC system that supports some very public and very mission-critical apps, and one day (it was Sunday night) it starts choking. We’re getting enqueues. Slowly they start climbing. Ten nodes came to a crashing halt. I have now seen a ten-node RAC cluster come to crashing halt and completely lock up. Why, you ask? A simple SQL statement: DELETE FROM a WHERE b=c AND d=e;.
Today is Hotsos Symposium 2008 Training Day — one full day with Tom Kyte. FInd out how I spent my last day at the Hotsos Symposium 2008.
I have been a MySQL DBA at The Pythian Group for three months (and 2 days) now. At most companies that is the probationary period, and I am still here, so that is a good sign….. So, after three months, how do I like it? Glad you asked!