Log Buffer #99: A Carnival of the Vanities for DBAs
May 30, 2008 / By David Edwards
Welcome the the 99th edition of Log Buffer, the weekly review of database blogs.
Since PGCon recently happened right here in Ottawa, let’s start with some posts about it, and about PostgreSQL. Josh Berkus came to the conference with his Database Soup. It sounds like he enjoyed himself: “So, that’s pgCon. It was exciting and fun. All of you PG geeks who missed it should be kicking yourselves about now, and putting in budget requests for next year.” He has day one highlights; day two highlights, and also some photos from the conference.
Robert Treat of zillablog also has his take-aways. “The biggest thing to come out of this past week to me was recognition of the continued growth of the postgresql community. With a number of regional conferences springing up over the past year, I wondered if PGCon would be able to match the experience from last time, and was happy to see that it far exceeded it. . . . One thing I noted was that we have a very large presence of Postgres in Ottawa, I think larger than what we had at PGEast for the Baltimore/Washington Area.”
On Greg’s Postgres stuff, Greg Sabino Mullane has an item on verifying master-slave replication with check_postgres.pl, a montoring script for Nagios and other systems.
Hubert Lubaciewski of select * from depesz; looks at different approaches to counting the number of distinct elements, such as distinct sessions per day.
On An Expert’s Guide to Oracle Technology, Lewis Cunningham examines the recent news of the worlds largest database running on Postgres. How large? 2 Petabytes! “Let’s put that in perspective,” writes Lewis. “1 petabyte is 1,000,000,000,000 bytes. Yahoo has two of those. . . . [They use] this database to analyze the browsing habits of its half a billion monthly visitors. How would you like to tune those queries?” Lewis also notes that this is not the plain-vanilla PostgreSQL that you or I would download and run on our website box.
There are conferences on the horizon of the Oracle world too. Dan Norris reminds us that ODTUG is just around the corner. That’s the Oracle Developer Tools User Group’s Kaleidoscope 2008 event, taking place in mid-June in New Orleans.
Gareth Roberts of In Depth Apps announces the call-for-papers of the New Zealand Oracle Users Group’s (NZOUG) 2008 conference, coming in October.
In other Oracle matters — Chen Shapira, just a simple DBA on a complex production system, has some thoughts on cumulative distribution. “How do we calculate Cumulative Normal Distribution? This means calculating the probability that a random variable from a normal distribution, will end up equal to or less than X.”
Slavik Markovich looks at removal of sensitive data, writing, “Most DBAs I know copy database files from production to create staging, integration and test environments. Those environments contain a lot of sensitive information (PII, CC, etc.) which is usually either deleted, scrambled or truncated. The problem with these solutions is that most DBAs forget that the database performs logical deletes and not physical deletes. . . . DBAs out there – what do you do to remove sensitive information from your non-production environments?
Tanel Poder looks into using autonomous transactions for sleeping. (No, Tanel. You need healthy, natural sleep. Chew valerian root and get more exercise.) But seriously, he says, “Itâ€™s more a hack than a real solution though, but may be useful for someone else too.”
Tanel is also looking for help with a weird MS SQL Server bug. He begins, “Weâ€™ve created an encrypted database level trigger on DDL operations in all databases including the model database so that when a new database is created the trigger will be created in the new database as well. The problem weâ€™ve encountered is that the encrypted triggers are not correctly copied to the new database.”
Alexander Kuznetsov shows how to reuse your code with table-valued UDFs in SQL Server 2005. In the comments, Adam Machanic points to a couple other posts discussing the performance hit associated with UDFs.
Kalen Delaney also is concerned with performance, asking What’s Worse Than a Table Scan? “I have frequently heard SQL Server developers and DBAs gasp when a query plan is indicating that SQL Server is performing a table scan, thinking that is the worst thing that could ever happen to a query. The truth is, it’s far from the worst thing and in addition, not all table scans are created equal. . . . I’m going to show you that two different table scans on the same data in a heap can give very different performance.”
Denis Gobo’s blog features two interviews with SQL Server . The first, an interview with Craig Freedman about indexing, query plans and performance; the second, with Erland Sommarskog about SQL Server and Transact SQL.
On Ramblings of a DBA, Tara Kizer gives us a howto on troubleshooting SQL Server connectivity problems. SQL Protocols covers some similar ground with their SQL Server 2005 Remote Connectivity Issue TroubleShoot.
On the MySQL side of the street a couple bloggers had something to say about Twitter’s recent downtime. Sheeri K. Cabral says Twitter should get back to basics, laying out what seems to have happened and a remedy or two, including a solicitation of help. High Availability MySQL’s Mark Callaghan suggests that perhaps MySQL can learn from Titter’s misfortune: “Is it time to stage an intervention before PostgreSQL or Oracle get the account,” he asks.
Also into the clusters is Jeremy Cole, who notes that MySQL Cluster has been split from the main MySQL distribution. “Overall, this seems like a good idea â€” the needs for releases, schedules, pressures, etc., are at odds between MySQL Cluster and MySQLâ€™s core,” he says. “I am, however, baffled by the decision of how to release the new product: coupled with MySQL as a single monolithic package with compiled-in storage engine. After all, 5.1 has long been touted to have this amazing new pluggable storage engine architecture. Why not use it?”
On the MySQL Performance Blog, Baron Schwartz asks, can MySQL temporary tables be made safe for statement-based replication?. “A while ago I wrote about how to make MySQL replication reliable, part of which is to eliminate temporary tables. . . . This is a contentious topic. People love their temporary tables and will ask hopefully “are you sure this isn’t safe?” They’ll propose all sorts of ways to mitigate the danger, and I’ve heard many of them. But I recently heard an angle on this I had not heard before.”
“If you need to combine two scripts with MySQL Proxy,” writes Giuseppe Maxia, “you have three choices. You can manually rewrite the two scripts (good luck!)[;] you can use the load-multi module to load scripts on demand; or you can use the proxy chaining technique.” Giuseppe provides illustrates how to chain proxies.
And that is all. Next week, The Big 100!
And it’s still up for grabs! So, if you think you want to welcome the DBA world to your blog for this landmark Log Buffer, send me an email. (And send me one anyway if you’d like to get involved some other time.)