Archive for the ‘Group Blog Posts’ Category

The Guru is In: Usenix 2008, Boston

By Sheeri Cabral June 23rd, 2008 at 7:42 am
Posted in Group Blog PostsMySQLNon-Tech Articles
Tags:

If you are attending Usenix 2008 at the Sheraton Hotel in downtown Boston, you can meet me and ask your burning MySQL questions at my “The Guru is In” session. On Friday, June 27th, 2008 from 2 - 3:30 pm in Constitution B, I will be helping folks out by optimizing queries and schemas, teaching general principles of working with MySQL databases, and answering (to the best of my ability) any other question they may throw at me.

The event details are at:
http://www.usenix.org/events/usenix08/tech/#fri

Hope to see you there!

Pythian in eWeek, the backstory

By Paul Vallee June 20th, 2008 at 4:08 pm
Posted in Group Blog PostsPythian
Tags:

I was happy to be invited by Brian Prince at eWeek to answer some questions he had posed to Pythian, NTirety and industry analysts Noel Yuhanna of Forrester and Peter O’Kelley of the Burton Group.

You can take a look at the end result here: How to Decide if Remote Database Admins are Right for you.

I found the process interesting. I had actually provided a lot more content, which I include below, and I strenuously disagree with some of the analyst statements, especially the statement that the processes must be totally licked before engaging an external vendor. Are all the other vendors staffed with rookies following established processes and that’s it? It’s a very strange statement to make given that many, most of our customers turn to us to help them define best processes in terms of capacity planning, availability optimization, and security. It’s specifically contradicted by Corey’s statement about how his shop optimizes and automates processes. We do the same thing, of course, and we routinely inherit shops with tons of low-hanging fruit where we can dramatically streamline the efforts.

Then again, maybe Noel is thinking of dbaDirect. If I may say, “eeks!”. Anyway, not all companies in this space are made alike, I guess.

I think it would be interesting for most to read the article linked above, and compare with the following answers to his questions that I provided to Brian, mostly because it generates some respect for the challenge of cutting down content, and also because it illustrates to what degree the media selects sources and answers in support of its pre-established story. Fascinating, no?

1)What are some of the benefits of taking a remote DBA approach to database administration?

There are several benefits that I could list, but ultimately it becomes a matter of resource availability and agility. With an outsourced provider, more technical skills are on tap, at all hours of the day, with more escalation support and for more weeks of the year than with a fully insourced strategy. Furthermore, agility is greatly increased as the service provider can scale with the project needs and the likelihood that the service provider has already performed a task or a project is much, much higher than for a single in-house hire. This can dramatically reduce technology adoption inertia. In larger teams, I might mention that Pythian’s blended insourced/outsourced model allows a best of both worlds strategy to be implemented.

2)Talk about Pythian’s business model. How do you price your services to make them competitive with paying a FT DBA?

Pythian’s service model is a no lock-in, scopeless, linear cost-to-effort model that is disruptive to the traditional outsourcing model of flat monthly rates over a lock-in period.

In the traditional outsourcing contract, services are limited to a service level agreement (SLA) covering a strict scope of work, and the vendor’s profit model is centered around minimizing their costs associated with delivering that scope of work. This means that in traditional outsourcing, the vendor is literally motivated by the contract to deliver the minimum value-add as possible while still satisfying the arbitrary SLA, which is set during the lifetime of the contract, sometimes as long as 5 or more years. This problem is at the heart of what is wrong with outsourcing as currently conceived: no matter how successful the vendor is at automating, streamlining, tuning, and improving processes, the customer does not see any of those savings as they are paying a previously-negotiated rate.

Pythian’s model allows customers to subscribe to a fraction of a small team of engineers, with no lock in whatsoever so that the quantity of effort the customer requires is changeable on 30 days’ notice or can be cancelled for convenience at any time. This means that Pythian is constantly earning our next month’s renewal, which keeps us on our toes and keeps us motivated to add as much value as we can within the allotted effort level. Our profit model is customer-friendly and well understood: it is a simple mark-up model on our costs of service delivery. This means that as we automate, streamline, tune, and improve processes the customer gets to tap into those efficiencies, either by re-sizing the contract downwards to claim cash savings with no morale penalty, or by increasing the responsibilities that flow to Pythian without needing to increase the allotment of hours.

In smaller shops that might only need one or fewer full-time DBAs, our model is cost competitive because only the effort required to run the shop needs to be provisioned, and our processes, delivery model and expertise all couple to outpace our markup quite handily. Often, it is not an insourced vs. outsourced decision, however. In many larger shops, Pythian is engaged alongside the in-house team in order to dramatically increase the team’s capabilities, technical experience, availability at all hours and project support. Both use cases work very well.

3)How does an organization get in touch with Pythian in the event of an emergency? (ie, the database suddenly fails). How fast can they expect a response?

In the event of an emergency, the likelihood is that our monitoring software will have already alerted us and we will be on the job immediately. Generally, Pythian acts as an extension of the customer’s existing team and as such any method that the customer has adopted to collaborate is supported by Pythian. For instance, we use every third party instant messaging platform alongside internal platforms, email of course and telephones, and also videoconferencing and telepresence technology. Whatever the customer is using to collaborate with a resource working on another floor of the same building, we are also using. It is completely seamless.

Our response guidelines set the expectation that, in an emergency, a Pythian resource should be available on the affected system in single-digit minutes or less, all year round. Customers have live access to a backup engineer at all times as well as to a service delivery manager, for a total of three resources on-call. Our escalation guidelines allow our customers to engage the backup engineer and the service delivery manager immediately with no mandatory waiting period. Some of the systems we manage have downtime costs in the six figures per hour range, so this is the standard of care that we have implemented as a result.

4)What about other functions DBAs perform like providing end user and developer support?

This is a key advantage of the scopeless model that Pythian champions. Any work that the client wishes to flow to Pythian, we can do. This includes end user support, developer support, back-end development of triggers and functions, data modeling, data warehousing dimensional modeling, building transforms, input on best practices for security, business continuity planning and change control, you name it.

5)How does the process work with Pythian - are customers assigned a specific DBA for all their needs?

Customers are assigned to a small team of three to five engineers with an appropriate skillset to support the customer’s implemented technology. That team is led by a team lead which takes primary responsibility for the excellence in service delivery for the team, however a primary DBA is specifically not assigned, because one of the goals of the service is to isolate the customer from the pain of turnover, vacation, sick days, etc. and that goal would be compromised if we failed to disseminate the client knowledge evenly over a large enough team to be able to absorb that kind of change. Each team maintains an on-call schedule of its own, so that it’s always the same small group of people doing the customer’s support, which creates a client intimacy that results in Pythian becoming a seamless extension of the client’s tech team. Furthermore, each team reports to a service delivery manager, which has on-call responsibilities as well, so that the client has management support from Pythian at any time.

6)Doesn’t doing this effectively require an understanding of a business’s apps and processes? How does Pythian deal with that?

Of course. In this regard, engaging the Pythian team is no different than making a new in-house hire that happens to work in another office, or another floor of the same office. On day one, we will be contributing primarily our technical expertise while being complete rookies on the internal company-specific applications and processes. As time goes on, however, our expertise on the in-house specifics will increase much in the same way a new hire will gain that expertise over time. Among our 110 customers, Pythian has two customers that have been customers continuously since 1999, and ten customers that have been customers continuously since 2002 or earlier. For those customers, we are in a real sense the “old-timers” in the shop and are a key source of organizational knowledge for application structure, data model, and processes!

Billy Joel and Databases

By Sheeri Cabral June 6th, 2008 at 3:48 pm
Posted in Group Blog PostsNon-Tech ArticlesNot on Homepage
Tags:

So, we have all heard that Billy Joel played a concert at Oracle’s OpenWorld in 2007.

What follows is an actual IRC conversation among Don Seiler, Dave Edwards, and myself:

(4:02:46 PM) don: ha @ Billy Joel at OOW
(4:03:38 PM) dave: “We didn’t fire the startup…”
(4:07:53 PM) don: “we didn’t start the backup”?
(4:12:53 PM) dave: “Don’t go changin’ . . . your slave and master”
(4:20:19 PM) ***sheeri shoots Dave
(4:20:49 PM) sheeri: “I don’t want clever replication, we never could have come this far”
(4:24:05 PM) sheeri: “And the server sounds like an aero-plane, and replication chugs along as it must…and the inserts go on, replication corrupts, and I say “Man, now I’m workin’ all night!”

(4:24:29 PM) dave: “I said ‘ls -u’ . . . that’s for access”
[”I said I love you . . . that’s forever”]

(4:24:30 PM) don: UP-TIME GIRL
(4:34:09 PM) dave: “Say it’s not wrong, execution plan!”
(4:43:39 PM) sheeri: Where’s my execution plan, oh man?
[Sing us a song of a piano man]
(4:45:52 PM) sheeri: Go ahead with your schema, leave me alone!

Comment here with your own database-themed parody of a Billy Joel song. Perhaps if we get enough MySQL-themed entries, we can get him to come to the MySQL Conference in April.

That and maybe thousands of dollars………..

What Does Open Source Mean?

By Sheeri Cabral June 4th, 2008 at 9:47 pm
Posted in Group Blog Posts
Tags:

At last night’s event, a lot of the questions were really implicitly asking, “Is open source better? Why?”

The first answer everyone comes up with is that it’s free, and that’s better.

However, that is neither necessary nor sufficient to deem it “better”.

If MySQL did exactly the same tasks Oracle did, but was free, there’s still a huge amount of money involved when migrating. Merely staffing the migration costs a lot of money.

Companies using open source technologies because they are free are (probably) making the right software choice for the wrong reason.

Firstly, open source does not have to be free — MySQL proves that. Their Enterprise source code is free to paying customers (and whoever paying customers distribute to, but that is not the issue).

Secondly, open source’s benefits far outweigh mere license costs, though the license cost is definitely the most tangible benefit.

I realized while the benefits of open source were being touched upon that the benefits are not lacking in the closed software world, they are simply much harder to come by. For instance, there are companies that reverse engineer solutions, develop their own in-house solutions without being able to read a line of original code. Surely it is easier to build a home-grown solution when the code is readable to begin with.

As well, the talent pool for open source is greater, because there is a lower barrier to entry. It’s still just as difficult to separate the wheat from the chaff as it is in a closed source world, however if your company is willing to hire the top 10%, I’d rather try to find the top 10% from a pool of tens of thousands of people than from a pool of thousands.

The oft-quoted “you can hack it yourself if you want” still applies, and moreso the idea that “even if the company goes out of business, or the core developers stop developing, others can pick up where the previous developers left off.”

One issue we did not touch upon was that open source tends to follow a popular concept in “extreme programming” — the idea that the software is always working. It may not have all the features, maybe it’s not much more than “hello world”, but it works. A feature is added, the code integrated, and it still works, now with +1 feature.

I think the issue is that in general, it is *easier* to reap these benefits from open source than from closed. It makes the argument more difficult, because it’s *possible* to reap similar (or the same) benefits from closed source, but it’s easier with open source.

MySQL Focuses on Community

By Sheeri Cabral June 4th, 2008 at 9:28 pm
Posted in Group Blog Posts
Tags:

Post Summary: An apology with a lesson.

When Steve Curry contacted me just after the MySQL Conference and Expo asking me if I’d be interested in a community roundtable, I was excited. Not just because Steve Curry brought me an inflatable pink dolphin after I squee‘d that I needed one, although I never forget when someone does me a favor.

However, a few weeks ago it seemed like the event was more of a PR gathering than a community roundtable. I was disappointed, and told Steve as much.

And then, one of two things happened:

1) My concerns were brought up, discussed and folks decided a roundtable involving community was a good idea;
or
2) I had come up with two different pictures of the event in my mind, based on my expectations of “community roundtable” at first and “event with businesses and PR, to include community” as the final description.

Now, last night was an excellent opportunity for me and also a lot of fun. A lot of the questions were really implicitly asking, “Is open source better? Why?” More on that in the next post, I promise.

So I wanted to say to MySQL that I was wrong.

I am sorry.

Sure, MySQL did not know what I was thinking. And certainly the event could have turned out to be one I did not enjoy.

The lesson to learn from this is that sometimes we get upset at our perception of reality, and not reality itself.

And to follow up on my cranky post where I was annoyed at the MySQL’s website’s lack of functionality at http://www.pythian.com/blogs/1016/mysql-website-a-reflection-of-values, I feel I should note that I got a call later that day from MySQL’s web designer telling me that my concerns were valid and MySQL was actively working on them. Indeed, www.mysql.com has added a “Documentation” link in the orange submenu (first is “Products” and second is “Downloads”, so I completely agree with their prioritization as well).

The other lesson: Always trade business cards with people, so they have your contact information when they want to contact you. A phone call was so much more powerful than an e-mail ever could have been.

Oracle Open World 2008 Sessions — Vote on Oracle Mix

By Alex Gorbachev June 2nd, 2008 at 10:01 pm
Posted in Group Blog PostsOracle
Tags:

In the recent month there were several Oracle community web sites created. Well, I remember I registered on one or two but I couldn’t really keep an eye on many so I decided to wait and see which one wins. Turns out that Oracle Mix came out as a winner. Maybe it’s just my impression.

But I digress… I just wanted to make a quick note that Oracle Mix organized an interesting hybrid between call for papers and abstract judging. Anyone registered at Oracle Mix can propose a session abstract to present themselves or as an idea for others. Everyone can give their votes to the proposed sessions. At the end of the voting deadline (25th of June) Oracle will select the top sessions to be included in the Oracle Open World schedule.

So what? Well, I did send mine few days ago — Demystifying Workload Management with Oracle RAC based on my Hotsos Symposium 2008 presentation. I’m not sure how wide is the potential audience for this session — it’s far from beginners session and is specific to RAC. However, I do believe that this topic is often misunderstood and there is very good potential to spread the knowledge. So if you are coming to the Oracle Open World and interested in that topic - go ahead and vote. If not mine, there are plenty of others.

PS: I managed to do 2 (!) typos in the title and can’t edit it anymore. I don’t have any error on update but the title comes back unchanged. I have already filled in a bug report - let’s see if it gets fixed.

BigTable Thoughts

By Sheeri Cabral May 31st, 2008 at 11:09 am
Posted in Group Blog PostsNon-Tech Articles
Tags:

So, Paul’s blog post pointing to Todd’s blog post got me thinking.

The main point Paul summarized was that duplicating data was a great way to scale, and used Todd’s reference to Flickr and how in their partition-by-user scheme, they put a comment in the commenter’s shard as well as in the commentee’s shard.

In my recent post about Twitter, I wrote:

Now, I understand that it is hard to get all the histories for the people I follow. But it only needs to be done once, and could then be cached — “Posts from who Sheeri follows on 5/20″. It would not be difficult, and I would be OK with the functionality changing such that “once you follow a new person, their tweets prior to when you followed them do not show up in the history.”

So using this thinking, every time someone I follow (say, @paulandstorm) makes a comment, it not only writes to their shard, but to mine. Now, that may not work given that the system also has to send messages at the same time, and that there can be numerous followers — dozens, hundreds, thousands.

The Flickr model works because it involves 2 writes to get the faster caching later, and there are more reads than writes. Twitter is more write-heavy, and likely has more writes than reads, considering that many folks do not visit an historical website to see their history.

This particular idea may not work for Twitter. But I’ve picked on Twitter enough….

I thought about livejournal. I’ve been a livejournal member since 2001 — after 2 months of writing my own journaling system with comments, I got wind that a system already existed, so started to use that.

Now, I can go and pick specific entries from specific days, or I can read my “friends list”. I specify my friends and livejournal dynamically populates pages of my friends list, with the amount of entries per page that I specify.

Livejournal could also use the idea presented above, as well as the concept of semi-dynamic data. Instead of dynamically generating the last, let’s say 20, entries of my “friends list”, livejournal could be making my friends list as it gets written to. A friend makes a post and it gets added to my shard, whether or not I read it. Once the count gets up to 20, a new cache page is generated.

Now, livejournal already has great caching, and has indeed had the growing pains Twitter is seeing. And for either livejournal or twitter to take advantage of these concepts, they would likely require a rewrite from the ground up. So it’s not that I am suggesting this. I just think it’s a great idea, and if you are working on a project, think of where it might be useful to apply…..again, it may not be applicable in all situations. Like Twitter, livejournals may have many “friends” so doing 100 or 1,000 writes every time a post is made may not actually be feasible.

Twitter Should Get Back to Basics

By Sheeri Cabral May 29th, 2008 at 4:17 pm
Posted in Group Blog PostsNon-Tech Articles

Twitter has had many outages recently. On May 17th, 2008 http://blog.twitter.com/2007/05/devils-in-details.html was posted and says:

What went wrong? We checked in code to provide more accurate pagination, to better distribute and optimize our messaging system—basically we just kept tweaking when we should have called it a day. Details are great but getting too caught up in them is a mistake. I’ve been CEO of Twitter for two months now and this an awesome lesson learned. We’re seeing the bigger picture and Twitter is back. Please contact us if something isn’t working right (with Twitter that is).

(in other news, that post was made on May 17th and does not show up on http://blog.twitter.com, which it should, between the May 16th and May 19th posts. I found a reference in other posts and had to search the site to find that post).

A real “awesome lesson learned” is “do not tweak production without testing first.” In every job I have had I have first learned and then taught the concept of “test everything possible.” Which Twitter has not learned yet, because http://blog.twitter.com/2008/05/not-true.html, posted on Tuesday May 20th, states:

We caused a database to fail during a routine update early this afternoon.

As someone who has years of experience working with MySQL, and before that was a systems adminsitrator; as someone who was referred to as “the MySQL Queen” yesterday (by someone who wanted me to test their product, so yes, they were flattering me); I can assure you:

no matter how “routine” a change is, if you do it on production without testing it first, you are playing with fire, and 95% of the fires caused by not testing first are completely preventable.

I will repeat this, because repetition is important to learning concepts.

no matter how “routine” a change is, if you do it on production without testing it first, you are playing with fire, and 95% of the fires caused by not testing first are completely preventable.

With a proper testing environment, 19 out of 20 “whoops, didn’t expect THAT from a routine change!” issues are caught. And I can tell you that often “routine changes” cause unexpected results.

Now, I was online during an outage, and http://twitter.com/home was showing their “site isn’t working” page for at least 3 hours between 2 and 5 am EDT yesterday (Tuesday, May 20th, 2008).

So…..there is no read-only copy around that Twitter could use? Maybe I cannot tweet, but I should at least be able to read what was done before!

Of course, since last week Twitter has done the opposite — often I can see the most recent 20 or so posts, but not anything prior. Now, I understand that it is hard to get all the histories for the people I follow. But it only needs to be done once, and could then be cached — “Posts from who Sheeri follows on 5/20″. It would not be difficult, and I would be OK with the functionality changing such that “once you follow a new person, their tweets prior to when you followed them do not show up in the history.”

Alternatively, you could go the snarky way and say: http://www.techcrunch.com/2008/05/20/twitter-something-is-technically-wrong/ states:
What would be great is if Twitter just moved their blog to another platform so that it doesn’t fail when users need it most.

I am not a huge user of rails. But I will say that given the content of the public announcements, the platform is not the problem. It is the code release process that is the problem. Maybe there’s “agile development” happening, paired programming and code reviews. But there is not adequate testing.

Twitter — if you truly need scaling help, please ask for help — I know Pythian would be happy to help. However, if it really is as it seems — that basic good practice is not being followed — I would like to remind you that backups are really important too, just on the off chance that backups are not happening.

When Your Hard Drive Comes Knocking… A Cautionary Tale

By machanic May 25th, 2008 at 2:24 pm
Posted in Group Blog Posts
Tags:

Imagine yourself, happily computing (or whatever it is that you do with your computer). It’s a fine, sunny day, narry a cloud in the sky, and you’re happily typing along when all of a sudden you hear a rather alien sound eminating from your hard drive. Something that sounds, perhaps, like some combination of a roofer banging in a nail, and a miner’s pick as he works on releasing a stubborn piece of ore from a cave wall. 

Certainly not a good sound to hear coming from the general region of your hard drive on a nice, sunny day. Especially when you have not taken a backup in over two years

Now consider your options. You could…

  1. Turn off your computer immediately, Google “data recovery”, and call one of the multitude of companies that pops up. They will charge you in the neighborhood of $500 to get your data safely off of your damaged disk.
  2. Immediately stop working, grab an external USB drive, and transfer all of your valuable data onto the external drive. Order a new internal hard disk, and keep working, confident in the knowledge that your disk is going to die soon, but you’ve already salvaged the valuable data and a new disk is on the way.
  3. Try to get clever. Think to yourself “maybe if I run ‘chkdsk’ it will go through the bad areas of the disk, mark the sectors as unusable, and I can keep using this disk.”

Such was the scene at my desk last Friday. And you have probably already figured out where this is going. Yes, I ran ‘chkdsk’. The first three (of five) checks completed without error, and I thought everything was going to be fine. Then, during the fourth phase, after about an hour of intense clicking and banging noises, a message appeared on the screen, which I can only paraphrase at this point:

Not enough space is available on the disk to fix the bad sectors.

This was a concerning message, given that the disk was much less than half full.  But only a minute or two later my concerns were answered by another rather vague message:

Unspecified error has occurred. Aborting.

An unspecified error during a ‘chkdsk’ run is never a good thing. And so I rebooted, only to discover that my hard disk was no longer recognized by my system as a hard disk. I did what I should have done in the first place — called a couple of the friendly data recovery companies — and after listening to my story and, nicely enough, not laughing at me outright, they told me that my chances of data recovery were near zero percent. You see, when your disk is banging away like that, it’s the sound of the heads hitting the platters. And they’re not supposed to do that. When I ran ‘chkdsk’, I forced the heads to touch every surface of the platters, thereby scratching them into oblivion.

A quick trip to my neighborhood computer store, a new hard drive, and I’m more or less back in business, minus two years worth of documents, most of which I never bothered to back up. A bad way to start the weekend, but I actually can’t think of anything especially valuable or irreplacable that was lost.  This is more of an extreme annoyance.

So please, learn from my mistake, and next time you hear an odd clicking sound, don’t try to outsmart your already-broken hardware. Listen to your computer. It’s sending you a very clear (albiet perhaps Morse-coded) message. And back up your data now and again! I work hard to tell my customers about all of the great reasons they should back up their enterprise data; but like most people, I never think about applying those same behaviors to my personal machines.

As an aside, when I reinstalled Vista on the new drive I accidentally used the 32-bit rather than 64-bit media. After I realized my error I decided to stay on the 32-bit version for now and see how it performs. I was not happy with memory consumption before, and I suspect that the 32-bit version will be a bit leaner. Since I only have 2 GB of RAM in this machine, there is no great reason for me to run 64-bit anyway. I’ll post again in a while about my thoughts on 32-bit vs. 64-bit Vista once I’ve had a chance to work with it a bit more intensely.

Welcome ASH Masters!

By Alex Gorbachev May 21st, 2008 at 9:27 pm
Posted in Group Blog PostsOracle
Tags:

Welcome ASH Masters!

I have already mentioned about the excellent work that Kyle Hailey did around Active Session History (ASH). Kyle has also created ASHMON.

The latest news — there is a new web site — ashmasters.com. This is the place where you can leave your comments and questions about ASH and ASHMON. Wondering how you can query ASH data? There are some ready to use queries. Have a cool idea how to use ASH and query ASH data? Share it there and have it added to the ASH Masters queries toolkit.

I hear, some of you say - “Right… ASH… 10g… Diagnostic Pack…” No panic, there is poor man’s ASH — ASH Simulation. This is the way to get some benefits of ASH while still running Oracle 9i or Oracle 10g without Diagnostic Pack.

Sounds exciting? It does, indeed!