THE WORLD DISCUSSES #PYTHIAN ON TWITTER. HAVE A QUESTION? USE OUR HASHTAG AND ASK AWAY.

This IBM Storage Fails Too Often, so Let’s Switch to EMC and Be Done… NOT!

A couple weeks ago I did a short blog post about SAN storage failures and how people are blinded by all the bells and whistles that are supposed to make storage arrays 100% reliable and failsafe. My conclusion was that there is no way to avoid storage failures, and that a better way is to anticipate those failures and be ready to handle them with minimal service impact.

I referenced a wake up call from a CTO of an Australian hosting company. Let me quote it again:

The outage, blamed on an IBM storage array, saw the company’s chief technology officer promise “significant changes to the way we deploy and manage our storage environment”.

Today, I stumbled across another article that demonstrates their solution of the storage reliability problem. From Melbourne IT on $18m Oracle revamp:

… to improve the reliability of its operational support systems at a cost of $7 million over three years, which has also seen it switch storage vendors from IBM to EMC. Data corruption that had occurred on its IBM storage systems were blamed for a several day outage experienced at the company’s WebCentral web-hosting business.

So we see that, instead of learning the right lesson, they conclude, “This IBM storage stuff isn’t reliable, EMC sales folks convinced me that they are better. Now my storage will not fail.” The “significant changes to the way we deploy and manage our storage environment” were mere vendor change.

Well, data recovery services will be flourishing!

Scalable Internet Architectures

My old friend and collaborator Theo Schlossnagle at OmniTI posted his slides from his Scalable Internet Architectures talk at VelocityConf 2009.

The slides are brilliant even without seeing Theo talk and I highly recommend the time it takes to flip through them, for anyone who is interested in systems performance. If anyone took an mp3 of this talk I’m dying to hear it, please let me know.

For those of you unfamiliar with OmniTI, Theo is the CEO of this rather remarkable company specializing in Internet-scale architecture consulting. They generalize on Internet-scale architecture, not on one specific dimension the way Pythian specializes on the database tier. This allows them to see Internet-scale workloads from a unique systemic, multidisciplinary point of view; from the user experience all the way up the stack, through the load balancer (or not), the front-end cache, the application server, the database server, the operating system, the storage, and so on. This approach lets them build Internet architectures and solve scalability problems in a unique and powerful, wholistic way.

Pythian first collaborated with OmniTI in 2001, and they deserve all of their success and profile that they’ve built since then. Trivia: both Pythian and OmniTI were founded in September 1997 and both companies continue to be majority-owned and controlled by founders (in Pythian’s case, yours truly).

Here’s the slide deck. Let me know your thoughts.

The Architecture Layer

Contemporary software engineering models include many loosely-defined layers. Database developers might help with other layers, but for the most part a database administrator’s domain is the persistence layer.


  • Presentation

  • Application

  • Business Logic

  • Persistence (also called Storage)

The Daily WTF has an article on The Mythical Business Layer makes the case for not separating the business layer and the application layer:

A good system (as in, one that’s maintainable by other people) has no choice but to duplicate, triplicate, or even-more-licate business logic. If Account_Number is a seven-digit required field, it should be declared as CHAR(7) NOT NULL in the database and have some client-side code to validate it was entered as seven digits. If the system allows data entry in other places by other means, that means more duplication of the Account_Number logic is required.

It almost goes without saying that business logic changes frequently and in unpredictable ways. The solution to this problem is not a cleverly coded business layer. Unfortunately, it’s much more boring than that. Accommodating change can only be accomplished through careful analysis and thorough testing.

I will call this merged business/application layer the “functional layer.”

The serious scaling requirements posed by most applications these days call for partitioning, clustering, sharding or some other term for “dividing up the data so it does not become the bottleneck”. Enter the “architecture layer”.

“Wait a minute,” I hear you asking. “Isn’t that just the persistence layer?”

Yes and no. To me, there’s a difference between the storage and the architecture of said storage. The database schema for storing a user profile is a persistence layer issue. Figuring out which database instance to go to is an architecture layer issue.

This is an important distinction for me. Many folks are coding the architecture layer directly into the functional layer. A “save_profile()” API function might call an ORM to deal with the persistence, or it will have MySQL (or other database) connection handling and queries. However, the database will grow, and at some point you will find yourself wanting to split the data [more].

This type of information, like the presentation layer, needs to be separate. Why should the application care whether save_profile(‘Sheeri’,'hair color’,'blonde’) accesses database1 or database2? More importantly, why should there be major code changes to the functional layer if the architecture changes? Just like no functionality has changed when you change your website color from blue to red, there is no functionality change when you go from splitting data between 2 database servers to splitting among 3, or 10.

For me, the persistence layer is about how the data is stored. Which, explicitly and for the record, I also believe should be separate from the functional layer — if you store hair color and eye color in one table or 2, the functionality of the application has not changed; all that’s needed is a change in how that data is stored and retrieved.

The architecture layer is all about where the data is stored. Early forms of the architecture layer are configuration files, though most would not call that a “layer”. Database administrators should be able to change the architecture of the database system without requiring mucking about in the application’s functional code.

Thoughts?

Start NowWith Pythian - database design, management and emergency handling capabilities...

Live Updates

pythian: RT @FN_Press2: Schooner Information Technology Teams with Pythian to Deliver Advanced Support and High... http://finanznachrichten.de/20
more



Testimonials

  • Serge Racine

    DBA, Brookfield Energy

    We are very satisfied by the service given to us by Andre and Shakir in support of our recent data quality and reorganization initiative.... more