Hardware Component Failures – Survey Results

May 10, 2012 / By Alex Gorbachev

Tags: , , ,

When preparing for the the IOUG Collaborate 12′s deep dive on deploying Oracle Databases for High Availability, I wanted to provide some feedback on which hardware components are failing most and least frequently. I believe I have a reasonably good idea of the answer, but I thought that providing some more objective data would be better. I couldn’t find results from a more scientific research, so I decided to organize a poll. This blog post shows the results, which I promised to share with several groups.

The results are also in the presentation material, but it might be hidden deep into the 100+ slides. Here is a dedicated blog post with some commentary on the results.

I asked the following question — “Sort the hardware components in the descending order of failure rates.” The list must be fully ordered so there is only one component for each place — #1, #2 and so on — so we basically have a sorted list of 7 components. I didn’t collect failure rate estimates or anything like that; it’s already perception-based, so asking for ratio estimates would definitely be seeking for garbage data. Each component gets a weight based on its placement in the list. Place #1 (most frequently failing) gives weight 7. Place #2 gives weight 6 and so on, until place #7 with it’s mere weight of 1.

The resulting weights for each component are averaged, so the higher the resulting average weight, the more frequently respondents think this hardware component fails. Again, this is perception-based, and I’m sure our brains make it quite subjective, but I hope having 50+ people on the survey would provide good indication. Please note that you can’t compare the failure rates of each component by comparing the resulting average weight below. Remember: It’s just the placement on the list, and we didn’t collect any failure rates, so we have no conclusions.


(Click on the image to see the full resolution.)

I was satisfied  with the results not being too close to what my estimates were, except that I wouldn’t place failure of network cards and switches as high. I’ll let you comment on that: Do the results match your estimates? Is there anything you found surprising?

The survey was answered mostly by Oracle technology users. I think it should generally be applicable to any other platforms. However, if we compare it with, say, MySQL deployments, the Oracle infrastructure is usually built using more enterprise hardware class while the MySQL infrastructure usually runs on lower-grade servers. Still, I think it should be useful to anyone — just keep in mind the source of the feedback.

Leave a Reply

  • (will not be published)

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>