<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Determining I/O throughput for a system</title>
	<atom:link href="http://www.pythian.com/news/15161/determining-io-throughput-for-a-system/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.pythian.com/news/15161/determining-io-throughput-for-a-system/</link>
	<description>News and views from Pythian DBAs</description>
	<lastBuildDate>Fri, 10 Feb 2012 13:01:25 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.4</generator>
	<item>
		<title>By: Oracle Clinic</title>
		<link>http://www.pythian.com/news/15161/determining-io-throughput-for-a-system/#comment-474407</link>
		<dc:creator>Oracle Clinic</dc:creator>
		<pubDate>Mon, 15 Nov 2010 11:21:40 +0000</pubDate>
		<guid isPermaLink="false">http://www.pythian.com/news/?p=15161#comment-474407</guid>
		<description>&lt;strong&gt;IO Issues in Data Warehouse Environment...&lt;/strong&gt;

Client IO Performance Review Description of the physical hardware configuration including CPU, Memory, Disk, HBA,Storage Type, etc? Interview technical contacts to get a description of the problems. Measure IO using OS and Oracle tools available? Use R...</description>
		<content:encoded><![CDATA[<p><strong>IO Issues in Data Warehouse Environment&#8230;</strong></p>
<p>Client IO Performance Review Description of the physical hardware configuration including CPU, Memory, Disk, HBA,Storage Type, etc? Interview technical contacts to get a description of the problems. Measure IO using OS and Oracle tools available? Use R&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Greg Rahn</title>
		<link>http://www.pythian.com/news/15161/determining-io-throughput-for-a-system/#comment-448063</link>
		<dc:creator>Greg Rahn</dc:creator>
		<pubDate>Fri, 30 Jul 2010 10:21:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.pythian.com/news/?p=15161#comment-448063</guid>
		<description>Another IO tool that may be useful is &lt;a href=&quot;http://www.iometer.org/&quot; rel=&quot;nofollow&quot;&gt;Iometer&lt;/a&gt;.</description>
		<content:encoded><![CDATA[<p>Another IO tool that may be useful is <a href="http://www.iometer.org/" rel="nofollow">Iometer</a>.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sheeri Cabral</title>
		<link>http://www.pythian.com/news/15161/determining-io-throughput-for-a-system/#comment-447905</link>
		<dc:creator>Sheeri Cabral</dc:creator>
		<pubDate>Thu, 29 Jul 2010 20:56:26 +0000</pubDate>
		<guid isPermaLink="false">http://www.pythian.com/news/?p=15161#comment-447905</guid>
		<description>Justin, thanx!  I think you&#039;ll have a lot to say about the blog post I just published too -- http://www.pythian.com/news/15157

I look forward to your comments there.  I have tried in that post to specifically say what optimizations are Oracle-specific.</description>
		<content:encoded><![CDATA[<p>Justin, thanx!  I think you&#8217;ll have a lot to say about the blog post I just published too &#8212; <a href="http://www.pythian.com/news/15157" rel="nofollow">http://www.pythian.com/news/15157</a></p>
<p>I look forward to your comments there.  I have tried in that post to specifically say what optimizations are Oracle-specific.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Data Warehousing Best Practices: Comparing Oracle to MySQL, part 1 (introduction and power) &#124; The Pythian Blog</title>
		<link>http://www.pythian.com/news/15161/determining-io-throughput-for-a-system/#comment-447903</link>
		<dc:creator>Data Warehousing Best Practices: Comparing Oracle to MySQL, part 1 (introduction and power) &#124; The Pythian Blog</dc:creator>
		<pubDate>Thu, 29 Jul 2010 20:53:38 +0000</pubDate>
		<guid isPermaLink="false">http://www.pythian.com/news/?p=15161#comment-447903</guid>
		<description>[...] Determining I/O throughput for a system [...]</description>
		<content:encoded><![CDATA[<p>[...] Determining I/O throughput for a system [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Justin Swanhart</title>
		<link>http://www.pythian.com/news/15161/determining-io-throughput-for-a-system/#comment-447885</link>
		<dc:creator>Justin Swanhart</dc:creator>
		<pubDate>Thu, 29 Jul 2010 19:49:40 +0000</pubDate>
		<guid isPermaLink="false">http://www.pythian.com/news/?p=15161#comment-447885</guid>
		<description>Sheeri,

It depends on the schema and the workload.  For example, you might have a table such as the &#039;Ontime Flightstats&#039; data, which is really a star schema but it is flattened into a single table.

This type of schema might benefit from throughput more than IOPS, particularly if covering indexes can be scanned sequentially.  In fact, both MyISAM and InnoDB can take advantage of covering indexes to approximate the performance of a column store when scanning only the columns in the index.  If MySQL has to get the whole row from the table, there is likely going to be a random element (since keys are not in natural table order in most cases) and then throughput can become less important than IOPS.

For tables with joins, if you use FORCE INDEX to force  a scan of the fact table, and the dimension tables fit into memory, then throughput may be more important than IOPS.  This is helped by a split LRU, since hopefully the scan of the fact table doesn&#039;t evict the dimensions.

Oracle also supports parallel query for partitioned tables.  This makes a tremendous difference in overall performance.  If you have lots of throughput, Oracle can sequentially scan more than one partition in parallel, leading to serious performance improvements.  My Google code project &#039;Shard-Query&#039; shows that MySQL could be significantly faster if it supported this rather simple optimization.</description>
		<content:encoded><![CDATA[<p>Sheeri,</p>
<p>It depends on the schema and the workload.  For example, you might have a table such as the &#8216;Ontime Flightstats&#8217; data, which is really a star schema but it is flattened into a single table.</p>
<p>This type of schema might benefit from throughput more than IOPS, particularly if covering indexes can be scanned sequentially.  In fact, both MyISAM and InnoDB can take advantage of covering indexes to approximate the performance of a column store when scanning only the columns in the index.  If MySQL has to get the whole row from the table, there is likely going to be a random element (since keys are not in natural table order in most cases) and then throughput can become less important than IOPS.</p>
<p>For tables with joins, if you use FORCE INDEX to force  a scan of the fact table, and the dimension tables fit into memory, then throughput may be more important than IOPS.  This is helped by a split LRU, since hopefully the scan of the fact table doesn&#8217;t evict the dimensions.</p>
<p>Oracle also supports parallel query for partitioned tables.  This makes a tremendous difference in overall performance.  If you have lots of throughput, Oracle can sequentially scan more than one partition in parallel, leading to serious performance improvements.  My Google code project &#8216;Shard-Query&#8217; shows that MySQL could be significantly faster if it supported this rather simple optimization.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sheeri Cabral</title>
		<link>http://www.pythian.com/news/15161/determining-io-throughput-for-a-system/#comment-447863</link>
		<dc:creator>Sheeri Cabral</dc:creator>
		<pubDate>Thu, 29 Jul 2010 17:41:38 +0000</pubDate>
		<guid isPermaLink="false">http://www.pythian.com/news/?p=15161#comment-447863</guid>
		<description>Also, Justin, I&#039;d be interested to think if, in general, you think that I/O throughput is not useful for MySQL, that iops is always better.  Is there any case in which I/O throughput is a better metric than iops?</description>
		<content:encoded><![CDATA[<p>Also, Justin, I&#8217;d be interested to think if, in general, you think that I/O throughput is not useful for MySQL, that iops is always better.  Is there any case in which I/O throughput is a better metric than iops?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sheeri Cabral</title>
		<link>http://www.pythian.com/news/15161/determining-io-throughput-for-a-system/#comment-447861</link>
		<dc:creator>Sheeri Cabral</dc:creator>
		<pubDate>Thu, 29 Jul 2010 17:37:52 +0000</pubDate>
		<guid isPermaLink="false">http://www.pythian.com/news/?p=15161#comment-447861</guid>
		<description>Justin -- thank you!  I have a longer post I&#039;m in the middle of on more stuff in that presentation where I take some of the points raised and compare Oracle to MySQL -- this was harder to ferret out.  

Can MySQL use the sequential scanning more if it&#039;s looking at MyISAM data?  After all, MyISAM is sequential by nature; and the underlying [default] BTREE structure of data and indexes lends well to sequential scans....

Also, how does something like page size come into play?  It&#039;s not really changeable with MyISAM (unless you change the OS page size) but with InnoDB you can change the page size....</description>
		<content:encoded><![CDATA[<p>Justin &#8212; thank you!  I have a longer post I&#8217;m in the middle of on more stuff in that presentation where I take some of the points raised and compare Oracle to MySQL &#8212; this was harder to ferret out.  </p>
<p>Can MySQL use the sequential scanning more if it&#8217;s looking at MyISAM data?  After all, MyISAM is sequential by nature; and the underlying [default] BTREE structure of data and indexes lends well to sequential scans&#8230;.</p>
<p>Also, how does something like page size come into play?  It&#8217;s not really changeable with MyISAM (unless you change the OS page size) but with InnoDB you can change the page size&#8230;.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Justin Swanhart</title>
		<link>http://www.pythian.com/news/15161/determining-io-throughput-for-a-system/#comment-447847</link>
		<dc:creator>Justin Swanhart</dc:creator>
		<pubDate>Thu, 29 Jul 2010 16:21:01 +0000</pubDate>
		<guid isPermaLink="false">http://www.pythian.com/news/?p=15161#comment-447847</guid>
		<description>Do you any objective evidence that a MySQL based database warehouse benefits from throughput more than IOPS?  

Because of nested loops, a MySQL DW will often have to perform many more random operations than a similar Oracle DB on the same schema/data.  This is because Oracle can switch to a hash join, and sequentially scan large chunks of data.  MySQL can not do this.</description>
		<content:encoded><![CDATA[<p>Do you any objective evidence that a MySQL based database warehouse benefits from throughput more than IOPS?  </p>
<p>Because of nested loops, a MySQL DW will often have to perform many more random operations than a similar Oracle DB on the same schema/data.  This is because Oracle can switch to a hash join, and sequentially scan large chunks of data.  MySQL can not do this.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

