<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.3.2" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>
<channel>
	<title>Comments on: Bad SQL or MySQL Bug?</title>
	<link>http://www.pythian.com/blogs/903/bad-sql-or-mysql-bug</link>
	<description>News and views from Pythian DBAs</description>
	<pubDate>Fri, 22 Aug 2008 01:55:19 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.3.2</generator>
		<item>
		<title>By: paulm</title>
		<link>http://www.pythian.com/blogs/903/bad-sql-or-mysql-bug#comment-176158</link>
		<dc:creator>paulm</dc:creator>
		<pubDate>Fri, 04 Apr 2008 02:45:26 +0000</pubDate>
		<guid>http://www.pythian.com/blogs/903/bad-sql-or-mysql-bug#comment-176158</guid>
		<description>Interesting that the () do nothing to determine the scope of the SQL.

I ran this on Oracle as well and got the same result.

You might as well have written select fid from foo. So the SQL is wrong in that sense and the IN subquery is redundant.

Interesting enough... if this is rewritten as a join

select fid from foo join bar on fid = fid; 
it results in a cartesian join which returns all rows in both tables.

Have Fun

Paul</description>
		<content:encoded><![CDATA[<p>Interesting that the () do nothing to determine the scope of the SQL.</p>
<p>I ran this on Oracle as well and got the same result.</p>
<p>You might as well have written select fid from foo. So the SQL is wrong in that sense and the IN subquery is redundant.</p>
<p>Interesting enough&#8230; if this is rewritten as a join</p>
<p>select fid from foo join bar on fid = fid;<br />
it results in a cartesian join which returns all rows in both tables.</p>
<p>Have Fun</p>
<p>Paul</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ryan Lowe</title>
		<link>http://www.pythian.com/blogs/903/bad-sql-or-mysql-bug#comment-176151</link>
		<dc:creator>Ryan Lowe</dc:creator>
		<pubDate>Fri, 04 Apr 2008 02:13:34 +0000</pubDate>
		<guid>http://www.pythian.com/blogs/903/bad-sql-or-mysql-bug#comment-176151</guid>
		<description>There is no guessing here:

mysql&#62; explain extended select * from foo where fid in (select fid from bar);
+----+--------------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
&#124; id &#124; select_type        &#124; table &#124; type  &#124; possible_keys &#124; key     &#124; key_len &#124; ref  &#124; rows &#124; Extra                    &#124;
+----+--------------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
&#124;  1 &#124; PRIMARY            &#124; foo   &#124; index &#124; NULL          &#124; PRIMARY &#124; 1       &#124; NULL &#124;    2 &#124; Using where; Using index &#124; 
&#124;  2 &#124; DEPENDENT SUBQUERY &#124; bar   &#124; ALL   &#124; NULL          &#124; NULL    &#124; NULL    &#124; NULL &#124;    2 &#124; Using where              &#124; 
+----+--------------------+-------+-------+---------------+---------+---------+------+------+--------------------------+
2 rows in set, 2 warnings (0.00 sec)

mysql&#62; show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1276
Message: Field or reference 'test.foo.fid' of SELECT #2 was resolved in SELECT #1
*************************** 2. row ***************************
  Level: Note
   Code: 1003
Message: select `test`.`foo`.`fid` AS `fid` from `test`.`foo` where (`test`.`foo`.`fid`,(select 1 AS `Not_used` from `test`.`bar` where ((`test`.`foo`.`fid`) = `test`.`foo`.`fid`)))
2 rows in set (0.00 sec)

mysql&#62;</description>
		<content:encoded><![CDATA[<p>There is no guessing here:</p>
<p>mysql&gt; explain extended select * from foo where fid in (select fid from bar);<br />
+&#8212;-+&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;+&#8212;&#8212;-+&#8212;&#8212;-+&#8212;&#8212;&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;+&#8212;&#8212;+&#8212;&#8212;+&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;+<br />
| id | select_type        | table | type  | possible_keys | key     | key_len | ref  | rows | Extra                    |<br />
+&#8212;-+&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;+&#8212;&#8212;-+&#8212;&#8212;-+&#8212;&#8212;&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;+&#8212;&#8212;+&#8212;&#8212;+&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;+<br />
|  1 | PRIMARY            | foo   | index | NULL          | PRIMARY | 1       | NULL |    2 | Using where; Using index |<br />
|  2 | DEPENDENT SUBQUERY | bar   | ALL   | NULL          | NULL    | NULL    | NULL |    2 | Using where              |<br />
+&#8212;-+&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;+&#8212;&#8212;-+&#8212;&#8212;-+&#8212;&#8212;&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;+&#8212;&#8212;+&#8212;&#8212;+&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;+<br />
2 rows in set, 2 warnings (0.00 sec)</p>
<p>mysql&gt; show warnings\G<br />
*************************** 1. row ***************************<br />
  Level: Note<br />
   Code: 1276<br />
Message: Field or reference &#8216;test.foo.fid&#8217; of SELECT #2 was resolved in SELECT #1<br />
*************************** 2. row ***************************<br />
  Level: Note<br />
   Code: 1003<br />
Message: select `test`.`foo`.`fid` AS `fid` from `test`.`foo` where (`test`.`foo`.`fid`,(select 1 AS `Not_used` from `test`.`bar` where ((`test`.`foo`.`fid`) = `test`.`foo`.`fid`)))<br />
2 rows in set (0.00 sec)</p>
<p>mysql&gt;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sheeri Cabral</title>
		<link>http://www.pythian.com/blogs/903/bad-sql-or-mysql-bug#comment-175992</link>
		<dc:creator>Sheeri Cabral</dc:creator>
		<pubDate>Thu, 03 Apr 2008 13:41:32 +0000</pubDate>
		<guid>http://www.pythian.com/blogs/903/bad-sql-or-mysql-bug#comment-175992</guid>
		<description>*nod*  The EXPLAIN was more than enough for me to see that it was a dependent subquery.  And empirically, I understood that it was the equivalent of

select fid from foo where fid in (select true);

I just wasn't sure how the optimizer got to that point -- Roland's comments helped the most for that.

Basically,
select fid from foo where fid in (select fid from bar)
turns into
select fid from foo where fid in (select 1 from bar)
select fid from foo where fid in (select 2 from bar)

(because 1 and 2 are the values of fid).</description>
		<content:encoded><![CDATA[<p>*nod*  The EXPLAIN was more than enough for me to see that it was a dependent subquery.  And empirically, I understood that it was the equivalent of</p>
<p>select fid from foo where fid in (select true);</p>
<p>I just wasn&#8217;t sure how the optimizer got to that point &#8212; Roland&#8217;s comments helped the most for that.</p>
<p>Basically,<br />
select fid from foo where fid in (select fid from bar)<br />
turns into<br />
select fid from foo where fid in (select 1 from bar)<br />
select fid from foo where fid in (select 2 from bar)</p>
<p>(because 1 and 2 are the values of fid).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Phil Hildebrand</title>
		<link>http://www.pythian.com/blogs/903/bad-sql-or-mysql-bug#comment-175928</link>
		<dc:creator>Phil Hildebrand</dc:creator>
		<pubDate>Thu, 03 Apr 2008 00:58:06 +0000</pubDate>
		<guid>http://www.pythian.com/blogs/903/bad-sql-or-mysql-bug#comment-175928</guid>
		<description>I think the issue is that fid in implies that fid should match the values from the subselect:

mysql&#62;  select * from foo where fid in (select 90 from bar);
Empty set (0.00 sec)

I'm in agreement with Gary and kimseong</description>
		<content:encoded><![CDATA[<p>I think the issue is that fid in implies that fid should match the values from the subselect:</p>
<p>mysql&gt;  select * from foo where fid in (select 90 from bar);<br />
Empty set (0.00 sec)</p>
<p>I&#8217;m in agreement with Gary and kimseong</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Phil Hildebrand</title>
		<link>http://www.pythian.com/blogs/903/bad-sql-or-mysql-bug#comment-175924</link>
		<dc:creator>Phil Hildebrand</dc:creator>
		<pubDate>Thu, 03 Apr 2008 00:45:41 +0000</pubDate>
		<guid>http://www.pythian.com/blogs/903/bad-sql-or-mysql-bug#comment-175924</guid>
		<description>I would say it's a bug...

My guess is, the optimizer is evaluating fid as a string, thus returning results:

mysql&#62; select * from foo where 'fid' in (select 'fid' from bar);
+-----+
&#124; fid &#124;
+-----+
&#124;   1 &#124;
&#124;   2 &#124;
+-----+
2 rows in set (0.00 sec)

You should enter a bug for it I think...</description>
		<content:encoded><![CDATA[<p>I would say it&#8217;s a bug&#8230;</p>
<p>My guess is, the optimizer is evaluating fid as a string, thus returning results:</p>
<p>mysql&gt; select * from foo where &#8216;fid&#8217; in (select &#8216;fid&#8217; from bar);<br />
+&#8212;&#8211;+<br />
| fid |<br />
+&#8212;&#8211;+<br />
|   1 |<br />
|   2 |<br />
+&#8212;&#8211;+<br />
2 rows in set (0.00 sec)</p>
<p>You should enter a bug for it I think&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Gary</title>
		<link>http://www.pythian.com/blogs/903/bad-sql-or-mysql-bug#comment-175919</link>
		<dc:creator>Gary</dc:creator>
		<pubDate>Thu, 03 Apr 2008 00:24:21 +0000</pubDate>
		<guid>http://www.pythian.com/blogs/903/bad-sql-or-mysql-bug#comment-175919</guid>
		<description>select * from foo where fid in (select fid from bar);  
should work because you also want this (and similar) to work....
select * from foo where 1 = (select count(*) from bar where fid=bid);
so fid and bid are both 'in scope' for the sub-query. 
what shouldn't work is
select * from foo where fid in (select bar.fid from bar);</description>
		<content:encoded><![CDATA[<p>select * from foo where fid in (select fid from bar);<br />
should work because you also want this (and similar) to work&#8230;.<br />
select * from foo where 1 = (select count(*) from bar where fid=bid);<br />
so fid and bid are both &#8216;in scope&#8217; for the sub-query.<br />
what shouldn&#8217;t work is<br />
select * from foo where fid in (select bar.fid from bar);</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: kimseong</title>
		<link>http://www.pythian.com/blogs/903/bad-sql-or-mysql-bug#comment-175911</link>
		<dc:creator>kimseong</dc:creator>
		<pubDate>Wed, 02 Apr 2008 23:35:39 +0000</pubDate>
		<guid>http://www.pythian.com/blogs/903/bad-sql-or-mysql-bug#comment-175911</guid>
		<description>It should be treated as a dependent subquery with fid referring to the table on the outer query. 
If table bar is empty, then there should not be any result.</description>
		<content:encoded><![CDATA[<p>It should be treated as a dependent subquery with fid referring to the table on the outer query.<br />
If table bar is empty, then there should not be any result.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: muzazzi</title>
		<link>http://www.pythian.com/blogs/903/bad-sql-or-mysql-bug#comment-175909</link>
		<dc:creator>muzazzi</dc:creator>
		<pubDate>Wed, 02 Apr 2008 23:30:11 +0000</pubDate>
		<guid>http://www.pythian.com/blogs/903/bad-sql-or-mysql-bug#comment-175909</guid>
		<description>Of course it should work. :)  This is basically saying "Give me all the fids if there exists at least one row in bar".  The subquery returns the current 'fid' for every row in bar, of which the current fid will always be a member.  I might note that it's the same behavior in Postgres as well.   Namespace works from inside out, fid isn't in the scope of the subquery, so it becomes dependent on the parent query.   If you explicitly defined the namespace as in:
select * from foo where fid in (select foo.fid from bar)

would you still not expect it to work?</description>
		<content:encoded><![CDATA[<p>Of course it should work. :)  This is basically saying &#8220;Give me all the fids if there exists at least one row in bar&#8221;.  The subquery returns the current &#8216;fid&#8217; for every row in bar, of which the current fid will always be a member.  I might note that it&#8217;s the same behavior in Postgres as well.   Namespace works from inside out, fid isn&#8217;t in the scope of the subquery, so it becomes dependent on the parent query.   If you explicitly defined the namespace as in:<br />
select * from foo where fid in (select foo.fid from bar)</p>
<p>would you still not expect it to work?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Roland Bouman</title>
		<link>http://www.pythian.com/blogs/903/bad-sql-or-mysql-bug#comment-175905</link>
		<dc:creator>Roland Bouman</dc:creator>
		<pubDate>Wed, 02 Apr 2008 23:22:44 +0000</pubDate>
		<guid>http://www.pythian.com/blogs/903/bad-sql-or-mysql-bug#comment-175905</guid>
		<description>Oh, I forgot to close the circle:

"So, should select * from foo where fid in (select fid from bar); work? If so, why is that?"

yes, it should work as the subquery always selects the exact same value for fid as present for the current row in the outer query. So, both occurrences of fid refer to the exact same value and thus, by definition fid in (...fid...) is always true.

Roland.</description>
		<content:encoded><![CDATA[<p>Oh, I forgot to close the circle:</p>
<p>&#8220;So, should select * from foo where fid in (select fid from bar); work? If so, why is that?&#8221;</p>
<p>yes, it should work as the subquery always selects the exact same value for fid as present for the current row in the outer query. So, both occurrences of fid refer to the exact same value and thus, by definition fid in (&#8230;fid&#8230;) is always true.</p>
<p>Roland.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Roland Bouman</title>
		<link>http://www.pythian.com/blogs/903/bad-sql-or-mysql-bug#comment-175903</link>
		<dc:creator>Roland Bouman</dc:creator>
		<pubDate>Wed, 02 Apr 2008 23:17:57 +0000</pubDate>
		<guid>http://www.pythian.com/blogs/903/bad-sql-or-mysql-bug#comment-175903</guid>
		<description>I cannot discover any logical error. Obviously, the query is most likely wrong semantically.

The short explanation is that 
1) fid is resolvable in the outer query - i.o.w it exists already at the topmost level
2) If fid exists at the topmost level, it exists also at the levels beneath that unless masked by a nearer scope
3) the subquery on bar does not define a fid, so fid unambigously points to the fid of the outermost query. 
4) the subquery on bar may reference fid from the outermost query.

That says it all.

As a slightly longer explanation, would you have been surprised that this works:

select fid
from  foo 
where exists (
    select null
    from  bar
    where bid = fid
)

Now if you accept this, what about this:

select fid
from  foo 
where exists (
    select fid
    from  bar
    where bid = fid
)

Logically this should not be objectionable - if the WHERE clause can refer to fid, why not the SELECT list? 

So if we accept this, then should be able to except that if 

    select fid
    from  bar
    where bid = fid

executes *within the current context* then surely, 

    select fid
    from  bar

must execute too within that same context. 

Your last query does not include that context though - fid is not defined when this is used as the outermost query - hence, you get the "unknown column" error.

You say first "certainly a subquery that cannot run" but the whole point is that *because it is a subquery* it can reference items defined elsewhere (in some enclosing scope). So it can run if it is embedded within its particular context. It may of course not be able to run when you strip away the context.</description>
		<content:encoded><![CDATA[<p>I cannot discover any logical error. Obviously, the query is most likely wrong semantically.</p>
<p>The short explanation is that<br />
1) fid is resolvable in the outer query - i.o.w it exists already at the topmost level<br />
2) If fid exists at the topmost level, it exists also at the levels beneath that unless masked by a nearer scope<br />
3) the subquery on bar does not define a fid, so fid unambigously points to the fid of the outermost query.<br />
4) the subquery on bar may reference fid from the outermost query.</p>
<p>That says it all.</p>
<p>As a slightly longer explanation, would you have been surprised that this works:</p>
<p>select fid<br />
from  foo<br />
where exists (<br />
    select null<br />
    from  bar<br />
    where bid = fid<br />
)</p>
<p>Now if you accept this, what about this:</p>
<p>select fid<br />
from  foo<br />
where exists (<br />
    select fid<br />
    from  bar<br />
    where bid = fid<br />
)</p>
<p>Logically this should not be objectionable - if the WHERE clause can refer to fid, why not the SELECT list? </p>
<p>So if we accept this, then should be able to except that if </p>
<p>    select fid<br />
    from  bar<br />
    where bid = fid</p>
<p>executes *within the current context* then surely, </p>
<p>    select fid<br />
    from  bar</p>
<p>must execute too within that same context. </p>
<p>Your last query does not include that context though - fid is not defined when this is used as the outermost query - hence, you get the &#8220;unknown column&#8221; error.</p>
<p>You say first &#8220;certainly a subquery that cannot run&#8221; but the whole point is that *because it is a subquery* it can reference items defined elsewhere (in some enclosing scope). So it can run if it is embedded within its particular context. It may of course not be able to run when you strip away the context.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
