Hi,
I've got the following query:
exec sp_executesql N'
select i.issuer_id,
i.issuer_name,
i.last_update ,
ia_bbeqtk.alias ,
ia_bbeqtk.last_update,
ia_bq.alias ,
ia_bq.last_update
from issuer i
left outer join issuer_alias as ia_bbeqtk
on ia_bbeqtk.issuer_id = i.issuer_id
and ia_bbeqtk.alias_type = ''BBEQTK''
left outer join issuer_alias as ia_bq
on ia_bq.issuer_id = i.issuer_id
and ia_bq.alias_type = ''BQUOTE''
where i.last_update > @.IssuerLastUpdate
or ia_bbeqtk.last_update > @.BlastUpdate
or ia_bq.last_update > @.BQLastUpdate
',N'@.IssuerLastUpdate datetime,@.BlastUpdate datetime,@.BQLastUpdate
datetime',@.IssuerLastUpdate=''2006-06-06
09:05:41:003'',@.BlastUpdate=''2006-06-01
18:09:17:937'',@.BQLastUpdate=''2006-06-06
09:05:41:003''
And here are the indexes for the 2 tables
Issuer (approx 4,000 rows)
^^^^
index_name index_description index_keys
IX_issuer nonclustered, stats no recompute located on PRIMARY last_update,
issuer_id
PK__issuer__3EFC4F81 clustered, unique, primary key located on
PRIMARY issuer_id
UQ__issuer__3FF073BA nonclustered, unique, unique key located on
PRIMARY issuer_name
Issuer_alias (approx 18,000 rows)
^^^^^^^
index_name index_description index_keys
IX_issuer_alias nonclustered, unique located on PRIMARY issuer_id, alias_typ
e
PK__issuer_alias__5303482E clustered, unique, primary key located on
PRIMARY alias_type, alias
Now when I execute the query it completely ignores the fact that last_update
column has more than 4000 rows with an update column of 29 May 2006 and the
rest 10-20 are after that. Instead it uses the clustered index on the isssue
r
table PK__issuer__3EFC4F81 instead of IX_issuer! Instead if it went for the
last_update column first you get 2 rows selectivity by searching the non
clustered and then it can run the query much quicker by linking on the
issuer_id.
I've also checked the stats on the table and they seem ok. Does anybody can
explain this behaviour which is not the most efficient in this case?
Thanks,
Panos.4000 rows are not too much for a database engine. So SQL Server, based on it
s
cost optimization algorithm, might have thought that it will be easier to
scan through the 4000 rows rather than reading the index and getting the
other coulmns from the table again. the query execution plan might (and I
guess will) change as the number of rows increases and when it reaches the
breakeven point.
Hope this helps.|||Hi Pano,
The the last two ("OR") predicates in your query prevent the usage of the
non-clustered index in the way that you describe. In other words the query
processor cannot push the filter on i.last_update below the left-outer join
and produce correct results (it would be filtering rows based on the
last_update column ,which could still qualify if they satisfy either the two
predicates on bbeqtk.last_update and bq.last_update).
However, if your predicates were joint with an AND instead of an OR you
should see the filter on last_update being pushed down below the join and
into an index s (assuming that the plan which uses the non-clustered
index s would be estimated as more efficient than the plan which uses the
clustered index scan by the query optimizer).
Regards,
Leo
"Panos Stavroulis." <PanosStavroulis@.discussions.microsoft.com> wrote in
message news:BCEC9194-2EFC-4957-8120-A2B29FCED4E2@.microsoft.com...
> Hi,
> I've got the following query:
> exec sp_executesql N'
> select i.issuer_id,
> i.issuer_name,
> i.last_update ,
> ia_bbeqtk.alias ,
> ia_bbeqtk.last_update,
> ia_bq.alias ,
> ia_bq.last_update
> from issuer i
> left outer join issuer_alias as ia_bbeqtk
> on ia_bbeqtk.issuer_id = i.issuer_id
> and ia_bbeqtk.alias_type = ''BBEQTK''
> left outer join issuer_alias as ia_bq
> on ia_bq.issuer_id = i.issuer_id
> and ia_bq.alias_type = ''BQUOTE''
> where i.last_update > @.IssuerLastUpdate
> or ia_bbeqtk.last_update > @.BlastUpdate
> or ia_bq.last_update > @.BQLastUpdate
> ',N'@.IssuerLastUpdate datetime,@.BlastUpdate datetime,@.BQLastUpdate
> datetime',@.IssuerLastUpdate=''2006-06-06
> 09:05:41:003'',@.BlastUpdate=''2006-06-01
> 18:09:17:937'',@.BQLastUpdate=''2006-06-06
> 09:05:41:003''
> And here are the indexes for the 2 tables
> Issuer (approx 4,000 rows)
> ^^^^
> index_name index_description index_keys
> IX_issuer nonclustered, stats no recompute located on PRIMARY last_update,
> issuer_id
> PK__issuer__3EFC4F81 clustered, unique, primary key located on
> PRIMARY issuer_id
> UQ__issuer__3FF073BA nonclustered, unique, unique key located on
> PRIMARY issuer_name
> Issuer_alias (approx 18,000 rows)
> ^^^^^^^
> index_name index_description index_keys
> IX_issuer_alias nonclustered, unique located on PRIMARY issuer_id,
> alias_type
> PK__issuer_alias__5303482E clustered, unique, primary key located on
> PRIMARY alias_type, alias
> Now when I execute the query it completely ignores the fact that
> last_update
> column has more than 4000 rows with an update column of 29 May 2006 and
> the
> rest 10-20 are after that. Instead it uses the clustered index on the
> isssuer
> table PK__issuer__3EFC4F81 instead of IX_issuer! Instead if it went for
> the
> last_update column first you get 2 rows selectivity by searching the non
> clustered and then it can run the query much quicker by linking on the
> issuer_id.
> I've also checked the stats on the table and they seem ok. Does anybody
> can
> explain this behaviour which is not the most efficient in this case?
> Thanks,
> Panos.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment