Showing posts with label 2hrs. Show all posts
Showing posts with label 2hrs. Show all posts

Wednesday, March 21, 2012

Query Performance (tearing what's left of my hair out)

I really am on my last legs with this. I am trying to improve performance on
the following query. The query runs in 2hrs. When I remove the first two
references to a13.a13.Parent_County_desc and a13.County_desc the query
returns the same amount of rows in 1 minute!!! The only index on the
POt_lu_rd_post_code a13 table is as follows: clustered, unique located on
PRIMARY [Po_rd_postcode_id], [County_id]. The execution plan shows
a
distinct sort using 100% cpu when you include this select criteria and a nic
e
hash aggregate when we don't. The volume of data is quite large and we have
to use this table instead of a lookup due to a known bug. What can I do to
improve performance on this query. Please help.
select distinct coalesce(pa1.Ra_gender_id, pa2.Ra_gender_id,
pa3.Ra_gender_id) Ra_gender_id, a12.Ra_gender_desc
Ra_gender_desc, coalesce(pa1.Pr_cover_id, pa2.Pr_cover_id, pa3.Pr_cover_id)
Pr_cover_id,
a11.Pr_cover_desc Pr_cover_desc,
coalesce(pa1.Parent_County_id, pa2.Parent_County_id, pa3.Parent_County_id)
Parent_County_id,
a13.Parent_County_desc Parent_County_desc,
coalesce(pa1.County_id, pa2.County_id, pa3.County_id) County_id,
a13.County_desc County_desc,
coalesce(pa1.Area_Id, pa2.Area_Id, pa3.Area_Id) Area_Id,
a13.Area_Desc Area_Desc,
coalesce(pa1.Po_rd_rating_code, pa2.Po_rd_rating_code,
pa3.Po_rd_rating_code) Po_rd_rating_code,
coalesce(pa1.Week_id, pa2.Week_id, pa3.Week_id) Week_id,
a14.week_desc week_desc,
pa1.WJXBFS1 WJXBFS1,
pa2.WJXBFS1 WJXBFS2,
pa3.WJXBFS1 WJXBFS3
from marc_ZZT5J02027GMD000 pa1(nolock)
full outer join marc_ZZT5J02027GMD001 pa2(nolock)
on (pa1.Area_Id = pa2.Area_Id and
pa1.County_id = pa2.County_id and
pa1.Parent_County_id = pa2.Parent_County_id and
pa1.Po_rd_rating_code = pa2.Po_rd_rating_code and
pa1.Pr_cover_id = pa2.Pr_cover_id and
pa1.Ra_gender_id = pa2.Ra_gender_id and
pa1.Week_id = pa2.Week_id)
full outer join marc_ZZT5J02027GMD002 pa3(nolock)
on (coalesce(pa1.Area_Id, pa2.Area_Id) = pa3.Area_Id and
coalesce(pa1.County_id, pa2.County_id) = pa3.County_id and
coalesce(pa1.Parent_County_id, pa2.Parent_County_id) = pa3.Parent_County_id
and
coalesce(pa1.Po_rd_rating_code, pa2.Po_rd_rating_code) =
pa3.Po_rd_rating_code and
coalesce(pa1.Pr_cover_id, pa2.Pr_cover_id) = pa3.Pr_cover_id and
coalesce(pa1.Ra_gender_id, pa2.Ra_gender_id) = pa3.Ra_gender_id and
coalesce(pa1.Week_id, pa2.Week_id) = pa3.Week_id)
join PRt_lu_Cover a11
on (coalesce(pa1.Pr_cover_id, pa2.Pr_cover_id, pa3.Pr_cover_id) =
a11.Pr_cover_id)
join RAv_lu_Gender a12
on (coalesce(pa1.Ra_gender_id, pa2.Ra_gender_id, pa3.Ra_gender_id) =
a12.Ra_gender_id)
join POt_lu_rd_post_code a13
on (coalesce(pa1.Area_Id, pa2.Area_Id, pa3.Area_Id) = a13.Area_Id)
join TIt_lu_day a14
on (coalesce(pa1.Week_id, pa2.Week_id, pa3.Week_id) = a14.Week_id)marcmc
Don't you really need 'WHERE' condition , am I right?
"marcmc" <marcmc@.discussions.microsoft.com> wrote in message
news:122E9FF3-5A9A-4729-B652-EA1B6956D751@.microsoft.com...
> I really am on my last legs with this. I am trying to improve performance
on
> the following query. The query runs in 2hrs. When I remove the first two
> references to a13.a13.Parent_County_desc and a13.County_desc the query
> returns the same amount of rows in 1 minute!!! The only index on the
> POt_lu_rd_post_code a13 table is as follows: clustered, unique located on
> PRIMARY [Po_rd_postcode_id], [County_id]. The execution plan show
s a
> distinct sort using 100% cpu when you include this select criteria and a
nice
> hash aggregate when we don't. The volume of data is quite large and we
have
> to use this table instead of a lookup due to a known bug. What can I do to
> improve performance on this query. Please help.
>
> select distinct coalesce(pa1.Ra_gender_id, pa2.Ra_gender_id,
> pa3.Ra_gender_id) Ra_gender_id, a12.Ra_gender_desc
> Ra_gender_desc, coalesce(pa1.Pr_cover_id, pa2.Pr_cover_id,
pa3.Pr_cover_id)
> Pr_cover_id,
> a11.Pr_cover_desc Pr_cover_desc,
> coalesce(pa1.Parent_County_id, pa2.Parent_County_id, pa3.Parent_County_id)
> Parent_County_id,
> a13.Parent_County_desc Parent_County_desc,
> coalesce(pa1.County_id, pa2.County_id, pa3.County_id) County_id,
> a13.County_desc County_desc,
> coalesce(pa1.Area_Id, pa2.Area_Id, pa3.Area_Id) Area_Id,
> a13.Area_Desc Area_Desc,
> coalesce(pa1.Po_rd_rating_code, pa2.Po_rd_rating_code,
> pa3.Po_rd_rating_code) Po_rd_rating_code,
> coalesce(pa1.Week_id, pa2.Week_id, pa3.Week_id) Week_id,
> a14.week_desc week_desc,
> pa1.WJXBFS1 WJXBFS1,
> pa2.WJXBFS1 WJXBFS2,
> pa3.WJXBFS1 WJXBFS3
> from marc_ZZT5J02027GMD000 pa1(nolock)
> full outer join marc_ZZT5J02027GMD001 pa2(nolock)
> on (pa1.Area_Id = pa2.Area_Id and
> pa1.County_id = pa2.County_id and
> pa1.Parent_County_id = pa2.Parent_County_id and
> pa1.Po_rd_rating_code = pa2.Po_rd_rating_code and
> pa1.Pr_cover_id = pa2.Pr_cover_id and
> pa1.Ra_gender_id = pa2.Ra_gender_id and
> pa1.Week_id = pa2.Week_id)
> full outer join marc_ZZT5J02027GMD002 pa3(nolock)
> on (coalesce(pa1.Area_Id, pa2.Area_Id) = pa3.Area_Id and
> coalesce(pa1.County_id, pa2.County_id) = pa3.County_id and
> coalesce(pa1.Parent_County_id, pa2.Parent_County_id) =
pa3.Parent_County_id
> and
> coalesce(pa1.Po_rd_rating_code, pa2.Po_rd_rating_code) =
> pa3.Po_rd_rating_code and
> coalesce(pa1.Pr_cover_id, pa2.Pr_cover_id) = pa3.Pr_cover_id and
> coalesce(pa1.Ra_gender_id, pa2.Ra_gender_id) = pa3.Ra_gender_id and
> coalesce(pa1.Week_id, pa2.Week_id) = pa3.Week_id)
> join PRt_lu_Cover a11
> on (coalesce(pa1.Pr_cover_id, pa2.Pr_cover_id, pa3.Pr_cover_id) =
> a11.Pr_cover_id)
> join RAv_lu_Gender a12
> on (coalesce(pa1.Ra_gender_id, pa2.Ra_gender_id, pa3.Ra_gender_id) =
> a12.Ra_gender_id)
> join POt_lu_rd_post_code a13
> on (coalesce(pa1.Area_Id, pa2.Area_Id, pa3.Area_Id) = a13.Area_Id)
> join TIt_lu_day a14
> on (coalesce(pa1.Week_id, pa2.Week_id, pa3.Week_id) = a14.Week_id)
>
>|||Honestly I would say yes very much so but this query was generated by a
reporting application so unfortunately I am stuck with it!
"Uri Dimant" wrote:

> marcmc
> Don't you really need 'WHERE' condition , am I right?
> "marcmc" <marcmc@.discussions.microsoft.com> wrote in message
> news:122E9FF3-5A9A-4729-B652-EA1B6956D751@.microsoft.com...
> on
> nice
> have
> pa3.Pr_cover_id)
> pa3.Parent_County_id
>
>|||ps: the problem is massive amounts of data where the county descriptions are
involved and the distinct sort is grouping them.
"marcmc" wrote:

> I really am on my last legs with this. I am trying to improve performance
on
> the following query. The query runs in 2hrs. When I remove the first two
> references to a13.a13.Parent_County_desc and a13.County_desc the query
> returns the same amount of rows in 1 minute!!! The only index on the
> POt_lu_rd_post_code a13 table is as follows: clustered, unique located on
> PRIMARY [Po_rd_postcode_id], [County_id]. The execution plan show
s a
> distinct sort using 100% cpu when you include this select criteria and a n
ice
> hash aggregate when we don't. The volume of data is quite large and we hav
e
> to use this table instead of a lookup due to a known bug. What can I do to
> improve performance on this query. Please help.
>
> select distinct coalesce(pa1.Ra_gender_id, pa2.Ra_gender_id,
> pa3.Ra_gender_id) Ra_gender_id, a12.Ra_gender_desc
> Ra_gender_desc, coalesce(pa1.Pr_cover_id, pa2.Pr_cover_id, pa3.Pr_cover_id
)
> Pr_cover_id,
> a11.Pr_cover_desc Pr_cover_desc,
> coalesce(pa1.Parent_County_id, pa2.Parent_County_id, pa3.Parent_County_id)
> Parent_County_id,
> a13.Parent_County_desc Parent_County_desc,
> coalesce(pa1.County_id, pa2.County_id, pa3.County_id) County_id,
> a13.County_desc County_desc,
> coalesce(pa1.Area_Id, pa2.Area_Id, pa3.Area_Id) Area_Id,
> a13.Area_Desc Area_Desc,
> coalesce(pa1.Po_rd_rating_code, pa2.Po_rd_rating_code,
> pa3.Po_rd_rating_code) Po_rd_rating_code,
> coalesce(pa1.Week_id, pa2.Week_id, pa3.Week_id) Week_id,
> a14.week_desc week_desc,
> pa1.WJXBFS1 WJXBFS1,
> pa2.WJXBFS1 WJXBFS2,
> pa3.WJXBFS1 WJXBFS3
> from marc_ZZT5J02027GMD000 pa1(nolock)
> full outer join marc_ZZT5J02027GMD001 pa2(nolock)
> on (pa1.Area_Id = pa2.Area_Id and
> pa1.County_id = pa2.County_id and
> pa1.Parent_County_id = pa2.Parent_County_id and
> pa1.Po_rd_rating_code = pa2.Po_rd_rating_code and
> pa1.Pr_cover_id = pa2.Pr_cover_id and
> pa1.Ra_gender_id = pa2.Ra_gender_id and
> pa1.Week_id = pa2.Week_id)
> full outer join marc_ZZT5J02027GMD002 pa3(nolock)
> on (coalesce(pa1.Area_Id, pa2.Area_Id) = pa3.Area_Id and
> coalesce(pa1.County_id, pa2.County_id) = pa3.County_id and
> coalesce(pa1.Parent_County_id, pa2.Parent_County_id) = pa3.Parent_County_
id
> and
> coalesce(pa1.Po_rd_rating_code, pa2.Po_rd_rating_code) =
> pa3.Po_rd_rating_code and
> coalesce(pa1.Pr_cover_id, pa2.Pr_cover_id) = pa3.Pr_cover_id and
> coalesce(pa1.Ra_gender_id, pa2.Ra_gender_id) = pa3.Ra_gender_id and
> coalesce(pa1.Week_id, pa2.Week_id) = pa3.Week_id)
> join PRt_lu_Cover a11
> on (coalesce(pa1.Pr_cover_id, pa2.Pr_cover_id, pa3.Pr_cover_id) =
> a11.Pr_cover_id)
> join RAv_lu_Gender a12
> on (coalesce(pa1.Ra_gender_id, pa2.Ra_gender_id, pa3.Ra_gender_id) =
> a12.Ra_gender_id)
> join POt_lu_rd_post_code a13
> on (coalesce(pa1.Area_Id, pa2.Area_Id, pa3.Area_Id) = a13.Area_Id)
> join TIt_lu_day a14
> on (coalesce(pa1.Week_id, pa2.Week_id, pa3.Week_id) = a14.Week_id)
>
>|||When you say you must use this table instead of a lookup, can you elaborate?
Is it possible that you can use this without DISTINCT initially and
apply DISTINCT
elsewhere in your process where it will not require as much work? Also,
what is
the known bug you mention? Maybe someone knows it and has a workaround.
Steve Kass
Drew University
marcmc wrote:

>I really am on my last legs with this. I am trying to improve performance o
n
>the following query. The query runs in 2hrs. When I remove the first two
>references to a13.a13.Parent_County_desc and a13.County_desc the query
>returns the same amount of rows in 1 minute!!! The only index on the
>POt_lu_rd_post_code a13 table is as follows: clustered, unique located on
>PRIMARY [Po_rd_postcode_id], [County_id]. The execution plan shows
a
>distinct sort using 100% cpu when you include this select criteria and a ni
ce
>hash aggregate when we don't. The volume of data is quite large and we have
>to use this table instead of a lookup due to a known bug. What can I do to
>improve performance on this query. Please help.
>
>select distinct coalesce(pa1.Ra_gender_id, pa2.Ra_gender_id,
>pa3.Ra_gender_id) Ra_gender_id, a12.Ra_gender_desc
>Ra_gender_desc, coalesce(pa1.Pr_cover_id, pa2.Pr_cover_id, pa3.Pr_cover_id)
>Pr_cover_id,
>a11.Pr_cover_desc Pr_cover_desc,
>coalesce(pa1.Parent_County_id, pa2.Parent_County_id, pa3.Parent_County_id)
>Parent_County_id,
>a13.Parent_County_desc Parent_County_desc,
>coalesce(pa1.County_id, pa2.County_id, pa3.County_id) County_id,
>a13.County_desc County_desc,
>coalesce(pa1.Area_Id, pa2.Area_Id, pa3.Area_Id) Area_Id,
>a13.Area_Desc Area_Desc,
>coalesce(pa1.Po_rd_rating_code, pa2.Po_rd_rating_code,
>pa3.Po_rd_rating_code) Po_rd_rating_code,
>coalesce(pa1.Week_id, pa2.Week_id, pa3.Week_id) Week_id,
> a14.week_desc week_desc,
> pa1.WJXBFS1 WJXBFS1,
> pa2.WJXBFS1 WJXBFS2,
> pa3.WJXBFS1 WJXBFS3
>from marc_ZZT5J02027GMD000 pa1(nolock)
> full outer join marc_ZZT5J02027GMD001 pa2(nolock)
> on (pa1.Area_Id = pa2.Area_Id and
> pa1.County_id = pa2.County_id and
> pa1.Parent_County_id = pa2.Parent_County_id and
> pa1.Po_rd_rating_code = pa2.Po_rd_rating_code and
> pa1.Pr_cover_id = pa2.Pr_cover_id and
> pa1.Ra_gender_id = pa2.Ra_gender_id and
> pa1.Week_id = pa2.Week_id)
> full outer join marc_ZZT5J02027GMD002 pa3(nolock)
> on (coalesce(pa1.Area_Id, pa2.Area_Id) = pa3.Area_Id and
> coalesce(pa1.County_id, pa2.County_id) = pa3.County_id and
> coalesce(pa1.Parent_County_id, pa2.Parent_County_id) = pa3.Parent_County_i
d
>and
> coalesce(pa1.Po_rd_rating_code, pa2.Po_rd_rating_code) =
>pa3.Po_rd_rating_code and
> coalesce(pa1.Pr_cover_id, pa2.Pr_cover_id) = pa3.Pr_cover_id and
> coalesce(pa1.Ra_gender_id, pa2.Ra_gender_id) = pa3.Ra_gender_id and
> coalesce(pa1.Week_id, pa2.Week_id) = pa3.Week_id)
> join PRt_lu_Cover a11
> on (coalesce(pa1.Pr_cover_id, pa2.Pr_cover_id, pa3.Pr_cover_id) =
>a11.Pr_cover_id)
> join RAv_lu_Gender a12
> on (coalesce(pa1.Ra_gender_id, pa2.Ra_gender_id, pa3.Ra_gender_id) =
>a12.Ra_gender_id)
> join POt_lu_rd_post_code a13
> on (coalesce(pa1.Area_Id, pa2.Area_Id, pa3.Area_Id) = a13.Area_Id)
> join TIt_lu_day a14
> on (coalesce(pa1.Week_id, pa2.Week_id, pa3.Week_id) = a14.Week_id)
>
>
>|||thanks all. i have processed the change request and am going for the bug fix
on the dimension itself.
"Steve Kass" wrote:

> When you say you must use this table instead of a lookup, can you elaborat
e?
> Is it possible that you can use this without DISTINCT initially and
> apply DISTINCT
> elsewhere in your process where it will not require as much work? Also,
> what is
> the known bug you mention? Maybe someone knows it and has a workaround.
> Steve Kass
> Drew University
> marcmc wrote:
>
>