Please consider the following Select statement. I'm not sure how this is
going to end up getting formatted when posted but please notice the two
sub-queries in the main Select statement. They are identical except for the
"AS" clause. I have the following questions:
1) Will the SQL 2000 or 2005 optimizer see these are identical and execute
them only once instead of twice?
2) If the answer to #1 is that they will be executed twice, is there a way
to optimize this so that the two subqueries are only executed once?
SELECT
ProjectTran.ProjectRowID,
ProjectTran.RowID,
ProjectTran.ProjectWBSRowID,
PA.BillingTypeRowID,
'REVENUE',
ProjectTran.PriorInvoiceAmount-(SELECT
COALESCE(SUM(dbo.ProjectInvoiceDetail.InvoiceAmoun t),0)
FROM dbo.ProjectInvoiceDetail
INNER JOIN dbo.ProjectInvoice on
dbo.ProjectInvoiceDetail.ProjectInvoiceRowID = dbo.ProjectInvoice.RowID
Where dbo.ProjectInvoice.LedgerDocRowID IS NOT NULL
AND dbo.ProjectInvoice.StatusRowID = 1165
AND dbo.ProjectInvoiceDetail.InvoiceAmount <> 0
AND dbo.ProjectInvoiceDetail.ProjectTranRowID =
ProjectTran.RowID) AS Revenue,
ProjectTran.PriorInvoiceAmount -(SELECT
COALESCE(SUM(dbo.ProjectInvoiceDetail.InvoiceAmoun t),0)
FROM dbo.ProjectInvoiceDetail
INNER JOIN dbo.ProjectInvoice on
dbo.ProjectInvoiceDetail.ProjectInvoiceRowID = dbo.ProjectInvoice.RowID
Where dbo.ProjectInvoice.LedgerDocRowID IS NOT NULL
AND dbo.ProjectInvoice.StatusRowID = 1165
AND dbo.ProjectInvoiceDetail.InvoiceAmount <> 0
AND dbo.ProjectInvoiceDetail.ProjectTranRowID =
ProjectTran.RowID) AS BilledAmount
FROM ProjectTran
LEFT OUTER JOIN @.ProjectAccount PA ON
dbo.ProjectTran.ProjectAccountRowID = PA.ProjectAccountRowID
INNER JOIN dbo.ProjectDoc ON dbo.ProjectTran.ProjectDocRowID =
dbo.ProjectDoc.RowID
Where dbo.ProjectTran.ProjectRowID in (Select RowID From #ProjectTemp)
AND dbo.ProjectTran.ProjectDocRowID IS NOT NULL
AND dbo.ProjectDoc.StatusRowID = 1003
AND ProjectTran.RowID NOT IN (Select TransactionRowID From
@.ProjectDataDetail WHERE AccountTypeDescription = 'REVENUE')
In general, the query optimizer will not search for equivalent subqueries
and remove them if they are written twice in the same query, no. It is
fairly expensive in compilation time to try to do this, and it is so rarely
written that the payoff is very, very low to using this approach vs. the
general case.
You can:
1. Write the subquery once (somehow - there are multiple ways). For
example, I believe you can use nested blocks to write the subquery once,
give the result an alias, and then reference it in a subsequent block twice.
2. Use CROSS APPLY and make it a join, then reference the resulting column
twice. You should take care here to make sure that the join does not change
your output cardinality. This require some expertise, so try (1) unless you
are very comfortable with relational algebra equivalences.
Good luck.
Conor Cunningham
SQL Server Query Optimization Development Lead
"Amos Soma" <amos_j_soma@.yahoo.com> wrote in message
news:e1FxxLzMGHA.140@.TK2MSFTNGP12.phx.gbl...
> Please consider the following Select statement. I'm not sure how this is
> going to end up getting formatted when posted but please notice the two
> sub-queries in the main Select statement. They are identical except for
> the "AS" clause. I have the following questions:
> 1) Will the SQL 2000 or 2005 optimizer see these are identical and execute
> them only once instead of twice?
> 2) If the answer to #1 is that they will be executed twice, is there a way
> to optimize this so that the two subqueries are only executed once?
> SELECT
> ProjectTran.ProjectRowID,
> ProjectTran.RowID,
> ProjectTran.ProjectWBSRowID,
> PA.BillingTypeRowID,
> 'REVENUE',
> ProjectTran.PriorInvoiceAmount-(SELECT
> COALESCE(SUM(dbo.ProjectInvoiceDetail.InvoiceAmoun t),0)
> FROM dbo.ProjectInvoiceDetail
> INNER JOIN dbo.ProjectInvoice on
> dbo.ProjectInvoiceDetail.ProjectInvoiceRowID = dbo.ProjectInvoice.RowID
> Where dbo.ProjectInvoice.LedgerDocRowID IS NOT NULL
> AND dbo.ProjectInvoice.StatusRowID = 1165
> AND dbo.ProjectInvoiceDetail.InvoiceAmount <> 0
> AND dbo.ProjectInvoiceDetail.ProjectTranRowID =
> ProjectTran.RowID) AS Revenue,
> ProjectTran.PriorInvoiceAmount -(SELECT
> COALESCE(SUM(dbo.ProjectInvoiceDetail.InvoiceAmoun t),0)
> FROM dbo.ProjectInvoiceDetail
> INNER JOIN dbo.ProjectInvoice on
> dbo.ProjectInvoiceDetail.ProjectInvoiceRowID = dbo.ProjectInvoice.RowID
> Where dbo.ProjectInvoice.LedgerDocRowID IS NOT NULL
> AND dbo.ProjectInvoice.StatusRowID = 1165
> AND dbo.ProjectInvoiceDetail.InvoiceAmount <> 0
> AND dbo.ProjectInvoiceDetail.ProjectTranRowID =
> ProjectTran.RowID) AS BilledAmount
> FROM ProjectTran
> LEFT OUTER JOIN @.ProjectAccount PA ON
> dbo.ProjectTran.ProjectAccountRowID = PA.ProjectAccountRowID
> INNER JOIN dbo.ProjectDoc ON dbo.ProjectTran.ProjectDocRowID =
> dbo.ProjectDoc.RowID
> Where dbo.ProjectTran.ProjectRowID in (Select RowID From #ProjectTemp)
> AND dbo.ProjectTran.ProjectDocRowID IS NOT NULL
> AND dbo.ProjectDoc.StatusRowID = 1003
> AND ProjectTran.RowID NOT IN (Select TransactionRowID From
> @.ProjectDataDetail WHERE AccountTypeDescription = 'REVENUE')
>
|||It is highly unlikely that the subquery will be executed once, because
that is not how query plans work in SQL Server. However, the any I/O
will probably only be done once and retained in the data cache for the
second execution.
I am not sure if it is faster (you would have to test this), but you
could rewrite it as below:
SELECT
ProjectTran.ProjectRowID,
ProjectTran.RowID,
ProjectTran.ProjectWBSRowID,
PA.BillingTypeRowID,
'REVENUE',
ProjectTran.PriorInvoiceAmount - I.SumOfInvoiceAmount AS Revenue,
ProjectTran.PriorInvoiceAmount - I.SumOfInvoiceAmount AS BilledAmount
FROM ProjectTran
LEFT OUTER JOIN @.ProjectAccount PA
ON dbo.ProjectTran.ProjectAccountRowID = PA.ProjectAccountRowID
INNER JOIN dbo.ProjectDoc
ON dbo.ProjectTran.ProjectDocRowID = dbo.ProjectDoc.RowID
LEFT OUTER JOIN (
SELECT dbo.ProjectInvoiceDetail.ProjectTranRowID,
COALESCE(SUM(dbo.ProjectInvoiceDetail.InvoiceAmoun t),0) AS
SumOfInvoiceAmount
FROM dbo.ProjectInvoiceDetail
INNER JOIN dbo.ProjectInvoice
on dbo.ProjectInvoiceDetail.ProjectInvoiceRowID =
dbo.ProjectInvoice.RowID
Where dbo.ProjectInvoice.LedgerDocRowID IS NOT NULL
AND dbo.ProjectInvoice.StatusRowID = 1165
AND dbo.ProjectInvoiceDetail.InvoiceAmount <> 0
GROUP BY dbo.ProjectInvoiceDetail.ProjectTranRowID
) AS I
ON I.ProjectTranRowID = ProjectTran.RowID
Where dbo.ProjectTran.ProjectRowID in (
Select RowID
From #ProjectTemp
)
AND dbo.ProjectTran.ProjectDocRowID IS NOT NULL
AND dbo.ProjectDoc.StatusRowID = 1003
AND ProjectTran.RowID NOT IN (
Select TransactionRowID
From @.ProjectDataDetail
WHERE AccountTypeDescription = 'REVENUE'
)
HTH,
Gert-Jan
Amos Soma wrote:
> Please consider the following Select statement. I'm not sure how this is
> going to end up getting formatted when posted but please notice the two
> sub-queries in the main Select statement. They are identical except for the
> "AS" clause. I have the following questions:
> 1) Will the SQL 2000 or 2005 optimizer see these are identical and execute
> them only once instead of twice?
> 2) If the answer to #1 is that they will be executed twice, is there a way
> to optimize this so that the two subqueries are only executed once?
> SELECT
> ProjectTran.ProjectRowID,
> ProjectTran.RowID,
> ProjectTran.ProjectWBSRowID,
> PA.BillingTypeRowID,
> 'REVENUE',
> ProjectTran.PriorInvoiceAmount-(SELECT
> COALESCE(SUM(dbo.ProjectInvoiceDetail.InvoiceAmoun t),0)
> FROM dbo.ProjectInvoiceDetail
> INNER JOIN dbo.ProjectInvoice on
> dbo.ProjectInvoiceDetail.ProjectInvoiceRowID = dbo.ProjectInvoice.RowID
> Where dbo.ProjectInvoice.LedgerDocRowID IS NOT NULL
> AND dbo.ProjectInvoice.StatusRowID = 1165
> AND dbo.ProjectInvoiceDetail.InvoiceAmount <> 0
> AND dbo.ProjectInvoiceDetail.ProjectTranRowID =
> ProjectTran.RowID) AS Revenue,
> ProjectTran.PriorInvoiceAmount -(SELECT
> COALESCE(SUM(dbo.ProjectInvoiceDetail.InvoiceAmoun t),0)
> FROM dbo.ProjectInvoiceDetail
> INNER JOIN dbo.ProjectInvoice on
> dbo.ProjectInvoiceDetail.ProjectInvoiceRowID = dbo.ProjectInvoice.RowID
> Where dbo.ProjectInvoice.LedgerDocRowID IS NOT NULL
> AND dbo.ProjectInvoice.StatusRowID = 1165
> AND dbo.ProjectInvoiceDetail.InvoiceAmount <> 0
> AND dbo.ProjectInvoiceDetail.ProjectTranRowID =
> ProjectTran.RowID) AS BilledAmount
> FROM ProjectTran
> LEFT OUTER JOIN @.ProjectAccount PA ON
> dbo.ProjectTran.ProjectAccountRowID = PA.ProjectAccountRowID
> INNER JOIN dbo.ProjectDoc ON dbo.ProjectTran.ProjectDocRowID =
> dbo.ProjectDoc.RowID
> Where dbo.ProjectTran.ProjectRowID in (Select RowID From #ProjectTemp)
> AND dbo.ProjectTran.ProjectDocRowID IS NOT NULL
> AND dbo.ProjectDoc.StatusRowID = 1003
> AND ProjectTran.RowID NOT IN (Select TransactionRowID From
> @.ProjectDataDetail WHERE AccountTypeDescription = 'REVENUE')
|||If they are identical and you want to optimize it, take a CTE statement
to pre define the query. The CTE is available since SQL 2005
WITH T_Revenue
AS
(
SELECT COALESCE(SUM(dbo.ProjectInvoiceDetail.InvoiceAmoun t),0) AS Revenue
FROM dbo.ProjectInvoiceDetail PID
INNER JOIN dbo.ProjectInvoice PI
on PID.ProjectInvoiceRowID = PI.RowID
WHERE PI.LedgerDocRowID IS NOT NULL
AND PI.StatusRowID = 1165
AND PID.InvoiceAmount <> 0
AND PID.ProjectTranRowID = ProjectTran.RowID
)
SELECT PT.ProjectRowID,
PT.RowID,
PT.ProjectWBSRowID,
PA.BillingTypeRowID,
'REVENUE',
PT.PriorInvoiceAmount-(SELECT Revenue
FROM T_REvenue) AS Revenue,
PT.PriorInvoiceAmount - (SELECT Revenue
FROM T_REvenue) AS BilledAmount
FROM ProjectTran PT
LEFT OUTER JOIN @.ProjectAccount PA
ON PT.ProjectAccountRowID = PA.ProjectAccountRowID
INNER JOIN dbo.ProjectDoc PD
ON PT.ProjectDocRowID = PD.RowID
Where PT.ProjectRowID in (Select RowID
From #ProjectTemp)
AND PT.ProjectDocRowID IS NOT NULL
AND PD.StatusRowID = 1003
AND PT.RowID NOT IN (Select TransactionRowID
From @.ProjectDataDetail
WHERE AccountTypeDescription = 'REVENUE')
Take a look at the paper I wrote :
http://www.sqlservercentral.com/colu...server2005.asp
A +
Amos Soma a crit :
> Please consider the following Select statement. I'm not sure how this is
> going to end up getting formatted when posted but please notice the two
> sub-queries in the main Select statement. They are identical except for the
> "AS" clause. I have the following questions:
> 1) Will the SQL 2000 or 2005 optimizer see these are identical and execute
> them only once instead of twice?
> 2) If the answer to #1 is that they will be executed twice, is there a way
> to optimize this so that the two subqueries are only executed once?
> SELECT
> ProjectTran.ProjectRowID,
> ProjectTran.RowID,
> ProjectTran.ProjectWBSRowID,
> PA.BillingTypeRowID,
> 'REVENUE',
> ProjectTran.PriorInvoiceAmount-(SELECT
> COALESCE(SUM(dbo.ProjectInvoiceDetail.InvoiceAmoun t),0)
> FROM dbo.ProjectInvoiceDetail
> INNER JOIN dbo.ProjectInvoice on
> dbo.ProjectInvoiceDetail.ProjectInvoiceRowID = dbo.ProjectInvoice.RowID
> Where dbo.ProjectInvoice.LedgerDocRowID IS NOT NULL
> AND dbo.ProjectInvoice.StatusRowID = 1165
> AND dbo.ProjectInvoiceDetail.InvoiceAmount <> 0
> AND dbo.ProjectInvoiceDetail.ProjectTranRowID =
> ProjectTran.RowID) AS Revenue,
> ProjectTran.PriorInvoiceAmount -(SELECT
> COALESCE(SUM(dbo.ProjectInvoiceDetail.InvoiceAmoun t),0)
> FROM dbo.ProjectInvoiceDetail
> INNER JOIN dbo.ProjectInvoice on
> dbo.ProjectInvoiceDetail.ProjectInvoiceRowID = dbo.ProjectInvoice.RowID
> Where dbo.ProjectInvoice.LedgerDocRowID IS NOT NULL
> AND dbo.ProjectInvoice.StatusRowID = 1165
> AND dbo.ProjectInvoiceDetail.InvoiceAmount <> 0
> AND dbo.ProjectInvoiceDetail.ProjectTranRowID =
> ProjectTran.RowID) AS BilledAmount
> FROM ProjectTran
> LEFT OUTER JOIN @.ProjectAccount PA ON
> dbo.ProjectTran.ProjectAccountRowID = PA.ProjectAccountRowID
> INNER JOIN dbo.ProjectDoc ON dbo.ProjectTran.ProjectDocRowID =
> dbo.ProjectDoc.RowID
> Where dbo.ProjectTran.ProjectRowID in (Select RowID From #ProjectTemp)
> AND dbo.ProjectTran.ProjectDocRowID IS NOT NULL
> AND dbo.ProjectDoc.StatusRowID = 1003
> AND ProjectTran.RowID NOT IN (Select TransactionRowID From
> @.ProjectDataDetail WHERE AccountTypeDescription = 'REVENUE')
>
Frdric BROUARD, MVP SQL Server, expert bases de donnes et langage SQL
Le site sur le langage SQL et les SGBDR : http://sqlpro.developpez.com
Audit, conseil, expertise, formation, modlisation, tuning, optimisation
********************* http://www.datasapiens.com ***********************
|||Frdric (is that your name? MVP's do not usually post messages
anonymously),
Have you actually seen performance improvement using a CTE for this type
of problem?
I have tried to write a CTE in a simplified experiment (see below), and
it clearly shows that the CTE query is executed more than once. In the
example below, the first query requires 6 logical reads, the second only
3, which indicates that the CTE is executed twice in the first query.
-- create tables and load a few rows
CREATE TABLE projects
(row_id int not null PRIMARY KEY
,prior_invoice_amount int not null
)
CREATE TABLE project_invoice_details
(row_id int not null REFERENCES projects
,project_invoice_row_id int not null
,invoice_amount int not null
,CONSTRAINT PK_ProjectInvoiceDetails PRIMARY KEY
(row_id,project_invoice_row_id)
)
set nocount on
INSERT INTO projects VALUES (1, 0)
INSERT INTO projects VALUES (2,100)
INSERT INTO projects VALUES (3,200)
INSERT INTO project_invoice_details VALUES (1, 1, 10)
INSERT INTO project_invoice_details VALUES (1, 2, 20)
INSERT INTO project_invoice_details VALUES (1, 3, 30)
INSERT INTO project_invoice_details VALUES (2, 1, 25)
INSERT INTO project_invoice_details VALUES (3, 1, 80)
set nocount off
go
set statistics io on
go
-- query with 2 calls to the CTE
WITH T_revenue AS (
SELECT row_id, COALESCE(SUM(invoice_amount),0) AS revenue
FROM project_invoice_details PID
GROUP BY row_id
)
SELECT P.row_id,
P.prior_invoice_amount - (SELECT revenue
FROM T_revenue
WHERE T_revenue.row_id=P.row_id) AS
revenue,
P.prior_invoice_amount - (SELECT revenue
FROM T_revenue
WHERE T_revenue.row_id=P.row_id) AS
billed_amount
FROM projects P
go
-- query with only 1 call to the CTE
WITH T_revenue AS (
SELECT row_id, COALESCE(SUM(invoice_amount),0) AS revenue
FROM project_invoice_details PID
GROUP BY row_id
)
SELECT P.row_id,
P.prior_invoice_amount - (SELECT revenue
FROM T_revenue
WHERE T_revenue.row_id=P.row_id) AS
revenue,
P.prior_invoice_amount AS billed_amount
FROM projects P
go
set statistics io off
go
DROP TABLE project_invoice_details
DROP TABLE projects
Gert-Jan
P.S. I did not read your article, because the site requires
registration.
"SQLpro [MVP]" wrote:
> If they are identical and you want to optimize it, take a CTE statement
> to pre define the query. The CTE is available since SQL 2005
> WITH T_Revenue
> AS
> (
> SELECT COALESCE(SUM(dbo.ProjectInvoiceDetail.InvoiceAmoun t),0) AS Revenue
> FROM dbo.ProjectInvoiceDetail PID
> INNER JOIN dbo.ProjectInvoice PI
> on PID.ProjectInvoiceRowID = PI.RowID
> WHERE PI.LedgerDocRowID IS NOT NULL
> AND PI.StatusRowID = 1165
> AND PID.InvoiceAmount <> 0
> AND PID.ProjectTranRowID = ProjectTran.RowID
> )
> SELECT PT.ProjectRowID,
> PT.RowID,
> PT.ProjectWBSRowID,
> PA.BillingTypeRowID,
> 'REVENUE',
> PT.PriorInvoiceAmount-(SELECT Revenue
> FROM T_REvenue) AS Revenue,
> PT.PriorInvoiceAmount - (SELECT Revenue
> FROM T_REvenue) AS BilledAmount
> FROM ProjectTran PT
> LEFT OUTER JOIN @.ProjectAccount PA
> ON PT.ProjectAccountRowID = PA.ProjectAccountRowID
> INNER JOIN dbo.ProjectDoc PD
> ON PT.ProjectDocRowID = PD.RowID
> Where PT.ProjectRowID in (Select RowID
> From #ProjectTemp)
> AND PT.ProjectDocRowID IS NOT NULL
> AND PD.StatusRowID = 1003
> AND PT.RowID NOT IN (Select TransactionRowID
> From @.ProjectDataDetail
> WHERE AccountTypeDescription = 'REVENUE')
> Take a look at the paper I wrote :
> http://www.sqlservercentral.com/colu...server2005.asp
[snip]
|||I haven't try yet to experiment performances about CTE. I just post a
complete study about CTE and many ways to do with CTE and recursive queries.
The fact I post without my complete name is due to the fact I am
actually in a firm that is badly organize about Internet access and I
cannot have my own newreader with my personnal settings.
Yes I am Fred Brouard.
The ways you does it is interesting.
If I have time I will experiment performances about CTE in the next monthes.
A +
Gert-Jan Strik a crit :
> Frdric (is that your name? MVP's do not usually post messages
> anonymously),
> Have you actually seen performance improvement using a CTE for this type
> of problem?
> I have tried to write a CTE in a simplified experiment (see below), and
> it clearly shows that the CTE query is executed more than once. In the
> example below, the first query requires 6 logical reads, the second only
> 3, which indicates that the CTE is executed twice in the first query.
>
> -- create tables and load a few rows
> CREATE TABLE projects
> (row_id int not null PRIMARY KEY
> ,prior_invoice_amount int not null
> )
> CREATE TABLE project_invoice_details
> (row_id int not null REFERENCES projects
> ,project_invoice_row_id int not null
> ,invoice_amount int not null
> ,CONSTRAINT PK_ProjectInvoiceDetails PRIMARY KEY
> (row_id,project_invoice_row_id)
> )
> set nocount on
> INSERT INTO projects VALUES (1, 0)
> INSERT INTO projects VALUES (2,100)
> INSERT INTO projects VALUES (3,200)
> INSERT INTO project_invoice_details VALUES (1, 1, 10)
> INSERT INTO project_invoice_details VALUES (1, 2, 20)
> INSERT INTO project_invoice_details VALUES (1, 3, 30)
> INSERT INTO project_invoice_details VALUES (2, 1, 25)
> INSERT INTO project_invoice_details VALUES (3, 1, 80)
> set nocount off
> go
> set statistics io on
> go
> -- query with 2 calls to the CTE
> WITH T_revenue AS (
> SELECT row_id, COALESCE(SUM(invoice_amount),0) AS revenue
> FROM project_invoice_details PID
> GROUP BY row_id
> )
> SELECT P.row_id,
> P.prior_invoice_amount - (SELECT revenue
> FROM T_revenue
> WHERE T_revenue.row_id=P.row_id) AS
> revenue,
> P.prior_invoice_amount - (SELECT revenue
> FROM T_revenue
> WHERE T_revenue.row_id=P.row_id) AS
> billed_amount
> FROM projects P
> go
> -- query with only 1 call to the CTE
> WITH T_revenue AS (
> SELECT row_id, COALESCE(SUM(invoice_amount),0) AS revenue
> FROM project_invoice_details PID
> GROUP BY row_id
> )
> SELECT P.row_id,
> P.prior_invoice_amount - (SELECT revenue
> FROM T_revenue
> WHERE T_revenue.row_id=P.row_id) AS
> revenue,
> P.prior_invoice_amount AS billed_amount
> FROM projects P
> go
> set statistics io off
> go
> DROP TABLE project_invoice_details
> DROP TABLE projects
>
> Gert-Jan
> P.S. I did not read your article, because the site requires
> registration.
>
> "SQLpro [MVP]" wrote:
> [snip]
Frdric BROUARD, MVP SQL Server, expert bases de donnes et langage SQL
Le site sur le langage SQL et les SGBDR : http://sqlpro.developpez.com
Audit, conseil, expertise, formation, modlisation, tuning, optimisation
********************* http://www.datasapiens.com ***********************
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment