Showing posts with label performing. Show all posts
Showing posts with label performing. Show all posts

Wednesday, March 21, 2012

Query Performance Problem

We have an application that's been performing reasonably well for quite some
time. Something seems to have changed one morning last w, as the entire
application slowed down significantly. I have no idea what could have
caused the problem, as there's nothing out of the ordinary in any of the
logs. I issued a DBCC CHECKDB to verify the integrity of the database, and
that didn't show any problems. I also ran DBCC DBREINDEX, DBCC UPDATEUSAGE,
and UPDATE STATISTICS, but that doesn't appear to have fixed anything
either.
During my investigation I was able to identify one query that was taking an
inordinate amount of time to execute. It is an in-line query (rather than
one that's invoked from within a stored procedure) that should have returned
in a couple of seconds, but was taking over 40 seconds. In trying to figure
out what was going on, I asked the server for an estimated execution plan,
and was surprised to find that it took just under 40 seconds for the server
to provide it. Once the plan was in the cache, asking for it again was
instantaneous and the query would execute in a couple of seconds. After the
query was removed from the cache (or if the query was modified slightly), it
took 40 seconds again.
It was my understanding that the server simply looked at indices and
statistics in order to come up with an ESTIMATED plan. Up until now I've
never seen it take more than a second to do so. Am I wrong about this?
Does this give anyone a clue as to what might be going on? The query is
pretty much a simple join between 3 tables and 3 views. (Each of the views
have 3-5 tables.) Any help would be appreciated.
Thanks!You cannot directly use a SProc inside an if condition.
If you really want to call within the if condition, then create a function.
But remember that the function works because its an extended proc.
CREATE function [dbo].[usp_FileExists] (@.physname nvarchar(260))
returns int
as
begin
DECLARE @.i int
EXEC master.dbo.usp_FileExists @.physname =
'z:\data\sql_data\myData_Data.MDF'
,@.i out
return(SELECT CASE WHEN @.i=1 THEN 1 ELSE 0 END)
end
IF (master.dbo. usp_FileExists('D:\data\sql_data\myData_
Data.MDF') = 0)
PRINT 'Does not Exist '
But again, unless you have a pressing reason to call it within the if
condition, I would suggest you call the proc the same way as the extended
proc is called within it..
Like this..
DECLARE @.RC int
DECLARE @.physname nvarchar(260)
set @.physname = 'D:\data\sql_data\myData_Data.MDF'
EXEC @.RC = [master].[dbo].[usp_FileExists1] @.physname
if(@.rc = 0)
PRINT 'Does not Exist '
-Omnibuzz (The SQL GC)
http://omnibuzz-sql.blogspot.com/|||Answered in the wrong post... sorry :(
--
-Omnibuzz (The SQL GC)
http://omnibuzz-sql.blogspot.com/
"Omnibuzz" wrote:

> You cannot directly use a SProc inside an if condition.
> If you really want to call within the if condition, then create a function
.
> But remember that the function works because its an extended proc.
> CREATE function [dbo].[usp_FileExists] (@.physname nvarchar(260))
> returns int
> as
> begin
> DECLARE @.i int
> EXEC master.dbo.usp_FileExists @.physname =
> 'z:\data\sql_data\myData_Data.MDF'
> ,@.i out
> return(SELECT CASE WHEN @.i=1 THEN 1 ELSE 0 END)
> end
> IF (master.dbo. usp_FileExists('D:\data\sql_data\myData_
Data.MDF') = 0)
> PRINT 'Does not Exist '
> But again, unless you have a pressing reason to call it within the if
> condition, I would suggest you call the proc the same way as the extended
> proc is called within it..
> Like this..
> DECLARE @.RC int
> DECLARE @.physname nvarchar(260)
> set @.physname = 'D:\data\sql_data\myData_Data.MDF'
> EXEC @.RC = [master].[dbo].[usp_FileExists1] @.physname
> if(@.rc = 0)
> PRINT 'Does not Exist '
>
> --
> -Omnibuzz (The SQL GC)
> http://omnibuzz-sql.blogspot.com/
>|||Mark Pauker (mpauker@.optonline.net) writes:
> During my investigation I was able to identify one query that was taking
> an inordinate amount of time to execute. It is an in-line query (rather
> than one that's invoked from within a stored procedure) that should have
> returned in a couple of seconds, but was taking over 40 seconds. In
> trying to figure out what was going on, I asked the server for an
> estimated execution plan, and was surprised to find that it took just
> under 40 seconds for the server to provide it. Once the plan was in the
> cache, asking for it again was instantaneous and the query would execute
> in a couple of seconds. After the query was removed from the cache (or
> if the query was modified slightly), it took 40 seconds again.
> It was my understanding that the server simply looked at indices and
> statistics in order to come up with an ESTIMATED plan. Up until now
> I've never seen it take more than a second to do so. Am I wrong about
> this? Does this give anyone a clue as to what might be going on? The
> query is pretty much a simple join between 3 tables and 3 views. (Each
> of the views have 3-5 tables.) Any help would be appreciated.
So 40 seconds for computing a query plan is indeed a bit extraordinary.
Then again a query like:
SELECT * FROM tbl WHERE col (val1, val2, ... val10000)
typically takes 15 seconds to compile on SQL 2000.
Since this is a query submitted from the application, I'm a little
nervous how the code actually looks like. When you submit bare
queries from the client, it's essential that:
o No parameter values are interpolated into the query string.
o All tables are prefixed with owner/schema.
o The query text never changes.
When SQL Server looks up the query plan for a loose query in the cache,
it hashes the text. For this reason the cache is case- and space-sensitive.
That is, if a single space is added to the query, there will be a new
plan.
The reason that the owner/schema is important is that if the query is
SELECT ... FROM tbl
and is run by user joe, SQL Server will first have to check whether there
is a joe.tbl before it tries with dbo.tbl. Since joe.tbl could appear in
any moment, the consequence is that joe gets a private plan in the cache
for the query. If you say "SELECT ... FROM dbo.tbl" this does not happen.
Now, there are other reasons for recompilations as well, for instance
changed statistics. If any of the involved tables are moderate in size,
but updated frequently, recompilation will hit you several times a time.
One option to address this is to add OPTION (KEEPFIXED PLAN) at the end
of the query, to prevent recompiles. But this still will not help if
the query involves a temp table that is recreated each time.
I understand that you would like to know why compilation takes such
long time, but I don't really have an answer to that. I guess I would
need intimate knowledge of the involved tables to tell. And, of course,
which version of SQL Server you are using.
--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx|||> So 40 seconds for computing a query plan is indeed a bit extraordinary.
The bizarre thing is that this only started last w. Prior to that the
query executed in a few seconds. I'm trying to figure out what could have
caused this. Neither the executable nor any server configurations have
changed.

> Then again a query like:
> SELECT * FROM tbl WHERE col (val1, val2, ... val10000)
> typically takes 15 seconds to compile on SQL 2000.
I've never seen a query take anywhere near this amout of time. Why would it
take so long?

> When you submit bare queries from the client, it's essential that:
> o No parameter values are interpolated into the query string.
> o All tables are prefixed with owner/schema.
> o The query text never changes.
The query is built by the front end, with the "parameterized" data being
appended directly into the query string. I've been re-factoring the app to
remove this type of code, but there's way too much of it for me to get to in
a timely manner. While your suggestions make sense, the bottom line is that
before last w, the system was performing acceptably (even though it was
having to recompile the query each time it was submitted). I'm hoping to
identify a root cause of the slowdown so I can get performance back to the
way it was before all of this started last w. What could cause a server
to generate an estimated execution plan in 40 seonds when that same server
was generating estimated execution plans of one or two seconds just a few
days before?.|||Mark Pauker wrote:
> The query is built by the front end, with the "parameterized" data being
> appended directly into the query string. I've been re-factoring the app t
o
> remove this type of code, but there's way too much of it for me to get to
in
> a timely manner. While your suggestions make sense, the bottom line is th
at
> before last w, the system was performing acceptably (even though it was
> having to recompile the query each time it was submitted). I'm hoping to
> identify a root cause of the slowdown so I can get performance back to the
> way it was before all of this started last w. What could cause a serve
r
> to generate an estimated execution plan in 40 seonds when that same server
> was generating estimated execution plans of one or two seconds just a few
> days before?.
>
How often are you updating statistics on the database? Have you
recently made mass changes to any tables, i.e. deleted a bunch of
records, or imported a bunch? Kind of sounds like a statistics issue to
me...|||The query optimizer will compile based on the estimated cost of the query
(and its complexity). We implement this using "buckets" of optimizations
that we call stages. Cheap queries get only one stage. More expensive get
two stages. The most complex get 3 stages.
A query can get slightly more expensive and go from one bucket to the next.
That next bucket could take a lot longer, especially if you have a lot of
joins (the number of alternatives we have to consider explodes).
Another reason things could change like this is that the initial join order
picked could be less optimal. This could cause a higher estimated cost and
thus cause you to go through more optimization stages. This could be caused
by changes in the underlying data.
Another possible reason for longer compilation time is that something has
changed (which sounds less likely here). If you upgraded from one release
to another, perhaps there is a bug and we're not finding a solution fast
enough.
I usually try to simplify the query (remove pieces of it one at a time)
until I've found the culprit.
Conor
"Tracy McKibben" <tracy@.realsqlguy.com> wrote in message
news:OGudU7RmGHA.4032@.TK2MSFTNGP02.phx.gbl...
> Mark Pauker wrote:
> How often are you updating statistics on the database? Have you recently
> made mass changes to any tables, i.e. deleted a bunch of records, or
> imported a bunch? Kind of sounds like a statistics issue to me...|||Or perhaps it is auto-stats kicking in? (Doesn't sound like it from the orig
inal description, but
who knows...)
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
"Conor Cunningham [MS]" <conorc_removeme@.online.microsoft.com> wrote in message
news:eLcx2mTmGHA.4992@.TK2MSFTNGP03.phx.gbl...
> The query optimizer will compile based on the estimated cost of the query
(and its complexity).
> We implement this using "buckets" of optimizations that we call stages. C
heap queries get only
> one stage. More expensive get two stages. The most complex get 3 stages.
> A query can get slightly more expensive and go from one bucket to the next
. That next bucket could
> take a lot longer, especially if you have a lot of joins (the number of al
ternatives we have to
> consider explodes).
> Another reason things could change like this is that the initial join orde
r picked could be less
> optimal. This could cause a higher estimated cost and thus cause you to g
o through more
> optimization stages. This could be caused by changes in the underlying da
ta.
> Another possible reason for longer compilation time is that something has
changed (which sounds
> less likely here). If you upgraded from one release to another, perhaps t
here is a bug and we're
> not finding a solution fast enough.
> I usually try to simplify the query (remove pieces of it one at a time) un
til I've found the
> culprit.
> Conor
> "Tracy McKibben" <tracy@.realsqlguy.com> wrote in message
> news:OGudU7RmGHA.4032@.TK2MSFTNGP02.phx.gbl...
>|||> A query can get slightly more expensive and go from one bucket to the
> next. That next bucket could take a lot longer, especially if you have a
> lot of joins (the number of alternatives we have to consider explodes).
This makes sense, but I still can't understand why this could take 30
seconds or more. Is this within a range that could be "as expected?" It
seems to me that something else must be going on. (The query in question
contains 6 tables and 3 views that are all left-joined togather. Each view
has 4 or 5 tables.)
Keep in mind that the ENTIRE application has slowed down, not just this
query. I brought up the query for illustrative purposes, but it really
appears to be just a symptom of a more systemmic problem. (I could rewrite
the query, but there would be a hundred more places that I'd need to fix...
And even then I wouldn't know that a larger problem wasn't simply being
masked.)
When the system started "misbehaving" last w, the processor utilization
went way up but the disk didn't get any more active. This is what makes me
think that the slow-down is related to the server's generating execution
plans. The query simply seems to bear that out. (Once the execution plan
has been generated, the query appears to execute normally.)
I agree with Tracy's comment that it appears to be statistics-related, but
I'm not sure what to do with that information. I've updated them manually
(with and without FULLSCAN), and verified that they haven't been updated
automatically between requested execution plans. It's almost as if the
server is ignoring the statistics and trying to recalculate the plan from
scratch (although I guess I'd see more disk activity if this were true).

> Another reason things could change like this is that the initial join
> order picked could be less optimal. This could cause a higher estimated
> cost and thus cause you to go through more optimization stages. This
> could be caused by changes in the underlying data.
Is there any way to alleviate this if it turns out to be the case?

> Another possible reason for longer compilation time is that something has
> changed (which sounds less likely here). If you upgraded from one release
> to another, perhaps there is a bug and we're not finding a solution fast
> enough.
Sounds unlikely.

> I usually try to simplify the query (remove pieces of it one at a time)
> until I've found the culprit.
I can certainly do this (and probably will), but as I said, the problem is
occurring system-wide. I was hoping that there would be some known issue or
a suggestion that might address things at a more global level. I think that
the problem may lie in a single, very widely used table (that's involved in
this and a great number of other queries). The table is approaching 10M
records, occupying over 2G of space (independent of indices and statistics).
Any other thoughts would be greatly appreciated.
-- Mark|||Mark Pauker (mpauker@.optonline.net) writes:
> Keep in mind that the ENTIRE application has slowed down, not just this
> query. I brought up the query for illustrative purposes, but it really
> appears to be just a symptom of a more systemmic problem.
But if that query needs to be compiled again and again, since it has
interpolated queries, you could have several processes compiling different
versions of this query. And if there are other similar queries running,
it could all add up to this disaster.

> Is there any way to alleviate this if it turns out to be the case?
Better indexing maybe?
It's awfully hard to give advice without access to the system, not knowing
the tables, the queries or the indexes.
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspxsql

Query performance during incremental processing

We're doing some testing, and finding that query performance goes to hell while performing incremental processing on the partition (ProcessAdd). Specifically during the partition merge part of the processing, queries take from 3 to 30 times longer.

I guess it's not totally surprising that this would be the case, but can anyone confirm that this is the expected behavior? Is there anything that we can do to improve the situation?We have the same issue, same problem with Process Index as well. MS never gave us any solutions, so all processing is done off hours.|||

This is probably your best bet.
Processing operation is quite heavy. It often requires tons of memory and I/O. It also tries to aqure a lock at it's final stage, meaning your queries got to finish by then to allow processing to commit.

Many Analysis Services customers implement updates to their cubes during special period of time, usually night hours. If your applicaiton requires to keep your cubes up-to-date with recent updates you can utilize several techniques that require some carefull planning from your side.

You partition your data say by month and setup a small partition that you will be fully processing ( with last day of data). Benind the sense ProcessAdd for partition is creating a new parttion and then merges it with the original one. It is often faster to re-process small dailiy partition fully.

You can also get yourself a hiher end hardware that is going to sustain processing loads while users are querying your data.

Another solution is to have dedicated processing machine that from time-to-time is synchronized with production server.

Take a look at the project REAL for some clues on how to partition your data and implement batch updates.

Edward.
--
This posting is provided "AS IS" with no warranties, and confers no rights.

Friday, March 9, 2012

Query optimization question...

I'm trying to optimize some queries on an existing system, and I'm
noticing some odd behavior. I'm performing a join between several
tables, the final table being joined by the optimizer has a clustered
index on the field that it is using to join to the rest of the query,
but for some reason SQL Server doesn't seem to actually use this index
(it's doing an index scan instead of an index seek). Is there some
reason why SQL Server would not use a valid Clustered Index? I've
dropped and readded the index, but that doesn't seem to help. I don't
know if it would be relevant, but the tables I'm working on are fairly
fat (2 to 7K bytes/row).

This is happening for several tables. I've been able to get around it
for some of the tables by creating a non-clustered index on all the
fields that are being queried so that the leaf pages don't need to be
loaded, but this isn't a valid solution for all of the tables I'm
struggling with.

Any ideas? (and no, they aren't willing to redesign any of the
tables)."Mathew Relick" <ticars@.yahoo.com> wrote in message
news:dd84d8a7.0311181017.39d1c69@.posting.google.co m...
> I'm trying to optimize some queries on an existing system, and I'm
> noticing some odd behavior. I'm performing a join between several
> tables, the final table being joined by the optimizer has a clustered
> index on the field that it is using to join to the rest of the query,
> but for some reason SQL Server doesn't seem to actually use this index
> (it's doing an index scan instead of an index seek). Is there some
> reason why SQL Server would not use a valid Clustered Index? I've
> dropped and readded the index, but that doesn't seem to help. I don't

This can happen if your table statistics are out of date. If SQL Server
determines that the join will require more than x % of the table to be
retrieved, then doing a clustered index scan may be faster than doing a
clustered index seek (faster because an index scan can read the rows in a
page sequentially, whereas an index seek has to traverse the B-tree
structure. Sequential read is faster because you cut down on seek time, and
also because you may be able to read more than one page in a single I/O
operation, since the data is sequential.) Try updating statistics and see
if it helps:

UPDATE STATISTICS <table_name> WITH FULLSCAN
GO

> know if it would be relevant, but the tables I'm working on are fairly
> fat (2 to 7K bytes/row).
> This is happening for several tables. I've been able to get around it
> for some of the tables by creating a non-clustered index on all the
> fields that are being queried so that the leaf pages don't need to be
> loaded, but this isn't a valid solution for all of the tables I'm
> struggling with.

This is because when you create a new index, new statistics are generated
for that index, so you have the most up-to-date statistics with your new
index. Same if you rebuild your existing indexes.

HTH,
Dave

>
> Any ideas? (and no, they aren't willing to redesign any of the
> tables).|||"Dave Hau" <nospam_dave_nospam_123@.nospam_netscape_nospam.net_ nospam> wrote
in message news:gCvub.33788$yj4.5497@.newssvr27.news.prodigy.c om...
> "Mathew Relick" <ticars@.yahoo.com> wrote in message
> news:dd84d8a7.0311181017.39d1c69@.posting.google.co m...
> > I'm trying to optimize some queries on an existing system, and I'm
> > noticing some odd behavior. I'm performing a join between several
> > tables, the final table being joined by the optimizer has a clustered
> > index on the field that it is using to join to the rest of the query,
> > but for some reason SQL Server doesn't seem to actually use this index
> > (it's doing an index scan instead of an index seek). Is there some
> > reason why SQL Server would not use a valid Clustered Index? I've
> > dropped and readded the index, but that doesn't seem to help. I don't
> This can happen if your table statistics are out of date. If SQL Server
> determines that the join will require more than x % of the table to be
> retrieved, then doing a clustered index scan may be faster than doing a
> clustered index seek (faster because an index scan can read the rows in a
> page sequentially, whereas an index seek has to traverse the B-tree
> structure. Sequential read is faster because you cut down on seek time,
and
> also because you may be able to read more than one page in a single I/O
> operation, since the data is sequential.) Try updating statistics and see
> if it helps:
> UPDATE STATISTICS <table_name> WITH FULLSCAN
> GO
> > know if it would be relevant, but the tables I'm working on are fairly
> > fat (2 to 7K bytes/row).
> > This is happening for several tables. I've been able to get around it
> > for some of the tables by creating a non-clustered index on all the
> > fields that are being queried so that the leaf pages don't need to be
> > loaded, but this isn't a valid solution for all of the tables I'm
> > struggling with.
> This is because when you create a new index, new statistics are generated
> for that index, so you have the most up-to-date statistics with your new
> index. Same if you rebuild your existing indexes.

I'm going to chime in because I think Dave has some good points here.
However, as I understand it, the original poster did rebuild the clustered
index, so there may be more to this problem than meets the eye.

One other thing that can happen is that the optimizer decides it's faster to
do a scan instead of a seek. This is particularly true if the result it
expects to return is a large percentage of the index. (i.e. if you have 100
rows and will return 80).

I'll be honest, I'm not sure exactly how this applies with a clustered
index.

> HTH,
> Dave
> > Any ideas? (and no, they aren't willing to redesign any of the
> > tables).