Friday, March 23, 2012

Query performance.

I am working at a client site with an accounting database, and we are dealing with a simple query:

select count(*)
from Contract
inner join Charge on Contract.ContractID = Charge.ContractID

Oddly, when we run this on our production server it takes 60 to 90 seconds to complete, but on our test server it completes in 2 seconds.
The problem becomes worse with more complex procedures, such as our statement run that can exceed seven hours on the production server but completes in 45 minutes on our test server.

Our test server is an exact restore of our production server.

Contract and Charge are both indexed on ContractID, and each has about 500,000 rows.

The execution plans on both servers are identical, the longest step being a hash-match/Inner-join between the two tables (probably a result of the cardinality of ContractID in the Charge table).

Our production server is more powerful than our test server, with quad-processors and 3.5 gigs of RAM, while our test server has only 2 gigs. Both servers are set to dynamically configure memory, and I haven't seen our production server's memory use top 1.8 gigs.

Running trace through query analyzer shows one discrepancy: out test server performs only 24 reads to execute the query, while our productions server requires 220.

The problem occurs on our production server regardless of whether other users are using the system.

I have a couple questions:

1) Any ideas on why the query would run so much slower on our production server?

2) Would running SQL Profiler give me any additional information, and if so what settings should I trace (I've only used profiler once or twice). Also, is there anything I should be concerned about in running profiler on a production system? I've heard that it can have some impact on performance.

3) As part of the execution plan the optimizer performs a step called Bitmap/Bitmap Create prior to the hash. I can't find any documentation on this in Books Online or on Microsquashes website. Anybody know anything about it?

Thanks!

blindmanRunning trace through query analyzer shows one discrepancy: out test server performs only 24 reads to execute the query, while our productions server requires 220.
--------------
Check indexes for fragmentation.|||I tried dropping and recreating the indexes, to no effect.

I also just tried setting ContractID as the clustered index on the Charge table to see if that would change the execution plan, but it had no effect.

I am suspecting that there might be performance or contention issues with the drive, though supposedly there are no other services running on the serve besides MS SQL.

blindman|||Originally posted by blindman
I tried dropping and recreating the indexes, to no effect.

I also just tried setting ContractID as the clustered index on the Charge table to see if that would change the execution plan, but it had no effect.

I am suspecting that there might be performance or contention issues with the drive, though supposedly there are no other services running on the serve besides MS SQL.

blindman

Try to use performance monitor - very useful thing. Compare results for servers.|||uhhh...bounce the box?

This is MS BTW

Can you be down for a couple?

What's in the Error Log?

Did you do DBCC CHECKDB?|||blocking ?

Originally posted by blindman
I am working at a client site with an accounting database, and we are dealing with a simple query:

select count(*)
from Contract
inner join Charge on Contract.ContractID = Charge.ContractID

Oddly, when we run this on our production server it takes 60 to 90 seconds to complete, but on our test server it completes in 2 seconds.
The problem becomes worse with more complex procedures, such as our statement run that can exceed seven hours on the production server but completes in 45 minutes on our test server.

Our test server is an exact restore of our production server.

Contract and Charge are both indexed on ContractID, and each has about 500,000 rows.

The execution plans on both servers are identical, the longest step being a hash-match/Inner-join between the two tables (probably a result of the cardinality of ContractID in the Charge table).

Our production server is more powerful than our test server, with quad-processors and 3.5 gigs of RAM, while our test server has only 2 gigs. Both servers are set to dynamically configure memory, and I haven't seen our production server's memory use top 1.8 gigs.

Running trace through query analyzer shows one discrepancy: out test server performs only 24 reads to execute the query, while our productions server requires 220.

The problem occurs on our production server regardless of whether other users are using the system.

I have a couple questions:

1) Any ideas on why the query would run so much slower on our production server?

2) Would running SQL Profiler give me any additional information, and if so what settings should I trace (I've only used profiler once or twice). Also, is there anything I should be concerned about in running profiler on a production system? I've heard that it can have some impact on performance.

3) As part of the execution plan the optimizer performs a step called Bitmap/Bitmap Create prior to the hash. I can't find any documentation on this in Books Online or on Microsquashes website. Anybody know anything about it?

Thanks!

blindman|||Bouncing the server did speed up the processing the other day, but the performance quickly degraded again.

Problem with buffer pages, perhaps?

Nothing in the Error Log.

DBCC checks done regularly, and databases are spittin' images of eachother.

No blocks detected, and problem occurs regardless of whether other users are logged in.

blindman|||How big are the tranny logs?

How about the allocation to tempdb?

Is it a high level of OLTP?

But..

The execution plans on both servers are identical, the longest step being a hash-match/Inner-join between the two tables (probably a result of the cardinality of ContractID in the Charge table).

Should be an Index Scan...shouldn't it?

And 45 minutes is a long time....(7 hours is an eternity)...

Can it be the network? Are you executing localy on dev and remotley to PROD (even that shouldn't matter)

I'll keep thinking...

You can try the brain trust at:

http://www.sqlteam.com/Default.asp|||I know SQL Server is supposed to be doing this on its own every now and then, but what happens when you run

update statistics table with fullscan

on both tables? Kind of reaching here, because the test database is a restore of the other (and therefore identical). How does the hardware stack up?|||How does the hardware stack up?

...And if I read the REST of the post... Disregard that last bit, Blindman.

No comments:

Post a Comment