Showing posts with label employees. Show all posts
Showing posts with label employees. Show all posts

Friday, March 9, 2012

Query Optimization - Please Help

Hi,

Can anyone help me optimize the SELECT statement in the 3rd step? I am actually writing a monthly report. So for each employee (500 employees) in a row, his attendance totals for all days in a month are displayed. The problem is that in the 3rd step, there are actually 31 SELECT statements which are assigned to 31 variables. After I assign these variable, I insert them in a Table (4th step) and display it. The troublesome part is the 3rd step. As there are 500 employees, then 500x31 times the variables are assigned and inserted in the table. This is taking more than 4 minutes which I know is not required :). Can anyone help me optimize the SELECT statements I have in the 3rd step or give a better suggestion.

DECLARE @.EmpID, @.DateFrom, @.Total1 ... // Declaring different variables

SELECT @.DateFrom = // Set to start of any month e.g. 2007-06-01 ..... 1st

Loop (condition -- Get all employees, working fine)

BEGIN

SELECT @.EmpID = // Get EmployeeID ..... 2nd

SELECT @.Total1 = SUM (Abences) ..... 3rd

FROM Attendance

WHERE employee_id_fk = @.EmpID (from 2nd step)

AND Date_Absent = DATEADD ("day", 0, Convert (varchar, @.DateFrom)) (from 1st step)

SELECT @.Total2 ..................... same as above

SELECT @.Total3 ..................... same as above

INSERT IN @.TABLE (@.EmpID, @.Total1, ..... @.Total31) ..... 4th

Iterate (condition) to next employee ..... 5th

END

It's only the loop which consumes the 4 minutes. If I can somehow optimize this part, I will be most satisfied. Thanks for anyone helping me...

What does the Attendance table look like? I have some ideas for you but I need to know how the attendance is stored. Can you give us the schema of that table please? Thanks!

|||

See this sample example-->

========================================================================================

Declare @.fromDate datetime,
@.toDate datetime

Set @.fromDate = '1-Aug-2007'
Set @.toDate = '4-Aug-2007'


Select distinct a.dtAttendate,

'STATUS'=(select
(case
when lv_status='EL' then 'EL'
when lv_status='CL' then 'CL'
when lv_status='SL' then 'SL'
when lv_status='ML' then 'ML'
when (wk_status='S' AND log_status='P') then 'SB'
when (wk_status='S' AND log_status='A') then 'Absent (SB)'
when wk_status='N' then 'Weekend'
when bIsHoliday=1 then 'Holiday'
when log_status='A' then 'Absent'
--when mnyLateHrs>0 then 'Late'
when log_status='P' then 'Present'
end)
from tblLogAbsent
where dtAttendate = a.dtAttendate and intEmpCode=a.intEmpCode),

'TIMEIN'=(select dtEmpTimeIn from tblLogStatus where dtAttendate=a.dtAttendate and intEmpCode=a.intEmpCode),

'TIMEOUT'=(select dtEmpTimeOUT from tblLogStatus where dtAttendate=a.dtAttendate and intEmpCode=a.intEmpCode)

from tblLogStatus a
where a. dtAttendate between @.fromDate and @.toDate

group by a.dtAttendate, intEmpCode

========================================================================================

here i use two different table "tbllogabsent" for his attendance status and "tbllogstatus" as a for additional information... this is not becoming problem... if u cant understand any line of code ask me again...hopefully this will be helpfull to u...

|||

Hmmm... I'm not sure how to use this information. In your initial code, you do a SELECT from a table called Attendance, which seems to have columns like Abences, employee_id_fk and Date_Absent. Can you give us more details on this table please? Thanks.

|||

Hi johram,

Thanks for replying. Yes you are right. The attendance table for employees has columns like EmpID_fk, Attendance_Date, ... , Attendance_Total. The Attendance_Total column is dependent on our business rules which include reason for signing in/out. e.g. If an employee has signed out for some official task, 1 is added to his Attendance_Total column. If he is going away for a business tour, 2 may be added to his Attendance_Total column. So in one day, an employee can have more than one record. The records for an employee may look like the following:

EmpID Attendance_Date ... Attendance_Total

1001 04/28/2006 1

1001 04/28/2006 2

1001 04/28/2006 1

So on 28th April, the total for Emp (1001) = 4. There are other columns in there but im only concerned with Attendance_Total. I need to display the SUM (Attendance_Total) for each day in a month for each employee.

If I further elaborate my report based on the above example, it may look something like:

EmpID EmpName D1 D2 D3 D4 D5 D6 ......

1001 ABC 1 0 4 4 1 0

I have tried a few techniques (under my experienceJ), but when it comes to computing the sum for each day, it takes almost 4-5 minutes which I am sure nobody wants. Also this report can be accessed anytime within a day and employees keep coming and going for business, so cant store the records and need to compute them everytime the report is accessed.

Thanks to you all for helping me...

|||

Hi patuary,

Thanks for your reply aswell. I have tried your technique and understand it - i bet :) Anyways, after applying your technique, there are the following two problems:

1. All records are being returned as rows e.g. The records for Emp 1001 in a month is not in a single row, rather seperate rows are returned for each day

2. Also, this query only returns data for days on which attendance may be marked. But if there is a weekend or the employee was absent, his attendance record for that day is not computed as there are no records. Whereas in my case, if he was absent or no record found on a given day, his attendance record must be marked as 0.

Thanks again for your time...

|||

Guys, please provide your valueable feedback...

|||

Is it important that you get a result with all the days, even if the sum is zero? Cause that will make it a bit more complicated in the SQL. You can have the SQL report back all days that actually have a total (greater than zero), and then in your GUI you can render the rest of the days as empty. In that case, I think we can work out a solution for you. At least that's what we'll start with ;-) I'll see what I can do!

Also, what's the datatype of you Attendance_Date column?

|||

If you are using SQL 2005, there should be a new statement called PIVOT, although it is nowhere to be found in the T-SQL reference manual on MSDN. Maybe you are luckier than me ;-) Pivot is the term for when you shift the layout of a table so that you look on it from a different perspective. In this case, you want to pivot the table on the date so that each date represents a column rather than a row.

Now, this can be done with a function calledCrosstable, which was developed by the legendary Rob Volk. The source code for this function can be foundhere. Note that you need to change the column "pivot" to "tpivot" or something, since "pivot" is a keyword in SQL 2005.

This is the modified version of Crosstable that will work in SQL 2005:

ALTER PROCEDURE crosstab @.select varchar(8000),@.sumfuncvarchar(100), @.pivotvarchar(100), @.table varchar(100)ASDECLARE @.sqlvarchar(8000), @.delimvarchar(1)SET NOCOUNT ONSET ANSI_WARNINGSOFFEXEC ('SELECT ' + @.pivot +' AS tpivot INTO ##pivot FROM ' + @.table +' WHERE 1=2')EXEC ('INSERT INTO ##pivot SELECT DISTINCT ' + @.pivot +' FROM ' + @.table +' WHERE ' + @.pivot +' Is Not Null')SELECT @.sql='', @.sumfunc=stuff(@.sumfunc,len(@.sumfunc), 1,' END)' )SELECT @.delim=CASE Sign( CharIndex('char', data_type)+CharIndex('date', data_type) )WHEN 0THEN''ELSE''''END FROM tempdb.information_schema.columnsWHERE table_name='##pivot'AND column_name='tpivot'SELECT @.sql=@.sql +'''' +convert(varchar(100), tpivot) +''' = ' + stuff(@.sumfunc,charindex('(', @.sumfunc )+1, 0,' CASE ' + @.pivot +' WHEN ' + @.delim +convert(varchar(100), tpivot) + @.delim +' THEN ' ) +', 'FROM ##pivotDROP TABLE ##pivotSELECT @.sql=left(@.sql,len(@.sql)-1)SELECT @.select=stuff(@.select, charindex(' FROM ', @.select)+1, 0,', ' + @.sql +' ')EXEC (@.select)SET ANSI_WARNINGSON

Now, to demonstrate the power of this function I made a quick sample for you to push you in the right direction:

EXECUTE Crosstab'SELECT EmpId FROM Attendance GROUP BY EmpId','SUM(Attendance_total)','Attendance_date','attendance'

This will give you a matrix with all the employees vertically, and horizontally you will have all unique dates, with the respective attendance total for each employee on that day. As I said earlier, this will not give you all the days of the month, unless there are data for each day. So you might need to do some logic in your GUI to render "empty" days correctly. Good luck!

|||

Hi Johram,

Once again thanks for your time. I really appreciate all your help. Yes your rite, we do need to handle all days in a month. But as you mentioned this can be handled in my logic so im lesser concerned about the days without any data.

Anyways, im aware of the Pivot function. Its basically used to convert rows into columns. Indeed the functionality i have in my stored procedure does the same job. What i do is that i have a temporary table in which i insert 31 rows for each employee. so for 500 employee, i insert 500x31 rows. Later i convert the rows into columns and display it. Although i do not use the Pivot function but i did once give it a try and the Processing Time was similar to what i have in there right now.

Still, im not ready to backout and will definitely give a try to your solution. Let me see what can i get out of it. By the way Johram, if you have dealt with any monthly or annual report in the past, usually how much time does it take to display such a report? Do you think that im being over ambitious in displaying such a report or such reports do take their time...

Once again, thanks alot for every help you have provided...

|||

Sorry, haven't done exactly this kind of report before. But it will depend on the amount of data you are trying to cover. Is it relevant to show ALL employees in a list/report? Maybe you should restrict it to region, or last name or something. Try to do a selection out of the 500 if it possible.

Although I havent been able to compare this crosstable thing with your first query, I still think that it might be faster. Try implement it and see for yourself. Good luck!

|||

Hi Johram

Thanks alot for your time and patience. I really appreciate your efforts and the help you have provided.

keep up the good work...

Monday, February 20, 2012

Query logic not working...

I have a little system of 3 tables Job, employees and times. This times table has the fields times_id, employee_id and job_id

I'm trying to have a query that pull of employees that don't have a certain job_id yet. I'm going to put this data in a table so the user knows they are available for that job. The code i have isn't working, and i'm not sure why.

SELECT
DISTINCT times.employee_id, employee.employee_name
FROM employee
INNER JOIN times ON employee.employee_id = times.employee_id
WHERE (times.job_id <> @.job_id)

Thanks in advance for any help. I'm sure I missing someting silly, or maybe i need to have a stored procedure involved?... Thanks!

Try a subquery:

SELECT
DISTINCT employee_id, employee_name
FROM employee
WHERE employee_id not in
(SELECT employee_id FROM times
WHERE (job_id= @.job_id) )

|||That worked great, I've totally forgot about sub-queries. Thanks a lot Iori Jay.|||

OR

SELECT
DISTINCT employee_id, employee_name
FROM employee
WHERE not exists (SELECT employee_id FROM times
WHERE (job_id= @.job_id) )