Hi,
Can anyone help me optimize the SELECT statement in the 3rd step? I am actually writing a monthly report. So for each employee (500 employees) in a row, his attendance totals for all days in a month are displayed. The problem is that in the 3rd step, there are actually 31 SELECT statements which are assigned to 31 variables. After I assign these variable, I insert them in a Table (4th step) and display it. The troublesome part is the 3rd step. As there are 500 employees, then 500x31 times the variables are assigned and inserted in the table. This is taking more than 4 minutes which I know is not required :). Can anyone help me optimize the SELECT statements I have in the 3rd step or give a better suggestion.
DECLARE @.EmpID, @.DateFrom, @.Total1 ... // Declaring different variables
SELECT @.DateFrom = // Set to start of any month e.g. 2007-06-01 ..... 1st
Loop (condition -- Get all employees, working fine)
BEGIN
SELECT @.EmpID = // Get EmployeeID ..... 2nd
SELECT @.Total1 = SUM (Abences) ..... 3rd
FROM Attendance
WHERE employee_id_fk = @.EmpID (from 2nd step)
AND Date_Absent = DATEADD ("day", 0, Convert (varchar, @.DateFrom)) (from 1st step)
SELECT @.Total2 ..................... same as above
SELECT @.Total3 ..................... same as above
INSERT IN @.TABLE (@.EmpID, @.Total1, ..... @.Total31) ..... 4th
Iterate (condition) to next employee ..... 5th
END
It's only the loop which consumes the 4 minutes. If I can somehow optimize this part, I will be most satisfied. Thanks for anyone helping me...
What does the Attendance table look like? I have some ideas for you but I need to know how the attendance is stored. Can you give us the schema of that table please? Thanks!
|||
See this sample example-->
========================================================================================
Declare @.fromDate datetime,
@.toDate datetime
Set @.fromDate = '1-Aug-2007'
Set @.toDate = '4-Aug-2007'
Select distinct a.dtAttendate,
'STATUS'=(select
(case
when lv_status='EL' then 'EL'
when lv_status='CL' then 'CL'
when lv_status='SL' then 'SL'
when lv_status='ML' then 'ML'
when (wk_status='S' AND log_status='P') then 'SB'
when (wk_status='S' AND log_status='A') then 'Absent (SB)'
when wk_status='N' then 'Weekend'
when bIsHoliday=1 then 'Holiday'
when log_status='A' then 'Absent'
--when mnyLateHrs>0 then 'Late'
when log_status='P' then 'Present'
end)
from tblLogAbsent
where dtAttendate = a.dtAttendate and intEmpCode=a.intEmpCode),
'TIMEIN'=(select dtEmpTimeIn from tblLogStatus where dtAttendate=a.dtAttendate and intEmpCode=a.intEmpCode),
'TIMEOUT'=(select dtEmpTimeOUT from tblLogStatus where dtAttendate=a.dtAttendate and intEmpCode=a.intEmpCode)
from tblLogStatus a
where a. dtAttendate between @.fromDate and @.toDate
group by a.dtAttendate, intEmpCode
========================================================================================
here i use two different table "tbllogabsent" for his attendance status and "tbllogstatus" as a for additional information... this is not becoming problem... if u cant understand any line of code ask me again...hopefully this will be helpfull to u...
|||
Hmmm... I'm not sure how to use this information. In your initial code, you do a SELECT from a table called Attendance, which seems to have columns like Abences, employee_id_fk and Date_Absent. Can you give us more details on this table please? Thanks.
|||
Hi johram,
Thanks for replying. Yes you are right. The attendance table for employees has columns like EmpID_fk, Attendance_Date, ... , Attendance_Total. The Attendance_Total column is dependent on our business rules which include reason for signing in/out. e.g. If an employee has signed out for some official task, 1 is added to his Attendance_Total column. If he is going away for a business tour, 2 may be added to his Attendance_Total column. So in one day, an employee can have more than one record. The records for an employee may look like the following:
EmpID Attendance_Date ... Attendance_Total
1001 04/28/2006 1
1001 04/28/2006 2
1001 04/28/2006 1
So on 28th April, the total for Emp (1001) = 4. There are other columns in there but im only concerned with Attendance_Total. I need to display the SUM (Attendance_Total) for each day in a month for each employee.
If I further elaborate my report based on the above example, it may look something like:
EmpID EmpName D1 D2 D3 D4 D5 D6 ......
1001 ABC 1 0 4 4 1 0
I have tried a few techniques (under my experienceJ), but when it comes to computing the sum for each day, it takes almost 4-5 minutes which I am sure nobody wants. Also this report can be accessed anytime within a day and employees keep coming and going for business, so cant store the records and need to compute them everytime the report is accessed.
Thanks to you all for helping me...
|||
Hi patuary,
Thanks for your reply aswell. I have tried your technique and understand it - i bet :) Anyways, after applying your technique, there are the following two problems:
1. All records are being returned as rows e.g. The records for Emp 1001 in a month is not in a single row, rather seperate rows are returned for each day
2. Also, this query only returns data for days on which attendance may be marked. But if there is a weekend or the employee was absent, his attendance record for that day is not computed as there are no records. Whereas in my case, if he was absent or no record found on a given day, his attendance record must be marked as 0.
Thanks again for your time...
|||
Guys, please provide your valueable feedback...
|||
Is it important that you get a result with all the days, even if the sum is zero? Cause that will make it a bit more complicated in the SQL. You can have the SQL report back all days that actually have a total (greater than zero), and then in your GUI you can render the rest of the days as empty. In that case, I think we can work out a solution for you. At least that's what we'll start with ;-) I'll see what I can do!
Also, what's the datatype of you Attendance_Date column?
|||
If you are using SQL 2005, there should be a new statement called PIVOT, although it is nowhere to be found in the T-SQL reference manual on MSDN. Maybe you are luckier than me ;-) Pivot is the term for when you shift the layout of a table so that you look on it from a different perspective. In this case, you want to pivot the table on the date so that each date represents a column rather than a row.
Now, this can be done with a function calledCrosstable, which was developed by the legendary Rob Volk. The source code for this function can be foundhere. Note that you need to change the column "pivot" to "tpivot" or something, since "pivot" is a keyword in SQL 2005.
This is the modified version of Crosstable that will work in SQL 2005:
ALTER PROCEDURE crosstab @.select varchar(8000),@.sumfuncvarchar(100), @.pivotvarchar(100), @.table varchar(100)ASDECLARE @.sqlvarchar(8000), @.delimvarchar(1)SET NOCOUNT ONSET ANSI_WARNINGSOFFEXEC ('SELECT ' + @.pivot +' AS tpivot INTO ##pivot FROM ' + @.table +' WHERE 1=2')EXEC ('INSERT INTO ##pivot SELECT DISTINCT ' + @.pivot +' FROM ' + @.table +' WHERE ' + @.pivot +' Is Not Null')SELECT @.sql='', @.sumfunc=stuff(@.sumfunc,len(@.sumfunc), 1,' END)' )SELECT @.delim=CASE Sign( CharIndex('char', data_type)+CharIndex('date', data_type) )WHEN 0THEN''ELSE''''END FROM tempdb.information_schema.columnsWHERE table_name='##pivot'AND column_name='tpivot'SELECT @.sql=@.sql +'''' +convert(varchar(100), tpivot) +''' = ' + stuff(@.sumfunc,charindex('(', @.sumfunc )+1, 0,' CASE ' + @.pivot +' WHEN ' + @.delim +convert(varchar(100), tpivot) + @.delim +' THEN ' ) +', 'FROM ##pivotDROP TABLE ##pivotSELECT @.sql=left(@.sql,len(@.sql)-1)SELECT @.select=stuff(@.select, charindex(' FROM ', @.select)+1, 0,', ' + @.sql +' ')EXEC (@.select)SET ANSI_WARNINGSON
Now, to demonstrate the power of this function I made a quick sample for you to push you in the right direction:
EXECUTE Crosstab'SELECT EmpId FROM Attendance GROUP BY EmpId','SUM(Attendance_total)','Attendance_date','attendance'
This will give you a matrix with all the employees vertically, and horizontally you will have all unique dates, with the respective attendance total for each employee on that day. As I said earlier, this will not give you all the days of the month, unless there are data for each day. So you might need to do some logic in your GUI to render "empty" days correctly. Good luck!
|||
Hi Johram,
Once again thanks for your time. I really appreciate all your help. Yes your rite, we do need to handle all days in a month. But as you mentioned this can be handled in my logic so im lesser concerned about the days without any data.
Anyways, im aware of the Pivot function. Its basically used to convert rows into columns. Indeed the functionality i have in my stored procedure does the same job. What i do is that i have a temporary table in which i insert 31 rows for each employee. so for 500 employee, i insert 500x31 rows. Later i convert the rows into columns and display it. Although i do not use the Pivot function but i did once give it a try and the Processing Time was similar to what i have in there right now.
Still, im not ready to backout and will definitely give a try to your solution. Let me see what can i get out of it. By the way Johram, if you have dealt with any monthly or annual report in the past, usually how much time does it take to display such a report? Do you think that im being over ambitious in displaying such a report or such reports do take their time...
Once again, thanks alot for every help you have provided...
|||
Sorry, haven't done exactly this kind of report before. But it will depend on the amount of data you are trying to cover. Is it relevant to show ALL employees in a list/report? Maybe you should restrict it to region, or last name or something. Try to do a selection out of the 500 if it possible.
Although I havent been able to compare this crosstable thing with your first query, I still think that it might be faster. Try implement it and see for yourself. Good luck!
|||
Hi Johram
Thanks alot for your time and patience. I really appreciate your efforts and the help you have provided.
keep up the good work...