Checking Index Fragmentation

October 18, 2011

If your not monitoring your indexes on a regular basis shame on you, you should be. Depending on how many databases you are supporting this could end up being extremely tiedious. So here is what I do and you can tailor the process to your env.

1) Each production server I support has a DBA database on it.

2) Create the table IndexStats in each of the DBA databases (scripts below)

3) Create a job with the script below to capture the index statisics . (I do it once a week)

4) Now here it’s up to you on how you want to do this. I have a huge number of servers that I support so I transfer the statistics from each individual server to a central repository. I use an SSIS package which connects to the table in each database and dumps it into a table of the same format on my repository. The SSIS package uses Simple Data flows to do this and is pretty straight forward. I’m not going to go into the package in this post. If you need assistance on creating data flows in SSIS there are plenty of resources out there.

5) Create the report stored procedure to email you indexes which need to be rebuilt and schedule it based on your needs. I run it once a week then rebuild the indexes that weekend on the server in questions.

Script 1 Create Table to hold your statistics in your DBA database.

CREATE TABLE [dbo].[IndexStats](
[SQLInstance] [varchar](160) NULL,
[DatabaseName] [varchar](160) NULL,
[TableName] [varchar](160) NULL,
[IndexName] [varchar](160) NULL,
[IndexType] [varchar](160) NULL,
[Databaseid] [smallint] NULL,
[object_id] [int] NULL,
[index_id] [int] NULL,
[user_seeks] [bigint] NULL,
[user_scans] [bigint] NULL,
[user_lookups] [bigint] NULL,
[user_updates] [bigint] NULL,
[last_user_seek] [datetime] NULL,
[last_user_scan] [datetime] NULL,
[last_user_lookup] [datetime] NULL,
[last_user_update] [datetime] NULL,
[system_seeks] [bigint] NULL,
[system_scans] [bigint] NULL,
[system_lookups] [bigint] NULL,
[system_updates] [bigint] NULL,
[last_system_seek] [datetime] NULL,
[last_system_scan] [datetime] NULL,
[last_system_lookup] [datetime] NULL,
[last_system_update] [datetime] NULL,
[rows] [int] NULL,
[StatisticsCaptured] [date] NULL,
[avg_fragmentation_in_percent] [float] NULL,
[page_count] [int] NULL,
[Index_Depth] [int] NULL,
[Index_level] [int] NULL,
[Avg_record_Size_in_bytes] [int] NULL
) ON [PRIMARY]GO

SET ANSI_PADDING OFF
GO

ALTER TABLE [dbo].[IndexStats] ADD DEFAULT (getdate()) FOR [StatisticsCaptured]
GO

Script 2 Schedule this script from SQL Agent to populate your IndexStats Table. I delete data from this table since my data is moved into a central repository but if you are not doing that then you can remove the delete part of the script

insert into MyDBADB..IndexStats
exec
sp_msforeachdb
‘use ?
DECLARE @dbid int
select @dbid = db_id(”?”)
select
DISTINCT
@@SERVERNAME as SQLInstance,
”?” as DB,
OBJECT_NAME(a.object_id) as [TableName],
B.name as [Index],
B.type_desc as [IndexType],
A.*,
p.rows,
getdate(),
OS.avg_fragmentation_in_percent,
OS.page_count,
OS.Index_Depth,
OS.Index_level,
OS.Avg_record_Size_in_bytes
FROM ?.sys.dm_db_index_usage_stats a
INNER JOIN ?.sys.indexes B on a.object_id = B.object_id and a.index_id = B.index_id
inner join ?.sys.partitions p on B.object_id =p.object_id and B.index_id =p.index_id
INNER JOIN ?.sys.dm_db_index_physical_stats(@dbid,NULL,NULL,NULL,”SAMPLED”) OS on
A.object_id = OS.Object_id and A.index_id =OS.Index_id
WHERE
OBJECT_NAME(a.object_id) not like ”sys%” and OBJECT_NAME(a.object_id) not like ”spt%”
and OBJECT_NAME(a.object_id) not like ”queue%”
and a.database_id = DB_ID(”?”)
order by B.namedelete from MyDBADB..IndexStats
where StatisticsCaptured > GETDATE() -14

Script 3 Create the following stored procedure on your repository and schedule it to run as you see fit. This will alert you on the indexes with Fragmentation percent greater then 50.

/******************************************************************************************************
* Created By: MotherofAllGeeks.Com
* Created on: 3/2011
* Purpose gather space information for trending or possibly alerting
* Create it in master and it will gather stats for all databases
* Sample Call proc_getdatabasespace_r
* Prereq is that you must have a DBADB and you must have created FileStats table in MyDBADb.
* exec [proc_GetFragmentationReport_r] 7,’Mom@motherofallGeeks.com’, ‘MyMailProfile’
******************************************************************************************************/
CREATE PROCEDURE [dbo].[proc_GetFragmentationReport_r]
@days int ,
@recipients varchar(255),
@profile varchar(255)

AS

BEGIN
SET NOCOUNT ON

BEGIN TRY

BEGIN
DECLARE @tableHTML NVARCHAR(MAX), @currentDate date, @usagepercentage int

SET @tableHTML =
N’

‘+ N’

‘+N’Database’+N’CountofFragmenedIndexes’+N’

‘ +CAST ((SELECTtd = convert(varchar(30),sqlinstance) ,” ,td = convert(varchar(60),databasename),” ,td = convert(varchar(24),counttables) ,”FROM (select sqlinstance,databasename,count(tablename) counttablesfrom indexstatswhere avg_fragmentation_in_percent > 50and StatisticsCaptured > GETDATE() – @daysgroup by sqlinstance, databasename) Aorder by counttables descFOR XML PATH(‘tr’), TYPE) as NVARCHAR(max) ) + N’

SQLInstance

IF (@tableHTML is not null)
BEGIN

declare @subject varchar(55)
SELECT @subject = ‘Fragmented Indexes ‘
exec msdb.dbo.sp_send_dbmail
@profile_name = @profile,
@recipients = @recipients,
@subject = @subject,
@body = @tableHTML,
@body_format = ‘HTML’
END
END

END TRY

BEGIN CATCH

DECLARE @ErrorMessage varchar(max)

SELECT @ErrorMessage =
‘ErrorNumber:’ + convert(varchar(7),ERROR_NUMBER()) + ‘
‘ +
‘ErrorSeverity:’ + convert(varchar(7),ERROR_SEVERITY()) +’
‘+
‘ErrorState:’ + convert(varchar(7),ERROR_STATE()) +’
‘+
‘Calling Procedure:’ + CONVERT(VARCHAR(25), OBJECT_NAME(@@PROCID))+ ‘
‘+
‘ErrorProcedure:’+ ERROR_PROCEDURE() +’
‘+
‘ErrorLine:’ + convert(varchar(7),ERROR_LINE()) +’
‘+
‘ErrorMessage:’ + ERROR_MESSAGE() +’


RAISERROR(@ErrorMessage,16,1)

END CATCH

END

GO


Yet another Performance issue….

September 28, 2009

This article is related to the article I wrote a while back on parameter sniffing. We recently had another problem with a production application that started to experience poor performance. The C# web application using a SQL 2005 database starting timing out. The timeout was set to 30 seconds. I wasn’t seeing deadlocks on the database and we had some blocking but it wasn’t extreme. So this problem lasted over about two weeks. Here are the steps taken to troubleshoot.

1) Began running SQL Profiler. I started tracing only RPC calls using a filter on the trace for duration > 30000 (30 seconds). My first thoughts were I better figure out which procs are taking the time. The trace showed one particular proc which was executed many times and always took alot of time on the DB. This procedure was also blocking others. We spent some time tuning this procedure and testing it and pushed it into production.

2) The tuned procedure reduced the timeouts from around 150 a day to about 50 a day. A huge improvement but still not a great user experience. I changed my trace for duration > 10,000 and monitored it for many days. I started seeing that one proc to fetch information used for a drop down list was taking more and more time each day. If I took the call from the trace and executed it from SSMS the query would execute in 1 second. If I ran the same call from OSQL on a different machine it would take 30 seconds. The application timeout setting was also increased to 60 seconds.  

3) Next I checked my indexes for fragmentation and rebuilt them old style. I mean I dropped and recreated them I didn’t use DBCC commands.

4) I increased the number of times update statistics with full scan from once a week to twice a week.

We still had issues with this one proc and it is now causing the majority of the timeouts.  When I looked at the query plan from SSMS in the GUI format it’s doing two index seeks. But when I look at the plan generated from OSQL it’s doing an index scan and generating over 3 million reads on a table that has only about 400K rows in it.

5) At this point an advisory case with Microsoft was opened.  My gut was telling me that it has to be related to the query plan but I couldn’t understand why SSMS is picking an efficient plan and OSQL is not.

I checked the connection settings for the database properties coming from my application and I compared these to the settings used by SSMS and OSQL. They are all the same. To check the login set options you use profiler to trace the Audit:Login event. Microsoft told me that these settings can affect your query plan.

6) Next thing to do is start looking at the trace when I execute the stored procedure call from SSMS and OSQL. What I found partially answered my question as to why the procedure call executes quickly from SSMS.  I traced all the events under Stored Procedures for my spid and I found that whenever I execute the proc from SSMS it was doing a cache miss. AKA not finding it in the procedure cache. Ok Why? When I do a call from OSQL it’s also doing a cache miss but then picking an inefficient plan. Well at least I can recreate the issue….

7) Next thing is to start looking at is the plans that are actually in the cache. You can do this by using the Dynamic management views. Create the function and view below for an easy look into the cache.

create function SqlAndPlan(@handle varbinary(max))
returns table
as
return select sql.text, cp.usecounts,cp.cacheobjtype,
cp.objtype, cp.size_in_bytes,
qp.query_plan
 from
 sys.dm_exec_sql_text(@handle) as sql cross join
 sys.dm_exec_query_plan(@handle) as qp
 join sys.dm_exec_cached_plans as cp
 on cp.plan_handle = @handle;

create view PlanCache
as
select sp.* from sys.dm_exec_cached_plans   as cp
cross apply SqlAndPlan(cp.plan_handle) as sp

8 ) When viewing the results you will get the XMLversion of your query plan. If you click on the results you can open it and save it as .sqlplan and when you open it again it will be in the graphical format. However the XML Plan gives you views into some things that the graphical plan does not. Some of this information is extremely useful. The thing I personally thought is the greatest is that I can actually see what parameters are used for my procedure execution when the plan is stored in the cache. You can always reverse engineer your procedure calls from this. To find the parameter values search the xml plan for parameter list. You will see something similar to the following:

<ParameterList>ColumnReference Column=”@Default” ParameterCompiledValue=”(1)” />
ColumnReference Column=”@Role” ParameterCompiledValue=”(8)” />
ColumnReference Column=”@Page” ParameterCompiledValue=”Start” />
ColumnReference Column=”@user” ParameterCompiledValue=”(1)” />
…. n  

From this information I can see if my plan is an atypical call. For example if 90% of my users have role 1 and the plan cached is for role 8 perhaps you should remove this plan from the cache and replace it with a more efficient one. This happened to be the situation in my case. To change the value in the cache you need to recompile the stored proc and execute it with different set of parameters then verify that is stored in your cache. Another useful bit of information in the cache is the compile time. The proc which was causing us issues took .001 seconds to compile the plan. We decided with the advice of Microsoft to add the recompile option to this proc to ensure the correct plan is always chosen.  The over head in this case was .001 seconds. I don’t recommend adding this option all the time and this should be carefully weighed based on the stored procedure and other tuning options available.  


Follow

Get every new post delivered to your Inbox.