Thursday, October 1, 2009

When to Reorganize/Rebuild Index

I had posted about using SQL Server 2005 DMV in the past. I had been thinking about adding some more details for a long time. Finally here it is..


When the data in the underlying tables change, indexes become fragmented (ie. logical ordering of index does not match with the physical ordering of the underlying table). When indexes get fragmented, the performance of the queries will suffer.


So most of the DBAs add to rebuild/reorganize the indexes in their maintenance jobs. Here are the differences between reorganizing and rebuilding the indexes.


Reorganize

Rebuild

Rebuild online in SQL Server 2005

It causes the data in the outermost data pages to reorder and will compact the index

Drops the desired index and creates a new index.

The database engine creates a new index and it is needs additional disk space to accommodate two concurrent indexes. During the process the new index is a write-only index and modifications to original index and applied to the new one. Once the process it done, it drops the original index.

Takes less system resouces and it is an online operation

Since the index is removed and recreated, it is an offline operation

The database engine will still have the original index available to the users while it building a new index. It uses Row Versioning to maintain the transactional consistency and integrity.


I personally look at the page count along with the avg_fragmentation_in_percent because high fragmentation on big indexes will have higher performance impact.


Even if you rebuild all the indexes, you may notice fragmentation for some tables will remain high. This is because if there are less than 1000 pages affected then the fragmentation will not be affected. For smaller tables, SQL server will prefer to use a table scan instead of an index.


Here is the script that I use

--Script to Reorganize/Rebuild indexes

DECLARE @cmd VARCHAR(1000)

DECLARE @Table VARCHAR(255)

DECLARE @SchemaName VARCHAR(255)

DECLARE @IndexName VARCHAR(255)

DECLARE @AvgFragmentationInPercent DECIMAL

DECLARE @fillfactor INT

DECLARE @ThresholdLowerLimit VARCHAR(10)

DECLARE @ThresholdUpperLimit VARCHAR(10)

DECLARE @PageCount INT

/* You can specify your value for reorganize and rebuild indexes

If avg_fragmentation_in_percent >= 5 and <30> 1000 Then Reorganize the Index

If avg_fragmentation_in_percent >= 30 AND Page_Count > 1000 Then Rebuild the Index

*/

SET @fillfactor = 90 --you can specify value for fill factor

SET @ThresholdLowerLimit= '5.0' -- this value is percentage

SET @ThresholdUpperLimit = '30.0' -- this value is percentage

SET @PageCount = 1000 --if pages are less then fragmentation would potentially remain same; This is not a magical cut off point but a start

-- ensure the temporary table does not exist

IF (SELECT OBJECT_ID('tempdb..#IndexList')) IS NOT NULL

DROP TABLE #IndexList;

--Select all the indexes that have >= 5% percent fragmentation and PageCount > 1000

SELECT OBJECT_NAME(S.OBJECT_ID) AS [TableName],schema_name(T.schema_id) AS [SchemaName],

OBJECT_NAME(S.object_id) [IndexName],S.page_count, S.avg_fragmentation_in_percent, 0 AS IsDefraged INTO #IndexList

FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL , NULL) S

JOIN sys.tables T WITH (nolock) ON S.OBJECT_ID = T.OBJECT_ID

WHERE T.is_ms_shipped = 0 --We only need user tables

AND S.avg_fragmentation_in_percent >= CONVERT(DECIMAL, @ThresholdLowerLimit)

AND S.page_count > @PageCount

ORDER BY S.page_count DESC

WHILE EXISTS ( SELECT 1 FROM #IndexList WHERE IsDefraged = 0 )

BEGIN

SELECT TOP 1 @Table = TableName, @AvgFragmentationInPercent = avg_fragmentation_in_percent,

@SchemaName = SchemaName, @IndexName = IndexName

FROM #IndexList

WHERE IsDefraged = 0

--Reorganizing the index if avg_fragmentation_in_percent > 5 and < style="">

IF((@AvgFragmentationInPercent >= @ThresholdLowerLimit) AND (@AvgFragmentationInPercent < @ThresholdUpperLimit))

BEGIN

SET @cmd = 'ALTER INDEX ' + @IndexName + ' ON [' + RTRIM(LTRIM(@SchemaName)) + '].[' + RTRIM(LTRIM(@Table)) + '] REORGANIZE'

EXEC (@cmd)

PRINT N'Reorganize Index completed successfully for [' + @Table + ']' + CONVERT(VARCHAR, GETDATE())

END

--Rebuilding the index if avg_fragmentation_in_percent > 30

ELSE IF (@AvgFragmentationInPercent >= @ThresholdUpperLimit )

BEGIN

SET @cmd = 'ALTER INDEX ' + @IndexName + ' ON [' + RTRIM(LTRIM(@SchemaName)) + '].[' + RTRIM(LTRIM(@Table)) + '] REBUILD WITH (FILLFACTOR = ' + CONVERT(VARCHAR(3),@fillfactor) + ', STATISTICS_NORECOMPUTE = OFF)'

EXEC (@cmd)

PRINT N'Rebuild Index completed successfully for [' + @Table + ']' + CONVERT(VARCHAR, GETDATE())

END

--Update the temp table

UPDATE #IndexList

SET IsDefraged = 1

WHERE TableName = @Table

AND IndexName = @IndexName

END

DROP TABLE #IndexList

Wednesday, September 30, 2009

Get the id of the newly inserted row

There are times when we need to retrieve the last inserted identity of a row. Here is a scenario


  1. You need to insert a record into a table
  2. You need the Id of the row you inserted, so you can use it to insert rows into tables referencing the first row


SQL Server has the options:


SCOPE_IDENTITY()– Gives the last identity value that was generated in the current scope. In other words, it will give the last identity value that you have explicitly created.



@@Identity –Gives the last identity value that was generated in the current session but in any scope.


IDENT_CURRENT(’tablename’) – Gives the last identity value produced in the table regardless if the connection it the scope.



In our scenario, SCOPE_IDENTITY() works perfect to return the identity of the recently added row.

Tuesday, June 16, 2009

SQL Server 2008 Resource Governor

Resource Govenor enables to manage SQL Server workload and resource utilization.


Terminology:

Resource Pool – It is a collection of physical resources. SQL Server 2008 has two built in pools, default and internal. It allows to create maximum of 18 user defined pools.


Workload Group – It is a container for the requests that are similar to the classification criteria applied to each request. Like Resource pools there are two built in workload groups, default and internal.


Classifier Function – It is a user written criteria that is being applied to the requests.Each user session would be assigned to a particular workgroup load using the classifier function.


Limitations:

  1. Limited to only CPU bandwidth and memory management
  2. We can only control resource utlization for database engine. It cannot be used with SSIS, Analysis or Reporting services
  3. Only one instance of SQL Server can be managed


Here is just an example of creating resource pool and workload group


--Enable Resource Governor

ALTER RESOURCE GOVERNOR RECONFIGURE

GO

USE [master]

GO

-- Create resource pool to be used

CREATE RESOURCE POOL poolAdhoc

WITH (

-- allocate at least % of the CPU bandwidth for admin queries

MIN_CPU_PERCENT = 10

-- do not let them exceed % either

, MAX_CPU_PERCENT = 20

-- admin queries should be small and we will allocate about %

-- of the server memory for them

, MIN_MEMORY_PERCENT = 5

, MAX_MEMORY_PERCENT = 5 )

GO

-- Create adhoc workload group

CREATE WORKLOAD GROUP groupAdhoc

WITH (

-- use all defaults

IMPORTANCE = MEDIUM

, REQUEST_MAX_MEMORY_GRANT_PERCENT =25

, REQUEST_MEMORY_GRANT_TIMEOUT_SEC = 0

, REQUEST_MAX_CPU_TIME_SEC = 50

, MAX_DOP = 0

, GROUP_MAX_REQUESTS = 0

)

USING

poolAdhoc

GO

-- Create admin workload group

CREATE WORKLOAD GROUP groupAdmin

WITH (

-- use all defaults

IMPORTANCE = MEDIUM

, REQUEST_MAX_MEMORY_GRANT_PERCENT =25

, REQUEST_MEMORY_GRANT_TIMEOUT_SEC = 0

, REQUEST_MAX_CPU_TIME_SEC = 30

, MAX_DOP = 0

, GROUP_MAX_REQUESTS = 0

)

USING

poolAdhoc

GO

CREATE FUNCTION [dbo].[rgClassifier_Adhoc]()

RETURNS sysname

WITH SCHEMABINDING

AS

BEGIN

-- Define the return sysname variable for the function

DECLARE @grp_name AS sysname;

SET @grp_name = 'default';

-- Specify the T-SQL statements for mapping session information

-- with Workload Groups defined for the Resource Governor.

IF (IS_SRVROLEMEMBER ('sysadmin') = 1)

SET @grp_name = 'groupAdmin'

IF (APP_NAME() LIKE '%Management Studio%')

SET @grp_name = 'groupAdhoc'

RETURN @grp_name;

END

GO

-- Set the classifier function for Resource Governor

ALTER RESOURCE GOVERNOR

WITH (

CLASSIFIER_FUNCTION = dbo.[rgClassifier_Adhoc]

)

GO

-- Make changes effective

ALTER RESOURCE GOVERNOR RECONFIGURE

GO

Friday, June 5, 2009

SQL Server 2008 – Where is Activity Monitor?

For a minute I thought they deprecated the feature. I use it a lot to check the activity going on the databases.

They moved it to the launch point and the new one is really cool. Here is a blog with screenshots.

http://blogs.msdn.com/dtjones/archive/2008/07/10/where-the-heck-is-activity-monitor.aspx

Saturday, May 30, 2009

SQL Server 2005 Pivots with Dynamic columns

http://www.simple-talk.com/community/blogs/andras/archive/2007/09/14/37265.aspx

Report Development - Report Requirements Document

The purpose of report requirements document is to define the requirements for each report or group of reports.


Personally I don’t like to read 80 pages of technical documentation for a report. Nor would I want to scratch my head wondering what the report is doing Vs what it is supposed to do.


Here is a quick checklist of items to include in the report requirements documentation


  • Business need for the report
  • Requirements from the point of view of the end users
  • The selection and the filter criteria
  • Data elements in the report
  • Calculations if any
  • Parameters in the report
  • Sort, grouping and totaling requirements
  • Appearance and style requirements (i.e. layout of the report, column names to be shown etc)
  • Performance requirements (speed and latency)
  • Security requirements (who needs access to the report)

Wednesday, May 13, 2009

Bad Data Models

We all know what a good data model is and spend significant time designing a good data model. So if there are good data models then there are bad data models too.

There is a difference between incorrect data models and bad data models. An incorrect data model would not satisfy business requirements and hence they don't make to the implementation. But bad data models would get implemented. They do more harm to the system. Every part of the application has to compensate for the bad data model.

Bad data models evolve due to various reasons – time and resource constraints, inexperience, over engineered modeling and sometimes sloppiness.

Thursday, May 7, 2009

SQL Server 2008 – Maximum Capacity Specs

I found this article which gives the maximum size specifications for SQL Server 2008

http://msdn.microsoft.com/en-us/library/ms143432.aspx

SQL Server 2008 Min and Max Date Ranges

Here are the Minimum and Maximum date ranges

SQL Server 2005
Min Date: 1753-01-01 00:00:00.000
Max Date: 9999-12-31 23:59:59.999

SQL Server 2008
Min Date: 0001-01-01 00:00:00.000
Max Date: 9999-12-31 23:59:59.999

Thursday, February 12, 2009

SSIS Configurations

There is a significant change in how SSIS applies configurations in SQL Server 2008. The configurations are applied twice – once before applying the command line options and once after applying the command line options. So it overwrites the value specified on the command line.

This is a nice improvement since you can set up a package with SQL Server Configurations and change the connection string for SQL Server configuration.

Check this link for more information


http://msdn.microsoft.com/en-us/library/cc671625.aspx



Sunday, February 1, 2009

Truncate Mirrored Database Log File

If you are running asynchronous database mirroring, then there could be a backlog of transaction log records that have not been sent from the principal to the mirror (called the database mirroring SEND queue). The transaction log records cannot be freed until they have been successfully sent. With a high rate of transaction log record generation and limited bandwidth on the network (or other hardware issues), the backlog can grow quite large and cause the transaction log to grow.



On the mirrored database, you cannot backup the log file with TRUNCATE_ONLY. Here the steps to shrink the log file for a database participating in mirroring



  1. Backup the log file to a location

BACKUP Log YourDatabaseName TO DISK = 'D:\BACKUP\DBNAME_20090201.TRN'


  1. Check if there is enough free space on perform the shrink operation

SELECT name ,size/128.0 - CAST(FILEPROPERTY(name, 'SpaceUsed') AS int)/128.0 AS AvailableSpaceInMB

FROM sys.database_files;


DBCC SQLPERF(LOGSPACE);


If there is no sufficient free space then the shrink operation cannot reduce file size.


  1. Check if all the transactions are written into the disk


DBCC LOGINFO('YourDatabaseName')


The status of the last transaction should be 0. If not, then backup the transaction log once again.


  1. Shrink the log file

DBCC SHRINKFILE(logfilename , target_size)


If the transaction lof file does not shrink after performing the above steps then backup the log file again to make more of the virtual log files inactive.


Also check the column LOG_REUSE_WAIT_DESC in the sys.databases catalog view to check if the reuse of the transaction log space is waiting on anything.


Check this link to find the factors that can delay log truncation

http://msdn.microsoft.com/en-us/library/ms345414(SQL.90).aspx

Tuesday, January 27, 2009

Connecting to Remote SSIS Server

If you get the “Access is denied” error while connecting to remote SSIS server, you may have to perform the steps outlined in the following link

http://msdn.microsoft.com/en-us/library/aa337083.aspx

Additional steps to perform which are not in the MSDN page

  1. Add the user to the Distributed COM Users group on the server
  2. Restart the SSIS service after making changes
  3. At the minimum, add the user to the MSDB db_dtsltduser role to edit their own packages. Check this link for more permissions on the msdb database

http://msdn.microsoft.com/en-us/library/ms141053.aspx