2010-01-09

Find index usage against a particular table, & find all unused indexes in the database

The environment: SQL Server 2005 or SQL Server 2008


Maybe you have a large database, and you're looking for ways to save space.  Maybe you're seeing a large, heavily-used table with many indexes, and you're wondering if you could improve write performance by getting rid of some of the indexes.

Before you start dropping indexes left and right, you should start by checking to see if, and how often, you'd obviously like to check which indexes are actually being used.  Starting in SQL Server 2005, there are DMV's (Dynamic Management Views) that make this possible - namely, sys.dm_db_index_usage_stats.  This DMV holds a row for each index which has been used since the last time the SQL Service starts up.  This information does not persist across service restarts, so I generally like to wait at least a month before I start looking at this DMV to get an idea of which indexes are being used, and how.

For more information on the SQL DMV's, check out Books Online:

Personal note: I've found that a month's worth of data is reasonably representative of your actual index usage - for example, the table in question may have a large index which is only used once a month - but it happens to be used by a query that is critical to your monthly billing system.


1. Check all of the indexes on a given table:
    select t.[name],
        i.[name],
        s.*
    from sys.tables t
        left outer join sys.indexes i on t.object_id = i.object_id
        left outer join sys.dm_db_index_usage_stats s on i.index_id = s.index_id
                                    and i.object_id = s.object_id
                                    and s.database_id = db_id()
    where i.is_hypothetical = 0    -- discard stats from consideration
        and t.[name] = 'TableName'                                   
    order by t.[name], i.[name]   
       
2. Check all tables in the database for small numbers of total seeks, scans and updates against any of their indexes:
    select t.[name],
        i.index_id,
        i.[name],
        s.*
    from sys.tables t
        left outer join sys.indexes i on t.object_id = i.object_id
        left outer join sys.dm_db_index_usage_stats s on i.index_id = s.index_id
                                    and i.object_id = s.object_id
                                    and s.database_id = db_id()
    where i.is_hypothetical = 0        --    not stats
        and
        (      --    has no index activity
            s.index_id is null              
            --    or has very little index activity       
            or ( s.user_seeks + s.user_scans + s.user_lookups ) < 100  
        )
        and i.[name] is not null       --    has indexes               
        and i.index_id <> 1            --    not a primary key           
    order by t.[name], i.[name]   

2010-01-07

Migrating Legacy DTS packages

The environment: SQL Server 2000, SQL Server 2005 and SQL Server 2008


Consider an environment in which you're asked to migrate a large number of DTS packages from one of your servers onto another server.  You could manually move each package, but if the following conditions are met, a simple insert statement into msdb..sysdtspackages can make your life a lot easier:

  • you don't need to convert your DTS packages to SSIS packages
  • all of your packages are stored on the server (rather than as a file)

use [msdb]
go

insert into sysdtspackages
    select [name],
        id,
        versionid,
        description,
        categoryid,
        createdate,
        owner,
        packagedata,
        owner_sid,
        packagetype
    from [linkedserver].msdb.dbo.sysdtspackages

 

2010-01-04

Choosing a monitoring solution for SQL Server

Depending on the size of your environment, you might find yourself in need of third-party software to monitor you various SQL Servers.  I'm going to examine various tools and outline my findings below.

It should be noted that there is not one solution for everyone - your solution can and should vary based on the size of your environment, budget, monitoring needs, and your personal preference.  I've discarded several of the solutions below because I simply did not like either the implementation or the interface.  Also, this is not at all a comprehensive list of monitoring software - this represents the result of a quick search of several SQL forums' recommendations, as well as a quick Google search.
I have need to monitor about 8-10 servers - 5 production clusters, 2 reporting repositories, and 1 dedicated replication distributor.  I need job monitoring, query collection, and counter monitoring.  Reports and trending would be nice (this is something I've found a bit lacking with our current solution).

** This post does not constitute a recommendation or endorsement of any of the products below.  I am not an employee or representative of any of the companies listed.  I am not being compensated in any fashion in conjunction with these recommendations.


SQL Diagnostic Manager
Pros: relatively intuitive dashboard views, multi-server views, very usable interface
Cons: occasionally buggy; I find some of the alerting annoying and obtrusive
Features: query collection at configurable intervals, SQL and Windows counter alerting and collection, job monitoring and alerting
Price Point: $2049 per instance, plus maintenance of $409.80
Notes: We are currently using this in our production environment (I'm examining other solutions as the license is up for renewal)  There are a few bugs in our version, but I've spoken with their support team, who is responsive and helpful, and it seems that these are resolved in the latest version they offer.


Confio (http://www.confio.com)
Ignite
Pros:
Cons: Java-based web client.  I prefer a standalone executable.
Features: Did not get as far as configuring this to run.
Price Point: N/A
Notes: As stated above, I installed this, but did not get past the configuration part for reasons stated (this is a web client, rather than a standalone executable) 


Quest (http://www.quest.com)
Spotlight on SQL Server
Pros:
Cons: The only con is that our shop is currently using Idera SQL Diagnostic Manager, which we can renew (and download the latest version) for far cheaper than we can purchase a new install of Spotlight for.  The outright purchase price is comparable with SQL Diagnostic Manger, though this seems to offer far more powerful tools with respect to reporting and trending.
Features:  Configurable alarms for Windows and SQL Server counters, fully configurable data collection scheduling, configurable custom counters, file monitoring, job monitoring, error log monitoring, and quite a bit more that I don't really have time to examine.  The default view is a little confusing and uninformative (consists of a lot of large, brightly colored circles and arrows and animations simulating network traffic) but the other views are very slick - there is reporting and trending for any of the counters (ex: Database File IO Statistics, Plan Cache - 2000 and 2005 only, for now).  It looks like the reporting and trending relies on a statistics repository that I've not yet configured.
Price Point: per server license of $1,995.00
  

Red Gate (http://www.red-gate.com)
SQL Response
Pros: Small, light-weight and easy-to-use.  Initial setup and configuration is minimal and intuituve.
Cons: lacks configurable trace - in the past, I have found it extremely useful to collect snapshots of running queries at set intervals, for the purpose of troubleshooting.  The "enable trace" option only collects for a given interval before/after an alert is raised, and retention is highly minimal.
Features:
Price Point: $619, with support/upgrades
Notes: I really like Red Gate's other products (SQL Compare, SQL Backup, etc) - they know how to create a good-looking, functional interface.  My experience with their sales and support staff has been overwhelmingly positive.  This would be my tool of choice at a smaller shop with less rigorous performance monitoring requirements.


Embarcadero
Performance Center 2.7
Pros:
Cons:
Features:
Price Point:
Notes: cancelled install when it prompted to install Apache

2010-01-03

Find the last full backup date for all databases on a server

Consider a situation in which you're taking over a number of SQL Servers, and you find yourself wondering about the backup situation - maybe you need to pull a full backup history for each database on each server, or maybe you just need to verify that your databases are being backed up.  This is a quick script that pulls the last full backup date for each database on the server - I'm including a SQL Server 2000-compatible version of the script:

SQL Server 2000
select d.[name],
    x.backup_date
from sysdatabases d
    left outer join
    (
        select database_name,
            convert(varchar(50),max(backup_finish_date),110) as backup_date
        from msdb..backupset
        where [type] = 'D'
        group by database_name
    ) x on x.database_name = d.[name]
order by d.[name]


SQL Server 2008 (should also work for SQL Server 2005)
;with x as
(
    select database_name,
        backup_size                                 as backup_size_bytes,
        backup_size / 1024                          as backup_size_kbytes,
        backup_size / (1024*1024)                   as backup_size_mbytes,
        compressed_backup_size,
        convert(varchar(50),backup_finish_date,110) as backup_date,
        [type],
        row_number() over ( partition by database_name order by backup_finish_date desc ) as rnk
    from msdb..backupset
    where [type] = 'D'       
)

select d.[name],
    d.recovery_model_desc,
    x.*
from sys.databases d
    left outer join x on x.database_name = d.[name]
                        and x.rnk = 1
order by d.[name]


Note that this is only the first step in your due diligence - you still have to determine where your databases are being backed up to, your retention window (locally and on any off-site tapes or servers), and (of course) run periodic fire drills to make sure your backup solution is adequate to the task - more on that in a later post.

2010-01-02

Find disk space usage for all tables and their indexes

The environment: SQL Server 2008 (this should work on SQL Server 2005 as well)


Consider a scenario in which you inherit a large database, or in which you revisit a database which has been sitting quietly in production for a while, just accumulating data.  This script compiles a full list of all of the tables in the database, along with their total reserved size in KB, their total index size in KB, and their total data size in KB.
create table #tmpsizes
(
    ID            int    identity(1,1) primary key,
    [name]        nvarchar(128),
    [rows]        char(11),
    reserved      varchar(18),
    data          varchar(18),
    index_size    varchar(18),
    unused        varchar(18),
)

create table #tmpsizes2
(   
    ID                int    identity(1,1) primary key,
    [name]            nvarchar(128),
    [rows]            int,
    reserved_kb       int,
    data_kb           int,
    index_size_kb     int,
    unused_kb         int
)


declare @objname nvarchar(128)
declare sizeCur cursor forward_only for
    select '[' + s.[name] + '].' + t.[name]
    from sys.tables t
        inner join sys.schemas s on t.schema_id = s.schema_id
open sizeCur
fetch next from sizeCur into @objname
while @@fetch_status = 0
begin
    insert into #tmpsizes
        exec sp_spaceused @objname

    fetch next from sizeCur into @objname
end
close sizeCur
deallocate sizeCur

insert into #tmpsizes2
    select [name],
        [rows],
        left(reserved, len(reserved)-3),
        left(data, len(data)-3),
        left(index_size, len(index_size)-3),
        left(unused, len(unused)-3)
    from #tmpsizes
 

-- show tables with the most rows at the top of the list      
select *
from #tmpsizes2
order by [rows] desc

-- show tables with highest ratios of index size to table size 
-- at the top of the list
select *,
    cast(index_size_kb as decimal(18,4))/cast(reserved_kb as decimal(18,4)) as index_percentage
from #tmpsizes2
where reserved_kb > 0
order by cast(index_size_kb as decimal(18,4))/cast(reserved_kb as decimal(18,4)) desc

drop table #tmpsizes
drop table #tmpsizes2

Boilerplate Code

I'm going to take advantage of this lazy Saturday afternoon to post some of my boilerplate code - this is code that I find use for on a repeat basis.  Whenever I post code that you're likely to want to save and re-use, I'll tag it as boilerplate code.


Some examples: 
  • find all the heaps in a database
  • search job code for a snippet of text
  • search proc/function code for a snippet of text
  • kill all of the open connections in a database (or all of the open connections from a given user)
  • find all of the table and index sizes in a database
Coming up: find all of the table and index sizes in a database!

2010-01-01

Replication: spot-check existing subscriptions for duplicate articles

The environment: SQL Server 2008 Enterprise 64-bit - transactional replication using a dedicated distributor.


SQL Server distribution agent generates error code 2627 with the following message:
Violation of PRIMARY KEY constraint 'PK_TableName'. Cannot insert duplicate key in object 'dbo.TableName'.

A quick manual inspection of the SQL Server error logs on the distributor reveals that there are two distribution agents generating the same error. If your publication is set to drop/recreate tables at the subscriber, it is possible to have duplicate replication - two different publications publishing the same article to the same destination server, database and table.


Removing one or both articles should resolve the issue (you may need to drop/add the article and re-snapshot). To guard against this error in the future, run the following query at the publisher database to determine if there are any other existing duplications:

select a.[name]     as articlename,
     s.srvname      as destination_servername,
     s.dest_db      as destination_database,
     a.dest_table   as destination_table,    
     count(*)       as numduplications
from syspublications p
     inner join sysarticles a on p.pubid = a.pubid
     inner join syssubscriptions s on s.artid = a.artid
group by a.[name],
     s.srvname,
     s.dest_db,
     a.dest_table
having count(*) >= 1


Before creating any new publications, or adding any articles to existing publications, run the following check for any tables you're adding to replication:
select a.[name]     as articlename,
     s.srvname      as destination_servername,
     s.dest_db      as destination_database,
     a.dest_table   as destination_table
from syspublications p
     inner join sysarticles a on p.pubid = a.pubid
     inner join syssubscriptions s on s.artid = a.artid
where a.[name] = 'tablename'

Welcome to my blog!

Hello everyone!

This is my first blog post - I'm just testing out the system right now.  I expect the first couple of posts here to concern replication, as it's something which I'm currently tweaking and streamlining.  However, I deal with everything from backups to performance tuning to development, so there should be a nice balance of topics.


A note on my environment - we're currently running transactional replication from 5-7 production servers through a single central distributor to multiple reporting and centralized business rules servers.  Most of the servers in my production environment are running SQL 2008 Enterprise 64-bit, and I anticipate that by mid-year, all of my remaining servers will be migrated to our SQL 2008 active passive clusters.


I've been a DBA for about 5 years now, working with SQL Server 2000/2005/2008, and every time I dig a little deeper into the SQL Server internals, I realize just how much there is to know.  I've been meaning to keep a blog for a while now, and it's kind of a New Years resolution - for myself as a kind of documentation, and for anyone out there who may stumble across any of the same issues.  Nothing infuriates me more than not knowing *why* something is happening, so expect a lot of detail in my posts.


D