65.9K
CodeProject 正在变化。 阅读更多。
Home

SQL Server 中 DEFAULT TRACE 的隐藏宝石

starIconstarIconstarIconstarIcon
emptyStarIcon
starIcon

4.98/5 (16投票s)

2016 年 3 月 15 日

CPOL

4分钟阅读

viewsIcon

30659

从 SQL Server 2005 开始,默认跟踪功能在每个 SQL Server 上默认启用。默认跟踪是一个轻量级的系统跟踪,它存储了大量关于系统变化的有用信息...

引言

从 SQL Server 2005 开始,默认跟踪功能在每个 SQL Server 上默认启用。默认跟踪是一个轻量级的系统跟踪,它存储了大量关于系统变化的有用信息。它位于 SQL Server 安装目录中,包含五个扩展名为 .trc 的文件。服务器每次启动时都会自动创建一个新的跟踪文件,最早的文件会被擦除。新事件始终记录到最新的文件中,该文件的大小限制为 20 MB。如果超出此大小,将创建一个新文件,并且此行为无法更改。唯一可以…启用或禁用此功能。

EXEC sys.sp_configure 'show advanced options', 1
GO
RECONFIGURE WITH OVERRIDE
GO

EXEC sys.sp_configure 'default trace enabled', 0
GO
RECONFIGURE WITH OVERRIDE
GO

使用以下查询检查默认跟踪是否已启用

SELECT name, value
FROM sys.configurations
WHERE configuration_id = 1568

默认跟踪目录

SELECT [path], start_time, last_event_time, event_count
FROM sys.traces
WHERE is_default = 1

让我们看看默认跟踪可以存储哪些类型的事件

SELECT e.trace_event_id, e.name, c.category_id, c.name
FROM sys.trace_categories c
JOIN sys.trace_events e ON c.category_id = e.category_id
trace_event_id name                       category_id name
-------------- -------------------------- ----------- -----------------------------------
196            CLR                        20          Assembly Load
92             Database                   2           Data File Auto Grow
93             Database                   2           Log File Auto Grow
94             Database                   2           Data File Auto Shrink
95             Database                   2           Log File Auto Shrink
79             Errors and Warnings        3           Missing Column Statistics
80             Errors and Warnings        3           Missing Join Predicate
67             Errors and Warnings        3           Execution Warnings
69             Errors and Warnings        3           Sort Warnings
55             Errors and Warnings        3           Hash Warning
21             Errors and Warnings        3           EventLog
22             Errors and Warnings        3           ErrorLog
213            Errors and Warnings        3           Database Suspect Data Page
214            Errors and Warnings        3           CPU threshold exceeded
46             Objects                    5           Object:Created
47             Objects                    5           Object:Deleted
164            Objects                    5           Object:Altered
...

1. 自动增长事件

任何事务都需要磁盘上一定空间(数据文件或日志)已不是秘密。如果空间不足,文件大小会自动增加。此时,文件会被锁定,SQL Server 将等待磁盘子系统执行必要的操作来分配磁盘空间。

默认情况下,SQL Server 会预先初始化磁盘上的新空间。对于数据文件,此行为可以通过 Instant File Initialization 禁用。然而,对于日志文件,初始化仍然会发生,这无疑会很慢。因此,我建议定期跟踪自动增长事件

SELECT
      StartTime
    , Duration = Duration / 1000
    , DatabaseName = DB_NAME(DatabaseID)
    , [FileName]
    , GrowType = CASE WHEN EventClass = 92 THEN 'DATA' ELSE 'LOG' END
FROM sys.traces i
CROSS APPLY sys.fn_trace_gettable([path], DEFAULT) t
WHERE t.EventClass IN (
            92, -- Data File Auto Grow
            93  -- Log File Auto Grow
        ) 
    AND i.is_default = 1

并且如果它们的数量急剧增加

StartTime                EndTime                  Duration    DatabaseName           FileName                  GrowType
------------------------ ------------------------ ----------- -------------------   -----------------------    --------
2016-01-16 02:52:48.127  2016-01-16 02:52:48.163  36          tempdb                 templog                   LOG
2016-01-16 02:52:49.090  2016-01-16 02:52:49.140  50          tempdb                 tempdev                   DATA
2016-01-16 02:52:50.687  2016-01-16 02:52:50.903  216         tempdb                 tempdev                   DATA
2016-01-16 02:52:51.213  2016-01-16 02:52:51.257  43          tempdb                 tempdev                   DATA
2016-01-16 02:52:52.300  2016-01-16 02:52:53.060  760         tempdb                 tempdev                   DATA
2016-01-16 02:52:54.090  2016-01-16 02:52:54.360  270         tempdb                 tempdev                   DATA
2016-01-16 02:52:55.407  2016-01-16 02:52:55.680  273         tempdb                 tempdev                   DATA
2016-01-16 02:52:56.720  2016-01-16 02:52:57.007  286         tempdb                 tempdev                   DATA
2016-01-16 02:52:58.047  2016-01-16 02:52:58.253  206         tempdb                 tempdev                   DATA
2016-01-16 02:52:59.317  2016-01-16 02:52:59.830  513         tempdb                 tempdev                   DATA
2016-01-16 02:53:01.917  2016-01-16 02:53:02.280  363         tempdb                 tempdev                   DATA
2016-01-16 02:53:03.387  2016-01-16 02:53:03.687  300         tempdb                 tempdev                   DATA
2016-01-16 02:53:05.757  2016-01-16 02:53:06.060  303         tempdb                 tempdev                   DATA
2016-01-16 02:53:08.143  2016-01-16 02:53:08.343  200         tempdb                 tempdev                   DATA
2016-01-16 02:53:10.440  2016-01-16 02:53:10.500  60          tempdb                 tempdev                   DATA
2016-01-16 02:53:12.703  2016-01-16 02:53:13.053  350         tempdb                 tempdev                   DATA
2016-01-16 02:53:15.150  2016-01-16 02:53:15.520  370         tempdb                 tempdev                   DATA
2016-01-16 02:53:17.653  2016-01-16 02:53:17.897  243         tempdb                 tempdev                   DATA
2016-01-16 02:53:21.030  2016-01-16 02:53:21.477  446         tempdb                 tempdev                   DATA
2016-01-16 02:53:25.490  2016-01-16 02:53:25.653  163         tempdb                 tempdev                   DATA
2016-01-16 02:53:30.870  2016-01-16 02:53:31.157  286         tempdb                 tempdev                   DATA
2016-01-16 02:53:36.707  2016-01-16 02:53:37.133  426         tempdb                 tempdev                   DATA
2016-01-16 02:53:42.663  2016-01-16 02:53:43.040  376         tempdb                 tempdev                   DATA
2016-01-16 02:53:49.787  2016-01-16 02:53:49.907  120         tempdb                 tempdev                   DATA
2016-01-16 02:53:58.640  2016-01-16 02:53:58.840  200         tempdb                 tempdev                   DATA
2016-01-16 02:54:06.870  2016-01-16 02:54:07.057  186         tempdb                 tempdev                   DATA
2016-01-16 02:54:17.063  2016-01-16 02:54:17.107  43          tempdb                 tempdev                   DATA
2016-01-16 02:54:27.363  2016-01-16 02:54:27.370  6           tempdb                 tempdev                   DATA
2016-01-16 02:54:42.107  2016-01-16 02:54:42.110  3           tempdb                 tempdev                   DATA
2016-01-16 02:55:04.883  2016-01-16 02:55:04.913  30          tempdb                 tempdev                   DATA
2016-01-16 14:19:47.607  2016-01-16 14:19:47.670  63          AdventureWorks2012     AdventureWorks2012_Log    LOG
2016-01-16 14:19:47.820  2016-01-16 14:19:47.840  20          AdventureWorks2012     AdventureWorks2012_Log    LOG
2016-01-16 16:51:32.980  2016-01-16 16:51:33.080  100         tempdb                 tempdev                   DATA
2016-02-16 17:31:02.260  2016-02-16 17:31:02.327  66          tempdb                 templog                   LOG
...

这不仅可能导致文件碎片,还可能导致查询性能大幅延迟

SELECT
      GrowType = CASE WHEN EventClass = 92 THEN 'DATA' ELSE 'LOG' END
    , GrowCount = COUNT(1)
    , Duration = SUM(Duration) / 1000
FROM sys.traces i
CROSS APPLY sys.fn_trace_gettable([path], DEFAULT) t
WHERE t.EventClass IN (92, 93) 
    AND i.is_default = 1
    AND t.DatabaseID = DB_ID('tempdb')
GROUP BY EventClass
GrowType GrowCount   Duration
-------- ----------- --------------------
DATA     36          7296
LOG      2           102

让我们看看问题数据库的设置

USE tempdb
GO

SELECT
      d.type_desc
    , d.name
    , d.physical_name
    , current_size_mb = ROUND(d.size * 8. / 1024, 0)
    , initial_size_mb = ROUND(m.size * 8. / 1024, 0) 
    , auto_grow =
        CASE WHEN d.is_percent_growth = 1
            THEN CAST(d.growth AS VARCHAR(10)) + '%'
            ELSE CAST(ROUND(d.growth * 8. / 1024, 0) AS VARCHAR(10)) + 'MB'
        END
FROM sys.database_files d
JOIN sys.master_files m ON d.[file_id] = m.[file_id]
WHERE m.database_id = DB_ID('tempdb')

初始数据库大小:数据文件 8 MB,日志文件 1 MB

type_desc   name       physical_name                  current_size_mb    initial_size_mb    auto_grow
----------- ---------- ------------------------------ ------------------ ------------------ ------------
ROWS        tempdev    D:\SQL_2012\DATA\tempdb.mdf    258.000000         8.000000           10%
LOG         templog    D:\SQL_2012\DATA\templog.ldf   3.000000           1.000000           1MB

…与当前大小相比,这显然不够。此外,我们必须记住,每次重新启动时,SQL Server 都会重新创建 tempdb。因此,下次启动 SQL Server 时,您将获得一个 9 MB 的文件、查询执行延迟以及默认跟踪中的新消息批次。

如何处理这个问题?监控文件大小并为它们预留空间

SELECT
      s.[file_id]
    , s.name
    , size = CAST(s.size * 8. / 1024 AS DECIMAL(18,2))
    , space_used = CAST(t.space_used * 8. / 1024 AS DECIMAL(18,2))
    , free_space = CAST((s.size - t.space_used) * 8. / 1024 AS DECIMAL(18,2))
    , used_percent = CAST(t.space_used * 100. / s.size AS DECIMAL(18,2))
    , s.max_size
    , s.growth
    , s.is_percent_growth
FROM sys.database_files s
CROSS APPLY (
    SELECT space_used = FILEPROPERTY(s.name, 'SpaceUsed')
) t

2. 自动收缩事件

最近,我写过 AUTO_CLOSE 选项会降低性能。那么,AUTO_SHRINK 则更糟。每 30 分钟,SQL Server 会尝试收缩 DB 文件中的可用空间。此过程非常消耗资源,并可能导致磁盘上的文件碎片。文件收缩可能导致高索引碎片,增加逻辑读取并降低查询性能。

SELECT
      StartTime
    , EndTime
    , Duration = Duration / 1000
    , DatabaseName = DB_NAME(DatabaseID)
    , [FileName]
    , GrowType = CASE WHEN EventClass = 94 THEN 'DATA' ELSE 'LOG' END
    , NTDomainName
    , ApplicationName
    , LoginName
    , TextData
    , IsSystem
FROM sys.traces i
CROSS APPLY sys.fn_trace_gettable([path], DEFAULT) t
WHERE t.EventClass IN (
            94, -- Data File Auto Shrink
            95  -- Log File Auto Shrink
        ) 
    AND i.is_default = 1
StartTime               EndTime                 Duration    DatabaseName         FileName                GrowType
----------------------- ----------------------- --------    -------------------  ----------------------- --------
2016-02-10 11:57:54.333 2016-02-10 11:57:54.450 116         AdventureWorks2012   AdventureWorks2012_Log  LOG
2016-02-10 14:58:21.087 2016-02-10 14:58:21.200 113         AdventureWorks2012   AdventureWorks2012_Log  LOG
2016-02-10 19:30:02.350 2016-02-10 19:30:02.463 113         AdventureWorks2012   AdventureWorks2012_Log  LOG
2016-02-10 21:00:26.083 2016-02-10 21:00:26.100 16          AdventureWorks2012   AdventureWorks2012_Log  LOG

我强烈建议为所有数据库禁用此选项

SELECT 'ALTER DATABASE ' + QUOTENAME(name) + ' SET AUTO_SHRINK OFF WITH NO_WAIT;'
FROM sys.databases
WHERE is_auto_shrink_on = 1

3. DBCC 事件

默认跟踪的另一个有用功能是能够跟踪何时以及谁运行了 DBCC 命令。不能责怪 CHECKDB 导致 DBCC,但如果有人在生产数据库上疯狂执行

DBCC SHRINKDATABASE
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS

这很容易被追踪

SELECT t.TextData, t.ApplicationName, t.LoginName, t.StartTime
FROM sys.traces i
CROSS APPLY sys.fn_trace_gettable([path], DEFAULT) t
WHERE i.is_default = 1
    AND t.EventClass = 116 -- Audit DBCC Event
    AND t.ApplicationName IS NOT NULL
TextData                ApplicationName     LoginName    StartTime               
----------------------- ------------------  -----------  ----------------------- 
DBCC SHRINKDATABASE(1)  SSMS - Query        PC\IgorS     2016-02-10 20:03:46.307 
DBCC FREEPROCCACHE      SSMS - Query        PC\IgorS     2016-02-10 20:03:43.430 
DBCC DROPCLEANBUFFERS   SSMS - Query        PC\IgorS     2016-02-10 20:03:44.767

4. 错误和警告事件

当 SQL Server 没有足够的可用内存容量来执行查询时,某些运算符的结果将在 tempdb 中处理。当优化器错误评估预期行数时,也会观察到相同的行为。

让我们尝试编写一个会调用 tempdb 溢出的查询

SELECT *
FROM Sales.SalesOrderHeader
WHERE DueDate > ShipDate
ORDER BY OrderDate DESC

多亏了默认跟踪中的一条消息,我们可以跟踪此类查询

SELECT TOP(10)
      EventName = e.name
    , t.TextData
    , t.ApplicationName
    , t.LoginName
    , t.StartTime
    , DatabaseName = DB_NAME(t.DatabaseID)
FROM sys.traces i
CROSS APPLY sys.fn_trace_gettable([path], DEFAULT) t
JOIN sys.trace_events e ON e.trace_event_id = t.EventClass
WHERE i.is_default = 1
    AND e.category_id = 3
ORDER BY t.StartTime DESC
EventName          ApplicationName   LoginName     StartTime               DatabaseName
------------------ ----------------- ------------- ----------------------- ---------------------
Sort Warnings      SSMS - Query      PC\SergeyS    2016-02-11 13:06:44.867 AdventureWorks2012

我们可以找到执行计划并优化查询

USE AdventureWorks2012
GO
SELECT TOP(5)
      p.query_plan
    , e.[text]
    , qyery_cost = p.query_plan.value(
            '(/*:ShowPlanXML/*:BatchSequence/*:Batch/*:Statements/*:StmtSimple/@StatementSubTreeCost)[1]',
            'FLOAT'
        )
    , s.last_execution_time
    , last_exec_ms = s.last_worker_time / 1000
    , s.execution_count
FROM sys.dm_exec_query_stats s
CROSS APPLY sys.dm_exec_query_plan(s.plan_handle) p
CROSS APPLY sys.dm_exec_sql_text(s.plan_handle) e
WHERE e.[text] NOT LIKE '%sys%'
    AND s.last_execution_time >= DATEADD(MS, -2500, '2016-02-10 19:41:45.983')
    AND e.[dbid] = DB_ID()
ORDER BY s.last_execution_time
query_plan        text                                     qyery_cost  last_execution_time     last_exec_ms   execution_count
----------------- ---------------------------------------- ----------- ----------------------- -------------- ----------------
ShowPlanXML ...  SELECT * FROM Sales.SalesOrderHeader...  1,10126     2016-02-11 13:06:44.850 1538           1

顺便说一句,实际执行计划在 DMV 中不可用。只能在查询执行后显示计划(Post Query Execution Showplan)阶段通过相应的跟踪事件获得,或者通过 SET STATISTICS XML ON。从 SQL Server 2012 开始,为此目的添加了一个新的 XEventpost_query_execution_showplan

我乐于能够跟踪某些类型的警告

SELECT DISTINCT d.SalesOrderID, d.UnitPrice, h.OrderDate
FROM Sales.SalesOrderHeader h
JOIN Sales.SalesOrderDetail d ON h.SalesOrderID = d.SalesOrderID
WHERE h.DueDate > h.ShipDate

EventName        ApplicationName   LoginName     StartTime               DatabaseName
---------------- ----------------- ------------- ----------------------- ---------------------
Hash Warning     SSMS - Query      PC\SergeyS    2016-02-11 13:14:44.433 AdventureWorks2012

例如,当你忘记指定表之间的 JOIN

SELECT *
FROM Sales.Currency c, Sales.CountryRegionCurrency r
--WHERE c.CurrencyCode = r.CurrencyCode

EventName               ApplicationName   LoginName     StartTime               DatabaseName
----------------------- ----------------- ------------- ----------------------- ---------------------
Missing Join Predicate  SSMS - Query      PC\SergeyS    2016-02-11 13:18:20.507 AdventureWorks2012

或者当过滤的列没有统计信息时

SELECT DatabaseLogID
FROM dbo.DatabaseLog
WHERE PostTime = '2012-03-14 13:14:18.847'

EventName                   TextData                                   .. LoginName    StartTime               DatabaseName
--------------------------- ------------------------------------------ .. ------------ ----------------------- --------------------
Missing Column Statistics   NO STATS:([dbo].[DatabaseLog].[PostTime])  .. PC\SergeyS   2016-02-11 13:19:39.813 AdventureWorks2012

5. 对象事件

在默认跟踪中,您可以跟踪对象更改

USE [master]
GO
IF DB_ID('test') IS NOT NULL
    DROP DATABASE [test]
GO
CREATE DATABASE [test]
GO
USE [test]
GO
CREATE TABLE dbo.tbl (ID INT)
GO
ALTER TABLE dbo.tbl ADD Col VARCHAR(20)
GO
CREATE UNIQUE CLUSTERED INDEX ix ON dbo.tbl (ID)
GO
USE [master]
GO
IF DB_ID('test') IS NOT NULL
    DROP DATABASE [test]
GO
SELECT
      EventType = e.name
    , t.DatabaseName
    , t.ApplicationName
    , t.LoginName
    , t.StartTime
    , t.ObjectName
    , ObjectType =
        CASE t.ObjectType
            WHEN 8259 THEN 'Check Constraint'
            WHEN 8260 THEN 'Default Constraint'
            WHEN 8262 THEN 'Foreign Key'
            WHEN 8272 THEN 'Stored Procedure'
            WHEN 8274 THEN 'Rule'
            WHEN 8275 THEN 'System Table'
            WHEN 8276 THEN 'Server Trigger'
            WHEN 8277 THEN 'Table'
            WHEN 8278 THEN 'View'
            WHEN 8280 THEN 'Extended Stored Procedure'
            WHEN 16724 THEN 'CLR Trigger'
            WHEN 16964 THEN 'Database'
            WHEN 17222 THEN 'FullText Catalog'
            WHEN 17232 THEN 'CLR Stored Procedure'
            WHEN 17235 THEN 'Schema'
            WHEN 17985 THEN 'CLR Aggregate Function'
            WHEN 17993 THEN 'Inline Table-valued SQL Function'
            WHEN 18000 THEN 'Partition Function'
            WHEN 18004 THEN 'Table-valued SQL Function'
            WHEN 19280 THEN 'Primary Key'
            WHEN 19539 THEN 'SQL Login'
            WHEN 19543 THEN 'Windows Login'
            WHEN 20038 THEN 'Scalar SQL Function'
            WHEN 20051 THEN 'Synonym'
            WHEN 20821 THEN 'Unique Constraint'
            WHEN 21075 THEN 'Server'
            WHEN 21076 THEN 'Transact-SQL Trigger'
            WHEN 21313 THEN 'Assembly'
            WHEN 21318 THEN 'CLR Scalar Function'
            WHEN 21321 THEN 'Inline scalar SQL Function'
            WHEN 21328 THEN 'Partition Scheme'
            WHEN 21333 THEN 'User'
            WHEN 21572 THEN 'Database Trigger'
            WHEN 21574 THEN 'CLR Table-valued Function'
            WHEN 21587 THEN 'Statistics'
            WHEN 21825 THEN 'User'
            WHEN 21827 THEN 'User'
            WHEN 21831 THEN 'User'
            WHEN 21843 THEN 'User'
            WHEN 21847 THEN 'User'
            WHEN 22601 THEN 'Index'
            WHEN 22611 THEN 'XMLSchema'
            WHEN 22868 THEN 'Type'
        END
FROM sys.traces i
CROSS APPLY sys.fn_trace_gettable([path], DEFAULT) t
JOIN sys.trace_events e ON t.EventClass = e.trace_event_id
WHERE e.name IN ('Object:Created', 'Object:Deleted', 'Object:Altered')
    AND t.ObjectType != 21587
    AND t.DatabaseID != 2
    AND i.is_default = 1
    AND t.EventSubClass = 1
EventType        DatabaseName   ApplicationName      LoginName     StartTime               ObjectName    ObjectType
---------------- -------------- -------------------- ------------- ----------------------- ------------- --------------
Object:Created   test           SSMS - Query         PC\SergeyS    2016-02-11 13:36:46.727 NULL          Database
Object:Created   test           SSMS - Query         PC\SergeyS    2016-02-11 13:36:46.760 tbl           Table
Object:Altered   test           SSMS - Query         PC\SergeyS    2016-02-11 13:36:46.803 tbl           Table
Object:Created   test           SSMS - Query         PC\SergeyS    2016-02-11 13:36:46.837 ix            Index
Object:Deleted   test           SSMS - Query         PC\SergeyS    2016-02-11 13:36:56.347 NULL          Database

6. 服务器事件

此外,您可以通过默认跟踪跟踪备份和还原数据库的操作以及执行者

SELECT
    CASE WHEN t.EventSubClass = 1
        THEN 'BACKUP' 
        ELSE 'RESTORE'
    END, t.TextData, t.ApplicationName, t.LoginName, t.StartTime
FROM sys.traces i
CROSS APPLY sys.fn_trace_gettable([path], DEFAULT) t
WHERE i.is_default = 1
    AND t.EventClass = 115 -- Audit Backup/Restore Event
TextData             ApplicationName                 LoginName   StartTime
-------------------- ------------------------------- ----------- -------------------------
BACKUP DATABASE      SSMS - Query                    PC\SergeyS  2016-01-21 13:05:26.390
RESTORE DATABASE     dbForge Studio for SQL Server   PC\SergeyS  2016-01-22 12:46:45.717
BACKUP DATABASE      SQLCMD                          sa          2016-01-24 10:16:40.317

或跟踪 SQL Server 的内存使用情况

SELECT
      t.StartTime
    , [Type] = CASE WHEN EventSubClass = 1 THEN 'UP' ELSE 'DOWN' END
    , t.IntegerData
FROM sys.traces i
CROSS APPLY sys.fn_trace_gettable([path], DEFAULT) t
WHERE t.EventClass = 81 -- Server Memory Change
    AND i.is_default = 1
StartTime                Type  IntegerData
------------------------ ----- -----------
2016-02-10 02:52:42.887  UP    191
2016-02-10 02:53:00.640  UP    371
2016-02-10 02:53:34.917  UP    734
2016-02-10 02:53:52.140  UP    916
2016-02-10 10:05:00.027  DOWN  736
2016-02-10 10:17:17.417  UP    921
2016-02-10 11:52:14.930  DOWN  735
2016-02-10 12:00:32.577  DOWN  553
2016-02-10 13:06:11.540  UP    751
2016-02-10 15:11:10.487  UP    936
2016-02-10 15:15:26.107  DOWN  714

Postscript

默认跟踪是一个很好的机会来监控服务器健康状况。当然,它有很多缺点,例如限制为 20MB,以及微软宣布自 SQL Server 2008 起此功能已被 弃用。然而,在诊断与 SQL Server 相关的问题的初始阶段,默认跟踪有时非常有用。

所有内容均在 Microsoft SQL Server 2012 (SP3) (KB3072779) - 11.0.6020.0 (X64) 上进行了测试。
所有执行计划均来自 dbForge Studio for SQL Server

© . All rights reserved.