Note: This applies largely to ring buffers, not trace files.
The last few weeks I've been capturing application-level database timeouts and dipped my toe into Extended Events to do so. Knowing that the default ADO.NET query timeout is 30s I decided to capture queries taking >= 27s to run and that finished with an error.
Note: This was done on SQL Server 2012/2014, and I also read that the order of the Where clause is important to improve performance which is why I've put the result check first.
If Exists (
Select Top 1 0
From sys.dm_xe_sessions
Where name = 'Query_Timeouts')
Begin
Alter Event Session [Query_Timeouts] On Server State = Stop
Drop Event Session [Query_Timeouts] On Server
End
Go
CREATE EVENT SESSION [Query_Timeouts] ON SERVER
ADD EVENT sqlserver.rpc_completed(
ACTION(sqlserver.client_app_name,sqlserver.client_hostname,sqlserver.database_name,sqlserver.plan_handle,sqlserver.query_hash,sqlserver.query_plan_hash,sqlserver.server_principal_name,sqlserver.session_id)
WHERE ([result]<>(0) AND [duration]>=(27000000))),
ADD EVENT sqlserver.sql_batch_completed(
ACTION(sqlserver.client_app_name,sqlserver.client_hostname,sqlserver.database_name,sqlserver.plan_handle,sqlserver.query_hash,sqlserver.query_plan_hash,sqlserver.server_principal_name,sqlserver.session_id)
WHERE ([result]<>(0) AND [duration]>=(27000000)))
ADD TARGET package0.ring_buffer(SET max_events_limit=(1000))
WITH (MAX_MEMORY=1024 KB,EVENT_RETENTION_MODE=ALLOW_MULTIPLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=OFF)
GO
ALTER EVENT SESSION [Query_Timeouts] ON SERVER STATE = START
GO
I needed some way to test this would pick up the kind of timeouts I was looking for and so wrote a little PowerShell script which will populate a hundred or so in 5m.
for ($i = 0; $i -lt 10; $i++) {
Start-Job {
for ($i = 0; $i -lt 10; $i++) {
Invoke-SqlCmd -Query "Waitfor Delay '00:01:00'" -ServerInstance .\SQL2014 -QueryTimeout 30 -ErrorAction:Continue
}
}
}
while (Get-Job -State Running) {
Start-Sleep -Seconds 1
}
Get-Job | Remove-Job
I don't pretend to have read the SQL Server XML books (who has the time…) but I can hack my way through and plenty of others have posted the code to shred an Extended Event. So I started shredding away and this is where things started going terribly wrong.
Note: I capture a little more in the event than I extract out just in case I need it later.
Select Switchoffset(c.value('(@timestamp)[1]', 'datetimeoffset'), Datepart(Tzoffset, Sysdatetimeoffset())) As timestamp_adjusted,
c.value('(@name)[1]', 'varchar(50)') As event_name,
c.value('(@package)[1]', 'varchar(50)') As package_name,
c.value('(action[@name = ''client_hostname'']/value)[1]', 'varchar(50)') As client_hostname,
c.value('(action[@name = ''server_principal_name'']/value)[1]', 'varchar(50)') As server_principal_name,
c.value('(action[@name = ''database_name'']/value)[1]', 'varchar(50)') As database_name,
Coalesce(c.value('(data[@name = ''batch_text'']/value)[1]', 'varchar(max)'), c.value('(data[@name = ''statement'']/value)[1]', 'varchar(max)')) As query,
c.value('(data[@name = ''result'']/text)[1]', 'varchar(50)') As result,
c.value('(data[@name = ''duration'']/value)[1]', 'bigint') / 1000000.0 As duration_s
From (
Select Cast(target_data As Xml) As target_data
From sys.dm_xe_sessions s
Join sys.dm_xe_session_targets st
On s.address = st.event_session_address
Where s.name = 'Query_Timeouts'
And st.target_name = 'ring_buffer'
) a
Cross Apply
target_data.nodes('RingBufferTarget/event') As b(c)
Performance is abysmal and takes 6 minutes to run in my VM (on a real server with more cores it was still taking 2-3 minutes to run). Unacceptable!
After reading numerous complaints online about slow XML I started playing around with some of the suggestions:
- Put the XML into a variable.
- Shred the XML rows into a temp table.
I found that I only received performance benefits when I used both together. This is what I ended up using:
If Object_Id('tempdb..#Xml') Is Not Null
Drop Table #Xml
Create Table #Xml (
[Id] Int Identity(1, 1) Primary Key,
[Event] Xml Not Null
)
Declare @Xml Xml
Select @Xml = Cast(target_data As Xml)
From sys.dm_xe_sessions s
Join sys.dm_xe_session_targets st
On s.address = st.event_session_address
Where s.name = 'Query_Timeouts'
And st.target_name = 'ring_buffer'
Insert #Xml (Event)
Select c.query('.') As Event
From @Xml.nodes('RingBufferTarget/event') As b(c)
-- Option (Optimize For (@Xml = Null)) -- Allegedly useful for SQL 2008
Select Switchoffset(Event.value('(event/@timestamp)[1]', 'datetimeoffset'), Datepart(Tzoffset, Sysdatetimeoffset())) As timestamp_adjusted,
Event.value('(event/@name)[1]', 'nvarchar(50)') As event_name,
Event.value('(event/action[@name = ''client_hostname'']/value)[1]', 'varchar(32)') As client_hostname,
Event.value('(event/action[@name = ''server_principal_name'']/value)[1]', 'varchar(128)') As server_principal_name,
Nullif(Event.value('(event/action[@name = ''plan_handle'']/value)[1]', 'varchar(88)'), Replicate('0', 88)) As plan_handle,
query_hash, -- These need to be converted later
query_plan_hash, -- These need to be converted later
Event.value('(event/action[@name = ''database_name'']/value)[1]', 'nvarchar(128)') As database_name,
Coalesce(Event.value('(event/data[@name = ''batch_text'']/value)[1]', 'nvarchar(max)'), Event.value('(event/data[@name = ''statement'']/value)[1]', 'nvarchar(max)')) As query,
Event.value('(event/data[@name = ''result'']/text)[1]', 'nvarchar(50)') As result,
Event.value('(event/data[@name = ''duration'']/value)[1]', 'bigint') / 1000000.0 As duration_s
From #Xml
Outer Apply (
Select Nullif(Event.value('(event/action[@name = ''query_hash'']/value)[1]', 'decimal(20)'), 0) As query_hash,
Nullif(Event.value('(event/action[@name = ''query_plan_hash'']/value)[1]', 'decimal(20)'), 0) As query_plan_hash
) a
Order By 1
That's an improvement of 6m down to 1s. I'd sure like to know how to improve performance of the first XML query without all the additional code though.