When did my Availability Group change primary replica states?

***UPDATE*** Script has been modified so that you don’t have to know the default location of your error log.
Within the first two weeks of coming to my new employer, I was tasked with developing a method of tracking and monitoring the which of the replicas in any given Availability Group was the primary and send a notification when the primary changed.

In order to quickly and simply detect a change, I connected to the Availability Group and query the sys.dm_hadr_availability_group_states DMV. If the primary replica in the DMV is the same as what’s in our CMDB, then we do nothing. If it’s different, make the requisite updates and send a page to the team.

What it doesn’t tell us is when the actual event occurred. We run this process every 15 minutes so that’s as granular as we could get if that was the only tool we used. However, SQL Server 2012 also includes an Extended Event Session called “AlwaysOn_health” that uses an event file target that we can query. We can leverage this session to figure out when a replica state change event occurred.

First, we need to determine the file name to which the Event Session is writing. Instead of searching every drive for a pattern like ‘AlwaysOn*_.xel’, we can use the sys.dm_xe_sessions and sys.dm_xe_session_targets DMVs.

With our file name in hand, we can query the event file:

    DECLARE @FileName NVARCHAR(4000)
    SELECT @FileName =
	target_data.value('(EventFileTarget/File/@name)[1]','nvarchar(4000)')
	FROM (
			SELECT
				CAST(target_data AS XML) target_data
				FROM sys.dm_xe_sessions s
				JOIN sys.dm_xe_session_targets t
					ON s.address = t.event_session_address
				WHERE s.name = N'AlwaysOn_health'
		) ft

	SELECT
		XEData.value('(event/@timestamp)[1]','datetime2(3)') AS event_timestamp,
		XEData.value('(event/data/text)[1]', 'VARCHAR(255)') AS previous_state,
		XEData.value('(event/data/text)[2]', 'VARCHAR(255)') AS current_state,
		ar.replica_server_name
	FROM (
		SELECT CAST(event_data AS XML) XEData, *
			FROM sys.fn_xe_file_target_read_file(@FileName, NULL, NULL, NULL)
			WHERE object_name = 'availability_replica_state_change'
		 ) event_data
	JOIN sys.availability_replicas ar
		ON ar.replica_id = XEData.value('(event/data/value)[5]', 'VARCHAR(255)')
	ORDER BY event_timestamp DESC;

If that query is executed against the Availability Group (by logging into the AG and not a node), you’ll get the events that have occurred on the node that is currently serving as the primary replica.

event_timestamp previous_state current_state replica_server_name
2013-08-02 17:10:33.106 PRIMARY_PENDING PRIMARY_NORMAL ag-server-node
2013-08-02 17:10:32.857 RESOLVING_NORMAL PRIMARY_PENDING ag-server-node
2013-08-02 17:10:32.705 RESOLVING_PENDING_FAILOVER RESOLVING_NORMAL ag-server-node
2013-08-02 17:10:20.794 SECONDARY_NORMAL RESOLVING_PENDING_FAILOVER ag-server-node

So starting at 5:10:20.794 PM, on August 2nd, 2013, this node (which is now primary. Remember, we know this because we’re logged in to the Availability Group which is running off of the primary replica) began its transition from secondary to primary replica. We see the course of action that it took which led to it ultimately settling in as PRIMARY_NORMAL its current state at 05:10:33.106 PM.

We could run the same query on any of the other replicas and see their histories as well, particularly the replica that transitioned from primary to secondary.

Advertisements

4 thoughts on “When did my Availability Group change primary replica states?

  1. i get this error when executing

    Msg 25718, Level 16, State 3, Line 13
    The log file name “(null)” is invalid. Verify that the file exists and that the SQL Server service account has access to it.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s