Thursday, 01 August 2013 10:16

Start-Databaseavailabilitygroup fails to Add Cluster Nodes

Written by

Recently I was testing the Datacenter failover of Database Availability Group (DR), and observed few things/errors when adding back the DAG members after the successful failover. During the failback, I could add one of the 2 servers in primary datacenter but failed to add another server. The Start-Databaseavailabilitygroup failed to complete with error as pasted below,

WARNING: Server 'EX-01' failed to be started as a member of database availability group 'DAG-01'. Error: A

server-side database availability group administrative operation failed. Error: The operation failed. CreateCluster

errors may result from incorrectly configured static addresses. Error: An error occurred while attempting a cluster

operation. Error: Node ex2010-01 is already joined to a cluster. [Server: Ex-DR.fabrikam.com]

WARNING: The operation wasn't successful because an error was encountered. You may find more details in log file

"C:\ExchangeSetupLogs\DagTasks\dagtask_2013-07-30_06-10-35.992_start-databaseavailabilitygroup.log".

Start-DatabaseAvailabilityGroup failed to start server(s) 'EX-01' in database availability group 'DAG-01'.

   + CategoryInfo         : InvalidArgument: (:) [Start-DatabaseAvailabilityGroup], FailedToStartNodeException

   + FullyQualifiedErrorId : 811A6BB8,Microsoft.Exchange.Management.SystemConfigurationTasks.StartDatabaseAvailabilit

   yGroup

Other symptoms are,

  1. The cluster service fails frequently on the ex-01 server
  2. Cmdlet Start-Databaseavailabilitygroup keeps failing
  3. No changes to the network settings since the stop- Databaseavailabilitygroup
  4. Verified the Get-clusternode cmdlet to verify the cluster ex-01 status, and did not find any state entry (up, joining etc.

 It means, the ex-01 is cleanly evicted from the cluster and does not have any stale entry in the cluster configuration.

How to Fix:

When I verified the DAG nodes Exchange version, I observed that the node which is failing to add is in a higher rollup version though at the same service pack level. I have raised the other nodes (ex-dr and ex-02) into the similar rollup level as ex-01 (server fail to add) and could able to execute the Start-Databaseavailabilitygroup cmdlet without any issues. That means, when you try to add a dag node using Start-Databaseavailabilitygroup, the PAM (Primary Active Manager) should be at the same or above rollup level.

So, please ensure the below if you are facing the similar error,

  1. Ensure all the DAG nodes are running at the same Service Pack and Rollup levels
  2. Ensure the OS patch level does not have much of a difference
  3. Ensure the NIC settings are matching with other nodes
  4. Ensure the DAG node you are trying to add is listing out in the Stopped mailbox servers list,
  5. Ensure that the cluster node is not in the status on “Joining” on cluster configuration (Use Get-ClusterNode cmdlet).

First 3 steps are easy to follow, step 4 and 5 are explained below.

How to verify the current started and stopped mailbox server in a DAG,

Execute the below cmdlet,

Get-DatabaseAvailabilityGroup -Status | fl st*,pr*,op*

DAG_Status_-_Started_Servers

Once you confirm that the server which are trying add back to DAG is listing in StoppedMailboxServers list, you can safely execute. If the server you are trying to add is already in the startedMailboxServers list then you might have to stop the server before you can add it back. The configuration might have updated due to multiple execution of Start-Databaseavailabilitygroup cmdlet in the past.

Do not forget to execute Set-DatabaseAvailabilityGroup cmdlet to properly set the cluster properties in case you run the Stop DAG command.

How to verify the node status in cluster configuration,

Execute Get-ClusterNode to list all the available nodes in the DAG. If the cluster node you are trying to add is listing out there other than status “up”, you might have to remove it forcefully before executing the Start-Databaseavailabilitygroup cmdlet.

Final Step:

You will be able to execute the Start-Databaseavailabilitygroup cmdlet successfully after these checks, see below,

DAG_Status_-_Start_and_Status_After

Share your comments, and errors if any.

-Praveen

 

 

 

theme by reviewshub