Exchange Cluster IP address goes walkabout

A power outage at the weekend caused unscheduled shutdowns on our servers. All came back OK and were working fine. However our backup of the Exchange 2010 server failed as Backup Exec could not connect to the Database.
We have a DAG set up to a DR site, we failover manually if required and for when we need to do maintenance, it all works rather well. DAGs have an IP address assigned to them which is actually related to the Windows Cluster service, which if the servers are in different subnets have be different depending on which server is active.
I believe this is controlled by the Cluster service, Exchange then updated DNS to point to the new server when you move to a different active server. Unfortunately despite not having auto failover the cluster service moved the IP address, Exchange, however, did not update DNS as the active DB had not moved. Bringing the server back up didn't change anything.
So we now had the cluster IP address in the DR site and the DB on the production site, clients could still connect to DB ok but BE was looking at the cluster IP address in DNS and not able to connect to the DB to back it up as the IP address is not bound to a NIC like a normal IP address though you can see it if you do an ipconfig/all on the active server, it shouldn't be on the passive server.
After rebooting servers and moving the active DB to the DR site and back there was no change in the cluster IP address.

There was also some of these errors in the event log


Log Name:      System
Source:        Microsoft-Windows-FailoverClustering
Date:          6/18/2010 2:02:41 PM
Event ID:      1069
Task Category: Resource Control Manager
Level:         Error
Keywords:     
User:          SYSTEM
Computer:      node1.company.com
Description: Cluster resource ‘IPv4 DHCP Address 1 (Cluster Group)’ in clustered service or application ‘Cluster Group’ failed.

After much Googling, head scratching and and deciphering Powershell commands we decided to set the DAG to only use the Production site IP address then add the DR site address back in using the following Powershell commands.


Set-DatabaseAvailabilityGroup Database_name –DatabaseAvailabilityGroupIpAddresses “Prod_IP_Address

Get-DatabaseAvailabilityGroup -identity "Database_name" | fl *ip*

DatabaseAvailabilityGroupIpv4Addresses : {Prod_IP_Address}
DatabaseAvailabilityGroupIpAddresses   : {Prod_IP_Address}


Set-DatabaseAvailabilityGroup Database_name –DatabaseAvailabilityGroupIpAddresses “Prod_IP_Address, DR_IP_Address

Get-DatabaseAvailabilityGroup -identity "Database_name" | fl *ip*

DatabaseAvailabilityGroupIpv4Addresses : {Prod_IP_Address, DR_IP_Address}
DatabaseAvailabilityGroupIpAddresses   : {Prod_IP_Address, DR_IP_Address}


 
This worked as hoped for and BE can do it's stuff once more.

If you have ever wondered what to do with MS OneNote I use it store every script I ever use for future reference, one tab at the top for each subject and one page for each script, it is very handy.

Comments

  1. I had the same issue today and found another solution.

    See the issue:
    cluster DAG1.domain.com group

    Resolve the issue:
    cluster.exe DAG1.domain.com group "cluster group" /moveto:prodserver

    http://blogs.technet.com/b/timmcmic/archive/2010/05/25/exchange-2010-cluster-core-resources-the-replication-service-and-active-manager.aspx
    http://blogs.technet.com/b/timmcmic/archive/2010/01/18/do-i-need-to-move-my-cluster-core-resources.aspx

    ReplyDelete

Post a Comment

Popular posts from this blog

Scripting DNS entries

Enterprise Vault - Failed Exchange Task

Windows Phone to iPhone - a painful transition