I’ve built a number of clusters before so I have some pretty good experience. Today I was met with the most odd issue that I’ve ever had. I built up the Microsoft Cluster, tested failover making sure the drives would fail over to the passive server. All was good so I went ahead and installed SQL Server. Install went good so I tested the failover again.
Well boy did I ever get a surprise. The Quorum and the MSDTC drives failed over correctly but the SQL Data and Log drives did not. The passive server that became active could see the drives but could not access them. In Cluster administrator the drives failed to move over then eventually it failed back, the issue was with the passive server. So scratching my head I tried some quick Google searches but didn’t find anything (more likely because I was frustrated and impatient). But I managed to fix the problem here is what I did.
- In cluster administrator I removed the data disk dependencies from several SQL resources.
- Delete both the data and log drive resources
- Recreate both resources (the active server should see both drives still)
- Reboot the passive server
- After reboot test fail over.
All was good after that but I’ve never seen that happen before. I have to admit I was worried that I would have to rebuild everything. Not sure why both the Data and Log disk resources had issues but not the Quorum and MSDTC disk resources. But in any case at least this worked for me.