How to Skip Disk Checking On Azure Virtual Machines

  • Post author:
  • Post category:Main
  • Post last modified:September 3, 2023
  • Reading time:5 mins read

In this article, we are going to look at how to skip disk checking on Azure virtual machines.

You may have found yourself in a difficult position. You have a critical workload running on an Azure virtual machine, the machine has rebooted and it has started into a disk check with an ETA of 2 hours. However accordingly to your disaster recovery policy, 2 hours is not a significant enough outage to start the disaster recovery process, but it is also an almost unmanageable amount of downtime for this critical workload. Putting aside what would have been done to prevent this outage, you must recover your workload quickly. 

How does disk checking work?

The check disk (or CHKDSK utility) usually runs on the bootup of your operating system if it is found to have any NTFS issues or corruption. The purpose is to resolve these NTFS issues to allow for a clean system boot and prevent further damage. 

CHKDSK works by verifying the actual system files, not by reading every sector on your system disk. It will then attempt to repair any damaged files.  

What causes a disk check to run on boot?

As mentioned above, the purpose of the disk check or CHKDSK utility is to repair damaged system files. With that being said, a disk check is initiated when an event occurs that could cause corrupt system files. Common events that can cause a disk check include:

  • An unclean system shutdown. This includes; loss of power, forced shutdown or an interrupted shutdown.
  • Failing storage devices that result in corruption of system files.
  • Malware infections that damage or alter system files.

Why is disk checking important?

Not only does disk checking look for data corruption within bad sectors it also looks for any physical bad sectors on the disk which indicate permanent or unfixable issues. When it finds data corruptions that can be fixed this is what is called a ‘soft’ bad sector, this is when the data is simply written badly but is usually recoverable. A physical error is called a ‘hard’ bad sector and usually indicates physical damage to the disk. In the event of physical damage to the disk, a replacement is usually necessary. 

Although in our scenario, Azure virtual machine operating systems are stored on virtual disks, this means we do not manage the physical disks (in most scenarios). So should there be any disk checking involved, it is best it is left to repair what it finds. 

In some scenarios, you may find that the disk check or repair is in a permanent loop, in which you check out the next section on what to do in that scenario. 

What are my options if my virtual machine boots into a long disk or recurring check?

In the worst scenario you may find one of your Azure virtual machines stuck in either a permanent disk check loop, where each time the repair finishes, it reboots straight back into another repair. Or you may find your VM has booted into a repair, but the ETA to completion is over a couple of hours, which is not a manageable amount of downtime for your workload. 

Option 1 – Hard Reboot

If your virtual machine is stuck in a long disk check, you may want to first try and hard power off your virtual machine from the Azure portal. You first want to ensure you have a successful last backup and whoever you report to is aware of the risks. Hopefully, if this does happen to you, it happens first thing in the morning and you have a successful backup from the night before. 

Option 2 – Recovery Virtual Machine

Another step you can take is to create an additional, temporary virtual machine, within your Azure tenant. With this, you can attach the OS disk of your problematic virtual machine and perform any maintenance or repair work to the operating system. Again, this may take a little time to create the virtual machine, attach the disk and troubleshoot, however at least you will be ready should this occur again. Take a look at the following Microsoft doc for instructions on how to do this: https://docs.microsoft.com/en-us/troubleshoot/azure/virtual-machines/troubleshoot-recovery-disks-portal-windows#manually-attach-a-failed-os-disk-to-a-repair-vm

Option 3 – Restore from Azure Backup

The last option is to restore from Azure backup. Now, even if you use a 3rd party solution, you should still be utilising Azure backup, even if it is for the instant recovery snapshot feature. In this scenario, you should restore from the latest instant recovery snapshot.

Summary

Thank you for taking the time to read my post on How to skip disk checking on Azure Virtual Machines when it is stuck in a loop or simply taking a long time. If you found this helpdesk please ensure you sign up to our mailing list to be notified when a new post is released.

Daniel Bradley

My name is Daniel Bradley and I work with Microsoft 365 and Azure as an Engineer and Consultant. I enjoy writing technical content for you and engaging with the community. All opinions are my own.

Leave a Reply