Thursday, January 31, 2008

Fixing fatal FAT32 problem

I'm rebuilding an old-ish PC that's on its way from one family member to another. The PC is one of those that has the OS in a restore partition on the hard disk, rather than shipping with a CD. OK, no problem to re-install from the restore partition after performing a NTFS reformat on the main data partition. The problem I had was that the D: partition had got some corrupted files – not the OS restore files, fortunately (then I would be been dead in the water), but the previous owner had managed to add some photos to D:, and these files were corrupted. Why a system restore partition is not read-only I have no idea.

OK, so disk problem. Not normally an issue, just run the 'error checking' tool from the tools tab on the disk properties menu, then reboot. The disk check runs during the boot phase, so that the OS can get low-level access to the drive. Except that, in this case, the disk check would get about 90% complete, then some low-level fault in the file system would cause the machine to reboot. Without marking the disk as clean. Indeed, the file system wasn't clean as the dead files were still there. This would happen on every boot, unless I manually ask the disk checker to skip the check, which only defers it to next boot cycle. I didn't want to hand over the machine in this state.

Fortunately, there was a solution. I downloaded the latest version of Knoppix Linux, and burnt a boot CD. Knoppix is a "live" linux - it runs from a CD without installing any files on the hard drive. Once in Knoppix, I launched a root shell and ran the FAT32 flavour of fsck on the dodgy drive, which appeared as /dev/hda1 on my system:

$ dosfsck -a -t -w -V /dev/hda1

It took about an hour in total, but managed to repair the broken FAT32 file system on what Windows sees as the D: drive. Thereafter, Windows still needed to run the disk checker once during boot, but found no errors and marked the drive clean. Subsequent boots were fine.