First, boot with a kernel commandline of ydebug to attempt a recovery. In the following example, the device name for a md based RAID 1 device has shifted, resulting in a boot failure. When the underlying block device name changes, a Yaird image will fail as it hardcodes device names.
First, the short version. Pass ydebug off to your kernel at boot. A dash console will be offered up to you if mknod creation fails for any of your devices necessary to mount your root filesystem. You will again be offered a console right before pivoting root takes place. If there are any errors, you’ll need to ensure your root filesystem is mounted before continuing, as no attempt is made again to mount root after the initial failure.
Failure during mknod process.
# cat /sys/block/hda/dev 33:0 # mknod /dev/hda b 33 0 # mknod /dev/hda b 33 1 (or cat /sys/block/hda/hda1/dev if you disbelieve)
Failure of mdadm during root filesystem mount.
One of the first three ought to work, then ensure you successfully mount your root filesystem.
# mdadm --assemble /dev/md0 /dev/hda1 /dev/hdc1 (or) # mdadm -Ac partitions --uuid 0d6619bf:bc2a0f78:70d72fac:47f8c6bb (use your UUID, segfaults for me) # mdadm --assemble --run /dev/md0 /dev/hda1 /dev/hdc1 (or if members are missing, force with --run) # mount -nr -t ext3 -o noatime /dev/md0 /mnt
Done. Or read on for more details.
The whole procedure.
From your dash console — pretty bare metal, eh? — you’ll need to evaluate what’s failed. If the block device name has changed for one of your underlying RAID devices, the image will failure to mknod as the entry in /sys/block it seeks is possible not present, or simply wrong. To manually discover the device major and minor node numbers, assuming you know how your device names shifted, you can simply run
# cat /sys/block/hda/dev 33:0
Where you should substitute hda with whatever your device actually is. Depending on the degree of device name shift, you may need to do this for multiple devices. Next, you’ll need to use mknod to magically create the block devices you need to communicate with your hardware block devices.
# mknod /dev/hda b 33 0
Executing the above will create a new block device with a block major of 33 and a block minor of 0. Your block device major and minor will likely differ, so ensure you use the correct one for your system and running kernel. Additionally, you will probably need a block device for the specific slice, or partition, that your root filesystem resides on. If you’re using LVM and not md, obviously your situation may differ, but generally the following is applicable, creating a device with a minor of 1 and the same major.
# mknod /dev/hda b 33 1
Once that’s complete, you can exit your dash session with the usual CTRL^D.
A Yaird initrd will attempt to create a device entry for each boot device in your array, so you may need to do the above again, but for your second device, if you’re running md RAID 1.
If your root filesystem still cannot be mounted, you will be dropped back to dash. This is your final chance to recover the system before having to reboot, curse, and try again.
The output should clearly indicate if md was simply unable to initialize your RAID array. The likely culprit is Yaird having used the device names explicitly when building your initrd image. So, you can recover fairly easily at this point by knowing the device names you should actually be using, which you may have created above earlier.
# mdadm --assemble /dev/md0 /dev/hda1 /dev/hdc1
You may also try using mdadm’s autodiscover feature. (The UUID of your array is listed in the output of the md failure or from cat’ing /init.)
# mdadm -Ac partitions --uuid 0d6619bf:bc2a0f78:70d72fac:47f8c6bb
However, on my system with mdadm 1.9.0, the above segfaults which makes me nervous.
If you’ve suffered a disk failure, you may also find your system unbootable even though you’re running a md RAID 1 array. You can try forcing the issue by telling mdadm it’s okay to continue with one or more missing disks. (Don’t do this with a RAID 5 if you’re missing more than one member device as it could get messy.)
# mdadm --assemble --run /dev/md0 /dev/hda1 /dev/hdc1
At this point, if mdadm claims it successfully started your array, you can perform the all important step of mounting your root filesystem. You must complete this step, as the initrd image has already tried and failed. Once you exit dash it will immediately try to pivot to /mnt which must be ready.
# mount -nr -t ext3 -o noatime /dev/md0 /mnt
Executing the above and substituting your filesystem type and root device, generally /dev/md0 when using Linux md RAID, you should be ready to rock. CTRL^D to finish up and the initrd image will attempt to pivot root. If you were successful, your system will boot.
Once the system has booted, make sure you create a new initrd image to reflect the system state if you’ve experience a permanent change. If not, leave it alone so it’ll boot next time as it did in the past.