A few days ago, I started receiving desktop notifications on my Linux PC that my main drive was about to fail. A quick look at the GNOME “Disks” utility confirmed the diagnosis.
How did my PC know the drive was failing?
Modern drives, whether HDD (hard disk drive, a.k.a. magnetic spinning disk) or SSD (solid state drive), have Self-Monitoring, Analysis and Reporting Technology (SMART) built into them. Without getting too far into the weeds, the OS regularly queries the drive’s built-in SMART monitoring, and when SMART detects that the drive is starting to go south, it lets the OS know that. Why drives fail is beyond the scope of this post; suffice it to say that they wear out over time and one of the purposes of SMART is to warn when they’re reaching end-of-life.
It’s important to note that drives can fail in both predictable (enough time to save the data) and unpredictable (no time to save the data!) ways, so you should never rely on SMART to protect important data. Back up your data! And follow the 3-2-1 rule: at least three copies of your data, on two different types of media, at least one of which is off-site.
The failing drive was an SSD, and while I didn’t have a spare SSD lying around, I did happen to have an unused HDD of the same capacity inside the PC, so I decided to be safe, I would migrate everything from the failing SSD to the old HDD, then order a replacement SSD, then migrate everything back to the new SSD.
Why I didn’t just clone the whole drive
For a PC with a simple storage configuration, I could have used Clonezilla to create an exact duplicate of the old drive on the new one, but I was hesitant to do that for several reasons:
- I use both LUKS and LVM, so cloning the drive would cause UUID conflicts since the cloned partitions would have the same UUIDs as the old ones.
- My
/boot
partition was actually on a different drive which I wanted to decommission, so this was an opportunity to move things around to get everything onto one drive. - I think it’s useful to occasionally get closer to the metal to pick up bits of new knowledge and refresh my existing knowledge about how stuff like this works.
Having said that, if I had used Clonezilla, this is how I would have done it:
- Review
/etc/fstab
and/etc/crypttab
to make sure all filesystems on the impacted drive were being referred to by UUID rather than/dev/sd*
, because device/dev/sd
device specifiers can change when drives are moved around. - If any changes were necessary in step 1, then just to be safe, run
grub-install /dev/sd[whatever]
, thenupdate-grub
, thenupdate-initramfs -c -k all
. - Boot from Clonezilla on a thumb drive (see instructions here for how to create it).
- Using Clonezilla, copy the entire old drive device to the new one.
- Critical: get rid of the partitions from the old drive, so there aren’t UUID conflicts. There are three ways you can do this (be careful that you do it to the correct drive that you’re actually intending to erase!):
- do a quick erase on the drive (e.g., go to the command line in Clonezilla and write a new partition table to the drive with fdisk’s “g” command), making the old partitions invisible to the OS;
- for greater paranoia use
dd if=/dev/zero of=/dev/sd[whatever]
to erase the data on the drive completely; or - shut down the computer and physically remove the drive from it (but be wary of disposing of the drive if you haven’t erased it!).
- Reboot, and everything should Just Work.
Here are the steps I followed to do the migration. I’m posting them here both for my own personal reference and on the off chance that some of what I learned might be useful to others.
I recommend doing all of the following while booted from a live thumb drive. It’s possible to do some of the work on your real running system and just do the last bits while booted from the thumb drive, but it’s more straightforward to do everything there. I used a Debian live image for this. You can download one from here. My personal preference for writing the ISO to the thumb drive is to use the Restore Disk Image… tool available on the three-dots menu in the GNOME “Disks” utility.
Create the partition table and primary partitions on the new drive
In gparted
(which you may need to install if it isn’t already available in the live image you’re using), make sure you’ve got the new drive selected, then select Device ➤ Create Partition Table… For the partition table type select “gpt” (if your PC is so old that its BIOS doesn’t support GPT partition tables, then some of the instructions in this post won’t work for you; since I’ve not actually done this with a PC that old, I can’t say for sure which, so proceed with caution).
Now you need to create your EFI partition (again: if your BIOS doesn’t use EFI boot then the instructions on this page won’t completely match what you need to do, so proceed with caution). Create a 512MB partition with the name “EFI System Partition” and filesystem type “fat32”. Apply the changes, then right click on the new partition, select Manage Flags, and enable the “boot” flag, which will also enable “esp” automatically.
Now you need to create your /boot
partition (this needs to be a separate partition so your root and data partitions can be encrypted). Create a 128MB ext4 partition named “boot”.
Now you need to create the LUKS partition that your root and data partitions will be stored inside of. Create a partition filling the rest of the drive with the name debian_crypt
(or, if that’s already the name of your old encrypted partition, use something different like debian2_crypt
) and file system type “unformatted”. Then apply all changes.
For the rest of these instructions every time you see debian_crypt
you should use whatever name you used here.
Set up LUKS
We need to leave gparted
now since it doesn’t support setting up encrypted partitions. Move on over to the GNOME “Disks” utility, select the new drive, select the unformatted LUKS partition, click the little gears icon below and to the left of it, select Format Partition…, and set it up as follows:
- For the volume name enter the same name you used above.
- Select “Other” as the partition type.
- On the next screen, select “No Filesystem” and check the “Password protect volume (LUKS)” checkbox.
- On the next screen, enter your LUKS passphrase twice.
- From the command line, run as root (via
sudo
orsudo -s
, here and in the rest of this post)dmsetup ls
to get the name that was assigned to the new device, which will look likeluks-[UUID]
. - Run
dmsetup rename luks-[UUID] debian_crypt
.
Set up LVM volumes
As I noted previously, I use separate root and data partitions on my PC, so these instructions reflect that. If you choose to instead just have one big partition, then adjust / ignore instructions below as appropriate.
pvcreate /dev/mapper/debian_crypt
vgcreate debian_vg /dev/mapper/debian_crypt
lvcreate --size 100g -n root debian_vg
lvcreate -l 100%FREE -n data debian_vg
mkfs.ext4 /dev/mapper/debian_vg-root
mkfs.ext4 /dev/mapper/debian_vg-data
Copy the data
mkdir /old /new
mount [device for old, bad root filesystem] /old
mount /dev/mapper/debian_vg-root /new
mount [device for old boot filesystem] /old/boot
mount [device for new boot filesystem] /new/boot
mount [device for old EFI filesystem] /old/boot/efi
mount [device for new EFI filesystem] /new/boot/efi
mount [device for old data filesystem] /old/[mount point for data filesystem]
mount /dev/mapper/debian_vg-data /new/[mount point for data filesystem]
rsync -a /old/ /new/
Update /etc/fstab
Load /new/etc/fstab
into an editor and update the lines for the /
, /boot
, /boot/efi
, and your data filesystem so that the first field specifies which filesystem to mount using UUID=UUID
syntax, for example, mine look like this (with the UUIDs obscured):
UUID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx / ext4 noatime,errors=remount-ro 0 1
UUID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx /boot ext4 noatime 0 2
UUID=XXXX-XXXX /boot/efi vfat umask=0077 0 1
UUID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx /d ext4 noatime 0 2
You can get the correct UUIDs to use with the “blkid” command. Make sure to:
- specify the UUIDs for the new filesystems, not the old ones you’re getting rid of; and
- for the LVM filesystems, specify the UUIDs of the logical volumes, which will show up with type “ext4”, not the underlying LUKS partition or LVM PV.
Note that I also added noatime
to the ext4
filesystems because I as noted above I am migrating (temporarily) from SSD to HDD, and I don’t want the HDD to have the extra performance burden of updating access times whenever files are accessed. When I migrate back to SSD I will probably remove noatime
. Whether you use it or not is up to you.
Update /etc/crypttab
Add a line that looks something like this to /new/etc/crypttab
:
debian_crypt UUID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx none luks,discard
- The first field is the name you gave to the LUKS partition, i.e., the name you used in the
dmsetup rename
command above. - The UUID in the second field is for the LUKS partition, which you can again get from
blkid
(look for a filesystem with type “crypto_LUKS”, and make sure it’s the new one you just created, not the old one you’re replacing. - You don’t need to specify
discard
if you’re using an HDD rather than an SSD, but it doesn’t hurt anything. See thecrypttab(5)
man page for information about the (minimal) privacy/security concerns with usingdiscard
.
Also delete or comment out the line in /new/etc/crypttab
referring to the old disk that you’re replacing, so it won’t get unlocked and mounted on reboot.
Update grub
and initramfs
for i in /sys /proc /run /dev; do mount --rbind $i /new$i; done
chroot /new
grub-install /dev/sd? # where /dev/sd? is the new drive
update-grub
update-initramfs -c -k all
If any of the above report warnings or errors, make sure you understand them before proceeding, because they may indicate something has gone wrong with the previous steps and your system won’t reboot properly onto the new drive.
Turn off the old boot partition
In gparted
, right click on the old EFI partition, select Manage Flags, and turn off the boot flag on the partition.
Boot onto the new drive
Shut down the computer entirely, unplug the thumb drive, and power up, and the system should boot up using the new drive. If it doesn’t, something is wrong that you need to troubleshoot based on how it fails.
Clean up old drive
- In
gparted
, delete all of the old drive’s partitions (LUKS,/boot
,/boot/efi
). It won’t let you delete them if they’re active, so if you’re prevented from deleting them, something is wrong, e.g., the cutover to the new drive was not fully successful, that you need to figure out. - If you’re going to dispose of the disk and you’re concerned about privacy, erase it in the GNOME Disks utility (you can also do this from the command line with
dd
, but I personally prefer to use the GUI):- Select the old drive
- Select Format Disk… from the three dots menu
- Change “Erase” to “Overwrite existing data with zeroes”
- Click the “Format” button and then the red “Format” button on the next screen
- Wait for the erase to finish
Then reboot again and make sure everything still works. If not, time for more troubleshooting!
Bonus content: cleaning up old boot menu entries with efibootmgr
The PC I was doing all this on used to run Ubuntu. A couple years ago I switched from Ubuntu to Debian, but even after doing that, whenever I typed F12 during the boot sequence to open the EFI boot menu, “Ubuntu” kept showing up as an option in the boot menu.
While I was doing the migration described above I decided to dig around and try to figure out why this was happening. I found some files underneath /boot/efi/EFI which referred to Ubuntu, so I deleted them, thinking that this would solve the problem. Nope! Even after deleting those files “Ubuntu” still showed up on the boot menu.
Fortunately my web-searching skills are still effective enough that with a little sleuthing I was able to find this. TLDR EFI boot menu entries aren’t (just) stored in /boot/efi
, they’re also stored in the motherboard firmware.
I used the efibootmgr
command, as suggested on that page, to list all the boot menu items, and then efibootmgr -b # -B
to delete the Ubuntu one.