Friday, November 18, 2011

Disk backup with dd

This post describes how to create an image of a whole disk with dd. I use this to backup my Mac OS X system disk before a major change, for example upgrading from Snow Leopard to Lion. Since I see such operations as risky (not running on Apple hardware), I want to be able to go back to a working state easily.

This process is not limited to Mac OS X, you can follow this guide to create a disk image of whatever disk you want: your windows OS disk, your data, anything.

Here are the steps I am going to go through in this post:
  • start up in linux,
  • find the device name of the disk to backup,
  • find the mounted name of the disk where to save the backup,
  • run the dd command with the appropriate attributes,
  • test that the backup worked.

Start up in linux

It feels safer to backup a system disk while the OS is not running. Therefore an OS running from a live CD seems a good solution. An Ubuntu live CD provides all the command line tools needed. Burn a CD, boot up on it, and you are good to start.

Find your OS disk

Fire up a terminal. $ls /dev shows a list of all devices on your machine.
Your disks are likely to be named by three letter combinations, such as sda, sdb, sdc, and so on. They may also be called hda, hdb, if you are using IDE disks and an old version of the linux kernel.
You can also see cdrom, dvd, spu, hpet.

Besides the disks' name, the partitions on these disks show up. For example in my installation, sda1, sda2 are present, as partitions of the sda disk.

Check your drives with the following command: $sudo fdisk -l

The result of this command gives you information about the physical volumes as well as their partitions.

ubuntu@ubuntu:~$ sudo fdisk -l

WARNING: GPT (GUID Partition Table) detected on '/dev/sda'! The util fdisk doesn't support GPT. Use GNU Parted.

Disk /dev/sda: 120.0 GB, 120034123776 bytes
255 heads, 63 sectors/track, 14593 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sda1 1 14594 117220823+ ee GPT
/dev/sda2 * 1 1 0 0 Empty
Partition 2 does not end on cylinder boundary.

WARNING: GPT (GUID Partition Table) detected on '/dev/sdb'! The util fdisk doesn't support GPT. Use GNU Parted.

Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sdb1 1 243202 1953514583+ ee GPT
Partition 1 does not start on physical sector boundary.

Apparently fdisk does not like the format of my Mac OS X partitions. It does not really matter: no need to understand the content of the disk to make a bit copy of it.

The disks are now identified: sda is the OS disk (120 GB), sdb is the data disk (2 TB).

The content of the folder /media shows that both disks have been mounted:

ubuntu@ubuntu:~$ ls /media
cdrom data Snow Leopard


Unmounting

Check where the volumes are mounted by using the command mount with no parameter. Filter the output with grep if you find it convenient:
ubuntu@ubuntu:~$ mount | grep sda
/dev/sda2 on /media/Snow Leopard type hfsplus (rw,nosuid,nodev,uhelper=udisks)

Then unmount the partition (in this case Snow Leopard):
ubuntu@ubuntu:~$ umount /media/Snow\ Leopard Note that in my version of Ubuntu, the disk is still present in the file explorer, and selecting it mounts the disk automatically. Close your file explorer if you want to make sure that the disk does not get mounted by mistake.

Note also that your disk's mounting point might not be in /media, depending on your version of linux. It could be called /mnt/sda, for example. You can check this in the output of the mount command.

Destination

You need somewhere to put the backup image. Since the goal of the backup is to restore in case I screw up, the image could very well be placed on a disk that already is in the computer. Unfortunately all my disks are formatted for Mac OS X and seem to be mounted as read-only. I'll save the image on an USB-disk.

If you want to save the image as is, you need a destination disk at least as big as your source disk. It does not matter if the source disk is nearly empty, since all bits (even non-sense bits in the empty part of the disk) are going to be copied.

The output of dd can be compressed, as we'll see later.

After connecting the external hard drive, it gets mounted automatically. On my computer, the result of mount tells that the device is called sdc, and its mounting point is /media/maxtor.

The dd command

Now we have:
  • an unmounted source disk,
  • the name of the source disk device,
  • a mounted external disk to save the backup to.

dd copies bits from a source to a destination. There are some jokes about dd standing for "disk destroyer" or "delete data", more probable name origins would be "disk dump", "dump data", "data description". People cannot seem to agree on which.

The syntax of the command parameters is a bit unusual, which can lead to confusion. It has certainly led to data destruction: if you get it wrong you might copy your empty destination unto your precious source.
if is the source.
of is the destination.
If you do not specify the parameter if, dd takes the standard input. If you do not specify the parameter of, dd takes the standard output. This means that dd can be used in combination with pipes, typically to gzip.

Other options that I used with dd are:
- conv=sync,noerror: makes dd jump over errors and put something (NULL) in the produced image to replace erroneous areas, so that the source and the destination are not misaligned.
- bs=64K: large read chunks seem to speed up the process, although I have not tested anything else personally.

The command I finally used is the following:
sudo dd if=/dev/sda conv=sync,noerror bs=64K | gzip -c > /media/maxtor/mac_os_x.img.gz
  • if is the Mac OS X system disk (sda),
  • there is no of argument, instead the output of dd is directed to the standard output,
  • the pipe gets this standard output or directs it to gzip
  • gzip compressed this output and places it in a compressed image file on my mounted external hard drive.
The resulting compressed image for my rather clean Mac OS X install (on a 120 GB disk) is around 14 GB. The disk was new just before installing the OS, it is supposedly filled with 0 or 1 in its empty areas, making the job of gzip easy.

Validate the backup

A backup solution is not valid until you have actually tried to restore.

Destroy

It felt a bit hard to willingly overwrite my working hackintosh system disk, but it would be harder to find out the backup was not working when I would need it.

After the image is created (it takes approximately 30 minutes for my system) it's time to erase the disk, filling it up with zeroes: sudo dd if=/dev/zero of=/dev/sda bs=1MI actually tried to boot on sda. Needless to say, it did not work.

Restore

The reverse operations of those described earlier need to be run:
  • uncompress the compressed image file with gzip and direct the result to the standard output,
  • pipe this standard output to the input of dd,
  • make dd put the resulting bit stream on the original (erased) Mac OS X disk.
Here is the command:
root@ubuntu:~# gunzip -c /media/matrox/mac_os_x.img.gz | dd of=/dev/sda bs=64K
dd: writing `/dev/sda': No space left on device
942+3661267 records in
942+3661266 records out
120034123776 bytes (120 GB) copied, 945.293 s, 127 MB/s
If you cannot gain root access with su, do not forget to add sudo before dd. Putting sudo at the beginning of the command does not propagate to the other side of the pipe.

Note that there is no conv=sync,noerror for dd this time. Running with these options did not work for me. I suppose that the erroneous areas were filled with NULL when the image was created, and did not need being taken care of at restoring (writing NULL back to erroneous areas should not matter). I am not certain about this theory, please comment if you know better.

Booting on sda worked like a charm!

Note: after the restore, the permissions on the content of /System/Library/Extensions, which I have under version control, had changed. The files became executable. I don't know if dd did that, or if something else did. The content of /Extra/Extensions was untouched. If you have an explanation, please leave a comment.

---
References:
- Debian help on the dd command
- a man page for fdisk
- a man page for dd
- http://www.linuxweblog.com/dd-image

3 comments: