For the last while, I’ve been helping with the testing of the recently announced ZFS Boot bits that Lin putback yesterday.

We’ve got the regression testing on these bits completed – these changes don’t break existing ZFS functionality, and we’ve validated that the basic functionality of ZFS bootable datasets works as designed.

I’m now looking at some additional tests for these bits, trying to boot mirrors with missing/detached disks, that sort of thing. (This week, I brought up a Thumper with root on a 47-way mirrored pool! :-)

As with the ZFS Mountroot bits before, I thought that writing a script to automate the install of these bits would be pretty useful while we don’t yet have full ZFS support in the installer. Here it is: zfs-actual-root-install.sh.

This is how you use it:

root@usuki[88] ./zfs-actual-root-install.sh --help
Usage : zfs-actual-root-install.sh [options to pass to zpool]
eg. ./zfs-actual-root-install.sh mirror c0t0d0s0 c0t0d1s0
You need to be running a fresh install of at least snv_50
(with a BFU of Lin's zfsboot bits) for this to work.
Note also, you must supply a disk using slice notation: we need SMI
labels to boot, whereas "zpool create c0t0d0" would use EFI labels.
Only single disks, or mirrors are supported. No stripes or raidz please.
If you set the environment variable $ROOT_FS, we use that as the root
filesystem.

As mentioned above, ZFS root boot only works with SMI labeled disks – if you’ve ever given ZFS the entire disk before, it’ll have put an EFI label on the disk, so you need to remove that using fdisk, then rewrite the label using format, or fmthard. Not too scary – here’s me having just changed the disk type:

Total disk size is 8924 cylinders
Cylinder size is 16065 (512 byte) blocks
Cylinders
Partition   Status    Type          Start   End   Length    %
=========   ======    ============  =====   ===   ======   ===
1       Active    Solaris2          1  8923    8923    100
SELECT ONE OF THE FOLLOWING:
1. Create a partition
2. Specify the active partition
3. Delete a partition
4. Change between Solaris and Solaris2 Partition IDs
5. Exit (update disk configuration and exit)
6. Cancel (exit without updating disk configuration)
Enter Selection: 5
format> l
[0] SMI Label
[1] EFI Label
Specify Label type[1]: 0
Warning: This disk has an EFI label. Changing to SMI label will erase all
current partitions.
Continue? y
Auto configuration via format.dat[no]?
Auto configuration via generic SCSI-2[no]?

Here’s the script in action:

root@usuki[92] ./zfs-actual-root-install.sh mirror c2t0d0s0 c2t1d0s0
Updating vfstab on UFS root
Starting to copy data from UFS root to /zfsroot - this may take some time.
.
.
.
10576640 blocks
.
.
There's a copy of the old UFS root in /zfsroot/etc/vfstab.old-ufs-root
diffs are new vs. old :
6a7
> /dev/dsk/c0d0s0       /dev/rdsk/c0d0s0        /       ufs     1       no      -
12,13c13
< rootpool/rootfs - / zfs - no -
 rootpool/rootfs - /zfsroot zfs - yes -
Creating ram disk for /zfsroot
updating /zfsroot/platform/i86pc/amd64/boot_archive...this may take a minute
updating /zfsroot/platform/i86pc/boot_archive...this may take a minute
Installing grub on /dev/rdsk/c2t0d0s0
stage1 written to partition 0 sector 0 (abs 16065)
stage2 written to partition 0, 260 sectors starting at 50 (abs 16115)
Installing grub on /dev/rdsk/c2t1d0s0
stage1 written to partition 0 sector 0 (abs 16065)
stage2 written to partition 0, 260 sectors starting at 50 (abs 16115)
Okay, assuming we haven't broken anything, when you next reboot, you
should be able to select a grub menu entry for ZFS on root!
Remember to report anything suspicious via bugster or
zfs-discuss@opensolaris.org.
If your boot device has changed because of this, remember to change your
bios settings.  (you should now boot from /dev/dsk/c2t0d0s0 /dev/dsk/c2t1d0s0)

And finally, here’s me booting with the new root:

# df -h /
Filesystem             size   used  avail capacity  Mounted on
rootpool/rootfs         67G   4.6G    62G     7%    /
# zfs list
NAME              USED  AVAIL  REFER  MOUNTPOINT
rootpool         4.57G  62.4G    24K  /rootpool
rootpool/rootfs  4.57G  62.4G  4.57G  legacy
# zpool status -v
pool: rootpool
state: ONLINE
scrub: none requested
config:
NAME          STATE     READ WRITE CKSUM
rootpool      ONLINE       0     0     0
mirror      ONLINE       0     0     0
c2t0d0s0  ONLINE       0     0     0
c2t1d0s0  ONLINE       0     0     0
errors: No known data errors
#

You can install a root pool to a slice that isn’t slice 0, but in that case, the script won’t work this out, and will run installgrub to that slice – if that’s the case, you should manually run the installgrub command to put the new ZFS-capable grub on whatever your boot device is.

One other thing to watch out for, is that if you’re BFUing development archives, you might run into 6528202 – so take a copy of /boot/platform/i86pc/kernel/unix before you BFU! If you’re happy to wait for a full install of nv_62 or later, then you don’t need to worry about this step.

I’ve said it before, but having ZFS on your root filesystem is just completely awesome – being able to incrementally backup, snapshot and rollback your root filesystem really is amazingly useful. I wrote mountrootadm to help out even more. Of course, eventually I suspect LiveUpgrade will handle all this for you, but in the meantime this does the trick.

Let me know if you’ve any thoughts or comments about the script.
Happy rumbling!

Update: Bart quite rightly pointed out to us that there’s no need for all that mucking around with failsafe-boot in order to reconstruct the /dev and /devices filesystems. Much easier and faster is:

mkdir -p /zfs-root-tmp.$$
mount -F lofs -o nosub / /zfs-root-tmp.$$
(cd /zfs-root-tmp.$$; tar cvf - devices dev ) | (cd /zfsroot; tar xvf -)
umount /zfs-root-tmp.$$
rm -rf /zfs-root-tmp.$$

So I’ve updated the post above to change that, fixed the script and tested it – works just fine. Thanks Bart!

Update: Lin pointed out a typo in the
create_dirs script
where /tmp was being given the wrong permissions, so I’ve fixed that in this version of the script too.

Update: – we were wrong about it being a typo. Normal service resuming..

Advertisements