![]() |
SummaryThis page describes setting up RAID 1 (mirror drives) on the NSLU2. The process starts with a blank drive and an unslung drive and ends up with two mirrored drives containing all the data stored on the originally-unslung drive. This is achieved by creating a raid array with just the blank second drive at first, copying the entire contents of the unslung drive onto it and then hotadding the unslung drive to the raid array. Diversion scripts have been written to start the raid array during the boot process and stop the array before shutdown. Unfortunately it is not possible to stop the array that contains the root filesystem during the shutdown process. This means that that array is marked as 'dirty' on shutdown and it will resync when restarted. Resyncing can take hours for a large partition and no other process can run while the resyncing is happening. To get around this problem this howto splits the root partition into a small partition containing the system files and a much larger partition to hold public files. This howto uses custom partitions on the attached hardrives making it impossible to return to the normal Unslung setup without repartitioning the drives and wiping all data. (Of course you can transfer the data off the drives before re-partitioning). RemarkI'm just curious about the use of mirrored swap partitions. I would basically set up swap in striped mode (Raid0) for performance. But if you take a look at http://linas.org/linux/Software-RAID/Software-RAID-8.html (Question 18) you'll see that you don't need to set up Raid0 for swap because the linux kernel is doing this automatically. Initial Setup
# ipkg update
# ipkg install busybox-base
# ipkg install mdadm
# ipkg install kernel-module-md
# ipkg install kernel-module-raid1
# /sbin/insmod md.o
# /sbin/insmod raid1.o
Change Hard Drive PartitionsThe standard NSLU2-formatted disk has three partitions: A 50MB swap partition, a 100MB config partition (mounted as /share/hdd/conf) and the rest of the disk as a data partition (mounted as /share/hdd/data). Unslung 5.5 also uses the data partition as the root filesystem. When the NSLU2 is switched off or reboots it will try to stop all the raid arrays. Any arrays that are not cleanly stopped will resync during the startup process. The resyncing process can take up to 10 hours for a 300GB harddrive and no other processes can run during that time. That would mean that your sytem is out of action for up to 10 hours everytime you reboot and to avoid this I split the data partition into a 1GB root partition and a 299GB data partition. This howto mirrors all four partitions. I had thought that mirroring the swap partition could have a performance cost when writing to disk (and indeed it does) but I believe there are more read operations (where there is a performance gain) than write operations on swap space. The files in the conf partition are used by Samba and the passwd utility so the conf partition must be mirrored if mirroring the data partition. In order for the RAID arrays to work on a reboot we also need to change the partition types from #/opt/bin/busybox fdisk /dev/sdb Be careful to use /dev/sdb, the repartitioning process wipes all existing data from the drive. Then press ' Disk /dev/sdb: 300.0 GB, 300090728448 bytes 255 heads, 63 sectors/track, 36483 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sdb1 1 36461 292872951 83 Linux /dev/sdb2 36462 36476 120487+ 83 Linux /dev/sdb3 36477 36483 56227+ 82 Linux swap Use option 'd' to delete all three partitions Now use option ' Now use option ' Device Boot Start End Blocks Id System /dev/sdb1 1 150 1204843+ fd Linux raid autodetect /dev/sdb2 151 165 120487+ fd Linux raid autodetect /dev/sdb3 166 172 56227+ fd Linux raid autodetect /dev/sdb4 173 36483 291668107+ fd Linux raid autodetect The sizes of the first three partitions should match this table exactly, the size of the fourth partition will depend on the size of the hard drive. My hard drive is 300GB. Now use ' If fdisk requests a reboot for the changes to the partition table to become effective, do so. After rebooting, re-enabling telnet and logging in, don't forget to install the kernel modules again: # /sbin/insmod md.o # /sbin/insmod raid1.o Create and Mount RAID arraysCreate one array for each partition (four in total). The ' # mknod /dev/md4 b 9 4 2>/dev/null # /opt/sbin/mdadm --create --level=1 --raid-devices=2 /dev/md4 /dev/sdb4 missing # /opt/sbin/mdadm --create --level=1 --raid-devices=2 /dev/md3 /dev/sdb3 missing # /opt/sbin/mdadm --create --level=1 --raid-devices=2 /dev/md2 /dev/sdb2 missing # /opt/sbin/mdadm --create --level=1 --raid-devices=2 /dev/md1 /dev/sdb1 missing Run # cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md1 : active raid1 scsi/host0/bus0/target0/lun0/part1[0]
1204736 blocks [2/1] [U_]
md2 : active raid1 scsi/host0/bus0/target0/lun0/part2[0]
120384 blocks [2/1] [U_]
md3 : active raid1 scsi/host0/bus0/target0/lun0/part3[0]
56128 blocks [2/1] [U_]
md4 : active raid1 scsi/host0/bus0/target0/lun0/part4[0]
291668032 blocks [2/1] [U_]
unused devices: <none>
There are other monitor functions that you can play around with such as " Now we create the file systems on each of the four partitions, starting with the swap partition: # /sbin/mkswap /dev/md3 # /sbin/swapon /dev/md3 Then the # /usr/bin/mke2fs -j /dev/md2 # /usr/bin/mke2fs -j /dev/md1 # /usr/bin/mke2fs -j /dev/md4 Mount the new partitions on the 'flash' directory temporarily. We will remount them to their rightful place (/share/hdd/data) on reboot. # mount -t ext3 /dev/md2 /share/flash/conf # mount -t ext3 /dev/md1 /share/flash/data # mkdir /share/flash/data/public # chown admin.everyone /share/flash/data/public # chmod 775 /share/flash/data/public # mount -t ext3 /dev/md4 /share/flash/data/public Copy entire file system to RAID partitionsThis code was nabbed from the # cd /share/hdd/conf # /usr/bin/find ./ -print0 -mount | /usr/bin/cpio -p -0 -d -m -u /share/flash/conf # cd / # /usr/bin/find . -path './public' -prune -o -print0 -mount | /usr/bin/cpio -p -0 -d -m -u /share/flash/data # cd /public # /usr/bin/find ./ -print0 -mount | /usr/bin/cpio -p -0 -d -m -u /share/flash/data/public The slug can manage about 10 Mbytes/sec at most so this last command could take a long time if you have a lot of data. There might be a quicker way using the dd command but this way works for me. (I have found the dd command useful for files that are too large for the cp command e.g. DVD ISO files) Diversion ScriptsIn Unslung 5.5 the root filesystem is mounted on one of the harddrives on reboot. The script that actually does the mounting is called '
# cp /initrd/linuxrc /initrd/linuxrc.orig
/bin/mknod /dev/md4 b 9 4 2>/dev/null
/sbin/insmod /unslung/md.o
/sbin/insmod /unslung/raid1.o
/unslung/mdadm -A /dev/md4 -R /dev/sdb4
/unslung/mdadm -A /dev/md3 -R /dev/sdb3
/unslung/mdadm -A /dev/md2 -R /dev/sdb2
/unslung/mdadm -A /dev/md1 -R /dev/sdb1
/bin/sleep 5
/bin/mount -rt ext3 /dev/md1 /mnt
# ls -l /initrd/unslung
-rw-rw-r-- 1 root root 53392 Jul 19 22:15 md.o
-rwxrwxr-x 1 root root 121368 Jul 19 22:15 mdadm
-rw-rw-r-- 1 root root 20192 Jul 19 22:15 raid1.o
# ls -l /share/flash/data/unslung
-rw-r--r-- 1 root root 1902 Aug 29 11:34 rc.1
-rw-r--r-- 1 root root 1488 Aug 29 14:50 rc.halt
-rw-r--r-- 1 root root 1140 Aug 29 14:50 rc.reboot
-rw-r--r-- 1 root root 1437 Aug 29 11:37 rc.sysinit
# cd /
# umount /share/flash/conf
# umount /share/flash/data/public
# umount /share/flash/data
# swapoff /dev/md3
# /opt/sbin/mdadm -S /dev/md4
# /opt/sbin/mdadm -S /dev/md3
# /opt/sbin/mdadm -S /dev/md2
# /opt/sbin/mdadm -S /dev/md1
Resyncing the RAID ArraysAt this stage we have a working RAID array containing just a single drive. The second drive is still unslung and you could still return to your original configuration. The next step is to add the unslung drive to the raid array to take it out of degraded mode. This step in the process wipes all data from your unslung drive. If you're not confident about doing that for any reason then just perform these steps on another blank drive and leave your unslung drive untouched. That way, in order to return to your original configuration you just have to replace the original linuxrc file and reboot with the unslung drive attached. While still running the slug from the second drive, prepare the partitions on the unslung drive in exactly the same way as it was done for the second drive before:
Now the array can be prepared to use both disks:
# mknod /dev/md4 b 9 4 2>/dev/null
# /unslung/mdadm -A /dev/md4 -R /dev/sdb4
# /unslung/mdadm -A /dev/md3 -R /dev/sdb3
# /unslung/mdadm -A /dev/md2 -R /dev/sdb2
# /unslung/mdadm -A /dev/md1 -R /dev/sdb1
# /unslung/mdadm -a /dev/md3 /dev/sda3
# /unslung/mdadm -a /dev/md2 /dev/sda2
# /unslung/mdadm -a /dev/md1 /dev/sda1
# /unslung/mdadm -a /dev/md4 /dev/sda4
The order is important here: The /dev/md4 aray will take hours to resync so you should do the other three first.
# cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md1 : active raid1 scsi/host1/bus0/target0/lun0/part1[2] sdb1[0]
1204736 blocks [2/2] [UU]
md2 : active raid1 scsi/host1/bus0/target0/lun0/part2[1] sdb2[0]
120384 blocks [2/2] [UU]
md3 : active raid1 scsi/host1/bus0/target0/lun0/part3[1] sdb3[0]
56128 blocks [2/2] [UU]
md4 : active raid1 scsi/host1/bus0/target0/lun0/part4[2] sdb4[0]
291668032 blocks [2/1] [U_]
[>....................] recovery = 0.0% (25664/291668032) finish=569.7min speed=8554K/sec
unused devices: <none>
Yes, it really will take 569 minutes (almost 10 hours) to resync my two 300GB drives. For $70 you get 10MByte/sec throughput and that's it! While the disks are resyncing it's best not to touch the slug. I have found that if I try to perform any I/O tasks the speed of resyncing falls precipitiously and never rises back up so I just leave it alone (say, overnight) until it's finished. There are parameters that can be adjusted (Google for mdadm speed_limit_max) but i have found them to be ineffective.
This is also the reason why the resyncing is done on the standard Linksys filesystem rather than the unslung filesystem: The slug is constantly writing to Once the resyncing has completed we create a mdadm.conf file, then stop the raid arrays, edit # /bin/echo "DEVICE /dev/sd[ab][1234]" > /unslung/mdadm.conf # /unslung/mdadm --detail --scan >> /unslung/mdadm.conf Mount the new root filesystem and copy mdadm.conf into place: # mount -t ext3 /dev/md1 /share/hdd/data # cp /unslung/mdadm.conf /share/hdd/data/opt/etc/mdadm.conf # umount /share/hdd/data Stop the raid arrays: # /unslung/mdadm --stop --scan --config=/unslung/mdadm.conf Edit /unslung/mdadm -A /dev/md4 -R /dev/sdb4
/unslung/mdadm -A /dev/md3 -R /dev/sdb3
/unslung/mdadm -A /dev/md2 -R /dev/sdb2
/unslung/mdadm -A /dev/md1 -R /dev/sdb1
With /unslung/mdadm --assemble --scan --config=/unslung/mdadm.conf
/bin/sleep 140
I picked 140 seconds as that is enough time for my (overclocked) slug to resync the boot (1GB) partition. Switch off the slug and switch it back on again, telnet/ssh in and run Err... that's it! You should have a working raid 1 array. Troubleshooting:Reboot with one drive attached failed: If you were getting
# /usr/bin/ipkg-cl update
# /usr/bin/ipkg-cl install mdadm
# /usr/bin/ipkg-cl install kernel-module-md
# /usr/bin/ipkg-cl install kernel-module-raid1
# mv /opt/sbin/mdadm /unslung/mdadm
# chmod 755 /unslung/mdadm
# cd /
# mv /lib/modules/2.4.22-xfs/kernel/drivers/md/md.o /unslung/md.o
# mv /lib/modules/2.4.22-xfs/kernel/drivers/md/raid1.o /unslung/raid1.o
# /sbin/insmod /unslung/md.o
# /sbin/insmod /unslung/raid1.o
# /unslung/mdadm -A /dev/md4 -R /dev/sdb4
# /unslung/mdadm -A /dev/md3 -R /dev/sdb3
# /unslung/mdadm -A /dev/md2 -R /dev/sdb2
# /unslung/mdadm -A /dev/md1 -R /dev/sdb1
# /bin/mount -t ext3 /dev/md1 /share/hdd/data
Return to original unslung configuration:
# cd /
# /bin/umount /share/hdd/data
# /unslung/mdadm -S /dev/md4
# /unslung/mdadm -S /dev/md3
# /unslung/mdadm -S /dev/md2
# /unslung/mdadm -S /dev/md1
# /unslung/mdadm --zero-superblock /dev/sdb4
# /unslung/mdadm --zero-superblock /dev/sdb3
# /unslung/mdadm --zero-superblock /dev/sdb2
# /unslung/mdadm --zero-superblock /dev/sdb1
Slug doesn't reboot with no drives attached: This is probably caused by an error in the linuxrc script. To fix it you will need to flash the 5.5 firmware onto the slug again. No data should have been lost at this stage and you can still return to your original unslung configuration. This section will not work unless you were unslung on 5.x (or maybe 4.x at a pinch).
# /usr/bin/ipkg-cl update
# /usr/bin/ipkg-cl install mdadm
# /usr/bin/ipkg-cl install kernel-module-md
# /usr/bin/ipkg-cl install kernel-module-raid1
failed to RUN_ARRAY: When you try to create the raid array if you get an error message that says "mdadm: failed to add /dev/sdb2 to /dev/md2: Invalid argument mdadm: failed to RUN_ARRAY /dev/md2: Invalid argument " then the problem could be that you forgot to kill the usb_detect process and the slug has mounted the /dev/sd[ab][12] partitions. Unmount these and try again. Failed Drive: Repeat the steps in this howto starting at "Resyncing the RAID arrays". Power Loss: Raid 1 obviously won't help if you lose both drives simultaneously but unless you're very unlucky you shouldn't lose any data. The worst that might happen is that one of the disks will have to be replaced if the slug was performing an I/O operation on it when the power went. After a power loss you must restart the slug with no drives attached and then repeat the steps in this howto starting at "Resyncing the RAID arrays". The reason that you cannot simply reboot is that all 4 raid arrays will try to resync and, because of the load on the CPU during startup, they will never finish the resync. If you suspect that the slug was performing a disk read or write operation when you lost power then the disks might be corrupted. Read http://www.tldp.org/HOWTO/Software-RAID-0.4x-HOWTO-4.html and follow steps 1 to 3 of method 2 before repeating the steps in this howto starting at "Resyncing the RAID arrays". PS The This leads to the obvious warning: Never switch the slug off while it is reading/writing to a raid array (Drive lights flashing).
view ·
edit ·
print ·
history ·
Last edited by nsc.
Based on work by PatrickSchneider, nsc, Torsten Bitz, and dcordes. Originally by nsc. Page last modified on June 03, 2007, at 10:33 AM
|