NSLU2-Linux
view · edit · print · history

HowTo.Raid1onUnslung5 History

Hide minor edits - Show changes to markup

June 03, 2007, at 10:33 AM by nsc -- Additional troubleshooting
Added lines 308-311:

failed to RUN_ARRAY: When you try to create the raid array if you get an error message that says "mdadm: failed to add /dev/sdb2 to /dev/md2: Invalid argument mdadm: failed to RUN_ARRAY /dev/md2: Invalid argument " then the problem could be that you forgot to kill the usb_detect process and the slug has mounted the /dev/sd[ab][12] partitions. Unmount these and try again.

May 29, 2006, at 02:19 PM by PatrickSchneider -- Added Remark about RAID1 swap
Added lines 8-11:

Remark

I'm just curious about the use of mirrored swap partitions. I would basically set up swap in striped mode (Raid0) for performance. But if you take a look at http://linas.org/linux/Software-RAID/Software-RAID-8.html (Question 18) you'll see that you don't need to set up Raid0 for swap because the linux kernel is doing this automatically.

March 06, 2006, at 12:16 AM by nsc --
Changed line 184 from:
  • If there are any instances of usb_detect running (check with ps -ef) then kill them as they will attempt to mount your drives before we can sync them.
to:
  • If there are any instances of usb_detect running (check with ps -ef) then kill them as they will attempt to mount your drives before we can sync them. Also, you might get locked out from the slug because the 'root' password will change.
January 22, 2006, at 12:38 PM by nsc -- last comment was for rc.1 not rc.sysinit
Changed lines 146-147 from:
  • If you are using SSH to access your slug (and I recommend you do) then please examine the rc.sysinit script closely. It creates a link to the root user's home directory called /root. If your root user's home directory is not /opt/user/root then you need to edit or comment out this line.
to:
  • If you are using SSH to access your slug (and I recommend you do) then please examine the rc.1 script closely. It creates a link to the root user's home directory called /root. If your root user's home directory is not /opt/user/root then you need to edit or comment out this line.
January 22, 2006, at 12:36 PM by nsc -- Added comment about /root link in rc.sysinit
Added lines 146-147:
  • If you are using SSH to access your slug (and I recommend you do) then please examine the rc.sysinit script closely. It creates a link to the root user's home directory called /root. If your root user's home directory is not /opt/user/root then you need to edit or comment out this line.
January 22, 2006, at 11:42 AM by nsc -- Add mknod call to linuxrc
Changed lines 122-123 from:
  • Using a text editor on /initrd/linuxrc, replace "/bin/mount -rt ext3 /dev/$prefroot /mnt" with these 8 lines:
to:
  • Using a text editor on /initrd/linuxrc, replace "/bin/mount -rt ext3 /dev/$prefroot /mnt" with these 9 lines:
/bin/mknod /dev/md4 b 9 4 2>/dev/null
October 27, 2005, at 03:59 AM by Torsten Bitz -- detail addition
Changed line 226 from:

Edit linuxrc as follows:

to:

Edit linuxrc (at this point located in the filesystem root /) as follows:

October 26, 2005, at 05:01 AM by Torsten Bitz -- added neccessary steps after fdisk-required reboot, changed order for partitioning unslung drive.
Changed lines 58-59 from:

Now use 'w' to write the new partition table to the disk & exit fdisk

to:

Now use 'w' to write the new partition table to the disk & exit fdisk.

If fdisk requests a reboot for the changes to the partition table to become effective, do so. After rebooting, re-enabling telnet and logging in, don't forget to install the kernel modules again:

 # /sbin/insmod md.o
 # /sbin/insmod raid1.o
Changed lines 172-174 from:
  • Switch off slug
  • Unplug both drives and switch back on
  • Telnet into slug (after using web interface to enable telnet)
to:

While still running the slug from the second drive, prepare the partitions on the unslung drive in exactly the same way as it was done for the second drive before:

Added lines 174-181:
  • Plug in the first drive (unslung drive) and wait 30sec for it to be recognised
  • Use /opt/bin/busybox fdisk /dev/sda to delete the existing partitions, create new partitions (same layout/sizes as done before for the second drive) and change the partition types of the drive to 'fd' . Finally write the new partition table to the disk and exit fdisk.

Now the array can be prepared to use both disks:

  • Switch off slug
  • Unplug both drives and switch back on
  • Telnet into slug (after using web interface to enable telnet)
  • If there are any instances of usb_detect running (check with ps -ef) then kill them as they will attempt to mount your drives before we can sync them.
Changed lines 183-184 from:
  • Use /opt/bin/busybox fdisk /dev/sda to change the partition types of the unslung drive to 'fd'
  • Create the 3 raid arrays in degraded form:
to:
  • Create the 4 raid arrays in degraded form:
September 02, 2005, at 08:04 AM by nsc --
Changed lines 54-55 from:
 /dev/sdb4             173     36483   292872951   fd  Linux raid autodetect
to:
 /dev/sdb4             173     36483   291668107+  fd  Linux raid autodetect
September 02, 2005, at 07:40 AM by nsc -- replaced copy with move in initrd
Changed line 288 from:
  • Find and copy required mdadm, md.o and raid1.o to /unslung
to:
  • Find and move (not copy) required mdadm, md.o and raid1.o to /unslung
September 02, 2005, at 07:30 AM by nsc -- Changed copy from initrd instructions
Changed line 240 from:
# cp /opt/sbin/mdadm /unslung/mdadm
to:
# mv /opt/sbin/mdadm /unslung/mdadm
Changed lines 243-244 from:
# /usr/bin/find . -name md.o -print0 -mount | /usr/bin/cpio -p -0 -d -m -u /unslung
# /usr/bin/find . -name raid1.o -print0 -mount | /usr/bin/cpio -p -0 -d -m -u /unslung
to:
# mv /lib/modules/2.4.22-xfs/kernel/drivers/md/md.o /unslung/md.o
# mv /lib/modules/2.4.22-xfs/kernel/drivers/md/raid1.o /unslung/raid1.o
September 01, 2005, at 03:41 PM by dcordes -- Stopping the swap on /dev/md3 so we can umount it.
Added lines 145-146:
  • Stop swapping on /dev/md3 so we can umount it:
# swapoff /dev/md3
September 01, 2005, at 03:15 PM by dcordes -- Adding the location of the kernel modules.
Changed line 126 from:
  • Copy mdadm, md.o and raid1.o into /initrd/unslung. Ensure that the new mdadm file has execute permissions (i.e. chmod 755 /initrd/unslung/mdadm).
to:
  • Copy mdadm, md.o and raid1.o into /initrd/unslung. Ensure that the new mdadm file has execute permissions (i.e. chmod 755 /initrd/unslung/mdadm). You'll find md.o and raid1.o in the /lib/modules/2.4.22-xfs/kernel/drivers/md directory.
September 01, 2005, at 10:56 AM by nsc -- Formatting Changes
Changed lines 141-144 from:
 # cd /
 # umount /share/flash/conf
 # umount /share/flash/data/public
 # umount /share/flash/data
to:
# cd /
# umount /share/flash/conf
# umount /share/flash/data/public
# umount /share/flash/data
Changed lines 146-149 from:
 # /opt/sbin/mdadm -S /dev/md4
 # /opt/sbin/mdadm -S /dev/md3
 # /opt/sbin/mdadm -S /dev/md2
 # /opt/sbin/mdadm -S /dev/md1
to:
# /opt/sbin/mdadm -S /dev/md4
# /opt/sbin/mdadm -S /dev/md3
# /opt/sbin/mdadm -S /dev/md2
# /opt/sbin/mdadm -S /dev/md1
Changed lines 184-186 from:
@@# cat /proc/mdstat
@@Personalities : [raid1]
@@read_ahead 1024 sectors
to:
# cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
Changed lines 216-219 from:
 /unslung/mdadm -A /dev/md4 -R /dev/sdb4
 /unslung/mdadm -A /dev/md3 -R /dev/sdb3
 /unslung/mdadm -A /dev/md2 -R /dev/sdb2
 /unslung/mdadm -A /dev/md1 -R /dev/sdb1
to:
/unslung/mdadm -A /dev/md4 -R /dev/sdb4
/unslung/mdadm -A /dev/md3 -R /dev/sdb3
/unslung/mdadm -A /dev/md2 -R /dev/sdb2
/unslung/mdadm -A /dev/md1 -R /dev/sdb1
Changed lines 221-222 from:
 /unslung/mdadm --assemble --scan --config=/unslung/mdadm.conf
 /bin/sleep 140
to:
/unslung/mdadm --assemble --scan --config=/unslung/mdadm.conf
/bin/sleep 140
Changed lines 234-242 from:
# /usr/bin/ipkg-cl update
# /usr/bin/ipkg-cl install mdadm
# /usr/bin/ipkg-cl install kernel-module-md
# /usr/bin/ipkg-cl install kernel-module-raid1
# cp /opt/sbin/mdadm /unslung/mdadm
# chmod 755 /unslung/mdadm
# cd /
# /usr/bin/find . -name md.o -print0 -mount | /usr/bin/cpio -p -0 -d -m -u /unslung
# /usr/bin/find . -name raid1.o -print0 -mount | /usr/bin/cpio -p -0 -d -m -u /unslung
to:
# /usr/bin/ipkg-cl update
# /usr/bin/ipkg-cl install mdadm
# /usr/bin/ipkg-cl install kernel-module-md
# /usr/bin/ipkg-cl install kernel-module-raid1
# cp /opt/sbin/mdadm /unslung/mdadm
# chmod 755 /unslung/mdadm
# cd /
# /usr/bin/find . -name md.o -print0 -mount | /usr/bin/cpio -p -0 -d -m -u /unslung
# /usr/bin/find . -name raid1.o -print0 -mount | /usr/bin/cpio -p -0 -d -m -u /unslung
Changed lines 246-251 from:
# /sbin/insmod /unslung/md.o
# /sbin/insmod /unslung/raid1.o
# /unslung/mdadm -A /dev/md4 -R /dev/sdb4
# /unslung/mdadm -A /dev/md3 -R /dev/sdb3
# /unslung/mdadm -A /dev/md2 -R /dev/sdb2
# /unslung/mdadm -A /dev/md1 -R /dev/sdb1
to:
# /sbin/insmod /unslung/md.o
# /sbin/insmod /unslung/raid1.o
# /unslung/mdadm -A /dev/md4 -R /dev/sdb4
# /unslung/mdadm -A /dev/md3 -R /dev/sdb3
# /unslung/mdadm -A /dev/md2 -R /dev/sdb2
# /unslung/mdadm -A /dev/md1 -R /dev/sdb1
Changed line 253 from:
# /bin/mount -t ext3 /dev/md1 /share/hdd/data
to:
# /bin/mount -t ext3 /dev/md1 /share/hdd/data
Changed lines 259-264 from:
# cd /
# /bin/umount /share/hdd/data
# /unslung/mdadm -S /dev/md4
# /unslung/mdadm -S /dev/md3
# /unslung/mdadm -S /dev/md2
# /unslung/mdadm -S /dev/md1
to:
# cd /
# /bin/umount /share/hdd/data
# /unslung/mdadm -S /dev/md4
# /unslung/mdadm -S /dev/md3
# /unslung/mdadm -S /dev/md2
# /unslung/mdadm -S /dev/md1
Changed lines 266-269 from:
# /unslung/mdadm --zero-superblock /dev/sdb4
# /unslung/mdadm --zero-superblock /dev/sdb3
# /unslung/mdadm --zero-superblock /dev/sdb2
# /unslung/mdadm --zero-superblock /dev/sdb1
to:
# /unslung/mdadm --zero-superblock /dev/sdb4
# /unslung/mdadm --zero-superblock /dev/sdb3
# /unslung/mdadm --zero-superblock /dev/sdb2
# /unslung/mdadm --zero-superblock /dev/sdb1
Changed lines 282-285 from:
# /usr/bin/ipkg-cl update
# /usr/bin/ipkg-cl install mdadm
# /usr/bin/ipkg-cl install kernel-module-md
# /usr/bin/ipkg-cl install kernel-module-raid1
to:
# /usr/bin/ipkg-cl update
# /usr/bin/ipkg-cl install mdadm
# /usr/bin/ipkg-cl install kernel-module-md
# /usr/bin/ipkg-cl install kernel-module-raid1
September 01, 2005, at 10:50 AM by nsc -- Formatiing changes
Changed lines 15-19 from:
# ipkg update
# ipkg install busybox-base
# ipkg install mdadm
# ipkg install kernel-module-md
# ipkg install kernel-module-raid1
to:
# ipkg update
# ipkg install busybox-base
# ipkg install mdadm
# ipkg install kernel-module-md
# ipkg install kernel-module-raid1
Changed lines 21-23 from:
# /sbin/insmod md.o
# /sbin/insmod raid1.o
to:
# /sbin/insmod md.o
# /sbin/insmod raid1.o
Changed line 85 from:

Now we create the file systems on each of the three partitions, starting with the swap partition:

to:

Now we create the file systems on each of the four partitions, starting with the swap partition:

Changed line 116 from:
# cp /initrd/linuxrc /initrd/linuxrc.orig
to:
# cp /initrd/linuxrc /initrd/linuxrc.orig
Changed lines 118-125 from:
/sbin/insmod /unslung/md.o
/sbin/insmod /unslung/raid1.o
/unslung/mdadm -A /dev/md4 -R /dev/sdb4
/unslung/mdadm -A /dev/md3 -R /dev/sdb3
/unslung/mdadm -A /dev/md2 -R /dev/sdb2
/unslung/mdadm -A /dev/md1 -R /dev/sdb1
/bin/sleep 5
/bin/mount -rt ext3 /dev/md1 /mnt
to:
/sbin/insmod /unslung/md.o
/sbin/insmod /unslung/raid1.o
/unslung/mdadm -A /dev/md4 -R /dev/sdb4
/unslung/mdadm -A /dev/md3 -R /dev/sdb3
/unslung/mdadm -A /dev/md2 -R /dev/sdb2
/unslung/mdadm -A /dev/md1 -R /dev/sdb1
/bin/sleep 5
/bin/mount -rt ext3 /dev/md1 /mnt
Changed lines 128-132 from:
# ls -l /initrd/unslung
-rw-rw-r-- 1 root root 53392 Jul 19 22:15 md.o
-rwxrwxr-x 1 root root 121368 Jul 19 22:15 mdadm
-rw-rw-r-- 1 root root 20192 Jul 19 22:15 raid1.o
to:
# ls -l /initrd/unslung
-rw-rw-r-- 1 root root 53392 Jul 19 22:15 md.o
-rwxrwxr-x 1 root root 121368 Jul 19 22:15 mdadm
-rw-rw-r-- 1 root root 20192 Jul 19 22:15 raid1.o
Changed lines 134-139 from:
# ls -l /share/flash/data/unslung
-rw-r--r-- 1 root root 1902 Aug 29 11:34 rc.1
-rw-r--r-- 1 root root 1488 Aug 29 14:50 rc.halt
-rw-r--r-- 1 root root 1140 Aug 29 14:50 rc.reboot
-rw-r--r-- 1 root root 1437 Aug 29 11:37 rc.sysinit
to:
# ls -l /share/flash/data/unslung
-rw-r--r-- 1 root root 1902 Aug 29 11:34 rc.1
-rw-r--r-- 1 root root 1488 Aug 29 14:50 rc.halt
-rw-r--r-- 1 root root 1140 Aug 29 14:50 rc.reboot
-rw-r--r-- 1 root root 1437 Aug 29 11:37 rc.sysinit
Changed lines 172-176 from:
# mknod /dev/md4 b 9 4 2>/dev/null
# /unslung/mdadm -A /dev/md4 -R /dev/sdb4
# /unslung/mdadm -A /dev/md3 -R /dev/sdb3
# /unslung/mdadm -A /dev/md2 -R /dev/sdb2
# /unslung/mdadm -A /dev/md1 -R /dev/sdb1
to:
# mknod /dev/md4 b 9 4 2>/dev/null
# /unslung/mdadm -A /dev/md4 -R /dev/sdb4
# /unslung/mdadm -A /dev/md3 -R /dev/sdb3
# /unslung/mdadm -A /dev/md2 -R /dev/sdb2
# /unslung/mdadm -A /dev/md1 -R /dev/sdb1
Changed lines 178-181 from:
# /unslung/mdadm -a /dev/md3 /dev/sda3
# /unslung/mdadm -a /dev/md2 /dev/sda2
# /unslung/mdadm -a /dev/md1 /dev/sda1
# /unslung/mdadm -a /dev/md4 /dev/sda4
to:
# /unslung/mdadm -a /dev/md3 /dev/sda3
# /unslung/mdadm -a /dev/md2 /dev/sda2
# /unslung/mdadm -a /dev/md1 /dev/sda1
# /unslung/mdadm -a /dev/md4 /dev/sda4
Changed lines 184-197 from:
# cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md1 : active raid1 scsi/host1/bus0/target0/lun0/part1[2] sdb1[0]
1204736 blocks [2/2] [UU]
md2 : active raid1 scsi/host1/bus0/target0/lun0/part2[1] sdb2[0]
120384 blocks [2/2] [UU]
md3 : active raid1 scsi/host1/bus0/target0/lun0/part3[1] sdb3[0]
56128 blocks [2/2] [UU]
md4 : active raid1 scsi/host1/bus0/target0/lun0/part4[2] sdb4[0]
291668032 blocks [2/1] [U_]
[>....................] recovery = 0.0% (25664/291668032) finish=569.7min speed=8554K/sec
unused devices: <none>
to:
@@# cat /proc/mdstat
@@Personalities : [raid1]
@@read_ahead 1024 sectors
md1 : active raid1 scsi/host1/bus0/target0/lun0/part1[2] sdb1[0]
1204736 blocks [2/2] [UU]
md2 : active raid1 scsi/host1/bus0/target0/lun0/part2[1] sdb2[0]
120384 blocks [2/2] [UU]
md3 : active raid1 scsi/host1/bus0/target0/lun0/part3[1] sdb3[0]
56128 blocks [2/2] [UU]
md4 : active raid1 scsi/host1/bus0/target0/lun0/part4[2] sdb4[0]
291668032 blocks [2/1] [U_]
[>....................] recovery = 0.0% (25664/291668032) finish=569.7min speed=8554K/sec
unused devices: <none>
September 01, 2005, at 10:44 AM by nsc --
Changed lines 54-55 from:
 /dev/sdb1             173     36483   292872951   fd  Linux raid autodetect
to:
 /dev/sdb4             173     36483   292872951   fd  Linux raid autodetect
September 01, 2005, at 10:21 AM by nsc -- Fixed important error in \"/opt/sbin/mdadm --create\" statements
Changed lines 64-69 from:
 # /opt/sbin/mdadm --create --level=1 --raid-devices=2 /dev/md4 /dev/sda4 missing 
 # /opt/sbin/mdadm --create --level=1 --raid-devices=2 /dev/md3 /dev/sda3 missing
 # /opt/sbin/mdadm --create --level=1 --raid-devices=2 /dev/md2 /dev/sda2 missing
 # /opt/sbin/mdadm --create --level=1 --raid-devices=2 /dev/md1 /dev/sda1 missing
to:
 # /opt/sbin/mdadm --create --level=1 --raid-devices=2 /dev/md4 /dev/sdb4 missing 
 # /opt/sbin/mdadm --create --level=1 --raid-devices=2 /dev/md3 /dev/sdb3 missing
 # /opt/sbin/mdadm --create --level=1 --raid-devices=2 /dev/md2 /dev/sdb2 missing
 # /opt/sbin/mdadm --create --level=1 --raid-devices=2 /dev/md1 /dev/sdb1 missing
September 01, 2005, at 10:16 AM by nsc --
Added lines 56-57:

The sizes of the first three partitions should match this table exactly, the size of the fourth partition will depend on the size of the hard drive. My hard drive is 300GB.

September 01, 2005, at 09:51 AM by nsc --
Changed lines 221-222 from:

Then switch off the slug and switch it back on again.

to:

I picked 140 seconds as that is enough time for my (overclocked) slug to resync the boot (1GB) partition. Switch off the slug and switch it back on again, telnet/ssh in and run cat /proc/mdstat to check that all four partitions are active and that none are resyncing.

Changed line 226 from:

Reboot Failed: If you were getting cat /proc/mdstat results similar to the examples listed above then the most likely cause of error is in the diversion scripts. To check these follow these steps:

to:

Reboot with one drive attached failed: If you were getting cat /proc/mdstat results similar to the examples listed above then the most likely cause of error is in the diversion scripts. To check these follow these steps:

Changed lines 231-240 from:
  • Check the unslung directory for the presence of mdadm, md.o and raid1.o
to:
  • Check the unslung directory for the presence of mdadm, md.o and raid1.o. If these files are missing you can run the following commands to create them (I have tested this and it doesn't fill the flash):
# /usr/bin/ipkg-cl update
# /usr/bin/ipkg-cl install mdadm
# /usr/bin/ipkg-cl install kernel-module-md
# /usr/bin/ipkg-cl install kernel-module-raid1
# cp /opt/sbin/mdadm /unslung/mdadm
# chmod 755 /unslung/mdadm
# cd /
# /usr/bin/find . -name md.o -print0 -mount | /usr/bin/cpio -p -0 -d -m -u /unslung
# /usr/bin/find . -name raid1.o -print0 -mount | /usr/bin/cpio -p -0 -d -m -u /unslung
September 01, 2005, at 09:39 AM by nsc --
Changed lines 229-230 from:
  • Switch back on and wait for it to reboot, then telnet in
  • Check your diversion scripts in /unslung. As you have no drives connected you are now on the Linksys-standard root file system so the scripts you see will be the ones that you put in /initrd/unslung.
to:
  • Switch back on and wait for it to reboot, then telnet in (password will be uNSLUng because no drives attached)
  • Check linuxc for any errors
  • Check the unslung directory for the presence of mdadm, md.o and raid1.o
  • Check that mdadm has the executable bit set (maybe run chmod 755 /unslung/mdadm to be sure)
September 01, 2005, at 09:32 AM by nsc -- Added to power loss troubleshooting
Added line 60:
Added line 67:
Changed lines 198-199 from:

If you prematurely switch off the slug while this resync is writing to the disks nasty things can happen. I lost one of my harddrives by switching the slug off this way (I got impatient). Luckily I didn't lose any data as the other drive was ok but the damaged drive was rendered useless.

to:

If you prematurely switch off the slug while this resync is writing to the disks (drive lights flashing) nasty things can happen. I lost one of my harddrives by switching the slug off this way (I got impatient). Luckily I didn't lose any data as the other drive was ok but the damaged drive was rendered useless.

Changed lines 278-279 from:

Power Loss: Raid 1 obviously won't help if you lose both drive simultaneously. Read http://www.tldp.org/HOWTO/Software-RAID-0.4x-HOWTO-4.html and follow steps 1 to 3 of method 2 and then repeat the steps in this howto starting at "Resyncing the RAID arrays". PS The fsck function on the slug has been renamed fsck.ext3.

to:

Power Loss: Raid 1 obviously won't help if you lose both drives simultaneously but unless you're very unlucky you shouldn't lose any data. The worst that might happen is that one of the disks will have to be replaced if the slug was performing an I/O operation on it when the power went.

After a power loss you must restart the slug with no drives attached and then repeat the steps in this howto starting at "Resyncing the RAID arrays". The reason that you cannot simply reboot is that all 4 raid arrays will try to resync and, because of the load on the CPU during startup, they will never finish the resync.

If you suspect that the slug was performing a disk read or write operation when you lost power then the disks might be corrupted. Read http://www.tldp.org/HOWTO/Software-RAID-0.4x-HOWTO-4.html and follow steps 1 to 3 of method 2 before repeating the steps in this howto starting at "Resyncing the RAID arrays". PS The fsck function on the slug has been renamed fsck.ext3.

This leads to the obvious warning: Never switch the slug off while it is reading/writing to a raid array (Drive lights flashing).

August 29, 2005, at 04:54 PM by nsc -- Using 4 partitions to avoid resync problem
Changed lines 4-6 from:

Diversion scripts have been written to start the raid array during the boot process and stop the array before shutdown.

Initial Setup

to:

Diversion scripts have been written to start the raid array during the boot process and stop the array before shutdown. Unfortunately it is not possible to stop the array that contains the root filesystem during the shutdown process. This means that that array is marked as 'dirty' on shutdown and it will resync when restarted. Resyncing can take hours for a large partition and no other process can run while the resyncing is happening. To get around this problem this howto splits the root partition into a small partition containing the system files and a much larger partition to hold public files.

This howto uses custom partitions on the attached hardrives making it impossible to return to the normal Unslung setup without repartitioning the drives and wiping all data. (Of course you can transfer the data off the drives before re-partitioning).

Initial Setup

Changed lines 12-13 from:
  • Format the second drive if not already formatted by the NSLU2. Note: due to a bug in the Linksys firmware the second drive appears to say 0MB capacity on the NSLU2 webpage, but if it is listed as 'Ready' then all is well.
to:
  • Format the second drive if not already formatted by the NSLU2. Note: due to a bug in the Linksys firmware the second drive may say 0MB capacity on the NSLU2 webpage, but if it is listed as 'Ready' then all is well.
  • This howto completely wipes the unslung-drive after all data has been copied to the raid array. If you want to maximise your ability to return to your original configuration (should something go wrong) then prepare a second blank, formatted drive, identical to the first and leave your unslung-drive untouched.
Changed line 16 from:
# ipkg install busybox
to:
# ipkg install busybox-base
Changed lines 24-27 from:

Change Partition Types

The standard NSLU2-formatted disk has three partitions: A 50MB swap partition, a 100MB config partition (mounted as /share/hdd/conf) and the rest of the disk as a data partition (mounted as /share/hdd/data). This howto mirrors all three partitions but it is possible that mirroring the swap partition is a mistake and will have a performance cost when writing to disk. The files in the conf partition are used by Samba and the passwd utility so the conf partition must be mirrored if mirroring the data partition.

In order for the RAID arrays to work on a reboot we need to change the partition types from 83 (Linux) and 82 (Linux swap) to fd (raid autodetect). We use the busybox version of fdisk to accomplish this (the standard version of fdisk has been heavily modified by Linksys) as follows:

to:

Change Hard Drive Partitions

The standard NSLU2-formatted disk has three partitions: A 50MB swap partition, a 100MB config partition (mounted as /share/hdd/conf) and the rest of the disk as a data partition (mounted as /share/hdd/data). Unslung 5.5 also uses the data partition as the root filesystem. When the NSLU2 is switched off or reboots it will try to stop all the raid arrays. Any arrays that are not cleanly stopped will resync during the startup process. The resyncing process can take up to 10 hours for a 300GB harddrive and no other processes can run during that time. That would mean that your sytem is out of action for up to 10 hours everytime you reboot and to avoid this I split the data partition into a 1GB root partition and a 299GB data partition.

This howto mirrors all four partitions. I had thought that mirroring the swap partition could have a performance cost when writing to disk (and indeed it does) but I believe there are more read operations (where there is a performance gain) than write operations on swap space. The files in the conf partition are used by Samba and the passwd utility so the conf partition must be mirrored if mirroring the data partition.

In order for the RAID arrays to work on a reboot we also need to change the partition types from 83 (Linux) and 82 (Linux swap) to fd (raid autodetect). We use the busybox version of fdisk to accomplish all this (the standard version of fdisk has been heavily modified by Linksys) as follows:

Changed lines 32-33 from:

Be careful to use '/dev/sdb'. You only want to change the partitions on the empty disk at this stage.

to:

Be careful to use /dev/sdb, the repartitioning process wipes all existing data from the drive.

Changed lines 44-50 from:

Now use option 't' to reset the partition types to 'fd' (these codes are in hex in case you're wondering). You have to do that for each partition: 1,2 and 3. Using 'p' again should show something like:

 Device Boot    Start       End    Blocks   Id  System
 /dev/sdb1               1       36461   292872951   fd  Linux raid autodetect
 /dev/sdb2           36462       36476      120487+  fd  Linux raid autodetect
 /dev/sdb3           36477       36483       56227+  fd  Linux raid autodetect
to:

Use option 'd' to delete all three partitions

Now use option 'n' to add the four partitions. On my system the first partition was 150 blocks, the second was 14 blocks, the third was 6 blocks and the final partition (/dev/sdb4) took up the remainder of the disk. The sizes of the second and third partitions match the standard Linksys sizes.

Now use option 't' to reset the partition types to 'fd' (these codes are in hex in case you're wondering). You have to do that for each partition: 1-4. Using 'p' again should show something like:

 Device Boot    Start       End     Blocks   Id  System
 /dev/sdb1               1       150     1204843+  fd  Linux raid autodetect
 /dev/sdb2             151       165      120487+  fd  Linux raid autodetect
 /dev/sdb3             166       172       56227+  fd  Linux raid autodetect
 /dev/sdb1             173     36483   292872951   fd  Linux raid autodetect
Changed lines 59-63 from:

Create one array for each partition (three in total). The missing parameter indicates that the array is incomplete and that we will supply the second device later. This is referred to as starting the raid in 'degraded' mode. Descriptions of the option parameters can be found in the mdadm man page.

 # /opt/sbin/mdadm --create --level=1 --raid-devices=2 /dev/md3 /dev/sdb3 missing
 # /opt/sbin/mdadm --create --level=1 --raid-devices=2 /dev/md2 /dev/sdb2 missing
 # /opt/sbin/mdadm --create --level=1 --raid-devices=2 /dev/md1 /dev/sdb1 missing
to:

Create one array for each partition (four in total). The 'missing' parameter indicates that the array is incomplete and that we will supply the second device later. This is referred to as starting the raid in 'degraded' mode. Full descriptions of the option parameters can be found in the mdadm man page.

 # mknod /dev/md4 b 9 4 2>/dev/null
 # /opt/sbin/mdadm --create --level=1 --raid-devices=2 /dev/md4 /dev/sda4 missing 
 # /opt/sbin/mdadm --create --level=1 --raid-devices=2 /dev/md3 /dev/sda3 missing
 # /opt/sbin/mdadm --create --level=1 --raid-devices=2 /dev/md2 /dev/sda2 missing
 # /opt/sbin/mdadm --create --level=1 --raid-devices=2 /dev/md1 /dev/sda1 missing
Changed line 71 from:
       292872832 blocks [2/1] [U_]
to:
       1204736 blocks [2/1] [U_]
Added lines 76-77:
 md4 : active raid1 scsi/host0/bus0/target0/lun0/part4[0]
       291668032 blocks [2/1] [U_] 
Changed line 84 from:

Then the conf and data partitions. On my 300GB harddrive it takes 10mins to create the filesystem on the data partition, even with a TurboSlug.

to:

Then the conf and data partitions. On my 300GB harddrive it takes 10mins to create the filesystem on the /dev/md4, even with an overclocked slug.

Changed lines 87-88 from:
to:
 # /usr/bin/mke2fs -j /dev/md4
Changed lines 92-96 from:
to:
 # mkdir /share/flash/data/public
 # chown admin.everyone /share/flash/data/public
 # chmod 775 /share/flash/data/public
 # mount -t ext3 /dev/md4 /share/flash/data/public
Changed line 98 from:

This code was nabbed from the unsling script. I'm sure the authors of that script will be quick to disclaim all responsibility for what you are about to do ;)

to:

This code was nabbed from the unsling script. I'm sure the authors of that script will be quick to disclaim all responsibility for what you are about to do ;)

Changed lines 102-104 from:
 # /usr/bin/find / -print0 -mount | /usr/bin/cpio -p -0 -d -m -u /share/flash/data

The slug can manage about 10 MBytes?/sec at most so this last command could take a long time if you have a lot of data. There might be a quicker way using the dd command but this way works for me.

to:
 # /usr/bin/find . -path './public' -prune -o -print0 -mount | /usr/bin/cpio -p -0 -d -m -u /share/flash/data
 # cd /public
 # /usr/bin/find ./ -print0 -mount | /usr/bin/cpio -p -0 -d -m -u /share/flash/data/public

The slug can manage about 10 Mbytes/sec at most so this last command could take a long time if you have a lot of data. There might be a quicker way using the dd command but this way works for me. (I have found the dd command useful for files that are too large for the cp command e.g. DVD ISO files)

Changed lines 109-110 from:

In Unslung 5.5 the root filesystem is mounted on one of the harddrives on reboot. The script that actually does the mounting is called 'linuxrc' and it can be found in /initrd. We have to modify this script to start the RAID arrays (with only one disk first, then 2 later).

to:

In Unslung 5.5 the root filesystem is mounted on one of the harddrives on reboot. The script that actually does the mounting is called 'linuxrc' and it can be found in /initrd. (There is another copy of linuxrc in the root directory but that is not used.) We have to modify this script to start the RAID arrays (with only one disk first, then 2 later).

Changed line 113 from:
  • Using a text editor on /initrd/linuxrc, replace "/bin/mount -rt ext3 /dev/$prefroot /mnt" with these 7 lines:
to:
  • Using a text editor on /initrd/linuxrc, replace "/bin/mount -rt ext3 /dev/$prefroot /mnt" with these 8 lines:
Added line 116:
/unslung/mdadm -A /dev/md4 -R /dev/sdb4
Changed line 123 from:
  • Copy /initrd/unslung/rc.halt and initrd/unslung/rc.reboot from HowTo.Raid1DiversionScripts into /initrd/unslung. Here is listing of my /initrd/unslung directory (files sizes may be slightly different from yours, mine have extra debugging lines):
to:
  • Here is listing of my /initrd/unslung directory (files sizes may be slightly different from yours, mine have extra debugging lines):
Changed lines 128-130 from:
-rw-r--r-- 1 root root 299 Aug 11 08:41 rc.halt
-rw-r--r-- 1 root root 242 Aug 11 08:16 rc.reboot
  • Copy rc.sysinit, rc.1, rc.halt and rc.reboot into /share/flash/data/unslung. Exact permissions of the diversion scripts don't seem to matter. This is a listing of my unslung directory:
to:
  • Copy rc.sysinit, rc.1, rc.halt and rc.reboot into /share/flash/data/unslung. Exact permissions of the diversion scripts don't seem to matter. The diversion scripts can be found at HowTo.Raid1DiversionScripts. This is a listing of my unslung directory:
Changed lines 131-135 from:
-rw-r--r-- 1 root root 2025 Aug 7 22:35 rc.1
-rw-r--r-- 1 root root 1258 Aug 11 08:41 rc.halt
-rw-r--r-- 1 root root 166 Aug 7 21:12 rc.local
-rw-r--r-- 1 root root 1395 Aug 11 07:56 rc.reboot
-rw-r--r-- 1 root root 406 Aug 7 21:12 rc.sysinit
to:
-rw-r--r-- 1 root root 1902 Aug 29 11:34 rc.1
-rw-r--r-- 1 root root 1488 Aug 29 14:50 rc.halt
-rw-r--r-- 1 root root 1140 Aug 29 14:50 rc.reboot
-rw-r--r-- 1 root root 1437 Aug 29 11:37 rc.sysinit
  • Umount the raid arrays:
    1. cd /
    2. umount /share/flash/conf
    3. umount /share/flash/data/public
    4. umount /share/flash/data
  • Stop the raid arrays:
    1. /opt/sbin/mdadm -S /dev/md4
    2. /opt/sbin/mdadm -S /dev/md3
    3. /opt/sbin/mdadm -S /dev/md2
    4. /opt/sbin/mdadm -S /dev/md1
Changed line 147 from:
  • Unplug DISK 1
to:
  • Unplug DISK 1 (The disk that is unslung)
Added lines 159-160:

This step in the process wipes all data from your unslung drive. If you're not confident about doing that for any reason then just perform these steps on another blank drive and leave your unslung drive untouched. That way, in order to return to your original configuration you just have to replace the original linuxrc file and reboot with the unslung drive attached.

Added lines 168-169:
# mknod /dev/md4 b 9 4 2>/dev/null
# /unslung/mdadm -A /dev/md4 -R /dev/sdb4
Changed lines 177-178 from:

The order is important here: The /dev/md1 aray will take hours to resync so you should do the other two first.

to:
# /unslung/mdadm -a /dev/md4 /dev/sda4

The order is important here: The /dev/md4 aray will take hours to resync so you should do the other three first.

Changed lines 184-185 from:
292872832 blocks [2/1] [U_]
[>....................] recovery = 0.0% (25664/292872832) finish=569.7min speed=8554K/sec
to:
1204736 blocks [2/2] [UU]
Added lines 189-191:
md4 : active raid1 scsi/host1/bus0/target0/lun0/part4[2] sdb4[0]
291668032 blocks [2/1] [U_]
[>....................] recovery = 0.0% (25664/291668032) finish=569.7min speed=8554K/sec
Changed lines 194-201 from:

Yes, it really will take 569 minutes (almost 10 hours) to resync my two 300GB drives. For $70 you get 10MByte/sec throughput and that's it! While the disks are resyncing I don't touch the slug. I have found that if I try to perform any I/O tasks the speed of resyncing falls precipitiously and never rises back up so I just leave it alone (say, overnight) until it's finished. If anyone knows why this is I would appreciate it if they posted it here. This is also the reason why the resyncing is done on the standard Linksys filesystem rather than the unslung filesystem: The slug is constantly writing to /var/log and running other cron jobs and this causes the resyncing to die. If you prematurely switch off the slug while this resync is writing to the disks nasty things can happen. I lost one of my harddrives by switching the slug off this way (I got impatient). Luckily I didn't lose any data as the other drive was ok.

Once the resyncing has completed we stop the raid arrays, edit linuxrc once again and reboot:

 # /unslung/mdadm -S /dev/md3
 # /unslung/mdadm -S /dev/md2
 # /unslung/mdadm -S /dev/md1
to:

Yes, it really will take 569 minutes (almost 10 hours) to resync my two 300GB drives. For $70 you get 10MByte/sec throughput and that's it! While the disks are resyncing it's best not to touch the slug. I have found that if I try to perform any I/O tasks the speed of resyncing falls precipitiously and never rises back up so I just leave it alone (say, overnight) until it's finished. There are parameters that can be adjusted (Google for mdadm speed_limit_max) but i have found them to be ineffective. This is also the reason why the resyncing is done on the standard Linksys filesystem rather than the unslung filesystem: The slug is constantly writing to /var/log and running other cron jobs and this would cause the resyncing to die. If you prematurely switch off the slug while this resync is writing to the disks nasty things can happen. I lost one of my harddrives by switching the slug off this way (I got impatient). Luckily I didn't lose any data as the other drive was ok but the damaged drive was rendered useless.

Once the resyncing has completed we create a mdadm.conf file, then stop the raid arrays, edit linuxrc once again and reboot:

 # /bin/echo "DEVICE    /dev/sd[ab][1234]" > /unslung/mdadm.conf
 # /unslung/mdadm --detail --scan >> /unslung/mdadm.conf

Mount the new root filesystem and copy mdadm.conf into place:

 # mount -t ext3 /dev/md1 /share/hdd/data
 # cp /unslung/mdadm.conf /share/hdd/data/opt/etc/mdadm.conf
 # umount /share/hdd/data

Stop the raid arrays:

 # /unslung/mdadm --stop --scan --config=/unslung/mdadm.conf
Added line 212:
 /unslung/mdadm -A /dev/md4 -R /dev/sdb4
Changed lines 217-219 from:
 /unslung/mdadm -A /dev/md3 /dev/sdb3 /dev/sda3
 /unslung/mdadm -A /dev/md2 /dev/sdb2 /dev/sda2
 /unslung/mdadm -A /dev/md1 /dev/sdb1 /dev/sda1
to:
 /unslung/mdadm --assemble --scan --config=/unslung/mdadm.conf
 /bin/sleep 140
Added line 233:
# /unslung/mdadm -A /dev/md4 -R /dev/sdb4
Changed line 237 from:
  • Mount the data partition
to:
  • Mount the root partition
Added line 246:
# /unslung/mdadm -S /dev/md4
Added line 251:
# /unslung/mdadm --zero-superblock /dev/sdb4
Changed line 255 from:
  • Change the partition types back to 82 and 83 using busybox fdisk
to:
  • Delete all four partitions using busybox fdisk
Changed lines 257-258 from:
  • Switch off the slug, plug in your two disks and switch back on. You may need to reformat the second drive.
to:
  • Switch off the slug, plug in your two disks and switch back on. Use the web interface to reformat the second drive.
August 12, 2005, at 08:03 AM by nsc --
Changed line 119 from:
  1. TwonkyVision media server working
to:
  1. Twonkyvision media server working
August 12, 2005, at 08:02 AM by nsc -- Corrected reference to diversion script page
Changed line 99 from:
  • Copy /initrd/unslung/rc.halt and initrd/unslung/rc.reboot from Howto.Raid1DiversionScripts? into /initrd/unslung. Here is listing of my /initrd/unslung directory (files sizes may be slightly different from yours, mine have extra debugging lines):
to:
  • Copy /initrd/unslung/rc.halt and initrd/unslung/rc.reboot from HowTo.Raid1DiversionScripts into /initrd/unslung. Here is listing of my /initrd/unslung directory (files sizes may be slightly different from yours, mine have extra debugging lines):
August 12, 2005, at 07:59 AM by nsc --
Changed lines 89-90 from:
# cp linuxrc linuxrc.orig
  • Using a text editor, replace "/bin/mount -rt ext3 /dev/$prefroot /mnt" with these 7 lines:
to:
# cp /initrd/linuxrc /initrd/linuxrc.orig
  • Using a text editor on /initrd/linuxrc, replace "/bin/mount -rt ext3 /dev/$prefroot /mnt" with these 7 lines:
Changed lines 191-192 from:
  • If you still can't find anything wrong and you want to go back to your original configuration and stop the raid arrays.
to:
  • If you still can't find anything wrong and you want to go back to your original configuration then continue to the next section
Added lines 208-222:

Slug doesn't reboot with no drives attached: This is probably caused by an error in the linuxrc script. To fix it you will need to flash the 5.5 firmware onto the slug again. No data should have been lost at this stage and you can still return to your original unslung configuration. This section will not work unless you were unslung on 5.x (or maybe 4.x at a pinch).

  • Switch slug off and put into upgrade mode using the reset button
  • Flash 5.5 firmware using Upslug or Sercomm upgrade tool
  • Telnet into the slug with no drives attached
  • Create the boot flag file (touch /.sda1root)
  • At this stage you can return to your original unslung configuration by switching off the slug, plugging in your unslung drive and starting up again.
  • However if you want to continue setting up the raid array don't switch off the slug. Instead edit the linuxrc file as described in the Diversion Scripts section.
  • Then install required software. I have tested this and it doesn't fill the flash memory
# /usr/bin/ipkg-cl update
# /usr/bin/ipkg-cl install mdadm
# /usr/bin/ipkg-cl install kernel-module-md
# /usr/bin/ipkg-cl install kernel-module-raid1
  • Find and copy required mdadm, md.o and raid1.o to /unslung
  • Switch off the slug, leave all drives unattached and switch it back on again. Hopefully it should reboot this time and you can continue with the howto.
August 11, 2005, at 10:11 PM by nsc --
Added lines 1-211:

Summary

This page describes setting up RAID 1 (mirror drives) on the NSLU2. The process starts with a blank drive and an unslung drive and ends up with two mirrored drives containing all the data stored on the originally-unslung drive. This is achieved by creating a raid array with just the blank second drive at first, copying the entire contents of the unslung drive onto it and then hotadding the unslung drive to the raid array.

Diversion scripts have been written to start the raid array during the boot process and stop the array before shutdown.

Initial Setup

  • Back up all your data before beginning.
  • Start with two identical drives, one of them unslung. It probably doesn't matter but I marked the usb cable of the unslung drive so that I always plugged it into the DISK 1 slot.
  • Copy any data you want to keep onto the unslung drive
  • Format the second drive if not already formatted by the NSLU2. Note: due to a bug in the Linksys firmware the second drive appears to say 0MB capacity on the NSLU2 webpage, but if it is listed as 'Ready' then all is well.
  • Telnet/SSH into the box and download required software packages. You need mdadm, the updated version of busybox (sometimes busybox requires help from --force-depends) and two kernel modules:
# ipkg update
# ipkg install busybox
# ipkg install mdadm
# ipkg install kernel-module-md
# ipkg install kernel-module-raid1
  • Install modules so kernel can use them.
# /sbin/insmod md.o
# /sbin/insmod raid1.o

Change Partition Types

The standard NSLU2-formatted disk has three partitions: A 50MB swap partition, a 100MB config partition (mounted as /share/hdd/conf) and the rest of the disk as a data partition (mounted as /share/hdd/data). This howto mirrors all three partitions but it is possible that mirroring the swap partition is a mistake and will have a performance cost when writing to disk. The files in the conf partition are used by Samba and the passwd utility so the conf partition must be mirrored if mirroring the data partition.

In order for the RAID arrays to work on a reboot we need to change the partition types from 83 (Linux) and 82 (Linux swap) to fd (raid autodetect). We use the busybox version of fdisk to accomplish this (the standard version of fdisk has been heavily modified by Linksys) as follows:

 #/opt/bin/busybox fdisk /dev/sdb

Be careful to use '/dev/sdb'. You only want to change the partitions on the empty disk at this stage.

Then press 'p' to see the partition table. You should see screen output something like:

 Disk /dev/sdb: 300.0 GB, 300090728448 bytes
 255 heads, 63 sectors/track, 36483 cylinders
 Units = cylinders of 16065 * 512 = 8225280 bytes

 Device Boot    Start       End    Blocks   Id  System
 /dev/sdb1               1       36461   292872951   83  Linux
 /dev/sdb2           36462       36476      120487+  83  Linux
 /dev/sdb3           36477       36483       56227+  82  Linux swap

Now use option 't' to reset the partition types to 'fd' (these codes are in hex in case you're wondering). You have to do that for each partition: 1,2 and 3. Using 'p' again should show something like:

 Device Boot    Start       End    Blocks   Id  System
 /dev/sdb1               1       36461   292872951   fd  Linux raid autodetect
 /dev/sdb2           36462       36476      120487+  fd  Linux raid autodetect
 /dev/sdb3           36477       36483       56227+  fd  Linux raid autodetect

Now use 'w' to write the new partition table to the disk & exit fdisk

Create and Mount RAID arrays

Create one array for each partition (three in total). The missing parameter indicates that the array is incomplete and that we will supply the second device later. This is referred to as starting the raid in 'degraded' mode. Descriptions of the option parameters can be found in the mdadm man page.

 # /opt/sbin/mdadm --create --level=1 --raid-devices=2 /dev/md3 /dev/sdb3 missing
 # /opt/sbin/mdadm --create --level=1 --raid-devices=2 /dev/md2 /dev/sdb2 missing
 # /opt/sbin/mdadm --create --level=1 --raid-devices=2 /dev/md1 /dev/sdb1 missing

Run cat /proc/mdstat to check on the raid status:

 # cat /proc/mdstat
 Personalities : [raid1]
 read_ahead 1024 sectors
 md1 : active raid1 scsi/host0/bus0/target0/lun0/part1[0]
       292872832 blocks [2/1] [U_]
 md2 : active raid1 scsi/host0/bus0/target0/lun0/part2[0]
       120384 blocks [2/1] [U_]
 md3 : active raid1 scsi/host0/bus0/target0/lun0/part3[0]
       56128 blocks [2/1] [U_]
 unused devices: <none>

There are other monitor functions that you can play around with such as "mdadm --examine /dev/sdb1" or "mdadm --detail /dev/md1".

Now we create the file systems on each of the three partitions, starting with the swap partition:

 # /sbin/mkswap /dev/md3
 # /sbin/swapon /dev/md3

Then the conf and data partitions. On my 300GB harddrive it takes 10mins to create the filesystem on the data partition, even with a TurboSlug.

 # /usr/bin/mke2fs -j /dev/md2
 # /usr/bin/mke2fs -j /dev/md1

Mount the new partitions on the 'flash' directory temporarily. We will remount them to their rightful place (/share/hdd/data) on reboot.

 # mount -t ext3 /dev/md2 /share/flash/conf
 # mount -t ext3 /dev/md1 /share/flash/data

Copy entire file system to RAID partitions

This code was nabbed from the unsling script. I'm sure the authors of that script will be quick to disclaim all responsibility for what you are about to do ;)

 # cd /share/hdd/conf
 # /usr/bin/find ./ -print0 -mount | /usr/bin/cpio -p -0 -d -m -u /share/flash/conf
 # cd /
 # /usr/bin/find / -print0 -mount | /usr/bin/cpio -p -0 -d -m -u /share/flash/data

The slug can manage about 10 MBytes?/sec at most so this last command could take a long time if you have a lot of data. There might be a quicker way using the dd command but this way works for me.

Diversion Scripts

In Unslung 5.5 the root filesystem is mounted on one of the harddrives on reboot. The script that actually does the mounting is called 'linuxrc' and it can be found in /initrd. We have to modify this script to start the RAID arrays (with only one disk first, then 2 later).

  • Take a copy of the original file
# cp linuxrc linuxrc.orig
  • Using a text editor, replace "/bin/mount -rt ext3 /dev/$prefroot /mnt" with these 7 lines:
/sbin/insmod /unslung/md.o
/sbin/insmod /unslung/raid1.o
/unslung/mdadm -A /dev/md3 -R /dev/sdb3
/unslung/mdadm -A /dev/md2 -R /dev/sdb2
/unslung/mdadm -A /dev/md1 -R /dev/sdb1
/bin/sleep 5
/bin/mount -rt ext3 /dev/md1 /mnt
  • Copy mdadm, md.o and raid1.o into /initrd/unslung. Ensure that the new mdadm file has execute permissions (i.e. chmod 755 /initrd/unslung/mdadm).
  • Copy /initrd/unslung/rc.halt and initrd/unslung/rc.reboot from Howto.Raid1DiversionScripts? into /initrd/unslung. Here is listing of my /initrd/unslung directory (files sizes may be slightly different from yours, mine have extra debugging lines):
# ls -l /initrd/unslung
-rw-rw-r-- 1 root root 53392 Jul 19 22:15 md.o
-rwxrwxr-x 1 root root 121368 Jul 19 22:15 mdadm
-rw-rw-r-- 1 root root 20192 Jul 19 22:15 raid1.o
-rw-r--r-- 1 root root 299 Aug 11 08:41 rc.halt
-rw-r--r-- 1 root root 242 Aug 11 08:16 rc.reboot
  • Copy rc.sysinit, rc.1, rc.halt and rc.reboot into /share/flash/data/unslung. Exact permissions of the diversion scripts don't seem to matter. This is a listing of my unslung directory:
# ls -l /share/flash/data/unslung
-rw-r--r-- 1 root root 2025 Aug 7 22:35 rc.1
-rw-r--r-- 1 root root 1258 Aug 11 08:41 rc.halt
-rw-r--r-- 1 root root 166 Aug 7 21:12 rc.local
-rw-r--r-- 1 root root 1395 Aug 11 07:56 rc.reboot
-rw-r--r-- 1 root root 406 Aug 7 21:12 rc.sysinit
  • Switch off slug
  • Unplug DISK 1
  • Switch on slug and wait for it to reboot. It should work identically in every way to your old unslung setup. A full check for me is:
    1. NSLU2 web pages operational
    2. Samba works ok
    3. I have access to the box via openssh/telnet
    4. TwonkyVision media server working
    5. etc., etc.
  • If all is well then continue. If not then skip to the Troubleshooting section

Resyncing the RAID Arrays

At this stage we have a working RAID array containing just a single drive. The second drive is still unslung and you could still return to your original configuration. The next step is to add the unslung drive to the raid array to take it out of degraded mode.

  • Switch off slug
  • Unplug both drives and switch back on
  • Telnet into slug (after using web interface to enable telnet)
  • If there are any instances of usb_detect running (check with ps -ef) then kill them as they will attempt to mount your drives before we can sync them.
  • Plug in both drives and wait 30sec for them to be recognised
  • Use /opt/bin/busybox fdisk /dev/sda to change the partition types of the unslung drive to 'fd'
  • Create the 3 raid arrays in degraded form:
# /unslung/mdadm -A /dev/md3 -R /dev/sdb3
# /unslung/mdadm -A /dev/md2 -R /dev/sdb2
# /unslung/mdadm -A /dev/md1 -R /dev/sdb1
  • Hot-Add the unslung drive:
# /unslung/mdadm -a /dev/md3 /dev/sda3
# /unslung/mdadm -a /dev/md2 /dev/sda2
# /unslung/mdadm -a /dev/md1 /dev/sda1

The order is important here: The /dev/md1 aray will take hours to resync so you should do the other two first.

  • Run cat /proc/mdstat to see that the disks are re-syncing
# cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md1 : active raid1 scsi/host1/bus0/target0/lun0/part1[2] sdb1[0]
292872832 blocks [2/1] [U_]
[>....................] recovery = 0.0% (25664/292872832) finish=569.7min speed=8554K/sec
md2 : active raid1 scsi/host1/bus0/target0/lun0/part2[1] sdb2[0]
120384 blocks [2/2] [UU]
md3 : active raid1 scsi/host1/bus0/target0/lun0/part3[1] sdb3[0]
56128 blocks [2/2] [UU]
unused devices: <none>

Yes, it really will take 569 minutes (almost 10 hours) to resync my two 300GB drives. For $70 you get 10MByte/sec throughput and that's it! While the disks are resyncing I don't touch the slug. I have found that if I try to perform any I/O tasks the speed of resyncing falls precipitiously and never rises back up so I just leave it alone (say, overnight) until it's finished. If anyone knows why this is I would appreciate it if they posted it here. This is also the reason why the resyncing is done on the standard Linksys filesystem rather than the unslung filesystem: The slug is constantly writing to /var/log and running other cron jobs and this causes the resyncing to die. If you prematurely switch off the slug while this resync is writing to the disks nasty things can happen. I lost one of my harddrives by switching the slug off this way (I got impatient). Luckily I didn't lose any data as the other drive was ok.

Once the resyncing has completed we stop the raid arrays, edit linuxrc once again and reboot:

 # /unslung/mdadm -S /dev/md3
 # /unslung/mdadm -S /dev/md2
 # /unslung/mdadm -S /dev/md1

Edit linuxrc as follows: Replace

 /unslung/mdadm -A /dev/md3 -R /dev/sdb3
 /unslung/mdadm -A /dev/md2 -R /dev/sdb2
 /unslung/mdadm -A /dev/md1 -R /dev/sdb1

With

 /unslung/mdadm -A /dev/md3 /dev/sdb3 /dev/sda3
 /unslung/mdadm -A /dev/md2 /dev/sdb2 /dev/sda2
 /unslung/mdadm -A /dev/md1 /dev/sdb1 /dev/sda1

Then switch off the slug and switch it back on again.

Err... that's it! You should have a working raid 1 array.

Troubleshooting:

Reboot Failed: If you were getting cat /proc/mdstat results similar to the examples listed above then the most likely cause of error is in the diversion scripts. To check these follow these steps:

  • Switch off the slug
  • Unplug all disks
  • Switch back on and wait for it to reboot, then telnet in
  • Check your diversion scripts in /unslung. As you have no drives connected you are now on the Linksys-standard root file system so the scripts you see will be the ones that you put in /initrd/unslung.
  • If you can't find any errors then plug in the drive that was blank (and now is part of the degraded raid array) and wait for 30secs.
  • Restart the raid arrays
# /sbin/insmod /unslung/md.o
# /sbin/insmod /unslung/raid1.o
# /unslung/mdadm -A /dev/md3 -R /dev/sdb3
# /unslung/mdadm -A /dev/md2 -R /dev/sdb2
# /unslung/mdadm -A /dev/md1 -R /dev/sdb1
  • Mount the data partition
# /bin/mount -t ext3 /dev/md1 /share/hdd/data
  • Check your diversion scripts in /share/hdd/data/unslung
  • If you still can't find anything wrong and you want to go back to your original configuration and stop the raid arrays.

Return to original unslung configuration:

  • Stop raid arrays
# cd /
# /bin/umount /share/hdd/data
# /unslung/mdadm -S /dev/md3
# /unslung/mdadm -S /dev/md2
# /unslung/mdadm -S /dev/md1
  • Erase raid superblocks:
# /unslung/mdadm --zero-superblock /dev/sdb3
# /unslung/mdadm --zero-superblock /dev/sdb2
# /unslung/mdadm --zero-superblock /dev/sdb1
  • Change the partition types back to 82 and 83 using busybox fdisk
  • Return linuxrc to the original format, perhaps using "mv linuxrc.orig linuxrc"
  • Switch off the slug, plug in your two disks and switch back on. You may need to reformat the second drive.

Failed Drive: Repeat the steps in this howto starting at "Resyncing the RAID arrays".

Power Loss: Raid 1 obviously won't help if you lose both drive simultaneously. Read http://www.tldp.org/HOWTO/Software-RAID-0.4x-HOWTO-4.html and follow steps 1 to 3 of method 2 and then repeat the steps in this howto starting at "Resyncing the RAID arrays". PS The fsck function on the slug has been renamed fsck.ext3.

view · edit · print · history · Last edited by nsc.
Based on work by PatrickSchneider, nsc, Torsten Bitz, and dcordes.
Originally by nsc.
Page last modified on June 03, 2007, at 10:33 AM