NSLU2-Linux
view · edit · print · history

Newsflash

The following a troubleshooting steps you can follow (as root) to identify the problem

  1. Immediately after you determine you have problem after reboot type in:

    dmesg > boot.messages

    dmesg is used to examine or control the kernel ring buffer.

    The program helps users to print out their bootup messages.
    The user can then email the boot.messages file to whoever can debug their problem.
  2. An inital action a user can take is to type:

    cat /proc/bus/usb/devices | grep "I\:"

    There should only be as many usb-storage devices as you expect to have.
    If there are more than the number you expect or this number is > 2 then this is your problem. See this FAQ entry.
  3. Another action a user can take is to type:

    ls -l /proc/*_conn

    Once again there should be no more than 2 listings.

    bash-2.05b# ls -l /proc/*_conn
    -r--r--r-- 1 root root 0 Feb 4 19:09 /proc/hd2_conn
    -r--r--r-- 1 root root 0 Feb 4 19:09 /proc/hd_conn

    You will see hd2_conn for a hard disk on port 2, usb_conn for a flash disk on port 2, hd_conn for a hard disk on port 1 and (for unslung firmware only) hd_conn for a flash disk on port 1 too.

After attemping the diversion scripts mentioned below one thing to note is that the mounting procedure is dependent on the order in which the devices are connected to your USB hub. You should connect the hard drive you want to use in the first port of the USB hub and then everything else. If that doesn't work try the last since it may be mounting drives from last to first instead!

The source of many of these cases have now been found.

When running 3.17 (and maybe also 3.16) the startup script malfunctions if the file /opt/sbin/insmod (which is a symbolic link to /opt/bin/busybox) is present. Just delete this link (if you really want to run this version of BusyBox insmod later, you can do that by writing "busybox insmod"). After deleting the link, just reboot and your drive should again be recognized as formatted and your shares available.

If this does not work for you and your drive was marked as "Not formatted" and not as "Not installed", you can still benefit from the old workarounds below. If these work but not the simple deletion of /opt/sbin/insmod, please notify on IRC.

2005-01-28, IRC

  • TresEquis reported being unable to boot with the HDD plugged in, although it would mount (even automount) after booting.
  • perlguru-work suggested that the correct fix was to move '/opt/bin:/opt/sbin' to the end of the PATH assignment in '/etc/rc.d/rc.sysinit'.
  • TresEquis reported that this fixed the booting problem for him.
  • Vic - I deleted the link and now I can't access my slug anymore. Anyone know how to fix this?

2005-10-30 unslung 5.5, HD1?-120G (recently replaced)

I checked HD1? with 'fsck.ext3 -y -f /dev/sda1' (if not -y, too many questions asked to confirm), after that HD1? didn't want to mount anymore. First fsck gave tons of problems that were fixed by using -y switch. Previous fsck was also done manually (shutdown, disconnect HD, boot, logon, connect HD, umount, fsck) about two weeks ago, also giving plenty of things fixed.

Symptoms

  • HD1? is not mounted when slug boots with HD attached
  • green light 'DISK1?' is ON
  • cannot manually mount HD1? when logged to slug
  • _can_ fsck HD1?, though - it produces no errors anymore

After checking output from the first fsck, noticed that journal was removed (partition downgraded from ext3 to ext2 ?). Moved HD to a linux box, and run

tune2fs -j /dev/sda1

Reconnected HD back to slug, and it boots fine now.

Just wonder, how many people have problems with failing HDs?, corrupted data, not-formatted-conditions, etc. Looking at this page, and mailing lists, not much help is provided for such problems. Not being able to fsck HDs? that are unslung (through the web, or scheduler) is not helping as well.


Having no access to your NSLU2 no more? Can't find your NSLU2-harddrive in the net anymore? Then this is the place to be..

1. Shut the NSLU2 down.

2. Switch USB-HDD off or disconnect it.

3. Turn NSLU2 on.

4. Enable Telnet (e.g. via http://192.168.1.77:/Management/telnet.cgi)

5. Telnet into NSLU2 with root/uNSLUng login/password.

6. Type 'sh' (not necessary, just more comfortable)

7. Type 'mount -t ext3 /dev/sda1 /mnt/tmpmnt'.

8. Type 'ls /mnt/tmpmnt'. So now you (hopefully) see your data.

9. Type 'df' is there sda1? no? then we have a problem, we are still working on...solution here hopefully soon. at least you can backup your data now.


Sometimes steps 4-9 isn't needed. Just have the drives disconnected on boot and connect them later. USB_Detect will se them and mount them. Go figure.


I have symptoms similar to those described here. I have an unslung 5.x box which was working fine until, during a heavy data transfer, the box stopped responding (even to pings). The box boots without the drive, but if I either boot it with the drive, or attach the drive after booting, the box becomes unresponsive (to pings and web access).

I can telnet into the box following a diskless boot, and when I then attach the drive, the box remains responsive, but the drive is not automounted. I ran fsck.ext3 on the drive, and it found one small problem, but fixed it. I can ls the drive, but even after multiple successful fscks, the problem remains.

Any ideas? --Pat / zippy@cs.brandeis.edu

Preliminary workaround for this problem

Note: This will only work if the above solution works - you must be able to mount your data partition manually. Also, this solution can in theory be used to mount lots of different file systems on the slug and should be integrated with /etc/fstab, but this is a proof of concept.

The startup mounting and detection system in Unslung 3 is based on the Linksys utility /etc/rc.d/rc.bootbin for which not source code is availabale. To make the system boot and mount drives that rc.bootbin does not recognize, one solution is to make a replacement diversion script for rc.1 which does not use rc.bootbin. This means the following:

  • It must detect /dev/sda and /dev/sdb and mount data partitions on those present as well as swap partition on /dev/sda3 (currently there is no special handling for flash). /dev/sda2 is already mounted at this point.
  • The rc.network must be called to run dhcp client.
  • In theory it should start uPnP deamons, but as I see little use in it, I prefer to save memory and leave it out.
  • The Linksys utility USB_Detect will behave badly too, so it has to go. This is done by an empty replacement diversion script for rc.quickset
  • [optional] For safety, do_umount (another Linksys app) is removed by a replacement diversion script for rc.local commenting out the startup for it.
  • The file /etc/CGI_ds.conf must be edited manually to set valid drives if one is to continue using the web interface (which is probably not a good idea). This is done ONCE by setting validhd=x:y, where x=1 if and only if drive 1 is present (else 0) and y reflects drive 2. The invalidhd variable can be set to "0:0". For a one drive system connected to USB connector one should be "validhd=1:0" and "invalidhd=0:0".
  • [optional] The firmware upgrade utility is also removed - no need for that anymore since upgrades should be performed without disks mounted. Jacques informed on the IRC channel that this daemon is dangerous in that anyone can reset and/or flash your slug if they are on the same network, so we don't want this thing there.
  • [possibly optional] The program rst_ugs, called from /etc/rc.d/rc.reset_usrgrpshare may overwrite Samba and password files if it finds them unpleasant. rc.reset_usrgrpshare is overridden too.

NOTE: Scripts below are testing quality at best. Also, I have no serial console so logging is done to HDD to check for problems. Lots of useless logging statements can be removed.

/unslung/rc.1

 #!/bin/sh
 /bin/echo "In diversion rc.1" > /unslung/rc.1.log
 /bin/ls /proc/hd* >> /unslung/rc.1.log
 /bin/ls /bin/*mount* >> /unslung/rc.1.log

 # First remount config partitions on the right spot
 # Mount data partitions on /dev/sda and /dev/sdb
 # For now, don't mount swap on /dev/sdb, but it could improve performance, so later...
 if ( [ -f /proc/hd_conn ] ) ; then
        /bin/echo "Mounting partitions on /dev/sda" >> /unslung/rc.1.log
        # Mount data partition and swap (conf partition already mounted)
        /bin/mount -t ext3 /dev/sda1 /share/hdd/data
        /bin/echo "Enabling swap" >> /unslung/rc.1.log

        /sbin/swapon /dev/sda3
 fi
 if ( [ -f /proc/hd2_conn ] ) ; then
        # Mount data partition
        /bin/mount -t ext3 /dev/sdb1 /share/flash/data
 fi
 /bin/echo "Done mounting partitions" >> /unslung/rc.1.log

 # Start network
 /bin/echo  "Starting network:" >> /unslung/rc.1.log
 /etc/rc.d/rc.network; check_status

 /bin/echo  "Restore time and timezone:"
 /etc/rc.d/rc.rstimezone; check_status

 /bin/echo  "Restore usrgrpshares:"
 /etc/rc.d/rc.reset_usrgrpshare; check_status

 /bin/echo  "Generating telnet password:"
 /usr/sbin/TelnetPassword?; check_status

 /bin/echo  "Starting WEB Server:"
 . /etc/rc.d/rc.thttpd; check_status

 /bin/echo  "Starting Samba:"
 . /etc/rc.d/rc.samba

 # /bin/echo  "Starting download:"; /usr/sbin/download

 /bin/echo  "Starting inetd:"
 . /etc/rc.d/rc.xinetd; check_status

 /bin/echo  "Creating ramfs for /tmp:"
 /bin/mount -t ramfs none /tmp -o maxsize=512

 /bin/echo  "Starting QuickSet? daemon:"
 . /etc/rc.d/rc.quickset

 /bin/echo  "Starting cron:"
 . /etc/rc.d/rc.crond

 /bin/echo  "Starting Rest Task:"
 . /etc/rc.d/rc.local

 /bin/echo  "Starting Unslung Packages:"
 . /etc/rc.d/rc.unslung-start

 /usr/bin/Set_Led ready
 /usr/bin/Set_Led beep1

 /bin/echo  "Checking disk status:"
 /usr/sbin/CheckDiskFull 2 >/dev/null

 /bin/echo "Exiting /unslung/rc.1" >> /unslung/rc.1.log
 return 0

/unslung/rc.local (optional)

 #!/bin/sh

 HOSTNAME=`hostname`

 /usr/sbin/CheckResetButton? 2>/dev/null
 /usr/sbin/CheckPowerButton? 2>/dev/null
 # /usr/sbin/do_umount 2>/dev/null
 /bin/chmod 775 /share 2>/dev/null
 /bin/chown admin.everyone /share/hdd/ 2>/dev/null
 /bin/chown admin.everyone /share 2>/dev/null
 /etc/rc.d/rc.quota &>/dev/null
 /bin/echo "$HOSTNAME: boot complete!"; check_status
 return 0

/unslung/rc.quickset

 #!/bin/sh

 return 0

/unslung/rc.reset_usrgrpshare

 #!/bin/sh

 return 0

Bobtm.


I had a problem where my drive showed up as "Not formatted". I formatted it and got no errors, and then tried to unsling to it. This didn't work, and at this point the drive still appeared as "Not formated" in the web interface. After first clearing the partition table (the drive was used for other things previously) and reformatting again, the problem went away.

To clear the partition table, do something like this (for Disk 1):

dd if=/dev/zero of=/dev/sda bs=1k count=1


Another workaround for this problem

I've found that the sequence of applying power to the the drive and the NSLU2 helps me solve problems like this. First I switch on the NSLU2 and after a few seconds the drive.


I checked the hard disk 1 with

fsck.ext3 -y -f /dev/sdb1

Some problems when running three times (killed, memory aborts) and the problems were fixed after rebooting the NSLU2.


Another workaround

I unslung 6.8 to a Hitachi 250GB Deskstor in port 1. When I later added another Hitachi disk to port 2, it wasn't formatting. In the end I put it on to an XP and did a quick format. Unplugged it put it back in port 2. Did a format again, but df only showed 80GB of space. Bad karma.

I plugged the drive into XP box. This showed the 80GB partition, plus 3 other partitions approximately 150MB each. I deleted all the partitions and then did a FULL FORMAT.

I powered the drive off, plugged the drive back in to port 2 and then switched on the drive. Using the web interface, the drive popped up in the disk pages. I then ran the format which worked :D

Page last modified on March 12, 2007, at 10:58 PM