NSLU2-Linux
view · edit · print · history

How to check your disk for errors (and repair them)

The scandisk function of the Linksys web interface will not work when the disk is unslung as it is then in use and cannot be unmounted. The following steps can be used instead.

  • Boot the slug without disk.
  • Enable telnet (http://IP-address-of-your-slug/Management/telnet.cgi user admin, password admin)
  • Log in with telnet (user root, password uNSLUng as your own password resides only on the external disk.
  • Connect the disk. After a short while the partitions should be mounted automatically (note that mounting can take quite some time - perhaps 30 minutes - with disks that contain many files, due to the firmware 'quotacheck' function).
  • Through telnet unmount all drives.
umount /dev/sda1
umount /dev/sda2

(if you get an error that no device is mounted use "mount" to find out the name of the device; look for devices that are mounted as /share/hdd/data and /share/hdd/conf).

The web interface may still not allow you to scan. The log may say:

"Warning: Out of disk space, scandisk cannot proceed."
  • Instead you can now, in the Telnet session, issue the commands
/sbin/fsck.ext3 -f /dev/sda1
/sbin/fsck.ext3 -f /dev/sda2

This will check the disk for you, the -f flag forces a check even if the file system appears sane at first glance.

If you want to you can add the -y flag to the command. This will answer all questions with yes automatically (which is generally the best choice anyway).

WARNING: be sure to reboot your slug after doing this (to avoid that you accidentally fill up the flash filesystem)

Alternative: you could also remove the USB hard drive / flash drive from the slug and connect it to a Linux box and run fsck from there as well. fsck and e2fsck is the same thing, so use either (one command points to the other). This is probably more convenient for users that run the stock Linksys firmware rather than Unslung or other telnet enabled firmwares. Your PC should have USB2 - otherwise you would have to remove the drive from the enclosure. If you don't have a Linux box somewhere, you can boot your normal PC or Laptop with one of the several Live CD distros, for instance:

  • A Fedora live CD. You can download and burn Fedora from http://fedoraproject.org/get-fedora/(approve sites). Note: You can also use the Live CD to build a live USB flash drive.
  • An Ubuntu CD (can be ordered free from https://shipit.ubuntu.com/) or downloaded. Attach hard disk via USB to a modern PC.

CD to create a live USB drive, if your BIOS supports USB boot devices.

  • A Knoppix CD that doesn't need installing. You can download and burn Knoppix from http://www.knoppix.net/.
  • Damn Small Linux didn't work for one author, but most Linux distros should work.
  1. Boot PC from delivered or downloaded and burned CD, without USB drive connected.
  2. Insert USB drive. You should see two new icons appear on the desktop when both partitions have been mounted.
  3. Figure out the mount point name of the drive. If this is the unslung drive, you should see 2. To do this in Ubuntu select Places menu, choose the name of your drive, and in the file browser that comes up, if you don't see the mount name (like /media/disk1) then click on the paper and pen icon and you will get the location bar with the entire mount name.
  4. Open a terminal window from the Applications -> Accessories menu.
  5. fsck will not work with the mount name. From the terminal window type 'df' <enter>. You should then see the mount name with the device name next to it (like /dev/sda1).
  6. Now type 'sudo umount /dev/sda1/' (change the /dev/ location as necessary), and verify that the drive icon disappears from the desktop or places menu. Running fsck on a mounted drive can ruin the data on it.
  7. Type 'sudo fsck /dev/sda1 -f -y' (change the /dev/ location as necessary). The 2 options mean force the check even if the drive looks fine initially, and answer yes to all repair requests. Wait for it to complete. On a large drive it could take a while.
  8. Repeat for the other partitions as necessary (e.g. 'sudo umount /dev/sda2' and 'sudo fsck /dev/sda2 -f -y')
  9. After all partitions have been fsck'ed, unplug the USB drive and try it on the nslu2 again. You can turn the computer off at any point after fsck is complete, since the USB drive was already ejected (AKA unmounted) before we started the fsck.

WARNING (infoball): Being sort of adventurous I run a HD case (Argosy HD360U) and disk size (Seagate 400GB) that does not seem to be completely compatible with the slug. Linksys firmware 2.3R29 refuses to finish formatting, 2.3R63 I don't know if it's any better in the formatting area but it won't finish scandisk. 5.5-beta works. Sort of. At least it finishes formatting the disk. The above procedure results in "Bus error" (basically when it's done it seems), which then is the immediate response to all subsequent commands and the slug has to be reset by yanking the power cord. I do not yet know if there are any odd side effects as it seems to work well afterwards, but be warned. I have also experienced other strange occurrences with this (brand new) disk when used by the slug so the warning about making sure your hardware works with the stock firmware is very valid. I made a conscious decision to not care in this case. We'll see if it's fixable ;-)

Update: Not having enough time to investigate I got a smaller disk (Hitachi 250GB) and an identical HD case. I also flashed the slug with the 2.3R63 firmware and lo: it worked! Formatting, scandisk and all. I have recently flashed with Unslung 5.5-beta again and after a while I tried the fsck again, which worked, no more "Bus error". Several errors were discovered, and I let them be corrected. I ran fsck several times and seemingly the same, or at least as many, errors were discovered and "fixed". Finally I just connected the disk to a Linux box and did the same thing and it worked right away, no errors left to fix on the second fsck run. As a first impression things also seem more stable (just two "network name not available" messages while transferring several large files to and from the slug in the hours since I connected it again). Therefore I'd like to suggest that the fsck primarily be done on a PC and not from the slug as the fsck.ext3 binary seems unreliable, and also that anyone experiencing samba trouble try this as well!

I yesterday had a hd crash (I suspect the cause was a mobile phone that was called just while a write operation was carried out). I run the above procedure and lots of errors run through the screen... I had to run it several fsck.ext3 several times until no errors were found. But even after that the NSLU2 refused to mount the drive. From the syslogs I see that at some point fsck decided to erase the ext3 journal, so now my fs is just ext2, and the slug does not like. I will try to modify the fstab file and mount it manually. Anyway, my info is gone anyway, I attached the drive to a PC and it's empty (lucky it was in test and I only had some programs but no data there)... but beware of mobile phones near hard drives!

-- i had a problem with a hd crash too (in fact 3 times, I'm desperate) 1. I tried both Debian debootstrapped (slugos/le) and twice debian installer rc1. 2. only things running: proftpd (no traffic aside from me uploading some stuff once), sshd, screen, rtorrent (the first 2 times), transmission (a lightweight bittorrent client - the third time). 3. It all goes smooth (very smooth) for somewhere between 12-16-24 or maybe a little more hours, then it crashes (no ssh or ftp access, ping responding). 4. during this time, RAM has about 4 MB free and swap (128 MB swap partition) is used between 6-16MB (downloading only a 7 GB torrent). 5. NO errors in the dmesg, messages log, any other log that i know of 6. no program crash, everything is fine until it crashes

after crash, I take my harddrive and stick it into a Ubuntu machine. It has lots of errors on fsck. First time, after fsck, i ended up with all my data in the lost+found folder. The second time after some fixes, it worked. This morning, I fsck-ed it again after the crash and again, lots of error (bad allocation i think, and "file has filetype set", and after running with -y for 10 minutes, (and listing almost every single file on that harddrive) I still got some system data, but the directory where the torrents were is now a file, not a directory, /var/log/messages is missing, and who knows what others strange things happened.

UPDATE:

I found the cause, which is the @#%!@## USB-> ATA Adapter (some El Cheapo from china). Here it is:

http://www.qbik.ch/usb/devices/showdev.php?id=3751

So it seems my usb -> ide adapter works just fine under Windows XP, but under Linux 2.6, after a variable number of hours, it trashes the harddrive, file system, anything.

I am now testing it with unslung (2.4 kernel - works all right so far - 20 hours) and then I will test another adapter (another vendor) with 2.6 and I will try to post here the results.

UPDATE 2: kernel 2.4 seems to be working ok with this adapter, so I will stick with it:

  1. uptime
 17:21:24 up 3 days, 18:20,  2 users,  load average: 1.48, 1.73, 1.67

3-5 torrents all this time (transmissioncli), amuled, many ftp transfers and local copy from dir to dir (big files for testing purposes)

UPDATE 3: it crashed after more than 7 days, but it crashed nevertheless. So don't use this controller. --

view · edit · print · history · Last edited by Bill C Riemers.
Based on work by colin gebhart, RobHam, Andreas, egorkobylkincom, fcarolo, seniorsimon, adixor, adi, manutremo, infoball, TimBishop, kaste, and tman.
Originally by tman.
Page last modified on February 10, 2009, at 02:26 PM