![]() |
NSLU2 performanceA professional report on web server performance from a researcher at the Free University in Amsterdam can be found here. Overview: What to expectIt will of course depend on your LAN, disk and USB enclosure. On a 10Mb LAN performing large sequential read or writes the NLSU2 will utilize it fully, so this is only relelvant for 100Mb and faster networks. Note that Unslung firmware does not seem to alter the network performance significantly. The results below are obtained with Unslung 3.17 using a USB 2.0 enclosure (Sweex - http://www.sweex.nl - using the ALI chip) with a Samsung 160GB 7.2krpm and 8 MB cache. Please update if your performance numbers differ significantly from mine. I have compared the NSLU2 to similar products. One of the closest, the Synology DS-101, has slightly better performance, probably due to more RAM and use of IDE disks rather than USB. Until it has been "Unsyned", the actual cause of the higher speed is uncertain. A performance comparation of NSLU2 and DS-101 (and the Kuro-box) using Samba and FTP tests are done with single client only. Superficial testing with two clients indicates slightly higher joint throughput. Pure disk and network tests should not differ significantly when having several instances of the test program. We should probably test using standard storage benchmark programs too - a couple that spring to mind are IOMeter and Bonnie (would prefer Spec SFS 3 but complex and not free to set up). Question: Does the overclock mod alter the speeds below for Samba? Samba
Samba on DebianTests on a de-underclocked NSLU2 running Debian Etch, kernel 2.6.18.dfsg.1-17. Samba ver: 3.0.24-6etch9
Additional SAMBA measures on DebianI just tweak the following stuff on my NSLU2: - Use of writeback journal (available in ext3fs) and noatime mount option - In smb.conf, use of : socket options = TCP_NODELAY SO_KEEPALIVE IPTOS_LOWDELAY SO_RCVBUF=16384 SO_SNDBUF=16384 Then i got the following results, with the Slug 100% dedicated to the test (ie no other consuming processes running): - From a Linux server, with a 100 MB file, flushing all cache between tests:
- WRITE : 5.2 MBS
- READ : 4.3 MBS
- From a Vista laptop, with a 100 MB file, flushing all cache between tests:
- WRITE : 5.0 MBS
- READ : 4.7 MBS
Note that for the laptop, the result are horrible when using a wireless connection (ie near 2.0 MBS), but it might be due to the bad routing peformance of my xDSL/WLAN gateway. NFSI don't use NFS, but posts from others seem to indicate about the same, or slightly higher, speeds as with Samba, but with more CPU load. For speed measurement for one of the NFS packages, see Unslung.Nfs-utils. The rsize and wsize options for NFS have a huge impact, try setting them to 32768 or 65536. I'm getting 4.75/3.5 MB/s (read/write), see my profile - Profiles.Zhyla FTPMeasured with vsftpd.
Disk speedWrite speed measured using
Network speedMeasured using netio (slug binary version downloadable from http://folk.uio.no/ingeba/netio.arm and x86 linux version http://folk.uio.no/ingeba/netio.x86 . Source can be fetched from http://www.netfuse.de/techarea/netio/netio114.zip ).
TroubleshootingThere are a number of possible causes for bad performance. Here are some things to look at:
Details - hard numbersLmbench results for Stock Slug with 2.4.22-Linksys kernel.Hardware:
with CSR loaded: Results going to ../results/armv5b-linux-gnu/LKG0FB07F?.
Using config in CONFIG.LKG0FB07F
Tue Sep 21 00:34:20 MDT 2004
Latency measurements
Tue Sep 21 00:39:56 MDT 2004
Calculating file system latency
Tue Sep 21 00:40:18 MDT 2004
Local networking
Tue Sep 21 00:40:54 MDT 2004
Bandwidth measurements
Tue Sep 21 01:53:08 MDT 2004
Calculating context switch overhead
Tue Sep 21 02:07:41 MDT 2004
Calculating memory load latency
Tue Sep 21 02:13:30 MDT 2004
make[1]: Leaving directory `/home/packages/lmbench/lmbench-2.0.4/src'
real 102m23.363s
user 80m48.670s
sys 20m6.560s
without CSR loaded: Results going to ../results/armv5b-linux-gnu/LKG000000?.0 Using config in CONFIG.LKG000000? Tue Sep 21 02:32:23 MDT 2004 Latency measurements Tue Sep 21 02:33:04 MDT 2004 Calculating file system latency Tue Sep 21 02:33:27 MDT 2004 Local networking Tue Sep 21 02:33:54 MDT 2004 Bandwidth measurements Tue Sep 21 02:41:49 MDT 2004 Calculating context switch overhead Tue Sep 21 02:43:41 MDT 2004 Calculating memory load latency Tue Sep 21 02:49:05 MDT 2004 make[1]: Leaving directory `/home/packages/lmbench/lmbench-2.0.4/src' real 17m46.969s user 14m12.840s sys 2m48.130s with CSR loaded: sh-2.05b# ./hdparm -Tt /dev/sda /dev/sda: Timing cached reads: 148 MB in 2.00 seconds = 74.00 MB/sec Timing buffered disk reads: 20 MB in 3.03 seconds = 6.60 MB/sec /dev/sda: Timing cached reads: 148 MB in 2.02 seconds = 73.27 MB/sec Timing buffered disk reads: 24 MB in 3.21 seconds = 7.48 MB/sec /dev/sda: Timing cached reads: 148 MB in 2.00 seconds = 74.00 MB/sec Timing buffered disk reads: 24 MB in 3.06 seconds = 7.84 MB/sec /dev/sda: Timing cached reads: 148 MB in 2.02 seconds = 73.27 MB/sec Timing buffered disk reads: 24 MB in 3.09 seconds = 7.77 MB/sec /dev/sda: Timing cached reads: 148 MB in 2.01 seconds = 73.63 MB/sec Timing buffered disk reads: 24 MB in 3.07 seconds = 7.82 MB/sec without CSR loaded: sh-2.05b# ./hdparm -Tt /dev/sda /dev/sda: Timing cached reads: 164 MB in 2.01 seconds = 81.59 MB/sec Timing buffered disk reads: 22 MB in 3.02 seconds = 7.28 MB/sec /dev/sda: Timing cached reads: 164 MB in 2.02 seconds = 81.19 MB/sec Timing buffered disk reads: 24 MB in 3.17 seconds = 7.57 MB/sec /dev/sda: Timing cached reads: 164 MB in 2.02 seconds = 81.19 MB/sec Timing buffered disk reads: 24 MB in 3.07 seconds = 7.82 MB/sec /dev/sda: Timing cached reads: 164 MB in 2.02 seconds = 81.19 MB/sec Timing buffered disk reads: 26 MB in 3.17 seconds = 8.20 MB/sec /dev/sda: Timing cached reads: 164 MB in 2.02 seconds = 81.19 MB/sec Timing buffered disk reads: 26 MB in 3.16 seconds = 8.23 MB/sec without CSR loaded: make on perl-5.8.3: real 80m43.326s user 76m39.070s sys 3m12.650s with CSR loaded: make on perl-5.8.3:
real 90m48.799s
user 81m40.610s
sys 8m1.640s
L M B E N C H 2 . 0 S U M M A R Y
------------------------------------
Basic system parameters
----------------------------------------------------
Host OS Description Mhz
--------- ------------- ----------------------- ----
LKG000000 Linux 2.4.22- armv5b-linux-gnu 266
LKG0FB07F Linux 2.4.22- armv5b-linux-gnu 266
familiar Linux 2.4.19- armv5tel-linux-gnu 400
Processor, Processes - times in microseconds - smaller is better
----------------------------------------------------------------
Host OS Mhz null null open selct sig sig fork exec sh
call I/O stat clos TCP inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ----- ---- ---- ---- ---- ----
LKG000000 Linux 2.4.22- 266 1.23 3.10 17.5 23.6 211.6 9.19 14.0 3200 11.K 41.K
LKG0FB07F Linux 2.4.22- 266 1.28 3.23 18.3 24.7 221.5 9.58 14.6 3500 12.K 44.K
familiar Linux 2.4.19- 400 0.37 1.03 59.9 61.6 70.4 2.85 4.62 1864 5434 15.K
Context switching - times in microseconds - smaller is better
-------------------------------------------------------------
Host OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw
--------- ------------- ----- ------ ------ ------ ------ ------- -------
LKG000000 Linux 2.4.22- 151.2 300.4 695.6 338.8 708.8 339.0 733.6
LKG0FB07F Linux 2.4.22- 178.7 343.5 770.3 378.6 783.2 385.0 784.2
familiar Linux 2.4.19- 109.0 293.3 800.5 294.5 824.6 308.0 823.9
*Local* Communication latencies in microseconds - smaller is better
-------------------------------------------------------------------
Host OS 2p/0K Pipe AF UDP RPC/ TCP RPC/ TCP
ctxsw UNIX UDP TCP conn
--------- ------------- ----- ----- ---- ----- ----- ----- ----- ----
LKG000000 Linux 2.4.22- 151.2 322.4 482. 755.4 1415
LKG0FB07F Linux 2.4.22- 178.7 375.1 967. 855.0
familiar Linux 2.4.19- 109.0 217.3 342. 544.7 729.0 684.4 1011. 1567
File & VM system latencies in microseconds - smaller is better
--------------------------------------------------------------
Host OS 0K File 10K File Mmap Prot Page
Create Delete Create Delete Latency Fault Fault
--------- ------------- ------ ------ ------ ------ ------- ----- -----
LKG000000 Linux 2.4.22- 747.4 188.6 1851.9 444.6 1734.0 5.026 30.0
LKG0FB07F Linux 2.4.22- 789.9 205.6 1972.4 793.7 1866.0 5.211 31.0
familiar Linux 2.4.19- 14.4 11.5 97.0 21.9 2141.0 2.060 13.0
*Local* Communication bandwidths in MB/s - bigger is better
-----------------------------------------------------------
Host OS Pipe AF TCP File Mmap Bcopy Bcopy Mem Mem
UNIX reread reread (libc) (hand) read write
--------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- -----
LKG000000 Linux 2.4.22- 9.02 18.8 16.1 25.6 64.3 43.6 43.3 64.3 84.6
LKG0FB07F Linux 2.4.22- 7.71 17.1 15.0 24.3 60.4 39.2 38.9 60.5 78.2
familiar Linux 2.4.19- 18.0 38.4 21.8 43.5 79.6 118.7 49.8 78.8 334.6
Memory latencies in nanoseconds - smaller is better
(WARNING - may not be correct, check graphs)
---------------------------------------------------
Host OS Mhz L1 $ L2 $ Main mem Guesses
--------- ------------- ---- ----- ------ -------- -------
LKG000000 Linux 2.4.22- 266 16.1 236.3 246.2 No L2 cache?
LKG0FB07F Linux 2.4.22- 266 17.0 266.2 266.2 No L2 cache?
familiar Linux 2.4.19- 400 7.540 302.7 322.5 No L2 cache?
--jacques Some tests done with the copy of a 200MB file between a NSLU2 with 2.12-beta firmware and a Linux 2.8 (ubuntu), the test was performed with the filesystem "mounted" and then a simple read/write of the 200MB via a python script. read write nfs: 5.7MB/s 2.6MB/s cifs: 3.5MB/s 1.9MB/s samba: 2.2MB/s 1.85MB/s nfs-server Version: 2.2beta47-2 --titoo DhrystoneThe Dhrystone benchmark mostly runs from cache and says something about CPU performance but relatively little about overall system performance. The system is an unmodfied NSLU2 (except for serial port addition) running Unslung 3.18 beta, gcc 3.3.5 with stock libs and compiler flags "-O3 -mcpu=xscale". Dhrystone 2.1: Microseconds for one run through Dhrystone: 6.4 Dhrystones per Second: 155440.4 VAX MIPS rating = 88.469 Dhrystone 1.1: Dhrystone( 1.1) time for 3000000 passes = 16.4 Register option selected? NO This machine benchmarks at 182815.4 dhrystones/second VAX MIPS rating = 104.050 These results look low to me for a 266 MHz xscale so I checked the rate at which the performance monitor register CCNT counts and saw 133 MHz. Maybe the core on the NSLU2 is running only at 133 MHz. Some other xscales have a frequency change procedure that kernels or bootloaders can get wrong-- but no such procedure appears to be documented for the IXP420. For comparison, my 233 MHz Pentium2 MMX running NetBSD 1.6.2 yields 189 and 213 VAX MIPS from Dhrystone 2.1 and 1.1 respectively. -- yahpn Additional tests on SlugOS/BEEnvironment:
--adriansi
view ·
edit ·
print ·
history ·
Last edited by adriansi.
Based on work by adriansi, kerry, Biboobox, Reedy Boy, Kassidy Clark, rwhitby, marceln, Blastur, Nate S, quandary, Xnaron, Zhyla, yahpn, bobtm, Jason O039Rourke, and uSURPER. Originally by bobtm. Page last modified on November 10, 2008, at 09:58 PM
|