SATA Performance issues

295 views
Skip to first unread message

James Barker

unread,
Jan 16, 2019, 2:08:35 PM1/16/19
to GnuBee
I posted this issue (which is probably the wrong place!) on neil browns gbubee tools repo on github but this might be a better place for the discussion so I will copy it here (https://github.com/neilbrown/gnubee-tools/issues/5)

jcbdev commented 14 hours ago  
edited 

This is probably more of a question as you seem to have more experience with hardware than I do (I am a programmer). I have been messing around with all the various raid combinations on the device (raid 0, 1 and 10) and no matter which raid combo I try I get the exact same write speed (about 72mbps) which I believe is the max write speed of the disks I am using (when used singularly)

So I tried a dd if=/dev/zero of=test bs=1M count=1000 on two threads to two of the drives and got the same combined write speed 72mbps (when you add the speeds together)

This got me scratching my head a bit. Is there only 1 6gbps lane all the ports are multiplexed through or is there something strange going on? I should have at least gotten above the write speed of a singular drive?

I have tried on old kernels too (like the original v3 kernel) and different debian distros (jessie/stretch and buster) and I think I am getting the same results.

Is there a hardware limitation I am missing or something?

On a side note, thanks for all your hard work! It has been inspiring for me playing with your tools and building the kernels etc.

@jcbdev
  

jcbdev commented 13 hours ago

I just tried with all six disks. here is the output. notice the 19 MB/s per drive. Are all the sata ports multiplexed together on one lane or something?

root@gnubee-n1:~# dd if=/dev/zero of=/data/brick1/test bs=1M count=1000 & dd if=/dev/zero of=/data/brick2/test bs=1M count=1000 & dd if=/dev/zero of=/data/brick3/test bs=1M count=1000 & dd if=/dev/zero of=/data/brick4/test bs=1M count=1000 & dd if=/dev/zero of=/data/brick5/test bs=1M count=1000 & dd if=/dev/zero of=/data/brick6/test bs=1M count=1000
[1] 3451
[2] 3452
[3] 3453
[4] 3454
[5] 3455
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 53.403 s, 19.6 MB/s
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 53.3504 s, 19.7 MB/s
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 54.4687 s, 19.3 MB/s
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 54.5439 s, 19.2 MB/s
[2]   Done                    dd if=/dev/zero of=/data/brick2/test bs=1M count=1000
[3]   Done                    dd if=/dev/zero of=/data/brick3/test bs=1M count=1000
[5]+  Done                    dd if=/dev/zero of=/data/brick5/test bs=1M count=1000
root@gnubee-n1:~# 1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 55.0374 s, 19.1 MB/s
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 54.3202 s, 19.3 MB/s
root@gnubee-n1:~#
@jcbdev
  

jcbdev commented 12 hours ago

According to the MediaTek specs the chips has 3x PCI express lanes and looking at the specs for the ASM1061 (and the commodity 1061 cards you can buy on ebay) they claim you can run 2 full sata III 6gbp/s ports on the pci-ex lane. Looking at the bottom of the board (gnubee PC2) I can see 3 ASM1061 chips all seemingly connected to a different lane dirrectly to the mediatek chip.

So from a theoretical hardware perspective that all seems to add up to the prospect of 6 full speed sata ports! But as you can see above the max throughput I can put through all the busses at the same time is less then 1 saturated SATA III port. Far far less! Only getting around 100MB/s when "theoretically" even one sata II port should be able to hit 300MB/s. Even the orignal pci-ex v1.0a has a theoretical throughput of 250MB/s so even if it was 3 pci-ex v1.0a lanes we should be able to see higher than 100MB/s across all devices.

This suggests there's a kernel issue or driver issue somewhere that needs to be addressed, unless I'm misinterpreting the specs (or maybe a hardware bottle neck elsewhere?). I'd love to help work on this if you think its an issue. (sorry for the spam this is my first time hacking at a kernel for an sbc device and I'm loving it). Just need some direction of where to focus my efforts!

http://www.asmedia.com.tw/eng/e_show_products.php?item=118
https://wikidevi.com/wiki/MediaTek_MT7621

James Barker

unread,
Jan 16, 2019, 7:34:13 PM1/16/19
to GnuBee
I think it might be cpu power thats the bottle neck,

if I run in one terminal to get throughput stats:
iostat -xd 5 -m

and a testbench dd command against any number of devices (sda or sda and sdb or all of them etc,) then the far right column %util hits 100 always

according to the man page for iostat 100% util means system is maxed out (cpu).

so its possible its not the devices but the cpu power?

James Barker

unread,
Jan 17, 2019, 3:28:39 PM1/17/19
to GnuBee
Did some further investigating.

It's not CPU bottleneck because I used an ODROID XU4 and set all the device up via NDB (network block device) (and I tried iSCSI) to the odroid and mounted them and tried the same benchmarking from there (but obviously the odroid cpu is managing all the filesystem etc now so is taking most of the cpu load on the client side).

I get the same spread of shared bandwidth between all the SATA devices that add up to total max approx 74MB/s but the cpu load on the GnuBee is barely stretched at all (at about 30%) because the odroid cpu on the nbd client side is doing most of work in terms of filesystem etc.

I'm starting to think the MediaTek SoC itself has split 1 pci-ex lane into 3 with built in multiplexer?  (Assuming its not a kernel issue)

The only thing that makes me think it could be a kernel issue is the fact that I checked the data sheet for the MT7621A and its apparently pci-ex v1.1 which has a bus speed of 250MB/s so theoretically it should be able to go higher.  I'm gonna test it with some SSDs today to see what happens.


On Wednesday, 16 January 2019 19:08:35 UTC, James Barker wrote:

James Barker

unread,
Jan 17, 2019, 4:15:39 PM1/17/19
to GnuBee
So just to confirm:

plugged a samsung evo ssd into one if the sata ports,  On my pc I can get approx 420MB/s write with the exact same command on same debian distro.

on the gnubee I get identical results to the HDD ones: (72MB/s)

root@gnubee-n1:~# dd if=/dev/zero of=/data/brick1/test bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 14.5713 s, 72.0 MB/s
root@gnubee-n1:~#

So there's certainly a max bottleneck of around 70 MB/s shared between all devices.  Whether its a hardware or software bottleneck I have no idea but ho-hum.  Without anyone else to help me shed more light on it I'm going to assume the mediatek chip has cheated on their claim of "3x PCIex" buy using a multiplier/multiplexer internally.

Still an awesome device though.  Good as a backup/rsync device I reckon but probably not quite for say media streaming through a connected plex server (in a household with a few kids who all want to watch different stuff I think it will struggle!!).

Anyone else on the group got some thoughts on this? (Could it be a driver issue? or do you get totally different results on your device?)

James Barker

unread,
Jan 23, 2019, 5:46:19 PM1/23/19
to GnuBee
SOOOO Kind of good new I think. (If anyone's reading this forum still?) but anyway if you are late to the party like me and look to squeeze some better performance out of this device then this might help you out!!

I've been tinkering and I think it's a debian/driver issue not a hardware issue.

before no matter how many parallel writes i tried to on the sata ports it would only add up to 72MB/s (speed of one drive) just divided by this number per drive

for example:
3 drives /dev/sda1, /dev/sdb2 and /dev/sdb3 (2 xfs one 1 ext4)

on all of the debian installs based on any of the boot kernels (neil browns mainline kernel, LEDE builds hanging around in this group and the original openwrt 14.01 firmware that comes with the device) + any debian install on an external ssd (via usb4) I would get the exact same result - 72MB/s divided between the three drives (25 MB/s per drive).  This meant the max performance of this as a NAS even in raid 0 over a bunch of disks would always be max 72MB/s.  This would also take a MASSIVE overhead hit when mounting over network via any protocol such as NFS/CIFS/iSCSI/NBD (the last two to off load cpu drain of managing raid or FS to the host device and mounting as a raw block device across the network) would always max out at around 30/40MB/s (which I've heard before is a common 1Gbps network adapters "real world performance anyway" so that might be the network mount bottle neck (next problem to solve with bonding or whatever).

But, onto the good news.  After reading this fantastic blog -> https://stumbles.id.au/getting-started-with-the-gnubee-personal-cloud-2-or-1.html it got me wondering whether openwrt is the key to the best performance.  I installed the initram kernel first but when I realised it doesnt save settings on reboot I flashed via the flash page the squashfs sysupgrade bin which now allows me to save changes after rebooting the router.  As he mentions on the page getting it to DHCP mode is a total pain because of the 30s rollback feature I figured out the best way for me to get it to work and persist was:
1) swich protocol to DHCP
2) on host get everything set up for an 'arp-scan 192.168.0.1/24' command in a terminal
3) hit save and apply in UI
4) switch direct cable in black port for network cable connected to normal network
5) run arp scan
6) login to the new IP thats appeared in the list in the browser AS QUICKLY AS POSSIBLE (30s aint a long time to do all this)

if you do this it will save and now you have the NAS on your normal network and you can ssh in and download packages from the net

I installed the e2fsprogs packages and all the xfs packages so I could mount my 6 drives

I then ran the following parallel test on the 3 of the disks:
root@OpenWrt:/mnt/disk1# time dd if=/dev/zero of=test bs=1M count=1000 & time dd
 if=/dev/zero of=test bs=1M count=1000 & time dd
 if=/dev/zero of=test bs=1M count=1000^[[D^[[D^[[D^[[D^[[D^[[D^[[D^[[D^[[D^[[D^[^CCommand terminated by signal 2
real    0m 11.86s
user    0m 0.00s
sys    0m 0.00s
root@OpenWrt:/mnt/disk1# time dd if=/dev/zero of=/mnt/d1000+0 records in1000 & t
1000+0 records outo of=test bs=1M count=1000 & time dd
real    0m 23.56s
user    0m 0.01s
sys    0m 16.51s
itest bs=1M count=1000 &                                1000+0 records in
1000+0 records outro of=test bs=1M count=1000 & time dd
real    0m 23.66s
user    0m 0.01s
sys    0m 14.64s
root@OpenWrt:/mnt/disk1# time dd if=/dev/zero of=/mnt/disk1/test bs=1M count=100
0 & time dd if=/dev/zero of=/mnt/disk2/test bs=1M count=1000 & time dd if=/dev/z
ero of=/mnt/disk2/test bs=1M count=1000

1000+0 records in
1000+0 records out
real    0m 21.55s
user    0m 0.09s
sys    0m 21.80s

1000+0 records in
1000+0 records out
real    0m 31.13s
user    0m 0.06s
sys    0m 19.52s
[3]-  Done                       time dd if=/dev/zero of=/mnt/disk1/test bs=1M count=1000
[2]-  Done                       time dd if=/dev/zero of=test bs=1M count=1000
[1]-  Done                       time dd if=/dev/zero of=test bs=1M count=1000
root@OpenWrt:/mnt/disk1# 1000+0 records in
1000+0 records out
real    0m 31.23s
user    0m 0.01s
sys    0m 18.55s

(DD on openwrt doesn't show you MB/s like on debian)

if you do the math though I am writing 1000MB to the disk in approx 31 seconds on most of the devices.  When you add up all the rates the total through put is around 111MB/s

This is significantly better than any of the results I got on debian.  I also tried my SSD and got 150mb/s over USB

I think this a) shows its a driver issue not a hardware bottleneck and b) shows we can actaully set up a decent raid on this.

I will update with further info even if no one is listening hahaha.

Let me know if anyone is interested in knowing more.

Todd Adam

unread,
Feb 28, 2019, 3:19:14 PM2/28/19
to GnuBee
I’m interested in hearing more. Any new information?

Razmkhah Rémi

unread,
Mar 17, 2019, 6:32:44 PM3/17/19
to GnuBee
I would be interesting if you have any new findings.

jjakob

unread,
Jan 7, 2020, 11:39:40 AM1/7/20
to GnuBee
Results from 5.4 kernel compiled from https://github.com/neilbrown/gnubee-tools. ext4 filesystem on a SSD partition.

root@gnubee:~# uname -a
Linux gnubee.gnubee 5.4.6+ #3 SMP Mon Jan 6 20:31:09 CET 2020 mips GNU/Linux

In the oflag=direct case, cpu usage is
5.5 us, 17.5 sy,  0.0 ni, 64.0 id, 12.6 wa,  0.0 hi,  0.3 si,  0.0 st
with dd using 47% of 1 cpu.
dd if=/dev/zero of=/root/test-1gb-zero bs=1M count=1000 oflag=direct                                                                                                    
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 11.1887 s, 93.7 MB/s



In the default case without oflag=direct, cpu usage is
6.8 us, 36.6 sy,  0.0 ni, 55.6 id,  0.0 wa,  0.0 hi,  1.0 si,  0.0 st
and dd consumes 99% of 1 cpu.
kworker/u8:0-flush-8:0 also comes to the 10% range and kswapd comes up to 22% periodically. This leads me to believe it's a memory, cache or scheduling issue.
dd if=/dev/zero of=/root/test-1gb-zero bs=1M count=1000                                                                                                                
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 21.7277 s, 48.3 MB/s

Interrupts before and a dd (default) run:
# cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3
 
7:          0          0          0          0      MIPS   7  timer
 
8:    5488291    5471178    5683959    5474474  MIPS GIC Local   1  timer
 
9:   13882051          0          0          0  MIPS GIC  63  IPI call
 
10:          0    7854029          0          0  MIPS GIC  64  IPI call
 
11:          0          0   14188299          0  MIPS GIC  65  IPI call
 
12:          0          0          0    7681272  MIPS GIC  66  IPI call
 
13:    1527765          0          0          0  MIPS GIC  67  IPI resched
 
14:          0    1580563          0          0  MIPS GIC  68  IPI resched
 
15:          0          0    1669087          0  MIPS GIC  69  IPI resched
 
16:          0          0          0    1585764  MIPS GIC  70  IPI resched
 
17:         37          0          0          0  MIPS GIC  19  1e000600.gpio-bank0, 1e000600.gpio-bank1, 1e000600.gpio-bank2                              
 
18:       1415          0          0          0  MIPS GIC  33  ttyS0
 
19:          0          0          0          0  MIPS GIC  27  1e130000.sdhci
 
20:         29          0          0          0  MIPS GIC  29  xhci-hcd:usb1
 
21:     467043          0          0          0  MIPS GIC  10  1e100000.ethernet
 
23:     397374          0          0          0  MIPS GIC  11  ahci[0000:01:00.0]
 
24:          0          0          0          0  MIPS GIC  31  ahci[0000:02:00.0]
 
25:        397          0          0          0  MIPS GIC  32  ahci[0000:03:00.0]
 
26:         37          0          0          0  1e000600.gpio  18  reset
ERR
:        439
root@gnubee
:~# cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3      
 
7:          0          0          0          0      MIPS   7  timer
 
8:    5494033    5476867    5689943    5480236  MIPS GIC Local   1  timer
 
9:   13892559          0          0          0  MIPS GIC  63  IPI call
 
10:          0    7857592          0          0  MIPS GIC  64  IPI call
 
11:          0          0   14197088          0  MIPS GIC  65  IPI call
 
12:          0          0          0    7685115  MIPS GIC  66  IPI call
 
13:    1529080          0          0          0  MIPS GIC  67  IPI resched
 
14:          0    1581723          0          0  MIPS GIC  68  IPI resched
 
15:          0          0    1670498          0  MIPS GIC  69  IPI resched
 
16:          0          0          0    1587295  MIPS GIC  70  IPI resched
 
17:         37          0          0          0  MIPS GIC  19  1e000600.gpio-bank0, 1e000600.gpio-bank1, 1e000600.gpio-bank2
 
18:       1415          0          0          0  MIPS GIC  33  ttyS0
 
19:          0          0          0          0  MIPS GIC  27  1e130000.sdhci
 
20:         29          0          0          0  MIPS GIC  29  xhci-hcd:usb1
 
21:     467257          0          0          0  MIPS GIC  10  1e100000.ethernet
 
23:     398796          0          0          0  MIPS GIC  11  ahci[0000:01:00.0]
 
24:          0          0          0          0  MIPS GIC  31  ahci[0000:02:00.0]
 
25:        397          0          0          0  MIPS GIC  32  ahci[0000:03:00.0]
 
26:         37          0          0          0  1e000600.gpio  18  reset
ERR
:        440

 ethernet and all ahci drivers (pcie ports for sata) are on CPU0. While this is not the issue in my test case of writing to only 1 disk locally, it can be a separate issue when accessing multiple disks over the network as all are using cpu0 interrupts.


Dne četrtek, 17. januar 2019 22.15.39 UTC+1 je oseba James Barker napisala:

Eric Culp

unread,
Jan 8, 2020, 10:34:09 AM1/8/20
to jjakob, GnuBee
You can play around with the irq smp_affinity bitmask. On my x86 systems, the irqs seem to balance on their own. I think on this MIPS board, linux cannot figure out how to assign an irq to multiple cores and just assigns them to the first core of the bitmask.

However, if you set the smp_affinity to a specific core, linux will use the first cpu of your affinity. E.g. you could reassign the 3 ahci irqs to core1, core2, and core3 with this:

echo 2 > /proc/irq/23/smp_affinity
echo 4 > /proc/irq/24/smp_affinity
echo 8 > /proc/irq/25/smp_affinity

--
You received this message because you are subscribed to the Google Groups "GnuBee" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gnubee+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gnubee/0cc1c74e-3d03-4c20-a6e3-7fbada0190b9%40googlegroups.com.

jjakob

unread,
Jan 12, 2020, 8:51:32 AM1/12/20
to GnuBee
I tried setting the smp_affinity for the three ahci controllers. There is no difference in simultaneous or single read speed.

I can read at 80MB/s from ext4 on a SSD on port 1 while at the same time copying at 35MB/s from ext4 on a RAID1 array on ports 5 and 6 to a USB3 drive. What's interesting is that even when reading from both drives on port 5 and 6 at the same time, the total speed stays at ~35MB/s. These drives share the PCI-e link and the ASM1061 PCI-e to SATA bridges. The ASM1061 is definitely capable of SATA at the PCIe rated throughput as some benchmarks show.

sda is a ssd on port 1, controller 1.
sdb, sdc are spinning drives on ports 3, 4 on controller 2.
sde, sdf are spinning drives on ports 5, 6 on controller 3.

01:46:44 PM       DEV       tps     rkB/s     wkB/s   areq-sz    aqu-sz     await     svctm     %util
01:46:46 PM       sda    120.00  80128.00      0.00    667.73      0.00      5.47      8.21     98.50
01:46:46 PM       sdb      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:46:46 PM       sdc      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:46:46 PM       sdd     36.50      0.00  30720.00    841.64      0.12      9.16      5.62     20.50
01:46:46 PM       sde    106.00  27264.00      0.00    257.21      0.00      2.09      9.34     99.00
01:46:46 PM       sdf      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:46:46 PM       md1    107.00  27320.25      0.00    255.33      0.00      0.00      0.00      0.00

sde and sdf are part of md1 on the same PCI-e link, sda is the SSD, sdd is the USB3 drive.

This is reading from 3 drives simultaneously with dd:
Average:          DEV       tps     rkB/s     wkB/s   areq-sz    aqu-sz     await     svctm     %util                                                      
Average:          sda    107.88  61493.50      9.50    570.13      0.00      5.03      8.93     96.37
Average:          sdb      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:          sdc      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:          sdd      0.38      0.00    129.50    345.33      0.01     27.00     13.33      0.50
Average:          sde     86.25  54272.00      0.00    629.24      0.25      7.94     10.86     93.63
Average:          sdf     86.75  54656.00      0.00    630.04      0.27      8.10     10.79     93.63
Average:          md1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

This is reading from 5 drives simultaneously with dd with smp_affinity set to 2, 4, 8:

Average:          DEV       tps     rkB/s     wkB/s   areq-sz    aqu-sz     await     svctm     %util                                                      
Average:          sda     65.17  40277.33      0.00    618.07      0.00      5.42     11.48     74.83
Average:          sdb     56.00  34133.33      0.00    609.52      0.10      6.87     12.98     72.67
Average:          sdc     55.83  34816.00      0.00    623.57      0.07      6.56     12.90     72.00
Average:          sdd      0.33      0.00      2.00      6.00      0.08    249.50     10.00      0.33
Average:          sde     56.83  35072.00      0.00    617.10      0.11      7.14     12.76     72.50
Average:          sdf     55.17  33536.00      0.00    607.90      0.10      7.08     12.75     70.33
Average:          md1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

This is reading from 5 drives simultaneously with dd with smp_affinity set to f (default, all on cpu0):

Average:          DEV       tps     rkB/s     wkB/s   areq-sz    aqu-sz     await     svctm     %util                                                      
Average:          sda     61.50  35496.00      0.00    577.17      0.00      5.34     11.59     71.25
Average:          sdb     58.25  33408.00      0.00    573.53      0.11      6.80     11.93     69.50
Average:          sdc     60.50  34688.00      0.00    573.36      0.07      6.62     11.94     72.25
Average:          sdd      0.25      0.00    256.00   1024.00      0.00     10.00     20.00      0.50
Average:          sde     61.00  35584.00      0.00    583.34      0.12      7.23     11.89     72.50
Average:          sdf     62.00  36992.00      0.00    596.65      0.07      6.43     12.18     75.50
Average:          md1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

So there is a small difference the irq smp_affinity makes, but it's not the issue.

a) direct reads on one drive are limited to ~80-90MB/s
b) direct reads on two drives on one controller are limited to ~35MB/s each, 70MB/s total
c) ext4 reads on a drive of a md raid1 are limited to 35MB/s
d) the total throughput of reading 5 drives at once can reach ~176MB/s.

Dne sreda, 08. januar 2020 16.34.09 UTC+1 je oseba eculperic napisala:
You can play around with the irq smp_affinity bitmask. On my x86 systems, the irqs seem to balance on their own. I think on this MIPS board, linux cannot figure out how to assign an irq to multiple cores and just assigns them to the first core of the bitmask.

However, if you set the smp_affinity to a specific core, linux will use the first cpu of your affinity. E.g. you could reassign the 3 ahci irqs to core1, core2, and core3 with this:

echo 2 > /proc/irq/23/smp_affinity
echo 4 > /proc/irq/24/smp_affinity
echo 8 > /proc/irq/25/smp_affinity

To unsubscribe from this group and stop receiving emails from it, send an email to gnu...@googlegroups.com.

jjakob

unread,
Jan 13, 2020, 5:36:43 PM1/13/20
to GnuBee
RAM bandwidth on the Gnubee is seriously lacking. I only get ~90MB/s writing to a tmpfs.

root@gnubee:~# dd if=/dev/zero of=/dev/null bs=1M count=1000

1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 3.64753 s, 287 MB/s

root@gnubee:~# mount -t tmpfs none /mnt/tmp
root@gnubee
:~# dd if=/dev/zero of=/mnt/tmp/test_$$ bs=1M conv=fdatasync
dd
: error writing '/mnt/tmp/test_1978': No space left on device
250+0 records in
249+0 records out
261382144 bytes (261 MB, 249 MiB) copied, 2.90455 s, 90.0 MB/s
root@gnubee
:~# rm /mnt/tmp/test_1978
root@gnubee
:~# umount /mnt/tmp
root@gnubee
:~# mount -t tmpfs none /mnt/tmp
root@gnubee
:~# dd if=/dev/zero of=/mnt/tmp/test_$$ bs=1M oflag=dsync
dd
: error writing '/mnt/tmp/test_1978': No space left on device
250+0 records in
249+0 records out
261382144 bytes (261 MB, 249 MiB) copied, 2.87896 s, 90.8 MB/s
root@gnubee
:~# rm /mnt/tmp/test_1978
root@gnubee
:~# umount /mnt/tmp

Compared to a Core 2 Duo with dual channel DDR3:

:~$ dd if=/dev/zero of=/dev/null bs=1M count=1000

1000+0 records in
1000+0 records out
1048576000 bytes (1,0 GB, 1000 MiB) copied, 0,189893 s, 5,5 GB/s

:~$ sudo mount -t tmpfs none /mnt/tmp
:~$ dd if=/dev/zero of=/mnt/tmp/test_$$ bs=1M count=1000 conv=fdatasync
1000+0 records in
1000+0 records out
1048576000 bytes (1,0 GB, 1000 MiB) copied, 0,952923 s, 1,1 GB/s
:~$ sudo umount /mnt/tmp
:~$ sudo mount -t tmpfs none /mnt/tmp
:~$ dd if=/dev/zero of=/mnt/tmp/test_$$ bs=1M count=1000 oflag=dsync
1000+0 records in
1000+0 records out
1048576000 bytes (1,0 GB, 1000 MiB) copied, 1,01765 s, 1,0 GB/s
:~$ sudo umount /mnt/tmp

This is on a Ubiquiti Edgerouter Lite (ER-3) which is a similarly rated (4-core 1GHz, 512MB RAM) running 4.9.79-UBNT mips64
ubnt@ubnt:~$ time dd if=/dev/zero of=/dev/null bs=1M count=1000

1000+0 records in
1000+0 records out


real    
0m0.570s
user    
0m0.000s
sys    
0m0.560s
1GB/0.570s = 1,75 GB/s

It's not the same SoC (Cavium Octeon II). It seems to have 4 separate memory chips. The datasheet for the CN6130 says "DDR3 up to 1066 MHz, 1 x 72-bit"

The MT7621 only has a 16-bit memory data bus. 72b is 4.5x 16b. The difference in measured dd bandwidth between the MT and CN is ~6.2x. So that could be the possible issue, in addition to a slower CPU and an unknown (possibly lower) DDR frequency on the Gnubee.

IMO the limitation is of the MT7621 SoC itself. It's simply not made to process/shuffle data as fast. More likely it's made to be a small managed switch/firewall device with the processor doing only minor work.

Dne nedelja, 12. januar 2020 14.51.32 UTC+1 je oseba jjakob napisala:

jjakob

unread,
Jan 13, 2020, 6:02:29 PM1/13/20
to GnuBee
If anyone has a different device with the same MT7621 SoC, it would be interesting to see if they have the same dd numbers. That way we could be sure everything's configured correctly in the SoC registers re: clocks and timings.

Some devices with the same SoC:

Alex Davies

unread,
Apr 22, 2020, 12:23:11 PM4/22/20
to GnuBee
Man I hope this gets solved. I bought the gnubee because I need to power the raid server off of a 24v battery bank, but with this kind of performance I'm probably going to need to scrap that plan. Very disappointing experience the the gnubee so far, thank you to the community who has made this as viable as it has been.

Jernej Jakob

unread,
Apr 22, 2020, 12:36:20 PM4/22/20
to gnu...@googlegroups.com
I wouldn't hold my breath for a massive increase. I think it's due to the
Mediatek SoC having just 1 narrow RAM channel, thus it's memory access
limited, and not SATA or PCIe or interrupt or anything like that. I
posted my testing results here a while ago (and comparing it to a similar
SoC with about twice as much RAM bandwidth that had about twice as much
disk speed, I'd say that's pretty definitive.) That was with the 5.4
kernel though, haven't upgraded to 5.6 yet.

Alex Davies

unread,
Apr 22, 2020, 12:41:17 PM4/22/20
to GnuBee
>When you add up all the rates the total through put is around 111MB/s

Am I mis-reading that? You did that test with openwrt on a gnubee, right? That is much better than what I'm seeing right now, and is closer to the kind of write speeds you'd expect. Certainly it's "good enough" to do things like serve media files.
Reply all
Reply to author
Forward
0 new messages