CuBox-i boot times

103 views
Skip to first unread message

Rudi

unread,
Feb 6, 2014, 3:29:22 PM2/6/14
to openbric...@googlegroups.com
Hi folks,

when doing first test with a CuBox-i4pro, I noticed that - using the same
SD card - it boots quite significantly slower than the Hummingboard (solo).

Any ideas what's wrong here ?


BTW, I also learned that it doesn't seem to like this type of SD card:

http://www.amazon.de/gp/product/B00CBPVOZM/ref=ox_ya_os_product




--

Ruediger "Rudi" Ihle


Rabeeh Khoury

unread,
Feb 7, 2014, 5:38:10 AM2/7/14
to openbric...@googlegroups.com

Hi Rudi.
I'm assuming that you are running with SPL u-boot. Can u try that SD card with older non SPL u-boot?

--
You received this message because you are subscribed to the Google Groups "OpenBricks Development List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openbricks-devel+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Rudi

unread,
Feb 8, 2014, 9:37:46 AM2/8/14
to openbric...@googlegroups.com
Hello Rabeeh,

> I'm assuming that you are running with SPL u-boot.

Yes.


> Can u try that SD card with older non SPL u-boot?

Tried it but it didn't change anything.

However, it looks like it's related to the SD card subsystem and/or cache.
I tested the speed by dd'ing the xbmc executable (14MB) to /dev/null. On
the Hummingboard I get about 135MB/s. However, on the i4pro it's only about
90MB/s. For writes I have 32MB/s vs. 24MB/s. Note that I just swapped the
same card (Transcend 8GB class 10) between the devices. There is a factor
of roughly 1.5 between the rates. Could it be, that some clock is not set
up correctly ?


> BTW, I also learned that it doesn't seem to like this type of SD card:
>
> http://www.amazon.de/gp/product/B00CBPVOZM/ref=ox_ya_os_product

Just in case I was not clear enough: This is a different problem (UHS-1).


--

Ruediger "Rudi" Ihle


Stéphan Rafin

unread,
Feb 8, 2014, 10:19:01 AM2/8/14
to openbric...@googlegroups.com
Hi Rudi,

The read bandwidth (135MB/s) you announce seems really very very good...
could you issue
    echo 3 > /proc/sys/vm/drop_caches
before running your bench to be sure you don't read from io cache ?

(For write measure, you should also  use "conv=fdatasync " option if you use dd to bench of course...)

Best regards
Stephan

Rudi

unread,
Feb 8, 2014, 11:35:20 AM2/8/14
to openbric...@googlegroups.com
Hi Stéphan,

you are right in the fact, that the high rates come from caching. Clearing
the cache and using a larger file gives more realistic speed results. Now
I get 14..18MB/s for read on both devices. However, with caching in effect
the numbers are as described. Which means, that the cache works much more
efficient on the Hummingboard than on the i4pro. So we need to look at the
memory setup rather than at the SD card code. Or do these four cores cause
thrashing ?

BTW, BusyBox' dd doesn't support "conv=fdatasync" but does "conv=sync"


Cheers !
> --
> You received this message because you are subscribed to the Google Groups "OpenBricks Development List" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to openbricks-dev...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.


--

Ruediger "Rudi" Ihle


Stéphan Rafin

unread,
Feb 8, 2014, 11:58:34 AM2/8/14
to openbric...@googlegroups.com
Hi Rudi,

> you are right in the fact, that the high rates come from caching.
> Clearing
> the cache and using a larger file gives more realistic speed results. Now
> I get 14..18MB/s for read on both devices. However, with caching in
> effect
> the numbers are as described. Which means, that the cache works much more
> efficient on the Hummingboard than on the i4pro. So we need to look at
> the
> memory setup rather than at the SD card code. Or do these four cores
> cause
> thrashing ?

I have to admit that apart pointing the too optimistic measure, I don't
have good explanation for the performance difference.
IO cache is in DDR (and of course possibly a little bit, but hard to
qualify, in L1/L2 caches) and on the contrary the i4 bandwidth should be
really better (higher frequency and 64 vs 32 bits)
If you suspect multicore issue, you can try to pass maxcpus=1 as a
kernel option to use only one core on cuboxi4...

>
> BTW, BusyBox' dd doesn't support "conv=fdatasync" but does "conv=sync"
For busybox, I am unsure conv=sync is really the same...
At least according to documentation :
(http://www.busybox.net/downloads/BusyBox.html)

conv=sync Pad blocks with zeros
conv=fsync Physically write data out before finishing

So, it would rather be "conv=fsync" don't you think ?

Stephan

Rudi

unread,
Feb 8, 2014, 12:10:43 PM2/8/14
to openbric...@googlegroups.com
On 08.02.2014 17:58, Stéphan Rafin wrote:

> I have to admit that apart pointing the too optimistic measure,
> I don't have good explanation for the performance difference.

It's even visible when watching the flow of systemd's messages
on the console :-(.


> If you suspect multicore issue, you can try to pass maxcpus=1 as a
> kernel option to use only one core on cuboxi4...

I'll try that. Tomorrow...


> So, it would rather be "conv=fsync" don't you think ?

Yep. Stupid me. But "conv=fsync" gives an error message as well.
But for reading it's probably not so important.




--

Ruediger "Rudi" Ihle


Nikolay Nikolaev

unread,
Feb 10, 2014, 3:42:04 PM2/10/14
to openbric...@googlegroups.com
Hello,


An interesting inside from Russell King on iMX6 vs SD-card issues:

 



--

Ruediger "Rudi" Ihle


--
You received this message because you are subscribed to the Google Groups "OpenBricks Development List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openbricks-devel+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

regards,
Nikolay Nikolaev

Rudi

unread,
Feb 20, 2014, 4:10:40 AM2/20/14
to openbric...@googlegroups.com
Am 08.02.2014 17:58, schrieb Stéphan Rafin:

> I have to admit that apart pointing the too optimistic measure, I don't
> have good explanation for the performance difference.


I guess nobody really has. I bought that up on IRC and we did some benchmarking. I don't
really understand the results. Jon suggested to use the "STREAM" part of the lmbench
(http://www.bitmover.com/lmbench/) suite. His I4P scored like this:

http://fpaste.org/77668/39256052/

I quickly added the benchmark tools to openbricks and got these results:

http://fpaste.org/77686/


As expected, my HB is the "slowest". Jon's I4P is somewhat faster. But not as much as
the higher DRAM clock rate and memory bus width would suggest. And then there is my I4P,
which scores worse than the HB ! This matches my observation about boot times.


Now Rabeeh compiled another benchmark program, which can be found here:

http://dl.dropboxusercontent.com/u/72661517/stream

It's probably a different version of the same test linked statically. He got these results:

http://pastebin.com/L0q8u4RX

This program scored pretty much the same on my I4P, while it crashed on my HB.


I don't really know ho to interprete the results...

Another thing: Remember, that we reduced frame buffer color depth to 16bpp in order to
avoid these IPU errors ? As I understood Stephan, this should only be necessary on the
"low end" hardware versions. However, my I4P shows the same problems as the HB when
switching to 32bpp. So it looks like the overlall memory bandwidth is not better.

I'm pretty sure something is wrong here. I just have no idea, what...



--

Ruediger "Rudi" Ihle


Reply all
Reply to author
Forward
0 new messages