Re: [linux-sunxi] Help: How do I get Mali drivers working for Fedora 18 (Mele A2000)

391 views
Skip to first unread message

Roman Mamedov

unread,
Mar 11, 2013, 2:06:48 AM3/11/13
to linux...@googlegroups.com, graem...@gmail.com
On Sun, 10 Mar 2013 16:14:54 -0700 (PDT)
Graeme Russ <graem...@gmail.com> wrote:

> I'm planning on using this box as a silent PC for my wife to do some basic
> web browsing, email, document editing etc. At the moment, it appears that
> there is no accelerated features of the GPU used in X so it really
> is unusable slow (connected via HDMI to an LCD monitor)

Really, is it? I've used an A10 based devices as a light desktop and for web
browsing with Firefox, it did not seem that slow at all.

"Unusably slow" for me means "you can see how stuff redraws on the screen from
upside down", but here even scrolling of web pages in the browser is pretty
smooth, how can you call that slow.

Which DE and other software do you use?

> I've tried to follow the instructions at
> http://linux-sunxi.org/Binary_drivers, but 'modprobe mali' fails and
> running the script results in '/dev/mali not found'.
>
> Does anyone have any pointers for how to get accelerated X11 going on the
> Mele A2000?

Even if you get those to work, they do not accelerate 2D at the moment, and
afaik unless you use the new "sunxifb" driver, 2D performance using "mali"
will be even worse than using "fbdev".

--
With respect,
Roman
signature.asc

Graeme Russ

unread,
Mar 11, 2013, 6:17:08 PM3/11/13
to Roman Mamedov, linux...@googlegroups.com
Hi Roman,

On Mon, Mar 11, 2013 at 5:06 PM, Roman Mamedov <r...@romanrm.ru> wrote:
> On Sun, 10 Mar 2013 16:14:54 -0700 (PDT)
> Graeme Russ <graem...@gmail.com> wrote:
>
>> I'm planning on using this box as a silent PC for my wife to do some basic
>> web browsing, email, document editing etc. At the moment, it appears that
>> there is no accelerated features of the GPU used in X so it really
>> is unusable slow (connected via HDMI to an LCD monitor)
>
> Really, is it? I've used an A10 based devices as a light desktop and for web
> browsing with Firefox, it did not seem that slow at all.
>
> "Unusably slow" for me means "you can see how stuff redraws on the screen from
> upside down", but here even scrolling of web pages in the browser is pretty
> smooth, how can you call that slow.

It's very laggy - if you drag a window, for example, the window will
lag behind the mouse by a good fraction of a second. And if I have
more than a few windows open (say a couple of terminal windows, a
browser, and a file explorer) the background windows can take ages
(many seconds) to redraw after being exposed by moving a foreground
window.

> Which DE and other software do you use?

xfce

>> I've tried to follow the instructions at
>> http://linux-sunxi.org/Binary_drivers, but 'modprobe mali' fails and
>> running the script results in '/dev/mali not found'.
>>
>> Does anyone have any pointers for how to get accelerated X11 going on the
>> Mele A2000?
>
> Even if you get those to work, they do not accelerate 2D at the moment, and
> afaik unless you use the new "sunxifb" driver, 2D performance using "mali"
> will be even worse than using "fbdev".

Oh - that's good to know. So I guess I need to start investigating
what is causing it to be so slow (or maybe it's just my perceptions)

I currently have it installed on a 4GB SD card - I have a faster 16GB
SD card that I could try

Regards,

Graeme

Siarhei Siamashka

unread,
Mar 15, 2013, 10:17:33 PM3/15/13
to linux...@googlegroups.com, graem...@gmail.com
I know that we have already discussed this on IRC, but here is some
information which might be interesting for the mailing list subscribers:

1. RAM clock speed does matter a lot. Mele A2000 has it configured
as 360MHz, while the hardware might be perfectly fine with 480MHz
(just like cubieboard). The device is using DDR3-1333 memory chips
as stated on http://linux-sunxi.org/Mele_A2000#Specifications
(512MB in 2x Hynix H5TQ2G63BFR H9C 143AK)
http://www.hynix.com/datasheet/eng/computing/details/computing_19_H5TQ2G63BFR.jsp

I'm running my Mele A2000 with memory clock frequency changed
to 480MHz just fine. The easiest way to do this is to recompile
u-boot after changing 360->480 here:
https://github.com/linux-sunxi/u-boot-sunxi/blob/sunxi/board/allwinner/mele_a1000/dram.c

2. CPU clock speed does matter. With the default frequency scaling
governor, the CPU is working at 408MHz and takes a bit of time
to go up to 1GHz under load. This results in laggy response time.
By the way, AFAIK mobile devices typically have some special
hooks in the input handling so that the clock frequency can be
instantly increased to maximum if the user starts tapping on
touchscreen or something like this.

In any case, just changing the frequency scaling governor
to "performance" should fix the problem:

echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

3. Screen resolution and refresh rate does matter a lot.

First, that's obviously more pixels to handle. For example, 1920x1080
screen resolution has 2.25x more pixels than 1280x720, and scrolling
a page in a fullscreen browser window would be also at least twice
slower for higher resolution.

Second, the constant framebuffer scanout sending pixel data to the
monitor over HDMI (or VGA or any other video output interface) is
also stealing some of the memory bandwidth and makes less of it
available to the CPU or GPU. Below is some benchmark data from
https://github.com/ssvb/xf86-video-sunxifb/blob/master/test/sunxi_g2d_bench.c

==== 640x480-32@60Hz ===

Running G2D benchmarks for framebuffer (typically writecombine mapped)
G2D blit performance: 178.97 MPix/s (715.86 MB/s)
G2D fill performance: 227.60 MPix/s (910.40 MB/s)

Running pixman benchmarks for framebuffer (typically writecombine mapped)
pixman blit performance: 105.92 MPix/s (423.67 MB/s)
pixman fill performance: 359.15 MPix/s (1436.61 MB/s)

Running pixman benchmarks for normal RAM (typically mapped as WB cached)
pixman blit performance: 189.38 MPix/s (757.51 MB/s)
pixman fill performance: 358.40 MPix/s (1433.60 MB/s)

==== 1280x720-32@60Hz ===

Running G2D benchmarks for framebuffer (typically writecombine mapped)
G2D blit performance: 168.23 MPix/s (672.92 MB/s)
G2D fill performance: 228.85 MPix/s (915.38 MB/s)

Running pixman benchmarks for framebuffer (typically writecombine mapped)
pixman blit performance: 100.56 MPix/s (402.26 MB/s)
pixman fill performance: 342.16 MPix/s (1368.62 MB/s)

Running pixman benchmarks for normal RAM (typically mapped as WB cached)
pixman blit performance: 169.54 MPix/s (678.17 MB/s)
pixman fill performance: 341.84 MPix/s (1367.37 MB/s)

==== 1920x1080-32@60Hz ===

Running G2D benchmarks for framebuffer (typically writecombine mapped)
G2D blit performance: 131.11 MPix/s (524.45 MB/s)
G2D fill performance: 220.21 MPix/s (880.82 MB/s)

Running pixman benchmarks for framebuffer (typically writecombine mapped)
pixman blit performance: 93.00 MPix/s (372.00 MB/s)
pixman fill performance: 137.94 MPix/s (551.78 MB/s)

Running pixman benchmarks for normal RAM (typically mapped as WB cached)
pixman blit performance: 131.42 MPix/s (525.68 MB/s)
pixman fill performance: 138.01 MPix/s (552.03 MB/s)

==== 1920x1080-32@60Hz VGA and 1920x1080-32@60Hz HDMI ===

Running G2D benchmarks for framebuffer (typically writecombine mapped)
G2D blit performance: 91.38 MPix/s (365.52 MB/s)
G2D fill performance: 171.75 MPix/s (687.00 MB/s)

Running pixman benchmarks for framebuffer (typically writecombine mapped)
pixman blit performance: 56.04 MPix/s (224.15 MB/s)
pixman fill performance: 128.72 MPix/s (514.88 MB/s)

Running pixman benchmarks for normal RAM (typically mapped as WB cached)
pixman blit performance: 86.26 MPix/s (345.03 MB/s)
pixman fill performance: 128.66 MPix/s (514.62 MB/s)

=========================================================

As can be seen, driving two 1080p monitors at the same time can
dramatically degrade memory performance. But even the difference
between a single 1080p monitor and a 720p monitor is not so small
either. More pixels to handle for higher screen resolutions combined
with slower memory can make the user experience much worse.
Reducing the screen refresh rate (to 50Hz from 60Hz) may also save
some of the memory bandwidth.

4. Reducing desktop color depth from 32bpp to 16bpp graphics is good
for memory bandwidth and improves performance.

5. The video driver and/or its configuration does matter. In the case
of "fbdev" driver, one important configuration option is "ShadowFB",
which is enabled by default. With this option in effect, a copy of
the framebuffer is kept in the normal cached memory just in case if
we want to read back (reading from the uncached framebuffer is
slow). Right now moving windows is somewhat faster with the shadow
framebuffer enabled, but it is not free and adds some overhead. Also
the shadow framebuffer may skip some of the screen updates for quick
animation which is not very nice. Some of the use cases may be
faster with "ShadowFB" disabled. But an improved driver can address
the problem in a better way.

And finally, as Roman already mentioned, the proprietary binary Mali GPU
drivers are not going to be useful for 2D graphics in Fedora. They
can't improve basic web browsing, email, document editing etc.

--
Best regards,
Siarhei Siamashka

Roman Mamedov

unread,
Mar 16, 2013, 4:39:32 AM3/16/13
to linux...@googlegroups.com, siarhei....@gmail.com, graem...@gmail.com
On Sat, 16 Mar 2013 04:17:33 +0200
Siarhei Siamashka <siarhei....@gmail.com> wrote:

>
> I'm running my Mele A2000 with memory clock frequency changed
> to 480MHz just fine. The easiest way to do this is to recompile
> u-boot after changing 360->480 here:
> https://github.com/linux-sunxi/u-boot-sunxi/blob/sunxi/board/allwinner/mele_a1000/dram.c

Can't the RAM frequency be changed simply by using a modified script.bin?
http://linux-sunxi.org/Fex_Guide#.5Bdram_para.5D
dram_para -> dram_clk

>
> 2. CPU clock speed does matter. With the default frequency scaling
> governor, the CPU is working at 408MHz and takes a bit of time
> to go up to 1GHz under load. This results in laggy response time.
> By the way, AFAIK mobile devices typically have some special
> hooks in the input handling so that the clock frequency can be
> instantly increased to maximum if the user starts tapping on
> touchscreen or something like this.
>
> In any case, just changing the frequency scaling governor
> to "performance" should fix the problem:
>
> echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

Switching to performance will certainly help, but it's a sledge-hammer
solution, and I found that tuning 'ondemand' to make it switch to 1GHz faster,
and not fallback to lower frequencies longer, works about just as well:
http://linux-sunxi.org/Cpufreq

--
With respect,
Roman
signature.asc

Alejandro Mery

unread,
Mar 16, 2013, 5:14:46 AM3/16/13
to linux...@googlegroups.com, Roman Mamedov, siarhei....@gmail.com, graem...@gmail.com
On 16/03/13 09:39, Roman Mamedov wrote:
> On Sat, 16 Mar 2013 04:17:33 +0200
> Siarhei Siamashka <siarhei....@gmail.com> wrote:
>
>>
>> I'm running my Mele A2000 with memory clock frequency changed
>> to 480MHz just fine. The easiest way to do this is to recompile
>> u-boot after changing 360->480 here:
>> https://github.com/linux-sunxi/u-boot-sunxi/blob/sunxi/board/allwinner/mele_a1000/dram.c
>
> Can't the RAM frequency be changed simply by using a modified script.bin?
> http://linux-sunxi.org/Fex_Guide#.5Bdram_para.5D
> dram_para -> dram_clk

unfortunatelly the DRAM gets wiped when reconfigured, which means only
the SPL (or boot0) can effectively set this. the [dram_para] section of
script.bin is not used by the kernel, but by livesuit to inject the data
into boot0 when flashing. on the sunxi size the [dram_para] section is
kept there as reference for populating the board.c file on u-boot, and
so hardcoded in the SPL when compiling for the chosen board.

one solution is to make this data a 3rd `dd`-able block instead of part
of the spl itself and as a helper to sunxi-tools to
encode/compile/convert the [dram_para] into a proper bin struct for the
purpose.

hno?

>> 2. CPU clock speed does matter. With the default frequency scaling
>> governor, the CPU is working at 408MHz and takes a bit of time
>> to go up to 1GHz under load. This results in laggy response time.
>> By the way, AFAIK mobile devices typically have some special
>> hooks in the input handling so that the clock frequency can be
>> instantly increased to maximum if the user starts tapping on
>> touchscreen or something like this.
>>
>> In any case, just changing the frequency scaling governor
>> to "performance" should fix the problem:
>>
>> echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
>
> Switching to performance will certainly help, but it's a sledge-hammer
> solution, and I found that tuning 'ondemand' to make it switch to 1GHz faster,
> and not fallback to lower frequencies longer, works about just as well:
> http://linux-sunxi.org/Cpufreq

properly configured ondemand consumes less power and emits less heat
than performance and should feel like the same. but yes, even with a
minimum of 408 instead of 60 is acts sloppy unless configured as
<http://linux-sunxi.org/Cpufreq> explains.

cheers,
Alejandro

wills

unread,
Mar 16, 2013, 5:59:05 AM3/16/13
to linux-sunxi
Hi, Siarheim,

Do you try increasing DRAM clock up to 667MHz?
Is there helpful for memory bandwidth and user experience?

On Mar 15, 10:17 pm, Siarhei Siamashka <siarhei.siamas...@gmail.com>
wrote:
>    as stated onhttp://linux-sunxi.org/Mele_A2000#Specifications
>    (512MB in 2x Hynix H5TQ2G63BFR H9C 143AK)
>    http://www.hynix.com/datasheet/eng/computing/details/computing_19_H5T...
>
>    I'm running my Mele A2000 with memory clock frequency changed
>    to 480MHz just fine. The easiest way to do this is to recompile
>    u-boot after changing 360->480 here:
>    https://github.com/linux-sunxi/u-boot-sunxi/blob/sunxi/board/allwinne...
>
> 2. CPU clock speed does matter. With the default frequency scaling
>    governor, the CPU is working at 408MHz and takes a bit of time
>    to go up to 1GHz under load. This results in laggy response time.
>    By the way, AFAIK mobile devices typically have some special
>    hooks in the input handling so that the clock frequency can be
>    instantly increased to maximum if the user starts tapping on
>    touchscreen or something like this.
>
>    In any case, just changing the frequency scaling governor
>    to "performance" should fix the problem:
>
>    echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
>
> 3. Screen resolution and refresh rate does matter a lot.
>
>    First, that's obviously more pixels to handle. For example, 1920x1080
>    screen resolution has 2.25x more pixels than 1280x720, and scrolling
>    a page in a fullscreen browser window would be also at least twice
>    slower for higher resolution.
>
>    Second, the constant framebuffer scanout sending pixel data to the
>    monitor over HDMI (or VGA or any other video output interface) is
>    also stealing some of the memory bandwidth and makes less of it
>    available to the CPU or GPU. Below is some benchmark data from
>    https://github.com/ssvb/xf86-video-sunxifb/blob/master/test/sunxi_g2d...

Siarhei Siamashka

unread,
Mar 16, 2013, 8:37:50 AM3/16/13
to linux...@googlegroups.com, wills.w...@gmail.com
On Sat, 16 Mar 2013 02:59:05 -0700 (PDT)
wills <wills.w...@gmail.com> wrote:

> On Mar 15, 10:17 pm, Siarhei Siamashka <siarhei.siamas...@gmail.com>
> wrote:
> > 1. RAM clock speed does matter a lot. Mele A2000 has it configured
> >    as 360MHz, while the hardware might be perfectly fine with 480MHz
> >    (just like cubieboard). The device is using DDR3-1333 memory chips
> >    as stated onhttp://linux-sunxi.org/Mele_A2000#Specifications
> >    (512MB in 2x Hynix H5TQ2G63BFR H9C 143AK)
> >    http://www.hynix.com/datasheet/eng/computing/details/computing_19_H5T...
> >
> >    I'm running my Mele A2000 with memory clock frequency changed
> >    to 480MHz just fine. The easiest way to do this is to recompile
> >    u-boot after changing 360->480 here:
> >    https://github.com/linux-sunxi/u-boot-sunxi/blob/sunxi/board/allwinne...
>
> Do you try increasing DRAM clock up to 667MHz?
> Is there helpful for memory bandwidth and user experience?

Trying to increase the memory clock frequency beyond 480MHz makes the
system unbootable. The stable maximum memory clock frequency is not only
limited by the DRAM chips alone, but also by the Allwinner A10 SoC and
PCB layout.

The SoC can apparently handle 480MHz (as it does in cubieboard), the
memory chips should also have more than enough headroom. There is
surely not enough statistics to assume that all Mele A1000/A2000
devices can handle 480MHz DRAM clock frequency, but it would be
interesting to hear from anyone having problems with it on Mele.

The DRAM clock frequency is also somewhat important for 2D graphics
because G2D accelerator (Mixer Processor) is currently clocked at
half of memory clock speed:

https://github.com/linux-sunxi/linux-sunxi/blob/sunxi-v3.4.29-r1/drivers/char/sun4i_g2d/g2d.c#L48

And G2D running at 240MHz can't process more than 240 millions
pixels per second.

Siarhei Siamashka

unread,
Mar 16, 2013, 9:27:27 AM3/16/13
to Alejandro Mery, linux...@googlegroups.com
Can we tweak the kernel to configure the ondemand governor so that it
has reasonable response time by default? Also what about the other
governors like "interactive" or even the allwinner "fantasy" thing?
Has anybody evaluated them?

In any case, the "performance" governor is guaranteed to fix this
particular problem. And when profiling and benchmarking the other
performance issues, "should feel the same" promise might be not
enough. Completely removing any possible interference from the
CPU frequency scaling just feels more reliable.
Reply all
Reply to author
Forward
0 new messages