I know that we have already discussed this on IRC, but here is some
information which might be interesting for the mailing list subscribers:
1. RAM clock speed does matter a lot. Mele A2000 has it configured
as 360MHz, while the hardware might be perfectly fine with 480MHz
(just like cubieboard). The device is using DDR3-1333 memory chips
as stated on
http://linux-sunxi.org/Mele_A2000#Specifications
(512MB in 2x Hynix H5TQ2G63BFR H9C 143AK)
http://www.hynix.com/datasheet/eng/computing/details/computing_19_H5TQ2G63BFR.jsp
I'm running my Mele A2000 with memory clock frequency changed
to 480MHz just fine. The easiest way to do this is to recompile
u-boot after changing 360->480 here:
https://github.com/linux-sunxi/u-boot-sunxi/blob/sunxi/board/allwinner/mele_a1000/dram.c
2. CPU clock speed does matter. With the default frequency scaling
governor, the CPU is working at 408MHz and takes a bit of time
to go up to 1GHz under load. This results in laggy response time.
By the way, AFAIK mobile devices typically have some special
hooks in the input handling so that the clock frequency can be
instantly increased to maximum if the user starts tapping on
touchscreen or something like this.
In any case, just changing the frequency scaling governor
to "performance" should fix the problem:
echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
3. Screen resolution and refresh rate does matter a lot.
First, that's obviously more pixels to handle. For example, 1920x1080
screen resolution has 2.25x more pixels than 1280x720, and scrolling
a page in a fullscreen browser window would be also at least twice
slower for higher resolution.
Second, the constant framebuffer scanout sending pixel data to the
monitor over HDMI (or VGA or any other video output interface) is
also stealing some of the memory bandwidth and makes less of it
available to the CPU or GPU. Below is some benchmark data from
https://github.com/ssvb/xf86-video-sunxifb/blob/master/test/sunxi_g2d_bench.c
==== 640x480-32@60Hz ===
Running G2D benchmarks for framebuffer (typically writecombine mapped)
G2D blit performance: 178.97 MPix/s (715.86 MB/s)
G2D fill performance: 227.60 MPix/s (910.40 MB/s)
Running pixman benchmarks for framebuffer (typically writecombine mapped)
pixman blit performance: 105.92 MPix/s (423.67 MB/s)
pixman fill performance: 359.15 MPix/s (1436.61 MB/s)
Running pixman benchmarks for normal RAM (typically mapped as WB cached)
pixman blit performance: 189.38 MPix/s (757.51 MB/s)
pixman fill performance: 358.40 MPix/s (1433.60 MB/s)
==== 1280x720-32@60Hz ===
Running G2D benchmarks for framebuffer (typically writecombine mapped)
G2D blit performance: 168.23 MPix/s (672.92 MB/s)
G2D fill performance: 228.85 MPix/s (915.38 MB/s)
Running pixman benchmarks for framebuffer (typically writecombine mapped)
pixman blit performance: 100.56 MPix/s (402.26 MB/s)
pixman fill performance: 342.16 MPix/s (1368.62 MB/s)
Running pixman benchmarks for normal RAM (typically mapped as WB cached)
pixman blit performance: 169.54 MPix/s (678.17 MB/s)
pixman fill performance: 341.84 MPix/s (1367.37 MB/s)
==== 1920x1080-32@60Hz ===
Running G2D benchmarks for framebuffer (typically writecombine mapped)
G2D blit performance: 131.11 MPix/s (524.45 MB/s)
G2D fill performance: 220.21 MPix/s (880.82 MB/s)
Running pixman benchmarks for framebuffer (typically writecombine mapped)
pixman blit performance: 93.00 MPix/s (372.00 MB/s)
pixman fill performance: 137.94 MPix/s (551.78 MB/s)
Running pixman benchmarks for normal RAM (typically mapped as WB cached)
pixman blit performance: 131.42 MPix/s (525.68 MB/s)
pixman fill performance: 138.01 MPix/s (552.03 MB/s)
==== 1920x1080-32@60Hz VGA and 1920x1080-32@60Hz HDMI ===
Running G2D benchmarks for framebuffer (typically writecombine mapped)
G2D blit performance: 91.38 MPix/s (365.52 MB/s)
G2D fill performance: 171.75 MPix/s (687.00 MB/s)
Running pixman benchmarks for framebuffer (typically writecombine mapped)
pixman blit performance: 56.04 MPix/s (224.15 MB/s)
pixman fill performance: 128.72 MPix/s (514.88 MB/s)
Running pixman benchmarks for normal RAM (typically mapped as WB cached)
pixman blit performance: 86.26 MPix/s (345.03 MB/s)
pixman fill performance: 128.66 MPix/s (514.62 MB/s)
=========================================================
As can be seen, driving two 1080p monitors at the same time can
dramatically degrade memory performance. But even the difference
between a single 1080p monitor and a 720p monitor is not so small
either. More pixels to handle for higher screen resolutions combined
with slower memory can make the user experience much worse.
Reducing the screen refresh rate (to 50Hz from 60Hz) may also save
some of the memory bandwidth.
4. Reducing desktop color depth from 32bpp to 16bpp graphics is good
for memory bandwidth and improves performance.
5. The video driver and/or its configuration does matter. In the case
of "fbdev" driver, one important configuration option is "ShadowFB",
which is enabled by default. With this option in effect, a copy of
the framebuffer is kept in the normal cached memory just in case if
we want to read back (reading from the uncached framebuffer is
slow). Right now moving windows is somewhat faster with the shadow
framebuffer enabled, but it is not free and adds some overhead. Also
the shadow framebuffer may skip some of the screen updates for quick
animation which is not very nice. Some of the use cases may be
faster with "ShadowFB" disabled. But an improved driver can address
the problem in a better way.
And finally, as Roman already mentioned, the proprietary binary Mali GPU
drivers are not going to be useful for 2D graphics in Fedora. They
can't improve basic web browsing, email, document editing etc.
--
Best regards,
Siarhei Siamashka