Taking full advantage of MIPSEL / loongson2f CPU

287 views
Skip to first unread message

Daniel Clark

unread,
Jun 17, 2010, 4:28:01 PM6/17/10
to loongs...@googlegroups.com, Kip Warner
On Sat, Jun 12, 2010 at 5:45 PM, Kip Warner <k...@thevertigo.com> wrote:
> Hey Daniel. One other thing. Do you recommend a good resource for
> learning about taking advantage of MIPS architecture. eg. SIMD type
> stuff, etc.. I know there are a lot of books out there, but I figured
> you probably have something in particular in mind you use for people
> like us who already know about i686 / amd64.

Anyone have suggestions along these lines?

As I recall there were also loongson2f-specific instructions that
could be helpful with making games faster - I think low-level info on
these may be at http://groups.google.com/group/loongson-dev/files in
"Loongson2FUserGuide.pdf".

--
\|/ Daniel JB Clark | Activist; Owner
FREEDOM -+-> INCLUDED ~ http://freedomincluded.com
/|\ Free Software respecting hardware

zhangfx

unread,
Jun 17, 2010, 9:08:38 PM6/17/10
to loongs...@googlegroups.com
Generally very low level advantage is hard to make use directly for non-assembly programmers. The toolchain is a good starting point. Some guys in ICT has been doing various experiments upon it, seeking for best options for this architecture, developing some improvement like the scheduler etc. The result is encouraging, I don't have the exact numbers but something like >10% in average is in my mind. If we can use N32 abi, even more is possible(20-30%).

For simd, it is mainly suitable for multimedia related things. People has already done some work on ffmpeg.

ffmpeg 0.5.2 patch:
http://bjlx.org.cn/loongson2f/squeeze/ffmpeg/loongson2mmi_ffmpeg_0.5.2.patch

ffmpeg 0.6 patch:
http://bjlx.org.cn/loongson2f/squeeze/ffmpeg/loongson2mmi_ffmpeg0.6_20100505.patch


http://www.bjlx.org.cn/node/769

Kip Warner

unread,
Jun 17, 2010, 5:03:34 PM6/17/10
to loongs...@googlegroups.com
On Thu, 2010-06-17 at 16:28 -0400, Daniel Clark wrote:
> Anyone have suggestions along these lines?
>
> As I recall there were also loongson2f-specific instructions that
> could be helpful with making games faster - I think low-level info on
> these may be at http://groups.google.com/group/loongson-dev/files in
> "Loongson2FUserGuide.pdf".

Excellent.

Something I am wondering about is what the build target should be if
using Loongson2f-specific instructions. If I have packages in the
repository for mipsel, they won't run on other non-loongson2f mipsel
hardware. Since I want to avoid hardware accelerated specific routines
that are selectively branched at runtime, because it is slower than
definitively having them as part of execution path, I'm not sure how to
work around this. It's kind of the same issue with SIMD on i686.
Apparently not all machines in i686 class support SIMD instructions, so
checking at runtime and then using those routines supported is slower
because you have to branch every time and you also can't inline them.

--
Kip Warner -- Software Engineer
OpenPGP encrypted/signed mail preferred
http://www.thevertigo.com

signature.asc

Zhang Le

unread,
Jun 21, 2010, 1:31:35 AM6/21/10
to loongs...@googlegroups.com
On 09:08 Fri 18 Jun , zhangfx wrote:
> Generally very low level advantage is hard to make use directly for
> non-assembly programmers. The toolchain is a good starting point.
> Some guys in ICT has been doing various experiments upon it, seeking
> for best options for this architecture, developing some improvement
> like the scheduler etc. The result is encouraging, I don't have the
> exact numbers but something like >10% in average is in my mind. If
> we can use N32 abi, even more is possible(20-30%).

Really good news!

Looking forward to that, :-D

--
Zhang, Le
Gentoo/Loongson Developer
http://zhangle.is-a-geek.org
0260 C902 B8F8 6506 6586 2B90 BC51 C808 1E4E 2973

ToobMug

unread,
Jul 15, 2010, 4:36:33 AM7/15/10
to loongson-dev
On Jun 18, 2:08 am, zhangfx <zhan...@lemote.com> wrote:
> Generally very low level advantage is hard to make use directly for
> non-assembly programmers. The toolchain is a good starting point.

Is the vectoriser active in mainline gcc using -march=loongson2f?
I've never had much luck getting it to do anything useful on x86 but I
like to think that it's there, ready to handle that one specific
memset loop that it actually understands.


> For simd, it is mainly suitable for multimedia related things. People
> has already done some work on ffmpeg.

There's a heck of a lot of x86 MMX assembly already written in most
projects that matter, and I'm guessing that it's not hard to
translate. Is it easy enough that a script could do it? I realise
that without 3-op and a decent regfile any existing MMX code is going
to be quite inefficient. I also realise that there are complications
to rewriting both assembly files and C files (assembly files have
whole ABIs that need to be rewritten and will use lots of scalar
instructions and registers, while C files with inline assembly have to
have their assembly rewritten in place). But it seems like it should
be possible to automate most of the tedious rewriting work.

Unfortunately I don't know any useful scripting languages, so I'm not
much use. I'll probably end up using my command history in vim to
repeat various complex search and replace operations.
Reply all
Reply to author
Forward
0 new messages