Idris backend to Clean/ABC

361 views
Skip to first unread message

Tim Steenvoorden

unread,
Jan 9, 2017, 10:15:41 AM1/9/17
to idris...@googlegroups.com
Dear Idris lovers,

I’m experimenting with a new backend for Idris. It compiles to the Clean programming language (http://clean.cs.ru.nl/Clean). Preliminary benchmarks show it’s about 3 times faster than Idris’ default C backend! You can find the code on https://github.com/timjs/idris-clean. Don’t hesitate to test it out and open some issues.

Clean is a general purpose, pure and lazy functional programming language, similar to Haskell. It’s developed at the Radboud University Nijmegen [1,2]. Clean compiles to abstract ABC bytecode instructions. The ABC stack machine is specially developed to support lazy functional languages and has good support for fast primitive types (Int, Char, Real, Bool, String, boxed and unboxed arrays) and strictness optimisations [3]. The code generator shipped in the Clean distribution is able to generate code for Intel (32 and 64 bit) and ARM (currently only 32 bit) on three different operating systems (Linux, macOS and Windows) [4,5]. Both the code generator and the runtime are written in C and some assembly.

For now, the main purpose of this backend is to see if the Clean programming language, but especially the ABC machine are a good fit for Idris. To be able to generate even faster code, we need to unbox primitive values. Therefore we need some type information in the IR of (for example) the form:

    data Ty = TyInt | TyDouble | TyChar | … | TyFunc Ty Ty | TyAny

There are two possibilities I can think of to approach this:

1. Take IR_defunc or IR_simpl and infer primitive types from the usage of primitive operations.
2. Hook into the TT-to-IR-generation and use Idris’ own knowledge to annotate function and constructor declarations in the IR.

Can anybody tell me if the second one is possible and, if so, where this kind of type information is stored in the TT tree?

Cheers,
Tim


[1]: Brus, T. H., van Eekelen, M. C., Van Leer, M. O., & Plasmeijer, M. J. (1987, September). Clean—a language for functional graph rewriting. In Conference on Functional Programming Languages and Computer Architecture (pp. 364-384). Springer Berlin Heidelberg.
[2]: Plasmeijer, R., & van Eekelen, M. Concurrent CLEAN Language Report (version 2.2), December 2011. http://clean.cs.ru.nl/download/doc/CleanLangRep.2.2.pdf
[3]: Plasmeijer, R. & Van Eekelen, M. (1993). Functional programming and parallel graph rewriting (Vol. 857). Reading: Addison-wesley.
[4]: Smetsers, S., Nöcker, E., Van Groningen, J., & Plasmeijer, R. (1991, August). Generating efficient code for lazy functional languages. In Conference on Functional Programming Languages and Computer Architecture (pp. 592-617). Springer Berlin Heidelberg.
[5]: van Groningen, J. (1990). Implementing the ABC-machine on MC680X0 based architectures. Master's thesis, University of Nijmegen.


----
T.J. Steenvoorden, MSc

PhD Candidate | Radboud University Nijmegen
Faculty of Science | Department of Software Science
Mercator 1 Building | Room 01.08
Toernooiveld 212 | 6525 EC | Nijmegen | The Netherlands
+31 24 365 22 91 | t.steen...@cs.ru.nl

Markus Pfeiffer

unread,
Jan 9, 2017, 11:22:50 AM1/9/17
to idris...@googlegroups.com

Out of interest (without having looked at anything in particular): I am
a bit surprised that with opposite evaluation strategies (clean being
lazy, and idris being strict), you are getting this much of a speed
increase. Can you explain this? Also what are the benchmarks you're
running, the Pythag by itself is not very comprehensive?

Are you aware of the malfunction backend?

Cheers,
Markus

Tim Steenvoorden writes:

> Dear Idris lovers,
>
> I’m experimenting with a new backend for Idris. It compiles to the Clean programming language (http://clean.cs.ru.nl/Clean). Preliminary benchmarks show it’s about 3 times faster than Idris’ default C backend! You can find the code on https://github.com/timjs/idris-clean. Don’t hesitate to test it out and open some issues.
>
> Clean is a general purpose, pure and lazy functional programming language, similar to Haskell. It’s developed at the Radboud University Nijmegen [1,2]. Clean compiles to abstract ABC bytecode instructions. The ABC stack machine is specially developed to support lazy functional languages and has good support for fast primitive types (Int, Char, Real, Bool, String, boxed and unboxed arrays) and strictness optimisations [3]. The code generator shipped in the Clean distribution is able to generate code for Intel (32 and 64 bit) and ARM (currently only 32 bit) on three different operating systems (Linux, macOS and Windows) [4,5]. Both the code generator and the runtime are written in C and some assembly.

>
> For now, the main purpose of this backend is to see if the Clean programming language, but especially the ABC machine are a good fit for Idris. To be able to generate even faster code, we need to unbox primitive values. Therefore we need some type information in the IR of (for example) the form:
>
> data Ty = TyInt | TyDouble | TyChar | … | TyFunc Ty Ty | TyAny
>
> There are two possibilities I can think of to approach this:
>
> 1. Take IR_defunc or IR_simpl and infer primitive types from the usage of primitive operations.
> 2. Hook into the TT-to-IR-generation and use Idris’ own knowledge to annotate function and constructor declarations in the IR.
>
> Can anybody tell me if the second one is possible and, if so, where this kind of type information is stored in the TT tree?
>
> Cheers,
> Tim
>
>
> [1]:Brus, T. H., van Eekelen, M. C., Van Leer, M. O., & Plasmeijer, M. J. (1987, September). Clean—a language for functional graph rewriting. In Conference on Functional Programming Languages and Computer Architecture (pp. 364-384). Springer Berlin Heidelberg.
> [2]:Plasmeijer, R., & van Eekelen, M. Concurrent CLEAN Language Report (version 2.2), December 2011.http://clean.cs.ru.nl/download/doc/CleanLangRep.2.2.pdf
signature.asc

Demi Obenour

unread,
Jan 9, 2017, 3:50:18 PM1/9/17
to Idris Programming Language, markus....@morphism.de
Agreed.  The malfunction backend is even faster (I believe a factor of 15 or so vs. the C backend).
This is possible because it targets OCaml's internal IR, which ocamlopt can optimize heavily.

If we needed to pick something that was more stable, we could target OCaml source, using Obj.magic
to get around the type system.

Markus Pfeiffer

unread,
Jan 9, 2017, 5:09:55 PM1/9/17
to idris...@googlegroups.com

Demi Obenour writes:

> Agreed. The malfunction backend is even faster (I believe a factor of 15
> or so vs. the C backend).

Yet again, factor 15 *on the pythag "benchmark"* (which is run about once
one one example).

> This is possible because it targets OCaml's internal IR, which ocamlopt can
> optimize heavily.
>
> If we needed to pick something that was more stable, we could target OCaml
> source, using Obj.magic
> to get around the type system.

I don't know who "we" is in this case, i was just asking why Clean (and
one conceivable answer can be "because my supervisor said so");
It has been discussed before that it might be a bit unfortunate
compiling strict Idris to any lazy language.

In particular I find this here quite convincing:

>> > For now, the main purpose of this backend is to see if the Clean
>> programming language, but especially the ABC machine are a good fit for
>> Idris. To be able to generate even faster code, we need to unbox primitive
>> values.

(I am not familiar with what OCaml/Malfunction does for this, or whether
the current backend could even handle it).

Cheers,
Markus

signature.asc

Jan de Muijnck-Hughes

unread,
Jan 10, 2017, 4:31:10 AM1/10/17
to idris...@googlegroups.com
First, Kudos to Tim for looking at providing a Clean/ABC backend for idris. The other day, I forgot to ask the clean people if this would be a possibility/what would the benefits be. Nice to see someone actually working on this.

I think a cool thing about Idris is that we *can* retarget the compiler for different codgens. It promotes, and facilitates, research in how to make better runtimes/VMs for dependently typed languages. Not to mention getting idris in places you daren't think possible. For example, IIRC ABC can target ARM as well as native code.

To those talking about efficiency of the C-Runtime. The C-Runtime is intended to be a reference implementation that demonstrates a fully working backend. IIRC the efficiency of this background comes second to functional correctness.  There *is* work out there on targeting Idris to OCaml and not through malfunction, but I don't know the state of the project.

Sadly, Idris doesn't have a proper benchmarking suite, nor do I know of one in the making. Without such a suite it is silly to say that one backend is better than another through the Pythagorus example. One test is not sufficient to demonstrate the pro's and con's of a codegen. If anyone is wiling to work on such a suite, please do, it would be most appreciated. Although, there might be some interesting research questions about what problems dependently typed programs should be benchmarked against. Might even be something to take to the Agda lot as well, in case they have some interesting ideas.

Finally, Tim, I am not to well read up on creating backends for Idris, however, I think ziman on IRC would be the best person to ask aside from Edwin.

Good luck with the backend, keep us updated!

Jan

PS Also, if you feel the documentation in `docs/` could be improved then feel free to contribute improvements, or at least tell us what you think could be improved.

--
You received this message because you are subscribed to the Google Groups "Idris Programming Language" group.
To unsubscribe from this group and stop receiving emails from it, send an email to idris-lang+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tim Steenvoorden

unread,
Jan 10, 2017, 8:21:02 AM1/10/17
to idris...@googlegroups.com
Dear all,

First, to answer Markus question: Clean has extensive support for strictness annotations and uses an aggressive strictness analyser. When you take a look at the Clean code generated by this backend, you can see almost anything is annotated as being strict (exclamation marks in front of type annotations).

As Jan said, Idris’ sadly doesn’t have an extensive benchmark suite. For now I've only found four benchmarks in the Idris source tree (https://github.com/idris-lang/Idris-dev/tree/master/benchmarks): fasta, pidigits, quasigroups and sortvec. `pidigits` doesn't work with the Clean backend because of a missing BigInt implementation (I’m working on that now) and `quasigroups` seems out of date. I added `pythagoras` to this suite (https://github.com/timjs/idris-clean/tree/master/Benchmarks). @ziman created some small benchmarks (https://github.com/ziman/idris-benchmarks) which I’ll try to incorporate. If anyone has bigger, real world examples to use for benchmarking, let me know!

I agree it’s a cool feature to be able to use different backends. ABC can indeed target ARM as well as Intel native code. I was only aware of the OCaml backend by @ziman (https://github.com/ziman/idris-ocaml), but the master branch doesn't seem to generate compilable OCaml code. The Malfunction backend seems pretty fast indeed. I'll take a better look at it.

Thanks very much for the tips Jan! I also didn’t had the chance to talk to you last Friday (enjoyed your talk btw). I hope Edwin or @ziman can point me in the right direction with regards to the type information.

Cheers,
Tim


----
T.J. Steenvoorden, MSc

PhD Candidate | Radboud University Nijmegen
Faculty of Science | Department of Software Science
Mercator 1 Building | Room 01.08
Toernooiveld 212 | 6525 EC | Nijmegen | The Netherlands
+31 24 365 22 91 | t.steen...@cs.ru.nl

Tim Steenvoorden

unread,
Jan 16, 2017, 6:36:57 AM1/16/17
to idris...@googlegroups.com
Hi all (especially Edwin I think),

What is the reason why `simplDecl` only export functions (`SFun`) and leaves constructors out? I’ll probably need the simplified IR for an ABC backend, but I also need constructor information.

Cheers,
Tim


----
T.J. Steenvoorden, MSc

PhD Candidate | Radboud University Nijmegen
Faculty of Science | Department of Software Science
Mercator 1 Building | Room 01.08
Toernooiveld 212 | 6525 EC | Nijmegen | The Netherlands
+31 24 365 22 91 | t.steen...@cs.ru.nl

Joe M

unread,
Jan 20, 2017, 2:39:39 PM1/20/17
to idris...@googlegroups.com
Hello Tim,

> What is the reason why `simplDecl` only export functions (`SFun`) and leaves constructors out? I’ll probably need the simplified IR for an ABC backend, but I also need constructor information.

This pull request adds the TT declarations to CodegenInfo. They might
be more useful than the IR as the TT declarations still have the type
information.

https://github.com/idris-lang/Idris-dev/pull/3627

Thanks

Yves Cloutier

unread,
Apr 15, 2017, 10:59:43 AM4/15/17
to Idris Programming Language
I haven't been programming with functional languages for that long, so I do apologize realize it may be a naive question which is not so easily answered.

But what are the major differences between Clean and Idris?

As well, what goes goes into a decision for choosing a backend, or decision to have multiple backends?

If the end result is that a program can be executed natively regardless of which backend is used, why would one choose to use the Ocaml, Idris, Clean [or something else] backend?

I'm just curious.  Or is it more an exercise in experimenting with the different possibilities of implementation?

Regards,

yc

Tim Steenvoorden

unread,
Apr 18, 2017, 10:12:53 AM4/18/17
to Idris Programming Language
Dear Yves,

You can go for a specific backend if you want interoperability with that language, using the foreign function interface. Off course this should be implemented, like in the official C and JS backends. 

The purpose of this particular backend is (at the moment) pure experimentation. Our hypothesis is that the ABC machine, which is Clean’s backend, could be a good candidate for Idris as well in terms of performance. There is no much work going on in this direction at the moment. Maybe I’ll have more time this summer.

Cheers,
Tim

----
T.J. Steenvoorden, MSc

PhD Candidate | Radboud University Nijmegen
Faculty of Science | Department of Software Science
Mercator 1 Building | Room 01.08
Toernooiveld 212 | 6525 EC | Nijmegen | The Netherlands
+31 24 365 22 91 | t.steen...@cs.ru.nl
--

Yves Cloutier

unread,
Apr 19, 2017, 9:50:27 PM4/19/17
to Idris Programming Language
I see Tim, I understand.  Thanks for clarifying!

Tim Steenvoorden

unread,
Apr 20, 2017, 2:37:30 AM4/20/17
to Idris Programming Language
I'm glad I could answer your questions Yves!

Cheers,
Tim

----
T.J. Steenvoorden, MSc

PhD Candidate | Radboud University Nijmegen
Faculty of Science | Department of Software Science
Mercator 1 Building | Room 01.08
Toernooiveld 212 | 6525 EC | Nijmegen | The Netherlands
+31 24 365 22 91 | t.steen...@cs.ru.nl
Reply all
Reply to author
Forward
0 new messages