Which supposed to be the latest maintained version ? Hackage 3.4.1.4 or github v4.1.0.1 ?

91 views
Skip to first unread message

Compl Yue

unread,
Mar 31, 2020, 7:59:55 PM3/31/20
to Haskell Repa
Hi folks here,

I'm implementing a graph+array database on top of GHC runtime, the graph part is almost done so I'm starting the array part, as GPU is not my current focus, I see Repa as the ideal interface library to bridge DML of my db and number crunching apps those to be written in Haskell. 

But I'm confused how Repa is maintained today, I found 3.4.1.4 on hackage but v4.1.0.1 released on github, so which one I'm supposed to start with?

Also I'd like to know how Repa is used by your stake users, my team is shifting from Numpy/Pandas to Haskell ecosystem, there're lots to learn and better stay idiomatic.

Best regards,
Compl

Ben Lippmeier

unread,
Apr 5, 2020, 1:51:43 AM4/5/20
to Compl Yue, Haskell Repa


> On 31 Mar 2020, at 11:58 pm, Compl Yue <comp...@gmail.com> wrote:
>
> I'm implementing a graph+array database on top of GHC runtime, the graph part is almost done so I'm starting the array part, as GPU is not my current focus, I see Repa as the ideal interface library to bridge DML of my db and number crunching apps those to be written in Haskell.
>
> But I'm confused how Repa is maintained today, I found 3.4.1.4 on hackage but v4.1.0.1 released on github, so which one I'm supposed to start with?

I started reworking the array API a few years ago after we wrote the “Polarized Data Parallel Data Flow” paper, but I had too many problems trying to eliminate all the boxing/unboxing overhead that is typical in numeric Haskell code, and lost interest. If the 3.4.1 version stops building then I’m happy to fix it, but the v4.1 array API on github should be considered abandoned.

> Also I'd like to know how Repa is used by your stake users, my team is shifting from Numpy/Pandas to Haskell ecosystem, there're lots to learn and better stay idiomatic.

You’d be better of using something like accelerate or hmatrix for the backend. The Accelerate library is an EDSL that can compile via CUDA code and is actively maintained. hmatrix is a binding to the standard BLAS operators. A key problem with writing numeric code in plain Haskell is that the GHC core language was never really intended for it, and the need to preserve the existing language semantics prevents it from doing some critical optimisations that would be necessary to get efficient code. GHC also doesn’t have an automatic vectoriser. Without hand-vectorisation, standard numerical kernels are going to be ~4x slower than BLAS, and if you’re going to hand vectorise code you might as well just use C intrinsics and call the C kernels using the Haskell FFI. I still think the techniques used in Repa are valid as a scientific approach, but to push this any further we’d need a compiler IR and simplifier that is more directly intended for high performance numeric code.

Cheers,
Ben.





Reply all
Reply to author
Forward
0 new messages