Array map with start and end

224 views
Skip to first unread message

Francisco Ramos

unread,
Nov 17, 2017, 3:25:22 AM11/17/17
to Elm Discuss
Hi there,

Was wondering how I can map over an array with a start and end indexes. I know I could slice the array and then map, but performance is a concern and slicing is O(N) where N = end - start, plus the actual mapping, another O(N).

Maybe there is another way where I just loop once over the array?

Thanks a lot,
Fran

Rupert Smith

unread,
Nov 17, 2017, 5:34:12 AM11/17/17
to Elm Discuss
You could write your own using Array.get. But I can see that will be a PITA, as it returns a Maybe, so you might also end up having to supply a default value for the case where get gives you a Nothing, but in practice that default will never be used. Or have your custom map function also return a Maybe, and output Nothing if the indexes are out of bounds and Array.get returns Nothing?

Robin Heggelund Hansen

unread,
Nov 17, 2017, 7:20:20 AM11/17/17
to Elm Discuss
Slicing isn't O(N).

In the current implementation in core, slicing is O(log32n) i believe. In the next version of Elm, slicing is O(log32n) when start = 0; I'm uncertain what the big-o notation is once start > 0 though.

Francisco Ramos

unread,
Nov 17, 2017, 7:34:08 AM11/17/17
to elm-d...@googlegroups.com
That was a good observation, Rupert. Well, it doesn't return Nothing if the indexes are out of the bounds, but if start < 0 then start = 0, and end >= length then end = length -1... I could actually use Array.get and implement my own map like you mention.

Thanks Robin for that correction. I thought Array.slice is using under the hood Array.prototype.slice, which as far as I know, the C++ implementation, it's O(N). If there is a new implementation with such complexity, then happy days.

Was just curious to know what ideas there are out there about this problem. I'm aware of the fact that 2 * O(N) is still O(N), but my arrays might be dealing with millions of entries. Imagine a 5000px by 5000px by 3 color channels. That's 75 millions. So performance is very important. That's why I'm asking.

Thanks guys 

--
You received this message because you are subscribed to the Google Groups "Elm Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elm-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Message has been deleted

Matthieu Pizenberg

unread,
Nov 20, 2017, 2:01:34 AM11/20/17
to Elm Discuss
Hi Francisco, just a few words about arrays and image manipulation.

I've been doing some work along this way and encountered multiple issues. One of them was with slicing. If I'm not wrong Robin's work will be merged in 0.19 but meanwhile, you should be aware that there are few issues with the current arrays, especially large ones [1]. You can already use his library as a drop-in replacement though.

I was also about to mention the numelm project but I think you know about it ^^. Regarding those kind of operations (slicing, transposing, ...) I think a generic image (/tensor) type would benefit a lot from having lazy "views" [2] and expressions of matrices as explained in the xtensor [3] project I mentionned in your numelm post.

Cheers and good luck for this amazing project!

[1] elm 0.18 array issues : https://github.com/elm-lang/core/issues/649
[2] lazy views at 12:30 : https://youtu.be/mwIQUgigjbE?t=12m30s
[3] https://github.com/QuantStack/xtensor

Francisco Ramos

unread,
Nov 20, 2017, 3:47:43 AM11/20/17
to elm-d...@googlegroups.com
Hi Matthieu,

Thanks for those links!!. Those lazy views look like what I'm trying to achieve. I'm actually working on a multidimensional container of items, elm-ndarray, https://github.com/jscriptcoder/elm-ndarray/blob/master/src/NdArray.elm. Still some work to do. I need to re-write map and fold since they're not correct and implement step. This is based on the work of Mikola Lysenko, https://github.com/scijs/ndarray, which in turn is based on Python ndarray, https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.ndarray.html, where operations such as slicing, indexing, transposing, reshaping, etc.. are all O(1). I'm missing though the possibility of using TypedArrays.

Fran

--

Robin Heggelund Hansen

unread,
Nov 20, 2017, 7:21:47 AM11/20/17
to Elm Discuss
It using Array.prototype.slice under the hood, but the way Arrays in Elm is implemented is by using trees. I suggest you watch my talk from Elm Europe, where I explain how the different data structures work in detail =)

Matthieu Pizenberg

unread,
Nov 20, 2017, 10:09:10 AM11/20/17
to Elm Discuss
Hi again,

So out of curiosity, I just spend a couple hours looking for variations of:
"(immutable/persistent) (tensor/multidimentional array/multidimentional data structure) implementation"
and my conclusion is that I did not easily find examples of implementations of data structures tailored for such specific needs as tensor manipulation. It must not be the right way to search for this.

What I found however was many references to Okasaki's work on immutable data structures. This question [1] with it's answer provide good starting points in my opinion. Okasaki's book seems to focus on how to design/implement fuctional data structures so it could give good insights for the raw data structure at the base of your ndarray type.

Maybe the first thing to do would be to clearly define all the operations you want to have for your ndarray in some document. Then design a data structure considering trade-off for all the operations supported. Apparently, there is a paper for the numpy arrays listed on scipy website [2]. These are not immutable however so I don't know if it is usefull.

In hope that it may help,
Cheers,
Matthieu

[1] interesting question: https://cs.stackexchange.com/a/25953/34063
[2] scipy citations: https://www.scipy.org/citing.html

Francisco Ramos

unread,
Nov 20, 2017, 12:29:25 PM11/20/17
to elm-d...@googlegroups.com

Hi guys,

Thanks for your answers. Robin, that was a great talk. I actually was in that very same room when you gave the presentation :-). Very interesting and educative. Hope to see you again in the next Elm Europe.

Matthieu, thanks for the info. I didn't know about Okasaki's work on immutable data structures. Have to admit I didn't google much about the subject. Got some references and I'll go through them. I already have a good idea about the API I'd like to implement for the ndarray. Once I get it done (time is not something I have plenty) I'll write some benchmarks. 

Ultimately, I'd like to rewrite NumElm using the elm-ndarray. Not sure how I'm gonna do this without writing kernel code. Linear algebra operations such as Inverse, Pseudo-inverse, Singular value decomposition, Eigenvalues and eigenvectors, etc... I simply have no idea how I'm gonna implement this. Need to have a look at solutions in Haskell for inspiration.

Cheers,

Fran



--

Rupert Smith

unread,
Nov 21, 2017, 5:24:09 AM11/21/17
to Elm Discuss
On Monday, November 20, 2017 at 5:29:25 PM UTC, Francisco Ramos wrote:

Ultimately, I'd like to rewrite NumElm using the elm-ndarray. Not sure how I'm gonna do this without writing kernel code. Linear algebra operations such as Inverse, Pseudo-inverse, Singular value decomposition, Eigenvalues and eigenvectors, etc... I simply have no idea how I'm gonna implement this. Need to have a look at solutions in Haskell for inspiration.


I suspect you are up against a tough impedance mismatch between immutable arrays for functional languages, and fast flat arrays for pure number crunching.

The tree structured arrays for functional languages are designed to allow a new version to be created from an existing array, without copying the entire array. Well, a balance between copying the least amount whilst keeping the tree fairly shallow for fast access.

Arrays of floats for number crunching ideally just want to be stored flat in RAM, so you can point an optimized for-loop at them or your GPU.

You could also look at Java nio.Buffer for some inspiration? These allow off-heap 'direct' buffers to be created, but have an interface on the Java language side to manipulate them. You can for example take a 'slice' of such a buffer, and it give you a so-called flyweight object as the result, that is, a start offset and length into the original buffer, but sharing the same data. 'slice' therefore is a very efficient operation.

This scheme won't translate into immutable functional data structures without modification. For example, to modify such a buffer in an immutable way, would mean copying the entire thing. I just mention it as a possible source of inspiration to help you think about your design.

Perhaps this is already what you have in mind for ndarray? A structure that is more efficient for your use case, but that is wrapped in an immutable functional API to make it play nicely with the host language.

Robin Heggelund Hansen

unread,
Nov 21, 2017, 5:47:02 AM11/21/17
to Elm Discuss
Something like https://github.com/Skinney/core/blob/master/src/Elm/JsArray.elm ? It's what is used as the basis for Arrays in 0.19. It is not planned to be opened for use outside of elm-lang/core, but if it fits your usecase better, I'm sure Evan would be interested in hearing about it.

(JsArray is a thin wrapper over javascript arrays. Any operation that modifies the underlying structure causes a complete copy, but get and folds are very fast. Slicing when start === 0 is still going to be faster using Elm Arrays as it is a tree structure. On the other hand, it should be fairly easy to create a "view" instead of slicing, but that might give you problems with space leaks.)

Rupert Smith

unread,
Nov 22, 2017, 5:42:12 PM11/22/17
to Elm Discuss

On Tuesday, November 21, 2017 at 10:47:02 AM UTC, Robin Heggelund Hansen wrote:
Something like https://github.com/Skinney/core/blob/master/src/Elm/JsArray.elm ? It's what is used as the basis for Arrays in 0.19. It is not planned to be opened for use outside of elm-lang/core, but if it fits your usecase better, I'm sure Evan would be interested in hearing about it.

(JsArray is a thin wrapper over javascript arrays. Any operation that modifies the underlying structure causes a complete copy, but get and folds are very fast. Slicing when start === 0 is still going to be faster using Elm Arrays as it is a tree structure. On the other hand, it should be fairly easy to create a "view" instead of slicing, but that might give you problems with space leaks.)

Yes.

Something I forgot to mention about Java nio.Buffers is that they are byte array buffers. There is mechanism by which int and float (and short, char, lond and double) are overlayed as views onto byte buffers.

The reason I mention this is that as yet Elm does not have any support for binary buffers, and it might also be worth thinking about that issue at the same time.

Could JsArray.elm by made to work with JavaScript typed arrays? https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays

Matthieu Pizenberg

unread,
Nov 22, 2017, 9:28:50 PM11/22/17
to Elm Discuss

Could JsArray.elm by made to work with JavaScript typed arrays? https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays

That was exactly what I was wondering. I peaked at the elm and kernel code and wanted to try to hack something to wrap ArrayBuffer. I got stopped at the step of compiling the elm plateform at master ^^, my system ghc beeing 8.2 and I don't have enough haskell-fu yet to manage multiple haskell versions :)

As a side note, I just went to a NixOS [1] meetup yesterday evening, really cool stuff. I think this nix package manager [2] will help me deal with those dependencies issues!

[1] NixOS: https://nixos.org/
[2] Nix package manager: https://nixos.org/nix/

Rupert Smith

unread,
Nov 23, 2017, 5:26:26 AM11/23/17
to Elm Discuss

On Thursday, November 23, 2017 at 2:28:50 AM UTC, Matthieu Pizenberg wrote:

Could JsArray.elm by made to work with JavaScript typed arrays? https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays

That was exactly what I was wondering. I peaked at the elm and kernel code and wanted to try to hack something to wrap ArrayBuffer. I got stopped at the step of compiling the elm plateform at master ^^, my system ghc beeing 8.2 and I don't have enough haskell-fu yet to manage multiple haskell versions :)

Do you need to rebuild the compiler for this?

I may be wrong, but kernel hacking can be done by using elm-github-install, by putting a substitution for elm-lang/core in your elm-package.json like this:

    "dependency-sources": {
       
"elm-lang/core": "../my-hacked-core"
   
}


Then you can try out kernel modifications without needing to rebuild the whole Elm distribution?
 
As a side note, I just went to a NixOS [1] meetup yesterday evening, really cool stuff. I think this nix package manager [2] will help me deal with those dependencies issues!

I really want to try NixOS, but it will have to wait until I have a good amount of time on my hands to play around with it.

Matthieu Pizenberg

unread,
Nov 23, 2017, 5:48:46 AM11/23/17
to Elm Discuss
Do you need to rebuild the compiler for this?

I'm not familiar with so called "native" elm 0.18 code. So I wanted to use the example given by Robin with `Elm/JsArray.elm` and `Elm/Kernel/JsArray.js` from elm master branch to try the same thing with `JsArrayBuffer.[elm/js]`. Since this is code in the master branch only, which is 0.19 syntax (`elm.json`, "kernel" js code, ...), I cannot compile it with elm-make from 0.18 branch.

But it's ok, this need no rush, will try when I have a little more time.

Francisco Ramos

unread,
Nov 23, 2017, 5:54:28 AM11/23/17
to elm-d...@googlegroups.com
I can confirm that. You just need to use elm-github-package, which allows you to install Elm packages from the official repository and github. That's exactly what I'm going in NumElm to be able to write Kernel code, https://github.com/jscriptcoder/numelm...

I'm really running into lots of walls on this subject. I'd be super happy if we had TypedArray views in Kernel code.

Currently I'm trying to figure out how to walk a strided Array. Imagine we have a buffer like this: [1, 2, 3, 4, 5, 6, 7, 8, 9], with shape [3, 3], square matrix. Strides for this is [3, 1] --> [1, 2, 3 | 4, 5, 6 | 7, 8, 9]. If I change the stride to [2, 2], then I get [1, 3 | 7, 9] --> 2x2 matrix. I need to implement map and fold functions, and to be able to do that I need to walk the Array: 1 -> 3 -> 7 -> 9... My functional programming skills are being challenged here ;-)

Fran



--

Robin Heggelund Hansen

unread,
Nov 23, 2017, 8:12:00 AM11/23/17
to Elm Discuss
Using native code in Elm isn't particularly hard (though, you cannot publish such code as an elm package).

My original, and still working, array implementation uses native code (it's a "blessed" library). It's better to use that as a template for any experimentation you might want to do: https://github.com/Skinney/elm-array-exploration

Matthieu Pizenberg

unread,
Nov 27, 2017, 5:55:10 AM11/27/17
to Elm Discuss
Following your advice, I came back to elm 0.18 and just started a repository to try to wrap JS typed arrays in elm. I've moved the discussion to a new post [1] since we were drifting from the original post here.
Cheers

[1] JS typed array implementation for elm: https://groups.google.com/d/topic/elm-discuss/ZfdV85yq9jU/discussion
Reply all
Reply to author
Forward
0 new messages