Support using brackets to access an index of a list

152 views
Skip to first unread message

Zach Daniel

unread,
Sep 21, 2023, 12:23:44 PM9/21/23
to elixir-lang-core
Right now if you use access on a list, it expects it to be a keyword and requires the value to be an atom.

I've tried to think of a reason why we couldn't support this in the language, but I haven't been able to. If the value being accessed is an `integer` it could get the item at that index, and if its an `atom` it could treat it as a keyword and get the value for that key.

`[1, 2, 3][0] -> 1`
`[foo: :bar][0] -> {:foo, :bar}`
`[foo: :bar][:foo] -> :bar`
`[1, 2, 3][:foo] -> not a keyword list error`

It was pointed out that perhaps we don't do this to express that indexing a list is not fast in Elixir like it is in other languages, but I'm not sure if that is sufficient reason IMO to leave out a typically very standard feature of lists.

Thoughts?

José Valim

unread,
Sep 21, 2023, 12:29:15 PM9/21/23
to elixir-l...@googlegroups.com
Index-based access on a list is very frequently an anti-pattern. Allowing list[i] means it will be easier to write non-efficient versions of algorithms without a second thought for those coming from imperative languages, instead of using the functions in Enum. It means someone can write this:

for i <- 0..length(list)-1 do
  list[i] * 2
end

We should not make it easy to write non-idiomatic code.

If we want to allow this syntax then I would like to see a comprehensive list of examples where accessing a list by index is important  and doing so via the bracket syntax is important enough to justify it.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/dc7b3c13-08c7-48e8-a338-fe409ca4202an%40googlegroups.com.

Justin Wood

unread,
Sep 21, 2023, 12:39:48 PM9/21/23
to elixir-l...@googlegroups.com

It was pointed out that perhaps we don't do this to express that indexing a list is not fast in Elixir like it is in other languages, but I'm not sure if that is sufficient reason IMO to leave out a typically very standard feature of lists.

Thoughts?

Can you give an example of a language that supports this? In my experience, languages do not provide this.

In Haskell, you need to the non-standard !! operator.

myList !! 5

OCaml does not provide a way to do this at all and you need to manually iterate through the list to find the element you want (maybe other stdlibs like Core provide something, but I have not used them)

Racket has the list-ref function in order to accomplish this

(list-ref my-list 3)

Common Lisp has the nth function for this

(nth 5 my-list)

In comparison, Elixir has provided the Enum.at/2 function.

While I believe most languages do provide some facility to retrieve the value at some index of a list, most do not provide it as square bracket indexing you would see in other languages where arrays are more common such as Ruby or Javascript.

Justin

Zach Daniel

unread,
Sep 21, 2023, 1:14:24 PM9/21/23
to elixir-l...@googlegroups.com
Languages that support it via square brackets: Rust, Ruby, Javascript, Python, C, Julia.

I see your point, ultimately for me what is frustrating is when I have predefined data structures and we have a tool for "get the value at some point (i.e map keys)" so I can do `map[:key][:key]` but if I want to write `map[:key][0][:key][0]` I have to reorient the code entirely to use `get_in` or piped functions like `|> Map.get(:key)`. Getting a key out of a keyword list would be equally as slow as getting a list at an index, so it didn't seem to me like the purpose of the `[]` was to only do fast operations.

There are plenty of cases where I need to get things positionally out of lists, the two that come to mind are handling various data structures that I got externally, like from an API (I can use bracket access until I get to a list then I can't) and testing. Often I'm testing that something at a very specific point in a produced data structure is a specific value. I.e "the tenth item at the path `foo[:bar]` is X". I can do that with pattern matching, but its only feasible to a point (like pattern matching the 10th item from a list does not look good). We can use `get_in` but we could also make that argument for everything and not have bracket access at all. It seems like a natural extension of that choice to give lists (and probably tuples) sensible behavior with bracket syntax.

I'm not necessarily convinced that not having `list[i]` will ultimately convince people not to write those bad algorithms, because they can just as easily say `List.at(list, 1)`. But wether or not I'm convinced of that is not relevant 😆



--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/dc142043-83dc-48e6-8c3f-7a39c8376d0e%40app.fastmail.com.

Zach Daniel

unread,
Sep 21, 2023, 1:26:03 PM9/21/23
to elixir-l...@googlegroups.com
I mean `Enum.at/2` 😆


Zach Daniel

unread,
Sep 21, 2023, 1:38:39 PM9/21/23
to elixir-l...@googlegroups.com
Something else I would say: if we are doing this to protect people from writing non-idiomatic code, can we improve the error message for when you provide an integer to bracket access a list? Something like:

If you are trying to get an element from a list by index, use `Enum.at` instead. However, please read the warnings about working with lists from X guide…

Or we could just include the explanation in-line. `Enum.at` should also potentially have a more clear warning for beginners directly in its docs.

I would rather support the syntax, but if we're doing it to help people write idiomatic code (which will probably be mostly beginners), we should point them more clearly to this then IMO.



On Thu, Sep 21, 2023 at 1:25 PM, Zach Daniel <zachary....@gmail.com> wrote:
I mean `Enum.at/2` 😆


Justin Wood

unread,
Sep 21, 2023, 2:19:57 PM9/21/23
to elixir-l...@googlegroups.com

Languages that support it via square brackets: Rust, Ruby, Javascript, Python, C, Julia.

All of these languages (other than maybe Julia? I have not used it at all.) are actually using arrays and not lists. It is fairly natural to have easy index based lookup for arrays. After all, it is meant to be a contiguous block of memory with each element having a known size. At least for Rust and C, other languages may be more loose in terms of implementation.

In fact, when you use std::collections::LinkedList in rust, you do not have this kind of syntax. If you want to find the element at index i you would need to iterate over the list just like ocaml. There is no provided function to my knowledge to even access a random element.

When you choose a random element from an array, you just need to do some math in order to find the offset in memory for where the element you are looking for resides. This means that random access is constant time or O(1) if you prefer. Having a convenient syntax for this makes sense since the time complexity is so low.

Linked lists, which Elixir and Erlang use, are not necessarily stored as a single contiguous block of memory. It is essentially a value and a pointer to the next element in the list. This means you actually need to traverse the list to the point that you want to access. This means that random access is considered to be linear time or O(n). Having a convenient syntax for this makes less sense since the time complexity is significantly higher. This is the tradeoff that most functional languages have taken, make it harder to do the expensive thing and get people to think about the data structures being used.

Depending on what exactly you are doing, and the size of your lists, it may be worth taking a look at the :array module provided by Erlang. It will give you logarithmic access times, or O(log n). But it still has the downside of not having [] type access.

Justin

Zach Daniel

unread,
Sep 21, 2023, 2:23:02 PM9/21/23
to elixir-l...@googlegroups.com
Yeah, I'm familiar with the differences on that front, but at the end of the day there are plenty of times where its perfectly normal to get a list/tuple element by index. It seems like a natural syntax extension to me (the syntax is basically already there, just intentionally blocked off in a way) for when you have to do that. In the same way that keyword lists aren't typically long, plenty of lists are not too long for it to be perfectly fine to access an item by index.


--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.

Marten Wijnja

unread,
Sep 21, 2023, 2:41:32 PM9/21/23
to elixir-l...@googlegroups.com


On 21/09/2023 20:19, 'Justin Wood' via elixir-lang-core wrote:

Depending on what exactly you are doing, and the size of your lists, it may be worth taking a look at the :array module provided by Erlang. It will give you logarithmic access times, or O(log n). But it still has the downside of not having [] type access.

To add on to that, there are quite a number of excellent libraries on Hex.pm which provide arrays including support for the Access behaviour (which gives you `[]`-indexing and a few other capabilities), such as:

- Aja, which provides a persistent vector based on RRB trees. In some benchmarking I did a year ago, Aja turned out to be the fastest general-purpose array implementation in Elixir.
- Arrays (written by me), which supports a unified interface to a bunch of different array implementations (such as map-based arrays and Erlang's `:array` module).
- Nx, which provides natively-implemented (one- and multi-dimensional) arrays for numbers and can potentially run computations with them on your GPU.
- Explorer, which provides natively-implemented one-dimensional 'series' and two-dimensional 'dataframes'.

Those last two do not allow you to store 'all' kinds of  Elixir datastructures inside them, but they are often more efficient than a library written in pure Elixir when you have a lot of data or do a lot of calculations.


~Qqwy / Marten

José Valim

unread,
Sep 21, 2023, 3:36:50 PM9/21/23
to elixir-l...@googlegroups.com
For what is worth, the error message on main says:

---

iex(1)> [1, 2, 3][0]
** (ArgumentError) the Access module does not support accessing lists by index, got: 0

Accessing a list by index is typically discouraged in Elixir, instead we prefer to use the Enum module to manipulate lists as a whole. If you really must access a list element by index, you can Enum.at/1 or the functions in the List module
---


Zach Daniel

unread,
Sep 21, 2023, 3:42:38 PM9/21/23
to elixir-l...@googlegroups.com
Nice 😆. That is a definite improvement. Ultimately, I think it would be a nice improvement to the language, but I don't disagree with the idea that we need to take great care for the on ramp and to help people avoid footguns. I don't think its very likely that I'll convince the team to add it, and its not important enough to use more of your time on, you all have better things to do 🙇‍♂️ Consider my proposal withdrawn.


--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.

Paul Clegg

unread,
Sep 22, 2023, 4:28:47 AM9/22/23
to elixir-l...@googlegroups.com
On Thu, Sep 21, 2023 at 7:19 PM 'Justin Wood' via elixir-lang-core <elixir-l...@googlegroups.com> wrote:
Languages that support it via square brackets: Rust, Ruby, Javascript, Python, C, Julia.
All of these languages (other than maybe Julia? I have not used it at all.) are actually using arrays and not lists. It is fairly natural to have easy index based lookup for arrays. After all, it is meant to be a contiguous block of memory with each element having a known size. At least for Rust and C, other languages may be more loose in terms of implementation.

That's the problem.  Ruby, Python, Javascript are where a lot of Elixir devs are coming from (or languages that Elixir devs are often "context switching" with on a daily basis) and these all implement lists that are not actually arrays in the conventional sense.  I'm pretty sure, in all three cases, that a list is actually an object that maintains the list elements -- that might be an actual array of pointers, or it might not.  It's definitely not a low-level single allocated block of memory, because these are weakly typed languages an all three support mixed type values (an array in C requires every entry be a fixed size; indexed access is simply multiplying the index by the size of each element and using that as an offset to the start of the memory block -- one of the reasons you can easily segfault by doing that).

Lists in Elixir are linked lists, but we don't regularly refer to them that way.  We don't explicitly reinforce this implementation by requiring things like "LinkedList.new(1, 2, foo)"; we use a "conventional array notation" like "mylist = [1, 2, foo]" -- just like you'd see in Python, Javascript, Ruby.  So it's really not that much of a surprise, coming from those languages, that you start off with that kind of a syntax, and you then expect the language to implement square-bracket-indexing, just like those languages do.  We have Enum.at() to do this, so it's not like it's an impossibility to actually make this work.  The question then is why the syntax doesn't just wrap that function and 'make it work'.  I'm not sure I buy Jose's argument about preventing "non-idiomatic code".  If someone is going to forget that Enum.map exists, they're going to write that loop, and they'll just use Enum.at if they can't get the square brackets to work.

This difference from how those languages have wrapped array-like functions into their syntax has popped up recently with my team, we've noticed people forgetting that the length of a List in Elixir isn't "free information" like it is in Ruby (where the length of a list is actually an attribute of the object representing the list), so we've had to remind people not to do things like "assert length(foo) > 0" (fortunately, a lot of this inefficiency has been limited to our unit tests!) -- because they're coming from Ruby where the length of a list is always "known", so it doesn't even occur to them (and I admit, I've been guilty of forgetting this occasionally as well) that Enum.count() and length() both actively loop over an argument to compute the length, and aren't just retrieving some privately cached data.

As an aside, the one oddity around Access' square-bracket-index implementation that still throws me every once in a while is why you can't use square brackets to index a field on a struct.  If you have a Map, you can do foo[key] but on a struct, foo[key] throws an exception.  Where it gets annoying is that the way to make this work on a struct is to actually use Map.get -- a Map function!  If a struct is a Map, why can't Access just treat structs like Maps in this situation?  But that's a completely different discussion, and I digress.  ;)

Personally, I don't see the harm in supporting it.  If someone's going to abuse it, they'll abuse Enum.at() to make it work anyway, just like I end up abusing Map.get() when I need to get a field from a struct based on a variable key name.  But it's definitely going to be an ongoing point of confusion because "one of these things is not like the others"

...Paul


Ben Wilson

unread,
Sep 22, 2023, 6:54:46 AM9/22/23
to elixir-lang-core
> Personally, I don't see the harm in supporting it.  If someone's going to abuse it, they'll abuse Enum.at()

The harm isn't for people who doing it intentionally, the harm is for people who are doing it unintentionally. Index based array access is so common in certain languages that it's one of the first thing newbies from those languages will try in Elixir and if it works then they'll just proceed writing loops like:

for i <- 0..length(users) do
  IO.puts users[i].name
end

In doing so they'll have reinforced a non idiomatic pattern, and failed to learn something crucial about the language. Of course they can always hit the docs and find Enum.at and rewrite it, but if you google: "Elixir access list by index" you get stack overflow and forum posts that help you realize that you probably don't want to do that. If [] just works they won't google till much later.

- Ben

Paul Clegg

unread,
Sep 22, 2023, 7:07:40 AM9/22/23
to elixir-l...@googlegroups.com
On Fri, Sep 22, 2023 at 11:54 AM Ben Wilson <benwil...@gmail.com> wrote:
The harm isn't for people who doing it intentionally, the harm is for people who are doing it unintentionally. Index based array access is so common in certain languages that it's one of the first thing newbies from those languages will try in Elixir and if it works then they'll just proceed writing loops like:

Fair enough.  :)

...Paul

 

Wojtek Mach

unread,
Sep 22, 2023, 7:24:55 AM9/22/23
to elixir-lang-core
I believe the best outcome would be for OTP to get a native vector type and then Elixir can support it, including the Access behaviour. So I would wait. it may never happen because the barrier to entry is so high but hey, OTP didn't use to have native maps either!
Reply all
Reply to author
Forward
Message has been deleted
0 new messages