Zombie protocol: Sequential (Indexable)

106 views
Skip to first unread message

christ...@gmail.com

unread,
Jul 22, 2017, 2:53:53 AM7/22/17
to elixir-lang-core
Hey all! 

In anticipation of v2.0, I'd like to propose a big honking API change, namely, a protocol that has been discussed here for years but never implemented: Sequential.

Context

We have several modules dealing with 'collections', each capturing a unique property of being a collection:

Types:

- Tuple: a collection of values stored contiguously in memory
- List: a collection of values stored in a linked list
- Binary: a collection of bytes
String: a collection of UTF-8-encoded unicode characters stored in a binary
Range: a collection of integer values explicitly denoted by a discrete start and end integers

Utilities:

- Enum: stratagems for exhaustively traversing a collection
- Stream: stratagems for lazily traversing a collection

Protocols:

- Enumerable: logic for traversing a collection
- Collectable: logic for mapping a collection of one type to another

So what would the Sequential protocol bring to the table? Well, it would specify the contract for countable collections, be they infinite or not. Much like Enum compliments Enumerable, we would implement a Sequence module to interact with those collections that can be accessed via an index, stealing several functions from existing modules.

Protocol

The Sequential protocol would act like the Enumerable protocol, with two callbacks:

- seek(Sequence.t, index): Scans through the sequence until it reaches the non_neg_integer value at the index position, and returns it.
- terminable?(Sequence.t): Returns whether or not the provided sequence can be exhaustively iterated.

Any collection function that hinges on indexabliity would be unified and moved into the Sequence module. Any collection function that hinges on exhaustive countability would be, too, so we could check if the Sequential was terminable?, raising a Sequence.InterminableError if it wasn't, and removing the caveat from the Enum module:

Since the majority of the functions in Enum enumerate the whole enumerable and return a list as result, infinite streams need to be carefully used with such functions, as they can potentially run forever.

Module

The Sequence module would utilize Sequential.seek/2 and Sequential.terminable?/1 and take on functions from the Enum, Stream, Tuple, List, and String modules, by encapsulating utilities for constructing and manipulating countable collections:

Manipulation:

- Sequence.at(Sequence.t, index): uses seek to find the value at the index, raises if the index is negative and the sequence isn't terminable.

- Sequence.first(Sequence.t)seeks the first value of the sequence.
- Sequence.last(Sequence.t)seeks last value of the sequence, raises if the sequence isn't terminable.
- Sequence.random(Sequence.t)seeks a random value of the sequence, raises if the sequence isn't terminable.
- Sequence.to_list(Sequence.t)converts the sequence to a list, raises if the sequence isn't terminable.
- Sequence.replace_at(Sequence.t, index, value)replaces the value at the index with the new value.

christ...@gmail.com

unread,
Jul 22, 2017, 2:54:56 AM7/22/17
to elixir-lang-core
Oops, hit <Tab+Enter> before this was fully composed.

christ...@gmail.com

unread,
Jul 22, 2017, 3:03:55 AM7/22/17
to elixir-lang-core
TL;DR: this has several virtues:

- Unifies Tuple and List access and replacement
- Sequential.seek/2 inherently implies that previous Enum functions will iterate fully over the whole enumerable
- Countable infinities have an opportunity to raise on negative index access
- Paves the way for a cacheable enumerable Stream.sequence/1 function that takes an enumerable and allows the user to flag an otherwise destructive-on-read stream as a countable infinity where values might be re-accessed, allowing it to enumerate the stream and cache values on access
- Paves the way for a set of Sequence.xxx/0 functions that procure useful Stream.sequences like Sequence.primes/0 and Sequence.fibonacci/0
- Paves the way for Sequence.product/{1,2} functions that take the cartesian product of possibly infinite, but countable streams

christ...@gmail.com

unread,
Jul 22, 2017, 3:08:12 AM7/22/17
to elixir-lang-core
This also gives us a feasible way to implement Access for non_neg_integer values on Sequences.

christ...@gmail.com

unread,
Jul 22, 2017, 3:20:11 AM7/22/17
to elixir-lang-core
Streams created by the Stream module would not be inherently Sequential, but streams created via the Stream.sequence/1 function would be assumed to be interminable, but sequential, Sequences.

José Valim

unread,
Jul 22, 2017, 3:47:17 AM7/22/17
to elixir-l...@googlegroups.com
Thanks Chris.

This doesn't feel like an API change, but rather the addition of a new protocol. The only API change I can foresee is if we want to deprecate Enum.at in favor of Sequence.

My concern with the proposal above is that I don't believe List should implement such protocol, exactly because access is not constant time - if the best we can guarantee is linear time - then Enum can address it.

Some of the other benefits can also be achieved in Enumerable today. For example, Sequence.product could be made part of Enum. If the concern is lack of guarantees regarding termination, we could add a new function to the Enumerable protocol, such as aggregate, with the same API as reduce, except it should only be implemented if termination is guaranteed.

Also note that implementing a function such as Sequence.replace would require new additions to the protocol if you want to keep the shape of the data type. And keep in mind that you can't also have different implementations for Binary and String - they are the same data type. Luckily both functionalities can be handled by Access - since accessors are not polymorphic on the data type but on the accessor function.

If we consider those concerns and limitations, I see two benefits:

1. Guarantee of termination
2. Guarantee of constant-time index access

I can really see the benefits of the former but a bit skeptical on the benefits of the latter. The two primary constant time access data-structures in Elixir, tuples and binaries, are better handled with pattern-matching than with functions.

With those concerns in mind, how would you change your proposal?




José Valim
Skype: jv.ptec
Founder and Director of R&D

On Sat, Jul 22, 2017 at 9:20 AM, <christ...@gmail.com> wrote:
Streams created by the Stream module would not be inherently Sequential, but streams created via the Stream.sequence/1 function would be assumed to be interminable, but sequential, Sequences.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/a7a7d5a8-7d99-415b-aa0c-56ba6baf27b3%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

christ...@gmail.com

unread,
Jul 23, 2017, 8:49:51 PM7/23/17
to elixir-lang-core, jose....@plataformatec.com.br
Hey José!

My thoughts aren't to make guarantees of constant-time access, just linear-time access, which seems reasonable as the main index-accessible data structure in Elixir, lists, makes the same promise. As such I am inclined to agree with you--implementing Sequential for tuples and binaries could distract from the superior pattern-matching method of negotiating them. Just because we could doesn't mean we should.

I propose a new protocol mostly because I feel as if Enum is doing its job really well––allowing you to yield values from collections. The only awkwardness I find in its API are the cases where it has functions that make assumptions about countability and termination, at the risk of falling into an infinite loop. If we moved these to a dedicated protocol we could be more assertive about the type incompatibility and failure cases of these functions.

Taken to its logical conclusion, this module would eat Enum.at/2, and perhaps even Enumerable.count/1. This would be a major API change, but allow us be explicit about when we're handling countable collections, and countable infinities.

Things like Sequence.product/{1,2} could be added to Enum instead, but a large part of the appeal of this proposal is, to me, slimming down the Enum module by moving these index-oriented assumptions into a devoted protocol. The question is if this is desirable enough to make the breaking change in Enum.

As such, consider my proposal to hinge on the merits of:

1. Guarantees of linear-time index access
2. Enforcing explicit handling of countable collections
3. Consistent failure modes around negative index access of interminable sequences
4. Slimming down Enum and Enumerable a little
5. Adding a possible avenue to introducing an Access-style formalism for non-atom (ie integer index) values

Some cases where these distinctions seem useful to me:

- Most streams are logically Enumerable and Sequential, but not necessarily terminable
- Database tables and key-value stores are logically Enumerable, but not necessarily Sequential
- Monotonically increasing random number generators might be meaningfully Enumerable but more usefully Sequential
- Ranges are meaningfully Sequential; MapSets less so
- Providing a non-destructive Stream.sequence/1 that converts a stream an index-accessible struct that caches produced values for access later (useful in the implementation of Sequence.product/{1, 2}, being the example that prompted me to start thinking about such a protocol)

The functions in Sequence would handle explicitly addressing individual elements by index, so I'm not sure if worrying about keeping the same shape is a concern. If we opt to include things like Sequence.slice/2 in there they could make the same list-oriented guarantees as Enum and require you to reach into Collectable if you need a particular resulting type––that is what that module is for after all.

With these clarifications and motivations, do you think the API change would be worth the effort? I'd rather see this not go anywhere than further widen the Enum module's API, so I think the addition of a new protocol and probably deprecation of other functions is a key part of the proposal.

José Valim

unread,
Jul 24, 2017, 3:32:19 AM7/24/17
to christ...@gmail.com, elixir-lang-core
Thanks for the reply!

Let me comment on your list of merits and then some last remarks:
  
1. Guarantees of linear-time index access

Already provided by Enum.
 
2. Enforcing explicit handling of countable collections

Yes, that would be great.
 
3. Consistent failure modes around negative index access of interminable sequences

This would be great as well.
 
4. Slimming down Enum and Enumerable a little

Could be a positive or negative change. Having another module to lookup for functions, beyond Enum and List, can actually cause more confusion than help.
 
5. Adding a possible avenue to introducing an Access-style formalism for non-atom (ie integer index) values

As mentioned in my previous e-mail, this one is provided with accessors functions. Remember this functionality needs to be polymorphic on the function instead of the data type, if we want to support things such as strings and binaries on the same data type.

In order for us to do such a large change, we need to have very clear benefits across the board. That's why I would rather extend the Enumerable protocol, without changing the Enum API, then do things like segregating the API which is a risky change. We could debate for weeks if splitting Enum apart is beneficial - and that's exactly the point - at this point, if we are unclear on such a large change, it is probably not the way to go.

It also worth expressing the sentiments of the Elixir team in that v2.0 will not have breaking changes besides the removal of what has been previously deprecated on 1.x. I don't think this affect this proposal but it is worth making it explicit.

Generally speaking, enumerable gives very few guarantees:

1. no guarantees of termination
2. no guarantees of ordering
3. no guarantees of constant-access

I think we could improve the termination one in many cases though. If you feel it is worth discussing, maybe we could start a topic around that - first by identifying the functions that need to terminate - such as the access and aggregation ones - and then coming up with a possible extension of Enumerable.

christ...@gmail.com

unread,
Jul 24, 2017, 4:16:23 AM7/24/17
to elixir-lang-core, christ...@gmail.com, jose....@plataformatec.com.br
Thanks for the feedback!

One thing I do like about my Sequential proposal: the terminology of the callback function seek/2 makes it very clear that only linear access is promised, and the necessity of fully seeking a Sequence before negative access becomes an emergent implication.

The parts you seem to favor are:

2. Enforcing explicit handling of countable collections.

The only way I can think of to do this within the existing Enum framework is to add a new Enumerable.seek callback.

3. Consistent failure modes around negative index access of interminable sequences.

The easiest way to handle this within the existing Enum framework would be by adding some sort of terminable flag to to Enumerable and having dependent functions check this first.

Both changes necessitate some ultimate breakage in existing behaviour of Enum––either eventually the functions that need to know this need be moved to a new protocol, or the Enumerable protocol need change, at least to introduce new failure modes previous not emitted.

Currently, both Enumerable.count/1 and Enumerable.member?/2 can return {:error, __MODULE__} to fall back on a naive linear implementation that takes neither countability nor exhaustibility into account––factors that Enumerable.reduce/3 doesn't entirely rely on. If we want to group these changes into the Enumerable module, this seem to me to be the place of leverage for breaking API changes.

How does this proposal sound? I can reformulate this as a new proposal to the mailing list in a second pass with your tentative blessing:


Iteration 1: two new functions to Enumerable

- Enumerable.seek/2

  Falls back to reduce/3 if not implemented, via a counter, re-implements Enum.at/{2, 3} and dependents, return {:error, __MODULE__} to fall back on reduce/3.

- Enumerable.exhaustible?/1

  Pure opt-in to denote non-negative access: raises if implemented as {:ok, false} in relevant functions, proceeds if {:ok, true} is returned, defaults to {:error, __MODULE__} for existing infinite loop behaviour. Uses the default for existing core types.


Iteration 2: re-implement Enumerable.count/1 and Enumerable.member?/2 in terms of the above

Again maintaining existing behaviour, but responses of {:error, __MODULE__} warn about future deprecation.


Iteration 3: Fully re-define Enumerable interface

- Enumerable.reduce/3

Fully maintains existing behaviour as continuable Enumerables can halt at any time to abort seeking or interminable behaviour when they reach their bounds.

- Enumerable.seek/2

Fully takes on all responsibilities of integer access within the Enum module.

- Enumerable.exhaustible?/1

Appears within any seek/2 dependent implementation to error early on negative integer access.

- Enumerable.member?/2

Fully optional in favor of dependent functions erroring if not terminable, and falling back on linear access if not implemented. Useful to override for Set-type data structures.

- Enumerable.count/1

Fully deprecated in favor of dependent functions erroring if not terminable. Still seeks by default.


Thanks again for your time! Thoughts?

José Valim

unread,
Jul 24, 2017, 5:28:00 AM7/24/17
to christ...@gmail.com, elixir-lang-core
Currently, both Enumerable.count/1 and Enumerable.member?/2 can return {:error, __MODULE__} to fall back on a naive linear implementation that takes neither countability nor exhaustibility into account––factors that Enumerable.reduce/3 doesn't entirely rely on. If we want to group these changes into the Enumerable module, this seem to me to be the place of leverage for breaking API changes.

The reason count/1 and member?/2 exist is to provide better than linear time implementations. So I don't believe we need to change them and we definitely can't remove them - since replacing them by any other construct would remove the constant time benefits. For those functions, if they return {:error, __MODULE__}, then they could fallback to an implementation that guarantees terminability.

- Enumerable.exhaustible?/1

  Pure opt-in to denote non-negative access: raises if implemented as {:ok, false} in relevant functions, proceeds if {:ok, true} is returned, defaults to {:error, __MODULE__} for existing infinite loop behaviour. Uses the default for existing core types.

The access functions (fetch, at, etc) are not the only one that suffer from this issue, any aggregation function, such as sum, max, min, etc will have termination issues. As well as drop/take with negative values.

We should also avoid double dispatching to functions: for example calling Enumerable.exhaustible? and then Enumerable.reduce because of the double dispatch cost. Protocol consolidation is available on production apps but still not available during scripts, compilation time, etc. This is not an Enumerable specific limitation. I would have added this exact same kind of concerns if the Sequence implementation moved forward.

The other limitation here is that some functions are actually quite hard to guarantee they terminate. For example, Enum.flat_map/flat_map feels like they terminate but they can return an infinite collection to be flattened, which then means they won't terminate. What should be the default implementation for those? We will either have false positives or false negatives.
 
- Enumerable.seek/2

  Falls back to reduce/3 if not implemented, via a counter, re-implements Enum.at/{2, 3} and dependents, return {:error, __MODULE__} to fall back on reduce/3.

That sounds great to me. What if we call it Enumerable.fetch to mirror Enum.fetch? It may return {:ok, found} | :error | {:error, __MODULE__} - where the last one means to fallback to a linear-time implementation.

I think at this point we can clearly treat them as two separate proposals. We should guarantee termination regardless if we can optimize functions such as fetch/2 and at/3 and vice-versa.

christ...@gmail.com

unread,
Jul 24, 2017, 4:35:39 PM7/24/17
to elixir-lang-core, christ...@gmail.com, jose....@plataformatec.com.br
Sounds good. I will work on two new proposals and present them alongside some of the open-ended questions I encounter experimenting with their initial implementation sometime this week.

Thanks again!

Burdock

unread,
Jul 24, 2017, 5:12:52 PM7/24/17
to elixir-lang-core, christ...@gmail.com, jose....@plataformatec.com.br
Since I am currently working on proposition product/2 functions for Enum/Stream I thought I may pipe in here.

When talking about Enumerables and Stream, there are a couple mathematics terms worth using. Specifically the terms for the number of elements of the set; a set's cardinality. 

Finite cardinality: These are your normal sequences. For example, Lists, Trees, Maps, and Functions that halt. Anything with finite cardinality has an Integer Length. 
Aleph-0 cardinality: These are your basic Countable infinite sequences. For example, Natural numbers, Stream.with_index, and so on. Anything with Aleph-0 cardinality doesn't have an Integer Length. 
Aleph-1 or more: These are your Uncountable infinite sequences. For example, All R
eal Numbers. Anything with Aleph-1 cardinality is very hard to reason with in a programming context. 

The proposed Enumerable.exhaustible?/1 is checking if the sequence is Finite or Aleph-0. Writing a function that returns true or false would be very hard. We would have to solve the halting problem to do so.

But from what I understand, the goal is to improve the *correctness* of Streams. The only thing I can think of that would be close to this is checking if something is a Stream or not.
Data enumerables will always halt. Stream enumerables may or may not halt. 

Something we can do is check if a Stream is Aleph-0 or Aleph-1. For example, concatenating two Aleph-0 Streams is not a valid operation because it makes the resulting Stream uncountable. Unforchenetly, as already established, you can't determine a Stream is or isn't countable without solving the halting problem.
   
So we are stuck with only being able to determine if an operation is valid, or is maybe valid

christ...@gmail.com

unread,
Jul 24, 2017, 6:07:46 PM7/24/17
to elixir-lang-core, christ...@gmail.com, jose....@plataformatec.com.br
Right––there won't be any sense of actually trying to ascertain whether or not the Enumerable is capable of these operations at runtime.

By encoding their behaviour at the protocol level, implementations can instead declare the class of operations they support for that type.

This will allow us to make streams not support negative integer access by default. If you have a stream you want that for, you would wrap it in a new struct and implement Enumerable for it, instructing the protocol callback on how to meaningfully handle that case.

José Valim

unread,
Jul 24, 2017, 6:23:56 PM7/24/17
to christ...@gmail.com, elixir-lang-core
This will allow us to make streams not support negative integer access by default. If you have a stream you want that for, you would wrap it in a new struct and implement Enumerable for it, instructing the protocol callback on how to meaningfully handle that case.

The issue with this is that you break a bunch of valid use cases. For example, the example below is actually finite:

[1, 2, 3] |> Stream.filter(fn x -> x >= 2 end)

But if you say that all streams no longer support negative integer access, you just broke it.

This example is also finite:

[1, 2, 3] |> Stream.flat_map(fn x -> [x] end)

But this next example is not necessarily finite, it depends on the implementation of some_fun:

[1, 2, 3] |> Stream.flat_map(some_fun)

Other streams, such as Stream.unfold/1, requires us solving the halting problem.

That's why I said we need to either choose by allowing maybe infinite streams in some cases or by refusing finite streams in some cases. The former reduces the quality of the guarantees we are able to offer. And the latter is backwards incompatible, so that's a no-go. I think it would be highly detrimental to not be able to perform an index access or an aggregation on Stream.filter or even Stream.flat_map. 

christ...@gmail.com

unread,
Jul 24, 2017, 6:37:41 PM7/24/17
to elixir-lang-core, christ...@gmail.com, jose....@plataformatec.com.br
This is a good point. I was looking for an example of how these protocol changes could allow us to make some enumerables not allow negative access. No existing thing in the language is a good example, though, because we would break current behaviour.

It does present the interesting option of adding an `infinite` boolean field, defaulting to false, to the Stream struct and having the exhaustiblity protocol function check that. Then stream constructors that are known to generate infinities like Stream.cycle could opt-in to erroring on negative access. Stream modifiers known to 'curb' the infinities could reset it to true. Ambiguous operations would leave it unmodified.

My exhaustible impl attempt will come well after my take on Enumerable.fetch, so I'm not really thinking very hard about this yet––just spitballing. But definitely the intention is not to have exhaustible? do any work, merely allow Enumerable implementers outside of core to opt-out of negative integer access if they wish.

OvermindDL1

unread,
Jul 25, 2017, 12:23:31 PM7/25/17
to elixir-lang-core, christ...@gmail.com, jose....@plataformatec.com.br
Just for more data points (not proposing anything), perhaps look at how C++ handles iterables I'll only be referencing the immutable iterator types or the types that can be immutable based on their contained types, I'll also be using the C++ names for them rather than the more traditional OCaml'y/Haskell'y or so names as C++'s is more expressive due to having more 'types':

* Iterator:  This defines something is an iterator, however it has no way to actually iterate it, it is only a marker trait that says that something is some type of iterator

* InputIterator:  This is something that you can 'get' something out of, then 'next' to the next element until it 'ends' (or does not end, it could be infinite).  You cannot go back over an iteration, it is consume-once thus it is read-once, and you cannot write an element, even of the slot that you are in.  The only real Elixir/Erlang 'InputIterator' that I can think of off the top of my mind is the process mailbox.

* OutputIterator:  This is something that you can 'put' something in to, then next to the next 'slot' to put something in to it as well.  You cannot read from an element even of the slot that you are currently in nor can you write to an element multiple times, only once.  The only real Elixir/Erlang 'OutputIterator' that I can think of is sending a message to a process.

* ForwardIterator:  Every ForwardIterator is also an InputIterator and/or an OutputIterator (depending on its contained types), you can get/put and next, however unlike InputIterator/OutputIterator you can read/write from/to a slot multiple times and you can also hold a cursor to a given slot to read/write from/to it again even after you have 'next'/iterated past it.  You cannot iterate backwards from a given slot, only forwards.  This would be like iterating through a list in Elixir/Erlang.

* BidirectionalIterator:  Every BidirectionalIterator is also a ForwardIterator, except you can also go backwards from a slot.  In Elixir/Erlang this would be like a zipper structure.

* RandomAccessIterator:  Every RandomAccessIterator is also a BidirectionalIterator except that you can jump to any slot in *constant* time via an index.  An Elixir equivalent would be like accessing a tuple.



C++ also has traits that define if something is a type of container, these are:

* Container:  The base Container trait (do note, these are not C++ classes, they are 'traits', if something implements the required functions for their type then they are considered to fulfill a given 'trait', there is no inheritence or anything of the sort).  This defines something that 'manages' something else.  It has a way to acquire a ForwardIterator (or anything else that implements a ForwardIterator, such as a Bidirectionaliterator).  This is like an Elixir list or a map or really anything that holds something.

* ReversibleContainer: A ReversibleContainer must also be a Container.  It adds the constraints that acquiring an iterator must return an iterator that fulfills a Bidirectionaliterator in capability.  It also adds the constraint that it must be possible to get an iterator that iterates over the elements in reverse order.  This is like an Elixir zipper.

* AllocatorAwareContainer:  An AllocatorAwareContainer must also be a Container.  It is a container that creates it's own elements via a (either passed in or internal) allocator function instead of the user supplying the element.  This is like an Elixir Stream.

* SequenceContainer:  A SequenceContainer must also be a Container.  It adds the constraints that the returned iterator must be in a linear arrangement, such as getting different iterators on the same container must always result in the same order and in constant iteration time (thus that 'next' is constant time).  This is like iterating over an Elixir List but not a map.

* ContiguousContainer:  A ContiguousContainer must also be a Container.  It adds the constraint that the returned iterator must be of type RandomAccessIterator, thus allowing for constant time element access.  This is like an Elixir tuple.


A given iterator can fulfill multiple even disjoint iterator traits, like being both an input and an output iterator (in Elixir this would be like a File access interface).

A given container can fulfill multiple container traits, like being both a SequenceContainer and a ContiguousContainer, which in Elixir would be a tuple since it fulfills both requirements.


Just putting it out there that there are a lot of different kinds of Sequences, and even in C++ you cannot know if an iterator (or even a container) is not 'infinite'.  Like take stdin/stdout in C++, those are InputIterator and OutputIterator respectively, yet you do not know if either 'end' until you next() on them and test (and 'next()ing' on the stdin InputIterator may 'wait' in time until more input comes in for example, and writing to an OutputIterator of stdout may 'wait' in time if the buffer is full and the receiving pipe has not processed the buffered data yet, which can also happen when sending/receiving message on the BEAM as well).

OvermindDL1

unread,
Jul 25, 2017, 12:34:10 PM7/25/17
to elixir-lang-core, christ...@gmail.com, jose....@plataformatec.com.br
Oh I forgot newer container types in C++:

* AssociativeContainer: An AssociativeContainer must also be a Container.  It allows for lookup based on some arbitrary 'key' (type is defined by the container) instead of an index.  The elements must be ordered so iterators to the same identical container must always result in the same order.  This has no constraints on lookup speed so it could be constant, or it could be linear or worse. This is like an Elixir Keyword list.

* UnorderedAssociativeContainer:  An UnorderedAssociativeContainer must also be a Container.  It allows for lookup based on some arbitrary 'key' (type is defined by the container) instead of an index.  The elements are not ordered so iterators to the same identical container may result in different orders each time.  This has the constraint that lookup speed must be at work linear but is preferred that the average case is either constant or logarithmic.  This is like an Elixir Map.

Burdock

unread,
Jul 25, 2017, 2:36:46 PM7/25/17
to elixir-lang-core, christ...@gmail.com, jose....@plataformatec.com.br
In my humble opinion, changing the underlying spec too much is unnecessary. 

Enumerable.count already encompasses most of the functionality we are talking about here. 
Currently, if it returns {:ok, number} the Enumerable is finite. If it returns {:error, __MODULE__} the Enumerable may be finite, Countable, or Uncountable. 

Currently, Streams always return {:error, __MODULE__}. Something that could be done to improve Streams is to keep the length of their inputs. 

For example: 
[1,2,3]
|> Stream.map(&(&1)) #Stream<[enum: [1, 2, 3], funs: [#Function<123/1 in Stream.map/2>], count: 3]>
|> Stream.concat([4,5,6]) #Stream<[funs: [#Function<456/2 in Stream.transform/3>], count 6]>
|> Stream.concat( Stream.unfold(...) ) #Stream<[funs: [#Function<456/2 in Stream.transform/3>]]> (or with count = nil)

Enumerable.count would return the count on the Stream struct, or {:error, Enumerable.Stream}  

Not only is this mostly backwards compatible, but it also makes Streams more informative to the users. 
Doing infinite/finite checks behind the scenes will make a lot more sense the users if that information is exposed and clearly articulated 

If this seems like a good solution, I will take a harder look at what changed would be needed to implement this. 
Reply all
Reply to author
Forward
0 new messages