Arrays as streams / consuming data with take et al

242 views
Skip to first unread message

andrew cooke

unread,
Nov 8, 2015, 8:11:09 PM11/8/15
to julia-users
I'd like to be able to use take() and all the other iterator tools with a stream of data backed by an array (or string).

By that I mean I'd like to be able to do something like:

> stream = XXX([1,2,3,4,5])
> collect(take(stream, 3))
[1,2,3]
> collect(take(stream, 2))
[4,5]

Is this possible?  I can find heavyweight looking streams for IO, and I can find lightweight iterables without state.  But I can't seem to find the particular mix described above.

(I think I can see how to write it myself; I'm asking if it already exists - seems like it should, but I can't find the right words to search for).

Thanks,
Andrew

Yichao Yu

unread,
Nov 8, 2015, 8:40:53 PM11/8/15
to Julia Users
On Sun, Nov 8, 2015 at 8:11 PM, andrew cooke <and...@acooke.org> wrote:
> I'd like to be able to use take() and all the other iterator tools with a
> stream of data backed by an array (or string).
>
> By that I mean I'd like to be able to do something like:
>
>> stream = XXX([1,2,3,4,5])
>> collect(take(stream, 3))
> [1,2,3]
>> collect(take(stream, 2))
> [4,5]
>
> Is this possible? I can find heavyweight looking streams for IO, and I can
> find lightweight iterables without state. But I can't seem to find the
> particular mix described above.

Jeff's conclusion @ JuliaCon is that it seems impossible to implement
this (stateful iterator) currently in a generic and performant way so
I doubt you will find it in a generic iterator library (that works not
only on arrays). A version that works only on Arrays should be simple
enough to implement and doesn't sound useful enough to be in an
exported API so I guess you probably should just implement your own.

Ref https://groups.google.com/forum/?fromgroups=#!searchin/julia-users/iterator/julia-users/t4ZieI2_iwI/3NTw1k406qkJ

andrew cooke

unread,
Nov 9, 2015, 4:24:14 AM11/9/15
to julia-users
thanks!

andrew cooke

unread,
Nov 9, 2015, 8:04:02 AM11/9/15
to julia-users

Yichao Yu

unread,
Nov 9, 2015, 9:20:47 AM11/9/15
to Julia Users
On Mon, Nov 9, 2015 at 8:04 AM, andrew cooke <and...@acooke.org> wrote:
>
> https://github.com/andrewcooke/StatefulIterators.jl

FYI, one way to make this more efficient is to parametrize the
iterator. You could easily do this for Array's. In the more general
case, you needs type inference to get the type right for a
non-type-stable iterator (iterator with a type unstable index...) but
it's generally a bad idea to write code that calls type inference
directly.

andrew cooke

unread,
Nov 9, 2015, 12:11:55 PM11/9/15
to julia-users

yes, i'm about to do it for arrays (i don't care about performance right now, but i want to implement read with type conversion and so need the types).

andrew cooke

unread,
Nov 9, 2015, 12:47:51 PM11/9/15
to julia-users

hmmm.  maybe i'm doing it wrong as that only gives a factor of 2 speedup.

anyway, it's all i need for now, i may return to this later.

thanks again,
andrew

Dan

unread,
Nov 9, 2015, 3:07:52 PM11/9/15
to julia-users
XXX in your questions = chain.
Or more clearly:
julia> stream = chain([1,2,3,4,5])
Iterators.Chain(Any[[1,2,3,4,5]])

julia> collect(take(stream, 3))
3-element Array{Any,1}:
 1
 2
 3

Dan

unread,
Nov 9, 2015, 3:36:13 PM11/9/15
to julia-users
ouch... my suggestion takes care of the first output, but the second output repeats the start of the sequence. but `chain` is a useful method to convert an array to an iterable. anyway, I've concocted a method to generate the desired behavior:

julia> function pull(itr,n)
       state
= start(itr)
       
for i=1:n state = next(itr,state)[2] ; end
       
(take(itr,n),rest(itr,state))
       
end
pull
(generic function with 1 method)


julia
> stream = 1:5
1:5


julia
> head, tail = pull(stream,3)
(Base.Take{UnitRange{Int64}}(1:5,3),Base.Rest{UnitRange{Int64},Int64}(1:5,4))


julia
> collect(head)
3-element Array{Int64,1}:
 
1
 
2
 
3


julia
> collect(tail)
2-element Array{Any,1}:
 
4
 
5

the idea is to use the defined `pull` function to generate the head and tail iterators. this must be so, since the state of the iterators after the first few elements must be remembered somewhere.

andrew cooke

unread,
Nov 9, 2015, 3:39:48 PM11/9/15
to julia-users

oh that's interesting.  this is from https://github.com/JuliaLang/Iterators.jl i guess.

it doesn't support read though (which i didn't realise i needed when i first asked).

i'll add a warning to StatefulIterators pointing people to this.

thanks,
andrew

andrew cooke

unread,
Nov 9, 2015, 3:44:13 PM11/9/15
to julia-users

oh, ok :o(

Dan

unread,
Nov 9, 2015, 3:45:12 PM11/9/15
to julia-users
the example with `pull` before, traverses the iterator's beginning twice... what one probably wants is:

julia> function pull(itr,n::Int)
       state
= start(itr)
       head
= eltype(itr)[]
       
while n>0 && !done(itr,state)
           val
,state = next(itr,state)
           push
!(head,val)
           n
-=1
       
end
       
(head,rest(itr,state))
       
end
pull
(generic function with 2 methods)


julia
> head,tail = pull([1,2,3,4,5],3)
([1,2,3],Base.Rest{Array{Int64,1},Int64}([1,2,3,4,5],4))



julia
> collect(tail)
2-element Array{Any,1}:
 
4
 
5


note the first call already pulls the first 3 elements and collects them into an array (one can't get to the next elements without first reading the head.

Dan

unread,
Nov 9, 2015, 3:58:39 PM11/9/15
to julia-users
Hmmm... maybe there is an issue with the following:
  | | |_| | | | (_| |  |  Version 0.5.0-dev+1137 (2015-11-04 03:36 UTC)
 _
/ |\__'_|_|_|\__'_|  |  Commit 95b7080 (5 days old master)
|__/                   |  x86_64-linux-gnu


julia
> collect(1:3)

3-element Array{Int64,1}:
 
1
 
2
 
3



julia
> collect(rest(1:3,start(1:3)))

3-element Array{Any,1}:
 
1
 
2
 
3


Shouldn't the type of both arrays be the same? (the latter defined in non-global context still yield Any Array).

andrew cooke

unread,
Nov 9, 2015, 7:06:02 PM11/9/15
to julia-users

yeah, that's the problem with types and iters.

this is why i had to add read() to StatefulIterators.jl

it seems to me that the problem is related to the lack of a typed generic container type.  but i guess it must be more complex than that.

andrew
Reply all
Reply to author
Forward
0 new messages