On 03/29/2016 09:48 PM, Alex Miller wrote:
> Rather than starting with a solution, can we start with a problem that
> needs to be solved and consider options? Can you sketch the use case
> more fully? For the cases mentioned in the first post (web and db apis),
> those sound like cases where I would maybe look at loop/recur or a
> custom lazy seq to control iteration. 
With elasticsearch, the scroll api (like a search) returns a token that
is used to retrieve each page of results, and each page of results
returns a new token which will give you the next page. The github api
when paging through results, each page has a header that points to the
next page of results. Just yesterday in the slack channel I spoke with
someone who was pulling effectively a page at a time of db rows back
from a database, each page being the rows with ids between the previous
pages last id and that id plus whatever the page size is. If I recall
the s3 (azure cloud storage too) listing api is very similar to the db
row fetching, you get back a page of results, and then the next api
request you ask for the page beginning after the last item in the
previous page.
These all have a common structure of iterated api requests
(where each api request depends on the result of the previous one), with
some halting condition.
This is certainly a solvable problem and can be/has been solved over and
over again, just like every time someone needs to map a function over a
collection they could write a custom lazy-seq or use loop/recur. I've
implemented custom lazy-seqs for this stuff, I've implemented reducible
collections, I've written both lazy-seq and reducible versions of unfold
in various projects to avoid repeating this stuff. Since it seems (at
least to me) to be something I see over and over again in apis it would
be nice to encapsulate that in some way that didn't involve a single
function library.
Right now, if I was tasked with, for example writing an elasticsearch
clojure api(which I was several months ago, and I used unfold) or with
s3 bucket listing code, I wouldn't even bother without writing unfold
first, and use that instead of reifying some reducible interface /
protocol or using lazy-seq directly. Which is kind of annoying, because
now, if someone asks "how would you write this code?" my current answer
involves a function that exists somewhere in my muscle memory which is
not super helpful to them.
> Re side effects and iterate, it is intended to be a pure generator -
> there is no guarantee made on chunking (might work ahead) or even
> necessarily on whether f might be invoked multiple times (so stateful/io
> would be bad). The current implementation of iterate is *both* a lazy
> seq and reducible. Reducibles are processed eagerly (without caching)
> and separately from seqs so using it in both capacities may cause f to
> be invoked separately for each use. This was implemented in CLJ-1603 for
> Clojure 1.7 and there is a lot of history and work there.
While I certainly prefer unfold, I think it could be replaced with some
combination of iterate and take-while. But the restrictions on iterate
make it unsuitable for these kind of iterated api calls. Maybe some
other iterate that doesn't require a pure function could solve that.
> Re implementation, it is preferable to implement IReduceInit directly if
> you control the implementation rather than to plug into CollReduce. We
> might also want this to be seqable which gets into some of the same
> territory as CLJ-1603 but with some twists if you expect the function to
> potentially have side effects. For once-only sources, maybe you could
> skip the seq impl though. We've seen the question of once-only traversal
> of external apis come up several times and I think there is something
> potentially to add here, whether it's unfold or something else.
> 
My experience, which is very limited, and I am sure others see things
differently, is that I have never wanted the caching behavior for
lazy-seqs that were used to represent, uh external resources, like this.
In fact the behavior has been a significant source of pain (tracking
down issues with macros inadvertently holding on to the head of a seq),
because it was never reasonable to expect the elements to fit in memory
at once. So, as I use unfold for external resources, having it be
non-caching is ideal.
It also seems like caching behavior is recoverable (into [] ...) from
non-caching, but removing caching when caching is built in is trickier.
On one hand, as I said, for this use case I think a reducible and the
non-caching is a win technically; on the other hand there maybe an
ergonomic argument to be made for supporting seqs. Getting code that
uses transducers/reducers/non-seq things that can be processed using
reduce through code review can be challenging, and I've had at least one
job interview go south at least in part because I used transducers in
their coding challenge. It seems like people are much more comfortable
with seqs.
> On Tuesday, March 29, 2016 at 5:59:50 PM UTC-5, Kevin Downey wrote:
> 
>     I would say it is definitely similar. I have found `unfold` in various
>     incarnations to be nicer to use than take-while + iterate, and of
>     course
>     `iterate`s docstring says 'f' must be free of side effects. I am not
>     100% sure why iterate specifies that, if I had to guess it is because
>     some people are uncomfortable with mixing lazy seqs and io[1]. I
>     think a
>     reduce based unfold side steps that (but my read on that could be all
>     wrong).
> 
> 
>     1. 
https://stuartsierra.com/2015/08/25/clojure-donts-lazy-effects
>     <
https://stuartsierra.com/2015/08/25/clojure-donts-lazy-effects>
> 
>     On 03/29/2016 03:46 PM, Howard Lewis Ship wrote:
>     > Sounds (just?) like clojure.core/iterate.
>     >
>     > On Tue, Mar 29, 2016 at 3:40 PM, Kevin Downey <
red...@gmail.com
>     <mailto:
red...@gmail.com>
 >     <
https://hackage.haskell.org/package/base-4.8.2.0/docs/Data-List.html#v:unfoldr>),
>     <
https://github.com/amalloy/useful/blob/develop/src/flatland/useful/seq.clj#L128-L147>).
> 
>     >
>     >
>     >     I have found unfold to be very useful dealing with web apis
>     and database
>     >     apis. unfold provides a nice way to turn any api that requires
>     a series
>     >     of api calls in to a series of api call results.
>     >
>     >     I have written an implementation of `unfold` using CollReduce
>     >     (
https://gist.github.com/hiredman/4d8bf007ba7897f11594
>     <
https://gist.github.com/hiredman/4d8bf007ba7897f11594>) but it would
>     >     likely be better to implement the new Reduce interfaces in 1.8.
>     >     Alternatively, or maybe a long side that a lazy-seq based
>     unfold might
>     >     be useful.
>     >
>     >     Does this seem like something useful? Do you think a patch for
>     this
>     >     would be well received? Should I open a jira issue?
>     >
>     >     --
>     >     And what is good, Phaedrus,
>     >     And what is not good—
>     >     Need we ask anyone to tell us these things?
>     >
>     >     --
>     >     You received this message because you are subscribed to the
>     Google
>     >     Groups "Clojure Dev" group.
>     >     To unsubscribe from this group and stop receiving emails from it,
>     >     send an email to 
clojure-dev...@googlegroups.com
>     <mailto:
clojure-dev%2Bunsu...@googlegroups.com>
>     >     <mailto:
clojure-dev%2Bunsu...@googlegroups.com
>     <mailto:
clojure-dev%252Buns...@googlegroups.com>>.
>     >     To post to this group, send email to
>     
cloju...@googlegroups.com <mailto:
cloju...@googlegroups.com>
>     >     <mailto:
cloju...@googlegroups.com
>     <mailto:
cloju...@googlegroups.com>>.
>     <
https://groups.google.com/group/clojure-dev>.
>     <
https://groups.google.com/d/optout>.
>     >
>     >
>     >
>     >
>     > --
>     > Howard M. Lewis Ship
>     >
>     > Senior Mobile Developer at Walmart Labs
>     >
>     > Creator of Apache Tapestry
>     >
>     > 
(971) 678-5210
>     > 
http://howardlewisship.com
>     > @hlship
>     >
>     > --
>     > You received this message because you are subscribed to the Google
>     > Groups "Clojure Dev" group.
>     > To unsubscribe from this group and stop receiving emails from it,
>     send
>     > an email to 
clojure-dev...@googlegroups.com
>     <mailto:
clojure-dev%2Bunsu...@googlegroups.com>
>     > <mailto:
clojure-dev...@googlegroups.com
>     <mailto:
cloju...@googlegroups.com>
>     > <mailto:
cloju...@googlegroups.com
>     <mailto:
cloju...@googlegroups.com>>.
>     <
https://groups.google.com/group/clojure-dev>.
>     <
https://groups.google.com/d/optout>.
> 
> 
>     -- 
>     And what is good, Phaedrus,
>     And what is not good—
>     Need we ask anyone to tell us these things?
> 
> -- 
> You received this message because you are subscribed to the Google
> Groups "Clojure Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to 
clojure-dev...@googlegroups.com
> <mailto:
clojure-dev...@googlegroups.com>.
> <mailto:
cloju...@googlegroups.com>.