Practically everything in the collection API should return streams. :-)
--
You received this message because you are subscribed to the Google Groups "scala-language" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-languag...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
AFAIK, (c1, c2).zipped performs a lazy zip of c1, c2.
--
Not evaluating everything eagerly doesn't necessarily mean laziness in the sense you are using it here.
What I think we need is a clearly defined boundary between _no_ evaluation and evaluation, just like many other libraries out there.
--
--
Consider groupBy, for example. There are zero ways to do this that don't require the entire collection to be rebuilt in the general case. Likewise with sorted or partition.
So, if they're not minefields to trap the unwary coder, they need to require execution. But then you commit yourself to a large set of eager and non-eager methods, which can be tricky to keep straight.
--
If you're going to go all the way to Akka Streams where you build the processing graph in advance, then I agree.
--
If you have followed the various alterations of the Akka Streams syntax, and the existence of two layers of specification, and so on, I am surprised you say "it's not such a big deal". They're trying, and it's getting pretty good, but it's _hard_.
I think it's a great thing to have, but it looks like a big deal to me. And I think it's an open question whether the cognitive overhead is larger than eager computation with assignment at-will of intermediate results to vals.
Programmers need to know when an operation is strict or not. It will help if the collection libraries have some simple way to know, and not just decided differently for each operation on a case by case basis.
This may be surprising, but many good programmers choose strict as a default, and only use lazy operations if there is a specific reason for it. The reason, at least for me, is that I want to make changes to software and be able to predict the performance impact. For people who like that style, it is nicer if everything is strict unless the collection has been explicitly converted into a lazy view of some kind.
I've written a lot of SQL and Datalog code, and I've worked a fair amount on Datalog engines. In my experience, heavy optimization across a whole query, or worse the whole query library, makes performance very mysterious. You make a small change, and your performance will suddenly tank. To recover performance, you either change things around randomly, or you use tools provided by the system to understand why it made the optimization decisions it did. Either way, programmers end up spending a lot of time studying what the optimizer did.
If you decide as a tool provider to use non-local optimizations, or even just laziness, it is a huge help to developers if you also provide a debugging tool for understanding the optimization decisions. In the Google Web Toolkit, we sensed a big increase in user happiness when we added a "Story of Your Compile" for debugging bad optimization of two subsystems: the code splitter, and the serialization system.
For lazy zip, I'm not sure what such a tool would look like. A dynamic trace of some kind. For staged queries, a branch - off subtopic of this thread, it really helps to have a way to print out the query plan.
Good luck, those working on it.
Lex Spoon
--
Programmers need to know when an operation is strict or not. It will help if the collection libraries have some simple way to know, and not just decided differently for each operation on a case by case basis.