Future.sequence equivalent in Akka Streams

321 views
Skip to first unread message

osa...@gmail.com

unread,
Nov 8, 2015, 1:26:56 PM11/8/15
to Akka User List
Hello,

I've been working on a way to turn a Seq[Source[T]] into Source[Seq[T]] using Akka Streams 2.0. I believe I've succeeded, but I'd like people to critique my code. Thanks in advance.

object StreamUtils {

def sequencedSource[T](sources: immutable.Seq[Source[T, Unit]]): Source[immutable.Seq[T], Unit] =
Source.fromGraph(FlowGraph.create() { implicit builder =>
import FlowGraph.Implicits._

val sequenceGraphStage = builder.add(new SequenceGraphStage[T](sources.size))
for ((source, index) <- sources.zipWithIndex) source ~> sequenceGraphStage.in(index)

SourceShape(sequenceGraphStage.out)
})

}

class SequenceGraphStage[T](numInputs: Int) extends GraphStage[UniformFanInShape[T, immutable.Seq[T]]] {

require(numInputs > 0, "1 or more inputs required")

val inputs = immutable.IndexedSeq.tabulate(numInputs)(num => Inlet[T]("Sequence.in" + num))
val output = Outlet[immutable.Seq[T]]("Sequence.out")

override val shape = UniformFanInShape(output, inputs: _*)

override def createLogic(inheritedAttributes: Attributes): GraphStageLogic = new GraphStageLogic(shape) {
val elements = mutable.Buffer.empty[T]

setHandler(output, new OutHandler {
override def onPull(): Unit =
inputs.foreach { input =>
read(input) { element =>
elements += element

if (elements.size == inputs.size) {
push(output, elements.to[immutable.Seq])
elements.clear()
}
}
}
})

inputs.foreach(setHandler(_, eagerTerminateInput))
}

override val toString = "Sequence"

}

BTW, I did something similar in Streams 1.0 using FlexiMerge; the API and boilerplate necessary was somewhat off-putting. I greatly appreciate the cleanup in 2.0 so far.

Giovanni Alberto Caporaletti

unread,
Nov 8, 2015, 2:31:28 PM11/8/15
to Akka User List
Why not just something like the following?

val seqOfSources: Seq[Source[Int, Unit]] = Seq(List(1,2), List(3,4), List(5,6)).map(Source(_))

val sourceOfSeqs: Source[Seq[Int], Unit] = Source(seqOfSources.toList).flatMapConcat(identity).grouped(100)

osa...@gmail.com

unread,
Nov 8, 2015, 3:24:46 PM11/8/15
to Akka User List
I wasn't very clear in my original requirements. In the resulting Source, I want its 1st element (i.e. its 1st sequence) to contain all the 1st elements of the input sources; its 2nd sequence to contain all the 2nd elements of the input sources; and so on until the shortest input source ends.

It's slipping my mind if this functionality has an analogue in the standard library or I'm mislabeling it.

I hope this clarifies what I'm talking about.

Viktor Klang

unread,
Nov 8, 2015, 3:27:35 PM11/8/15
to Akka User List
That sounds like a feature of questionable value, it's very easy to arrive at OOMEs with such a method.

--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+...@googlegroups.com.
To post to this group, send email to akka...@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.



--
Cheers,

osa...@gmail.com

unread,
Nov 8, 2015, 4:57:18 PM11/8/15
to Akka User List
> That sounds like a feature of questionable value
I have multiple sources of time series data. Each source is over the same time range and I'd like to aggregate them into one time series. So I use this method then on each of the resulting source's sequences I map it into the sum or average of all those points.

I'm genuinely curious if there's a better way to do it.


> it's very easy to arrive at OOMEs with such a method.
How so?

Viktor Klang

unread,
Nov 8, 2015, 5:15:04 PM11/8/15
to Akka User List
On Sun, Nov 8, 2015 at 10:57 PM, <osa...@gmail.com> wrote:
> That sounds like a feature of questionable value
I have multiple sources of time series data. Each source is over the same time range and I'd like to aggregate them into one time series. So I use this method then on each of the resulting source's sequences I map it into the sum or average of all those points.

I'm genuinely curious if there's a better way to do it.

Why would you need to turn them into Seqs for that?
 


> it's very easy to arrive at OOMEs with such a method.
How so?

Because Sources are not bounded in nature.



--
Cheers,

osa...@gmail.com

unread,
Nov 8, 2015, 5:38:03 PM11/8/15
to Akka User List
>Why would you need to turn them into Seqs for that?
Because....that's just the way I did it and it seems to be working. I find it useful because it meets my business requirement. I'm not saying it's the answer, I'm just seeking a code critique and advice from any of you much smarter people who have a little time to spare :)

This is basically what I do:

type TimeSeriesPoint = (DateTime, Int)

val timeSeriesSources: List[Source[TimeSeriesPoint, Unit]] = ???

def total(points: immutable.Seq[TimeSeriesPoint]): TimeSeriesPoint =
points.reduce((point1, point2) => (point1._1, point1._2 + point2._2))

StreamUtils.sequencedSource(timeSeriesSources).map(total)

>Because Sources are not bounded in nature.
Ah, yes. Good point. I only use this on Sources of time series data we've queried from the database. So we're only using it on bounded sources by convention; is there a type that represents a bounded source? That might be useful.

Viktor Klang

unread,
Nov 8, 2015, 5:45:05 PM11/8/15
to Akka User List
On Sun, Nov 8, 2015 at 11:38 PM, <osa...@gmail.com> wrote:
>Why would you need to turn them into Seqs for that?
Because....that's just the way I did it and it seems to be working. I find it useful because it meets my business requirement. I'm not saying it's the answer, I'm just seeking a code critique and advice from any of you much smarter people who have a little time to spare :)

:-)
 

This is basically what I do:

type TimeSeriesPoint = (DateTime, Int)

val timeSeriesSources: List[Source[TimeSeriesPoint, Unit]] = ???

def total(points: immutable.Seq[TimeSeriesPoint]): TimeSeriesPoint =
points.reduce((point1, point2) => (point1._1, point1._2 + point2._2))

StreamUtils.sequencedSource(timeSeriesSources).map(total)


Source.reduce?
 
>Because Sources are not bounded in nature.
Ah, yes. Good point. I only use this on Sources of time series data we've queried from the database. So we're only using it on bounded sources by convention; is there a type that represents a bounded source? That might be useful.

I'm not convinced it would be feasible to encode—just imagine all the permutations of all transformations. Is a bounded source a subtype of an unbounded or vice versa?



--
Cheers,

osa...@gmail.com

unread,
Nov 9, 2015, 10:02:03 PM11/9/15
to Akka User List
> Source.reduce?
I'm not sure I follow.

I want to output a stream in the end. I want the 1st element of that output stream to be the sum of the 1st elements on the input streams. I want the Nth element of the output stream to be the sum of the Nth elements of the input stream. I want the output stream to end when the shortest of the input streams is out of elements.

That's why I subsequently map a stream of (n-length) sequences into a stream of the sum of those sequences. Specifically, I'm creating a total times series stream from a bunch of time series streams over the same time range. Just adding all the points at a specific time together. I was wondering if my usage of GraphStage was the best way to generate the stream of sequences that I subsequently call 'map(_.reduce)' on.

I'm just learning the Streams 2.0 API and I'm not sure how "Source.reduce" would work in this situation. Sorry if I'm not being clear enough.
Reply all
Reply to author
Forward
0 new messages