Why Future.map requires an execution context

1,176 views
Skip to first unread message

Ezequiel Surijon

unread,
Apr 26, 2016, 5:19:55 PM4/26/16
to scala-user
Futures are used for example to avoid blocking the current thread, and let the blocking code to run in a different thread. When using Async framwork like PlayFramework, you are supossed to execute any blocking code inside a Future

When you create a future like this:

val sayHello = Future {
     
Thread.sleep(1000)
     
"hello"
}

An implicit ExecutionContext is passed to:

Future.apply[T](body: T)(implicit executor: ExecutionContext): Future[T]

And it makes sense, since you have to specify where you code block will be executed.

But I can't understand why map a future requires an execution context. 

Future.map[S](f: (T) S)(implicit executor: ExecutionContext): Future[S]

As far I undestand the mapping function f, will be executed once the future is resolved, i.e. after the blocking code was executed, and  I think it should be executed and also is supposed to be a non blocking code block, if not you better to use flatMap  

So my questions are

Why Future.map requires an execution context?

There is a penalty to map a future several times? There is a difference in performace thread compsumtion in bellow code?

val f : Future = ???
((f map f1) map f2) map f3

f map
(f1f2 ∘f3)



Daniel Armak

unread,
Apr 26, 2016, 5:39:30 PM4/26/16
to Ezequiel Surijon, scala-user
  • fut.map(g) requires an ExecutionContext for the same reason Future.apply does: to specify where g will run once fut completes.

Just because you have an existing Future, doesn’t mean you have the ExecutionContext where it executed; there may not even have been one to begin with. For instance, Future.successful(1) creates a completed Future with the value 1. (Advice: look at the definition of that method.) And a Promise creates a Future without executing any code.

When you map on such a future, where and when will your function run? You have to provide an ExecutionContext to answer that question.

  • Ideally you’re not supposed to ever run blocking code, in a Future or otherwise; you should use asynchronous code all the way down to IO. Running blocking code in a Future risks exhausting the ExecutionContext’s thread pool (or deadlocking, depending on why it’s blocking).

However, if you do have a blocking piece of code and you’re going to run it in a Future, then map isn’t any different from Future.apply.

  • There is indeed a difference in performance between (f map f1) map f2 and f map (f1*f2). How large it is depends on what you care about. It’s a very small cost in throughput, but it might be a large cost in latency, depending on how many Futures are waiting to be executed.

If f1 and f2 are both synchronous functions (i.e. they’re not returning a Future), there’s no reason to call map twice.

-- 
Daniel Armak

--
You received this message because you are subscribed to the Google Groups "scala-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-user+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Viktor Klang

unread,
Apr 27, 2016, 6:03:44 AM4/27/16
to Ezequiel Surijon, scala-user

Hi Ezequiel,

On Apr 26, 2016 23:19, "Ezequiel Surijon" <sur...@gmail.com> wrote:
>
> Futures are used for example to avoid blocking the current thread, and let the blocking code to run in a different thread. When using Async framwork like PlayFramework, you are supossed to execute any blocking code inside a Future
>
> When you create a future like this:
>
> val sayHello = Future {
>       Thread.sleep(1000)
>       "hello"
> }
>
>
> An implicit ExecutionContext is passed to:
>
> Future.apply[T](body: ⇒ T)(implicit executor: ExecutionContext): Future[T]
>
> And it makes sense, since you have to specify where you code block will be executed.
>
> But I can't understand why map a future requires an execution context. 
>
> Future.map[S](f: (T) ⇒ S)(implicit executor: ExecutionContext): Future[S]
>
> As far I undestand the mapping function f, will be executed once the future is resolved, i.e. after the blocking code was executed, and  I think it should be executed and f also is supposed to be a non blocking code block, if not you better to use flatMap  

>
> So my questions are
>
> Why Future.map requires an execution context?

Because there is what's called a "race condition" between the future being completed and the mapping function being registered as a callback.

This means that there are 2 outcomes:

The thread which completes the future also is responsible for executing all existing callbacks.

The thread which adds the callback has to execute the callback since the future has already been completed.

This means that it is no longer possible to reason about what executes where and when. My dear, and former, colleague Havoc Pennington did an excellent writeup on this very problem here: http://blog.ometer.com/2011/07/24/callbacks-synchronous-and-asynchronous/


As an aside, Future.apply is only syntactic bacon over Future.successful(()).map(_ => block)


Thanks for raising these questions!


>
> There is a penalty to map a future several times?

Yes. But it is a classic tradeoff in fairness and throughput.

 There is a difference in performace thread compsumtion in bellow code?
>
> val f : Future = ???
> ((f map f1) map f2) map f3
>
> f map (f1∘f2 ∘f3)
>
>
>

Kevin Wright

unread,
Apr 27, 2016, 6:26:18 AM4/27/16
to Viktor Klang, Ezequiel Surijon, scala-user
I just read “syntactic bacon”… and somehow the rest of the post didn’t seem important any more :)

Viktor Klang

unread,
Apr 27, 2016, 6:56:04 AM4/27/16
to Kevin Wright, Ezequiel Surijon, scala-user
Future[Bacon]
--
Cheers,

Daniel Armak

unread,
Apr 27, 2016, 7:48:04 AM4/27/16
to Viktor Klang, Ezequiel Surijon, scala-user
Pennington's post is absolutely right.

As an aside, scala's Future.map (really Future.onComplete) doesn't specify whether the continuation may be executed synchronously if the first future is already completed when onComplete is called. (At least its docs say both are allowed.) The (default) implementation always acts asynchronously, which is good, since lots of code incidentally relies on this!

On the other hand, in scala-async, await is guaranteed to be synchronous if the awaited future is already completed. (This is new behavior since some version about a year ago.) Or, at least, the implementation is; I don't know if the docs mention it. 

This new behavior of async/await is great for performance if you e.g. have long-lived Futures that are completed once and then awaited many times. But it also means a method returning a Future which is implemented with async/await may in fact run synchronously and end by returning an completed Future, which may not be what the caller wanted.

Daniel Armak

Ezequiel Surijon

unread,
Apr 27, 2016, 11:32:19 AM4/27/16
to Daniel Armak, scala-user
Thanks Daniel, now it's much clear for me.

Any way going forward, comming from Java and MVC world it's a common pratctice to divide an app into layers, so you have specialized classes for each pourpose, daos to hide storage complexity, services to encapsulate bussiness logic and controllers to handle and dispatch UI actions.

So I tried to bring same desing principles into a Play/Scala app, in contrast to Java a Play app is asynchronous all the way down to IO (as you mentioned), so daos, services and controller encapsulates it'stretuning type into a Future. This is how I end up in code like this:

val f : Future = ???
((f map f1) map f2) map f3

Do tou think this is good parctice in Scala? 
Is any aproach I can follow to get application layered and run all mapping functions togheter? 



              Saludos y gracias,
              Ezequiel Surijon.

Daniel Armak

unread,
Apr 27, 2016, 11:56:22 AM4/27/16
to Ezequiel Surijon, scala-user

Using map many times has only two possible shortcomings: performance, and code readability.

Performance penalties, like always, should be demonstrated before being optimized: premature optimization is the root of all evil.

To improve readability, consider using scala-async. Some people also use for comprehensions (i.e. for (result <- future) yield ...).

Of course, when composing synchronous functions (that don’t return Future), you should always compose them directly - it’s easier, faster, and more readable.

Daniel Armak

Viktor Klang

unread,
Apr 27, 2016, 12:45:21 PM4/27/16
to Daniel Armak, Ezequiel Surijon, scala-user
On Wed, Apr 27, 2016 at 5:55 PM, Daniel Armak <dana...@gmail.com> wrote:

Using map many times has only two possible shortcomings: performance, and code readability.

Interestingly, going async is not about gaining performance, it is about gaining scalability.
(Parallelization is about gaining performance)



--
Cheers,

Kevin Wright

unread,
Apr 27, 2016, 12:46:48 PM4/27/16
to Viktor Klang, Daniel Armak, Ezequiel Surijon, scala-user
Scalability and, I would argue, responsiveness.

Viktor Klang

unread,
Apr 27, 2016, 1:02:57 PM4/27/16
to Kevin Wright, Daniel Armak, Ezequiel Surijon, scala-user
I guess that depends on how you look at it. I think non-blocking is more about responsiveness from a liveness PoV.
But from a responsiveness PoV *under load* async helps a lot. (because of the scalability PoV) but it is not a panacea due to Queue Theory constraints.
--
Cheers,

Daniel Armak

unread,
Apr 27, 2016, 2:20:52 PM4/27/16
to Viktor Klang, Ezequiel Surijon, scala-user
Performance is a nebulous term, and different systems have different tradeoffs. I'd say, however, that all systems benefit from making all IO (network, files, and ultimately other system signals and callbacks) asynchronous. In most systems, there's enough IO to 'infect' anything stateful with asynchronous types.



Daniel Armak

Bardur Arantsson

unread,
Apr 27, 2016, 2:40:56 PM4/27/16
to scala...@googlegroups.com
On 04/27/2016 08:20 PM, Daniel Armak wrote:
> Performance is a nebulous term, and different systems have different
> tradeoffs. I'd say, however, that all systems benefit from making all IO
> (network, files, and ultimately other system signals and callbacks)
> asynchronous.

That's not true in general. Throughput can suffer from asyncrounous I/O.
(Obviously, we're talking at the *extremes* here, it's not something
you're going to notice under light load.)

Regards,

Reply all
Reply to author
Forward
0 new messages