Does and don't regarding context and thread safety

124 views
Skip to first unread message

Steve Hummingbird

unread,
Feb 24, 2018, 12:30:32 PM2/24/18
to vert.x
Much by accident I came along this library vertx-completable-future which raised concerns for the first time as we are using java's plain CompletableFuture to avoid callback hell - as it is recommended within the vert.x context in several places over the web (for instance). I have mostly read everything I could find regarding that subject, however especially the documentation does not mention much about what is safe to do and what is not.
At that point things would be simple, as we would only need to replace all CompletableFutures with VertxCompletableFutures, however along our research we found this post, which suggests that there are potential issues with VertxCompletableFuture, which could cause the event loop to block, which we are trying to avoid at all costs. However, I am not sure if I really have understood that concern fully)

Our situation is as following:
- We are using services which return a CompletableFuture, that are then chained via `thenAccept()`. None of the *async methods are used.
- We also retrieve data from mongodb using the mongodb async driver. We essentially wrap their callback in a method that returns a CompletableFuture (still no *async methods used)
- As my current understanding is, the mongo driver uses a thread from its connection pool to call the callback, which then calls the `complete` method of the CompletableFuture
- After that call to the database we usually don't modify any verticle state, we in the end usually just put a message on the event bus
- as mongo's connection pool is used to call the callback, we most likely will end up on a different thread and probably should wrap everything we do in `runOnContext`

Currently we have not noticed any issues regarding what we are doing, however we are afraid that this might just be a matter of time. When trying to figure out what to do we came across a few questions:

- is using CompletableFuture without using the *async methods and wrapping callbacks in `runOnContext` always safe?
- does vertx provide a context aware way to chain multiple methods? Is the vertx Future context aware (I currently don't think so, so still `runOnContext` should be needed)? 
- are there any recommendations or preferences regarding vert.x when it comes to reactive stream support (rxjava2 vs reactive stream)?

Julien Viet

unread,
Feb 25, 2018, 12:43:06 PM2/25/18
to ve...@googlegroups.com

On 24 Feb 2018, at 18:30, Steve Hummingbird <Steve.Hu...@yandex.com> wrote:

Much by accident I came along this library vertx-completable-future which raised concerns for the first time as we are using java's plain CompletableFuture to avoid callback hell - as it is recommended within the vert.x context in several places over the web (for instance). I have mostly read everything I could find regarding that subject, however especially the documentation does not mention much about what is safe to do and what is not.
At that point things would be simple, as we would only need to replace all CompletableFutures with VertxCompletableFutures, however along our research we found this post, which suggests that there are potential issues with VertxCompletableFuture, which could cause the event loop to block, which we are trying to avoid at all costs. However, I am not sure if I really have understood that concern fully)

Our situation is as following:
- We are using services which return a CompletableFuture, that are then chained via `thenAccept()`. None of the *async methods are used.
- We also retrieve data from mongodb using the mongodb async driver. We essentially wrap their callback in a method that returns a CompletableFuture (still no *async methods used)
- As my current understanding is, the mongo driver uses a thread from its connection pool to call the callback, which then calls the `complete` method of the CompletableFuture
- After that call to the database we usually don't modify any verticle state, we in the end usually just put a message on the event bus
- as mongo's connection pool is used to call the callback, we most likely will end up on a different thread and probably should wrap everything we do in `runOnContext`

I remember that was discussed on the github issues (for 3.5.1) and I recommended you should do that, because that's what you expect from a Vert.x-y API.


Currently we have not noticed any issues regarding what we are doing, however we are afraid that this might just be a matter of time. When trying to figure out what to do we came across a few questions:

- is using CompletableFuture without using the *async methods and wrapping callbacks in `runOnContext` always safe?

can you give an example ?

- does vertx provide a context aware way to chain multiple methods? Is the vertx Future context aware (I currently don't think so, so still `runOnContext` should be needed)? 

the Vert.x future does not perform trampolining, so you are left with what your callback has which is fine, if you need trampolining you need to use runOnContext

- are there any recommendations or preferences regarding vert.x when it comes to reactive stream support (rxjava2 vs reactive stream)?

I recommend using rxjava2 because of the nice API.

side note: we are planning to bring CompletionStage to Vert.x in 2018 (we need have still to figure out about it) and we will take care of properly defining the semantic of the type.



--
You received this message because you are subscribed to the Google Groups "vert.x" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vertx+un...@googlegroups.com.
Visit this group at https://groups.google.com/group/vertx.
To view this discussion on the web, visit https://groups.google.com/d/msgid/vertx/7f1382c2-870e-43f3-8400-6d37b4ba2921%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Steve Hummingbird

unread,
Feb 25, 2018, 1:28:00 PM2/25/18
to vert.x


On Sunday, February 25, 2018 at 6:43:06 PM UTC+1, Julien Viet wrote:
On 24 Feb 2018, at 18:30, Steve Hummingbird <Steve.Hu...@yandex.com> wrote:

- as mongo's connection pool is used to call the callback, we most likely will end up on a different thread and probably should wrap everything we do in `runOnContext`

I remember that was discussed on the github issues (for 3.5.1) and I recommended you should do that, because that's what you expect from a Vert.x-y API.

 
Thanks for the reply. Yes, we were talking briefly about that regarding the vertx-mongo-streams library. But that is not the actual issue, as this is a quick thing to do. However, I am currently figuring out what to do with a rather large codebase, which makes use of hundreds of CompletableFutures (which were all assumed to be safe), so migrating that code might not be something that is done that quickly.
 
Currently we have not noticed any issues regarding what we are doing, however we are afraid that this might just be a matter of time. When trying to figure out what to do we came across a few questions:

- is using CompletableFuture without using the *async methods and wrapping callbacks in `runOnContext` always safe?

can you give an example ?


This is a method we use to access the db using the async mongo driver:

public CompletableFuture<T> findById(ObjectId id) {

    CompletableFuture<T> completableFuture = new CompletableFuture<>();

    collection.find(new Document("_id", id)).first((T document, Throwable t) -> {

        if (t != null) {
            completableFuture.completeExceptionally(t);
        } else {
            completableFuture.complete(document);
        }
    });

    return completableFuture;
}


Services are usually written in Groovy and we access that code the following way (things of course are more complex as we chain services using CompletableFuture in the same way and there are more than two layers). But it all is done in exactly the same manner.

repository.findById(id).thenAccept({ User user -> 
   eb.send(socket.writeHandlerID(), new Response(user));
})



- does vertx provide a context aware way to chain multiple methods? Is the vertx Future context aware (I currently don't think so, so still `runOnContext` should be needed)? 

the Vert.x future does not perform trampolining, so you are left with what your callback has which is fine, if you need trampolining you need to use runOnContext

currently we do not make use of trampolining, so we should be fine if we just replace the CompletableFutures with vert.x Future?
- are there any recommendations or preferences regarding vert.x when it comes to reactive stream support (rxjava2 vs reactive stream)?

I recommend using rxjava2 because of the nice API.

side note: we are planning to bring CompletionStage to Vert.x in 2018 (we need have still to figure out about it) and we will take care of properly defining the semantic of the type.

Nice. Feel free to ping me if there is something I can help with.

Steve Hummingbird

unread,
Mar 17, 2018, 11:38:43 AM3/17/18
to vert.x
Maybe I should clarify the important questions a little, as the documentation seems quite sparse on that topic

Most importantly, in which cases does it (potentially) lead to issues when code is executed on a different vertx context and/or thread? Is it save to run code on a different thread or context, when none of the vertx functions are executed? For instance a message is received via websocket and there are only multiple requests to the db made (which might lead to callbacks being executed on a different threads / or contexts) - but no response is sent or any vertx functions called. Or is it only problematic when code is run on the same thread but on a different context and data from the context is accessed e.g. via get or put?

Vert.x provides two different methods to run code on a vert.x context:
Vertx.runOnContext Puts the handler on the event queue for the current context so it will be run asynchronously ASAP after all preceeding events have been handled
which will (unless the current context also is the original context) most likely run the code on a different context.

context.runOnContext Run the specified action asynchronously on the same context, some time after the current execution has completed.
which will ensure that code is run exactly the provided context.

The first option seems a little difficult to grasp in case running on different contexts would have severe implications.

In the end I would like to put together an overview over different ways to orchestrate async code and their implications with vertx. However that does not seem to be too useful as long as the implications of using different threads and contexts seem that vague.



Julien Viet

unread,
Mar 17, 2018, 1:10:43 PM3/17/18
to ve...@googlegroups.com

On 17 Mar 2018, at 16:38, Steve Hummingbird <Steve.Hu...@yandex.com> wrote:

Maybe I should clarify the important questions a little, as the documentation seems quite sparse on that topic

Most importantly, in which cases does it (potentially) lead to issues when code is executed on a different vertx context and/or thread? Is it save to run code on a different thread or context, when none of the vertx functions are executed?

it is not clear what you mean. If no vertx function is executed then well there is no problem, so I must miss something

For instance a message is received via websocket and there are only multiple requests to the db made (which might lead to callbacks being executed on a different threads / or contexts) - but no response is sent or any vertx functions called. Or is it only problematic when code is run on the same thread but on a different context and data from the context is accessed e.g. via get or put?

Vert.x provides two different methods to run code on a vert.x context:
Vertx.runOnContext Puts the handler on the event queue for the current context so it will be run asynchronously ASAP after all preceeding events have been handled
which will (unless the current context also is the original context) most likely run the code on a different context.

context.runOnContext Run the specified action asynchronously on the same context, some time after the current execution has completed.
which will ensure that code is run exactly the provided context.

The first option seems a little difficult to grasp in case running on different contexts would have severe implications.

Vertx#runOnContext should be avoided when you have access to the context in which the code is executing.


In the end I would like to put together an overview over different ways to orchestrate async code and their implications with vertx. However that does not seem to be too useful as long as the implications of using different threads and contexts seem that vague.





--
You received this message because you are subscribed to the Google Groups "vert.x" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vertx+un...@googlegroups.com.
Visit this group at https://groups.google.com/group/vertx.

Steve Hummingbird

unread,
Mar 17, 2018, 3:02:51 PM3/17/18
to vert.x


On Saturday, March 17, 2018 at 6:10:43 PM UTC+1, Julien Viet wrote:

it is not clear what you mean. If no vertx function is executed then well there is no problem, so I must miss something
 
I don't think you missed something. It's just that I have read everything I could find regarding that topic. And a lot of answers have been quite vague, very generalised or sometimes even contradicting, which may have lead to some "stupid" questions.

So, unless I missed something, the following should be the case:
- Vertx expects everything to be run on the same thread
- Therefore if there is another threadpool (like the connection pool of a db driver, or the *async stuff in CompletableFuture), one needs to assure that the callback is scheduled on the event queue (at least via Vertx.runOnContext). This is very similar to other frameworks that make use of an eventloop. If the context is not specifically accessed in that case, using Vertx.runOnContext should be fine.
- In case the context is accessed (this should probably only affect the methods for accessing and modifying data on the context), one must make sure that the context is the correct one (if not context.runOnContext must be used). To be noted is that things like completableFuture.thenAccept or thenRun etc do only preserve the thread but not the context as it is described here

Could you please confirm that these observations correct?

So, for our case: We should be fine if a request comes in. This is orchestrated via to a bunch of CompletableFuture.thenAccept, as shown earlier. If there is no db access, we should be able to safely call eventBus.send() without further scheduling on the eventBus via runOnContext. In case there is a request to the db made, which most likely will use a callback on a different thread, we must make at least use of Vertx.runOnContext on the callback (most likely as soon as possible). Then again we should be fine to send something over the eventBus. And finally if we want to access data from the context, context.runOnContext should be used, as the completableFutures do not preserve the context.

TLDR: if you stay on the same thread, you can do anything you want unless you want to access data from the context (- in that case you should also stay on the same context)
Yes, that was helpful with some aspects, however it still left questions in which scenarios which case actually is ok or not.

Julien Viet

unread,
Mar 19, 2018, 4:25:42 AM3/19/18
to ve...@googlegroups.com
here is the actual rule for a Vert.x API:

1/ the API should be used from the same context the API was meant for (e.g using the http client inside a verticle), for best performances
2/ the API can be used fine outside from another thread but it may lead to degraded performances, e.g publishing a websocket message or sending an http response


On 17 Mar 2018, at 20:02, Steve Hummingbird <Steve.Hu...@yandex.com> wrote:



On Saturday, March 17, 2018 at 6:10:43 PM UTC+1, Julien Viet wrote:

it is not clear what you mean. If no vertx function is executed then well there is no problem, so I must miss something
 
I don't think you missed something. It's just that I have read everything I could find regarding that topic. And a lot of answers have been quite vague, very generalised or sometimes even contradicting, which may have lead to some "stupid" questions.

So, unless I missed something, the following should be the case:
- Vertx expects everything to be run on the same thread

yes for best performance

- Therefore if there is another threadpool (like the connection pool of a db driver, or the *async stuff in CompletableFuture), one needs to assure that the callback is scheduled on the event queue (at least via Vertx.runOnContext). This is very similar to other frameworks that make use of an eventloop. If the context is not specifically accessed in that case, using Vertx.runOnContext should be fine.

it is the recommended behaviour but it's not required

- In case the context is accessed (this should probably only affect the methods for accessing and modifying data on the context), one must make sure that the context is the correct one (if not context.runOnContext must be used). To be noted is that things like completableFuture.thenAccept or thenRun etc do only preserve the thread but not the context as it is described here

the context can be used from other thread (i.e it is thread safe)


Could you please confirm that these observations correct?

So, for our case: We should be fine if a request comes in. This is orchestrated via to a bunch of CompletableFuture.thenAccept, as shown earlier. If there is no db access, we should be able to safely call eventBus.send() without further scheduling on the eventBus via runOnContext. In case there is a request to the db made, which most likely will use a callback on a different thread, we must make at least use of Vertx.runOnContext on the callback (most likely as soon as possible). Then again we should be fine to send something over the eventBus. And finally if we want to access data from the context, context.runOnContext should be used, as the completableFutures do not preserve the context.

TLDR: if you stay on the same thread, you can do anything you want unless you want to access data from the context (- in that case you should also stay on the same context)


Yes, that was helpful with some aspects, however it still left questions in which scenarios which case actually is ok or not.

--
You received this message because you are subscribed to the Google Groups "vert.x" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vertx+un...@googlegroups.com.
Visit this group at https://groups.google.com/group/vertx.

Steve Hummingbird

unread,
Mar 19, 2018, 10:29:07 AM3/19/18
to vert.x
Thank you very much. I think that clears up a lot of things. We got the impression that calling code from the right context / thread is critical (like in the way it could cause severe issues/possibly even corrupt data) from statements like that on the vertx blog: 

While getting back onto the correct context may not be critical if you have remained on the event loop thread throughout, it is critical if you are going to invoke subsequent vert.x handlers, update verticle state or anything similar, so it’s a sensible general approach.

We got quite alert after we found out that people modified completableFuture as it seemed to have caused quite some issues within their apps regarding the context. But now that this might only have consequences regarding performance - this is a totally different story. I was actually wondering why there would be so little documentation about that topic, if it would be as critical as some people claim. But that makes sense now.
Reply all
Reply to author
Forward
0 new messages