anyone using capped collections with tailable and awaitData?

105 views
Skip to first unread message

br...@interpayments.com

unread,
Oct 8, 2020, 5:47:53 PM10/8/20
to ReactiveMongo - http://reactivemongo.org
I have implemented some server-side-event/EventSource asynchronous messages (to my browser based react app) using an endpoint which eventually is querying from a capped collection and returning an akka Source of the documents of said collection.

Things *appear* to work however my mongodb starts spewing out thousands and thousands of logging messages like those below and it seems to be directly related to my reactivemongo tailable and awaitData use of the collection.

.
.
.
2020-10-08T15:33:52.101-0600 I QUERY [clientcursormon] Cursor id 1925582963770810175 timed out, idle since 2020-10-08T15:23:50.359-0600
2020-10-08T15:33:52.101-0600 I QUERY [clientcursormon] Cursor id 1915815380401102783 timed out, idle since 2020-10-08T15:23:48.755-0600
2020-10-08T15:33:52.101-0600 I QUERY [clientcursormon] Cursor id 3752763394119358863 timed out, idle since 2020-10-08T15:23:49.012-0600
2020-10-08T15:33:52.101-0600 I QUERY [clientcursormon] Cursor id 3582174053519504015 timed out, idle since 2020-10-08T15:23:48.479-0600
.
.
.

Below is some pseudo code (or some missing implementation that I hope don't skew the reading of my code) and hoping someone has some insight or tips on how they have implemented it.

def watchNotifications(id: String)(implicit m: Materializer): Source[Event, NotUsed] = {
  Source.future(bcollection.map(collection =>
    collection
      .find($doc(
        "channelId" -> id,
        "createdAt" -> $gte(OffsetDateTime.now())
      ), Option.empty)
      .tailable
      .awaitData
      .cursor[Notification]()
      .documentSource()
   ))
   .flatMapConcat(identity)
   .map(n => {
     Event(Json.asciiStringify(Json.toJson(n)), Some(DbUtil.next), n.messageType)
   })
}


In my controller, I have a Ok.chunked response with a 20.second tick.

def sse = AnyUserAction { implicit request =>
  var stream = notificatonRepo.watchNotifications(request.caller.id)
  stream = Source.single(Comment).concat(stream)
  stream = stream.flatMapConcat(e => Source(List(e, Comment)))
  stream = stream.keepAlive(20.seconds, () => Comment)
  Ok.chunked(stream.via(StreamUtil.chunkEvents())).as(ContentTypes.EVENT_STREAM)
}

Cédric Chantepie

unread,
Oct 9, 2020, 6:27:12 PM10/9/20
to ReactiveMongo - http://reactivemongo.org
From where the logs are coming?
Do the logs corresponds to some error in the logs?
What versions are used.

Brad Rust

unread,
Oct 12, 2020, 10:37:49 AM10/12/20
to reacti...@googlegroups.com

I run mongo 4.2.9 foreground with no change to logging settings.   They come out as INFO level logs from mongod.  I usually run with (mongod --dbpath /data/db --wiredTigerCacheSizeGB 2)

 

I don’t think I have see any errors in the log however I will do some investigating just in case.

 

Reactivemongo 1.0.0 – finally no more JSONCollection’s for me.

Mongod (ubuntu 18.04 WSL2 instance) – I’ll try a Windows native version to just in case.

--
You received this message because you are subscribed to the Google Groups "ReactiveMongo - http://reactivemongo.org" group.
To unsubscribe from this group and stop receiving emails from it, send an email to reactivemong...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/reactivemongo/6bf71bc6-ec5e-4c99-a432-93271827d57dn%40googlegroups.com.

Brad Rust

unread,
Oct 13, 2020, 11:44:00 AM10/13/20
to reacti...@googlegroups.com

Exactly the same thing happens when running a Windows native mongo 4.4.x.

 

I turned on reactivemongo DEBUG logging and when my .tailable.awaitData cursor is being used, I get a high rate of request/responses on the reactivemongo/database system as suggested by these scala-side logs…

 

If there is a particular logger I could turn to DEBUG or TRACE, please suggest and I’ll give it a try.

 

2020-10-13 09:35:30,344 [debug] r.c.a.MongoDBSystem - [Supervisor-3/prod-copy] Received a request expecting a response (160440): reactivemongo.core.actors.ExpectingResponse@46e765b7

2020-10-13 09:35:30,345 [debug] r.c.a.MongoDBSystem - [Supervisor-3/prod-copy] Received a request expecting a response (160441): reactivemongo.core.actors.ExpectingResponse@46e765b7

2020-10-13 09:35:30,346 [debug] r.c.a.MongoDBSystem - [Supervisor-3/prod-copy] Received a request expecting a response (160442): reactivemongo.core.actors.ExpectingResponse@46e765b7

2020-10-13 09:35:30,347 [debug] r.c.a.MongoDBSystem - [Supervisor-3/prod-copy] Received a request expecting a response (160443): reactivemongo.core.actors.ExpectingResponse@46e765b7

2020-10-13 09:35:30,348 [debug] r.c.a.MongoDBSystem - [Supervisor-3/prod-copy] Received a request expecting a response (160444): reactivemongo.core.actors.ExpectingResponse@46e765b7

2020-10-13 09:35:30,349 [debug] r.c.a.MongoDBSystem - [Supervisor-3/prod-copy] Received a request expecting a response (160445): reactivemongo.core.actors.ExpectingResponse@46e765b7

2020-10-13 09:35:30,350 [debug] r.c.a.MongoDBSystem - [Supervisor-3/prod-copy] Received a request expecting a response (160446): reactivemongo.core.actors.ExpectingResponse@46e765b7

2020-10-13 09:35:30,350 [debug] r.c.a.MongoDBSystem - [Supervisor-3/prod-copy] Received a request expecting a response (160447): reactivemongo.core.actors.ExpectingResponse@46e765b7

2020-10-13 09:35:30,351 [debug] r.c.a.MongoDBSystem - [Supervisor-3/prod-copy] Received a request expecting a response (160448): reactivemongo.core.actors.ExpectingResponse@46e765b7

2020-10-13 09:35:30,352 [debug] r.c.a.MongoDBSystem - [Supervisor-3/prod-copy] Received a request expecting a response (160449): reactivemongo.core.actors.ExpectingResponse@46e765b7

2020-10-13 09:35:30,353 [debug] r.c.a.MongoDBSystem - [Supervisor-3/prod-copy] Received a request expecting a response (160450): reactivemongo.core.actors.ExpectingResponse@46e765b7

Cédric Chantepie

unread,
Oct 14, 2020, 7:53:47 PM10/14/20
to reacti...@googlegroups.com
If there is no error on the application side, it just means that cursors are refreshed automatically after expiration.

Generally investigating debug/trace/info message of mongod if there is no issue is not useful (and very time consuming).

You received this message because you are subscribed to a topic in the Google Groups "ReactiveMongo - http://reactivemongo.org" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/reactivemongo/HFgX5IIK9NA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to reactivemong...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/reactivemongo/BY5PR18MB3761111E9A116BF807C0B648CD040%40BY5PR18MB3761.namprd18.prod.outlook.com.

Brad Rust

unread,
Oct 15, 2020, 1:22:45 PM10/15/20
to reacti...@googlegroups.com

These  logging statements on mongodb side and jvm side are coming a thousand a second which suggests to me that a thousand cursors per second are being created.   Surely something is not as planned.   Additionally, my WSL2 instance load average goes up to 250 which is crazy.

 

Would a sandbox replication of this be required to investigate more? 

Cédric Chantepie

unread,
Oct 15, 2020, 7:00:53 PM10/15/20
to reacti...@googlegroups.com
WSL2 is known to have network issues, so first try on more stable env.

Brad Rust

unread,
Oct 16, 2020, 9:26:49 AM10/16/20
to reacti...@googlegroups.com

The same thing happens on a Windows Native Mongo 4.4 instance.   It is not just happening on WSL2.  

Cédric Chantepie

unread,
Oct 17, 2020, 11:31:38 AM10/17/20
to reacti...@googlegroups.com
You can try to share a reproducer (code as minimal as possible, and data and procedure to test).

Cédric Chantepie

unread,
Nov 8, 2020, 11:22:30 AM11/8/20
to ReactiveMongo - http://reactivemongo.org
I cannot reproduce locally with a simple use case.

import scala.concurrent.Await
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.duration._

import reactivemongo.api._

val drv = AsyncDriver()
val con = Await.result(drv.connect(List("localhost:27017")), 5.seconds)

val db = Await.result(con.database("foo"), 5.seconds)

import reactivemongo.api.bson._
import reactivemongo.akkastream.cursorProducer

val sys = akka.actor.ActorSystem()
implicit val mat = akka.stream.ActorMaterializer.create(sys)

db.collection("log")
  .find(BSONDocument.empty)
  .tailable
  .awaitData
  .cursor[BSONDocument]()
  .documentSource()
  .runWith(akka.stream.scaladsl.Sink.foreach { doc =>
    println(s"- ${BSONDocument pretty doc}")
  })

Brad Rust

unread,
Nov 9, 2020, 4:26:49 PM11/9/20
to reacti...@googlegroups.com

Thank you.  I have a test case where I can *sometimes* reproduce it.   So I haven’t gotten back to this particular “please help me” request.

 

 

 

 

From: reacti...@googlegroups.com <reacti...@googlegroups.com> On Behalf Of Cédric Chantepie
Sent: Sunday, November 8, 2020 9:23 AM
To: ReactiveMongo - http://reactivemongo.org <reacti...@googlegroups.com>
Subject: Re: anyone using capped collections with tailable and awaitData?

 

I cannot reproduce locally with a simple use case.

--

You received this message because you are subscribed to the Google Groups "ReactiveMongo - http://reactivemongo.org" group.
To unsubscribe from this group and stop receiving emails from it, send an email to reactivemong...@googlegroups.com.

Cédric Chantepie

unread,
Nov 9, 2020, 6:33:42 PM11/9/20
to reacti...@googlegroups.com
Have let it run for some times. No error.

You received this message because you are subscribed to a topic in the Google Groups "ReactiveMongo - http://reactivemongo.org" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/reactivemongo/HFgX5IIK9NA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to reactivemong...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/reactivemongo/BY5PR18MB37617738D8E7F99A814FB388CDEA0%40BY5PR18MB3761.namprd18.prod.outlook.com.

Brad Rust

unread,
Nov 19, 2020, 12:55:29 PM11/19/20
to reacti...@googlegroups.com

Cédric,

 

As I am trying to reproduce this, I have come across what might be the “problem” in my situation.  Thanks again for replying.

 

  • In Production, running behind a (google) load balancer (which has a timeout in this case of 2 minutes)
    • Browser-side server-socket gets timed-out and re-opens the server-socket which creates a new tailable, awaitData, cappedCollection cursor
    • The previous cursor never completes or times-out (even with a maxTimeMs value of 1 minute)
    • Question,… is there a pattern/approach to having a tailable, awaitData cursor with a Timeout to make sure the cursor itself is gracefully-ish destroyed and would therefore propagate downstream and allow the next “poll” of my cappedCollection to not exhaust resources?
    • Question,… is there any way to shutdown your example program (below) without a kill/ctrl-c?...   maybe a poison-pill message as part of the documentSource stream?
  • In Development, I am constantly making code-changes with Play recompiles and Play-App restarts. 
    • Is it possible that the tailable, awaitData cursor is “stuck” or never gets destroyed in this case?  
    • If it only fails in development, I can live with that.  😊

 

From: reacti...@googlegroups.com <reacti...@googlegroups.com> On Behalf Of Cédric Chantepie

Sent: Sunday, November 8, 2020 9:23 AM

To: ReactiveMongo - http://reactivemongo.org <reacti...@googlegroups.com>
Subject: Re: anyone using capped collections with tailable and awaitData?

 

I cannot reproduce locally with a simple use case.

--

You received this message because you are subscribed to the Google Groups "ReactiveMongo - http://reactivemongo.org" group.
To unsubscribe from this group and stop receiving emails from it, send an email to reactivemong...@googlegroups.com.

Cédric Chantepie

unread,
Nov 19, 2020, 4:06:46 PM11/19/20
to ReactiveMongo - http://reactivemongo.org
On Thursday, 19 November 2020 at 18:55:29 UTC+1 br...@interpayments.com wrote:

Cédric,

 

As I am trying to reproduce this, I have come across what might be the “problem” in my situation.  Thanks again for replying.

 

  • In Production, running behind a (google) load balancer (which has a timeout in this case of 2 minutes)
    • Browser-side server-socket gets timed-out and re-opens the server-socket which creates a new tailable, awaitData, cappedCollection cursor
    • The previous cursor never completes or times-out (even with a maxTimeMs value of 1 minute)
The maxTimeMs is enforced by the server, not the driver.
    • Question,… is there a pattern/approach to having a tailable, awaitData cursor with a Timeout to make sure the cursor itself is gracefully-ish destroyed and would therefore propagate downstream and allow the next “poll” of my cappedCollection to not exhaust resources?
That's contradictory ... either it's tailable, or there is a timeout, but not both.
 
    • Question,… is there any way to shutdown your example program (below) without a kill/ctrl-c?...   maybe a poison-pill message as part of the documentSource stream?
  • In Development, I am constantly making code-changes with Play recompiles and Play-App restarts. 
    • Is it possible that the tailable, awaitData cursor is “stuck” or never gets destroyed in this case?  
If the reload doesn't properly unload/close the driver ... of course.

Brad Rust

unread,
Nov 22, 2020, 2:19:40 PM11/22/20
to reacti...@googlegroups.com
I ended up reproducing this consistently, so I created an Issue

the first test case in PostSpec.scala is relatively simple and shows resource intensive cursor creation.  The second one shows this resource intensive cursor creation *until* a document is inserted when the cursor logic seems to operate "correctly" or as expected.

I am hoping the github project is acceptable to run and reproduce for you or someone.   If there is some area of code you want me to try and fix, just give me a point in the right direction.

Thanks in advance.

From: reacti...@googlegroups.com <reacti...@googlegroups.com> on behalf of Cédric Chantepie <chantep...@gmail.com>
Sent: Thursday, November 19, 2020 2:06 PM

To: ReactiveMongo - http://reactivemongo.org <reacti...@googlegroups.com>
Subject: Re: anyone using capped collections with tailable and awaitData?
--
You received this message because you are subscribed to the Google Groups "ReactiveMongo - http://reactivemongo.org" group.
To unsubscribe from this group and stop receiving emails from it, send an email to reactivemong...@googlegroups.com.

Cédric Chantepie

unread,
Nov 22, 2020, 4:08:23 PM11/22/20
to ReactiveMongo - http://reactivemongo.org
As indicated, the GitHub tracker is only for confirmed issue.
Thanks for the reproducer, but it doesn't look minimal (with Play and browser based tests).

Brad Rust

unread,
Nov 23, 2020, 10:59:27 AM11/23/20
to reacti...@googlegroups.com

This is mostly your example and it shows it (to me at least)

 

import scala.concurrent.Await
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.duration._

import reactivemongo.api._

val drv = AsyncDriver()
val con = Await.result(drv.connect(List("localhost:27017")), 5.seconds)

val db = Await.result(con.database("foo"), 5.seconds)

import reactivemongo.api.bson._
import reactivemongo.akkastream.cursorProducer

val sys = akka.actor.ActorSystem()
// implicit val mat: Materializer = app.injector.instanceOf[Materializer]
// used the above to access to materializer in scalatest Spec.
// one below would not compile for me because of deprecations
implicit val mat = akka.stream.ActorMaterializer.create(sys)

val cName = "notifications"

val drop = db.collection(cName).drop()
Await.result(drop
, 5.seconds)

val capped = db.collection(cName).createCapped(5 * 1024 * 1024, Option(5000))
Await.result(capped
, 5.seconds)

val x = db.collection(cName)
  .find(BSONDocument.
empty)

  .tailable
  .awaitData
  .cursor[BSONDocument]()
  .documentSource()
  .runWith(akka.stream.scaladsl.Sink.foreach { doc =>
    println(
s"- ${BSONDocument pretty doc}")
  })
Await.result(x
, 10.seconds)

 

Running the above case which I started from your example you had sent in this group (granted I did this inside of a ScalaTest block), in the 10 seconds that the test runs for, 3500 cursors are produced when grepping output for “r.a.c.GenericQueryBuilder”.

 

 

 

 

 

 

 

From: reacti...@googlegroups.com <reacti...@googlegroups.com> On Behalf Of Cédric Chantepie

Cédric Chantepie

unread,
Nov 23, 2020, 3:02:50 PM11/23/20
to ReactiveMongo - http://reactivemongo.org
On Monday, 23 November 2020 at 16:59:27 UTC+1 br...@interpayments.com wrote:

Running the above case which I started from your example you had sent in this group (granted I did this inside of a ScalaTest block), in the 10 seconds that the test runs for, 3500 cursors are produced when grepping output for “r.a.c.GenericQueryBuilder”.


Logging like [trace] r.a.c.GenericQueryBuilder$ - command: { ... doesn't mean that client (or server) cursor is created.
And in the server log no entry indicate cursor issue.

Brad Rust

unread,
Nov 23, 2020, 3:24:41 PM11/23/20
to reacti...@googlegroups.com

Fine, perhaps it doesn’t mean that a cursor is created – assumption on my part.

 

However, the current bahavior makes using a tailableCursor simply unusable (unless you have inserted documents and you never want the Actor to complete except with a Process kill).   It pegs the CPU at 100% because of the tight looping of whatever it is doing.   This behaviour caused an Atlas JVM to reboot because of the load or resource contention placed.   This has taken down our production servers because of the GB’s of logging generated.  Sure I can turn logging off but please please please consider this as a defect.   It simply is NOT usable in the current state.  

 

I am not sure how else to describe this.

 

From: reacti...@googlegroups.com <reacti...@googlegroups.com> On Behalf Of Cédric Chantepie
Sent: Monday, November 23, 2020 1:03 PM
To: ReactiveMongo - http://reactivemongo.org <reacti...@googlegroups.com>
Subject: Re: anyone using capped collections with tailable and awaitData?

 

 

On Monday, 23 November 2020 at 16:59:27 UTC+1 br...@interpayments.com wrote:

--
You received this message because you are subscribed to a topic in the Google Groups "ReactiveMongo - http://reactivemongo.org" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/reactivemongo/HFgX5IIK9NA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to reactivemong...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/reactivemongo/ad1ea0db-434d-4095-8101-9386a65f3206n%40googlegroups.com.

Cédric Chantepie

unread,
Nov 23, 2020, 5:21:30 PM11/23/20
to reacti...@googlegroups.com
Have a load in case of I/O due to log enabled to trace level (that is to say much more low level than even debug) ... 

Brad Rust

unread,
Nov 23, 2020, 5:32:26 PM11/23/20
to reacti...@googlegroups.com

It isn’t the logging that is causing the load.   I turned off logging and it still happens.   CPU gets pegged immediately.  

 

How can I convince you that something is wrong?

--

You received this message because you are subscribed to the Google Groups "ReactiveMongo - http://reactivemongo.org" group.

To unsubscribe from this group and stop receiving emails from it, send an email to reactivemong...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/reactivemongo/CAFGeoOVWfAkcmjubo54_coanPM7EsEbv46c_dq%2B5sorw%3D5eLdA%40mail.gmail.com.

Cédric Chantepie

unread,
Nov 25, 2020, 6:45:14 PM11/25/20
to reacti...@googlegroups.com


Le lun. 23 nov. 2020 à 23:32, Brad Rust <br...@interpayments.com> a écrit :

It isn’t the logging that is causing the load.   I turned off logging and it still happens.   CPU gets pegged immediately.  

How can I convince you that something is wrong?

Are you sure that you are not trying to tail an empty capped collection?
At least the Scala reproducer is, and that a known corner case where MongoDB server doesn't honor the tailable/await.

Brad Rust

unread,
Nov 30, 2020, 5:43:37 PM11/30/20
to reacti...@googlegroups.com

 

Certainly the tests that I wrote started with a clean collection and therefore the corner case you are describing.  However, the alternative still fails and is unusable which is represented by a selector on the find which yields no results (which would always be the case when starting with a clean (dropped and then created) collection.  

 

To get around this, on each call to my long-lived “connection”, I insert a record that will immediately be returned by the find(<date gte>).tailable.awaitData cursor.   So for each iteration of my Browser/Client making a server-side-event call, it always immediately returns with an Event which is a filler/bogus/useless event that I have to insert to avoid catastrophic resource creation.

--

You received this message because you are subscribed to a topic in the Google Groups "ReactiveMongo - http://reactivemongo.org" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/reactivemongo/HFgX5IIK9NA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to reactivemong...@googlegroups.com.

Cédric Chantepie

unread,
Nov 30, 2020, 7:16:42 PM11/30/20
to reacti...@googlegroups.com
Le lun. 30 nov. 2020 à 23:43, Brad Rust <br...@interpayments.com> a écrit :

 

Certainly the tests that I wrote started with a clean collection and therefore the corner case you are describing. 

That's why it's important to provide a clear reproducer.

However, the alternative still fails and is unusable which is represented by a selector on the find which yields no results (which would always be the case when starting with a clean (dropped and then created) collection.  

To get around this, on each call to my long-lived “connection”, I insert a record that will immediately be returned by the find(<date gte>).tailable.awaitData cursor.   So for each iteration of my Browser/Client making a server-side-event call, it always immediately returns with an Event which is a filler/bogus/useless event that I have to insert to avoid catastrophic resource creation.


Which confirm the cause (and would have been useful to know it's already set up as described).

Some could consider that MongoDB server not honoring tailable/await on empty capped is wrong... But that's it, so let's keep it constructive.

If empty is not a corner case you can live with (I honestly need encountered it except as an edge case), then there some workaround (included the one you mentioned).
With 1.0.2-SNAPSHOT, the workaround "maxAwaitTimeMs" is also available (cap.find(..).tailable.awaitData.maxAwaitTimeMs(duration to wait in corner case before retrying)).

Reply all
Reply to author
Forward
0 new messages