mongodb change stream example for php ?

527 views
Skip to first unread message

Guglielmo Fanini

unread,
Nov 3, 2019, 4:19:47 PM11/3/19
to mongodb-user

where can I find any mongodb change stream example for php ?

not tailable cursor on a "capped collection", I'd need to "watch" a modifiable collection from php


the environment is WAMP php 7.2 windows



Jeremy Mikola

unread,
Nov 3, 2019, 8:56:24 PM11/3/19
to mongod...@googlegroups.com
On Sun, Nov 3, 2019 at 4:19 PM Guglielmo Fanini <g.fa...@gmail.com> wrote:

where can I find any mongodb change stream example for php ?

not tailable cursor on a "capped collection", I'd need to "watch" a modifiable collection from php

The watch() methods for Collection, Database, and Client objects are covered in the PHP library docs (https://docs.mongodb.com/php-library/current/reference/method/MongoDBCollection-watch/), and those pages include an example demonstrating iteration.

https://docs.mongodb.com/manual/changeStreams/ in the MongoDB manual also has examples for PHP (via the language tab above each code example).

Guglielmo Fanini

unread,
Nov 4, 2019, 6:27:44 AM11/4/19
to mongodb-user
this is how I got it to work for WAMP on windows in case anyone else is wondering :

verify which php interpreter you have pathed with php --ini

after that just run :

C:\wamp64\www>composer require mongodb/mongodb
Using version ^1.5 for mongodb/mongodb
./composer.json has been created
Loading composer repositories with package information
Updating dependencies (including require-dev)
Package operations: 1 install, 0 updates, 0 removals
  - Installing mongodb/mongodb (1.5.1): Downloading (100%)
Writing lock file
Generating autoload files

then from the php script executed from the path where mongo lib (and not just the ext) was downloaded you get a sub folder "vendor" :

// This path should point to Composer's autoloader
require 'vendor/autoload.php';

and this enables monitoring also of delected mongo docs I verified, thank you for pointing me in the right direction


Il giorno lunedì 4 novembre 2019 02:56:24 UTC+1, Jeremy Mikola ha scritto:

Guglielmo Fanini

unread,
Nov 5, 2019, 9:10:31 AM11/5/19
to mongodb-user
further to that, where could there be any examples of regex "match" of subset of _id docs to be watched ?

$regex = new MongoDB\BSON\Regex ( '^'.$substring);// _id starting with "substring"
$query = new MongoDB\Driver\Query( array('_id' => $regex) );
$cursor = $manager->executeQuery($collection, $query);  // Query the collection for a subset of _id s

perhaps it should be specified as a "pipeline" to the watch (blocking) method ?
thank you for any clues.

Jeremy Mikola

unread,
Nov 5, 2019, 11:32:35 AM11/5/19
to mongod...@googlegroups.com


On Tue, Nov 5, 2019 at 9:10 AM Guglielmo Fanini <g.fa...@gmail.com> wrote:
further to that, where could there be any examples of regex "match" of subset of _id docs to be watched ?

$regex = new MongoDB\BSON\Regex ( '^'.$substring);// _id starting with "substring"
$query = new MongoDB\Driver\Query( array('_id' => $regex) );
$cursor = $manager->executeQuery($collection, $query);  // Query the collection for a subset of _id s

perhaps it should be specified as a "pipeline" to the watch (blocking) method ?
thank you for any clues.

The structure of documents returned by iterating a change stream (i.e. cursor-like object returned by watch() methods) is described on the Change Events page in the MongoDB manual. The PHP library docs also link to this in the "See Also" section in each page for the various watch() methods.

Note that the top-level _id field in a change event document is a resume token, which is used internally to continue iteration after an error. Modify Change Stream Output in the MongoDB manual advises users not to modify/remove that field via a pipeline stage (e.g. $project). Recent versions of the server will raise an error if they detect a pipeline stage attempting to modify the field.

The actual _id of the changed document can be found under the change event's "documentKey._id" field. That is probably want you want to match on, and you can do so using the $match pipeline stage.

Details about regex matches are discussed in the $regex operator docs in the MongoDB manual. The PHP library also has examples in its CRUD: Regular Expressions tutorial page. The MongoDB manual notes differences between using the $regex query operator and a BSON regex object (although the manual refers to that as /pattern/ syntax, since it's written from the POV of the MongoDB shell). And important thing to note is that MongoDB's query engine only supports matching regular expressions against string types. Assuming your document _id values are ObjectIds, I don't think you'll be able to match them directly with a regex pattern.

That said, if you're using the aggregation framework you can insert a $project stage in the pipeline (accepted by both aggregate and watch methods) that converts ObjectIds values to a string, and then do a $match between the newly converted string value and a regex pattern. See https://stackoverflow.com/a/51231012/162228 for an existing discussion on that subject (pertaining just to ObjectId to string conversion within a pipeline).

Guglielmo Fanini

unread,
Nov 5, 2019, 12:07:14 PM11/5/19
to mongodb-user
due to my specific requirements I'm not using mongo supplied object ids as _id but am using my own strings as (indexed) key, that said, could you make any educated guess whether anything like this may be feasible in order to filter to a subset of docs, without modifying the _id clearly, i.e. read only filter :

$pipeline =array(
    array('$match' => array('_id' => '^0FFC'))
);

$changeStream = $collection->watch($pipeline);

where I am aiming to restrict to only _id starting with substring '0FFC' say, thank you for any clue.


Il giorno martedì 5 novembre 2019 17:32:35 UTC+1, Jeremy Mikola ha scritto:

Jeremy Mikola

unread,
Nov 5, 2019, 2:23:05 PM11/5/19
to mongod...@googlegroups.com
On Tue, Nov 5, 2019 at 12:07 PM Guglielmo Fanini <g.fa...@gmail.com> wrote:
due to my specific requirements I'm not using mongo supplied object ids as _id but am using my own strings as (indexed) key, that said, could you make any educated guess whether anything like this may be feasible in order to filter to a subset of docs, without modifying the _id clearly, i.e. read only filter :

$pipeline =array(
    array('$match' => array('_id' => '^0FFC'))
);

$changeStream = $collection->watch($pipeline);

where I am aiming to restrict to only _id starting with substring '0FFC' say, thank you for any clue.

If the document _id values are string types, then you can certainly use regex matches; however, the PHP syntax you have above is incorrect and, as-written, would perform an equality match where _id equals "^0FFC", which is certainly not what you want.

Please refer to the links in my last response. The PHP library's CRUD tutorial has examples of using regular expressions in queries. Those examples were written with basic queries (i.e. find commands) in mind, but the same syntax in those examples would also apply to the $match pipeline stage.

Lastly, also note my previous comments about the structure of documents accessible using the iterator returned by watch(). The code you have above would attempt to match the change stream's resume token, instead of the _id field of the document in your collection. The correct field path for the document's _id would be "documentKey._id". For example:

$regex = new MongoDB\BSON\Regex('^0FFC');
$pipeline = [ ['$match' => ['documentKey._id' => $regex]]];

You can either use a Regex object or $regex operator (i.e. ['$regex' => [ ... ]]), as given in the MongoDB manual's $regex doc examples.
 


Il giorno martedì 5 novembre 2019 17:32:35 UTC+1, Jeremy Mikola ha scritto:


On Tue, Nov 5, 2019 at 9:10 AM Guglielmo Fanini <g.f...@gmail.com> wrote:
further to that, where could there be any examples of regex "match" of subset of _id docs to be watched ?

$regex = new MongoDB\BSON\Regex ( '^'.$substring);// _id starting with "substring"
$query = new MongoDB\Driver\Query( array('_id' => $regex) );
$cursor = $manager->executeQuery($collection, $query);  // Query the collection for a subset of _id s

perhaps it should be specified as a "pipeline" to the watch (blocking) method ?
thank you for any clues.

The structure of documents returned by iterating a change stream (i.e. cursor-like object returned by watch() methods) is described on the Change Events page in the MongoDB manual. The PHP library docs also link to this in the "See Also" section in each page for the various watch() methods.

Note that the top-level _id field in a change event document is a resume token, which is used internally to continue iteration after an error. Modify Change Stream Output in the MongoDB manual advises users not to modify/remove that field via a pipeline stage (e.g. $project). Recent versions of the server will raise an error if they detect a pipeline stage attempting to modify the field.

The actual _id of the changed document can be found under the change event's "documentKey._id" field. That is probably want you want to match on, and you can do so using the $match pipeline stage.

Details about regex matches are discussed in the $regex operator docs in the MongoDB manual. The PHP library also has examples in its CRUD: Regular Expressions tutorial page. The MongoDB manual notes differences between using the $regex query operator and a BSON regex object (although the manual refers to that as /pattern/ syntax, since it's written from the POV of the MongoDB shell). And important thing to note is that MongoDB's query engine only supports matching regular expressions against string types. Assuming your document _id values are ObjectIds, I don't think you'll be able to match them directly with a regex pattern.

That said, if you're using the aggregation framework you can insert a $project stage in the pipeline (accepted by both aggregate and watch methods) that converts ObjectIds values to a string, and then do a $match between the newly converted string value and a regex pattern. See https://stackoverflow.com/a/51231012/162228 for an existing discussion on that subject (pertaining just to ObjectId to string conversion within a pipeline).

--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/128aa18e-53ea-40d9-ae83-aab19a6ce8ee%40googlegroups.com.

Guglielmo Fanini

unread,
Nov 5, 2019, 2:50:40 PM11/5/19
to mongodb-user
thanks for that, it works, because I was getting really weird out of memory errors attempting to "match" just _id, perhaps you might wish to put any such example on the official documentation website of php lib ?


Il giorno martedì 5 novembre 2019 20:23:05 UTC+1, Jeremy Mikola ha scritto:


To unsubscribe from this group and stop receiving emails from it, send an email to mongod...@googlegroups.com.

Jeremy Mikola

unread,
Nov 5, 2019, 4:43:12 PM11/5/19
to mongod...@googlegroups.com
On Tue, Nov 5, 2019 at 2:50 PM Guglielmo Fanini <g.fa...@gmail.com> wrote:
thanks for that, it works, because I was getting really weird out of memory errors attempting to "match" just _id,

If the OOM errors are reproducible, that may be worth reporting as a bug in either the PHPC or PHPLIB projects in JIRA (or on GitHub), depending on whether the error originated. We'd be happy to look into that further to diagnose a root cause in case it is a bug.

perhaps you might wish to put any such example on the official documentation website of php lib ?

The driver documentation errs on the side of not duplicating content already specified in the MongoDB manual, as then it would be more likely for content to get outdated. This is why we refer to the Change Events page in the MongoDB manual from our watch() documentation.

There is definitely a trade-off in how many examples to include in the documentation, and ensuring that those we do include are general enough to be useful to a wide audience. In your case, using strings instead of an ObjectId and performing a regex match on the _id field are both fairly unique scenarios. The CRUD tutorial does seem the most logical place to document examples of regex matching, if it's to be included at all in the PHP docs, as it would otherwise need to be duplicated across any API docs that take query documents (find(), update(), aggregate(), and so on).
 
That said, would it have been more helpful if https://docs.mongodb.com/php-library/current/reference/method/MongoDBCollection-watch/ included an example of supplying an aggregation pipeline instead of simply calling watch() with no arguments? If so, that's something we can certainly do (and duplicate across the Client and Database watch() pages.
 
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/a5823ee9-7486-4f31-a431-1e9c111660ca%40googlegroups.com.

Guglielmo Fanini

unread,
Nov 11, 2019, 9:22:12 PM11/11/19
to mongodb-user
Of course more explanations would be helpful if possbile.

Another question : can the watch() cursor be kept open indefinitely ? or must it be reopened when the op log rolls over ?
After a while I get EXCEPTION: cannot resume stream; the resume token was not found. {_data: "825DC99861000001C32B022C0100296E5A1004DFE455B4382D414D9CB41F8710A12E0C463C5F6964003C3032393332303431000004"}
thank you for clues


Il giorno martedì 5 novembre 2019 22:43:12 UTC+1, Jeremy Mikola ha scritto:


Jeremy Mikola

unread,
Nov 12, 2019, 10:35:30 AM11/12/19
to mongod...@googlegroups.com
On Mon, Nov 11, 2019 at 9:22 PM Guglielmo Fanini <g.fa...@gmail.com> wrote:
Of course more explanations would be helpful if possbile.

I've opened https://jira.mongodb.org/browse/PHPLIB-503 to suggest adding pipelines to existing examples on the watch() API docs.
 

Another question : can the watch() cursor be kept open indefinitely ? or must it be reopened when the op log rolls over ?
After a while I get EXCEPTION: cannot resume stream; the resume token was not found. {_data: "825DC99861000001C32B022C0100296E5A1004DFE455B4382D414D9CB41F8710A12E0C463C5F6964003C3032393332303431000004"}

I managed to trace this error message to the following bit of server code: https://github.com/mongodb/mongo/blob/r4.2.1/src/mongo/db/pipeline/document_source_check_resume_token.cpp#L245

It appears that this error is raised when the resume token no longer points to a position in the oplog (which feeds the change stream). In this case, it's not possible to resume and the application likely needs to construct a new change stream. If you have control over your MongoDB deployment, you might be able to increase the oplog size to make it less likely for your application to hit this error; however, the watch() cursor certainly cannot be kept open indefinitely unless you are actively iterating and keeping up with the oplog.

Guglielmo Fanini

unread,
Nov 20, 2019, 5:11:04 PM11/20/19
to mongodb-user
any clue what this may mean when watching a constantly updating (not removed) collection :

IterateWatchCursor EXCEPTION:MongoDB\Driver\Exception\CommandException: CollectionScan died due to position in capped collection being deleted. Last seen record id: RecordId(6761473611439013897) - 136

I am unsure where this exception is thrown :


function IterateWatchCursor(& $user)
{
  
try {
for (; ($user->changeStream!=null) && ($pollcounter< 10); ) // limit each client to a few polls
{

  if($user->changeStream==null)  <<<------- could it be exception here ?
return;

             $user->changeStream->next();  <<<------- could it be exception here ?
 
           if ( ! $user->changeStream->valid()) // no data iterable  <<<------- could it be exception here ?
           {
                   continue;
           }
  
           $event = $user->changeStream->current();          <<<------- could it be exception here ?

           if ($event['operationType'] === 'invalidate')            <<<------- I exclude this since not logged anything
                   {             
                      mylog($user->remoteaddress, "IterateWatchCursor stream invalidated");
                          break;
           }
           $id = $event['documentKey']['_id'];
  }
  catch(MongoDB\Driver\Exception\Exception $e)
    {
  $errmsg="IterateWatchCursor EXCEPTION:".get_class($e). ": ". $e->getMessage(). " - ". $e->getCode();
                        mylog($user->remoteaddress,$errmsg);
     }

}

what should I do ehen I eventually get this exception after about 8 hours say, close and reopen the watch say ?


Il giorno martedì 12 novembre 2019 16:35:30 UTC+1, Jeremy Mikola ha scritto:

Guglielmo Fanini

unread,
Nov 21, 2019, 4:28:41 PM11/21/19
to mongodb-user
the exception seems to be in the next() call, could it be possible to recover by calling resume() ?

    /**
     * Recreates the ChangeStreamIterator after a resumable server error.
     *
     * @return void
     */
    private function resume()
    {
        $this->iterator = call_user_func($this->resumeCallable, $this->getResumeToken(), $this->hasAdvanced);
        $this->iterator->rewind();

        $this->onIteration($this->hasAdvanced);
    }

Guglielmo Fanini

unread,
Nov 21, 2019, 4:36:08 PM11/21/19
to mongodb-user
next() seems to try to resume()

 /**
     * @see http://php.net/iterator.next
     * @return void
     * @throws ResumeTokenException
     */
    public function next()
    {
        try {
            $this->iterator->next();
            $this->onIteration($this->hasAdvanced);
        } catch (RuntimeException $e) {
            $this->resumeOrThrow($e);
        }
    }

so it's not resumable ? I need to recreate the watch cursor ?
thank you

Jeremy Mikola

unread,
Nov 25, 2019, 7:22:10 PM11/25/19
to mongod...@googlegroups.com
On Wed, Nov 20, 2019 at 11:11 PM Guglielmo Fanini <g.fa...@gmail.com> wrote:
any clue what this may mean when watching a constantly updating (not removed) collection :

IterateWatchCursor EXCEPTION:MongoDB\Driver\Exception\CommandException: CollectionScan died due to position in capped collection being deleted. Last seen record id: RecordId(6761473611439013897) - 136

I am unsure where this exception is thrown :

This error message originates in the CollectionScan::doRestoreStateRequiresCollection() function in the server. Based on some references to the message in the server's JIRA project, it looks like this is an expected error if the change stream iteration does not keep up with the oplog. We previously discussed an issue with failing to keep up with the oplog earlier in this thread ("the resume token was not found" error message), so I believe the error here is just a different manifestation of the same issue. In this case, you're seeing the error attempting to issue a getMore command whilst iterating the change stream cursor.

Your later replies suggest that you identified next() as the source of this exception. In the future, the getTrace() or getTraceAsString() methods on the Exception class should be useful for determining the origin of any PHP exception.

On Thu, Nov 21, 2019 at 10:36 PM Guglielmo Fanini <g.fa...@gmail.com> wrote:
next() seems to try to resume()

so it's not resumable ? I need to recreate the watch cursor ?
thank you

ChangeStream does include logic to resume once automatically after certain errors encountered during iteration. This is discussed in more detail in the Change Stream spec, but it's not something to discuss in our driver documentation since the specifics are quite verbose.

Per "certain errors" linked above, you'll see that CappedPositionLost (code 136) is explicitly called out as a non-resumable error code. This agrees with the fact that you were unable to use the resume token earlier in the thread, for the same reason that iteration likely did not keep up with the oplog.

If you are not able to address this issue by either iterating faster or increasing the size of your oplog (discussed in an earlier reply), I believe your only solution will be to create a new ChangeStream by calling watch() again and accept that you may have missed some event documents in doing so.

Guglielmo Fanini

unread,
Nov 26, 2019, 6:06:17 AM11/26/19
to mongodb-user
So that CappedPositionLost (code 136) is not resumable, and I'd need to reopen the watch on the collection ?


Il giorno martedì 26 novembre 2019 01:22:10 UTC+1, Jeremy Mikola ha scritto:
On Wed, Nov 20, 2019 at 11:11 PM Guglielmo Fanini <g.f...@gmail.com> wrote:
any clue what this may mean when watching a constantly updating (not removed) collection :

IterateWatchCursor EXCEPTION:MongoDB\Driver\Exception\CommandException: CollectionScan died due to position in capped collection being deleted. Last seen record id: RecordId(6761473611439013897) - 136

I am unsure where this exception is thrown :

This error message originates in the CollectionScan::doRestoreStateRequiresCollection() function in the server. Based on some references to the message in the server's JIRA project, it looks like this is an expected error if the change stream iteration does not keep up with the oplog. We previously discussed an issue with failing to keep up with the oplog earlier in this thread ("the resume token was not found" error message), so I believe the error here is just a different manifestation of the same issue. In this case, you're seeing the error attempting to issue a getMore command whilst iterating the change stream cursor.

Your later replies suggest that you identified next() as the source of this exception. In the future, the getTrace() or getTraceAsString() methods on the Exception class should be useful for determining the origin of any PHP exception.

Jeremy Mikola

unread,
Nov 26, 2019, 11:22:43 AM11/26/19
to mongod...@googlegroups.com
On Tue, Nov 26, 2019 at 6:06 AM Guglielmo Fanini <g.fa...@gmail.com> wrote:
So that CappedPositionLost (code 136) is not resumable, and I'd need to reopen the watch on the collection ?

That is correct. The last two paragraphs of my previous reply make this point.

Reply all
Reply to author
Forward
0 new messages