Update arrays of arrays in PHP

143 views
Skip to first unread message

Ángel Manuel García Carmona

unread,
Oct 10, 2018, 8:36:38 PM10/10/18
to mongodb-user
Hello,

I have a data collection that follows this structure:

Captura.JPG

As you may see, _unidades is an array of arrays. All those arrays follow the same structure (ID + Name (String) + Array). Inside that latter array (element 2 of every unidades_ array) there are many arrays. 

So I would like to insert a new array to the arrays of arrays at 2nd position of every unidades_ array. But I cannot find any way to track those positions in PHP -besides could use BulkWrite.

Thanks in advance for your help,

Regards,
Ángel Manuel.

Neil Lunn

unread,
Oct 11, 2018, 12:37:31 AM10/11/18
to mongod...@googlegroups.com
Hi  Ángel ,

In the simplest way of addressing your question, whilst there is indeed the facility to do the sort of update you are asking for in modern MongoDB releases in an "atomic" way and even "non-atomic" in older legacy releases, the real point for consideration here should be that if you are seeing problems and a lack of examples to follow, then you likely don't have the optimal data structure idea for what you really want to do.

To address this by your own opening statement, we can indeed "see" that you have a structure like "(ID + Name (String) + Array)" even though you are using plain indexed array positions to represent these instead of named properties. The most clear position here would be to use "named properties" instead, as that is exactly what they are for.

The other case of course is that even though MongoDB allows you to store data in just about any form or structure, you still really need to consider that the primary usage case here should be for a "database". So whilst it's generally acceptable to have a master document detail and a "single" array of child component information that relates to that master detail, complicated nesting is simply "complicated", and "over-complicated" for practical query and update of the data even if the modern server API supports doing so.

That said, the methods are there depending on the overall structure. For example with the following:

{
  doc: 1,
  unidates: [
    [1,"one",["A","B"]],
    [2,"two",["C","D"]]
  ]
},
{
  doc: 2,
  unidates: [
    [1,"one",["A","B"]]
  ]  
},
{
  doc: 3,
  unidates: [
    [1,"one",["A","B"]],
    [2,"two",["C","D"]]
  ]
}

Then updating the nested inner array would be like:

db.unimatrix.updateMany(
  { unidates: { $elemMatch: { 0: 2 } } },
  {
    $push: {
      "unidates.$.2": "E"
    }
  }
)

Which results in:

{
  doc: 1,
  unidates: [
    [1,"one",["A","B"]],
    [2,"two",["C","D","E"]]   // this changed
  ]
},
{
  doc: 2,
  unidates: [
    [1,"one",["A","B"]]
  ]  
},
{
  doc: 3,
  unidates: [
    [1,"one",["A","B"]],
    [2,"two",["C","D","E"]]  // this changed
  ]
}


Noting here that with the absence of "named properties" we are using the hard index values of 0 within the first inner level of the array to match the entry within the $elemMatch as well as the 2 within that matched element to indicate the "inner nested" array position in order to $push the new element.

NOTE: As mentioned before, NOT using named properties is NOT A GOOD IDEA, and most importantly positional filtered updates for multiple matches within an array using an "indexed notation" has "questionable" value in application of "positional filtered" updates with modern release API's in order to update more than the "first" matched condition of the legacy operator:

db.unimatrix.updateMany(
  { unidates: { $elemMatch: { 0: 2 } } },
  {
    $push: {
      "unidates.$[el].2": "F"
    }
  },
  {
    arrayFilters: [
      { "el.0": 2 }
    ]
  }
)

As said earlier, whilst this can be updated within this form and it also can be "queried" within this form it's just simply not ideal, and I've only touched on "one" of the many limitations. The far more practical and scalable form of recording such a structure is more like:

{
  doc: 1,
  unidates: [
    { id: 1, name: "one", title: "A", description: "B" },
    { id: 2, name: "two", title: "C", description: "D" }
  ]
},
{
  doc: 2,
  unidates: [
    { id: 1, name: "one", title: "A", description: "B" }
  ]  
},
{
  doc: 3,
  unidates: [
    { id: 1, name: "one", title: "A", description: "B" },
    { id: 2, name: "two", title: "C", description: "D" }
  ]
}

Where most notably everything is a "named property" as opposed to an indexed array position and is considerably flatter. If all the "inner" array elements really represent are different and "static" named properties then this is most logical to update with simple statements like:

db.unibetter.updateMany(
  { "unidates.id": 2 },
  {
    "$push": {
      "unidates": {
        id: 3, name: "three", title: "E", description: "F"
      }
    }
  }
)

Which of course results in the simple appending of another array member with specified properties in the matching documents:

{
  doc: 1,
  unidates: [
    { id: 1, name: "one", title: "A", description: "B" },
    { id: 2, name: "two", title: "C", description: "D" },
    { id: 3, name: "three", title: "E", description: "F" }    // added here
  ]
},
{
  doc: 2,
  unidates: [
    { id: 1, name: "one", title: "A", description: "B" }
  ]  
},
{
  doc: 3,
  unidates: [
    { id: 1, name: "one", title: "A", description: "B" },
    { id: 2, name: "two", title: "C", description: "D" },
    { id: 3, name: "three", title: "E", description: "F" }   // added here
  ]
}

And even where you actually have a structure you envisage having "nested array children", it's still far better for both update and query purposes to keep the "flatter" design approach and simply denote the common and differing properties on the children:

{
  doc: 1,
  unidates: [
    { id: 1, name: "one", type: 1, title: "A thing", description: "Place for A's" },
    { id: 1, name: "one", type: 2, title: "B thing", description: "Place for B's" },
    { id: 2, name: "two", type: 1, title: "C thing", description: "Place for C's" },
    { id: 2, name: "two", type: 2, title: "D thing", description: "Place for D's" },
  ]
},
{
  doc: 2,
  unidates: [
    { id: 1, name: "one", type: 1, title: "A thing", description: "Place for A's" },
    { id: 1, name: "one", type: 2, title: "B thing", description: "Place for B's" }
  ]
},
{
  doc: 3,
  unidates: [
    { id: 1, name: "one", type: 1, title: "A thing", description: "Place for A's" },
    { id: 1, name: "one", type: 2, title: "B thing", description: "Place for B's" },
    { id: 2, name: "two", type: 1, title: "C thing", description: "Place for C's" },
    { id: 2, name: "two", type: 2, title: "D thing", description: "Place for D's" },
  ]
}

This may appear at first somewhat "denormalized" and "duplicative" of details, but it is however the most efficient and widely supported form for query and update with simple statements like:

db.uniawesome.updateMany(
  { "unidates.id": 2 },
  {
    "$push": {
      "unidates": {
        "$each": [
          { id: 3, name: "three", type: 1, title: "E Thing", description: "Place for E's" },
          { id: 3, name: "three", type: 2, title: "F Thing", description: "Place for F's" }
        ]
      }
    }
  }
)

Which is of course opposed to the "still better than plain arrays" but actual "nested" approach like:

{
  doc: 1,
  unidates: [
    {
      id: 1,
      name: "one",
      stuff:[
       { title: "A thing", description: "Place for A's" },
       { title: "B thing", description: "Place for B's" }
      ]
    },
    {
      id: 2,
      name: "two",
      stuff:[
       { title: "C thing", description: "Place for C's" },
       { title: "D thing", description: "Place for D's" }
      ]
    }
  ]
},
{
  doc: 2,
  unidates: [
    {
      id: 1,
      name: "one",
      stuff:[
       { title: "A thing", description: "Place for A's" },
       { title: "B thing", description: "Place for B's" }
      ]
    }
  ]
},
{
  doc: 3,
  unidates: [
    {
      id: 1,
      name: "one",
      stuff:[
       { title: "A thing", description: "Place for A's" },
       { title: "B thing", description: "Place for B's" }
      ]
    },
    {
      id: 2,
      name: "two",
      stuff:[
       { title: "C thing", description: "Place for C's" },
       { title: "D thing", description: "Place for D's" }
      ]
    }
  ]
}

And these can be updated even with "positional filtered" statements in modern supporting MongoDB releases:

db.uninested.updateMany(
  { "unidates.id": 2 },
  {
    "$push": {
      "unidates.$[el].stuff": {
        "$each": [
          { title: "E Thing", description: "Place for E's" },
          { title: "F Thing", description: "Place for F's" }
        ]
      }
    }
  },
  {
    "arrayFilters": [
      { "el.id": 2 }
    ]
  }
)


But as stated all along, it's still just NOT IDEAL since whilst it may "look" logical and cleaner to the eye, there is the additional overhead of both "update" and "query" complexity, not to mention the general constraints on "multikey indexes" and overall efficiency.

So as a rule of thumb, every time your brain "thinks nested", you actually generally always means much "flatter" in terms of database scalabiltiy and performance. There's a bit more about these things written elsewhere online, but the general advice is to actually use the named properties where it really is appropriate to do so ( it's not like omitting the names saves any real amount of storage ) and to keep the "flatter" structre with "one" level of array content at most for ease of query and update.


Neil

N.B You might also note that posting "images" representative of code or data structure's really isn't a great idea in itself for presenting concepts to your peers. "Pictures" simply can not be "cut & paste" as the code/data they actually represent, and be used by your peers in working through and applicable example. It's always far better ( particularly with "data" and MongoDB ) to stick to a common textual and JSON representation ( as well as the JavaScript ) which can be applied by everyone in your target audience to the common tools available ( i.e MongoDB shell ). The same "data structures" which comprise query and update operations also therefore apply to every language implementation with just a little application to working it in to that language's own specific way of representing BSON documents as functional argments.

Keep it simple and clear. And "make it up" if you find yourself redacting every line of data, as something is a lot more indicative than nothing.


--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/4ed54ba5-30f6-4072-b835-4719d2185f6f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ángel Manuel García Carmona

unread,
Oct 11, 2018, 5:39:59 AM10/11/18
to mongodb-user
Thanks a lot!

Finally I used named properties and fine ;-)

Neil Lunn

unread,
Oct 11, 2018, 6:50:17 AM10/11/18
to mongod...@googlegroups.com
Hi  Ángel ,

Though the main point here was of course that you really should be changing the structure ( as your later response seems to indicate you have ) there are a couple of things to note in your reply.

1. Don't interchange array() and [] notation with PHP. They mean the same thing, and PHP only differentiates that the => specifies keys of an "associative array" and without that it's just a "regular array" with each entry by index. This means things you did like this:


[
  '_unidades' =>  [
    array( '$elemMatch' => ['0' => $id_unidad] )
  ]
]  

Are actually just really this:

[
  '_unidades' =>  [  '$elemMatch' => [ '0' => $id_unidad ] ]
]  

So if it does not have the arrow it's "square" and if it does have the arrow it's "curly" in the translation back to JSON terms. You probably also should get in the habit of using json_encode() within your code listings and logging your composed data structures for queries and the like to console on execution so you can actually see if what you coded was the same as what you "think" you coded from the perspective of following an existing JSON notation example.

2. It looks like you're using the wrong Driver. This is actually quite painfully common ( Bump to J. Mikola ) due to the actual intended driver for usage being published in two parts. You appear to be using the "core" driver directly, and you're not really meant to be doing that unless you are writing a "higher level driver" yourself ( which I would not advise ). The actual driver for your "application usage" is the PHPLIB Driver here.

The big giveaway to this is code using the MongoDB\Driver namespace should be pretty much absent from this sort of code and the bulk of your "collection" methods should be from MongoDB\Collection, which basically hosts the same consistent methods that all other language drivers also have. Most notably are the methods like updateMany() just as demonstrated earlier and just like every driver does it. If you really must then there is actually even bulkWrite(), but I suspect your usage in the shown code is actually because it seemed the only method you understood from the MongoDB\Driver namespace.

Also FYI, ALL MongoDB API calls ( from the common API as pointed out ) actually already use the "Bulk API" anyway, The different methods are simply for "code clarify", therefore updateOne() when that is what you mean, updateMany() for a single statement which applies to "many" documents, and the actual bulkWrite() means exactly to execute a "bulk batch" of different "write operations", be those "update" or "insert" or "delete".


On Thu, Oct 11, 2018 at 7:44 PM Ángel Manuel García Carmona <am...@live.com> wrote:

Dear Neil,

Thanks for your very instructive response.

However, it does not work in PHP.

That is my BSON file:

{

"_id": {"$oid":"5bbb20941c9d440000073055"},

"_id_curso":{"$numberInt":"3"},

"_nombre_curso":"Introducción a Python",

"_unidades": [

[

{"$numberInt":"1"},

"Primeros pasos",

[["¿Es un lenguaje orientado a objetos?","Sí, de hecho es multiplataforma"],["¿Qué extensiones de archivo admite?",".py, .pyc, .pyd, .pyo, .pyw"]]

],

[

{"$numberInt":"2"},

"Control de flujo",

[["¿Es indiferente el uso de while en vez de for?","No, a no ser que antes inicialices la variable índice y, en la última línea interna al bucle, establezcas la operación aritmética correspondiente."],["¿Equivale do-while a la sentencia while?","Lo que ocurre es que en este primer caso, la sentencia se ejecutará al menos una vez."]]

]

]

}

Meanwhile here you have my PHP implementation:

        ......

        $bulk = new MongoDB\Driver\BulkWrite();
        $bulk->update(
                   [
                       '_id_curso' => $id_curso,
                       '_unidades' =>
                            [
                                array('$elemMatch' => ['0' => $id_unidad])
                            ]
                   ],
                   [
                       '$push' => [
                                 '_unidades.$.2' => array(array('0' => $enunciado, '1' => $respuesta))
                           ]
                   ]
                );       
       
        $writeConcern = new MongoDB\Driver\WriteConcern(MongoDB\Driver\WriteConcern::MAJORITY, 100);
       
        try {
            $result = $manager->executeBulkWrite('moodle.faq', $bulk, $writeConcern);
        } catch (MongoDB\Driver\Exception\BulkWriteException $e) {
            $result = $e->getWriteResult();

            // Check if the write concern could not be fulfilled
            if ($writeConcernError = $result->getWriteConcernError()) {
                printf("%s (%d): %s\n",
                    $writeConcernError->getMessage(),
                    $writeConcernError->getCode(),
                    var_export($writeConcernError->getInfo(), true)
                );
            }

            // Check if any write operations did not complete at all
            foreach ($result->getWriteErrors() as $writeError) {
                printf("Operation#%d: %s (%d)\n",
                    $writeError->getIndex(),
                    $writeError->getMessage(),
                    $writeError->getCode()
                );
            }
        } catch (MongoDB\Driver\Exception\Exception $e) {
            printf("Other error: %s\n", $e->getMessage());
            exit;
        }

        printf("Inserted %d document(s)\n", $result->getInsertedCount());
        printf("Updated  %d document(s)\n", $result->getModifiedCount());

I got no errors. I cannot see any update in my document.

Could you assess me?

Regards,

Ángel Manuel.


For more options, visit https://groups.google.com/d/optout.
-- 
Ángel Manuel García Carmona
B. Sc. Computer Engineering student 
(UDIMA)

Avenida de la Constitución 38º
06450 Quintana de la Serena 
(Badajoz, Spain)
+34634760756

Jeremy Mikola

unread,
Oct 11, 2018, 3:08:25 PM10/11/18
to mongod...@googlegroups.com
On Thu, Oct 11, 2018 at 3:50 AM Neil Lunn <neill...@gmail.com> wrote:

2. It looks like you're using the wrong Driver. This is actually quite painfully common ( Bump to J. Mikola ) due to the actual intended driver for usage being published in two parts. You appear to be using the "core" driver directly, and you're not really meant to be doing that unless you are writing a "higher level driver" yourself ( which I would not advise ). The actual driver for your "application usage" is the PHPLIB Driver here.

Points noted. There's language in http://php.net/mongodb and https://docs.mongodb.com/ecosystem/drivers/php/ that encourages users to employ PHPLIB for application use. Short of slapping a banner on every documentation page within the driver's PHP.net documentation (and we'd really rather not pollute PHP.net like that), I don't have a better solution for this issue. It does happen from time to time, but the majority of users do end up finding their way to PHPLIB, especially if they engage with the GitHub repositories, where issues in the driver and library frequently link to each other. That said, I'm open to any suggestions as to how we can improve this if you'd like to follow up with me out-of-thread.

Ángel Manuel García Carmona

unread,
Oct 12, 2018, 3:35:00 AM10/12/18
to mongodb-user
Thanks for your reply.

As far as I know, PHPLIB is deprecated: http://php.net/manual/en/book.mongo.php.

Ángel Manuel García Carmona

unread,
Oct 12, 2018, 3:36:40 AM10/12/18
to mongodb-user
Besides I got it but it gives me problems:
https://github.com/mongodb/mongo-php-library/issues/584

Jeremy Mikola

unread,
Oct 12, 2018, 10:50:12 AM10/12/18
to mongod...@googlegroups.com
On Fri, Oct 12, 2018 at 12:35 AM Ángel Manuel García Carmona <am...@live.com> wrote:
Thanks for your reply.

As far as I know, PHPLIB is deprecated: http://php.net/manual/en/book.mongo.php.

That is a link to the legacy "mongo" extension's documentation. The newer "mongodb" extension and userland library are discussed (and linked) in the introductory section on: https://docs.mongodb.com/ecosystem/drivers/php

Ángel Manuel García Carmona

unread,
Oct 12, 2018, 11:06:37 AM10/12/18
to 'Jeremy Mikola' via mongodb-user

Ah, ok. I'll take a look...

Thank you.

--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages