check bsoncxx::oid is null

511 views
Skip to first unread message

li ning

unread,
Jun 15, 2016, 11:56:20 PM6/15/16
to mongodb-user
bsoncxx::oid id;
if(!id)
   std::cout<<"OK";

save this id to db
and read it out:
bsoncxx::oid id = bson["id"].get_oid().value;
if(!id)
    std::cout<<"OK";
else
    std::cout<<"Not user want";

It print Not user want
This is odd, the id value is not change, but the operator bool return different value if save to db then read it from db.
the oid operator bool use a bool _is_valid, but the meaning represented by this data member has some problem,
it used to represent:
1, oid take an invalid input and not use the input, the array value is 0. _is_valid == false
2, make an oid object but not use any input data to fill the array. _is_valid == false

However, this 2 situation totally different. The first one describe a exception, the second one is a legal usage (to represent the oid value is null)
The oid class mix those 2 case together so there is problem in the beginning, from user view, operator bool return false should means: "all char in array is 0"

And I am afraid use "_is_valid == false" to represent an construct exception is bad

Andrew Morrow

unread,
Jun 16, 2016, 1:30:35 PM6/16/16
to mongod...@googlegroups.com

Hi -

Could you please try to re-write this question? I am not able to understand either the behavior you expect, nor the behavior you are actually observing. I think the best way would for you to write some code that can actually be compiled and executed that demonstrates the problem you are encountering.

Thanks,
Andrew


--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: https://docs.mongodb.org/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/63cfa7de-4455-4d07-a43b-d182ce551040%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

li ning

unread,
Jun 17, 2016, 12:01:15 AM6/17/16
to mongodb-user
There is a use case:

class User {
    User():_id(bsoncxx::oid::init_tag){}
    bool has_friend() {return friend_id;}
    bosncxx::oid _id;
    bosncxx::oid friend_id;
};

The User can have or have not a friend, if it has a friend, then its friend_id will be the '_id' of another User.
The member function has_friend check friend_id is null or not, now create a User object.

User u1;
assert(!u1.has_friend()); // OK friend_id operator bool will return false

// now convert u1 to a bson object and save it to mongo db, the bson object like this: {"_id":ObjectId("xxx"), "friend_id":ObjectId("yyy")}
// then read this bson object from mongodb and use it to create User object u1 by User constructor
User::User(const bsoncxx::document::view& bson)
{
_id = bson["_id"].get_oid().value;
friend_id = bson["friend_id"].get_oid().value;
}

User u1(bson_object);
assert(!u1.has_friend()); // fail

I have not change the value of friend_id, but the value of operator bool return is inconsistent.


在 2016年6月17日星期五 UTC+8上午1:30:35,acm写道:

Andrew Morrow

unread,
Jun 17, 2016, 3:57:20 PM6/17/16
to mongod...@googlegroups.com

Hi -

It should be easy enough to write a complete example of this that can be run locally and debugged. Posting an SSCCE will mean that I can locally compile the code and explore the behavior.

Please write up such an example, that defines the entire User class, constructs an instance of it, converts it to BSON, stores it to the database, queries the database, constructs a User object from the class, and shows that the friend_id field has not round-tripped correctly.

Thanks,
Andrew


li ning

unread,
Jun 18, 2016, 12:17:13 AM6/18/16
to mongodb-user
mongocxx::instance mongo_driver;
mongocxx::uri uri("mongodb://127.0.0.1:8820");
mongocxx::pool pool(uri);
bsoncxx::oid id;
if(id)
std::cout << "1" << std::endl;
auto conn = pool.acquire();
conn->database("test")["test"].insert_one(bsoncxx::builder::stream::document{} <<
"_id" << id <<
bsoncxx::builder::stream::finalize);
bsoncxx::oid id1;
for(auto&& c : conn->database("test")["test"].find({}))
{
id1 = c["_id"].get_oid().value;
}
if(id1)
std::cout << "2" << std::endl;


It print 2, this is inconsist, the id has not been changed, but operator bool return different value.

在 2016年6月18日星期六 UTC+8上午3:57:20,acm写道:

Andrew Morrow

unread,
Jun 19, 2016, 11:23:24 AM6/19/16
to mongod...@googlegroups.com

Thanks for providing the code. It isn't quite an SSCCE though, as it doesn't have the necessary headers, and the code isn't wrapped in a main function. I've done that, and here is a complete repro of the behavior you are observing:

#include <cstdlib>
#include <iostream>

#include <bsoncxx/builder/stream/document.hpp>
#include <bsoncxx/json.hpp>

#include <mongocxx/client.hpp>
#include <mongocxx/instance.hpp>
#include <mongocxx/pool.hpp>
#include <mongocxx/uri.hpp>

int main(int argc, char* argv[]) {

    using bsoncxx::builder::stream::document;

    mongocxx::instance mongo_driver{};

    const auto uri = mongocxx::uri{(argc >= 2) ? argv[1] : mongocxx::uri::k_default_uri};
    mongocxx::pool pool{uri};

    auto conn = pool.acquire();
    auto coll = (*conn)["test"]["test"];
    coll.drop();

    bsoncxx::oid id;

    if (id)
        std::cout << "id object is true-ish\n";
    auto doc = bsoncxx::builder::stream::document{} << "_id" << id << bsoncxx::builder::stream::finalize;
    std::cout << "doc before insert: " << bsoncxx::to_json(doc) << "\n";

    coll.insert_one(doc.view());

    for (auto&& c : coll.find({})) {
        std::cout << "doc after insert: " << bsoncxx::to_json(c) << "\n";
        bsoncxx::oid id1 = c["_id"].get_oid().value;
        if (id1)
            std::cout << "id field is true-ish\n";
    }

    return EXIT_SUCCESS;
}

When I run this program, I observe the following output:

doc before insert: {
    "_id" : {
        "$oid" : "d93a5eff7f000076f2c066ff"
    }
}
doc after insert: {
    "_id" : {
        "$oid" : "d93a5eff7f000076f2c066ff"
    }
}
id field is true-ish

The reason you observe this behavior is that you are invoking the default constructor of bsoncxx::oid, which very surprisingly does not initialize the byte array within the bsoncxx::oid object. This is why your first conditional, where you check if(id) does not take the true branch and print anything - the bsoncxx::oid object knows that it is uninitialized, and the bytes it contains are effectively random, so its operator bool returns false.

However, you then actually add those random bytes into your document for the _id field and store it into the database. As a result, when you query for the document, the object you get back contains a field called _id of the correct type to be extracted as a bsoncxx::types::b_oid type, containing the same random bytes that you added when you built the document. That object contains a bsoncxx::oid object which has been initialized, which is why the second conditional takes the true branch and results in a print.

The immediate workaround for your problem is to change your code as follows:

    coll.drop();

    bsoncxx::oid id{bsoncxx::oid::init_tag};

    if (id)


Which will properly initialize the bsoncxx::oid object with meaningful values.

I did some digging in the git history for the project, and this behavior has been present in the bsoncxx::oid class since the beginning. I suspect that it was written this way so that writing:

bsoncxx::oid obj_id;
obj_id = some_document["_id"].get_oid().value;

didn't result in first doing the work to construct a valid OID representation for obj_id, and then immediately discard that result by assigning in the result of obtaining the OID from the document.

I don't think that is a very good argument though, since the behavior is so surprising. Most likely, the sense of the initialization should be reversed, and the default constructor should properly initialize the bytes. Users who want or need to avoid the overhead of constructing proper OID internals should explicitly request that behavior:

bsoncxx::oid obj_id{bsoncxx::oid::no_init_tag};
obj_id = some_doc["_id"].get_oid().value;

I've filed https://jira.mongodb.org/browse/CXX-940 to track this issue. Thanks for posting the example that revealed this problem with the driver API. In the future, if you encounter issues like this, I strongly recommend posting a fully compilable example code demonstrating the problem. It will allow us to understand the issue you have encountered much more quickly than trying to describe it in prose.

Thanks,
Andrew


li ning

unread,
Jun 19, 2016, 9:54:23 PM6/19/16
to mongodb-user
Thanks. As you said the default constructor is dangerous, it might be "random" to another legal id. However, whatever user use init_tag or no_init_tag, the std::array<char, 12> should init to {0}.
And operator bool need check the array equal to {0,0...0} or not. Use an flag to represent this will require the flag value persistent to db, this is conflict with BSON definition(I am not very sure about this)

Also, if construct a oid object with illegal parameter, it is better to throw an exception rather set the "_valid" flag. If someone worry the performance for example construct oid in a big size and heavy job loop, supply an constructor with an error code like:
oid(const char* data, error_code& error)


在 2016年6月19日星期日 UTC+8下午11:23:24,acm写道:

Andrew Morrow

unread,
Jun 20, 2016, 9:36:58 AM6/20/16
to mongod...@googlegroups.com

I think the simpler answer is that bsoncxx::oid should always initialize its contents as a proper OID. In the event that you don't want to pay for OID construction, you should probably write:

bsoncxx::stdx::optional<bsoncxx::oid> maybe_holds_an_oid;
if (...) {
    maybe_holds_an_oid = bsoncxx::oid();
}

if (maybe_holds_and_oid) {
   // ...
}

I suspect that bsoncxx::oid may have predated the availability of optional in the codebase, hence the odd design. I do however need to double check that there isn't an important performance reason we made this choice.

Please watch https://jira.mongodb.org/browse/CXX-940 for updates.

Thanks,
Andrew

li ning

unread,
Jun 20, 2016, 11:26:17 PM6/20/16
to mongodb-user
Can't understand your "a proper OID", you mean init_tag constructed OID? But sometimes we can not know what the id is when it gets init.

Use optional to wrap oid is not good, at least in coding style. If user can not know what the id value is when it gets init, user must use optional<oid> rather than simple oid. And there is no way to protect this usage(like a compile error), user may forget to do so.
I think the OID should act like an primitive type ( it just an big number, isn't it? ), as few situation needs optional<int>:

int i = foo();
if(i) // use value 0 as an flag
{
// foo indeed return a meaningful result, the result range is (0, int max]
}

But few people write as:
option<int> i = foo();
This is logical correct but odd. In most case the result range will not convert all [int min, int max]

For bsoncxx::oid, mark 0 as a special flag should be accepted. 

Further, use optional to wrap bsoncxx::oid still can not perfectly solve the problem if it need to persist to db
class User {
oid _id;
optional<oid> _friend_id;
};
If no friend, the bson is {"_id":xxx}
If has friend, the bson is {"_id":xxx,"friend":yyy}
The bson is different, this might not user wanted.

The int version:
class User {
int _id;
int _friend_id;
};
If no friend, the bson is {"_id":xxx,"friend":-1}
If has friend, the bson is {"_id":xxx,"friend":yyy}

For the oid init overhead, I really don't think a memset is a big deal. It almost cost nothing

在 2016年6月20日星期一 UTC+8下午9:36:58,acm写道:

Andrew Morrow

unread,
Jun 21, 2016, 8:43:16 AM6/21/16
to mongod...@googlegroups.com

I think we actually agree on most points. What I am proposing is to:

- Remove the bsoncxx::oid::init_tag entirely, along with the constructor that takes it
- Give bsoncxx::oid a default constructor, which will initialize it according to https://docs.mongodb.com/manual/reference/bson-types/#objectid

This will give bsoncxx::oid the "primitive type" semantics you are looking for. The cost will be that it will no longer be free to declare a bsoncxx::oid object on the stack, as the implementation will need to set the timestamp, machine id, process id, and counter.

My proposed mitigation for this change is that users who require the ability to produce a bsoncxx::oid without the cost of initialization can do so if and when needed by using optional or some other method to sidestep the more costly initialization.

You seem to want to use the "all zero" oid as a sentinel value, which I think is the only point of disagreement. I think I would prefer not to do that. There are enough ways to signal the lack of value built in to C++ that I don't think the bsoncxx::oid type needs to support one intrinsically.

Thanks,
Andrew




li ning

unread,
Jun 21, 2016, 10:17:56 PM6/21/16
to mongodb-user
"all zero" has an advantage: user can do an logical check only on oid value:

if(!friend_id) // operator bool return fasle if all zero
// no friend yet, or the relationship was released by user
else
auto iter = friend_map.find(friend_id)
if(iter == friend_map.end())
// the friend don't exist anymore (deleted or something). 

or your solution:
// the operator bool is meaningless, it always return true
auto iter = friend_map.find(friend_id)
if(iter == friend_map.end())
// there are 2 cases:1, no friend yet. 2, the friend don't exist anymore (deleted or something). 
// need more job to do what the case it is if user need to know

Of curse current version oid can do this too, the oid(const char* data, int len) can do this, but the usage is trouble

oid id; // all zero
id = bsoncxx::oid(init_tag);
id.reset() // all zero

or

const static bsoncxx::oid all_zero; // self define an all zero oid
if(id == all_zero)
//...
id = all_zero

The first one is more concise, and it is very very small chance that an legal oid be all zero, even current local time is 1970.1.1 00:00:00
I guess most users need this so why not do it in oid class? Or else every one need write something to make an global zero oid and compare with it

在 2016年6月21日星期二 UTC+8下午8:43:16,acm写道:

Andrew Morrow

unread,
Jun 22, 2016, 11:35:07 AM6/22/16
to mongod...@googlegroups.com

Hi -

I just don't find the argument compelling. I think there are other ways to represent the situation you describe. In particular, I don't like the idea that the bsoncxx::oid class will have boolean semantics. I view that as a mistake made early in driver development that I would like to correct.

Thanks,
Andrew

Reply all
Reply to author
Forward
0 new messages