> from doc in docs
> select new { doc.User.Name }
> What happen if the document doesn't have User.Name ?
IMHO, if Name is null this should be indexed as a NULL value, so that I
can query for users without a Name. If User is null, the the doc
shouldn't be in the index at all. When there's another field like:
select new { Title= Title, doc.User.Name }
...then if User=null it should probably be indexed as (Title,
MISSING_VALUE)-> doc.
For the above index I would e.g. like to be able to do:
Query<Doc>(doc => doc.User.Name == "Bob")
// -> docs with User=null are just ignored
Query<Doc>(doc => doc.User.Name == null)
// -> docs with User=null are just ignored
Query<Doc>(doc => doc.Title == "something")
// -> Includes docs with User == null
Query<Doc>(doc => doc.Title == "something" && doc.User.Name == "Bob")
// -> docs with User=null are just ignored
Tobias
Typo, should be: select new { doc.Title, doc.User.Name }
> What is the meaning of MISSING_VALUE vs. Not indexing it in the first place?
Just what I explained in my sample queries. With the following index:
select new { doc.Title, doc.User.Name }
...when a document has a Title != null and a User == null, I still want
to be able to query this index for documents by Title.
Query<Doc>(doc => doc.Title == "something")
When doc.User is null then doc.User.Name is missing. But because
there's also doc.Title in the index, the document must be indexed.
In order to do that, the doc.User.Name field in the index probably
needs a special "MISSING_VALUE" which can be distinguished from null or
any real value. (I might want to query for doc.User.Name == null, which
must not include docs where doc.User == null)
A document may only be completely ignored, if ALL fields in the index
are missing.
Tobias
I'm a fan of MISSING_VALUE, indexes shouldn't stop working entirely
because a document is incorrect. The only question i have is what
implications does this raise with the client API, perhaps the client
API should some how this error case occurred?
Perhaps the real reason this is brought up is error collection for
indexes is lacking? Becuase the indexing error collection is just
about completely worthless as it stands now.
However if the error results were more along the lines
Index: UserByID, document: user/1 (link to document), select new
{ doc.User.Name } results in NullReference Exception.
Back tracking to my earlier statement, perhaps it would make sense to
return all of this information to the client when you query an index
that you get like .Advanced.IndexErrors, or that it drops out with the
statistics object or something.
Another thing to consider is perhaps this shouldn't just be a single
defined convention, maybe it should be defined as a convention for
Raven and then added to indexes as a Raven option that lets you choose
what happens with indexing errors: raise error, ignore entire
document, ignore field / return null.