Raven4 - Auto Index sometimes gives different results than Static index

78 views
Skip to first unread message

Jens Pettersson

unread,
Nov 30, 2017, 2:58:04 AM11/30/17
to RavenDB - 2nd generation document database
Hello.

Version: RavenDB 4 RC2

We have an Auto Index that's behaving a bit weird. Look at the following screenshots:


The first is an Auto Index generated by the following queries:

await session.Query<PersistedGrant>()
                    .Where(x => x.SubjectId == subjectId)
                    .ToListAsync()
                    .ConfigureAwait(false);

and:

await session.Query<PersistedGrant>()
                    .Where(x => x.Expiration < DateTimeOffset.UtcNow)
                    .ToListAsync()
                    .ConfigureAwait(false);

The second is a manually created index:

from grant in docs.PersistedGrants
select new {
    grant.SubjectId,
    grant.Expiration
}

The manually created index gives the correct number of results (2 entries in this case) but the Auto index gives 12 results. 

If I look at the results of the Auto Index I can see that each entry is duplicated a couple of times:


In earlier versions of Raven you could look at how the Auto Index was created but I can't find a way to see it in Raven 4 (at least not in the Studio).


Not sure there's an error in how the index was generated though, because if I reset the auto index from the Studio, it shows the correct number of entries again.


This is a collection that gets written to quite frequently and is cleaned every hour (by the query that queries on ExpirationDate), but that shouldn't really affect how an index behaves, right?


I haven't got a way to reproduce this yet, but I wanted to post here to see if this is something you've encountered before. As soon as I can reproduce, I'll let you know.


//J




Oren Eini (Ayende Rahien)

unread,
Nov 30, 2017, 3:37:56 AM11/30/17
to ravendb
You can inspect the details of the index here:
Inline image 1

Can you show what is in there?

Hibernating Rhinos Ltd  

Oren Eini l CEO Mobile: + 972-52-548-6969

Office: +972-4-622-7811 l Fax: +972-153-4-622-7811

 


--
You received this message because you are subscribed to the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jens Pettersson

unread,
Nov 30, 2017, 4:05:46 AM11/30/17
to RavenDB - 2nd generation document database


At the moment of writing this, both my static and the auto index is showing the same results. I resetted the Auto index after posting the first message. I will keep track of this during the day.

//J
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.

Oren Eini (Ayende Rahien)

unread,
Nov 30, 2017, 4:07:05 AM11/30/17
to ravendb
SubjectId and Expiration are both simple properties?
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+unsubscribe@googlegroups.com.

Jens Pettersson

unread,
Nov 30, 2017, 4:10:34 AM11/30/17
to RavenDB - 2nd generation document database
SubjectId is a string and Expiration is a nullable DateTime. Not sure why we made it nullable tbh, but that's how it is atm...

Oren Eini (Ayende Rahien)

unread,
Nov 30, 2017, 4:14:51 AM11/30/17
to ravendb
If you have this again, can you look at the raw index entries in the query page?
Inline image 1
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+unsubscribe@googlegroups.com.

Jens Pettersson

unread,
Dec 1, 2017, 1:53:55 AM12/1/17
to RavenDB - 2nd generation document database
Good morning. Everything was working fine during the day after I resetted the index and when I first looked in the Studio this morning there where 17 grants documents in the db and each index showed 17 entries.

Then the application that uses this database woke up (was idle on the IIS) and started its "TokenCleanup" (delete all expired grants once every minute) but then things got weird again. All expired grants was removed (only two new left in the db) but BOTH my indexes (auto and static) now shows 9 entries. I will add some images that shows the current state:

These are the actual grants documents (2) left in the database after the Token Cleanup did run:


When I query the index, this is the results I get:


Again, the two grants documents repeated...


This is when I query using the raw entries as Oren suggested:


Here are all results unique and the last two entries are the two grants documents that exists in the db, the rest are grants that got removed.



This time however, the Static index I created yesterday is behaving exactly the same way, so it doesn't seem to be related to it being an Auto index or not.

Also, but probably totaly coincidal, the TOTAL number of documents in this db is now (at the time of writing) 9, the same number of entries that those indexes are showing. Probably not related though...

//J

Oren Eini (Ayende Rahien)

unread,
Dec 1, 2017, 3:42:16 PM12/1/17
to ravendb
Can you send us the db in question?

It is possible that is related to the idle mode of auto index, but also that it isn't filtering removed docs
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+unsubscribe@googlegroups.com.

Jens Pettersson

unread,
Dec 1, 2017, 4:31:49 PM12/1/17
to RavenDB - 2nd generation document database
There's sensitive data in this db so I'm not comfortable sending it as it is right now. Also, exporting it and importing it fixes the indexes...

I will however try to reproduce in a separate db. Not sure if I'll manage this weekend, but I'll look into it.

My guess is that it's something to do with not filtering removed docs correctly but only after the indexes has been idle.

Is there a way to set the idle timeout on an index in order to force and index to be idle as it helps with troubleshooting?

Oren Eini (Ayende Rahien)

unread,
Dec 1, 2017, 4:47:27 PM12/1/17
to ravendb
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+unsubscribe@googlegroups.com.

Jens Pettersson

unread,
Dec 2, 2017, 10:13:31 AM12/2/17
to RavenDB - 2nd generation document database
Not really following how to set that property. Just updating the settings.json to either:

"Indexing.TimeToWaitBeforeMarkingAutoIndexAsIdleInMin": 1,

or

"Indexing": {
"TimeToWaitBeforeMarkingAutoIndexAsIdleInMin": 1
},

Didn't do anything for my "Auto Map" index.

I don't know how to access the configuration via the regular RavenDB.Client nuget package either, but maybe I'm just missing something...

//J

Jens Pettersson

unread,
Dec 2, 2017, 10:38:03 AM12/2/17
to RavenDB - 2nd generation document database
Haven't reproduced it "locally" yet, but when testing on our dev server that's been idle for a couple of hours now, those indexes only tracks ADDED documents (it increments the # entries right away) but doesn't update when I remove a document.

What's even more weird is that another Auto Index (in the same database, over a document collection that's rarely added to and never deleted from by our app) seems to work when I manually add and delete documents. The difference between that auto index and the one having problems is that the documents in that collection gets deleted on a regular basis.

Still no real success in finding a reliable way to reproduce, but I wanted to share my findings, however limited.

//J

Oren Eini (Ayende Rahien)

unread,
Dec 2, 2017, 1:55:21 PM12/2/17
to ravendb
Both would work.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+unsubscribe@googlegroups.com.

Jens Pettersson

unread,
Dec 7, 2017, 6:52:07 AM12/7/17
to RavenDB - 2nd generation document database
I've finally managed to reproduce this issue, both locally and on one of our dev/test servers. It's a pain to reproduce as you have to wait for quite some time to let the server do whatever it does when it gets idle (IF that's related to the issue)...

In my repo of mixed Raven things I've got a WebAPI that can be used to reproduce: https://github.com/jenspettersson/RavenPerfs/tree/master/Bugs/Raven4.Indexing.Debug - How to reproduce is written in the readme.md file, but I'll write it here as well:

------------------------------------------------------------------------------------------------------------------------
RavenDB 4 - 4.0.0-rc-40023 - Community License

settings.json

{
 
"ServerUrl": "http://localhost:4040",
 
"DataDir": "APPDRIVE:/Raven",
 
"Indexing": {
   
"TimeToWaitBeforeMarkingAutoIndexAsIdleInMin": 1
 
},
 
"RunInMemory": false
}

To reproduce: Create a database called: `Raven.Index.Debug` Start the WebAPI (not sure it actually need to be a Web Application, but that's what our application that have this issue is so). It will run `PopulateData(...)` on startup to populate two document collections (TestDocument that uses an IdConvention and AnotherTestDocument that doesn't) with 20 documents each. One auto index for each Document Collection will also be created. 1. Open Raven Studio and look at the index list, both indexes should show "20 entries". 2. Remove a couple of documents manually from each collection, and look at the index list again. It will show the new number of entries correctly. 3. Stop the Web Application (could also remove the PopulateData call inside the Startup to prevent it from repopulating again) and also, close the Raven Studio (not sure this is needed) 4. Wait... I had to wait between 20-30 minutes for the issue to actually happen. 5. Open Raven Studio and look at the index list again, they will still show "X entries" (depending on how many documents you removed in step 2). 6. Manually remove a couple of documents from each collection again and look at the index list. This time it doesn't always decrement the # of entries. If it does, you probably didn't wait long enough in step 4... 7. Remove ALL documents from the collections and look at the index list again. Here it might differ, but for my latest attempt, the # of entries updated to "7 entries" and "1 entry" Now, you can start the Web Application again (be sure to remove the call to PopulateData(...)). You can do the following requests: GET: http://localhost:8802/api/anotherTestDocuments My result: ``` { "numberOfActualResults": 0, "stats": { "isStale": false, "durationInMs": 0, "totalResults": 7, "skippedResults": 7, "timestamp": "2017-12-07T11:18:58.3316286", "indexName": "Auto/AnotherTestDocuments/ByCreated", "indexTimestamp": "2017-12-07T11:18:58.3316286", "lastQueryTime": "2017-12-07T11:20:19.8947196", "timingsInMs": {}, "resultEtag": 4441579847025080415, "resultSize": 0, "scoreExplanations": {} } } ``` The numberOfActualResults is the Count of the actual query result, and the stats is what the .Statistics(...) is producing. You can also do a GET: http://localhost:8802/api/testDocuments and get a similar result.

------------------------------------------------------------------------------------------------------------------------

Ignore the IdConvention stuff, it was just a thought that it could have something to do with it, but it doesn't seem to be...

Hope this will help as I can't figure out what's going on...

//J

Tal Weiss

unread,
Dec 7, 2017, 9:12:02 AM12/7/17
to RavenDB - 2nd generation document database
Hi Jens,
I have ran through the process and waited about 35 minutes before starting to delete documents (step 6)
Deleting a couple more documents and looking at the amount of entries for the index yielded the expect result of 16 (i have deleted another 2 documents in step 2)
When i have deleted all the documents going to the index page yields 0 results but there seems to be two entries per index.
I'm not sure yet what are those but they are skipped and don't show when querying the index.
You can see in your app endpoint "skippedResults": 7,
i'll investigate exactly what are those but i wanted to hear if this is the behavior you're seeing too?
Because you were talking about actually getting duplicated entries before and i can't reproduce that issue...

To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--

Hibernating Rhinos Ltd  cid:image001.png@01CF95E2.8ED1B7D0

Tal Weiss l Core Team Developer Mobile:+972-54-802-4849

Office: +972-4-622-7811 l Fax: +972-153-4-622-7811l Skype: talweiss1982

RavenDB paving the way to "Data Made Simplehttp://ravendb.net/ 

Jens Pettersson

unread,
Dec 7, 2017, 9:28:10 AM12/7/17
to RavenDB - 2nd generation document database
First: The title of this thread is very misleading as that was the first initial thought... :)

The duplicated results I mentioned earlier only seems to be shown in the Studio. But in order to see those, then the indexes inside of the studio must start to show incorrect number of entries, and as it didn't happen to you I understand it must be hard to investigate further... But in short: There are no duplicate entries when doing a regular query from code.

Regarding the skipped documents when querying: That's what I see  (in the Statisticts(...)) as soon as my indexes inside the studio is starting to show different # of entries than actual documents. Say I have 10 documents in the db, my index might show "15 entries" (after waiting and deleting etc...). When querying that index using .Statistics(...) and only do a .ToList() on the query to get all items, I get 10 results back (i.e. all documents in that collection) but my Stats shows "totalResults: 15" and "skippedResults: 5". It might cause problems if trying to do some paging...

//J

Hibernating Rhinos Ltd  cid:image001.png@01CF95E2.8ED1B7D0

Oren Eini (Ayende Rahien)

unread,
Dec 7, 2017, 9:38:25 AM12/7/17
to ravendb
Initial thoughts is that this somehow relates to the idle indexing mode for auto indexing, investigating further
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+unsubscribe@googlegroups.com.

ar...@ayende.com

unread,
Dec 8, 2017, 9:50:52 AM12/8/17
to RavenDB - 2nd generation document database
Hi Jens,

We were able to reproduce it locally and fix it. Thank you for providing us with the repro. Details: http://issues.hibernatingrhinos.com/issue/RavenDB-9680

Thanks,
Arek

Jens Pettersson

unread,
Dec 8, 2017, 12:29:24 PM12/8/17
to RavenDB - 2nd generation document database
Thanks for the information and good to hear you’ve got a fix allready!

//J
Reply all
Reply to author
Forward
0 new messages