Newbie question about Id generation

539 views
Skip to first unread message

Justin A

unread,
May 11, 2011, 9:18:09 AM5/11/11
to ravendb
Hi folks,

i'm trying to get my head around how Id's are auto-generated with
RavenDB. The poor-equivalent would be the IDENTITY type for a RDMBS.

Currently, the POCO needs to have a property called Id which is of
type string

Quote (http://ravendb.net/tutorials/hello-world)
"The only requirement is that a root entity string Id property
(configurable)"

Kewl. When I first save 10 documents, I get the following Id's :-

Id1, Id2, Id3, Id4... Id10

Then, when i try and store one more item later on in my integration
test, it gets the Id of gamefiles/1

I'm trying to store the exact same POCO though - just a new instance.

here's the pseduo poco code ....

// Populate this document database.
foreach (var gameFile in GameFileMockData.CreateMockGameFileData())
{
GameFileRepository.Save(gameFile);
}
Session.SaveChanges();

.. and then later on ...

var count = GameFileRepository.Find().Take(int.MaxValue).Count();

// Act.
GameFileRepository.Save(newGameFile);
Session.SaveChanges();

var countAfterSave =
GameFileRepository.Find().Take(int.MaxValue).Count();

// NOTE: GameFileRepository.Find().Last() is not implimented.
var lastGameFile =
GameFileRepository.Find().Skip(count).Take(1).SingleOrDefault();

// Asserts....

and the first time I do this, the Id is gamefiles/1
If i do this a few mins later, this document is now over-ridden and
the Id is now gamefiles/1025. (and it's gamefiles/2049 for the 3rd
time...)

I'm guessing Raven/Hilo/gamefiles is the way it calculates the
Identity. After the first add/store, it's ServerHi: 2. The second
store, ServerHi: 3

And the poco's are nothing fancy...

namespace Whatever.Foo
{
public class Entity
{
public string Id { get; set; }
}
}

namespace Whatever.Foo
{
public class GameFile : Entity
{
... whatever properties go in here ...
}
}

Can someone help out a newbie with this simple question, please?

Ayende Rahien

unread,
May 11, 2011, 9:20:54 AM5/11/11
to rav...@googlegroups.com
We use hi lo to generate ids by default. When you create a new document store, you force regeneration of hilo, forcing us to start from the next batch.


Just to note, this will NOT work:  var count = GameFileRepository.Find().Take(int.MaxValue).Count();

Justin A

unread,
May 11, 2011, 12:09:01 PM5/11/11
to ravendb
I figured out my problem. My auto-generated collections (I use
NBuilder) was auto-setting the strings to name + increment. Hence the
Initialize getting Id1, Id2, etc.

But that doesn't explain why the second time i run the test ... and it
tries to put 10 more auto-gen'd docs in, they start at id 1025.

first run
auto-add 1, 2, 3, 4, -> 10.
then manually add 1 .. so 11 now exists. test ends.

run test again.

auto-add 10 docs .. but next number is 1025, 1026, 1027-> 1034.
manually add 1 .. 1035

??

> Just to note, this will *NOT *work:  var count =


> GameFileRepository.Find().Take(int.MaxValue).Count();

So how can/should we get a count?

Itamar Syn-Hershko

unread,
May 11, 2011, 12:13:51 PM5/11/11
to rav...@googlegroups.com
As Oren said, you were probably creating a new doc store, hence creating new HiLo values.

So how can/should we get a count?

 Count() will return the correct value for every query. You don't need to fetch all the results for that:

session.Query<Game>().Where(something).Count();

See:

Louis Haußknecht

unread,
May 11, 2011, 1:51:20 PM5/11/11
to rav...@googlegroups.com

That's because of the HiLo algorithm Ayende mentioned.

Am 11.05.2011 18:23 schrieb "Justin A" <justin...@stargategroup.com.au>:

Justin A

unread,
May 11, 2011, 8:47:36 PM5/11/11
to ravendb
On May 12, 2:13 am, Itamar Syn-Hershko <ita...@ayende.com> wrote:
> As Oren said, you were probably creating a new doc store, hence creating new
> HiLo values.

a new Doc Store? Hmm. i thought i was using the existing one...

var store = new DocumentStore {Url = "http://localhost:8080"};
store.Initialize();

wouldn't that code just re-use the existing store if one exists?
Because this is an MSTest this runs at the start of each TestMethod.
But I thought that wouldn't be an issue?

Ayende Rahien

unread,
May 11, 2011, 11:38:46 PM5/11/11
to rav...@googlegroups.com
We are talking about DocumentStore instance, and you are creating a new one every time you call new DocumentStore.

Justin A

unread,
May 12, 2011, 3:55:07 AM5/12/11
to ravendb
Correct - I've also been talking about the DocumentStore instance.
Which I thought is what we need to do, once per integration-test
harness. Eg, each time when CI kicks in or when we run 1-or-more
tests.

Or do I need to use some different syntax?

Even though this is a unit/integration test, I'm assuming that each
time the test harness is kicked in, the store is created once (ie. if
this is nunit, then in the [TestFixtureSetUp]). The sessions are
created with each [Test] (i'm still trying to see how I can do this).
Gotta love TDD :) Hate to be doing this in the WebApp first (even
though all the docs show how to do this really easily with
global.asax :P

Ayende Rahien

unread,
May 12, 2011, 9:00:18 AM5/12/11
to rav...@googlegroups.com
 once per integration-test harness

What are you talking about?

This is specific for the actual unit testing framework that you are talking about.

If you are talking about xUnit, the ctor is called _per each test_.

Justin A

unread,
May 12, 2011, 10:08:46 AM5/12/11
to ravendb
I'm using NUnit (via nuget).

And i was meaning that the DocStore code goes in the
[TestFixtureSetup] which is ran once per class (i believe) before the
[Tests] are ran in the Fixture. To me, I thought that was a close
analogy to a website's application (test fixture setup) and request
(each [Test]).

(not the ctor which is called _per each test_ as u stated).

Ayende Rahien

unread,
May 12, 2011, 10:10:26 AM5/12/11
to rav...@googlegroups.com
That should create it properly, how are you getting these behavior, running it one at a time, or running all in debug mode and breaking in

Justin A

unread,
May 16, 2011, 1:49:52 AM5/16/11
to ravendb
I've made a quick sample .NET solution which is trying to show what
i'm doing.

I've also made a sample video screen-cast (sincere apologies -> i
really really suck at that!) to try and explain what i'm doing by
going through the steps.

Video on YouTube: http://www.youtube.com/watch?v=C51oADZfAmU
Sample Solution: http://www.easy-share.com/1915433790/LearningRavenDb.7z

Disclaimer: I didn't know where to stick the compressed solution - i
grabbed the first site on the net. I can move it somewhere else, if
someone prefers that.

Would be lovely if anyone here could spend a few mins looking the vid?
It's in English, so once more - apologies to those not speaking
English.

Justin A

unread,
May 17, 2011, 4:03:50 AM5/17/11
to ravendb
Drats!! i just noticed that the video has no sound :( I'll find the
proper copy OR create a new one and upload it.

oops *blush*

(and here i have been .. waiting on a knife-edge for some help for the
last 24 hours) ...

-J-

Justin A

unread,
May 17, 2011, 4:33:27 AM5/17/11
to ravendb
sigh. i've put up a new video: http://www.youtube.com/watch?v=k15DI5T0w7Q

but it also doesn't have sound (youtube doesn't like converting .mov
to yt flash vids) .. hmmm time to covert!

Justin A

unread,
May 17, 2011, 8:09:48 AM5/17/11
to ravendb
Ok - take 3 ...

Video Question: http://www.youtube.com/watch?v=YeNGOFl2qlQ

/me hopes someone might have a quick look/listen to it, to help out.

-J-

Itamar Syn-Hershko

unread,
May 17, 2011, 8:19:13 AM5/17/11
to rav...@googlegroups.com
Justin, every time you run the test you create a new document store. When creating a new DocumentStore object which connects to a remote RavenDB server, new HiLo keys are generated and exchanged. This is why you see the gap in ids - because the previous session which you ended had the unused range preserved for it. To understand it better, think about running the two tests in parallel and not sequentially - the only way they could add entities safely is when they have ID ranges preserved for each.

In your application, as long as you don't close the docstore object, you're safe. You can use sequential IDs by setting them yourself, but then you run into the risk of collisions.

Itamar Syn-Hershko

unread,
May 17, 2011, 8:31:00 AM5/17/11
to rav...@googlegroups.com
To clarify even a bit further: this happens in your RegisterRepositories, once you call new DocumentStore, which whenever you click Run Tests is called.

Justin A

unread,
May 18, 2011, 5:44:35 AM5/18/11
to ravendb
Hi Itamar.

I really really really appreciate that you've spent some on your time
having a look at my newbie question. Sincere regards!

Comments inline

>>every time you run the test you create a new document store.
Correct .. but isn't this the correct way to do it? To me, this is the
same analogy as if I restart the AppPool of my IIS website. So would
this mean that if I have a website, and I add 10 documents to the
collection when it first 'started up', the Id's would be 1-10. Then,
if the app-pool is restarted for that website, then the next document
to be added will have an id of .. i donno .. something else BUT most
likely not 11?

> When creating a new DocumentStore object which connects to a remote RavenDB
> server, new HiLo keys are generated and exchanged.

Hmm. hold on. I really don't understand something now, A DocumentStore
<-- isn't that the database i'm connecting to inside RavenDb.
Because i'm still stuck in RDBMS world, the poor analogy here would
be :-
* SqlServer / RavenDb == The Service/Servers.
* AdventureWorks / Default Database == the databases.


> In your application, as long as you don't close the docstore object, you're
> safe.

Hmm. so how would an database stay sequential? The service/console is
still running, but the unit test or more importantly, the web site
(app-pool) is constantly starting / stopping.
Do you not initialise? That's only a once off call? how would u know
NOT to call that?

>You can use sequential IDs by setting them yourself, but then you run
> into the risk of collisions.

Agreed - that sounds scary.

I guess I really don't understand this - so apologies as I come to
grips here. I really want to start using RavenDb :)

-J-

Itamar Syn-Hershko

unread,
May 18, 2011, 5:53:38 AM5/18/11
to rav...@googlegroups.com
A great opportunity to test our new docs: https://github.com/ravendb/docs/blob/master/docs/consumer/basic-concepts.markdown. Let me know if you have any questions remaining after reading this.

And just one correction: the document store is NOT your app pool. The RavenDB server is the database, and the DocumentStore is what enables you to connect to it from a client application - its the session factory if you wish. The communication channel to the server. EmbeddedDocumentStore is about the same thing, but slightly different since it doesn't make HTTP calls to the server, since it can call it directly.

Ayende Rahien

unread,
May 18, 2011, 8:36:36 AM5/18/11
to rav...@googlegroups.com
inline

On Wed, May 18, 2011 at 12:44 PM, Justin A <justin...@stargategroup.com.au> wrote:
Hi Itamar.

I really really really appreciate that you've spent some on your time
having a look at my newbie question. Sincere regards!

Comments inline

>>every time you run the test you create a new document store.
Correct .. but isn't this the correct way to do it? To me, this is the
same analogy as if I restart the AppPool of my IIS website. So would
this mean that if I have a website, and I add 10 documents to the
collection when it first 'started up', the Id's would be 1-10. Then,
if the app-pool is restarted for that website, then the next document
to be added will have an id of .. i donno .. something else BUT most
likely not 11?


Yes, that is the behavior, and that is by design.
Put simply, we have no way of knowing that we stopped at 10, and that no one else might generate an 11, so we have to start at the next range.
 
> When creating a new DocumentStore object which connects to a remote RavenDB
> server, new HiLo keys are generated and exchanged.

Hmm. hold on. I really don't understand something now, A DocumentStore
<-- isn't that the database i'm connecting to inside RavenDb.

No, that is the state of ALL of your connections to a specific RavenDB database.

 
Because i'm still stuck in RDBMS world, the poor analogy here would
be :-
* SqlServer / RavenDb == The Service/Servers.
* AdventureWorks / Default Database == the databases.


It makes easier sense to think about this if you are familiar with NHibernate, the DocumentStore == Session Factory.
 

> In your application, as long as you don't close the docstore object, you're
> safe.

Hmm. so how would an database stay sequential? The service/console is
still running, but the unit test or more importantly, the web site
(app-pool) is constantly starting / stopping.

It wouldn't stay sequential, but it would always be incrementing and you'll never have conflicts.

Justin A

unread,
May 18, 2011, 8:54:57 AM5/18/11
to ravendb
Comments Inline.

> A great opportunity to test our new docs:https://github.com/ravendb/docs/blob/master/docs/consumer/basic-conce....
> Let me know if you have any questions remaining after reading this.
Sure.
/me runs of to read.

Ok. back. Yep, so far so good. I liked the "Client API design
guidelines" ..
"...API design intentionally mimics the widely familiar NHibernate
API.
IDocumentStore ...
IDocumentSession ...
IDocumentQuery ... "

That surely helped me. But it still left me with a few basic
questions .. (which i'll talk about below).

> And just one correction: the document store is NOT your app pool. The
> RavenDB server is the database, and the DocumentStore is what enables you to
> connect to it from a client application - its the session factory if you
> wish.
Kewl :) This was also explained in those docs - which makes sense to
me, now.

Now, what I still felt was missing was any mention about document
identity ~generation~. There is a major paragraph titled "Document
Identifiers". This does explain what a document identity is and (which
i didn't know), the identity is unique in-such-a-way that regardless
of the instance 'type', the most recent document will persist over any
previous ones (of any type).
" There is absolutely nothing that would prevent you from saving a
Post with the document id of "users/1", and that would overwrite any
existing document with the id "users/1", regardless of which
collection it belongs to."

Bingo! (And WOW moment) never knew that! That's very important -
especially for all of us first timers to noSql. I always assumed that
the type AND the id defines a document as unique. Which is why you
don't have multiple Id's of '1' or '2', etc.. Espeically how thinsg
are visualized into collections. Dangerous trap, that (for us first
timers). Ok. kewl!

But there is nothing about how these id's are autogenerated. There are
some conventions that take place (and can be manually defined) in
another page (connecting-to-a-ravendb-datastore.markdown). But no
mention about how they are created and what we must care about.

In the G-Group forum here, people method the HiLo algorithm .. sure ..
but do I need to care about this? Should I just want to know that the
id's will just be sequential? Earlier on, you said "When
creating a new DocumentStore object which connects to a remote RavenDB
server, new HiLo keys are generated and exchanged.". To me (remember:
i'm still trying to get my head around this), a DocumentStore object
is an expensive ~application-wide~ object. Gotcha. It's not the DB,
but the client-side object that will communicate with it.
"DocumentStore is what enables you to connect to it from a client
application". So why is this Store in charge (to some degree) with
generating these HiLo keys? That's what i don't get. Why can't it just
ask the DB for these keys so it can continue sequentially making
numbers. If my website app-pool recycles, then now my numbers will be
different, right? A new Application_OnStart is fired .. a new
DocumentStore instance is created .. and therefore new HiLo's are
generated.

Maybe the real question is: do I really care for incremental numbers?
That's sooo 1970's with RDBMS's... ?? I feel it's important but i'm
just new to this noSql thing.

I really want to learn this :)

So apologies if this feels like u're holding my hand and walking me
through a mine field. I suppose, if i can 'get it', then joe-lame-ass
will and there will be lots of RavenDB love and the world will be a
better place.

-J-

Oh - tiny typo:
doc: https://github.com/ravendb/docs/blob/master/docs/consumer/connecting-to-a-ravendb-datastore.markdown
search for: Database=Northwind - connect to a remove RavenDB
error: remove should be remote.

Itamar Syn-Hershko

unread,
May 18, 2011, 1:12:22 PM5/18/11
to rav...@googlegroups.com
inline

On Wed, May 18, 2011 at 3:54 PM, Justin A <justin...@stargategroup.com.au> wrote:
Now, what I still felt was missing was any mention about document
identity ~generation~.

We will add that too.
 
I always assumed that
the type AND the id defines a document as unique. Which is why you
don't have multiple Id's of '1' or '2', etc.. Espeically how thinsg
are visualized into collections. Dangerous trap, that (for us first
timers). Ok. kewl!

Note: the id we are discussing there is a string id which contains the collection name (entity type)!!! so an entity User with int id set to 1 will have a Raven ID of "users/1" (unless you override some things, which is out of scope atm).
 
But there is nothing about how these id's are autogenerated. There are
some conventions that take place (and can be manually defined) in
another page (connecting-to-a-ravendb-datastore.markdown). But no
mention about how they are created and what we must care about.

It will be discussed there too.
 
In the G-Group forum here, people method the HiLo algorithm .. sure ..
but do I need to care about this? Should I just want to know that  the
id's will just be sequential? Earlier on, you said "When
creating a new DocumentStore object which connects to a remote RavenDB
server, new HiLo keys are generated and exchanged.". To me (remember:
i'm still trying to get my head around this), a DocumentStore object
is an expensive ~application-wide~ object. Gotcha. It's not the DB,
but the client-side object that will communicate with it.
"DocumentStore is what enables you to connect to it from a client
application". So why is this Store in charge (to some degree) with
generating these HiLo keys? That's what i don't get. Why can't it just
ask the DB for these keys so it can continue sequentially making
numbers.

For performance reasons (to save on travels to the server), and to allow the notion of Unit of Work (and efficiently). You want the client to be independent of the server as much as it can, and without preserving value ranges it can't decide on ids without being 100% sure there will be no collisions.


DocumentStore should be created once and in an application-wide scope, yes. You may want to note that if you insert several thousands entities it will communicate with the server to get more HiLo values to use, so even then you're not promised to have sequential ids.

If my website app-pool recycles, then now my numbers will be
different, right? A new Application_OnStart is fired .. a new
DocumentStore instance is created .. and therefore new HiLo's are
generated.

HiLo values just mean NEW entities saved with that doc store will get ids between Lo and Hi. It doesn't affect saved entities, nor updates.
 
Maybe the real question is: do I really care for incremental numbers? 
That's sooo 1970's with RDBMS's... ?? I feel it's important but i'm
just new to this noSql thing.

You probably don't.

Justin A

unread,
May 20, 2011, 9:20:34 AM5/20/11
to ravendb
Thanks for the reply everyone :)

I just downloaded rob ashton's gallery and changed his ravendb from
embedded to a documentstore. ran the app, registered. added 4 photos
(imagedocuments/1 -> 4) killed the website (left raven running) and
started it up again. added a new photo and blamo - next ID is 1025.

So this confirms exactly what ayende and itamar are saying (i never
questioned or doubted that, though). I did this to confirm that my
code (read: newbie with no idea) and rob ashtons code (read: dude who
really knows what he's doing) are doing the same thing.

I suppose i really need to get my head around these document numbers
not being incremental. a decade and a half of sql server has mooshed
my brain to some extent.

cheers Hibernating Rhino team - thanks for helping me in this thread.
Much appreciated.

-J-
Reply all
Reply to author
Forward
0 new messages