How can I describe the structure of index table?

18 views
Skip to first unread message

kaz

unread,
Jan 12, 2010, 6:11:45 AM1/12/10
to Google App Engine
Hi,

I'm writing some tech articles for App Engine engineers in Japan and
now trying to write a diagram to explain the structure of index tables
(for kind, single property and composite index).

In the article "How Entities and Indexes are Stored", it says that the
EntitiesByProperty ASC/DESC table would contain the following data:

- App ID
- Kind
- Property name
- Property value
- Entity key

Can I assume that the actual Bigtable would be organized like this:

- Key: "App ID/Kind/Prop name/Prop value"
- Value: Entity key

So that Datastore can convert a single property query like "name =
foo" into a Bigtable range scan for a key "/MyApp/MyEntity/name/foo".
Is this correct?

Thanks,

Kaz

kaz

unread,
Jan 12, 2010, 7:13:40 PM1/12/10
to Google App Engine
In addition,

- If it is correct, numerical property values like 123 will be
converted to String with paddings like "0000123"? So that it can
convert the numerical range query to a lexical range scan.
- In the article "How Entities and Indexes are Stored", it says the
App Engine is using total 6 Bigtables, while the article shows 7
Bigtables. Which one is correct?

Thanks,

Kaz

Nick Johnson (Google)

unread,
Jan 18, 2010, 4:54:09 AM1/18/10
to google-a...@googlegroups.com
Hi,

Your original assessment is correct. For more details, see this video from Google I/O 2008: http://www.youtube.com/watch?v=tx5gdoNpcZM

On Wed, Jan 13, 2010 at 12:13 AM, kaz <kazun...@gmail.com> wrote:
In addition,

- If it is correct, numerical property values like 123 will be
converted to String with paddings like "0000123"? So that it can
convert the numerical range query to a lexical range scan.

No, the keys are represented in a binary form that provides the correct ordering properties.
 
- In the article "How Entities and Indexes are Stored", it says the
App Engine is using total 6 Bigtables, while the article shows 7
Bigtables. Which one is correct?

Can you provide a specific reference?

-Nick Johnson
 

Thanks,

Kaz

On 1月12日, 午後8:11, kaz <kazunori...@gmail.com> wrote:
> Hi,
>
> I'm writing some tech articles for App Engine engineers in Japan and
> now trying to write a diagram to explain the structure of index tables
> (for kind, single property and composite index).
>
> In the article "How Entities and Indexes are Stored", it says that the
> EntitiesByProperty ASC/DESC table would contain the following data:
>
> - App ID
> - Kind
> - Property name
> - Property value
> - Entity key
>
> Can I assume that the actual Bigtable would be organized like this:
>
> - Key: "App ID/Kind/Prop name/Prop value"
> - Value: Entity key
>
> So that Datastore can convert a single property query like "name =
> foo" into a Bigtable range scan for a key "/MyApp/MyEntity/name/foo".
> Is this correct?
>
> Thanks,
>
> Kaz

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.






--
Nick Johnson, Developer Programs Engineer, App Engine
Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration Number: 368047

kaz

unread,
Jan 18, 2010, 10:32:40 PM1/18/10
to Google App Engine
Hi Nick,

> Your original assessment is correct. For more details, see this video from
> Google I/O 2008: http://www.youtube.com/watch?v=tx5gdoNpcZM

I've seen it several months ago and saw it again today :) Yes, this
time I could hear
Ryan was saying that: "I'm omitting key from (the diagram of) the end
of the index
rows, but just remember that there's a key here, so every time we scan
we gotta go
fetch entities" (at 0:22).

Now I got another question: Is the entity key always embedded in the
Bigtable
row key of the index tables? i.e. The rows do not have any columns
other than
the row key, because everything you need (= index data) resides in the
row key?
(Sorry for the trivial question, but I just want to make my diagram
more accurate).

> No, the keys are represented in a binary form that provides the correct
> ordering properties.

I understand. No paddings, but the values will be encoded in a binary
form. Is it
some proprietary form or standard form like IEEE 754?

> Can you provide a specific reference?

Here are the 7 index tables mentioned in the doc:

1. Entities table
2. EntitiesByKind table
3. EntitiesByProperty ASC table
4. EntitiesByProperty DESC table
5. EntitiesByCompositeProperty table
6. Custom index table
7. Id sequences table

Is there any duplication or something?

Thanks,

Kaz


On 1月18日, 午後6:54, "Nick Johnson (Google)" <nick.john...@google.com>
wrote:


> Hi,
>
> Your original assessment is correct. For more details, see this video from
> Google I/O 2008:http://www.youtube.com/watch?v=tx5gdoNpcZM
>

> > google-appengi...@googlegroups.com<google-appengine%2Bunsu...@googlegroups.com>

Nick Johnson (Google)

unread,
Jan 19, 2010, 10:38:58 AM1/19/10
to google-a...@googlegroups.com
Hi,

On Tue, Jan 19, 2010 at 3:32 AM, kaz <kazun...@gmail.com> wrote:
Hi Nick,

> Your original assessment is correct. For more details, see this video from
> Google I/O 2008: http://www.youtube.com/watch?v=tx5gdoNpcZM

I've seen it several months ago and saw it again today :) Yes, this
time I could hear
Ryan was saying that: "I'm omitting key from (the diagram of) the end
of the index
rows, but just remember that there's a key here, so every time we scan
we gotta go
fetch entities" (at 0:22).

Now I got another question: Is the entity key always embedded in the
Bigtable
row key of the index tables? i.e. The rows do not have any columns
other than
the row key, because everything you need (= index data) resides in the
row key?
(Sorry for the trivial question, but I just want to make my diagram
more accurate).

That's correct.
 

> No, the keys are represented in a binary form that provides the correct
> ordering properties.

I understand. No paddings, but the values will be encoded in a binary
form. Is it
some proprietary form or standard form like IEEE 754?

I'm not 100% certain - but in any case the internal representation is irrelevant. :)
 

> Can you provide a specific reference?

Here are the 7 index tables mentioned in the doc:

1. Entities table
2. EntitiesByKind table
3. EntitiesByProperty ASC table
4. EntitiesByProperty DESC table
5. EntitiesByCompositeProperty table
6. Custom index table
7. Id sequences table

Is there any duplication or something?

I believe the discrepancy is the explicit mention of the "id sequences table".

-Nick Johnson
 
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.



kaz

unread,
Jan 19, 2010, 7:22:06 PM1/19/10
to Google App Engine
Hi Nick,

> I believe the discrepancy is the explicit mention of the "id sequences
> table".

I see! Thanks so much for the explanation!

Kaz

On 1月20日, 午前12:38, "Nick Johnson (Google)" <nick.john...@google.com>

> > <google-appengine%2Bunsu...@googlegroups.com<google-appengine%252Buns...@googlegroups.com>

Reply all
Reply to author
Forward
0 new messages