Stephen,
My thinking on this is..
Say you have a Month Column.. and it has the default lexically sorted
asc,dsc indexes on it.
My assumption is, when you insert a new row with a Month value
defined, the entire index will have to be updated (no matter how many
shards or tablets it's broken up into).
This is supported by this quote from the Index Building document you
linked to:
"When an entity is created or deleted, all of the index rows for that
entity must be updated."
So.. when you roll around to the June 2010 time period and start
inserting rows for that.. if you have a generic Month column, all the
indexes for every month for every year will have to be updated.
Now, if you had them partitioned out in the way I described,
presumably, you would just need to rebuild the indexes for entries
with a June column.
Granted this isn't most optimal, why not just go all out and do
"June2009", "June2010" columns (and still have the "y2009","y2010"
columns too for quickly grabbing yearly data).. that way.. once the
month of the year is past, those indexes would never need to be
rebuilt again.. yet, since you were using the Expando model, you
wouldn't need seperately defined Models for each MonthYear combo.
Mainly, I see this method as a way of helping BigTable out in
understanding how to partition out my data...
Does this make sense?
I think your ListProperty idea sounds efficient to implement, but I
think it would run into that index updating issue once you got into
the 100s of millions and billions of rows.. every new insert would
require the dimensions and date columns to be rebuilt. Now, these are
my assumptions.. which are fraught with peril... so, I'm trying to
post it here to see if anyone else out there is of a mind to think it
through with me.
Thanks for any input.
On Nov 4, 9:10 am, Stephen <
sdea...@gmail.com> wrote:
> On Nov 3, 11:35 pm, Eli <
eli.jo...@gmail.com> wrote:
>
>
>
> > (This is just the first usage example that comes to mind. This row
> > naming method could be used for all sorts of set intersection stuff,
> > and would cut down on insert times due to the fact that it should
> > partition out the indexes when dealing with humongous datasets).
>
> I don't think what your proposing is a physical optimisation because
> indexes are not discrete objects as they are in a traditional
> relational database:
>
>
http://code.google.com/appengine/articles/index_building.html#Index%2...