[erlang-questions] Getting attributes

Steve Davis

unread,

Oct 13, 2012, 7:23:54 PM10/13/12

to Erlang-Questions Questions

Hi all,

Suppose I have a 2 dimensional plane {x, y} of 64k by 64k coordinates.

Each coordinate has a mutable attribute {{x, y}, attribute}

If I need to query: What are the attributes of positions "next to" my position, i.e: x +- 1, y +- 1?

What is the correct solution to this without storing everything in "Oracle" and using relational queries?

Can this be sensibly maintained in a KV store?

/s
_______________________________________________
erlang-questions mailing list
erlang-q...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions

Steve Davis

unread,

Oct 13, 2012, 7:38:32 PM10/13/12

to Erlang-Questions Questions

I understated the issue:

The position of the entity can also change:
{id, {x,y}, attribute}

So, in fact, the id is the only constant and the pos and attribute are independently mutable.

Mike Oxford

unread,

Oct 13, 2012, 9:52:17 PM10/13/12

to Steve Davis, Erlang-Questions Questions

Sure; you have two ways of doing this.

1) Key = {x,y} and just make queries for each surrounding node to get the attributes. This requires more read calls.

2) Key = {x,y} and you COPY all of the data from the surrounding nodes into it. This requires only 1 read call, but much more work on the write-side as things move around.

On a heavy-write system, #2 will spend a lot of time blocked as everything is updating everything around it when it moves (from cluster + to cluster.)

#1 takes more reads which are relatively cheap.

It depends on your read/write profile balance. On a very heavy "read" system where thing do not move often, #2 will be best. On a system with lots of moving around, #1 is better. Where that boundary line lies is specific to your own application's patterns and this figuring out this pattern will dictate which method is best (or even if you really want to go KV for this.) Profile it, even if your first-pass profile is just on paper/pencil.

-mox

Jim

unread,

Oct 13, 2012, 10:09:16 PM10/13/12

to Mike Oxford, Steve Davis, Erlang-Questions Questions

How about use two index columns consisting if the bit representation of the x and y coordinates as in {key, attribute, 000000000001, 000000000010}. Use mnesia and just do the math on the bits to calculate the next select

Depending on your problem, It's possible for the key to be {bitstring1,bitstring2}

Sent from my iPhone

Dmitry Kolesnikov

unread,

Oct 14, 2012, 2:55:28 AM10/14/12

to Steve Davis, Erlang-Questions Questions

Yes,

You can use z-cuve to handle it within key/value store.
http://en.wikipedia.org/wiki/Z-order_curve

alternative approach, is R-tree but 64k-by-64k might be too much to keep it in-memory. Therefore, MySQL + RTRee index should help you.

- Dmitry

Steve Davis

unread,

Oct 14, 2012, 11:29:52 AM10/14/12

to Mike Oxford, Jim, Dmitry Kolesnikov, Erlang-Questions Questions

Thanks to you all for the excellent practical and educational advice. It's got me moving forward again :-)

Best regards,

Toby Thain

unread,

Oct 14, 2012, 1:31:20 PM10/14/12

to Dmitry Kolesnikov, Steve Davis, Erlang-Questions Questions

On 14/10/12 2:55 AM, Dmitry Kolesnikov wrote:
> Yes,
>
> You can use z-cuve to handle it within key/value store.
> http://en.wikipedia.org/wiki/Z-order_curve
>
> alternative approach, is R-tree but 64k-by-64k might be too much to keep it in-memory. Therefore, MySQL + RTRee index should help you.
>

If using a relational database, spatial indexing is not required. Simple
tuples are quite enough, just put a composite index on (x,y) since all
fetches are (constant,constant). And you can get all neighbours in a
single statement.

--Toby

Dmitry Kolesnikov

unread,

Oct 14, 2012, 3:38:13 PM10/14/12

to Toby Thain, Steve Davis, Erlang-Questions Questions

Toby,

Unfortunately, composite index on (x,y) does not give you an optimal performance due to B-Tree index nature. Any (x,y) proximity requests are converted into random disk I/O because B-Tree clusters data first component and then by second. This is a reason why z-order curve were invented. It allows to fit spacial data into B-Tree index and provide you sequential disk I/O for proximity queries.

- Dmitry

Toby Thain

unread,

Oct 14, 2012, 4:43:39 PM10/14/12

to Dmitry Kolesnikov, Steve Davis, Erlang-Questions Questions

On 14/10/12 3:38 PM, Dmitry Kolesnikov wrote:
> Toby,
>
> Unfortunately, composite index on (x,y) does not give you an optimal performance due to B-Tree index nature. Any (x,y) proximity requests are converted into random disk I/O because B-Tree clusters data first component and then by second. This is a reason why z-order curve were invented. It allows to fit spacial data into B-Tree index and provide you sequential disk I/O for proximity queries.
>
> - Dmitry
>
>

A good point about locality. Does the OP require this level of
performance, though? There is probably low hanging fruit before one gets
to the point of tuning RDBMS index types.

--Toby

Anthony Molinaro

unread,

Oct 15, 2012, 3:15:37 PM10/15/12

to Steve Davis, Erlang-Questions Questions

On Sat, Oct 13, 2012 at 06:38:32PM -0500, Steve Davis wrote:
> I understated the issue:
>
> The position of the entity can also change:
> {id, {x,y}, attribute}
>
> So, in fact, the id is the only constant and the pos and attribute are independently mutable.
>
>
>
> On Oct 13, 2012, at 6:23 PM, Steve Davis <steven.cha...@gmail.com> wrote:
>
> > Hi all,
> >
> > Suppose I have a 2 dimensional plane {x, y} of 64k by 64k coordinates.
> >
> > Each coordinate has a mutable attribute {{x, y}, attribute}
> >
> > If I need to query: What are the attributes of positions "next to" my position, i.e: x +- 1, y +- 1?
> >
> > What is the correct solution to this without storing everything in "Oracle" and using relational queries?
> >
> > Can this be sensibly maintained in a KV store?

This problem reminds me of one I had to do to map latitude and longitude
to location, then look up the closest location based on the latitude and
longitude of an input.

I used an ordered_set ets table inserting records like

{Latitude, Longitude, Record}

Then when looking things up given an input latitude and longitude, I would
add/subtract a small number of degrees to the input to get a range of
latitudes and longitudes.

Then query ets first using ets:prev/2 with the input latitude and walking
down by repeatedly calling ets:prev/2 until I either hit the start of the
table or the lower bound on the latitude.

You then do the same with ets:next/2 to the upper bound and
finally ets:lookup/2 in case its an exact match.

I would pass the upper and lower bound of the longitude along so I could
filter out things out of range as I was walking through the latitudes.

I've got about 60000 entries in the ets table and queries take about
0.085 milliseconds on average. I found that using ets:prev/2 and ets:next/2
was much faster than say ets:match with a matchspec to do the range
checking. I've also got another ordered_set table with 11 million entries
which I don't do range checks but just an ets:prev/2 and an ets:lookup/2
and that runs even faster (average about 0.010 millis per pair of calls).

Unfortunately the code is not shareable, but its relatively straightforward
so shouldn't be hard to recreate.

Good Luck,

-Anthony

--
------------------------------------------------------------------------
Anthony Molinaro <anth...@alumni.caltech.edu>

Reply all

Reply to author

Forward