Reverse index mapping

67 views
Skip to first unread message

shashankkr

unread,
Aug 3, 2013, 3:32:16 PM8/3/13
to uzay...@googlegroups.com
Hi Daniel,
Thanks for updating the library. I am however stuck at a point.
I generated a HBase table having 10 million points as per your instruction in howto wiki with hilbert index of the points (x,y) as the row key. I am now writing a mapreduce mapper method which executes a range query.
What I have to do is that, I need to convert the index back to coordinates. For example, String index = 1010 needs to be converted as x = 10 , y = 10 using inverse Hilbert mapping so that I can further filter out results. 
In the wiki you mentioned :

To extract the coordinates back from an index, use SpaceFillingCurve.index(index, point) with a pre-allocated coordinate array.

but I am not able to understand how this works. Can you please help me.

Take care,
Shashank 

Daniel

unread,
Aug 3, 2013, 5:16:20 PM8/3/13
to uzay...@googlegroups.com
Hi Shashank,

Please create a small self-contained example that shows your problem. To make things easier you can use the included mock HTable implementation.

Thanks,
Daniel

shashankkr

unread,
Aug 4, 2013, 3:06:36 AM8/4/13
to uzay...@googlegroups.com
Hi Daniel,
Thanks for the prompt reply. Here is the sample code,

I generated 10 million points using mapreduce like this 

public void map (LongWritable k1, Text v1, Context context) throws InterruptedException, IOException {
long limit = 0, max = 10000000;  // generates uniform dataset of coordinates from 0 to 10,000,000 e.g (0,0) (1,1) ...... (999999,999999)
try {
// construct key 
long i,j=limit;
for(i=limit; i<max; i++) {
BitVector bv = HilbertUtils.HilbertConvertor(i, j);  // returns BitVector of hilbert mapping of i and j as per the wiki
String rowKey = bv.toString();
long value = Math.abs(System.nanoTime()); // sample value
String svalue = Long.toString(value);
Put put = new Put(Bytes.toBytes(rowKey)); 
put.add(Bytes.toBytes("location"), Bytes.toBytes("longlat"), Bytes.toBytes(svalue));
context.write(new ImmutableBytesWritable(Bytes.toBytes(rowKey)), put); // put key value into HTable
context.setStatus("Inserting "+rowKey);
j++;

for example :
<key>                                                                                          <family:qualifier>     <value>
100000001010001010001000100000100000000010100010                longitude:latitude      567777432169954 

Now I have to write another mapper function which will execute a range query like this :

take the key 
calculate inverse hilbert mapping to get original coordinates 
check if its in range (min<x<max) && (min<y<max)
write key-value in another table 

public void map (ImmutableBytesWritable rowkey, Result result, Context context) throws InterruptedException, IOException {
// sample Query values
int xmin = 711845 ; // 255000
int ymin = 711845 ;
int xmax = 1711845;
int ymax = 1711845;
// extract key and check for range 
String composite = Bytes.toString(rowkey.get());

{
// code to calculate inverse h mapping of String Composite to get x and y 
if ((xmin <= x) && ( xmax >= x) && ( ymin <= y) && (ymax >= y)) {
try {
Put put = new Put(rowkey.get());
for (KeyValue kv : result.raw()) {
put.add(kv);
}
context.write(rowkey, put);
} catch (InterruptedException e) {
throw new IOException(e);
}
}
}
 
--------------------------------------------------------------------------------------------------
I am stuck at this point as to how to modify this code to calculate inverse hilbert mapping :
CompactHilbertCurve chc = new CompactHilbertCurve(new int[] {2, 2});
List<Integer> bitsPerDimension = chc.getSpec().getBitsPerDimension();
BitVector[] p = new BitVector[bitsPerDimension.size()];
for (int i = p.length; --i >= 0;) {
        p
[i] = BitVectorFactories.OPTIMAL.apply(bitsPerDimension.get(i));
}
p
[0].copyFrom(0b10);
p
[1].copyFrom(0b11);
BitVector chi = BitVectorFactories.OPTIMAL.apply(chc.getSpec().sumBitsPerDimension());
chc
.index(p, 0, chi);
System.out.println(String.format(Locale.ROOT, "index([0b%s, 0b%s])=0b%s", p[0], p[1], chi));
// Next line overwrites whatever is already written in p.
chc
.indexInverse(chi, p);
System.out.println(String.format(
       
Locale.ROOT, "indexInverse(0b%s)=[0b%s, 0b%s]", chi, p[0], p[1]));

kindly let me know if its still not clear. I really appreciate your concern in helping me out in this problem.

Thanks a lot.

take care 
-shashank 

Daniel

unread,
Aug 4, 2013, 4:01:25 AM8/4/13
to uzay...@googlegroups.com
Hi Shashank,

There is no method to parse the result of calling BitVector.toString(), because it's not intended to be used that way. It is simply a human-friendly representation of a bit vector useful for debugging purposes, and it's format is subject to change.

Rather than putting the result of Bytes.toBytes(BitVector.toString()) as a row key, you should use instead BitVector.toBigEndianByteArray(). See HBaseQueryTest lines 344-346. So it will look something like:

BitVector index = BitVectorFactories.OPTIMAL.apply(spec.sumBitsPerDimension());
for (BitVector[] point : points) {
    sfc.index(point, 0, index);
    byte[] row = index.toBigEndianByteArray();
    Put put = new Put(row);
    // (batch-)insert the "put"
}

For the reverse transformation you can have a look at HBaseQueryTest, lines 242-246, 253-257 and 273-279. Here are the main steps:

SpaceFillingCurve sfc = new CompactHilbertCurve(spec);
BitVector[] point = new BitVector[spec.getBitsPerDimension().size()];
BitVector index = BitVectorFactories.OPTIMAL.apply(spec.sumBitsPerDimension());
for (int j = 0; j < spec.getBitsPerDimension().size(); ++j) {
  point[j] = BitVectorFactories.OPTIMAL.apply(spec.getBitsPerDimension().get(j));
}
for (Result result : scanner) {
    byte[] row = result.getRow();
    index.copyFromBigEndian(row);
    sfc.indexInverse(index, point);
    // Assuming you have at most 63 bits in x and at most 63 bits in y.
    long x = point[0].toExactLong();
    long y = point[1].toExactLong();
    // Perform other checks with x and y.
}

Daniel

Il giorno sabato 3 agosto 2013 21:32:16 UTC+2, shashankkr ha scritto:

shashankkr

unread,
Aug 4, 2013, 2:29:17 PM8/4/13
to uzay...@googlegroups.com
Hi Daniel,
thanks for pointing that out to me. However, I am still getting little bit confused (sorry, new in Java ) 

 int x = 0, y =0;
  for( x=0; x<10; x++) {   // to generate uniform data (x,y) = { (0,0) , (1,1) and so on
String bx  = Integer.toBinaryString(x);
String by = Integer.toBinaryString(y);
 MultiDimensionalSpec spec = new MultiDimensionalSpec(Ints.asList(bx.length(),by.length())); // assume this sets the number of bits in each dimension
                                                                                                                                                  // I have less than 64 bits in both x and y dimension
          SpaceFillingCurve sfc = new CompactHilbertCurve(spec);
 BitVector index = BitVectorFactories.OPTIMAL.apply(spec.sumBitsPerDimension());
 BitVector[] points = new BitVector[spec.getBitsPerDimension().size()];  // how to save x and y values to compute the H value ? 
  
          for (BitVector[] point : points) {
         sfc.index(point, 0, index);
         byte[] row = index.toBigEndianByteArray();
         Put put = new Put(row);
         // (batch-)insert the "put"
      }


Thanks a lot for all your help.
take care
- Shashank


On Saturday, August 3, 2013 2:32:16 PM UTC-5, shashankkr wrote:

shashankkr

unread,
Aug 4, 2013, 8:33:34 PM8/4/13
to uzay...@googlegroups.com
Hi Daniel,
I think I was trying to do too much at once.
In my original code, instead of BitVector.toString() , I am now putting
byte[] rowKey = BitVector.toBigEndianByteArray();  

However, When I try to query for a point :

      {
            HTable htable = new  HTable(conf, tableName); 

       long x = 76224;  // sample query point
   long y = 76224;
   BitVector bv = HilbertUtils.HilbertConvertor(x, y);
byte[] rowKey = bv.toBigEndianByteArray();
Get get = new Get(rowKey);
Result rs = htable.get(get);
for (KeyValue kv : rs.raw()) {
                 System.out.println("key:value "+new String(kv.getRow())+":"+new String(kv.getValue()));
}

For few points it does not return any value and new String(kv.getRow()) prints weird symbols but it is returning value if I change x and y. So am I inserting the rowkey in correct way? I am working on the inverse mapping .. will let you know about my findings. 

Thanks for all your help.
-Shashank   

On Saturday, August 3, 2013 2:32:16 PM UTC-5, shashankkr wrote:
Reply all
Reply to author
Forward
0 new messages