Thanks Alex Baranau for your quick response
What is the access pattern of the data in HBase?
We are maintaining the row key to solely to achieve uniqueness.
But at the same time we would be having the data in sequence it should
be incremental order.
I guess you process data with MapReduce job (to create index in
Solr?)
No , i have a custom logic in which i am generating Map file /per
user . Which will be then accessed by solr for authentication
purpose.
Or you use it solely to achieve uniqueness?
yes , Let say if there 10 document(in real time i have 10Billion) and
if i have 10 user .
USER1[0,0,1,0,1,0,0,0,0,0] == User 1 has access to doc 3, doc 5, doc
10 . In lucence level i am deciding whether for the rowkey(Unique Key
in solr) he has access or not
Perhaps, to better understand your use-case, I should ask: have you
considered using random UUID as a row key? If yes, why it is not
appropriate to use it in your case?
No , As per our logic each row key is going to bit position of my
map(Custom map) file. if 8 Million documents is the then size of
map(custom map) will be 1MB
My Map Reduce program is generating this Custom Map File..
Do you write from different machines in HBase?
No it will from one machin only.
On Jul 20, 7:23 pm, Alex Baranau <
alex.barano...@gmail.com> wrote:
> Hi Syed,
>
> Before I can judge whether HBaseWD is the right thing to use for you, let
> me ask you several Qs:
>
> What is the access pattern of the data in HBase? I guess you process data
> with MapReduce job (to create index in Solr?). Is this the primary access
> pattern?
>
> I guess apart from uniqueness of row keys, the reason why you use
> continuously incrementing row key is to be able to fetch "newly arrived
> (non-process) delta" (e.g. to feed into MR job). Is that so? Or you use it
> solely to achieve uniqueness? Do you write from different machines in HBase?
>
> Perhaps, to better understand your use-case, I should ask: have you
> considered using random UUID as a row key? If yes, why it is not
> appropriate to use it in your case?
>
> Alex Baranau
> ------
> Sematext ::
http://blog.sematext.com/:: Hadoop - HBase - ElasticSearch -
> Solr
>
> P.S. Thanx for the interest in HBaseWD rpoject.
>