I see there are 2 hbase table-scanning APIs in Google Cloud's sample code:
1) using google.cloud module bigtable object
from google.cloud import bigtable
client = bigtable.Client(project=project_id, admin=True)
instance = client.instance(instance_id)
table = instance.table(table_id)
partial_rows = table.read_rows(...)
partial_rows.consume_all()
for row_key, row in partial_rows.rows.items():
2) using google.cloud module bigtable and happybase objects
from google.cloud import bigtable
from google.cloud import happybase
client = bigtable.Client(project=project_id, admin=True)
instance = client.instance(instance_id)
connection = happybase.Connection(instance=instance)
table = connection.table(table_name)
for key, row in table.scan():
what are the differences in these 2 approaches in terms of performance and suitability for Spark jobs ?