Kylin + Spark for a Customer Experience Management solution

193 views
Skip to first unread message

Gaspare Maria

unread,
Jun 4, 2015, 11:42:47 AM6/4/15
to kylin...@googlegroups.com
Hi,

I need to build an OLAP on top of HBase to provide a Customer Experience Management solution.

Currently, we are collecting and loading ~200 GB of data (CDR, logs from probes, etc.) per day on HBase with one year of retention (~ 2 PB). Data are organized as time series on HBase. Scan on row keys based on time range are very fast and also on secondary indexes (we developed coprocessors to handle indexes).

Now customer wants to have:

1. Dynamic Creation of Cubes on fact tables already existing on HBase ==> It looks like the "Kylin Cube Creation" but (from the tutorial) not clear to me if I can select fact tables from HBase (in other words not already created with Kylin).

2. Fast Response on ROLAP ==> I see that Kylin Route unsupported queries to Hive, that is not good. In gira "KYLIN-742" there is request to have "Route unsupported queries to SparkSQL". Do you have any date to have Spark integrated with Kylin ?

Finally, I would use Spark as ETL to populate in real-time the MOLAP according to CUBO metadata (using the Kylin MEtadata API). Idea is:

- raw data are loaded as CSV files in HDFS;
- use Spark to process CSV files in HDFS in order to load MOLAP created with "Kylin Cube Creation" and also have "Realtime Analytics" during data loading.


Is it feasible?

Regards,

gas.
Reply all
Reply to author
Forward
0 new messages