Description
Impala Technical Deep Dive
Mark Grover, Hive contributer and author of "Programming Hive"
Join us for this technical deep dive about Cloudera Impala, the project that makes scalable parallel databse technology available to the Hadoop community for the first time. Impala is an open-sourced code base that allows users to issue low-latency queries to data stored in HDFS and Apache HBase using familiar SQL operators. We will begin with an overview of Impala from the user's perspective, followed by an overview of Impala's architecture and implementation, and will conclude with a comparison of Impala with Apache Hive, commercial MapReduce alternatives and traditional data warehouse infrastructure.