Apache Drill is a next-generation SQL
engine for Hadoop and NoSQL. Its unique schema-free approach enables
self-service data exploration with the agility that organizations need
in this new era of rapidly growing and evolving data.
In this talk, based on demonstrations,
you will understand the key features and architecture of Apache Drill.
You will also see how to get started with Drill; and start query, using
SQL, various data sources such as HBase, Hive, Parquet, and Avro, but
also more complex data structure stored in JSON documents.
– Introduction to Apache Spark
Spark is a programming model for doing
large-scale data analysis in parallel, without focusing on the details
of distributed computing; the same program you write for one computer
will also work across many computers.
Spark builds on the MapReduce
framework by providing an interactive environment that has a more
general set of functions for manipulating data efficiently in-memory.
The result is a highly scalable way of quickly exploring large data sets
interactively.