Apache Drill and Apache Spark at JUGUA (December 16th)

12 views

Skip to first unread message

Andrii Rodionov

unread,

Dec 7, 2015, 4:56:34 PM12/7/15

to JUG UA (Java user group of KPI)

Dear friends, we are happy to announce our next JUG UA meeting with great speaker from MapR (Apache Hadoop Distribution) – Tugdual Grall

Registration: http://jug.ua/2015/12/apache_drill_and_apache_spark/

Bio

Tugdual Grall is a Technical Evangelist at MapR, an open source advocate and a passionate developer. He currently works with the European developer communities to ease MapR, Hadoop and NoSQL adoption. Before joining MapR, Tug was Technical Evangelist at MongoDB and Couchbase. Tug has also worked as CTO at eXo Platform and JavaEE product manager, and software engineer at Oracle. Tugdual is Co-Founder of the Nantes JUG (Java User Group) that holds since 2008 monthly meeting about Java ecosystem. Tugdual also writes a blog available at http://tgrall.github.io

Date: December 16th, 19:30 – 22:30
Venue: To be announced later (Kyiv)

Agenda

– Drilling into Data with Apache Drill

Apache Drill is a next-generation SQL engine for Hadoop and NoSQL. Its unique schema-free approach enables self-service data exploration with the agility that organizations need in this new era of rapidly growing and evolving data.

In this talk, based on demonstrations, you will understand the key features and architecture of Apache Drill. You will also see how to get started with Drill; and start query, using SQL, various data sources such as HBase, Hive, Parquet, and Avro, but also more complex data structure stored in JSON documents.

– Introduction to Apache Spark

Spark is a programming model for doing large-scale data analysis in parallel, without focusing on the details of distributed computing; the same program you write for one computer will also work across many computers.

Spark builds on the MapReduce framework by providing an interactive environment that has a more general set of functions for manipulating data efficiently in-memory. The result is a highly scalable way of quickly exploring large data sets interactively.

Reply all

Reply to author

Forward

0 new messages