MapReduce Design Patterns

8 views
Skip to first unread message

David G. Boney

unread,
Jan 16, 2013, 2:15:50 PM1/16/13
to Austin-ACM-...@meetup.com, austin...@yahoogroups.com, aust...@googlegroups.com, semantic...@googlegroups.com
Austin ACM SIGKDD is starting a weekly series on MapReduce Design Patterns. It is based on the book, "MapReduce Design Patterns" by Donald Miner and Adam Shook.
We will meet either on Wednesday nights at 7:00 or Saturday at 1:00, depending on the availability of the meeting room at Northwest recreation center.
Textbooks MapReduce Design Patterns: Building Effective Algorithms and Analytics for Hadoop and Other Systems by Donald Miner and Adam Shook http://www.amazon.com/MapReduce-Design-Patterns-Effective-Algorithms/dp/1449327176/ref=sr_1_4?ie=UTF8&qid=1355520026&sr=8-4&keywords=hadoop
Syllabus
Wednesday, 1/23/2013, 7:00, Northwest Recreation Center, Week  1 - Overview of Hadoop - from Hadoop: The Definitive Guide by Tom White
Wednesday, 1/30/2013, 7:00, Northwest Recreation Center, Week  2 - Hadoop API, Readers and Writers -  from Hadoop: The Definitive Guide by Tom White
Wednesday, 2/06/2013, 7:00, Northwest Recreation Center, Week  3 - Numerical Summarizations Pattern
Saturday, 2/16/2012, 1:00, Northwest Recreation Center, Week  4 - Inverted Index Summarizations Pattern
Saturday, 2/16/2012, 1:00, Northwest Recreation Center, Week  5 - Counting with Counters Pattern
Saturday, 2/16/2012, 1:00, Northwest Recreation Center, Week  6 - Filtering Pattern
Week  7 - Bloom Filtering Pattern
Week  8 - Top Ten Pattern
Week  9 - Distinct Pattern
Week 10 - Structured to Hierarchical Pattern
Week 11 - Partitioning Pattern
Week 12 - Binning Pattern
Week 13 - Total Order Sorting Pattern
Week 14 - Shuffling Pattern
Week 15 - Reduce Side Join Pattern
Week 16 - Replicated Join Pattern
Week 17 - Composite Join Pattern
Week 18 - Cartesian Produce Pattern
Week 19 - Job Chaining Pattern
Week 20 - Chain Folding Pattern
Week 21 - Job Merging Pattern
Week 22 - Custom Input and Output in Hadoop Pattern
Week 23 - Generating Data Pattern
Week 24 - External Source Output Pattern
Week 25 - External Source Input Pattern
Week 26 - Partition Prunning Pattern
-----------------
Weekly brown bag lunch for Kaggle Competitions

Austin ACM SIGKDD meets on Monday for a brown bag lunch at Austin Tech Ranch to discuss Kaggle competitions.  The lunch is sponsored by Elastic Knowledge. We have been having a good turn out for the meetings, between 8 to 10 people per meeting. We have been discussing the new GE competitions on Kaggle.

People have been having problems finding Austin Tech Ranch on Old Jollyville Road. Please check the map before coming. If you know where the building is that the plane hit a couple of years ago, Tech Ranch is in the last building south on old Jollyville road in that group of buildings.

Please bring your laptop and a power cord.
--------------------------------------------------------------

I have started an open source project to build a data cube on Hadoop and HBase. Data cubes are used to build multi-dimension data models for data warehouses. The project is called Hadoop Cube. One of the goals of the project is to provide the MDX query language for the Hadoop ecosystems. Another goal is to provide linear statistical models and machine learning techniques for Hadoop scaled multi-dimensional data models. Part of the project is related to research I am conducting at Texas A & M on implementing linear statistical models and linear algebra in Hadoop. Contact me if you are interested in participating.

https://github.com/HadoopCube/hadoopcube

-----------------

David G. Boney

unread,
Jan 17, 2013, 2:26:18 AM1/17/13
to austin...@yahoogroups.com, Austin-ACM-...@meetup.com, aust...@googlegroups.com, semantic...@googlegroups.com
The format of the series is to cover one design pattern a week from the book "MapReduce Design Patterns: Building Effective Algorithms and Analytics for Hadoop and Other Systems" by Donald Miner and Adam Shook. We will be running the code from the examples in the book. Bring your laptop.

Here is a link to the source code used in the book.

Northwest Recreation Center
2913 Northland Drive
Austin, Tx 78757
Reply all
Reply to author
Forward
0 new messages