Introduction to Data Science, a free online course on Coursera started on May 1st
URL:
https://www.coursera.org/course/datasci
Below are picked up from Coursera website
About the Course
Commerce and research is being transformed by data-driven discovery and
prediction. Skills required for data analytics at massive levels –
scalable data management on and off the cloud, parallel algorithms,
statistical modeling, and proficiency with a complex ecosystem of tools
and platforms – span a variety of disciplines and are not easy to obtain
through conventional curricula. Tour the basic techniques of data
science, including both SQL and NoSQL solutions for massive data
management (e.g., MapReduce and contemporaries), algorithms for data
mining (e.g., clustering and association rule mining), and basic
statistical modeling (e.g., linear and non-linear regression).
Course Syllabus
Part 0: Introduction
Examples, data science articulated, history and context, technology landscape
Part 1: Data Manipulation, at Scale
Databases and the relational algebra
Parallel databases, parallel query processing, in-database analytics
MapReduce, Hadoop, relationship to databases, algorithms, extensions, languages
Key-value stores and NoSQL; tradeoffs of SQL and NoSQL
Entity resolution, record linkage, data cleaning
Part 2: Analytics
Basic statistical modeling, experiment design, introduction to machine learning, overfitting
Supervised learning: overview, simple nearest neighbor, decision trees/forests, regression
Unsupervised learning: k-means, multi-dimensional scaling
Graph Analytics: PageRank, community detection, recursive queries, iterative processing
Text Analytics: latent semantic analysis
Collaborative Filtering: slope-one
Part 3: Communicating Results
Visualization, data products, visual data analytics
Provenance, privacy, ethics, governance
Part 4: Guest Lectures
Guest Lectures: AMPLab, Datameer, SciDB, more