CMPSC 497C/IST 497C

-- This page is best viewed from the PSU or CSE vpn --


Working in the lab or at home


Course Logistics

Instructor:Daniel Kifer
Office hours:Tuesday/Thursday 2:30-3:30pm in 360F IST
 
Semester:Spring 2016
Meeting time:Tuesday/Thursday 1:00-2:15pm
Location:Computer Lab at 134 Cedar Building (see map below)

Lectures

Date Topic Slides/Notes/Links
Jan 12 Discussion of syllabus, cloud computing, and big data  
Jan 14 The MapReduce model and its implementation in Hadoop / Lab 1
Jan 19 Modifying the shuffle and sort / Lab 1  
Jan 21 Combining Strategies / Labs 1,2  
Jan 26 Combining Strategies  
Jan 28 MapReduce algorithms, tf-idf
Feb 2 Graph processing with MapReduce Reading: Data-Intensive Text Processing with MapReduce, ch 1,2 (note Section 2.6 is out of date)
Feb 4
Feb 9
Feb 11
Feb 16 Reservoir Sampling
Feb 18 Pig
Feb 23 Pig
Feb 25 Pig Pig commands we used
March 1  
March 3  
March 8 Spring Break  
March 10 Spring Break  
March 15
March 17  
March 22  
March 24
March 29  
March 31
April 5
April 7
April 12 Spark Spark Commands from Class
April 14  
Aoril 19
April 21
April 26 Min Hashing notes
Reading: Chapter 3 of MMDS book
April 28  
May Final exam
 

Guides

Trouble shooting

Code

Data

Assignments