Map Reduce Intro CS4961-L22
Map Reduce Intro CS4961-L22
Map Reduce Intro CS4961-L22
Map Reduce
November 23, 2010
Map Reduce
• What is MapReduce?
• Example computing environment
• How it works
• Fault Tolerance
• Debugging
• Performance
• Google version = Map Reduce; Hadoop = Open source
11/23/10
What is MapReduce?
11/23/10
Example: Counting Words
• Map()
- Input <filename, file text>
- Parses file and emits <word, count> pairs
- eg. <”hello”, 1>
• Reduce()
- Sums values for the same key and emits <word, TotalCount>
- eg. <”hello”, (3 5 2 7)> => <”hello”, 17>
Example Use of MapReduce
• Counting words in a large set of documents
• User to do list:
- indicate:
- Input/output files
- M: number of map tasks
- R: number of reduce tasks
- W: number of machines
- Write map and reduce functions
- Submit the job
...
map map
Data store 1 Data store n
• RELATED
- Sawzall
- Pig
- Hadoop