raft-key-value-store / labs-fall23-AnurimaVaishnavi-master / assignment1-2 / src / mapreduce / readme.go
readme.go
Raw
// Package mapreduce provides a simple mapreduce library with a sequential
// implementation. Applications should normally call Distributed() [located in
// master.go] to start a job, but may instead call Sequential() [also in
// master.go] to get a sequential execution for debugging purposes.
//
// The flow of the mapreduce implementation is as follows:
//
//   1. The application provides a number of input files, a map function, a
//      reduce function, and the number of reduce tasks (nReduce).
//   2. A master is created with this knowledge. It spins up an RPC server (see
//      master_rpc.go), and waits for workers to register (using the RPC call
//      Register() [defined in master.go]). As tasks become available (in steps
//      4 and 5), schedule() [schedule.go] decides how to assign those tasks to
//      workers, and how to handle worker failures.
//   3. The master considers each input file one map tasks, and makes a call to
//      doMap() [common_map.go] at least once for each task. It does so either
//      directly (when using Sequential()) or by issuing the DoJob RPC on a
//      worker [worker.go]. Each call to doMap() reads the appropriate file,
//      calls the map function on that file's contents, and produces nReduce
//      files for each map file. Thus, there will be #files x nReduce files
//      after all map tasks are done:
//
//          f0-0, ..., f0-0, f0-<nReduce-1>, ...,
//          f<#files-1>-0, ... f<#files-1>-<nReduce-1>.
//
//   4. The master next makes a call to doReduce() [common_reduce.go] at least
//      once for each reduce task. As for doMap(), it does so either directly or
//      through a worker. doReduce() collects nReduce reduce files from each
//      map (f-*-<reduce>), and runs the reduce function on those files. This
//      produces nReduce result files.
//   5. The master calls mr.merge() [master_splitmerge.go], which merges all
//      the nReduce files produced by the previous step into a single output.
//   6. The master sends a Shutdown RPC to each of its workers, and then shuts
//      down its own RPC server.
//
// TODO:
// You will have to write/modify doMap, doReduce, and schedule yourself. These
// are located in common_map.go, common_reduce.go, and schedule.go
// respectively. You will also have to write the map and reduce functions in
// ../main/wc.go.
//
// You should not need to modify any other files, but reading them might be
// useful in order to understand how the other methods fit into the overall
// architecture of the system.
package mapreduce