MapReduce Framework

Distributed MapReduce framework in Python with manager-worker architecture, socket-based networking, fault-tolerant execution, and automatic failure recovery.

Tech Stack

PythonSocket ProgrammingThreadingMultiprocessing

Overview

Implemented a distributed MapReduce framework in Python to execute parallel jobs across multiple workers. Designed manager-worker architecture using socket-based networking to coordinate task distribution and status tracking in cluster environment. Utilized OS-level concurrency (threads and processes) to parallelize tasks and improve throughput. Built fault-tolerant execution with automatic worker failure detection and task reassignment.

Architecture

Distributed manager-worker architecture with socket-based networking for task coordination, OS-level concurrency using threads and processes, fault-tolerant execution with automatic failure detection and recovery, and horizontal scalability testing in simulated multi-node environment.