Distributed File System with Raft

This project is a simple implementation of a Distributed File System (DFS) in Go. It uses the Raft consensus algorithm to replicate file metadata (creations, deletions, etc.) across a cluster of nodes, ensuring consistency and fault tolerance.

Unlike basic metadata-only systems, this implementation also replicates file content from the leader to all follower nodes, improving availability and durability.

The Raft layer is based on a modified version of Phil Eaton's goraft.

Key Features

Core Functionality

Raft-based Metadata Replication: File operations (create, delete) are committed to a distributed log via Raft, ensuring metadata is strongly consistent across the cluster.
Leader Election: Nodes elect a leader automatically. All write operations go through the leader, while reads can be served from any replica.
File Content Replication: When a file is uploaded, the leader replicates its content to all followers. This ensures that file data is not lost if the leader crashes.
Basic Fault Tolerance: The system tolerates up to (N-1)/2 failures in a cluster of size N. Both metadata and file content remain available as long as a quorum survives.

Advanced Fault Tolerance Features

Automatic Recovery Manager: Detects and recovers missing files by fetching them from healthy nodes (runs every 30 seconds)
Health Monitoring: Continuously monitors the health of all cluster nodes with configurable failure thresholds (checks every 5 seconds)
Data Repair & Integrity: Uses SHA-256 checksums to detect corrupted files and automatically repairs them from healthy replicas (verifies every 60 seconds)
Performance Metrics: Tracks operation latency, success rates, replication overhead, and provides trade-off analysis between reliability and performance
NTP-Based Time Synchronization: The system was upgraded to use the Network Time Protocol (NTP) for time synchronization.
- Previous Method (Removed): Follower nodes would ask the Raft leader for the current time. This approach was simple but had a major drawback: if the leader's clock was inaccurate, the entire cluster's time would be skewed.
- Current Method: Each node now independently contacts an external NTP server (pool.ntp.org) to synchronize its clock. This removes the dependency on the leader, making timestamps significantly more accurate and robust across the entire cluster.

Simple HTTP API

Interact with the DFS using REST-like endpoints for upload, download, delete, list, and status checks.

Architecture Overview

graph TD
    A[Client] --> B(Leader Node);
    B --> C(Follower Node 1);
    B --> D(Follower Node 2);

    subgraph Cluster
        B((Leader Node<br/>Raft + Files))
        C((Follower Node 1<br/>Raft + Files))
        D((Follower Node 2<br/>Raft + Files))
    end

    style B fill:#f9f,stroke:#333,stroke-width:2px;
    style C fill:#ccf,stroke:#333;
    style D fill:#ccf,stroke:#333;

    B -- Metadata: Raft Log --> C;
    B -- File Content: Replication --> C;
    B -- Metadata: Raft Log --> D;
    B -- File Content: Replication --> D;

Metadata: Replicated via Raft log
File Content: Replicated from Leader to Followers
Reads: Can be served from any node

Prerequisites

Go: Version 1.20+
PowerShell: To run the cluster startup script on Windows
curl: For quick testing of the HTTP API

Running the Cluster

1. Build the Application

go build -o dfsapi.exe .

2. Start the Cluster(This is a Shell Script)

.\start-cluster.ps1

This launches 3 nodes:

Node 1: http://localhost:8081 (Raft RPC on :3030)
Node 2: http://localhost:8082 (Raft RPC on :3031)
Node 3: http://localhost:8083 (Raft RPC on :3032)

Testing the File System

Basic Operations

1. Find the Leader

curl http://localhost:8081/status

Look for "is_leader": true.

2. Upload a File

echo "hello distributed world" > my-test-file.txt
curl -X POST --data-binary @my-test-file.txt http://localhost:8081/upload/my-first-file.txt

3. List Files (from any node)

curl http://localhost:8082/files

4. Download from a Follower

curl http://localhost:8083/upload/my-first-file.txt

5. Delete a File

curl -X DELETE http://localhost:8081/upload/my-first-file.txt

Fault Tolerance Monitoring

Check Recovery Status

curl http://localhost:8081/recovery/status

Shows total files, local files, and missing files that need recovery.

Check Health Status

curl http://localhost:8081/health/status

Displays health status of all cluster nodes with response times and failure counts.

Check Data Repair Status

curl http://localhost:8081/repair/status

Shows verified files, repairs performed, and corruptions detected.

View Performance Metrics

curl http://localhost:8081/metrics

Displays operation counts, average durations, success rates, and replication statistics.

Analyze Trade-offs

curl http://localhost:8081/metrics/tradeoffs

Provides analysis of reliability vs. performance trade-offs in the current configuration.

Manual File Verification

curl -X POST -H "Content-Type: application/json" \
  -d '{"file_path":"my-first-file.txt"}' \
  http://localhost:8081/repair/verify

Trigger Manual Sync

curl -X POST http://localhost:8081/recovery/sync

API Endpoints

File Operations

POST /upload/{filename} - Upload a file (leader only)
GET /upload/{filename} - Download a file (any node)
DELETE /upload/{filename} - Delete a file (leader only)
GET /files - List all files (any node)

System Status

GET /status - Node status and leader information

Fault Tolerance

GET /recovery/status - Recovery manager status
POST /recovery/sync - Trigger manual recovery sync
GET /health/status - Cluster health monitoring
GET /repair/status - Data repair and integrity status
POST /repair/verify - Manually verify a specific file
GET /metrics - Performance metrics
GET /metrics/tradeoffs - Trade-off analysis

Internal (used by cluster)

POST /replicate/{filename} - Receive replicated file content
POST /delete-content/{filename} - Delete local file content

Fault Tolerance Details

Recovery Manager

Runs every 30 seconds
Compares local files with cluster metadata
Automatically fetches missing files from healthy nodes
Ensures eventual consistency of file content

Health Monitor

Performs health checks every 5 seconds
Marks nodes unhealthy after 3 consecutive failures
Provides real-time cluster health visibility
Helps identify network partitions or node failures

Data Repair Manager

Calculates SHA-256 checksums for all files
Verifies file integrity every 60 seconds
Detects silent data corruption
Automatically repairs corrupted files from healthy replicas

Metrics Collector

Tracks all file operations (create, read, delete)
Measures latency, throughput, and success rates
Analyzes replication overhead
Provides insights on reliability vs. performance trade-offs
Logs summary every 5 minutes

Development Challenges Faced

Building this distributed file system involved overcoming several common but complex engineering challenges:

Concurrency and State Management:
- Challenge: Ensuring that shared data (like the file list or performance metrics) could be safely accessed by concurrent HTTP requests, background Raft processes, and timed fault-tolerance checks without causing race conditions.
- Example: The initial implementation for tracking the success/failure of a file upload required a complex wrapper around Go's standard http.ResponseWriter just to capture the status code from a concurrent handler. This highlights the difficulty of managing state across asynchronous operations.
Correctly Implementing Distributed Concepts:
- Challenge: The theoretical concepts of distributed systems have subtle but critical details. The initial implementation of time synchronization is a prime example.
- Example: The system first used a simple, leader-based time sync where followers would ask the leader for the time. The difficulty was realizing this approach was flawed—if the leader's clock was wrong, the entire cluster's time would be skewed. This led to its replacement with a more robust NTP-based solution, where each node consults an external, authoritative time source.
Robust Configuration and Startup:
- Challenge: Parsing command-line arguments manually is error-prone. The initial getConfig function had to carefully iterate through os.Args to find flags and their values.
- Example: This manual approach is brittle; it can easily break if arguments are provided in an unexpected order. While functional, it represents the common difficulty of building robust configuration handling without relying on standard libraries like Go's flag package.

Project Structure

.
├── consensus/          # Raft consensus implementation
│   ├── raft.go        # Core Raft logic
│   ├── raft_test.go   # Persistence tests
│   └── util.go        # Utility functions
├── replication/       # File replication logic
│   └── replication.go # State machine and handlers
├── faulttolerance/    # Fault tolerance features
│   ├── recovery.go    # Recovery manager
│   ├── health.go      # Health monitoring
│   ├── datarepair.go  # Data integrity verification
│   ├── metrics.go     # Performance metrics
│   └── faulttolerance.go # Delete operations
├── timesync/          # Time synchronization
│   └── timesync.go    # Clock synchronization
├── main.go            # Entry point and HTTP server
├── start-cluster.ps1  # Cluster startup script
└── README.md          # This file

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
DFS		DFS
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Distributed File System with Raft

Key Features

Core Functionality

Advanced Fault Tolerance Features

Simple HTTP API

Architecture Overview

Prerequisites

Running the Cluster

1. Build the Application

2. Start the Cluster(This is a Shell Script)

Testing the File System

Basic Operations

1. Find the Leader

2. Upload a File

3. List Files (from any node)

4. Download from a Follower

5. Delete a File

Fault Tolerance Monitoring

Check Recovery Status

Check Health Status

Check Data Repair Status

View Performance Metrics

Analyze Trade-offs

Manual File Verification

Trigger Manual Sync

API Endpoints

File Operations

System Status

Fault Tolerance

Internal (used by cluster)

Fault Tolerance Details

Recovery Manager

Health Monitor

Data Repair Manager

Metrics Collector

Development Challenges Faced

Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages