Skip to content

Latest commit

 

History

History
277 lines (227 loc) · 8.46 KB

File metadata and controls

277 lines (227 loc) · 8.46 KB

ZMESH Development Progress

Project: ZMESH - Decentralized ZPU Mesh Network
Started: March 1, 2026
Stack: FastAPI (Python), PostgreSQL, Redis


✅ Completed

Phase 1.1: Infrastructure Setup (March 1, 2026)

  • Project structure created
  • PostgreSQL database (zmesh) created
  • Redis available
  • Environment configuration (.env)

Phase 1.2: Database Models

  • Provider model (orchestrator/models/provider.py)

    • GPU vendor (NVIDIA, AMD, Apple, Intel)
    • VRAM tracking (total, available, allocated)
    • Status (pending, online, busy, offline)
    • Reputation score, earnings
    • WireGuard keys
  • Job model (orchestrator/models/job.py)

    • VRAM slicing allocation (JSON field)
    • SSH access credentials
    • Status tracking
    • Billing (duration, cost)
  • User model (orchestrator/models/user.py)

    • API key authentication
    • Wallet (balance, spent)
    • Usage stats

Phase 1.3: Orchestrator API

  • Provider endpoints (orchestrator/api/providers.py)

    • POST /providers/register - Provider joins mesh
    • POST /providers/heartbeat - 30s status ping
    • GET /providers/online - List online providers
    • GET /providers/{id} - Provider details
  • Job endpoints (orchestrator/api/jobs.py)

    • POST /jobs/submit - Request ZPU (VRAM slicing)
    • GET /jobs/{id} - Job status + SSH access
    • GET /jobs/{id}/allocation - VRAM slice breakdown
    • POST /jobs/{id}/cancel - Cancel job
  • User endpoints (orchestrator/api/users.py)

    • POST /users/register - User signup
    • GET /users/me - Profile
    • POST /users/balance/add - Add credits
  • VRAM Slicing Algorithm implemented

    • Pools VRAM from multiple providers
    • Sorts by available VRAM
    • Allocates optimally

Phase 1.4: Connector Agent

  • Hardware Detection (provider-app/core/hardware.py)

    • NVIDIA detection (nvidia-smi)
    • AMD detection (rocm-smi, clinfo)
    • Apple Silicon detection (system_profiler)
    • Mock mode for development
  • Agent Core (provider-app/agent.py)

    • API key authentication
    • Heartbeat every 30 seconds
    • Auto-reconnect with backoff
    • Command handling from server
  • System Tray (provider-app/core/tray.py)

    • Status icon (green/yellow/red)
    • Right-click menu
    • Open dashboard link
    • Pause/Resume/Quit
  • Auto-Startup (provider-app/core/startup.py)

    • Windows (Startup folder)
    • macOS (LaunchAgents)
    • Linux (autostart desktop file)

Phase 1.5: Server Running

  • Orchestrator running at http://localhost:8000
  • API docs at http://localhost:8000/docs
  • Health endpoint working
  • Stats endpoint working

Phase 1.6: Authentication System (March 1, 2026)

  • Auth Models (orchestrator/models/auth.py)

    • AuthUser (name, email, mobile, hashed_password)
    • IPBan (ip_address, reason, expires_at)
    • LoginAttempt (tracking failed logins)
    • RegisterAttempt (tracking spam registrations)
  • Auth API (orchestrator/api/auth.py)

    • POST /auth/register - Register with name, email, mobile
    • POST /auth/login - Login with email/password
    • POST /auth/refresh - Refresh JWT tokens
    • GET /auth/me - Get current user
    • POST /auth/logout - Logout
  • Security Features

    • JWT access + refresh tokens
    • Password hashing (bcrypt)
    • Email/mobile duplicate detection
    • IP-based rate limiting
    • Account locking (5 failed attempts = 30min lock)
    • IP ban on unusual activity
  • JWT Module (orchestrator/core/jwt.py)

    • Access token (30 min expiry)
    • Refresh token (7 days expiry)
    • Token validation
  • Rate Limiter (orchestrator/services/rate_limiter.py)

    • Login rate limiting
    • Register rate limiting
    • IP ban management

Phase 1.7: Web Dashboard (March 1, 2026)

  • Dashboard UI (orchestrator/static/)

    • Dark theme with accent colors
    • Responsive sidebar navigation
    • Stats cards (earnings, hours, jobs, VRAM)
  • Auth Pages

    • Login form with email/password
    • Register form with name/email/mobile/password
    • Form validation and error display
    • Token storage in localStorage
  • Provider Dashboard

    • Overview page with quick actions
    • Hardware detection display
    • VRAM slider (0-100% allocation)
    • ON/OFF provider toggle
    • Earnings summary (today/week/month/total)
    • Settings page (profile, API key)
    • Connection status indicators
  • Static File Serving

    • CSS at /static/css/style.css
    • JS at /static/js/app.js
    • Dashboard at /dashboard

🔄 In Progress

Phase 1.8: Integration Testing

  • End-to-end flow test
  • Provider connects via agent
  • User submits job
  • VRAM allocation works
  • SSH access returned

Phase 1.8: Production Deployment

  • DigitalOcean Mumbai droplet
  • Domain: mesh.zyoralabs.com
  • SSL certificate
  • WireGuard VPN server

Phase 2: Provider Experience

  • Provider prescreening
  • Reputation system
  • Fault tolerance / failover
  • Desktop app packaging (.exe, .dmg)

Phase 3: VRAM Pooling Engine

  • Multi-provider job execution
  • Model sharding
  • Container orchestration

📁 Project Structure

/Users/redfoxhotels/zmesh/
├── .env                           # Configuration
├── pyproject.toml                 # Dependencies
├── progress.md                    # This file
├── roadmpa.md                     # Full roadmap
│
├── orchestrator/                  # Backend API
│   ├── main.py                    # FastAPI app
│   ├── api/
│   │   ├── auth.py                # Auth endpoints
│   │   ├── providers.py           # Provider endpoints
│   │   ├── jobs.py                # Job endpoints
│   │   └── users.py               # User endpoints
│   ├── core/
│   │   ├── config.py              # Settings
│   │   ├── database.py            # PostgreSQL
│   │   └── jwt.py                 # JWT utilities
│   ├── models/
│   │   ├── auth.py                # Auth models
│   │   ├── provider.py            # Provider model
│   │   ├── job.py                 # Job model
│   │   └── user.py                # User model
│   ├── services/
│   │   └── rate_limiter.py        # IP ban & rate limiting
│   └── static/                    # Web Dashboard
│       ├── index.html             # Main HTML
│       ├── css/
│       │   └── style.css          # Styles
│       └── js/
│           └── app.js             # Frontend logic
│
└── provider-app/                  # Connector Agent
    ├── main.py                    # Entry point
    ├── agent.py                   # Agent logic
    ├── pyproject.toml             # Dependencies
    └── core/
        ├── hardware.py            # GPU detection
        ├── tray.py                # System tray
        └── startup.py             # Auto-start

🛠 Tech Stack

Component Technology
Backend API FastAPI (Python 3.11)
Database PostgreSQL 14
Queue Redis 8.6
Connector Agent Python + pystray
VPN WireGuard (planned)
Frontend TBD (Next.js or plain HTML)

📊 API Endpoints

Endpoint Method Description
/ GET Server info
/health GET Health check
/stats GET Network stats
/auth/register POST Register (name, email, mobile)
/auth/login POST Login (JWT tokens)
/auth/refresh POST Refresh access token
/auth/me GET Current user (authenticated)
/auth/logout POST Logout
/dashboard GET Provider web dashboard
/providers/register POST Provider signup
/providers/heartbeat POST Provider ping
/providers/online GET Online providers
/providers/{id} GET Provider details
/jobs/submit POST Request ZPU
/jobs/{id} GET Job status
/jobs/{id}/allocation GET VRAM breakdown
/jobs/{id}/cancel POST Cancel job
/users/register POST User signup
/users/me GET User profile
/users/balance/add POST Add credits

🎯 Next Action

Integration Testing - Verify end-to-end flow:

  1. Register provider via web dashboard
  2. Connect provider agent
  3. Submit job from user
  4. Verify VRAM allocation works
  5. Test provider earnings tracking