Final Year B.Tech Project — Department of Computer Science & Engineering
|
Hooghly Engineering & Technology College, Hooghly Department of Computer Science & Engineering Affiliated to Maulana Abul Kalam Azad University of Technology, West Bengal 2025–26 |
|
📽️ Replace the URL above with your actual video link (GitHub upload, Google Drive, or YouTube). GitHub supports direct .mp4 embeds in READMEs.
- About DTrain
- The Problem It Solves
- Key Features
- System Architecture
- How It Works
- Performance & Cost Analysis
- Tech Stack
- Project Structure
- Prerequisites
- Environment Variables
- Setup & Installation
- Running the Project
- API Reference
- Full Walkthrough
- Team
- License
DTrain is a full-stack peer-to-peer distributed AI model training platform. It connects ML researchers and developers who need compute power with GPU owners who have idle machines — and handles everything in between: job analysis, pricing, payment escrow, execution, log streaming, and automatic payout.
Think of it as Airbnb for GPU compute. A user uploads their Python training code as a ZIP file, DTrain's two-layer AI pricing engine (rule-based scorer + Groq LLM) estimates job complexity and sets a flat price. The user reviews and pays. A worker's Electron desktop agent picks up the job, runs it inside an isolated Docker container, streams logs back live via Socket.IO, and uploads the trained model to Supabase when done. Payment settles automatically — the worker earns 80%, the platform takes 20%.
No cloud accounts. No infrastructure setup. No per-hour billing surprises.
| Pain Point | Reality Today |
|---|---|
| Cloud GPU costs | AWS p3.2xlarge costs ~₹280/hr. A BERT fine-tune can run 3–4 hours = ₹840–₹1120 for one job |
| Queue times | Colab Pro disconnects. Kaggle limits. University HPC has multi-day queues |
| Infrastructure complexity | Setting up CUDA drivers, Docker, cloud IAM, storage buckets takes hours |
| No live feedback | Cloud batch jobs give you nothing until they're done — or crashed |
| Wasted idle GPUs | Millions of consumer GPUs (gaming PCs) sit at 0% utilisation every night |
| No micropayment layer | Sharing compute ad-hoc has no billing, escrow, or trust mechanism |
DTrain solves all six simultaneously.
- AI-powered job pricing — Groq LLM + rule-based static analyser reads your
requirements.txtand training script to estimate complexity and assign a fair flat price from a ₹10–₹500 tier ladder. No surprises. - Draft → Publish flow — jobs are created as drafts with a price preview before any payment is taken. Users can review and cancel.
- Stripe escrow — funds are reserved (not charged) on publish. Charge only happens on successful completion. Full refund on failure.
- Real-time log streaming — every
stdout/stderrline from the Docker container is pushed via Socket.IO to the user's browser as it happens. - Docker isolation — every training job runs in a fresh, sandboxed Docker container. No cross-job interference, no host machine access.
- Electron desktop agent — a one-click Windows installer turns any gaming PC into a worker node. No terminal, no config files.
- Automatic worker payout — when a job completes, the worker's in-platform wallet is credited instantly. Payout requests go through Stripe Connect.
- Worker hardware detection — the Electron agent auto-detects OS, CPU, RAM, and GPU on registration.
- 80/20 revenue split — workers keep 80% of every job's tier price. Platform takes 20%.
The platform has four distinct application layers that communicate via REST APIs and WebSockets:
| Component | Tech | Purpose |
|---|---|---|
frontend/ |
React 18, TypeScript, Tailwind, Vite | User dashboard — job submission, live log view, wallet top-up, model download |
backend/ |
Express.js v5, Socket.IO, Mongoose | Central API — auth, job lifecycle, pricing, payment webhooks, log relay |
worker-ui/ |
React, Tailwind, Vite | Browser UI for workers — view earnings, request payouts, set pricing preferences |
electron-worker/ |
Electron, Node.js, Docker | Desktop agent — polls jobs, spins Docker containers, streams logs, uploads output |
- Register — Worker opens the Electron app, which auto-detects device specs (OS, CPU, RAM, GPU via
osmodule + Docker stats) and callsPOST /api/worker/register. - Poll — Every ~5 seconds, the agent calls
GET /api/worker/available-jobs?deviceId=.... This also acts as a heartbeat — if no poll arrives in 60s, the backend marks the worker offline. - Accept — Agent accepts the first available job. Backend atomically assigns it (prevents two workers grabbing the same job).
- Execute — Agent downloads the ZIP from Supabase, extracts it to a temp folder, and runs:
docker run --rm -v /tmp/job_xyz:/workspace python:3.10-slim bash -c "pip install -r requirements.txt && python train.py" - Stream — Every line of
stdout/stderris POSTed to the backend, which relays it to the user via Socket.IO. - Complete — Output files are zipped and uploaded to Supabase. The agent calls
POST /api/worker/complete-job. Backend settles payment and credits the worker wallet.
Job pricing uses a two-layer scoring system to prevent both under-pricing heavy jobs and over-pricing trivial ones:
Layer 1 — Rule-based static analyser (paymentHelpers.js)
Scans requirements.txt (comment lines stripped) and the main Python file for known signals:
| Signal | Score Added |
|---|---|
torch / pytorch in requirements |
+40 pts |
tensorflow / keras |
+40 pts |
transformers / diffusers (Hugging Face) |
+45 pts |
xgboost / lightgbm / catboost |
+20 pts |
scikit-learn |
+10 pts |
numpy / pandas |
+2 pts |
n_samples ≥ 100,000 in code |
+30 pts |
n_samples ≥ 50,000 |
+20 pts |
| Epoch count ≥ 100 | +15 pts |
| Multi-GPU / distributed training patterns | +20 pts |
Layer 2 — Groq LLM reads the full script and returns its own complexity score 0–100.
The final score = max(rule_score, groq_score). This is then mapped to the tier ladder:
Score 0–10 → ₹10 Score 11–20 → ₹20 Score 21–30 → ₹30
Score 31–40 → ₹40 Score 41–50 → ₹50 Score 51–60 → ₹75
Score 61–70 → ₹100 Score 71–80 → ₹150 Score 81–88 → ₹200
Score 89–93 → ₹300 Score 94–97 → ₹400 Score 98–100 → ₹500
Revenue split:
Worker receives = tier_price × 0.80
Platform fee = tier_price × 0.20
DTrain uses a reserve → charge model powered by Stripe to protect both parties:
User tops up wallet → Stripe payment intent
↓
User publishes job → Funds RESERVED (not charged)
↓
Worker accepts job → Job locked to worker
↓
┌── Job SUCCESS ─────────────────────────────────────────┐
│ Funds CHARGED · Worker wallet +80% · Platform +20% │
└────────────────────────────────────────────────────────┘
┌── Job FAILED / TIMEOUT ────────────────────────────────┐
│ Reservation RELEASED · Full refund to user wallet │
└────────────────────────────────────────────────────────┘
Workers can request a payout from their in-platform wallet to their bank account via Stripe Connect at any time.
The following benchmarks compare estimated training time and cost across four compute setups for common ML workloads. All values are derived mathematically from published GPU TFLOPS specifications and real AWS spot pricing.
Methodology:
Time = Total FLOPs ÷ (GPU TFLOPS × utilisation factor). GTX 1060 ≈ 4 TFLOPS @ 70% util · RTX 4090 ≈ 82.6 TFLOPS @ 85% util · AWS V100 ≈ 14 TFLOPS @ 90% util · DTrain assumes 3–5 mid-range workers (GTX 1080 Ti / RTX 3070, ~12 TFLOPS avg, 10% coordination overhead). AWS cost = time × ₹280/hr (p3.2xlarge spot). Local PC = electricity at ₹8/kWh. DTrain cost = flat tier price from the ₹10–₹500 platform ladder.
| Model / Task | Low-end PC | ✅ DTrain | High-end PC | AWS Cloud |
|---|---|---|---|---|
| Scikit-learn — time | 1.4h | 0.35h | 0.25h | 0.18h |
| Scikit-learn — cost | ₹9 | ₹20 | ₹4 | ₹50 |
| XGBoost — time | 5.2h | 1.1h | 0.8h | 0.55h |
| XGBoost — cost | ₹33 | ₹40 | ₹12 | ₹154 |
| PyTorch CNN — time | 18.4h | 4.1h | 2.7h | 1.8h |
| PyTorch CNN — cost | ₹118 | ₹100 | ₹43 | ₹504 |
| BERT fine-tune — time | 38.0h | 6.8h | 5.2h | 3.4h |
| BERT fine-tune — cost | ₹243 | ₹200 | ₹83 | ₹952 |
| Platform | Cost (CNN job) | Setup Time | Live Logs | Auto Billing | Job Isolation |
|---|---|---|---|---|---|
| AWS EC2 p3.2xlarge | ₹504 | 30–60 min | ❌ | ❌ | Full VM |
| Google Colab Pro | ₹900/month flat | 5–10 min | ❌ | ❌ | Shared runtime |
| Paperspace Gradient | ₹400–₹800 | 15 min | Partial | ❌ | Container |
| Local machine only | ~₹118 (electricity) | 0 min | ❌ | N/A | None |
| ✅ DTrain | ₹100 flat | ~2 min | ✅ Real-time | ✅ Stripe | Docker |
DTrain's sweet spot is medium-to-heavy jobs (PyTorch CNNs, Transformers) where it is 4–6× faster than a low-end PC and 74–80% cheaper than AWS — with zero cloud account setup required.
| Technology | Version | Role |
|---|---|---|
| 18 / 5.5 | UI framework | |
| 5.4 / 3.4 | Build + styling | |
| 4.8 | Live job status updates |
dtrain/ │ ├──
backend # Express.js REST API + Socket.IO server │ ├──
middlewares │ │ └──
authMiddleware.js # JWT verify → attaches req.user to every protected route │ ├──
routes │ │ ├──
JobRoutes.js # /create /publish /cancel /list /:id │ │ ├──
PaymentRoutes.js # Stripe checkout, webhook handler, wallet top-up & payout │ │ ├──
UserRoutes.js # /signup /signin /profile /update /delete │ │ └──
WorkerRoutes.js # /register /available-jobs /accept /push-log /complete /fail /delete │ ├──
schemas # Mongoose data models (MongoDB) │ │ ├──
BillingSchema.js # Per-job billing record — links user ↔ job ↔ worker │ │ ├──
JobMetricsSchema.js # Training duration, resource usage snapshots │ │ ├──
JobSchema.js # Core model — status enum, pricing, logs[], config{} │ │ ├──
TransactionSchema.js # Wallet credit / debit ledger │ │ ├──
UserSchema.js # email, passwordHash, walletBalance │ │ └──
WorkerSchema.js # deviceId, gpu, currentStatus, walletBalance │ ├──
utils │ │ ├──
jwt.js # signToken / verifyToken helpers │ │ ├──
paymentHelpers.js ⭐ # Tier scoring engine + Groq LLM call + 80/20 split logic │ │ ├──
redis.js # Redis publisher instance (job queue pub-sub) │ │ ├──
stripeClient.js # Stripe SDK singleton │ │ └──
supabaseClient.js # Supabase client (service role — ZIP upload & model fetch) │ ├──
.gitignore │ ├──
index.js ⭐ # Entry point — Express app, Socket.IO setup, DB connect │ ├──
package-lock.json │ └──
package.json │ ├──
electron-worker # Desktop worker agent (Windows .exe) │ ├──
assets │ │ ├──
icon.ico │ │ └──
icon.png │ ├──
release # Build output — do not edit manually │ │ ├──
win-unpacked │ │ │ ├──
locales # Electron i18n locale packs (auto-generated) │ │ │ │ └──
*.pak (55 languages) │ │ │ ├──
resources │ │ │ │ ├──
app.asar # Packaged Electron app bundle │ │ │ │ └──
elevate.exe │ │ │ ├──
DTrain Worker.exe # Unpacked executable (for testing) │ │ │ └──
*.dll / *.pak / *.bin # Chromium runtime dependencies │ │ ├──
DTrain Worker Setup 1.0.0.exe ⭐ # Distributable installer (electron-builder) │ │ ├──
DTrain Worker Setup 1.0.0.exe.blockmap │ │ ├──
builder-debug.yml │ │ └──
builder-effective-config.yaml │ ├──
.gitignore │ ├──
main.js ⭐ # Core agent — job polling, Docker spawn, log streaming, upload │ ├──
package-lock.json │ ├──
package.json │ └──
preload.js # Electron contextBridge — IPC between main ↔ renderer │ ├──
frontend # User-facing React app (port 5173) │ ├──
public │ │ ├──
Favicon.ico │ │ ├──
logo.png │ │ └──
logo1.png │ ├──
src │ │ ├──
components │ │ │ ├──
ActiveWorkers.tsx # Live map / list of online worker nodes │ │ │ ├──
Dashboard.tsx ⭐ # Job list, status filters, stats overview │ │ │ ├──
Documentation.tsx # In-app how-to guide │ │ │ ├──
HeroSection.tsx # Landing page — hero, CTA, feature highlights │ │ │ ├──
JobDetail.tsx ⭐ # Live log stream, status timeline, model download │ │ │ ├──
JobSubmission.tsx ⭐ # Multi-step: upload → AI pricing preview → publish & pay │ │ │ ├──
PendingJobs.tsx # Queue view of jobs awaiting a worker │ │ │ ├──
ProfileDropdown.tsx # User menu — profile info, sign out │ │ │ ├──
RunningJobs.tsx # Live view of currently processing jobs │ │ │ ├──
SettingsModal.tsx ⭐ # Update name & email · permanent account deletion │ │ │ ├──
SignIn.tsx # Login form with JWT storage │ │ │ ├──
SignUp.tsx # Registration form │ │ │ └──
Wallet.tsx ⭐ # Balance card, Stripe top-up, transaction history │ │ ├──
types │ │ │ └──
index.ts # Shared TypeScript interfaces — Job, Worker, User, Billing... │ │ ├──
App.tsx # Root component — router + auth guard │ │ ├──
index.css # Tailwind base styles │ │ ├──
main.tsx # React DOM entry point │ │ └──
vite-env.d.ts # Vite env type declarations │ ├──
.gitignore │ ├──
eslint.config.js │ ├──
index.html # HTML shell — Vite injects bundle here │ ├──
package-lock.json │ ├──
package.json │ ├──
postcss.config.js │ ├──
tailwind.config.js │ ├──
tsconfig.app.json │ ├──
tsconfig.json │ ├──
tsconfig.node.json │ └──
vite.config.ts │ └──
worker-ui # Worker browser dashboard (port 3000) ├──
public │ ├──
Favicon.ico │ ├──
logo.png │ └──
logo1.png ├──
src │ ├──
components │ │ ├──
Documentation.tsx # In-app guide for workers │ │ ├──
HeroSection.tsx # Worker landing / onboarding page │ │ ├──
JobDetail.tsx # Per-job detail with log preview │ │ ├──
PayoutRequest.tsx ⭐ # Payout form → POST /api/payment/payout-request │ │ ├──
PricingSettings.tsx # Worker sets minimum accepted job price │ │ ├──
ProfileDropdown.tsx # Worker menu — profile info, sign out │ │ ├──
RunningJobs.tsx # Active job monitor with live log preview │ │ ├──
SettingsModal.tsx ⭐ # Update name & email · delete user account + linked worker record │ │ ├──
SignIn.tsx # Worker login │ │ ├──
SignUp.tsx # Worker registration │ │ ├──
WalletCard.tsx # Earnings balance + payout trigger │ │ ├──
WorkerDashboard.tsx ⭐ # Earnings summary, recent jobs, wallet overview │ │ └──
WorkerRegistration.tsx ⭐ # First-time device registration — sends hardware specs │ ├──
types │ │ └──
index.ts # Shared TypeScript interfaces │ ├──
App.tsx # Router + auth guard │ ├──
index.css │ ├──
main.tsx │ └──
vite-env.d.ts ├──
.gitignore ├──
README.md ├──
eslint.config.js ├──
index.html ├──
package-lock.json ├──
package.json ├──
postcss.config.js ├──
tailwind.config.js ├──
tsconfig.app.json ├──
tsconfig.json ├──
tsconfig.node.json └──
vite.config.ts
Ensure the following are installed and configured before running DTrain:
Runtime & Tools
- Node.js v20 or later + npm v10+
- Git
- Docker Desktop — required on any machine running the Electron worker agent
Cloud Services (all have free tiers)
- MongoDB Atlas — free M0 cluster is sufficient for development
- Supabase — create a project and a storage bucket named
jobs - Stripe — test mode keys work for the full flow
- Groq — free API key for
llama3-70b-8192or compatible model - Redis — run locally via Docker (instructions below)
Create a file named .env inside the backend/ directory. Never commit this file — it is already in .gitignore.
# ── Database ─────────────────────────────────────────────────────────────────
MONGO_URI=mongodb+srv://<username>:<password>@<cluster>.mongodb.net/?appName=DTrain
# ── Supabase (object storage for ZIPs and model outputs) ─────────────────────
SUPABASE_URL=https://<project-id>.supabase.co
SUPABASE_SERVICE_ROLE=eyJ... # Service role key (NOT the anon key)
# ── Redis (job queue) ─────────────────────────────────────────────────────────
REDIS_URL=redis://127.0.0.1:6379
# ── Authentication ────────────────────────────────────────────────────────────
JWT_SECRET=<minimum-32-char-random-string>
# ── Stripe (payments) ─────────────────────────────────────────────────────────
STRIPE_SECRET_KEY=sk_test_<your-stripe-secret-key>
STRIPE_WEBHOOK_SECRET=whsec_<your-stripe-webhook-secret>
# ── Groq AI (job complexity pricing) ─────────────────────────────────────────
GROQ_API_KEY=gsk_<your-groq-api-key>
# ── Server ────────────────────────────────────────────────────────────────────
PORT=5000
FRONTEND_URL=http://localhost:5173| Variable | Where to get it |
|---|---|
MONGO_URI |
MongoDB Atlas → Clusters → Connect → Drivers |
SUPABASE_URL |
Supabase dashboard → Project Settings → API |
SUPABASE_SERVICE_ROLE |
Supabase dashboard → Project Settings → API → service_role key |
JWT_SECRET |
Generate with: node -e "console.log(require('crypto').randomBytes(32).toString('hex'))" |
STRIPE_SECRET_KEY |
Stripe Dashboard → Developers → API keys |
STRIPE_WEBHOOK_SECRET |
Stripe Dashboard → Developers → Webhooks → signing secret |
GROQ_API_KEY |
console.groq.com → API Keys |
git clone https://github.com/<your-org>/dtrain.git
cd dtrainDTrain uses Redis as a job queue and pub-sub channel. Start a Redis container:
docker run --name dtrain-redis -p 6379:6379 -d redisConfirm it's running:
docker ps
# Should show dtrain-redis with status "Up"To stop and remove later:
docker stop dtrain-redis && docker rm dtrain-rediscd backend
cp .env.example .env # if .env.example exists, otherwise create from scratch
# Fill in all values as described in the Environment Variables section aboveRun from the project root:
# Backend
cd backend && npm install && cd ..
# User frontend
cd frontend && npm install && cd ..
# Worker browser UI
cd worker-ui && npm install && cd ..
# Electron desktop agent
cd electron-worker && npm install && cd ..In your Supabase project, create a storage bucket named jobs and set it to public (so the signed URLs the backend generates are accessible to workers and users).
Open four separate terminal windows from the project root directory:
cd backend
npm run devExpected output:
Server running on port 5000
MongoDB connected ✅
Redis connected ✅
cd frontend
npm run devOpen → http://localhost:5173
cd worker-ui
npm run devOpen → http://localhost:3000
cd electron-worker
npm start
⚠️ Docker Desktop must be running before you launch the Electron agent. The agent spawns Docker containers to execute jobs. Without Docker running, job execution will fail immediately.
Docker Desktop → Redis container → Backend → Frontend / Worker UI → Electron Agent
| Method | Endpoint | Auth | Description |
|---|---|---|---|
| POST | /api/user/signup |
None | Register new user |
| POST | /api/user/signin |
None | Login, returns JWT |
| GET | /api/user/profile |
JWT | Get current user |
| Method | Endpoint | Auth | Description |
|---|---|---|---|
| GET | /api/jobs/ |
JWT | List user's jobs (all statuses) |
| POST | /api/jobs/create |
JWT | Upload ZIP, analyse, save as draft |
| POST | /api/jobs/publish/:id |
JWT | Pay & publish draft to workers |
| GET | /api/jobs/:id |
JWT | Get single job with logs |
| POST | /api/jobs/cancel/:id |
JWT | Cancel a draft or pending job |
| Method | Endpoint | Auth | Description |
|---|---|---|---|
| POST | /api/worker/register |
JWT | Register device + hardware specs |
| GET | /api/worker/available-jobs |
None* | Poll for pending jobs (heartbeat) |
| POST | /api/worker/accept-job |
None* | Claim a job |
| POST | /api/worker/push-log |
None* | Stream a log line |
| POST | /api/worker/complete-job |
None* | Mark job done + trigger payout |
| POST | /api/worker/fail-job |
None* | Report failure + trigger refund |
*Worker endpoints use deviceId for identification rather than JWT.
| Method | Endpoint | Auth | Description |
|---|---|---|---|
| POST | /api/payment/checkout |
JWT | Create Stripe payment intent |
| POST | /api/payment/webhook |
Stripe sig | Handle Stripe events |
| GET | /api/payment/wallet |
JWT | Get wallet balance + history |
| POST | /api/payment/topup |
JWT | Add funds to wallet |
| POST | /api/payment/payout-request |
JWT | Worker requests bank payout |
Manual walkthrough to test the full flow:
As a User:
- Open
http://localhost:5173→ Sign up - Go to Wallet → Top Up → use Stripe test card
4242 4242 4242 4242(any future expiry, any CVC) - Go to Dashboard → Submit Job
- Upload a ZIP containing your Python training script +
requirements.txt - Enter your main filename (e.g.
train.py) - Click Analyse — wait for the AI pricing engine to return a price
- Review the price and click Publish — pay from your wallet
- Navigate to the job detail page — wait for a worker to pick it up
As a Worker:
- Open Docker Desktop — make sure it's running
- Launch the Electron worker app (
npm startinelectron-worker/) - Open
http://localhost:3000→ Sign up as a worker → Register device - The agent will automatically poll and pick up the published job
- Watch logs stream in real time on both the worker agent and the user's job detail page
- When done, the model is uploaded and your wallet is credited
Potential improvements and features for future versions:
- GPU-accelerated Docker containers (NVIDIA Container Toolkit)
- Multi-worker job parallelism (data-parallel training)
- Worker reputation and rating system
- macOS and Linux Electron agent builds
- Job template library (common training scripts)
- Webhook notifications (email/Slack on job completion)
- Admin dashboard for platform analytics
- Worker GPU benchmarking before job assignment
This project was submitted as an academic final year B.Tech capstone project. All rights reserved by the authors and institution. Not licensed for commercial use or redistribution without explicit written permission from the team.






