About the System
This site is backed by a live distributed inference cluster—four HP EliteDesk 800 G3 desktops splitting model weights across their combined memory over gigabit ethernet. It serves an OpenAI-compatible API through a WireGuard tunnel to the public internet.
Nodes
4x HP EliteDesk 800 G3 SFF
CPU per node
Intel i5-6600 (4C/4T)
Total memory
32 GB RAM + 32 GB swap
Current model
DeepSeek R1 Distill 8B
Inference engine
Distributed Llama
Context length
16,384 tokens
Internet
|
DigitalOcean VPS — Caddy reverse proxy, SSL
| WireGuard tunnel
Gateway — NAT + port forwarding
| Gigabit ethernet
|—— Node 1 — Root + API server
|—— Node 2 — Worker
|—— Node 3 — Worker
|—— Node 4 — Worker