psalias2006 · psalias2006 · Oct 15, 2025 · Oct 13, 2025 · Oct 13, 2025 · Oct 13, 2025
diff --git a/.dockerignore b/.dockerignore
@@ -1,76 +1,18 @@
-# Version control
-.git
-.gitignore
-.gitattributes
-
-# Python
-__pycache__
+__pycache__/
 *.pyc
 *.pyo
 *.pyd
 .Python
 *.so
 *.egg
-*.egg-info
-dist
-build
-
-# Virtual Environment
-venv/
-env/
-ENV/
-.venv/
-
-# IDE
-.vscode/
-.idea/
-*.swp
-*.swo
-*~
-.DS_Store
-
-# Documentation
-README.md
-LICENSE
+*.egg-info/
+dist/
+build/
+.git/
+.gitignore
 *.md
+!README.md
 docs/
-
-# Docker
-Dockerfile
-docker-compose.yml
-docker-compose.override.yml
-.dockerignore
-
-# CI/CD
-.github/
-.gitlab-ci.yml
-.travis.yml
-Jenkinsfile
-
-# Testing
-tests/
-test/
-*.test.py
-.pytest_cache/
-.coverage
-htmlcov/
-.tox/
-
-# Logs
-*.log
-logs/
-
-# Temporary files
-tmp/
-temp/
-*.tmp
-*.temp
-
-# OS files
+*.png
+LICENSE
 .DS_Store
-Thumbs.db
-
-# Environment files
-.env
-.env.local
-.env.*.local
diff --git a/README.md b/README.md
@@ -1,9 +1,8 @@
 <div align="center">
 
 # GPU Hot
-### **Real-time NVIDIA GPU Monitoring Dashboard**
 
-Monitor NVIDIA GPUs from any browser. No SSH, no configuration – just start and view in real-time.
+Real-time NVIDIA GPU monitoring dashboard. Web-based, no SSH required.
 
 [![Python](https://img.shields.io/badge/Python-3.8+-3776AB?style=flat-square&logo=python&logoColor=white)](https://www.python.org/)
 [![Docker](https://img.shields.io/badge/Docker-Ready-2496ED?style=flat-square&logo=docker&logoColor=white)](https://www.docker.com/)
@@ -14,108 +13,69 @@ Monitor NVIDIA GPUs from any browser. No SSH, no configuration – just start an
 
 </div>
 
+---
 
-## Quick Start
+## Usage
 
-### Docker (recommended)
+Monitor a single machine or an entire cluster with the same Docker image.
 
+**Single machine:**
 ```bash
-docker run -d --name gpu-hot --gpus all -p 1312:1312 ghcr.io/psalias2006/gpu-hot:latest
+docker run -d --gpus all -p 1312:1312 ghcr.io/psalias2006/gpu-hot:latest
 ```
 
-**Force nvidia-smi mode (for older GPUs):**
+**Multiple machines:**
 ```bash
-docker run -d --name gpu-hot --gpus all -p 1312:1312 -e NVIDIA_SMI=true ghcr.io/psalias2006/gpu-hot:latest
+# On each GPU server
+docker run -d --gpus all -p 1312:1312 -e NODE_NAME=$(hostname) ghcr.io/psalias2006/gpu-hot:latest
+
+# On a hub machine (no GPU required)
+docker run -d -p 1312:1312 -e GPU_HOT_MODE=hub -e NODE_URLS=http://server1:1312,http://server2:1312,http://server3:1312 ghcr.io/psalias2006/gpu-hot:latest
 ```
 
 Open `http://localhost:1312`
 
-### From source
+**Older GPUs:** Add `-e NVIDIA_SMI=true` if metrics don't appear.
 
+**From source:**
 ```bash
 git clone https://github.com/psalias2006/gpu-hot
 cd gpu-hot
 docker-compose up --build
 ```
 
-### Local dev
-
-```bash
-pip install -r requirements.txt
-python app.py
-```
-
-**Requirements:** Docker + NVIDIA Container Toolkit ([install guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html))
+**Requirements:** Docker + [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)
 
 ---
 
 ## Features
 
-**Sub-Second Updates:**
-- **Lightning-fast refresh rates**
-- Historical data tracking
-- WebSocket real-time streaming
-
-**Charts:**
-- Utilization, Temperature, Memory, Power
-- Fan Speed, Clock Speeds, Power Efficiency
-
-**Monitoring:**
-- Multi-GPU detection
-- Process tracking (PID, memory usage)
-- System CPU/RAM
-- WebSocket real-time updates
-
-**Metrics:**
-- GPU & Memory Utilization (%)
-- Temperature (GPU core, memory)
-- Memory (used/free/total)
-- Power draw & limits
-- Fan Speed (%)
-- Clock Speeds (graphics, SM, memory, video)
-- PCIe Gen & width
-- Performance State (P-State)
-- Compute Mode
-- Encoder/Decoder sessions
-- Throttle status
+- Real-time metrics (sub-second)
+- Automatic multi-GPU detection
+- Process monitoring (PID, memory usage)
+- Historical charts (utilization, temperature, power, clocks)
+- System metrics (CPU, RAM)
+- Scale from 1 to 100+ GPUs
+
+**Metrics:** Utilization, temperature, memory, power draw, fan speed, clock speeds, PCIe info, P-State, throttle status, encoder/decoder sessions
 
 ---
 
 ## Configuration
 
-Optional. Edit `core/config.py`:
-
-```python
-UPDATE_INTERVAL = 0.5         # NVML polling interval (fast)
-NVIDIA_SMI_INTERVAL = 2.0     # nvidia-smi polling interval (slower to reduce overhead)
-PORT = 1312                   # Web server port
-DEBUG = False
-```
-
-Environment variables:
+**Environment variables:**
 ```bash
 NVIDIA_VISIBLE_DEVICES=0,1    # Specific GPUs (default: all)
-NVIDIA_SMI=true                # Force nvidia-smi mode for all GPUs
-```
-
-**nvidia-smi Fallback:**
-- Automatically detects GPUs that don't support NVML utilization metrics
-- Falls back to nvidia-smi for those GPUs
-- Compatible with older GPUs (Quadro P1000, Tesla, etc.)
-
-**Force nvidia-smi for all GPUs:**
-- Docker: `docker run -e NVIDIA_SMI=true ...`
-- Config: Set `NVIDIA_SMI = True` in `core/config.py`
-
-Frontend tuning in `static/js/socket-handlers.js`:
-```javascript
-DOM_UPDATE_INTERVAL = 1000       // Text updates frequency (ms)
-SCROLL_PAUSE_DURATION = 100      // Scroll optimization (ms)
+NVIDIA_SMI=true                # Force nvidia-smi mode for older GPUs
+GPU_HOT_MODE=hub               # Set to 'hub' for multi-node aggregation (default: single node)
+NODE_NAME=gpu-server-1         # Node display name (default: hostname)
+NODE_URLS=http://host:1312...  # Comma-separated node URLs (required for hub mode)
 ```
 
-Chart history in `static/js/charts.js`:
-```javascript
-if (data.labels.length > 120)    // Data points to keep
+**Backend (`core/config.py`):**
+```python
+UPDATE_INTERVAL = 0.5    # Polling interval
+PORT = 1312              # Server port
 ```
 
 ---
@@ -131,41 +91,10 @@ GET /api/gpu-data  # JSON metrics
 ### WebSocket
 ```javascript
 socket.on('gpu_data', (data) => {
-  // Updates every 0.5s
-  // data.gpus, data.processes, data.system
+  // Updates every 0.5s (configurable)
+  // Contains: data.gpus, data.processes, data.system
 });
 ```
-
----
-
-## Extending
-
-Add new metrics:
-
-**Backend (`core/metrics/collector.py`):**
-```python
-# Add NVML query
-value = pynvml.nvmlDeviceGetYourMetric(handle)
-gpu_data['your_metric'] = value
-```
-
-**Frontend (`static/js/gpu-cards.js`):**
-```javascript
-// Add to card template
-<div class="metric-value" id="your-metric-${gpuId}">
-    ${gpuInfo.your_metric}
-</div>
-
-// Add to update function
-if (yourMetricEl) yourMetricEl.textContent = gpuInfo.your_metric;
-```
-
-**Chart (optional):**
-```javascript
-// static/js/charts.js
-chartConfigs.yourMetric = { type: 'line', ... };
-```
-
 ---
 
 ## Project Structure
@@ -196,53 +125,29 @@ gpu-hot/
 
 ---
 
-## Performance
-
-Frontend uses `requestAnimationFrame` batching to minimize reflows. Scroll detection pauses DOM updates during scrolling.
-
-For heavy workloads or many GPUs, increase update intervals in `core/config.py`.
-
----
-
 ## Troubleshooting
 
-**GPU not detected:**
+**No GPUs detected:**
 ```bash
-# Verify drivers
-nvidia-smi
-
-# Test Docker GPU access
-docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
-
-# Restart Docker
-sudo systemctl restart docker
+nvidia-smi  # Verify drivers work
+docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi  # Test Docker GPU access
 ```
 
-**Performance issues:**
-- Increase `UPDATE_INTERVAL` in `core/config.py`
-- Reduce chart history in `static/js/charts.js`
-- Check browser console for errors
-
-**Debug mode:**
-```python
-# core/config.py
-DEBUG = True
+**Hub can't connect to nodes:**
+```bash
+curl http://node-ip:1312/api/gpu-data  # Test connectivity
+sudo ufw allow 1312/tcp                 # Check firewall
 ```
 
+**Performance issues:** Increase `UPDATE_INTERVAL` in `core/config.py`
+
 ---
 
 ## Contributing
 
-PRs welcome. For major changes, open an issue first.
+PRs welcome. Open an issue for major changes.
 
 ## License
 
 MIT - see [LICENSE](LICENSE)
-
----
-
-<div align="center">
-
-[Report Bug](https://github.com/psalias2006/gpu-hot/issues) • [Request Feature](https://github.com/psalias2006/gpu-hot/issues)
-
-</div>
+