Deployment¶
Deployment is Docker-based. The development image is for tests and benchmarks; demo images are for serving.
Local CPU demo¶
docker build -f docker/Dockerfile.demo.cpu -t feature-elm-demo-cpu .
docker run --rm -p 8888:8888 feature-elm-demo-cpu
Local GPU demo¶
docker build -f docker/Dockerfile.demo.gpu -t feature-elm-demo-gpu .
docker run --rm --gpus all -p 8888:8888 feature-elm-demo-gpu
The GPU image uses the same HTTP surface and enables on-demand benchmark behavior when a CUDA device is visible.
Ports and environment¶
| Setting | Default | Notes |
|---|---|---|
| HTTP port | 8888 |
Publish with -p 8888:8888 |
DEMO_USE_GPU |
unset | Enables GPU demo path when set |
NVIDIA_VISIBLE_DEVICES |
all |
Compose dev service default |
NVIDIA_DRIVER_CAPABILITIES |
compute,utility |
Compose dev service default |
Free-tier CPU hosting guide¶
For public repositories, the CPU demo can be hosted on free container tiers that support Docker images.
Hugging Face Spaces (Docker)¶
- Create a new Space with Docker runtime.
- Set the Dockerfile path to
docker/Dockerfile.demo.cpu. - The HTTP port
8888is automatically mapped. - Benchmark snapshots are pre-bundled in the image.
- Note: Spaces GPU support is available but requires a paid subscription.
Render¶
- Create a new Web Service.
- Select Docker as the runtime.
- Set the image source to
ghcr.io/<owner>/feature-elm-demo-cpu:latestor use the Dockerfile. - Map port
8888in the service configuration. - Render's free tier supports one web service with 750 hours/month.
Fly.io¶
- Install
flyctland runfly launch. - Select the CPU demo image:
ghcr.io/<owner>/feature-elm-demo-cpu:latest. - The app listens on port
8888. - Fly's free tier provides 3 shared-cpu apps with 160GB hours/month combined.
Plan-B registry mirrors are useful when a host cannot pull from GHCR. GitHub Actions may impose egress limits with a 30-day notice period; mirror to Docker Hub if needed.
Image tags¶
Use semantic version tags for stable deployments:
Production notes¶
- Use CPU tests as the correctness gate on hosted CI because GPU runners are not free-tier standard.
- Build the CUDA image in CI to prove compilation, but do not run GPU tests there.
- Pin demo image tags to semantic versions for reproducible deployments.