Technician Cheat Sheet

Technician Cheat Sheet: Local LLM Deployment

Version: 1.01.26
Audience: Technician / Engineer

Use this as a fast mapping:

SOP #1 — LLM Container + Goose UI
- When: UI needed, Docker allowed, client cares about isolation.
- Notes: Docker/LLM_Inference, Goose installed on host.
SOP #2 — Terminal-Only LLM Container
- When: Max privacy, technical user, CLI OK.
- Notes: Docker/LLM_Inference, curl/Invoke-RestMethod.
SOP #3 — LM Studio Local Runner
- When: Easiest local ChatGPT-style app, single user.
- Notes: No Docker, Windows/Linux app.
SOP #4 — Goose + n8n + LLM+Agent Containers
- When: Automation / scheduling needed (cron-style workflows).
- Notes: Docker/LLM_Agent_Stack, Docker/Automations_n8n.
SOP #5 — Goose Standalone (Windows)
- When: Non-technical Windows user, wants one app and no Docker.
- Notes: Windows-only, 7–8B model suggested for 8 GB VRAM.

< 8 GB VRAM
- Use 3B–7B Q4 models.
- Prefer LM Studio or Goose Standalone; keep context small.
8–12 GB VRAM
- 7B–8B Q4 models comfortable, 14B possible with trade-offs.
- All SOPs possible; choose by UX and complexity.
12–24 GB VRAM
- 8B–14B Q4 models are fine.
- Container-based solutions work well (SOP #1/#2/#4).
> 24 GB VRAM
- High-end or professional cards.
- Any SOP, heavy workloads, long contexts.
AMD GPU
- Assume CPU fallback unless explicitly validated.
- Do not promise GPU acceleration.

cd Docker/LLM_Inference
docker compose up -d
docker compose down

For n8n + Agent:

cd Docker/LLM_Agent_Stack
docker compose up -d

cd ../Automations_n8n
docker compose up -d

curl http://localhost:8000/v1/models

Windows alt:

Invoke-RestMethod -Uri "http://localhost:8000/v1/models" -Method Get

If data is doctor–patient, lawyer–client, privileged legal, PHI, or Secret-class:

Prefer SOP #2 (Terminal-Only) or SOP #3 (LM Studio).
If Goose is ever used, firewall it completely from the internet and document the exception.

Promising AMD GPU support (always caveat).
Forgetting to mount Models/ directory.
Using synced folders (OneDrive/Dropbox) for model storage.
Enabling cloud providers in LM Studio or Goose without explicit client sign-off.

Ensure the SOP version in use is 1.01.26.
If making local changes, bump version as per scheme (e.g., 1.012.26 for second revision in same month/year).