SOP: Goose Standalone Local Model (Windows 11, No Docker)¶
Document Type: Standard Operating Procedure (SOP)
Version: 1.01.26
Status: Approved for Use
Audience: Technician + Client
Confidentiality: Internal / Client Delivery
Platform:** Windows 11 Only
1. Purpose¶
To deploy and operate Goose as the sole application layer running a local LLM on Windows 11 without Docker containers, enabling a user-friendly desktop AI assistant capable of interacting with local files and workflows.
2. Scope¶
This SOP applies to: - Users who want a single-application setup - Windows 11 environments - Clients who do not require containerization, automation stacks, or terminal workflows
Not included: - Docker-based deployments - Linux deployments - Scheduled automation (requires n8n/container integrations) - Air-gapped configurations
3. Responsibilities¶
Technician Responsibilities - Install Goose on the host - Download and configure local LLM models - Confirm basic file operations if enabled - Communicate hardware limitations and privacy constraints
Client Responsibilities - Provide hardware and approve use cases - Understand performance trade-offs - Confirm sensitivity level of data used with Goose
(Optional) IT/Compliance Responsibilities - Validate privacy and offline use policy for sensitive files
4. Requirements¶
4.1 Minimum Hardware (Windows-Only Use)¶
- CPU: 8 cores
- RAM: 16 GB
- Disk: 20 GB free
- GPU: Optional (CPU-only acceptable)
4.2 Recommended Hardware¶
- CPU: 12+ cores
- RAM: 32 GB
- GPU: 8–24 GB VRAM (NVIDIA preferred)
4.3 GPU Practical Notes¶
- NVIDIA strongly preferred for consumer-grade local inference
- AMD support may be limited or CPU fallback may activate
- CPU-only acceptable for small reasoning workloads
4.4 Supported OS¶
- Windows 11 (only)
5. Model Selection Note¶
Example reference model for Goose standalone:
Llama-3.1-8B-Instruct-Q4_K_M — chosen for compatibility with 8–12 GB VRAM laptops/desktops and smooth UI operation.
(Heavier models like 14B may run on CPU or larger VRAM GPUs but are not required for this SOP.)
6. Installation Procedure — Windows 11¶
6.1 Download Goose¶
Option A — Winget:
winget install block.goose
Option B — Direct Installer (recommended for clients):
Download .exe from: https://block.github.io/goose
6.2 First Launch¶
- Launch Goose from the Start menu.
- Complete any onboarding screens.
6.3 Download Local Model¶
Within Goose:
1. Open Models panel.
2. Search: Llama-3.1-8B-Instruct-Q4 (or similar 7–8B Q4 model).
3. Download and load the model.
4. Confirm successful initialization.
6.4 Configure Resource Settings¶
In Goose settings: - Enable GPU if NVIDIA hardware is present. - Set output/token rate to balanced mode. - Reduce context window if memory constrained.
7. Usage¶
7.1 Chat Interface¶
Goose provides: - Chat responses - Task decomposition - Context handling
7.2 Optional File Tools¶
Technician may enable: - File read - File write - Directory listing - Basic workflow assistance
Client must explicitly approve file access within Goose.
8. Who Chooses This Setup (Decision Context)¶
Ideal when client wants: - Lowest complexity possible. - Local ChatGPT-style usage. - Minimal infrastructure and no Docker. - No need for automation or scheduling.
Not ideal when: - Scheduled tasks or complex workflows are required (→ SOP #4). - Compliance isolation or strong security boundaries are required (→ SOP #1/#2). - Linux environments are primary targets.
9. Validation / Verification¶
Technician verifies: - Model loads and responds coherently. - File access tools function as expected (if enabled).
Client verifies: - Response quality meets expectations. - Performance acceptable for intended tasks.
10. Optional Lockdown (Higher Privacy)¶
For environments using sensitive files (but not extreme-classified data): - Restrict outbound traffic via firewall rules. - Disable automatic updates where permissible. - Store models in non-synced folders (avoid OneDrive/Dropbox). - Require user approval before file operations.
11. Maintenance¶
- Update Goose manually when approved.
- Update model versions offline as needed.
- Re-test performance and responsiveness after updates.
12. Notes / Warnings¶
- 8B models recommended for 8 GB VRAM machines; 14B models may be too heavy.
- AMD support not guaranteed; performance varies widely.
- Not intended for multi-user, automated, or compliance-heavy workflows.
13. Revision Control¶
- Version: 1.01.26
- Editor: Elijah B
- Next Review: Within 90 Days