SOP: Goose Standalone Local Model (Windows 11, No Docker)¶

Document Type: Standard Operating Procedure (SOP)
Version: 1.01.26
Status: Approved for Use
Audience: Technician + Client
Confidentiality: Internal / Client Delivery
Platform:** Windows 11 Only

1. Purpose¶

To deploy and operate Goose as the sole application layer running a local LLM on Windows 11 without Docker containers, enabling a user-friendly desktop AI assistant capable of interacting with local files and workflows.

2. Scope¶

This SOP applies to: - Users who want a single-application setup - Windows 11 environments - Clients who do not require containerization, automation stacks, or terminal workflows

Not included: - Docker-based deployments - Linux deployments - Scheduled automation (requires n8n/container integrations) - Air-gapped configurations

3. Responsibilities¶

Technician Responsibilities - Install Goose on the host - Download and configure local LLM models - Confirm basic file operations if enabled - Communicate hardware limitations and privacy constraints

Client Responsibilities - Provide hardware and approve use cases - Understand performance trade-offs - Confirm sensitivity level of data used with Goose

(Optional) IT/Compliance Responsibilities - Validate privacy and offline use policy for sensitive files

4. Requirements¶

4.1 Minimum Hardware (Windows-Only Use)¶

CPU: 8 cores
RAM: 16 GB
Disk: 20 GB free
GPU: Optional (CPU-only acceptable)

4.2 Recommended Hardware¶

CPU: 12+ cores
RAM: 32 GB
GPU: 8–24 GB VRAM (NVIDIA preferred)

4.3 GPU Practical Notes¶

NVIDIA strongly preferred for consumer-grade local inference
AMD support may be limited or CPU fallback may activate
CPU-only acceptable for small reasoning workloads

4.4 Supported OS¶

Windows 11 (only)

5. Model Selection Note¶

Example reference model for Goose standalone:

Llama-3.1-8B-Instruct-Q4_K_M — chosen for compatibility with 8–12 GB VRAM laptops/desktops and smooth UI operation.

(Heavier models like 14B may run on CPU or larger VRAM GPUs but are not required for this SOP.)

6. Installation Procedure — Windows 11¶

6.1 Download Goose¶

Option A — Winget:

winget install block.goose

Option B — Direct Installer (recommended for clients): Download .exe from: https://block.github.io/goose

6.2 First Launch¶

Launch Goose from the Start menu.
Complete any onboarding screens.

6.3 Download Local Model¶

Within Goose: 1. Open Models panel.
2. Search: Llama-3.1-8B-Instruct-Q4 (or similar 7–8B Q4 model).
3. Download and load the model.
4. Confirm successful initialization.

6.4 Configure Resource Settings¶

In Goose settings: - Enable GPU if NVIDIA hardware is present. - Set output/token rate to balanced mode. - Reduce context window if memory constrained.

7. Usage¶

7.1 Chat Interface¶

Goose provides: - Chat responses - Task decomposition - Context handling

7.2 Optional File Tools¶

Technician may enable: - File read - File write - Directory listing - Basic workflow assistance

Client must explicitly approve file access within Goose.

8. Who Chooses This Setup (Decision Context)¶

Ideal when client wants: - Lowest complexity possible. - Local ChatGPT-style usage. - Minimal infrastructure and no Docker. - No need for automation or scheduling.

Not ideal when: - Scheduled tasks or complex workflows are required (→ SOP #4). - Compliance isolation or strong security boundaries are required (→ SOP #1/#2). - Linux environments are primary targets.

9. Validation / Verification¶

Technician verifies: - Model loads and responds coherently. - File access tools function as expected (if enabled).

Client verifies: - Response quality meets expectations. - Performance acceptable for intended tasks.

10. Optional Lockdown (Higher Privacy)¶

For environments using sensitive files (but not extreme-classified data): - Restrict outbound traffic via firewall rules. - Disable automatic updates where permissible. - Store models in non-synced folders (avoid OneDrive/Dropbox). - Require user approval before file operations.

11. Maintenance¶

Update Goose manually when approved.
Update model versions offline as needed.
Re-test performance and responsiveness after updates.

12. Notes / Warnings¶

8B models recommended for 8 GB VRAM machines; 14B models may be too heavy.
AMD support not guaranteed; performance varies widely.
Not intended for multi-user, automated, or compliance-heavy workflows.

13. Revision Control¶

Version: 1.01.26
Editor: Elijah B
Next Review: Within 90 Days