Skip to content

SOP: Goose Standalone Local Model (Windows 11, No Docker)

Document Type: Standard Operating Procedure (SOP)
Version: 1.01.26
Status: Approved for Use
Audience: Technician + Client
Confidentiality: Internal / Client Delivery
Platform:** Windows 11 Only


1. Purpose

To deploy and operate Goose as the sole application layer running a local LLM on Windows 11 without Docker containers, enabling a user-friendly desktop AI assistant capable of interacting with local files and workflows.


2. Scope

This SOP applies to: - Users who want a single-application setup - Windows 11 environments - Clients who do not require containerization, automation stacks, or terminal workflows

Not included: - Docker-based deployments - Linux deployments - Scheduled automation (requires n8n/container integrations) - Air-gapped configurations


3. Responsibilities

Technician Responsibilities - Install Goose on the host - Download and configure local LLM models - Confirm basic file operations if enabled - Communicate hardware limitations and privacy constraints

Client Responsibilities - Provide hardware and approve use cases - Understand performance trade-offs - Confirm sensitivity level of data used with Goose

(Optional) IT/Compliance Responsibilities - Validate privacy and offline use policy for sensitive files


4. Requirements

4.1 Minimum Hardware (Windows-Only Use)

  • CPU: 8 cores
  • RAM: 16 GB
  • Disk: 20 GB free
  • GPU: Optional (CPU-only acceptable)
  • CPU: 12+ cores
  • RAM: 32 GB
  • GPU: 8–24 GB VRAM (NVIDIA preferred)

4.3 GPU Practical Notes

  • NVIDIA strongly preferred for consumer-grade local inference
  • AMD support may be limited or CPU fallback may activate
  • CPU-only acceptable for small reasoning workloads

4.4 Supported OS

  • Windows 11 (only)

5. Model Selection Note

Example reference model for Goose standalone:

Llama-3.1-8B-Instruct-Q4_K_M — chosen for compatibility with 8–12 GB VRAM laptops/desktops and smooth UI operation.

(Heavier models like 14B may run on CPU or larger VRAM GPUs but are not required for this SOP.)


6. Installation Procedure — Windows 11

6.1 Download Goose

Option A — Winget:

winget install block.goose

Option B — Direct Installer (recommended for clients): Download .exe from: https://block.github.io/goose

6.2 First Launch

  • Launch Goose from the Start menu.
  • Complete any onboarding screens.

6.3 Download Local Model

Within Goose: 1. Open Models panel.
2. Search: Llama-3.1-8B-Instruct-Q4 (or similar 7–8B Q4 model).
3. Download and load the model.
4. Confirm successful initialization.

6.4 Configure Resource Settings

In Goose settings: - Enable GPU if NVIDIA hardware is present. - Set output/token rate to balanced mode. - Reduce context window if memory constrained.


7. Usage

7.1 Chat Interface

Goose provides: - Chat responses - Task decomposition - Context handling

7.2 Optional File Tools

Technician may enable: - File read - File write - Directory listing - Basic workflow assistance

Client must explicitly approve file access within Goose.


8. Who Chooses This Setup (Decision Context)

Ideal when client wants: - Lowest complexity possible. - Local ChatGPT-style usage. - Minimal infrastructure and no Docker. - No need for automation or scheduling.

Not ideal when: - Scheduled tasks or complex workflows are required (→ SOP #4). - Compliance isolation or strong security boundaries are required (→ SOP #1/#2). - Linux environments are primary targets.


9. Validation / Verification

Technician verifies: - Model loads and responds coherently. - File access tools function as expected (if enabled).

Client verifies: - Response quality meets expectations. - Performance acceptable for intended tasks.


10. Optional Lockdown (Higher Privacy)

For environments using sensitive files (but not extreme-classified data): - Restrict outbound traffic via firewall rules. - Disable automatic updates where permissible. - Store models in non-synced folders (avoid OneDrive/Dropbox). - Require user approval before file operations.


11. Maintenance

  • Update Goose manually when approved.
  • Update model versions offline as needed.
  • Re-test performance and responsiveness after updates.

12. Notes / Warnings

  • 8B models recommended for 8 GB VRAM machines; 14B models may be too heavy.
  • AMD support not guaranteed; performance varies widely.
  • Not intended for multi-user, automated, or compliance-heavy workflows.

13. Revision Control

  • Version: 1.01.26
  • Editor: Elijah B
  • Next Review: Within 90 Days