Skip to content

SOP: Goose Standalone Local Model (Windows)

SOP: Goose Standalone Local Model (Windows 11, No Docker)

Section titled “SOP: Goose Standalone Local Model (Windows 11, No Docker)”

Document Type: Standard Operating Procedure (SOP)**
Version: 1.01.26
Status: Approved for Use
Audience: Technician + Client
Confidentiality: Internal / Client Delivery
Platform: Windows 11 Only


To deploy and operate Goose as the sole application layer running a local LLM on Windows 11 without Docker containers, enabling a user-friendly desktop AI assistant capable of interacting with local files and workflows.


This SOP applies to:

  • Users who want a single-application setup
  • Windows 11 environments
  • Clients who do not require containerization, automation stacks, or terminal workflows

Not included:

  • Docker-based deployments
  • Linux deployments
  • Scheduled automation (requires n8n/container integrations)
  • Air-gapped configurations

Technician Responsibilities

  • Install Goose on the host
  • Download and configure local LLM models
  • Confirm basic file operations if enabled
  • Communicate hardware limitations and privacy constraints

Client Responsibilities

  • Provide hardware and approve use cases
  • Understand performance trade-offs
  • Confirm sensitivity level of data used with Goose

(Optional) IT/Compliance Responsibilities

  • Validate privacy and offline use policy for sensitive files

  • CPU: 8 cores
  • RAM: 16 GB
  • Disk: 20 GB free
  • GPU: Optional (CPU-only acceptable)
  • CPU: 12+ cores
  • RAM: 32 GB
  • GPU: 8–24 GB VRAM (NVIDIA preferred)
  • NVIDIA strongly preferred for consumer-grade local inference
  • AMD support may be limited or CPU fallback may activate
  • CPU-only acceptable for small reasoning workloads
  • Windows 11 (only)

Example reference model for Goose standalone:

Llama-3.1-8B-Instruct-Q4_K_M — chosen for compatibility with 8–12 GB VRAM laptops/desktops and smooth UI operation.

(Heavier models like 14B may run on CPU or larger VRAM GPUs but are not required for this SOP.)


Option A — Winget:

Terminal window
winget install block.goose

Option B — Direct Installer (recommended for clients): Download .exe from: https://block.github.io/goose

  • Launch Goose from the Start menu.
  • Complete any onboarding screens.

Within Goose:

  1. Open Models panel.
  2. Search: Llama-3.1-8B-Instruct-Q4 (or similar 7–8B Q4 model).
  3. Download and load the model.
  4. Confirm successful initialization.

In Goose settings:

  • Enable GPU if NVIDIA hardware is present.
  • Set output/token rate to balanced mode.
  • Reduce context window if memory constrained.

Goose provides:

  • Chat responses
  • Task decomposition
  • Context handling

Technician may enable:

  • File read
  • File write
  • Directory listing
  • Basic workflow assistance

Client must explicitly approve file access within Goose.


8. Who Chooses This Setup (Decision Context)

Section titled “8. Who Chooses This Setup (Decision Context)”

Ideal when client wants:

  • Lowest complexity possible.
  • Local ChatGPT-style usage.
  • Minimal infrastructure and no Docker.
  • No need for automation or scheduling.

Not ideal when:

  • Scheduled tasks or complex workflows are required (→ SOP #4).
  • Compliance isolation or strong security boundaries are required (→ SOP #1/#2).
  • Linux environments are primary targets.

Technician verifies:

  • Model loads and responds coherently.
  • File access tools function as expected (if enabled).

Client verifies:

  • Response quality meets expectations.
  • Performance acceptable for intended tasks.

For environments using sensitive files (but not extreme-classified data):

  • Restrict outbound traffic via firewall rules.
  • Disable automatic updates where permissible.
  • Store models in non-synced folders (avoid OneDrive/Dropbox).
  • Require user approval before file operations.

  • Update Goose manually when approved.
  • Update model versions offline as needed.
  • Re-test performance and responsiveness after updates.

  • 8B models recommended for 8 GB VRAM machines; 14B models may be too heavy.
  • AMD support not guaranteed; performance varies widely.
  • Not intended for multi-user, automated, or compliance-heavy workflows.

  • Version: 1.01.26
  • Editor: Elijah B
  • Next Review: Within 90 Days