SOP: Goose on Host + n8n Automation + LLM+Agent Container (Llama-3.1-14B)¶

Document Type: Standard Operating Procedure (SOP)
Version: 1.01.26
Status: Approved for Use
Audience: Technician + Client
Confidentiality: Internal / Client Delivery
Platforms Supported:** Windows 11 + Linux

1. Purpose¶

To deploy a local AI automation stack with: - Goose as the desktop UI on the host. - n8n in one container for scheduling and workflows. - A combined LLM + Agent (OpenInterpreter) in a second container for reasoning and file operations, using Llama-3.1-14B (Q4).

2. Scope¶

This SOP applies when: - Clients want automated, scheduled, or event-driven AI tasks (e.g., “every 4 hours organize notes”). - A local, mostly offline solution is preferred. - A desktop UI (Goose) is desired for direct interaction.

Not included: - Building custom agent images (assumes a pre-built LLM+Agent image exists). - Air-gapped installations (see Optional Lockdown). - Multi-tenant or multi-user remote deployments.

3. Responsibilities¶

Technician Responsibilities - Install Docker and (optionally) Portainer. - Deploy n8n and the LLM+Agent containers with Docker Compose. - Connect Goose to the LLM+Agent API. - Configure basic n8n workflows that call the AI endpoint.

Client Responsibilities - Provide required hardware and OS. - Approve automation use cases and schedule/frequency. - Understand performance and privacy limitations based on hardware.

(Optional) IT/Compliance Responsibilities - Approve local AI automation policies. - Validate that data processed is allowed in this environment.

4. Requirements¶

4.1 Minimum Hardware¶

CPU: 8 cores
RAM: 16 GB (n8n + LLM+Agent may be memory constrained)
Disk: 40 GB free
GPU: Optional (CPU possible but slower)

4.2 Recommended Hardware¶

CPU: 12+ cores
RAM: 32–64 GB
GPU: NVIDIA RTX 3090 or better
NVMe/SSD storage

4.3 GPU Practical Notes (NVIDIA vs AMD)¶

NVIDIA strongly preferred; CUDA stack works well with Llama-3.1-14B (Q4) via llama.cpp/llama-cpp-python.
AMD may not work at all for this use-case; ROCm/HIP/Vulkan support is inconsistent and may result in CPU fallback or failure.
Plan for CPU inference if AMD hardware is present, or explicitly specify NVIDIA in requirements.

4.4 Supported OS¶

Component	Windows 11	Linux
Docker	Docker Desktop	Docker Engine
Compose	`docker compose`	`docker compose` / Portainer
n8n Container	Supported	Supported
LLM+Agent Container	Supported	Supported
Goose UI	Supported	Supported

5. Model Selection Note¶

Example model in use:

Llama-3.1-14B-Instruct-Q4_K_M, providing strong general reasoning (similar to GPT-4-class models) while still fitting on common consumer hardware in quantized form.

Referred to hereafter as Llama-3.1-14B (Q4).

The LLM+Agent container image is assumed to: - Contain Llama-3.1-14B (Q4) served via an OpenAI-compatible HTTP endpoint (e.g., http://0.0.0.0:8000/v1). - Run OpenInterpreter configured to use that endpoint.

6. Directory Structure (Standard)¶

Recommended layout:

Docker/
  Portainer_Management/        # (Optional) Portainer stack
  LLM_Agent_Stack/             # LLM+Agent container compose
  Automations_n8n/             # n8n automation/workflows compose
Models/                        # Offline GGUF models (mounted into LLM+Agent)

Models/ directory stores .gguf files offline on host.
Compose uses host mounting into the LLM+Agent container.

7. Procedure — Windows 11¶

7.1 Install Docker Desktop¶

Download from: https://www.docker.com/products/docker-desktop/
Install and ensure WSL2 backend is enabled.

7.2 Prepare Host Model Storage¶

mkdir C:\Models

Place .gguf model file (e.g., llama-3-14b-instruct-q4_k_m.gguf) into C:\Models.

7.3 LLM+Agent Compose (Docker/LLM_Agent_Stack/docker-compose.yml)¶

Note: This assumes a prebuilt image (e.g., local/llm-agent:latest) that runs Llama-3.1-14B (Q4) + OpenInterpreter bound to port 8000.

services:
  llm_agent:
    image: local/llm-agent:latest
    volumes:
      - C:\Models:/models
    environment:
      - MODEL_PATH=/models/llama-3-14b-instruct-q4_k_m.gguf
      - OPENAI_API_BASE=http://0.0.0.0:8000/v1
      - OPENAI_API_KEY=local-key
    ports:
      - "8000:8000"
    restart: unless-stopped

Deploy:

cd Docker\LLM_Agent_Stack
docker compose up -d

7.4 n8n Compose (Docker/Automations_n8n/docker-compose.yml)¶

services:
  n8n:
    image: n8nio/n8n:latest
    ports:
      - "5678:5678"
    volumes:
      - n8n_data:/home/node/.n8n
    environment:
      - N8N_BASIC_AUTH_ACTIVE=true
      - N8N_BASIC_AUTH_USER=admin
      - N8N_BASIC_AUTH_PASSWORD=change_me
volumes:
  n8n_data:

Deploy:

cd Docker\Automations_n8n
docker compose up -d

7.5 Install Goose on Host¶

Option A — Winget:

winget install block.goose

Option B — Direct installer: Download .exe from: https://block.github.io/goose

Configure Goose to use the LLM+Agent endpoint: - Endpoint: http://localhost:8000/v1

8. Procedure — Linux¶

8.1 Install Docker Engine + Compose¶

sudo apt install docker.io docker-compose-plugin -y

8.2 (Optional) Portainer Setup (Docker/Portainer_Management/docker-compose.yml)¶

services:
  portainer:
    image: portainer/portainer-ce
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - portainer_data:/data
    ports:
      - "9443:9443"
volumes:
  portainer_data:

Deploy:

cd Docker/Portainer_Management
docker compose up -d

8.3 Prepare Model Storage¶

mkdir -p /opt/Models

Download .gguf into /opt/Models.

8.4 LLM+Agent Compose (Docker/LLM_Agent_Stack/docker-compose.yml)¶

services:
  llm_agent:
    image: local/llm-agent:latest
    volumes:
      - /opt/Models:/models
    environment:
      - MODEL_PATH=/models/llama-3-14b-instruct-q4_k_m.gguf
      - OPENAI_API_BASE=http://0.0.0.0:8000/v1
      - OPENAI_API_KEY=local-key
    ports:
      - "8000:8000"
    restart: unless-stopped

Deploy:

cd Docker/LLM_Agent_Stack
docker compose up -d

8.5 n8n Compose (Docker/Automations_n8n/docker-compose.yml)¶

Same YAML as Windows; adjust paths only if needed.

Deploy:

cd Docker/Automations_n8n
docker compose up -d

8.6 Goose on Linux Host¶

Install Goose per vendor instructions; configure endpoint: - http://localhost:8000/v1

9. n8n → AI Integration (High Level)¶

Within n8n (browser at http://localhost:5678):

Log in with basic auth.
Create a new workflow.
Add a Cron or Schedule node (e.g., every 4 hours).
Add an HTTP Request node:
Method: POST
URL: http://llm_agent:8000/v1/chat/completions (when using Docker internal network; or http://host.docker.internal:8000/v1 depending on setup)
Body: JSON with model + messages (OpenAI-compatible).
Optionally add nodes to read/write files (e.g., via local SMB shares or webhooks that call the agent).

Technician should confirm the exact service host name depending on Docker networking mode (default bridge vs. explicit network).

10. Who and Why This Setup Fits¶

Use this architecture when: - The client wants scheduled automation, not just ad-hoc chat. - There is a need to structure workflows (e.g., parse files, call AI, save results). - UI (Goose) + automation (n8n) must both be available.

Not ideal when: - Very strict air-gap/zero-network is required (then n8n may be overkill). - The client does not need automation at all (LM Studio or SOP #1 may suffice).

11. Validation / Verification¶

Technician verifies: - llm_agent is running:

docker ps | grep llm_agent

- n8n is running:

docker ps | grep n8n

- A test prompt sent through n8n (or directly via HTTP) produces an LLM response. - Goose successfully interacts with the LLM+Agent endpoint.

Client verifies: - Workflows run on schedule (e.g., every X hours). - Goose interaction is responsive and local.

12. Optional Lockdown (High Privacy)¶

For clients with increased privacy needs: - Restrict outbound network for Docker daemon and n8n via firewall rules. - Avoid configuring any external SaaS connectors in n8n. - Use Docker user-defined network for llm_agent and n8n, but block external internet access. - Store .gguf models only in local Models/ directory (no synced cloud folders).

13. Maintenance¶

Update local/llm-agent:latest image using internal or trusted build processes.
Update n8n periodically and re-test workflows.
Backup n8n_data volume (for workflows) and Models/ (for versioned models) if required.

14. Notes / Warnings¶

AMD GPU acceleration may not function and should not be assumed.
CPU fallback for Llama-3.1-14B (Q4) can be significantly slower; communicate expectations clearly.
Careless n8n workflow design can accidentally cause large workloads; design with rate and scope in mind.

15. Revision Control¶

Version: 1.01.26
Editor: Elijah B
Next Review: Within 90 Days