Skip to content

SOP: Goose + n8n Automation + LLM+Agent Containers

SOP: Goose on Host + n8n Automation + LLM+Agent Container (Llama-3.1-14B)

Section titled “SOP: Goose on Host + n8n Automation + LLM+Agent Container (Llama-3.1-14B)”

Document Type: Standard Operating Procedure (SOP)**
Version: 1.01.26
Status: Approved for Use
Audience: Technician + Client
Confidentiality: Internal / Client Delivery
Platforms Supported: Windows 11 + Linux


To deploy a local AI automation stack with:

  • Goose as the desktop UI on the host.
  • n8n in one container for scheduling and workflows.
  • A combined LLM + Agent (OpenInterpreter) in a second container for reasoning and file operations, using Llama-3.1-14B (Q4).

This SOP applies when:

  • Clients want automated, scheduled, or event-driven AI tasks (e.g., “every 4 hours organize notes”).
  • A local, mostly offline solution is preferred.
  • A desktop UI (Goose) is desired for direct interaction.

Not included:

  • Building custom agent images (assumes a pre-built LLM+Agent image exists).
  • Air-gapped installations (see Optional Lockdown).
  • Multi-tenant or multi-user remote deployments.

Technician Responsibilities

  • Install Docker and (optionally) Portainer.
  • Deploy n8n and the LLM+Agent containers with Docker Compose.
  • Connect Goose to the LLM+Agent API.
  • Configure basic n8n workflows that call the AI endpoint.

Client Responsibilities

  • Provide required hardware and OS.
  • Approve automation use cases and schedule/frequency.
  • Understand performance and privacy limitations based on hardware.

(Optional) IT/Compliance Responsibilities

  • Approve local AI automation policies.
  • Validate that data processed is allowed in this environment.

  • CPU: 8 cores
  • RAM: 16 GB (n8n + LLM+Agent may be memory constrained)
  • Disk: 40 GB free
  • GPU: Optional (CPU possible but slower)
  • CPU: 12+ cores
  • RAM: 32–64 GB
  • GPU: NVIDIA RTX 3090 or better
  • NVMe/SSD storage
  • NVIDIA strongly preferred; CUDA stack works well with Llama-3.1-14B (Q4) via llama.cpp/llama-cpp-python.
  • AMD may not work at all for this use-case; ROCm/HIP/Vulkan support is inconsistent and may result in CPU fallback or failure.
  • Plan for CPU inference if AMD hardware is present, or explicitly specify NVIDIA in requirements.
ComponentWindows 11Linux
DockerDocker DesktopDocker Engine
Composedocker composedocker compose / Portainer
n8n ContainerSupportedSupported
LLM+Agent ContainerSupportedSupported
Goose UISupportedSupported

Example model in use:

Llama-3.1-14B-Instruct-Q4_K_M, providing strong general reasoning (similar to GPT-4-class models) while still fitting on common consumer hardware in quantized form.

Referred to hereafter as Llama-3.1-14B (Q4).

The LLM+Agent container image is assumed to:

  • Contain Llama-3.1-14B (Q4) served via an OpenAI-compatible HTTP endpoint (e.g., http://0.0.0.0:8000/v1).
  • Run OpenInterpreter configured to use that endpoint.

Recommended layout:

Docker/
Portainer_Management/ # (Optional) Portainer stack
LLM_Agent_Stack/ # LLM+Agent container compose
Automations_n8n/ # n8n automation/workflows compose
Models/ # Offline GGUF models (mounted into LLM+Agent)
  • Models/ directory stores .gguf files offline on host.
  • Compose uses host mounting into the LLM+Agent container.

Terminal window
mkdir C:\Models

Place .gguf model file (e.g., llama-3-14b-instruct-q4_k_m.gguf) into C:\Models.

7.3 LLM+Agent Compose (Docker/LLM_Agent_Stack/docker-compose.yml)

Section titled “7.3 LLM+Agent Compose (Docker/LLM_Agent_Stack/docker-compose.yml)”

Note: This assumes a prebuilt image (e.g., local/llm-agent:latest) that runs Llama-3.1-14B (Q4) + OpenInterpreter bound to port 8000.

services:
llm_agent:
image: local/llm-agent:latest
volumes:
- C:\Models:/models
environment:
- MODEL_PATH=/models/llama-3-14b-instruct-q4_k_m.gguf
- OPENAI_API_BASE=http://0.0.0.0:8000/v1
- OPENAI_API_KEY=local-key
ports:
- "8000:8000"
restart: unless-stopped

Deploy:

Terminal window
cd Docker\LLM_Agent_Stack
docker compose up -d

7.4 n8n Compose (Docker/Automations_n8n/docker-compose.yml)

Section titled “7.4 n8n Compose (Docker/Automations_n8n/docker-compose.yml)”
services:
n8n:
image: n8nio/n8n:latest
ports:
- "5678:5678"
volumes:
- n8n_data:/home/node/.n8n
environment:
- N8N_BASIC_AUTH_ACTIVE=true
- N8N_BASIC_AUTH_USER=admin
- N8N_BASIC_AUTH_PASSWORD=change_me
volumes:
n8n_data:

Deploy:

Terminal window
cd Docker\Automations_n8n
docker compose up -d

Option A — Winget:

Terminal window
winget install block.goose

Option B — Direct installer: Download .exe from: https://block.github.io/goose

Configure Goose to use the LLM+Agent endpoint:

  • Endpoint: http://localhost:8000/v1

Terminal window
sudo apt install docker.io docker-compose-plugin -y

8.2 (Optional) Portainer Setup (Docker/Portainer_Management/docker-compose.yml)

Section titled “8.2 (Optional) Portainer Setup (Docker/Portainer_Management/docker-compose.yml)”
services:
portainer:
image: portainer/portainer-ce
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- portainer_data:/data
ports:
- "9443:9443"
volumes:
portainer_data:

Deploy:

Terminal window
cd Docker/Portainer_Management
docker compose up -d
Terminal window
mkdir -p /opt/Models

Download .gguf into /opt/Models.

8.4 LLM+Agent Compose (Docker/LLM_Agent_Stack/docker-compose.yml)

Section titled “8.4 LLM+Agent Compose (Docker/LLM_Agent_Stack/docker-compose.yml)”
services:
llm_agent:
image: local/llm-agent:latest
volumes:
- /opt/Models:/models
environment:
- MODEL_PATH=/models/llama-3-14b-instruct-q4_k_m.gguf
- OPENAI_API_BASE=http://0.0.0.0:8000/v1
- OPENAI_API_KEY=local-key
ports:
- "8000:8000"
restart: unless-stopped

Deploy:

Terminal window
cd Docker/LLM_Agent_Stack
docker compose up -d

8.5 n8n Compose (Docker/Automations_n8n/docker-compose.yml)

Section titled “8.5 n8n Compose (Docker/Automations_n8n/docker-compose.yml)”

Same YAML as Windows; adjust paths only if needed.

Deploy:

Terminal window
cd Docker/Automations_n8n
docker compose up -d

Install Goose per vendor instructions; configure endpoint:

  • http://localhost:8000/v1

Within n8n (browser at http://localhost:5678):

  1. Log in with basic auth.
  2. Create a new workflow.
  3. Add a Cron or Schedule node (e.g., every 4 hours).
  4. Add an HTTP Request node:
    • Method: POST
    • URL: http://llm_agent:8000/v1/chat/completions (when using Docker internal network; or http://host.docker.internal:8000/v1 depending on setup)
    • Body: JSON with model + messages (OpenAI-compatible).
  5. Optionally add nodes to read/write files (e.g., via local SMB shares or webhooks that call the agent).

Technician should confirm the exact service host name depending on Docker networking mode (default bridge vs. explicit network).


Use this architecture when:

  • The client wants scheduled automation, not just ad-hoc chat.
  • There is a need to structure workflows (e.g., parse files, call AI, save results).
  • UI (Goose) + automation (n8n) must both be available.

Not ideal when:

  • Very strict air-gap/zero-network is required (then n8n may be overkill).
  • The client does not need automation at all (LM Studio or SOP #1 may suffice).

Technician verifies:

  • llm_agent is running:
    Terminal window
    docker ps | grep llm_agent
  • n8n is running:
    Terminal window
    docker ps | grep n8n
  • A test prompt sent through n8n (or directly via HTTP) produces an LLM response.
  • Goose successfully interacts with the LLM+Agent endpoint.

Client verifies:

  • Workflows run on schedule (e.g., every X hours).
  • Goose interaction is responsive and local.

For clients with increased privacy needs:

  • Restrict outbound network for Docker daemon and n8n via firewall rules.
  • Avoid configuring any external SaaS connectors in n8n.
  • Use Docker user-defined network for llm_agent and n8n, but block external internet access.
  • Store .gguf models only in local Models/ directory (no synced cloud folders).

  • Update local/llm-agent:latest image using internal or trusted build processes.
  • Update n8n periodically and re-test workflows.
  • Backup n8n_data volume (for workflows) and Models/ (for versioned models) if required.

  • AMD GPU acceleration may not function and should not be assumed.
  • CPU fallback for Llama-3.1-14B (Q4) can be significantly slower; communicate expectations clearly.
  • Careless n8n workflow design can accidentally cause large workloads; design with rate and scope in mind.

  • Version: 1.01.26
  • Editor: Elijah B
  • Next Review: Within 90 Days