Skip to content

GaugeWright Workbench documentation

GaugeWright lets an expert's method (an AI agent — its instructions, skills, and tools) run against a client's private context (their data) inside an enforced boundary, so that neither the method nor the data leaks to the other party or to the runtime.

These docs cover everyone who works with GaugeWright: experts who build and deploy agents, clients who receive them, IT admins who govern a deployment, and end-users who use an embedded agent.

Status legend

Every capability claim on every page carries one of four badges. They mean exactly this — and the single source of truth is the status table. When a page and the table disagree, the table wins.

Available shipped in the product you can download today  ·  Built implemented & tested in code, not yet operationally deployed  ·  Planned committed, designed, not yet built  ·  Not implemented absent today

The one truth that surprises people: local orchestration is not local inference

GaugeWright orchestrates on your machine, but the agent's reasoning is performed by the third-party LLM provider you configure (e.g. OpenAI, Anthropic, Azure OpenAI). Your prompts and the in-scope context are sent to that provider over the network. The provider is in the trust boundary today — it sees plaintext. Available; removing it (confidential inference) is Planned. Read where your data goes before you put anything sensitive through a run.

Is this for me?

  • You want to build an AI agent, run it against your own files, and review every change before keeping it — all on your own desktop. Available
  • You're a consultant who wants to collaborate with a counterpart across two machines, where neither side's relay can read the payload. Available
  • You need an append-only audit trail of what every run read and produced. Available
  • You're comfortable sending prompts + in-scope context to an LLM provider you contract with yourself.
  • You need a hosted or embedded agent on a public website — that's Planned.
  • You need enterprise SSO/SCIM/RBAC live for a customer org — the code is Built but not operationally deployed.
  • You need attested confidential-VM compute in production — the verifier is Built, live hosting is Planned.
  • Your data may not leave for any third-party model — wait for confidential inference (Planned) or use a provider you've contractually bound.

What works today

The product you can download is the local desktop workbench. Here is the honest cut of what is usable right now versus what exists only in code.

What Status Notes
Build an agent (archetype) in an edit chat Available Authoring on your machine
Run it against your context, isolated per run Available Reasoning goes to your LLM provider
Review each run's diff and keep or discard it Available Reviewing diffs locally on your machine. Releasing outputs to stakeholders is Built, not yet live
Multi-stakeholder output release lifecycle (release crossing parties) Built The cross-party release gate (SOUND_RELEASE) is implemented + tested; local diff review above is the live part
Federate across two machines (consultant ↔ client) Available Code-complete and CI-tested (loopback + NAT-isolated harness), not yet operationally deployed
Append-only audit log + SIEM export Available Tamper-evident at the app log
Kernel-enforced method isolation Available Linux/macOS only; Windows Planned
Cross-party packaging & deployment Built live Planned Implemented + tested, not live
Attested confidential-VM compute verifier Built live Planned Verifier in code; no live hosting
Enterprise identity (OIDC / SAML / SCIM / RBAC) Built live Planned Not operationally deployed
Hosted / public / embedded agents Planned Not buildable locally
Confidential inference (provider out of trust boundary) Planned Today the provider sees plaintext

Rule of thumb: anything single-party, on your own machine is Available today; anything cross-party, hosted, or attested in production is Built or Planned. For the full table — including known gaps (no SBOM / dependency scanning, no production monitoring, unsigned builds, no third-party audit) — see Roadmap & status.

Where an agent can run today

The same protection model applies in every deployment mode; governance is added, not re-architected. Only one mode is usable end-to-end today. The Deployment modes page covers each in full.

Mode What it is Status
Local desktop Orchestration + storage on your machine; inference calls your configured LLM provider Available
Multi-authority federation Expert ↔ client collaborate across machines; relay routes opaque bytes only Built — code-complete, CI-tested (loopback + NAT-isolated), not operationally deployed
Hosted multi-tenant Cloud-hosted relay + compute for consultants' deployments Planned (needs infra)
Attested compute Confidential VM; both parties verify the measurement verifier Built live Planned
Public hosting / embed Browser-embeddable agent for end-users core Built live Planned

Two kinds of guarantee — keep them separate

  • Structural guarantees are built into how the system works and are machine-checked against formal invariants with adversarial tests (a reference is not access; a run has no ambient authority; history is append-only; a work chat cannot rewrite its own method). These hold by construction. See How GaugeWright protects your work.
  • Policy / operational guarantees depend on configuration and deployment (which LLM provider you trust, whether enterprise identity is wired up, code signing). These are only as strong as how you run it.

One structural guarantee is per-OS: kernel-enforced method isolation runs on Linux and macOS today; the Windows method-isolation sandbox is Planned (egress gating — deny-by-default network, fail-closed admission at the boundary — is Available). Windows users are not left unprotected: they get the boundary's egress gates and fail-closed admission; what they lack is the kernel sandbox that stops a work chat from rewriting its own method.

Start here

  1. Getting started — download, install, configure a provider, and complete your first reviewed run. Available
  2. Concepts — the mental model (projectarchetypeplacementchat) and the full glossary. Collaboration runs along a workstream — a shared line of work that a set of chats auto-sync into; an output's taint is engagement-scoped (read: chat-scoped).
  3. How GaugeWright protects your work — the boundary and where your data goes, in plain language.

Find your role

You are… Start here What you'll do
An expert / consultant For experts Build an agent, run & review, package & deploy
A client receiving an agent For clients Provide context, review & release outputs
An admin / IT governing a deployment For admins SSO, provisioning, audit, policy Built live Planned
An embedded end-user For embedded users Use an agent embedded in a site Planned

Trust & reference

For deep readers, the user-facing terms above link to the glossary; the formal invariants (INV-n) and decision records (ADRs) live in the product specification and are referenced from the protection and security pages — optional, never required reading.


This documentation is the single source of truth for using GaugeWright. It lives in the product repository alongside the code it describes, so guidance and behavior change together. Capability status defers to reference/status.md.