Skip to content

Audit logging

The TypeScript runtime emits structured audit events for the sensitive operations listed below. Events are pino log lines at info level with a stable discriminator field audit: true, so deployments can route them to a dedicated sink (file, syslog, SIEM) by filter.

jsonc
{
  "level": 30,
  "time": 1735603200000,
  "audit": true,
  "action": "api_key.create",
  "outcome": "success",
  "requestId": "01JFE5...",
  "subject": {
    "type": "oidc",            // "apiKey" | "oidc" | "bootstrap" | "anonymous"
    "id": "sub-123",
    "label": "alice@example.com"
  },
  "workspaceId": "ws-1",
  "details": { "keyId": "...", "label": "ci-deployer" },
  "msg": "audit api_key.create success"
}

What gets logged

ActionTriggerNotes
api_key.createPOST /api/v1/workspaces/{w}/api-keysPlaintext is never logged. Only keyId + caller-supplied label.
api_key.revokeDELETE /api/v1/workspaces/{w}/api-keys/{keyId}Soft revoke; emitted on the first revoke only.
workspace.createPOST /api/v1/workspacesIncludes the workspace label (the human-friendly name).
workspace.deleteDELETE /api/v1/workspaces/{w}Emitted after the cascade completes.
kb.createPOST /api/v1/workspaces/{w}/knowledge-basesProvisions an Astra collection — destructive on rollback. Includes knowledgeBaseId + label.
kb.deleteDELETE /api/v1/workspaces/{w}/knowledge-bases/{kb}Cascades the underlying collection drop and all rag-document rows. Emitted after the cascade.
document.deleteDELETE /api/v1/workspaces/{w}/knowledge-bases/{kb}/documents/{d}Cascades chunk wipe before the row drop. Includes knowledgeBaseId + documentId.
agent.createPOST /api/v1/workspaces/{w}/agents or POST /api/v1/workspaces/{w}/agents/from-templateIncludes agentId + label. The from-template variant additionally includes templateId (the catalog slug).
agent.deleteDELETE /api/v1/workspaces/{w}/agents/{a}Cascades conversations + chat messages owned by the agent. Emitted after the cascade.
job.claimCross-replica orphan reclaim in jobs/sweeper.tsEmitted when a replica successfully CAS-claims an orphaned job. Includes jobId + jobKind. Subject is the replica id (synthetic), not a user.
mcp.invokeAny tool call into /api/v1/workspaces/{w}/mcpIncludes the toolName. Argument payloads are not logged.
tool.invokeAny agent tool call in the chat tool-call loop (POST /api/v1/workspaces/{w}/agents/{a}/conversations/{c}/messages[/stream])One row per tool call — built-in, native, Astra, or remote-MCP. Includes the toolName, the source (builtin/native/astra/mcp), and the mcpServerId for mcp-source calls. outcome: "denied" (with reason) when the tool is not on the agent's allow-list or an mcp:-source call's caller lacks the tools:invoke scope (reason: "missing tools:invoke scope"); outcome: "failure" (with reason) when the tool errors or times out; outcome: "success" otherwise. Argument payloads are not logged (secrets could be in args).
auth.api_denied401/403 auth decisions on /api/v1/*outcome: "denied" with reason; unauthenticated 401s have subject: null, while scoped 403s include the resolved subject when available. A 403 from a scope gate (assertScope/requireScope) also carries a structured requiredScope (e.g. write:ingest) so denials aggregate by scope; workspace-membership 403s and 401s omit it.
auth.bootstrap_useAny request authenticated with the bootstrap operator tokenIncludes scheme: "bootstrap". The plaintext bootstrap token is never logged.
auth.csrf_rejectedA state-changing request to a cookie-protected route was rejected by the Origin/Referer checkoutcome: "failure" with reason{ "no allowed origin available", "missing Origin and Referer on state-changing request", "origin mismatch (got <claimed>)" }. The HTTP response is 403 forbidden_origin. Bearer-token requests bypass the check and never emit this event.
auth.loginOIDC /auth/callbackoutcome: "success" once the access token passes the runtime's own verifier; outcome: "failure" with reason on token-validation errors.
auth.refreshOIDC /auth/refreshoutcome: "success" on a clean rotate. outcome: "failure" with reason{ "no_refresh_token", "idp_rejected", "token_validation_failed" } covers the three failure paths (missing cookie, IdP refused the refresh_token, freshly-issued access token failed self-verification).
auth.logoutOIDC /auth/logoutEmitted on every cookie clear, even when no session was present.
auth.device.authorizePOST /auth/device/authorizeRFC 8628 device-flow proxy for the aiw CLI. outcome: "success" once the IdP returns a device_code; the audit row includes the short user_code (the human-typed code) but never the device_code itself. outcome: "failure" with reason when the IdP rejects the authorize request.
auth.device.tokenPOST /auth/device/tokenRFC 8628 device-flow proxy for the aiw CLI — one row per terminal poll. outcome: "success" when the IdP issues an access token; outcome: "failure" with reason for IdP-rejected polls (expired_token, access_denied, etc.). Pending polls (authorization_pending / slow_down) are NOT audited; only the final outcome lands a row, so a long-lived device grant doesn't flood the audit log.
principal.createPOST /api/v1/workspaces/{w}/principalsRLAC prototype. Includes the principalId.
principal.updatePATCH /api/v1/workspaces/{w}/principals/{principalId}RLAC prototype. Includes the principalId.
principal.deleteDELETE /api/v1/workspaces/{w}/principals/{principalId}RLAC prototype. Includes the principalId.

The set is intentionally small. Adding a new event is a one-line call from a route handler — see src/lib/audit.ts.

Sample envelopes

Concrete payloads from a live runtime, lightly redacted. Field order is audit, action, outcome, requestId, subject, workspaceId, details, msg — pino emits in declaration order, which makes the envelope stable enough to grep with cut/jq.

workspace.create — anonymous in dev mode

jsonc
{
  "level": 30,
  "time": 1735603195123,
  "audit": true,
  "action": "workspace.create",
  "outcome": "success",
  "requestId": "01KQG3MCDGC3VWP07BNQWX7NPB",
  "subject": {
    "type": "anonymous",
    "id": null,
    "label": null
  },
  "workspaceId": "ab907991-dba4-4d9d-81f0-4756ec5ccf43",
  "details": { "label": "support-docs" },
  "msg": "audit workspace.create success"
}

subject.type: "anonymous" is normal in development (default auth.mode: disabled). In production, subject.type will be "apiKey" or "oidc" — the auth deployment guard refuses to start with anonymous access on a non-memory control plane.

auth.login — failed JWT validation

jsonc
{
  "level": 30,
  "time": 1735603612877,
  "audit": true,
  "action": "auth.login",
  "outcome": "failure",
  "requestId": "01KQG3PV9ZH7T82R4KAE8WBN3X",
  "subject": {
    "type": "anonymous",
    "id": null,
    "label": null
  },
  "details": { "scheme": "oidc", "reason": "audience_mismatch" },
  "msg": "audit auth.login failure"
}

workspaceId is absent because the request never resolves a workspace before the auth middleware rejects it. details.reason is one of the verifier's terminal error codes (audience_mismatch, signature_invalid, token_expired, issuer_mismatch, malformed); see src/auth/oidc/verifier.ts.

api_key.create — authenticated by an OIDC subject

jsonc
{
  "level": 30,
  "time": 1735603889104,
  "audit": true,
  "action": "api_key.create",
  "outcome": "success",
  "requestId": "01KQG3QRVF20YGD6MTFB8KKCN5",
  "subject": {
    "type": "oidc",
    "id": "auth0|7c2d4f12",
    "label": "alice@example.com"
  },
  "workspaceId": "ab907991-dba4-4d9d-81f0-4756ec5ccf43",
  "details": { "keyId": "3a4977c8-3e01-4fd0-9b02-2e082950bd40", "label": "ci-deployer" },
  "msg": "audit api_key.create success"
}

The plaintext token (wb_live_…) is only in the HTTP response body, never the audit log. details.keyId is the row id; label is the operator-supplied tag.

Seed-failure events (non-route)

Workspace creation tries to seed default agents, LLM services, chunking services, and embedding services. Per-row failures emit audit: true error lines (not routed through audit() because they don't cleanly fit <resource>.<verb>):

jsonc
{
  "level": 50,
  "time": 1735603195310,
  "audit": true,
  "workspaceId": "ab907991-dba4-4d9d-81f0-4756ec5ccf43",
  "serviceName": "openai-text-embedding-3-small",
  "err": { "type": "ControlPlaneConflictError", "message": "..." },
  "msg": "failed to seed default embedding service"
}

When every seed of a kind fails (systemic — DB outage, broken config), an aggregate line follows with expected: <count> so monitoring can alert on "workspace shipped with no embedders" rather than counting individual failures.

Design rules

The audit module enforces a few rules so events stay safe to ship to external systems:

  • No secret material. The details field is typed and only accepts a known set of identifier fields (keyId, knowledgeBaseId, scheme, reason, label). Plaintext tokens, refresh tokens, hashes, OAuth codes, and PII are not part of the contract and have no path into the envelope.
  • Stable action names. <resource>.<verb> in snake_case. We never rename in place — adding a new action and keeping the old one for a release is the migration path.
  • Outcome is always set. success | failure | denied so SIEM rules can alert on bursts of denied without parsing status codes.
  • Best-effort. Audit logging must never break the request path. Logger errors are swallowed inside audit().

Operating it

  • Single-replica. The default pino transport writes to stdout. Pipe the container's stdout into your log pipeline and filter on audit: true.
  • Multi-replica. Each replica writes its own events; correlate by requestId (already echoed in every audit envelope) and by the Strict-Transport-Security / replicaId markers documented in production.md.
  • Retention. The runtime does not retain audit events itself. Choose a retention period that satisfies your compliance posture and configure it on the sink.

What's not yet logged

These are tracked as gaps:

  • Rate-limit denials. They are visible from the limiter's existing log lines but are not audit events yet.
  • Document and chunk mutation. Volume-sensitive; needs a sampling / batching strategy first.

When a new event lands, add it to the What gets logged table and the AuditAction union in src/lib/audit.ts — the audit-doc-drift test will fail otherwise.

Resolved findings

The following audit gaps have been addressed in the commits listed:

FindingStatusResolution
Unbounded agent systemPrompt / userPrompt fieldsResolved in #147 (8dbf741)Capped at 128 KB; name at 200 chars, description at 2 KB.
Monolithic body-size cap (50 MB) on all /api/v1/workspaces/*Resolved in #147 (8dbf741)Split: 10 MB default, 50 MB only on explicit .../knowledge-bases/*/ingest.
Sequential Promise.all on multi-KB agent toolsResolved in #147 (8dbf741)Parallelized with Promise.allSettled in chat/tools/registry.ts.
SSRF surface on service endpoint URLs (RFC1918, loopback, cloud metadata)Resolved in #147 (8dbf741)Layered validation in openapi/schemas.ts + root.ts; runtime.blockPrivateNetworkEndpoints config flag.

PRs welcome.

Released under the MIT license.