Skip to content

User-defined agents

Agents are the unit of chat in a workspace. Create one or more per workspace; each one carries its own persona, RAG defaults, optional LLM service, and conversation history. The runtime's send + streaming pipeline runs against any agent — there is no built-in chat surface above this layer.

Historical note. Earlier drafts of the runtime auto-provisioned a singleton "Bobbie" agent and exposed a parallel /chats route as a thin alias. The singleton was retired and replaced with the template catalog (ADR 0003). Today's workspaces are seeded with the catalog's defaultOnNewWorkspace templates (currently Bobby + Maven); the rest of the catalog is opt-in via the UI gallery or POST /agents/from-template.

Concepts

TermWhat it is
AgentA row in wb_agentic_agents_by_workspace. Carries name, system / user prompts, RAG defaults (ragEnabled, ragMaxResults, ragMinScore, knowledgeBaseIds), reranker overrides, and an optional llmServiceId pointing at the LLM executor it uses.
ConversationA row in wb_agentic_conversations_by_agent. One conversation belongs to exactly one (workspace, agent) pair. Carries title and a per-conversation knowledgeBaseIds filter that overrides the agent's default at retrieval time.
MessageA row in wb_agentic_messages_by_conversation. Same shape across all agents — role ∈ {user, agent, system, tool}, metadata carries RAG provenance / model id / finish reason.
TemplateA static catalog entry the UI can offer as a one-click agent. Identified by stable lowercase-kebab templateId slug. Not a record — runtime data shipped with the binary. See Template catalog.

Fresh workspaces are seeded with the catalog's defaultOnNewWorkspace templates (Bobby + Maven today). When you delete an agent the cascade goes agent → its conversations → their messages. Workspace delete cascades workspace → agents → conversations → messages.

Template catalog

The catalog (agent-templates.ts) is a static list of personas the UI can offer as one-click agent creation. The catalog ships with four entries:

templateIdNameDefault-onUse case
bobbyBobbyDirect, terse data analyst
mavenMavenMulti-source research synthesis
quillQuillConcise, code-forward technical writer
sageSageStrict-grounding Q&A; declines confidently

Two HTTP routes are exposed:

  • GET /api/v1/workspaces/{w}/agent-templates — returns the full catalog. Workspace-scoped for authz, but the body is workspace- independent.
  • POST /api/v1/workspaces/{w}/agents/from-template with body { "templateId": "..." } — instantiates the template as a new agent in the workspace. The new agent's name, description, and systemPrompt are copied from the template; other fields default to the same values as POST /agents.

The seed step inside workspace POST uses the same catalog, filtered to defaultOnNewWorkspace === true. Workspace POST seeds Bobby + Maven into the new workspace's agent list.

Adding a new template is a one-file change (append to agent-templates.ts and decide if defaultOnNewWorkspace should be true); see ADR 0003 for the design context.

Data model

See runtimes/typescript/src/astra-client/table-definitions.ts for the wire-level types. The store-level shapes are in runtimes/typescript/src/control-plane/types.ts:

ts
interface AgentRecord {
  workspaceId: string;
  agentId: string;
  name: string;
  description: string | null;
  systemPrompt: string | null;
  userPrompt: string | null;
  toolIds: readonly string[];     // unused in v0; reserved for tool-using agents
  llmServiceId: string | null;    // optional pointer to an LLM executor
  ragEnabled: boolean;
  knowledgeBaseIds: readonly string[];
  ragMaxResults: number | null;
  ragMinScore: number | null;
  rerankEnabled: boolean;
  rerankingServiceId: string | null;
  rerankMaxResults: number | null;
  createdAt: string;
  updatedAt: string;
}

interface ConversationRecord {
  workspaceId: string;
  agentId: string;
  conversationId: string;
  title: string | null;
  knowledgeBaseIds: readonly string[];
  createdAt: string;
}

agent.knowledgeBaseIds is the default RAG-grounding set. conversation.knowledgeBaseIds overrides it for the conversation when populated; empty means "fall back to the agent's default, or to all KBs in the workspace if the agent's set is also empty".

LLM service binding

agent.llmServiceId is mutable and optional. Resolution order at send time:

  1. Per-agent service. If agent.llmServiceId is set, the runtime fetches the matching wb_config_llm_service_by_workspace row and instantiates a chat service from it. Three providers are wired end-to-end today: provider: "openrouter" (hosted default), provider: "openai" (direct/BYOK), and provider: "ollama" (local/offline). All three are OpenAI-compatible and share one adapter, so native function calling — required for the agent tool-call loop — works on every wired provider (subject to the specific model supporting tools). Any other provider returns 422 llm_provider_unsupported until its adapter lands. A bound service without a credentialRef returns 422 llm_credential_missing (Ollama needs no credential).
  2. Workspace fallback. If agent.llmServiceId is unset, the runtime falls back to the global chat: block in workbench.yaml, which defaults to the OpenRouter provider.
  3. Hard stop. If neither is configured, POST .../messages and POST .../messages/stream return 503 chat_disabled. The agent record itself is unaffected; you can still list / patch / delete it without an LLM available.

The system prompt resolves in the same layered way: agent.systemPrompt wins if set, otherwise chatConfig.systemPrompt from workbench.yaml, otherwise the runtime falls back to DEFAULT_AGENT_SYSTEM_PROMPT from control-plane/defaults.ts. The same precedence holds for the system prompt regardless of which chat service provider was selected — the prompt is added as the first turn in the prompt envelope before any RAG-retrieved chunks.

Tool calling. The agent dispatcher's tool-execution loop (RAG search, list KBs, summarize, etc.) requires the underlying provider to support native function calling. All three wired providers (openrouter, openai, ollama) share the OpenAI tool-call wire format, so tool dispatch works across all of them — subject to the specific model supporting tools (OpenRouter's catalog is filtered to tool-capable models). Models without tool support still answer; they just skip the tool-call lane.

HTTP surface

All routes are workspace-scoped, mounted under /api/v1/workspaces/{w}/agents. Auth is enforced by the shared workspace-route wrapper.

Agents

MethodPathNotes
GET/agentsList agents in the workspace, oldest-first. Paginated.
POST/agentsCreate a new agent. Body: { agentId?, name, description?, systemPrompt?, userPrompt?, llmServiceId?, knowledgeBaseIds?, ragEnabled?, ragMaxResults?, ragMinScore?, rerankEnabled?, rerankingServiceId?, rerankMaxResults? }. 409 on duplicate explicit agentId.
GET/agents/{agentId}Get one agent.
PATCH/agents/{agentId}Patch any of the optional fields above (except agentId). llmServiceId accepts null to clear the binding.
DELETE/agents/{agentId}204; cascades the agent's conversations and their messages.

Conversations (per-agent)

MethodPathNotes
GET/agents/{agentId}/conversationsList the agent's conversations, newest-first. Paginated.
POST/agents/{agentId}/conversationsStart a new conversation. Body: { conversationId?, title?, knowledgeBaseIds? }. 404 if the agent doesn't exist.
GET/agents/{agentId}/conversations/{conversationId}Get one conversation.
PATCH/agents/{agentId}/conversations/{conversationId}Update title and / or knowledgeBaseIds.
DELETE/agents/{agentId}/conversations/{conversationId}204; cascades messages.

Messages (per-conversation)

MethodPathNotes
GET/agents/{agentId}/conversations/{conversationId}/messagesOldest-first message log. Paginated.
POST/agents/{agentId}/conversations/{conversationId}/messagesSynchronous send. Body: { content }. Persists the user turn, retrieves grounding context, calls the agent's LLM (per the resolution order above), persists the assistant turn, returns { user, assistant }.
POST/agents/{agentId}/conversations/{conversationId}/messages/streamSSE send. Same body. Emits user-message, then one token event per delta, then a terminal done (or error) carrying the persisted assistant row.

POST /messages and POST /messages/stream return:

  • 404 when the conversation does not belong to the named agent (or when the workspace, agent, or conversation does not exist).
  • 422 llm_provider_unsupported when agent.llmServiceId points at an LLM service whose provider is not one of openrouter, openai, or ollama.
  • 422 llm_credential_missing when the bound LLM service has no credentialRef.
  • 503 chat_disabled when the runtime has no global chat: block configured and the agent has no llmServiceId — there is no executor available.

The streaming wire format mirrors the now-retired /chats/.../messages/stream route. Browser clients use fetch with Accept: text/event-stream and parse the response body manually (EventSource only supports GET). The runtime helper the web UI uses lives at apps/web/src/lib/chatStream.ts.

The dispatcher emits the following SSE events in order:

EventWhenPayload
user-messageOnce, after the user turn is persistedThe persisted user ChatMessage
tokenPer model emission{ delta: string }
token-resetOptional — fires after each tool-call iteration so clients can clear pre-tool narration from the live preview{}
tool-callWhen the model requests a tool invocation. Native function calling works across all wired providers (openrouter, openai, ollama) since they share the OpenAI tool-call wire format, subject to the model supporting tools.{ toolName, args, callId }
tool-resultEach tool result fed back into the next iteration{ toolName, callId, result }
doneTerminal on successThe persisted assistant ChatMessage (metadata.finish_reason: "stop" / "length")
errorTerminal on failureThe persisted assistant ChatMessage with metadata.finish_reason: "error" and a human-readable content

Each turn ends with exactly one of done or error. The dispatcher caps tool-use iterations at MAX_TOOL_ITERATIONS = 6 per turn. Conversations bound to a model without tool support never emit tool-call / tool-result — the assistant streams a single answer pass instead.

text
event: user-message
data: {"workspaceId":"…","conversationId":"…","role":"user","content":"hi","messageId":"…","messageTs":"…","metadata":{}}

event: token
data: {"delta":"Hello"}

event: token
data: {"delta":" there"}

event: done
data: {"workspaceId":"…","conversationId":"…","role":"agent","content":"Hello there","messageId":"…","messageTs":"…","metadata":{"model":"…","finish_reason":"stop","context_document_ids":"…"}}

Cascade rules

  • Workspace delete → agents → conversations → messages.
  • Agent delete → that agent's conversations → their messages. Other agents in the workspace are untouched.
  • Conversation delete → its messages.
  • KB delete → strips the kb id from every conversation's knowledgeBaseIds set in the workspace. The agent-level knowledgeBaseIds is not stripped today; if this becomes a problem we'll extend the cascade.
  • LLM service delete → refused with 409 conflict while any agent still references the service via llmServiceId. Reassign or delete the dependent agents first.

Testing

Released under the MIT license.