Skip to content

Green Boxes — the multi-runtime architecture

Every "green box" is a language-native implementation of the AI Workbench HTTP runtime. They all serve the same /api/v1/* contract and speak Astra via their language's native SDK internally. The UI picks which one to target at deploy time via BACKEND_URL.

Why multiple runtimes?

  • Different organizations prefer different stacks. A team with heavy Python tooling should be able to deploy a Python-native workbench runtime that they can extend idiomatically.
  • The Astra SDK ecosystem is polyglot (astra-db-ts, astrapy, astra-db-java, …). Each runtime uses its native SDK — no wrapper libraries, no "universal" middleware to maintain.
  • The HTTP contract is small enough to replicate; replicating it across languages is easier than mandating one runtime and making everyone port their tooling.

Current runtimes

RuntimeLocationStatusAstra SDK
TypeScript (default)runtimes/typescript/Operational through Phase 3 + auth (UI, playground, API keys, OIDC login + silent refresh, knowledge bases with auto-provisioned collections, chunking / embedding / reranking services, vector/text search, hybrid + rerank, sync/async ingest with pipeline resume after orphan reclaim, durable JobStore with cross-replica subscription polling + lease/heartbeat + orphan sweeper, chunks listing, document delete cascade)@datastax/astra-db-ts
Pythonruntimes/python/FastAPI scaffold — routes return 501 until implementedastrapy (pending)
Javaruntimes/java/Spring Boot scaffold — routes return 501 until implementedastra-db-java (pending)

The TypeScript runtime is the default ship path: it gets bundled with the UI into one Docker image, so operators deploying the UI get a working backend out of the box. Alternative-language runtimes deploy as separate containers and the UI points at them via BACKEND_URL.

The contract

Every green box serves:

PathPurpose
GET /healthzLiveness
GET /readyzReadiness (must confirm its control plane is reachable)
GET /versionBuild metadata; runtime field carries the language tag
GET /Service banner (JSON) when no UI is embedded; the UI shell (HTML) in the bundled TypeScript + UI image
GET /docsOpenAPI reference UI
GET /api/v1/openapi.jsonMachine-readable OpenAPI 3.1 doc
(CRUD) /api/v1/workspaces[/{id}]Workspace lifecycle
(CRUD) /api/v1/workspaces/{w}/{chunking,embedding,reranking}-services[/{id}]Service-definition lifecycle
(CRUD) /api/v1/workspaces/{w}/knowledge-bases[/{id}]KB lifecycle (POST auto-provisions the underlying vector collection; DELETE drops it)
POST / DELETE / POST/api/v1/workspaces/{w}/knowledge-bases/{kb}/records, .../records/{rid}, .../search
(CRUD) /api/v1/workspaces/{w}/knowledge-bases/{kb}/documents[/{id}]Document metadata + chunks listing under a KB
POST/api/v1/workspaces/{w}/knowledge-bases/{kb}/ingest[?async=true]

Full contract details: api-spec.md.

Response shapes for every route are captured as fixtures under conformance/fixtures/. Every runtime's test suite diffs against those fixtures — drift surfaces as a failing test.

What's not part of the contract

  • Internal storage. Each runtime picks its own control-plane backend(s). The TS runtime ships memory / file / astra; the Python runtime can reuse those names or invent its own. What matters is what shows up on the wire.
  • Astra SDK. Each runtime uses its language-native client. The wire traffic they generate is DataStax's problem to keep consistent — not ours.
  • Logging, metrics, tracing shape. Each runtime can emit what's idiomatic for its ecosystem; we'll align on a common telemetry schema later.
  • Language-specific deployment ergonomics. The TS runtime uses Hono; the Python one FastAPI; Java (future) would likely use Spring. The internals are whatever each language prefers.

Deployment

The UI reaches its runtime via BACKEND_URL:

                     ┌───────────────────┐
                     │       UI          │
                     │                   │
                     │ BACKEND_URL=…     │
                     └─────────┬─────────┘

      ┌────────────────────────┼────────────────────────┐
      │                        │                        │
 ┌────┴─────┐            ┌─────┴─────┐            ┌─────┴─────┐
 │ TS       │            │ Python    │            │ Java      │
 │ runtime  │            │ runtime   │            │ runtime   │
 │ :8080    │            │ :8080     │            │ :8080     │
 └────┬─────┘            └─────┬─────┘            └─────┬─────┘
      │                        │                        │
      ▼                        ▼                        ▼
      (their language-native Astra SDK to Astra Data API)

The default shipping container embeds the UI with the TS runtime, so BACKEND_URL points at / (same origin). For alternative runtimes, operators deploy the alternative container and set BACKEND_URL=http://my-python-runtime:8080 on the UI container.

Adding a new language

See runtimes/README.md for the step-by-step. In short:

  1. Create runtimes/<lang>/.
  2. Scaffold an HTTP server exposing /api/v1/*.
  3. Use the language-native DataStax SDK internally.
  4. Write a test harness that runs every scenario in conformance/scenarios.json against your server and diffs responses against the shared fixtures.
  5. Add a row to the current-runtimes table above when you open the PR.

Python runtime specifics

See runtimes/python/README.md for the quickstart, environment variables, and house rules.

Currently every /api/v1/* route scaffolds to HTTP 501 not_implemented with the canonical error envelope. Operational routes (/healthz, /version, /, /docs) work today.

Java runtime specifics

See runtimes/java/README.md for the quickstart, environment variables, and house rules.

Spring Boot 3 + Java 21 + Gradle. Same scaffold posture as the Python runtime — operational routes (/healthz, /version, /, /docs) work today; every /api/v1/* route throws NotImplementedApiError → 501 with the canonical envelope. Java records under com.datastax.aiworkbench.model mirror the TS *Record types one-to- one so JSON maps cleanly.

Released under the MIT license.