Architecture Overview¶
Astra is built around a hard product constraint: hot-path reads must stay fast (see PRD §25). Cached layers back that goal; synchronous database reads on latency-sensitive user paths are avoided by design.
Layers¶
Applications use the Astra SDK, which talks to a microkernel-style core (actors, task graph, scheduler, messaging, state). Below that: Postgres, Redis, Memcached, and object storage.
flowchart TB
subgraph apps [Applications]
SDK[Astra SDK]
end
subgraph kern [Kernel]
AR[Actor runtime]
TG[Task graph]
SCH[Scheduler]
BUS[Messaging]
ST[State / events]
end
subgraph infra [Infrastructure]
PG[(Postgres)]
RD[(Redis)]
MC[(Memcached)]
OBJ[Object storage]
end
SDK --> AR
AR --> BUS
TG --> PG
SCH --> RD
BUS --> RD
ST --> PG
SDK --> TG
Security and data plane¶
Clients use TLS to the API gateway with JWT auth. Service-to-service traffic uses mTLS. Tool execution is sandboxed; secrets are injected via Vault (or equivalent), not baked into images. See Security and PRD §18.
Kubernetes namespaces (summary)¶
| Namespace | Role |
|---|---|
control-plane |
Gateway, identity, access control |
kernel |
Scheduler, tasks, agents, goals, planner, memory |
workers |
Execution, browser, tools, LLM routing, prompts, evaluation, worker manager |
infrastructure |
Data stores and supporting services |
observability |
Metrics, logs, traces |
Goal → task → result (flow)¶
- Client submits a goal for an agent (with optional inline documents).
- Goal path validates, assembles agent context (system_prompt + rules + skills + context_docs), then planning produces a task DAG with context embedded.
- Task service persists the graph; scheduler finds ready tasks and dispatches to shard queues.
- Workers claim work, run tool/sandbox steps, then complete or fail tasks. Failed tasks retry or move to dead letter.
- Dependent tasks unlock; the cycle continues until the graph finishes.
Chat flow (Phase 10)¶
Users can also interact via WebSocket chat on the api-gateway. Chat sessions connect to chat-capable agents with streaming responses, optional tool invocation, and memory context. The Slack adapter bridges Slack Events API to the chat/goal pipeline for workspace-based interactions.
Dashboard¶
The super-admin dashboard at /superadmin/dashboard/ provides platform-wide visibility: service health, agents, goals, workers, approvals, cost, and Slack configuration. Pastel glass-style design with light/dark theme toggle.
sequenceDiagram
participant C as Client
participant GW as API gateway
participant GS as Goal path
participant PL as Planner
participant TS as Task service
participant SCH as Scheduler
participant Q as Work queues
participant W as Workers
C->>GW: Goal
GW->>GS: Forward
GS->>PL: Plan
PL->>TS: Persist graph
SCH->>TS: Ready tasks
SCH->>Q: Dispatch
W->>Q: Claim
W->>TS: Complete / fail
Caching (10ms read path)¶
Hot reads use Redis and Memcached with TTLs described in PRD §13. Writes go durable first, then events, then cache refresh — so caches can lag slightly behind the database by design.
Tradeoff
Eventual consistency on cached fields is accepted so the 10ms read target stays achievable at scale.
Hardware targets¶
| Platform | Acceleration | Notes |
|---|---|---|
| macOS (Apple Silicon / Intel) | Metal, Neural Engine (ANE), CPU | Supported production target (Mac Mini, Mac Studio). ASTRA_USE_METAL=true. |
| Linux | CUDA, CPU | Primary cloud/K8s target. ASTRA_USE_CUDA=true when GPUs available. |
Single codebase; graceful CPU fallback when accelerators unavailable. Detection via runtime.GOOS or build tags. See PRD §20 and Deployment.