docsAI

SI providers & policies

memQL routes every model call — chat, streaming chat-with-tools, text-to-speech, speech-to-text, and embeddings — through a single, declaratively-configured provider system. Providers are defined as .memql files (vendor + model + params + env-resolved auth), registered into an in-memory ProviderRegistry at engine startup, and selected at call time either by explicit name, by a routing policy (a primary provider plus an ordered fallback chain with latency targets), or by a default. This document covers how providers are declared, the full catalog of configured providers across OpenAI, Anthropic, Google/Gemini, Mistral, Groq, and xAI (plus embeddings, TTS, STT, and a set of declared-but-stubbed modalities), how routing policies resolve to a provider chain with automatic pre-flight fallback, how prompts bind to providers, and the runtime mechanics — including a documented gap in BYOK runtime activation and a subtle prompt/policy naming inconsistency. It is a snapshot of main (VERSION 0.9.14).


1. Concepts: what a provider is

A provider is a single addressable model endpoint: a vendor (which determines the API client and wire format), a specific model id, a parameter block (context window, token caps, pricing), and an auth block. Providers come in two flavors:

  • Base providers (@base) declare vendor-level auth and a @type. They are not registered as callable providers — they exist so concrete providers can @extends(...) them and inherit auth + type. Example: openai, anthropic, google, groq, mistral, xai.
  • Concrete providers (@extends("<base>")) declare a @model and a params block. These are the entries that land in the registry and that callers select by name.

All providers are authored in one consolidated file, generated by the per-construct DSL restructure:

memql
// Generated by scripts/restructure-by-construct.
@base
@type("Anthropic")
provider anthropic {
auth {
apiKey env("MEMQL_SI_ANTHROPIC_API_KEY")
}
}

Source: dsl/providers/providers.memql

memql
@extends("anthropic")
@model("claude-sonnet-4-6")
provider streamClaudeSonnet {
params {
contextWindow 200000
maxTokens 4096
enablePromptCache "true"
inputCostPerMillion 3.00
outputCostPerMillion 15.00
cachedInputCostPerMillion 0.30
}
}

Source: dsl/providers/providers.memql

1.1 The @type annotation drives the Go client

The provider's @type (inherited from the base unless overridden) is the dispatch key in newSIProvider. The type both selects the concrete Go client and, when no @modality is set, infers the modality:

go
func newSIProvider(cfg ProviderConfig) (SIProvider, error) {
switch strings.ToLower(cfg.Type) {
case "openai", "openaichat":
return newOpenSIProvider(cfg)
case "openaistream":
return newOpenAIStreamProvider(cfg)
// OpenAI-wire-compatible vendors. Each provider .memql sets its own
// auth.baseURL and auth.apiKey env var; the existing OpenAI client
// handles the wire format.
case "google", "googlechat",
"groq", "groqchat",
"xai", "xaichat",
"mistral", "mistralchat":
return newOpenSIProvider(cfg)
case "googlestream", "groqstream", "xaistream", "mistralstream":
return newOpenAIStreamProvider(cfg)
case "openaitts":
return newOpenAITTSProvider(cfg)
case "openaiembedding":
return newOpenAIEmbeddingProvider(cfg)
case "anthropic", "anthropicchat":
return newAnthropicProvider(cfg)
case "anthropicstream":
return newAnthropicStreamProvider(cfg)
...
}
}

Source: component/memql/si_providers.go

A key architectural decision: Google, Groq, xAI, and Mistral all reuse the OpenAI Go client. They are OpenAI-wire-compatible, so the only thing that differs is auth.baseURL and the API-key env var. Adding a new OpenAI-compatible vendor requires only a base .memql with the right baseURL and a case entry — no new Go client. Anthropic is the exception with its own SDK-backed client (it supports extended thinking and prompt caching natively).

1.2 Modality and inference

Modality is set explicitly via @modality(...) or inferred from @type. ResolvedModality() maps types to one of:

go
const (
ModalityText ProviderModality = "text"
ModalityTTS ProviderModality = "tts"
ModalitySTT ProviderModality = "stt"
ModalityEmbedding ProviderModality = "embedding"
// declared so .memql files can register intent; handlers are placeholders today
ModalityRealtime ProviderModality = "realtime"
ModalityAudio ProviderModality = "audio"
ModalityImage ProviderModality = "image"
ModalityVideo ProviderModality = "video"
ModalityComputerUse ProviderModality = "computerUse"
ModalityModeration ProviderModality = "moderation"
ModalitySearch ProviderModality = "search"
ModalityResearch ProviderModality = "research"
)

Source: component/memql/si_providers.go

The text, TTS, and embedding modalities have real Go clients in the registry. STT does not: its dispatch types (openaistt / openaiwhisper) route through newOpenAIPlaceholderProvider, with real transcription living outside the registry in the integrations/stt/ package (see §3.5). The remaining eight modalities (realtime, audio, image, video, computer-use, moderation, search, research) are likewise declared but stubbed — their .memql files register the model + auth so misconfigurations surface at startup, but invocation routes through newOpenAIPlaceholderProvider, which validates the API key at registration and returns an informative error on Call():

go
func (p *openAIPlaceholderProvider) Call(_ context.Context, _ string) (any, error) {
return nil, fmt.Errorf(
"provider %q (%s / model=%s) is declared but the Go client is not wired yet; "+
"add a dispatch case in component/memql/si_providers.go:newSIProvider",
p.name, p.capability, p.model,
)
}

Source: component/memql/si_providers.go

The intent is that the .memql files stay untouched when a real client lands — you only swap the dispatch case.


2. Auth resolution: env placeholders, concept storage, and the BYOK gap

Every base provider declares auth.apiKey env("MEMQL_SI_<VENDOR>_API_KEY"). The env(...) form lowers to a ${MEMQL_SI_..._API_KEY} placeholder, resolved by resolveAuthPlaceholders at registration in a three-layer order:

  1. v1:platform:globalSecret — via the engine-wired systemSecretResolver.
  2. v1:platform:globalVariable — for non-sensitive auth fields like baseURL.
  3. OS env — the bootstrap-window fallback.
go
// Resolution order:
// 1. v1:platform:globalSecret -- via systemSecretResolver ...
// 2. v1:platform:globalVariable -- same, for non-sensitive auth fields like baseURL.
// 3. OS env -- bootstrap-window fallback. Providers
// load eagerly during engine init ...
// If all three layers miss, the provider fails to load with a message
// telling the operator how to seed it.

Source: component/memql/si_providers.go

The resolver also normalizes naming. A provider asking for MEMQL_SI_OPENAI_API_KEY will additionally accept the bare OPENAI_API_KEY form (the dev-manifest convention), trying both names against both concept stores and the env:

go
func authConceptLookupNames(envKey string) []string {
const elidedPrefix = "MEMQL_SI_"
if strings.HasPrefix(envKey, elidedPrefix) {
return []string{envKey, strings.TrimPrefix(envKey, elidedPrefix)}
}
return []string{envKey}
}

Source: component/memql/si_providers.go

When everything misses, the provider fails to load with an actionable message pointing at make secret-set NAME=... VALUE=... SCOPE=global.

2.1 The BYOK runtime-activation gap

memQL has a BYOK (bring-your-own-key) surface: the router integration's setApiKey capability encrypts a plaintext vendor key and writes it as a v1:platform:partitionSecret row keyed by <VENDOR>_API_KEY, so a tenant can override the instance default per partition without plaintext ever persisting:

go
func secretNameForVendor(vendor string) string {
return strings.ToUpper(strings.TrimSpace(vendor)) + "_API_KEY"
}

Source: integrations/router/integration.go

However, providers resolve their auth eagerly, once, at engine startup (resolveAuthPlaceholders runs during registration). The resolver reads v1:platform:globalSecret / globalVariable and OS env — but the registered provider client captures the resolved key in a closure and is not re-resolved per request. Two consequences follow:

  • A BYOK key written via setApiKey after boot (to v1:platform:partitionSecret) does not retroactively re-key already-registered providers. The code comment for the OS-env fallback names the same root cause: "Retiring this requires either lazy/per-request provider auth resolution or a post-seed engine reload; tracked as future work." There is a ReloadSIProviders engine method (component/memql/engine_si.go) that re-runs registration, but auth resolution is not threaded per-partition at call time.
  • setApiKey writes a partition-scoped secret (v1:platform:partitionSecret), but the startup resolver reads the global stores (globalSecret / globalVariable) and OS env. The per-partition BYOK secret is therefore not on the provider auth-resolution path at registration.

Net: env-var (or globalSecret) startup resolution is the live path; partition-scoped BYOK secrets are written and encrypted but their runtime activation against the provider clients is the gap.


3. The full provider catalog

Every concrete provider below is registered at startup (subject to its auth resolving). Pricing is USD per million tokens, read from params (inputCostPerMillion / outputCostPerMillion / cachedInputCostPerMillion). Values left at zero mean "not configured," which callers treat as unknown cost, not free (Pricing.Configured()).

3.1 Text chat — non-streaming (OpenAI client path)

ProviderModelCtx windowMax tokensIn / Out / Cached ($/M)Notes
chat53Latestgpt-5.3-chat-latest1280004096Auto-updated alias for dev/preview
chat54gpt-5.412800040962.50 / 10.00 / 1.25Flagship standard tier
chat54Minigpt-5.4-mini128000163840.15 / 0.60 / 0.075Balanced; max raised to 16384 for structured seed work
chat54Nanogpt-5.4-nano3200020480.10 / 0.40 / 0.05Cheapest high-volume
chat54Progpt-5.4 (aliased)25600081925.00 / 20.00 / 2.50Aliases to gpt-5.4; gpt-5.4-pro is responses-API-only

3.2 Text chat — non-streaming (Anthropic client path)

ProviderModelCtxMaxIn / Out / Cached ($/M)
claudeHaikuclaude-haiku-4-5-2025100120000040960.80 / 4.00 / 0.08
claudeSonnetclaude-sonnet-4-620000040963.00 / 15.00 / 0.30
claudeOpusclaude-opus-4-6200000409615.00 / 75.00 / 1.50

3.3 Text chat — streaming (the @default lives here)

Streaming providers carry @type("...Stream") and back the agent tool-loop and voice paths. stream54 carries @default, so any caller that does not pick a provider lands there.

ProviderModelCtxMaxIn / Out / Cached ($/M)Notes
stream53Latestgpt-5.3-chat-latest1280004096Dev alias
stream54gpt-5.412800040962.50 / 10.00 / 1.25@default
stream54Minigpt-5.4-mini12800040960.15 / 0.60 / 0.075
stream54Nanogpt-5.4-nano3200020480.10 / 0.40 / 0.05
stream54Progpt-5.4 (aliased)25600081925.00 / 20.00 / 2.50Aliases to gpt-5.4; used as strongReasoning fallback
streamClaudeHaikuclaude-haiku-4-5-2025100120000040960.80 / 4.00 / 0.08
streamClaudeSonnetclaude-sonnet-4-620000040963.00 / 15.00 / 0.30enablePromptCache=true
streamClaudeOpusclaude-opus-4-6200000409615.00 / 75.00 / 1.50
reasoningClaudeOpusclaude-opus-4-62000001638415.00 / 75.00 / 1.50thinkingBudgetTokens=8192 (extended thinking)
streamGeminiFlashgemini-2.0-flash100000081920.075 / 0.30 / 0.01875Google, OpenAI-compatible
streamGeminiProgemini-2.0-pro200000081921.25 / 5.00 / 0.3125Google
streamGrok2grok-2-latest13107281922.00 / 10.00 / 0xAI
streamGroqLlama70Bllama-3.3-70b-versatile12800081920.59 / 0.79 / 0Groq; sub-300ms TTFT
streamMistralLargemistral-large-latest12800081922.00 / 6.00 / 0Mistral
streamCodestralcodestral-latest25600081920.20 / 0.60 / 0Mistral coding model
streamCodex51Maxgpt-5.1-codex-max40000016384Heavy long-context coding
streamCodex53gpt-5.3-codex2560008192Coding-specialized
streamReasoning4Minio4-mini20000016384o-series reasoning

Non-streaming sibling chat/coding providers also exist for the synchronous path: codestral, mistralLarge, geminiFlash, geminiPro, grok2, groqLlama70B.

3.4 Embeddings (real client — pgvector-backed)

memQL uses embeddings. The embedding integration writes to the node_vectors and embedding_cache PostgreSQL tables via pgvector, with cosine-distance similarity search. The default embedding provider is embedding3Small.

ProviderModelDimensionsNotes
embedding3Smalltext-embedding-3-small1536Default for high-volume retrieval
embedding3Largetext-embedding-3-large3072High-fidelity
go
// Format embedding as pgvector literal: [0.1,0.2,...]
...
providerName = "embedding3Small"

Source: integrations/embedding/embedding.go

The pgvector schema is provisioned by migrations 20260325000001_enable_pgvector.up.sql and 20260325000002_vector_tables.up.sql (Source: component/database/memory-nodes/migrations/). The similarity and embedding integrations expose DSL capabilities (embed, similarity search by cosine distance).

3.5 TTS (real client) and STT (placeholder in the registry)

TTS is a real provider-registry path: openaitts dispatches to newOpenAITTSProvider. STT is not. The whisper1 and transcribeDiarize providers declare their model + auth, but both the openaistt and openaiwhisper dispatch types route through newOpenAIPlaceholderProvider — so a registry-path call returns the "not wired" error like the §3.6 stubs. Real transcription lives in the separate integrations/stt/ package (openai_whisper.go, deepgram.go, polyphon_session.go), which is not driven by the provider registry. MEMQL_WHISPER_MODEL (default whisper-1) is read there, not by the registry provider.

ProviderModelModalityNotes
tts4oMinigpt-4o-mini-ttsttsNewest natural-sounding TTS (real client)
tts1Hdtts-1-hdttsReliable HD fallback (real client)
whisper1whisper-1sttRegistry placeholder; real STT in integrations/stt/openai_whisper.go
transcribeDiarizegpt-4o-transcribe-diarizesttRegistry placeholder; speaker diarization

3.6 Declared-but-stubbed providers (placeholder clients)

These register their model + auth at startup but return an "not wired" error on invocation:

ProviderModelModality
realtime15 / realtimeMinigpt-realtime-1.5 / gpt-realtime-minirealtime
audio15 / audioMinigpt-audio-1.5 / gpt-audio-miniaudio
image15 / image1Minigpt-image-1.5 / gpt-image-1-miniimage
sora2 / sora2Prosora-2 / sora-2-provideo
computerUsecomputer-use-previewcomputerUse
moderationOmniomni-moderation-latestmoderation
search5gpt-5-search-apisearch
research3 / research4Minio3-deep-research / o4-mini-deep-researchresearch

4. Provider-specific runtime behaviors

4.1 OpenAI project header

For OpenAI providers, the client injects an OpenAI-Project header when a project id is present. It is read from auth.projectId (a .memql override) or MEMQL_SI_OPENAI_PROJECT_ID from env. Service-account keys (sk-svcacct-*) carry no project id, so the header is simply omitted — which is why MEMQL_SI_OPENAI_PROJECT_ID is intentionally not declared as an env() placeholder (an unset placeholder fails registration).

4.2 Anthropic extended thinking (reasoningClaudeOpus)

The reasoningClaudeOpus provider is the same model as streamClaudeOpus plus a thinkingBudgetTokens 8192 param. The Anthropic stream client reads it and injects a Thinking config into every request:

go
func (p *anthropicStreamProvider) anthropicThinking() anthropic.ThinkingConfigParamUnion {
budget, ok := intParam(p.params["thinkingBudgetTokens"])
...
return anthropic.ThinkingConfigParamOfEnabled(int64(budget))
}

Source: component/memql/si_providers.go

When thinking is enabled the client suppresses the temperature param (the Anthropic API requires temperature==1 under extended thinking), so callers do not need to coordinate:

go
if _, ok := p.params["temperature"]; ok && !p.anthropicThinkingEnabled() {
// ... set temperature
}

Source: component/memql/si_providers.go

Constraints baked into the provider: maxTokens is raised to 16384 (the budget must be < maxTokens), and the budget of 8192 is a mid-tier value. Reserved for reasoning-heavy workloads (planner, agent factory, training-plan composition) where 5–30s of added latency is acceptable.

4.3 Anthropic prompt caching (enablePromptCache)

streamClaudeSonnet sets enablePromptCache "true". The Anthropic client marks the prefix (system blocks) with cache_control=ephemeral, enabling Anthropic's prompt caching for the cached-input price tier (cachedInputCostPerMillion 0.30 vs. 3.00 regular input).

4.4 Pricing math

Pricing supports a split between cached and regular input tokens:

go
func (p Pricing) CostFor(inputTokens, outputTokens, cachedInputTokens int) (input, output, cachedInput, total float64) {
regularInput := inputTokens - cachedInputTokens
...
input = p.InputPerMillion * float64(regularInput) / 1_000_000
cachedInput = p.CachedInputPerMillion * float64(cachedInputTokens) / 1_000_000
output = p.OutputPerMillion * float64(outputTokens) / 1_000_000
total = input + output + cachedInput
return
}

Source: component/memql/si_providers.go

This breakdown is what the SI Router writes per call to the v1:router:call usage ledger.


5. Routing policies

A routing policy names a primary provider plus an ordered fallback chain and latency targets. Policies are authored in dsl/policies/policies.memql and loaded into a PolicyRegistry. The complete set:

memql
@primary("streamClaudeSonnet")
@fallback("stream54Pro")
@fallback("streamGeminiPro")
@maxLatencyMs(60000)
@preferredRole("assistant")
@preferredRole("specialist")
@description("Default chat policy for non-operator agents. Claude Sonnet 4.6 primary, GPT-5.4 Pro as cross-vendor fallback, Gemini Pro as tertiary safety net. ... the floor for agent replies is now Sonnet-class regardless of role. ...")
policy balancedChat { }
 
@primary("streamGeminiFlash")
@fallback("streamGroqLlama70B")
@fallback("streamCodestral")
@maxLatencyMs(15000)
@description("Cheapest capable -- bulk suggestion + classification work where cost per call matters more than model ceiling. ...")
policy cheapestCapable { }
 
@primary("streamCodestral")
@fallback("streamGroqLlama70B")
@fallback("stream54Mini")
@maxLatencyMs(20000)
@description("Fast coding -- cheap + quick for code generation and refactor assistance. ...")
policy fastCoding { }
 
@primary("streamGroqLlama70B")
@fallback("streamGeminiFlash")
@fallback("stream54Mini")
@maxTimeToFirstTokenMs(800)
@maxLatencyMs(10000)
@description("Low latency voice -- turn-taking in multi-party voice conversations. Groq is the best-in-class for first-token latency ...")
policy lowLatencyVoice { }
 
@primary("streamClaudeSonnet")
@fallback("stream54Pro")
@fallback("streamGeminiPro")
@maxLatencyMs(60000)
@preferredRole("operator")
@description("Strong reasoning -- for operator-enabled agents driving the UI and for complex tool-calling choreography. ...")
policy strongReasoning { }

Source: dsl/policies/policies.memql

5.1 Policy annotations

AnnotationMeaning
@primary("name")First provider tried. Required — load fails if empty.
@fallback("name")Appended to the chain in declaration order; tried in turn on pre-flight error.
@maxLatencyMs(n)Total-call latency target (ms).
@maxTimeToFirstTokenMs(n)TTFT target (ms); only lowLatencyVoice sets it (800ms).
@preferredRole("role")Registers this policy as the default for an agent role. First @preferredRole claim wins per role.
@description(...)Human-readable rationale; surfaced on the /router/policies admin page.

The parsed PolicyConfig exposes a ProviderChain() that flattens primary + fallbacks:

go
func (p PolicyConfig) ProviderChain() []string {
chain := make([]string, 0, len(p.Fallbacks)+1)
if strings.TrimSpace(p.Primary) != "" {
chain = append(chain, p.Primary)
}
for _, f := range p.Fallbacks { ... }
...
}

Source: component/memql/policy_parser.go

The fallback chains are deliberately cross-vendor: balancedChat and strongReasoning go Anthropic → OpenAI → Google, so a single vendor being out of credits / rate-limited / down does not take down the agent tool loop. Note the comment on stream54Pro — it aliases to gpt-5.4 precisely because gpt-5.4-pro rejects v1/chat/completions with a 404, which would silently kill the fallback.

5.2 @preferredRole → role-default policy

The registry indexes policies by role (byRole), and DefaultForRole(role) answers "which policy should an agent of this role use when it hasn't pinned one?":

go
func (r *PolicyRegistry) DefaultForRole(role string) string {
...
return r.byRole[key]
}

Source: component/memql/policy_loader.go

In practice the agent replier hardcodes the role→policy mapping rather than calling DefaultForRole:

go
policyName := agentPolicyName
if policyName == "" {
switch {
case operatorEnabled:
policyName = "strongReasoning"
case voiceMode:
policyName = "lowLatencyVoice"
default:
policyName = "balancedChat"
}
}

Source: integrations/agent/replier.go


6. The SI Router: resolution + fallback

The Router (component/router) is an in-process component embedded in every node that makes SI calls. It resolves a ResolveRequest to a provider chain, wraps the chosen client with an observer that writes one v1:router:call ledger row per call (tokens, cost, latency, outcome), and hands the wrapped provider back.

6.1 Selection precedence

go
// Precedence:
// 1. ExplicitProvider -- single-entry chain
// 2. PolicyName -- policy's primary + fallbacks
// 3. DefaultProvider -- single-entry chain
// 4. Registry default -- single-entry chain (last resort)
func (r *Router) resolveChain(req ResolveRequest, mod providerModality) ([]string, Resolved, error) {
switch {
case strings.TrimSpace(req.ExplicitProvider) != "":
chain = []string{strings.TrimSpace(req.ExplicitProvider)}
case strings.TrimSpace(req.PolicyName) != "" && r.policies != nil:
if policy, ok := r.policies.Lookup(req.PolicyName); ok {
chain = policy.ProviderChain()
policyName = policy.Name
}
}
if len(chain) == 0 {
if def := strings.TrimSpace(req.DefaultProvider); def != "" {
chain = []string{def}
} else if d := r.providers.Default(); d != "" {
chain = []string{d}
}
}
...
}

Source: component/router/router.go

After building the chain, resolveChain walks it and stamps the first available entry that implements the requested modality interface (ChatStreamWithToolsProvider for streaming-with-tools, ChatSIProvider for non-streaming) as the primary Resolved.

The agent replier composes the precedence as: per-turn hint → agent's stored explicit provider → agent's stored model (if it matches a registry name) → agent's stored policyName → role-based default policy → deploy-time env default (OPERATOR_AGENT_PROVIDER / DEFAULT_AGENT_PROVIDER) → registry global default.

6.2 Fallback semantics

The router returns a wrapper (fallbackStreamWithTools / fallbackChat) that walks the chain. The fallback rule is pre-flight only:

Pre-flight error on a chain entry → record outcome="fallback_used" and advance to the next entry. Mid-stream errors are NOT auto-fallback'd — they're recorded as outcome="error" by the observer, because mid-stream retry would require replaying partial output. Source: component/router/fallback.go

Each failed pre-flight attempt writes a fallback_used ledger row attributed to the failed provider (with FallbackFromModel set). The ledger write is fire-and-forget on a detached context with a synthetic system:router actor, so observability never blocks or interrupts the SI reply:

go
func (r *Router) recordCall(rec CallRecord) {
go func() {
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
ctx = auth.ContextWithToken(ctx, &auth.TokenInfo{ Subject: "system:router", ... })
...
query := fmt.Sprintf("mutationRecordRouterCall(%s)", string(payload))
if _, err := r.engine.Execute(ctx, query); err != nil { ... }
}()
}

Source: component/router/router.go

Dropped writes are counted (RecordsDropped()), not retried. Note from types.go: Phase-1 token counts are heuristic (char-count / 4) with tokensEstimated=true; real vendor usage is threaded through in a later phase.

6.3 Registry default selection

The registry default is chosen with this precedence at registration:

go
// 1. MEMQL_DEFAULT_CHAT_PROVIDER env var (defaultPinned=true): never overridden.
// 2. A provider marked @default: always wins over a prior first-wins fallback.
// 3. First available registered provider: only used when nothing else applies.

Source: component/memql/si_providers.go

So MEMQL_DEFAULT_PROVIDER (read by loadSIProviders) / MEMQL_DEFAULT_CHAT_PROVIDER, then the @default annotation on stream54, then first-available.


7. How prompts bind to providers

Prompts (struct-form .memql declarations) declare their model with @defaultProvider("name") and their template with @templateFile("..."). Example:

memql
@defaultProvider("chat54Mini")
@templateFile("prompts/cognitionRouting.tmpl")
@description("Decide whether any AI agent should respond to a human utterance ... Sub-500ms fast-model target.")
prompt cognitionRouting {
agents []object @required @description("Available AI agents ...")
utterance string @required @description("The latest human message to route")
...
}

Source: dsl/cognition/prompts.memql

The observed @defaultProvider distribution across all prompts:

ProviderCountRepresentative prompts
chat54Mini13cognition routing/reply/triage/dispatch, askSpecialist, seedDomain*, consolidateMemory, decomposeGoal
chat54Nano1docSummary
streamClaudeSonnet2plannerAgent, reactiveConductor
streamClaudeOpus1trainerAgent
strongReasoning1agentFactoryAnalyze

7.1 The si() path resolves the prompt provider as a registry name

When a prompt is invoked via si() / InvokeSIStructured, the provider name is resolved from the prompt's @defaultProvider (an explicit override wins; then the prompt's default; then the registry default):

go
func (r *siRuntime) resolveProviderName(prompt *PromptTemplate, invocation *SIInvocation) (string, error) {
if invocation != nil && invocation.ProviderOverride != nil {
if trimmed := strings.TrimSpace(*invocation.ProviderOverride); trimmed != "" {
return trimmed, nil
}
}
if prompt != nil && strings.TrimSpace(prompt.DefaultProvider) != "" {
return strings.TrimSpace(prompt.DefaultProvider), nil
}
if defaultName := r.providers.Default(); strings.TrimSpace(defaultName) != "" {
return strings.TrimSpace(defaultName), nil
}
...
}

Source: component/memql/si_runtime.go

The resolved name is then looked up directly in the registry as a provider entry (r.providers.Entry(providerName)) — not as a policy. Structured invocations go through StructuredChatProviderByName(providerName), with graceful degradation to the default structured-capable provider and finally to the default chat provider with in-prompt schema instructions:

go
if structured := e.StructuredChatProviderByName(providerName); structured != nil {
result, err = structured.CallChatStructured(ctx, messages, spec)
} else if structured := e.StructuredChatProvider(); structured != nil {
result, err = structured.CallChatStructured(ctx, messages, spec)
} else {
chat := e.DefaultChatProvider()
... // schema as in-prompt instructions
}

Source: component/memql/engine_si.go

7.2 Inconsistency: agentFactoryAnalyze names a policy where a provider is expected

agentFactoryAnalyze declares @defaultProvider("strongReasoning"), but strongReasoning is a policy, not a provider (grep -c "provider strongReasoning" dsl/providers/providers.memql → 0). The si()/InvokeSIStructured path resolves @defaultProvider as a registry provider name, not a policy. StructuredChatProviderByName("strongReasoning") therefore returns nil, and the call silently falls back to the default structured-capable provider (per the chain in §7.1). So agentFactoryAnalyze does not actually run on the Sonnet→Pro→Gemini chain its name implies — it lands on whatever StructuredChatProvider() picks (preference order chat54, chat54Mini, chat54Nano, chat54Pro, chat53Latest). This is a benign-but-misleading misconfiguration: the prompt path and the router policy path are two different resolution systems, and a policy name leaked into a provider slot. The router's policy chain only applies on the agent-reply path (ResolveStreamWithTools with PolicyName), not on the si() prompt path.

7.3 Two SI call styles

  • Structured-output prompts (routing, suggest, classification, factory/intake decisions) use CallChatStructured with a JSON schema — OpenAI's json_schema response format with Strict constrained decoding when available. On empty content the OpenAI client surfaces the real cause (finish_reason=length, content_filter, refusal) instead of bubbling a confusing "unexpected end of JSON input."
  • Prose prompts (agent replies to users) use regular streaming chat through the router's ResolveStreamWithTools path, where policies and fallback chains apply.

8. Admin surface

The router integration exposes three DSL-callable capabilities backing the CoPresent /router/* admin pages (Source: integrations/router/integration.go):

  • setApiKey — encrypt + persist a BYOK vendor key as v1:platform:partitionSecret (see §2.1 for the activation gap).
  • listModels — flatten every registered provider (text, TTS, STT, embedding modalities) with vendor, model, pricing, context window, and availability into v1:router:modelcatalog rows.
  • listPolicies — dump every routing policy with its ProviderChain(), latency targets, and preferred roles into v1:router:policycatalog rows.

9. Summary of resolution paths

CallerResolution mechanismPolicies apply?Fallback chain?
Agent reply (prose, tools)Router ResolveStreamWithTools with PolicyNameYesYes (pre-flight)
si() / InvokeSIStructured prompt@defaultProvider resolved as registry provider nameNoGraceful provider-class fallback (not policy chain)
EmbeddingsEmbeddingProvider(name), default embedding3SmallNoNo
TTSTTSProviderByName / modality scan (real registry client)NoNo
STTNot a registry path — integrations/stt/ package (openai_whisper.go etc.); the whisper1 registry provider is a placeholderNoNo
Voice turnRouter with lowLatencyVoice policy (via voice_mode hint)YesYes

The two systems to keep distinct: provider names (registry entries, what a .memql provider declares and what si() resolves) versus policy names (chains in dsl/policies/policies.memql, consumed only by the router's reply/voice path). The agentFactoryAnalyze case in §7.2 is exactly the bug you get when those two namespaces are crossed.