SI providers & policies

memQL routes every model call — chat, streaming chat-with-tools, text-to-speech, speech-to-text, and embeddings — through a single, declaratively-configured provider system. Providers are defined as .memql files (vendor + model + params + env-resolved auth), registered into an in-memory ProviderRegistry at engine startup, and selected at call time either by explicit name, by a routing policy (a primary provider plus an ordered fallback chain with latency targets), or by a default. This document covers how providers are declared, the full catalog of configured providers across OpenAI, Anthropic, Google/Gemini, Mistral, Groq, and xAI (plus embeddings, TTS, STT, and a set of declared-but-stubbed modalities), how routing policies resolve to a provider chain with automatic pre-flight fallback, how prompts bind to providers, and the runtime mechanics — including a documented gap in BYOK runtime activation and a subtle prompt/policy naming inconsistency. It is a snapshot of main (VERSION 0.9.14).

#1. Concepts: what a provider is

A provider is a single addressable model endpoint: a vendor (which determines the API client and wire format), a specific model id, a parameter block (context window, token caps, pricing), and an auth block. Providers come in two flavors:

Base providers (@base) declare vendor-level auth and a @type. They are not registered as callable providers — they exist so concrete providers can @extends(...) them and inherit auth + type. Example: openai, anthropic, google, groq, mistral, xai.
Concrete providers (@extends("<base>")) declare a @model and a params block. These are the entries that land in the registry and that callers select by name.

All providers are authored in one consolidated file, generated by the per-construct DSL restructure:

memql

// Generated by scripts/restructure-by-construct.
@base
@type("Anthropic")
provider anthropic {
  auth {
    apiKey  env("MEMQL_SI_ANTHROPIC_API_KEY")
  }
}

Source: dsl/providers/providers.memql

memql

@extends("anthropic")
@model("claude-sonnet-4-6")
provider streamClaudeSonnet {
  params {
    contextWindow              200000
    maxTokens                  4096
    enablePromptCache          "true"
    inputCostPerMillion        3.00
    outputCostPerMillion       15.00
    cachedInputCostPerMillion  0.30
  }
}

Source: dsl/providers/providers.memql

#1.1 The `@type` annotation drives the Go client

The provider's @type (inherited from the base unless overridden) is the dispatch key in newSIProvider. The type both selects the concrete Go client and, when no @modality is set, infers the modality:

func newSIProvider(cfg ProviderConfig) (SIProvider, error) {
	switch strings.ToLower(cfg.Type) {
	case "openai", "openaichat":
		return newOpenSIProvider(cfg)
	case "openaistream":
		return newOpenAIStreamProvider(cfg)
	// OpenAI-wire-compatible vendors. Each provider .memql sets its own
	// auth.baseURL and auth.apiKey env var; the existing OpenAI client
	// handles the wire format.
	case "google", "googlechat",
		"groq", "groqchat",
		"xai", "xaichat",
		"mistral", "mistralchat":
		return newOpenSIProvider(cfg)
	case "googlestream", "groqstream", "xaistream", "mistralstream":
		return newOpenAIStreamProvider(cfg)
	case "openaitts":
		return newOpenAITTSProvider(cfg)
	case "openaiembedding":
		return newOpenAIEmbeddingProvider(cfg)
	case "anthropic", "anthropicchat":
		return newAnthropicProvider(cfg)
	case "anthropicstream":
		return newAnthropicStreamProvider(cfg)
	...
	}
}

Source: component/memql/si_providers.go

A key architectural decision: Google, Groq, xAI, and Mistral all reuse the OpenAI Go client. They are OpenAI-wire-compatible, so the only thing that differs is auth.baseURL and the API-key env var. Adding a new OpenAI-compatible vendor requires only a base .memql with the right baseURL and a case entry — no new Go client. Anthropic is the exception with its own SDK-backed client (it supports extended thinking and prompt caching natively).

#1.2 Modality and inference

Modality is set explicitly via @modality(...) or inferred from @type. ResolvedModality() maps types to one of:

const (
	ModalityText      ProviderModality = "text"
	ModalityTTS       ProviderModality = "tts"
	ModalitySTT       ProviderModality = "stt"
	ModalityEmbedding ProviderModality = "embedding"
	// declared so .memql files can register intent; handlers are placeholders today
	ModalityRealtime    ProviderModality = "realtime"
	ModalityAudio       ProviderModality = "audio"
	ModalityImage       ProviderModality = "image"
	ModalityVideo       ProviderModality = "video"
	ModalityComputerUse ProviderModality = "computerUse"
	ModalityModeration  ProviderModality = "moderation"
	ModalitySearch      ProviderModality = "search"
	ModalityResearch    ProviderModality = "research"
)

Source: component/memql/si_providers.go

The text, TTS, and embedding modalities have real Go clients in the registry. STT does not: its dispatch types (openaistt / openaiwhisper) route through newOpenAIPlaceholderProvider, with real transcription living outside the registry in the integrations/stt/ package (see §3.5). The remaining eight modalities (realtime, audio, image, video, computer-use, moderation, search, research) are likewise declared but stubbed — their .memql files register the model + auth so misconfigurations surface at startup, but invocation routes through newOpenAIPlaceholderProvider, which validates the API key at registration and returns an informative error on Call():

func (p *openAIPlaceholderProvider) Call(_ context.Context, _ string) (any, error) {
	return nil, fmt.Errorf(
		"provider %q (%s / model=%s) is declared but the Go client is not wired yet; "+
			"add a dispatch case in component/memql/si_providers.go:newSIProvider",
		p.name, p.capability, p.model,
	)
}

Source: component/memql/si_providers.go

The intent is that the .memql files stay untouched when a real client lands — you only swap the dispatch case.

#2. Auth resolution: env placeholders, concept storage, and the BYOK gap

Every base provider declares auth.apiKey env("MEMQL_SI_<VENDOR>_API_KEY"). The env(...) form lowers to a ${MEMQL_SI_..._API_KEY} placeholder, resolved by resolveAuthPlaceholders at registration in a three-layer order:

v1:platform:globalSecret — via the engine-wired systemSecretResolver.
v1:platform:globalVariable — for non-sensitive auth fields like baseURL.
OS env — the bootstrap-window fallback.

// Resolution order:
//  1. v1:platform:globalSecret    -- via systemSecretResolver ...
//  2. v1:platform:globalVariable  -- same, for non-sensitive auth fields like baseURL.
//  3. OS env                -- bootstrap-window fallback. Providers
//     load eagerly during engine init ...
// If all three layers miss, the provider fails to load with a message
// telling the operator how to seed it.

Source: component/memql/si_providers.go

The resolver also normalizes naming. A provider asking for MEMQL_SI_OPENAI_API_KEY will additionally accept the bare OPENAI_API_KEY form (the dev-manifest convention), trying both names against both concept stores and the env:

func authConceptLookupNames(envKey string) []string {
	const elidedPrefix = "MEMQL_SI_"
	if strings.HasPrefix(envKey, elidedPrefix) {
		return []string{envKey, strings.TrimPrefix(envKey, elidedPrefix)}
	}
	return []string{envKey}
}

Source: component/memql/si_providers.go

When everything misses, the provider fails to load with an actionable message pointing at make secret-set NAME=... VALUE=... SCOPE=global.

#2.1 The BYOK runtime-activation gap

memQL has a BYOK (bring-your-own-key) surface: the router integration's setApiKey capability encrypts a plaintext vendor key and writes it as a v1:platform:partitionSecret row keyed by <VENDOR>_API_KEY, so a tenant can override the instance default per partition without plaintext ever persisting:

func secretNameForVendor(vendor string) string {
	return strings.ToUpper(strings.TrimSpace(vendor)) + "_API_KEY"
}

Source: integrations/router/integration.go

However, providers resolve their auth eagerly, once, at engine startup (resolveAuthPlaceholders runs during registration). The resolver reads v1:platform:globalSecret / globalVariable and OS env — but the registered provider client captures the resolved key in a closure and is not re-resolved per request. Two consequences follow:

A BYOK key written via setApiKey after boot (to v1:platform:partitionSecret) does not retroactively re-key already-registered providers. The code comment for the OS-env fallback names the same root cause: "Retiring this requires either lazy/per-request provider auth resolution or a post-seed engine reload; tracked as future work." There is a ReloadSIProviders engine method (component/memql/engine_si.go) that re-runs registration, but auth resolution is not threaded per-partition at call time.
setApiKey writes a partition-scoped secret (v1:platform:partitionSecret), but the startup resolver reads the global stores (globalSecret / globalVariable) and OS env. The per-partition BYOK secret is therefore not on the provider auth-resolution path at registration.

Net: env-var (or globalSecret) startup resolution is the live path; partition-scoped BYOK secrets are written and encrypted but their runtime activation against the provider clients is the gap.

#3. The full provider catalog

Every concrete provider below is registered at startup (subject to its auth resolving). Pricing is USD per million tokens, read from params (inputCostPerMillion / outputCostPerMillion / cachedInputCostPerMillion). Values left at zero mean "not configured," which callers treat as unknown cost, not free (Pricing.Configured()).

#3.1 Text chat — non-streaming (OpenAI client path)

Provider	Model	Ctx window	Max tokens	In / Out / Cached ($/M)	Notes
`chat53Latest`	`gpt-5.3-chat-latest`	128000	4096	—	Auto-updated alias for dev/preview
`chat54`	`gpt-5.4`	128000	4096	2.50 / 10.00 / 1.25	Flagship standard tier
`chat54Mini`	`gpt-5.4-mini`	128000	16384	0.15 / 0.60 / 0.075	Balanced; max raised to 16384 for structured seed work
`chat54Nano`	`gpt-5.4-nano`	32000	2048	0.10 / 0.40 / 0.05	Cheapest high-volume
`chat54Pro`	`gpt-5.4` (aliased)	256000	8192	5.00 / 20.00 / 2.50	Aliases to `gpt-5.4`; `gpt-5.4-pro` is responses-API-only

#3.2 Text chat — non-streaming (Anthropic client path)

Provider	Model	Ctx	Max	In / Out / Cached ($/M)
`claudeHaiku`	`claude-haiku-4-5-20251001`	200000	4096	0.80 / 4.00 / 0.08
`claudeSonnet`	`claude-sonnet-4-6`	200000	4096	3.00 / 15.00 / 0.30
`claudeOpus`	`claude-opus-4-6`	200000	4096	15.00 / 75.00 / 1.50

#3.3 Text chat — streaming (the `@default` lives here)

Streaming providers carry @type("...Stream") and back the agent tool-loop and voice paths. stream54 carries @default, so any caller that does not pick a provider lands there.

Provider	Model	Ctx	Max	In / Out / Cached ($/M)	Notes
`stream53Latest`	`gpt-5.3-chat-latest`	128000	4096	—	Dev alias
`stream54`	`gpt-5.4`	128000	4096	2.50 / 10.00 / 1.25	`@default`
`stream54Mini`	`gpt-5.4-mini`	128000	4096	0.15 / 0.60 / 0.075
`stream54Nano`	`gpt-5.4-nano`	32000	2048	0.10 / 0.40 / 0.05
`stream54Pro`	`gpt-5.4` (aliased)	256000	8192	5.00 / 20.00 / 2.50	Aliases to `gpt-5.4`; used as `strongReasoning` fallback
`streamClaudeHaiku`	`claude-haiku-4-5-20251001`	200000	4096	0.80 / 4.00 / 0.08
`streamClaudeSonnet`	`claude-sonnet-4-6`	200000	4096	3.00 / 15.00 / 0.30	`enablePromptCache=true`
`streamClaudeOpus`	`claude-opus-4-6`	200000	4096	15.00 / 75.00 / 1.50
`reasoningClaudeOpus`	`claude-opus-4-6`	200000	16384	15.00 / 75.00 / 1.50	`thinkingBudgetTokens=8192` (extended thinking)
`streamGeminiFlash`	`gemini-2.0-flash`	1000000	8192	0.075 / 0.30 / 0.01875	Google, OpenAI-compatible
`streamGeminiPro`	`gemini-2.0-pro`	2000000	8192	1.25 / 5.00 / 0.3125	Google
`streamGrok2`	`grok-2-latest`	131072	8192	2.00 / 10.00 / 0	xAI
`streamGroqLlama70B`	`llama-3.3-70b-versatile`	128000	8192	0.59 / 0.79 / 0	Groq; sub-300ms TTFT
`streamMistralLarge`	`mistral-large-latest`	128000	8192	2.00 / 6.00 / 0	Mistral
`streamCodestral`	`codestral-latest`	256000	8192	0.20 / 0.60 / 0	Mistral coding model
`streamCodex51Max`	`gpt-5.1-codex-max`	400000	16384	—	Heavy long-context coding
`streamCodex53`	`gpt-5.3-codex`	256000	8192	—	Coding-specialized
`streamReasoning4Mini`	`o4-mini`	200000	16384	—	o-series reasoning

Non-streaming sibling chat/coding providers also exist for the synchronous path: codestral, mistralLarge, geminiFlash, geminiPro, grok2, groqLlama70B.

#3.4 Embeddings (real client — pgvector-backed)

memQL uses embeddings. The embedding integration writes to the node_vectors and embedding_cache PostgreSQL tables via pgvector, with cosine-distance similarity search. The default embedding provider is embedding3Small.

Provider	Model	Dimensions	Notes
`embedding3Small`	`text-embedding-3-small`	1536	Default for high-volume retrieval
`embedding3Large`	`text-embedding-3-large`	3072	High-fidelity

// Format embedding as pgvector literal: [0.1,0.2,...]
...
providerName = "embedding3Small"

Source: integrations/embedding/embedding.go

The pgvector schema is provisioned by migrations 20260325000001_enable_pgvector.up.sql and 20260325000002_vector_tables.up.sql (Source: component/database/memory-nodes/migrations/). The similarity and embedding integrations expose DSL capabilities (embed, similarity search by cosine distance).

#3.5 TTS (real client) and STT (placeholder in the registry)

TTS is a real provider-registry path: openaitts dispatches to newOpenAITTSProvider. STT is not. The whisper1 and transcribeDiarize providers declare their model + auth, but both the openaistt and openaiwhisper dispatch types route through newOpenAIPlaceholderProvider — so a registry-path call returns the "not wired" error like the §3.6 stubs. Real transcription lives in the separate integrations/stt/ package (openai_whisper.go, deepgram.go, polyphon_session.go), which is not driven by the provider registry. MEMQL_WHISPER_MODEL (default whisper-1) is read there, not by the registry provider.

Provider	Model	Modality	Notes
`tts4oMini`	`gpt-4o-mini-tts`	tts	Newest natural-sounding TTS (real client)
`tts1Hd`	`tts-1-hd`	tts	Reliable HD fallback (real client)
`whisper1`	`whisper-1`	stt	Registry placeholder; real STT in `integrations/stt/openai_whisper.go`
`transcribeDiarize`	`gpt-4o-transcribe-diarize`	stt	Registry placeholder; speaker diarization

#3.6 Declared-but-stubbed providers (placeholder clients)

These register their model + auth at startup but return an "not wired" error on invocation:

Provider	Model	Modality
`realtime15` / `realtimeMini`	`gpt-realtime-1.5` / `gpt-realtime-mini`	realtime
`audio15` / `audioMini`	`gpt-audio-1.5` / `gpt-audio-mini`	audio
`image15` / `image1Mini`	`gpt-image-1.5` / `gpt-image-1-mini`	image
`sora2` / `sora2Pro`	`sora-2` / `sora-2-pro`	video
`computerUse`	`computer-use-preview`	computerUse
`moderationOmni`	`omni-moderation-latest`	moderation
`search5`	`gpt-5-search-api`	search
`research3` / `research4Mini`	`o3-deep-research` / `o4-mini-deep-research`	research

#4. Provider-specific runtime behaviors

#4.1 OpenAI project header

For OpenAI providers, the client injects an OpenAI-Project header when a project id is present. It is read from auth.projectId (a .memql override) or MEMQL_SI_OPENAI_PROJECT_ID from env. Service-account keys (sk-svcacct-*) carry no project id, so the header is simply omitted — which is why MEMQL_SI_OPENAI_PROJECT_ID is intentionally not declared as an env() placeholder (an unset placeholder fails registration).

#4.2 Anthropic extended thinking (`reasoningClaudeOpus`)

The reasoningClaudeOpus provider is the same model as streamClaudeOpus plus a thinkingBudgetTokens 8192 param. The Anthropic stream client reads it and injects a Thinking config into every request:

func (p *anthropicStreamProvider) anthropicThinking() anthropic.ThinkingConfigParamUnion {
	budget, ok := intParam(p.params["thinkingBudgetTokens"])
	...
	return anthropic.ThinkingConfigParamOfEnabled(int64(budget))
}

Source: component/memql/si_providers.go

When thinking is enabled the client suppresses the temperature param (the Anthropic API requires temperature==1 under extended thinking), so callers do not need to coordinate:

if _, ok := p.params["temperature"]; ok && !p.anthropicThinkingEnabled() {
	// ... set temperature
}

Source: component/memql/si_providers.go

Constraints baked into the provider: maxTokens is raised to 16384 (the budget must be < maxTokens), and the budget of 8192 is a mid-tier value. Reserved for reasoning-heavy workloads (planner, agent factory, training-plan composition) where 5–30s of added latency is acceptable.

#4.3 Anthropic prompt caching (`enablePromptCache`)

streamClaudeSonnet sets enablePromptCache "true". The Anthropic client marks the prefix (system blocks) with cache_control=ephemeral, enabling Anthropic's prompt caching for the cached-input price tier (cachedInputCostPerMillion 0.30 vs. 3.00 regular input).

#4.4 Pricing math

Pricing supports a split between cached and regular input tokens:

func (p Pricing) CostFor(inputTokens, outputTokens, cachedInputTokens int) (input, output, cachedInput, total float64) {
	regularInput := inputTokens - cachedInputTokens
	...
	input = p.InputPerMillion * float64(regularInput) / 1_000_000
	cachedInput = p.CachedInputPerMillion * float64(cachedInputTokens) / 1_000_000
	output = p.OutputPerMillion * float64(outputTokens) / 1_000_000
	total = input + output + cachedInput
	return
}

Source: component/memql/si_providers.go

This breakdown is what the SI Router writes per call to the v1:router:call usage ledger.

#5. Routing policies

A routing policy names a primary provider plus an ordered fallback chain and latency targets. Policies are authored in dsl/policies/policies.memql and loaded into a PolicyRegistry. The complete set:

memql

@primary("streamClaudeSonnet")
@fallback("stream54Pro")
@fallback("streamGeminiPro")
@maxLatencyMs(60000)
@preferredRole("assistant")
@preferredRole("specialist")
@description("Default chat policy for non-operator agents. Claude Sonnet 4.6 primary, GPT-5.4 Pro as cross-vendor fallback, Gemini Pro as tertiary safety net. ... the floor for agent replies is now Sonnet-class regardless of role. ...")
policy balancedChat { }
 
@primary("streamGeminiFlash")
@fallback("streamGroqLlama70B")
@fallback("streamCodestral")
@maxLatencyMs(15000)
@description("Cheapest capable -- bulk suggestion + classification work where cost per call matters more than model ceiling. ...")
policy cheapestCapable { }
 
@primary("streamCodestral")
@fallback("streamGroqLlama70B")
@fallback("stream54Mini")
@maxLatencyMs(20000)
@description("Fast coding -- cheap + quick for code generation and refactor assistance. ...")
policy fastCoding { }
 
@primary("streamGroqLlama70B")
@fallback("streamGeminiFlash")
@fallback("stream54Mini")
@maxTimeToFirstTokenMs(800)
@maxLatencyMs(10000)
@description("Low latency voice -- turn-taking in multi-party voice conversations. Groq is the best-in-class for first-token latency ...")
policy lowLatencyVoice { }
 
@primary("streamClaudeSonnet")
@fallback("stream54Pro")
@fallback("streamGeminiPro")
@maxLatencyMs(60000)
@preferredRole("operator")
@description("Strong reasoning -- for operator-enabled agents driving the UI and for complex tool-calling choreography. ...")
policy strongReasoning { }

Source: dsl/policies/policies.memql

#5.1 Policy annotations

Annotation	Meaning
`@primary("name")`	First provider tried. Required — load fails if empty.
`@fallback("name")`	Appended to the chain in declaration order; tried in turn on pre-flight error.
`@maxLatencyMs(n)`	Total-call latency target (ms).
`@maxTimeToFirstTokenMs(n)`	TTFT target (ms); only `lowLatencyVoice` sets it (800ms).
`@preferredRole("role")`	Registers this policy as the default for an agent role. First `@preferredRole` claim wins per role.
`@description(...)`	Human-readable rationale; surfaced on the `/router/policies` admin page.

The parsed PolicyConfig exposes a ProviderChain() that flattens primary + fallbacks:

func (p PolicyConfig) ProviderChain() []string {
	chain := make([]string, 0, len(p.Fallbacks)+1)
	if strings.TrimSpace(p.Primary) != "" {
		chain = append(chain, p.Primary)
	}
	for _, f := range p.Fallbacks { ... }
	...
}

Source: component/memql/policy_parser.go

The fallback chains are deliberately cross-vendor: balancedChat and strongReasoning go Anthropic → OpenAI → Google, so a single vendor being out of credits / rate-limited / down does not take down the agent tool loop. Note the comment on stream54Pro — it aliases to gpt-5.4 precisely because gpt-5.4-pro rejects v1/chat/completions with a 404, which would silently kill the fallback.

#5.2 `@preferredRole` → role-default policy

The registry indexes policies by role (byRole), and DefaultForRole(role) answers "which policy should an agent of this role use when it hasn't pinned one?":

func (r *PolicyRegistry) DefaultForRole(role string) string {
	...
	return r.byRole[key]
}

Source: component/memql/policy_loader.go

In practice the agent replier hardcodes the role→policy mapping rather than calling DefaultForRole:

policyName := agentPolicyName
if policyName == "" {
	switch {
	case operatorEnabled:
		policyName = "strongReasoning"
	case voiceMode:
		policyName = "lowLatencyVoice"
	default:
		policyName = "balancedChat"
	}
}

Source: integrations/agent/replier.go

#6. The SI Router: resolution + fallback

The Router (component/router) is an in-process component embedded in every node that makes SI calls. It resolves a ResolveRequest to a provider chain, wraps the chosen client with an observer that writes one v1:router:call ledger row per call (tokens, cost, latency, outcome), and hands the wrapped provider back.

#6.1 Selection precedence

// Precedence:
//  1. ExplicitProvider  -- single-entry chain
//  2. PolicyName        -- policy's primary + fallbacks
//  3. DefaultProvider   -- single-entry chain
//  4. Registry default  -- single-entry chain (last resort)
func (r *Router) resolveChain(req ResolveRequest, mod providerModality) ([]string, Resolved, error) {
	switch {
	case strings.TrimSpace(req.ExplicitProvider) != "":
		chain = []string{strings.TrimSpace(req.ExplicitProvider)}
	case strings.TrimSpace(req.PolicyName) != "" && r.policies != nil:
		if policy, ok := r.policies.Lookup(req.PolicyName); ok {
			chain = policy.ProviderChain()
			policyName = policy.Name
		}
	}
	if len(chain) == 0 {
		if def := strings.TrimSpace(req.DefaultProvider); def != "" {
			chain = []string{def}
		} else if d := r.providers.Default(); d != "" {
			chain = []string{d}
		}
	}
	...
}

Source: component/router/router.go

After building the chain, resolveChain walks it and stamps the first available entry that implements the requested modality interface (ChatStreamWithToolsProvider for streaming-with-tools, ChatSIProvider for non-streaming) as the primary Resolved.

The agent replier composes the precedence as: per-turn hint → agent's stored explicit provider → agent's stored model (if it matches a registry name) → agent's stored policyName → role-based default policy → deploy-time env default (OPERATOR_AGENT_PROVIDER / DEFAULT_AGENT_PROVIDER) → registry global default.

#6.2 Fallback semantics

The router returns a wrapper (fallbackStreamWithTools / fallbackChat) that walks the chain. The fallback rule is pre-flight only:

Pre-flight error on a chain entry → record outcome="fallback_used" and advance to the next entry. Mid-stream errors are NOT auto-fallback'd — they're recorded as outcome="error" by the observer, because mid-stream retry would require replaying partial output. Source: component/router/fallback.go

Each failed pre-flight attempt writes a fallback_used ledger row attributed to the failed provider (with FallbackFromModel set). The ledger write is fire-and-forget on a detached context with a synthetic system:router actor, so observability never blocks or interrupts the SI reply:

func (r *Router) recordCall(rec CallRecord) {
	go func() {
		ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
		defer cancel()
		ctx = auth.ContextWithToken(ctx, &auth.TokenInfo{ Subject: "system:router", ... })
		...
		query := fmt.Sprintf("mutationRecordRouterCall(%s)", string(payload))
		if _, err := r.engine.Execute(ctx, query); err != nil { ... }
	}()
}

Source: component/router/router.go

Dropped writes are counted (RecordsDropped()), not retried. Note from types.go: Phase-1 token counts are heuristic (char-count / 4) with tokensEstimated=true; real vendor usage is threaded through in a later phase.

#6.3 Registry default selection

The registry default is chosen with this precedence at registration:

// 1. MEMQL_DEFAULT_CHAT_PROVIDER env var (defaultPinned=true): never overridden.
// 2. A provider marked @default: always wins over a prior first-wins fallback.
// 3. First available registered provider: only used when nothing else applies.

Source: component/memql/si_providers.go

So MEMQL_DEFAULT_PROVIDER (read by loadSIProviders) / MEMQL_DEFAULT_CHAT_PROVIDER, then the @default annotation on stream54, then first-available.

#7. How prompts bind to providers

Prompts (struct-form .memql declarations) declare their model with @defaultProvider("name") and their template with @templateFile("..."). Example:

memql

@defaultProvider("chat54Mini")
@templateFile("prompts/cognitionRouting.tmpl")
@description("Decide whether any AI agent should respond to a human utterance ... Sub-500ms fast-model target.")
prompt cognitionRouting {
  agents      []object  @required @description("Available AI agents ...")
  utterance   string    @required @description("The latest human message to route")
  ...
}

Source: dsl/cognition/prompts.memql

The observed @defaultProvider distribution across all prompts:

Provider	Count	Representative prompts
`chat54Mini`	13	cognition routing/reply/triage/dispatch, askSpecialist, seedDomain*, consolidateMemory, decomposeGoal
`chat54Nano`	1	docSummary
`streamClaudeSonnet`	2	plannerAgent, reactiveConductor
`streamClaudeOpus`	1	trainerAgent
`strongReasoning`	1	agentFactoryAnalyze

#7.1 The `si()` path resolves the prompt provider as a registry name

When a prompt is invoked via si() / InvokeSIStructured, the provider name is resolved from the prompt's @defaultProvider (an explicit override wins; then the prompt's default; then the registry default):

func (r *siRuntime) resolveProviderName(prompt *PromptTemplate, invocation *SIInvocation) (string, error) {
	if invocation != nil && invocation.ProviderOverride != nil {
		if trimmed := strings.TrimSpace(*invocation.ProviderOverride); trimmed != "" {
			return trimmed, nil
		}
	}
	if prompt != nil && strings.TrimSpace(prompt.DefaultProvider) != "" {
		return strings.TrimSpace(prompt.DefaultProvider), nil
	}
	if defaultName := r.providers.Default(); strings.TrimSpace(defaultName) != "" {
		return strings.TrimSpace(defaultName), nil
	}
	...
}

Source: component/memql/si_runtime.go

The resolved name is then looked up directly in the registry as a provider entry (r.providers.Entry(providerName)) — not as a policy. Structured invocations go through StructuredChatProviderByName(providerName), with graceful degradation to the default structured-capable provider and finally to the default chat provider with in-prompt schema instructions:

if structured := e.StructuredChatProviderByName(providerName); structured != nil {
	result, err = structured.CallChatStructured(ctx, messages, spec)
} else if structured := e.StructuredChatProvider(); structured != nil {
	result, err = structured.CallChatStructured(ctx, messages, spec)
} else {
	chat := e.DefaultChatProvider()
	... // schema as in-prompt instructions
}

Source: component/memql/engine_si.go

#7.2 Inconsistency: `agentFactoryAnalyze` names a policy where a provider is expected

agentFactoryAnalyze declares @defaultProvider("strongReasoning"), but strongReasoning is a policy, not a provider (grep -c "provider strongReasoning" dsl/providers/providers.memql → 0). The si()/InvokeSIStructured path resolves @defaultProvider as a registry provider name, not a policy. StructuredChatProviderByName("strongReasoning") therefore returns nil, and the call silently falls back to the default structured-capable provider (per the chain in §7.1). So agentFactoryAnalyze does not actually run on the Sonnet→Pro→Gemini chain its name implies — it lands on whatever StructuredChatProvider() picks (preference order chat54, chat54Mini, chat54Nano, chat54Pro, chat53Latest). This is a benign-but-misleading misconfiguration: the prompt path and the router policy path are two different resolution systems, and a policy name leaked into a provider slot. The router's policy chain only applies on the agent-reply path (ResolveStreamWithTools with PolicyName), not on the si() prompt path.

#7.3 Two SI call styles

Structured-output prompts (routing, suggest, classification, factory/intake decisions) use CallChatStructured with a JSON schema — OpenAI's json_schema response format with Strict constrained decoding when available. On empty content the OpenAI client surfaces the real cause (finish_reason=length, content_filter, refusal) instead of bubbling a confusing "unexpected end of JSON input."
Prose prompts (agent replies to users) use regular streaming chat through the router's ResolveStreamWithTools path, where policies and fallback chains apply.

#8. Admin surface

The router integration exposes three DSL-callable capabilities backing the CoPresent /router/* admin pages (Source: integrations/router/integration.go):

setApiKey — encrypt + persist a BYOK vendor key as v1:platform:partitionSecret (see §2.1 for the activation gap).
listModels — flatten every registered provider (text, TTS, STT, embedding modalities) with vendor, model, pricing, context window, and availability into v1:router:modelcatalog rows.
listPolicies — dump every routing policy with its ProviderChain(), latency targets, and preferred roles into v1:router:policycatalog rows.

#9. Summary of resolution paths

Caller	Resolution mechanism	Policies apply?	Fallback chain?
Agent reply (prose, tools)	Router `ResolveStreamWithTools` with `PolicyName`	Yes	Yes (pre-flight)
`si()` / `InvokeSIStructured` prompt	`@defaultProvider` resolved as registry provider name	No	Graceful provider-class fallback (not policy chain)
Embeddings	`EmbeddingProvider(name)`, default `embedding3Small`	No	No
TTS	`TTSProviderByName` / modality scan (real registry client)	No	No
STT	Not a registry path — `integrations/stt/` package (`openai_whisper.go` etc.); the `whisper1` registry provider is a placeholder	No	No
Voice turn	Router with `lowLatencyVoice` policy (via `voice_mode` hint)	Yes	Yes

The two systems to keep distinct: provider names (registry entries, what a .memql provider declares and what si() resolves) versus policy names (chains in dsl/policies/policies.memql, consumed only by the router's reply/voice path). The agentFactoryAnalyze case in §7.2 is exactly the bug you get when those two namespaces are crossed.

SI providers & policies

#1. Concepts: what a provider is

#1.1 The @type annotation drives the Go client

#1.2 Modality and inference

#2. Auth resolution: env placeholders, concept storage, and the BYOK gap

#2.1 The BYOK runtime-activation gap

#3. The full provider catalog

#3.1 Text chat — non-streaming (OpenAI client path)

#3.2 Text chat — non-streaming (Anthropic client path)

#3.3 Text chat — streaming (the @default lives here)

#3.4 Embeddings (real client — pgvector-backed)

#3.5 TTS (real client) and STT (placeholder in the registry)

#3.6 Declared-but-stubbed providers (placeholder clients)

#4. Provider-specific runtime behaviors

#4.1 OpenAI project header

#4.2 Anthropic extended thinking (reasoningClaudeOpus)

#4.3 Anthropic prompt caching (enablePromptCache)

#4.4 Pricing math

#5. Routing policies

#5.1 Policy annotations

#5.2 @preferredRole → role-default policy

#6. The SI Router: resolution + fallback

#6.1 Selection precedence

#6.2 Fallback semantics

#6.3 Registry default selection

#7. How prompts bind to providers

#7.1 The si() path resolves the prompt provider as a registry name

#7.2 Inconsistency: agentFactoryAnalyze names a policy where a provider is expected

#7.3 Two SI call styles

#8. Admin surface

#9. Summary of resolution paths

#1.1 The `@type` annotation drives the Go client

#3.3 Text chat — streaming (the `@default` lives here)

#4.2 Anthropic extended thinking (`reasoningClaudeOpus`)

#4.3 Anthropic prompt caching (`enablePromptCache`)

#5.2 `@preferredRole` → role-default policy

#7.1 The `si()` path resolves the prompt provider as a registry name

#7.2 Inconsistency: `agentFactoryAnalyze` names a policy where a provider is expected