Core component · 01-02

Router & gating

The router scores how well each expert fits a task; the gate turns those scores into a concrete plan - who runs, how many, and with what weight. Together they are the part of the system that keeps execution sparse instead of broadcasting every task to every agent.

Router design

A router maps a TaskSignature to a list[Score] - one affinity logit per registered expert, plus an eligibility flag from the capability mask. The router only scores; it never decides who actually runs. That separation lets you tune scoring and selection independently.

The default learned-softmax strategy scores by cosine affinity between the task embedding and each expert's capability vector, nudged by the expert's rolling success rate, with ineligible experts forced to -inf before normalization.

router.pypython

1# Router: TaskSignature -> sparse routing distribution.2from dataclasses import dataclass, field3from typing import Protocol4import math56@dataclass(frozen=True)7class Score:8    expert_id: str9    logit: float          # raw affinity before normalization10    eligible: bool        # passes the capability mask?1112class RoutingStrategy(Protocol):13    def score(self, sig: "TaskSignature", experts: list["Expert"]) -> list[Score]:14        ...1516@dataclass17class LearnedSoftmaxRouter:18    """Scores experts by dot-product affinity, masks the ineligible,19    then normalizes the survivors with a temperature-scaled softmax."""20    temperature: float = 0.72122    def score(self, sig, experts) -> list[Score]:23        scores = []24        for e in experts:25            eligible = set(sig.capabilities) <= set(e.capabilities)26            affinity = _affinity(sig, e) if eligible else -math.inf27            scores.append(Score(e.id, affinity, eligible))28        return scores2930def _affinity(sig, expert) -> float:31    # cosine of task embedding against the expert's capability vector,32    # nudged by the expert's rolling success rate.33    base = _cosine(sig.embedding, expert.vector)34    return base + 0.15 * expert.success_rate

Score

Field	Type	Description
expert_id	str	Identifier of the scored expert.
logit	float	Raw affinity before softmax; -inf if ineligible.
eligible	bool	Whether the expert's capabilities cover the task's requested tags.

Gating mechanism

The gate is where sparsity happens. It ranks eligible experts, applies a load-balancing penalty so a busy expert yields to a comparably-scored idle peer, takes the top-k under a global capacity cap, and softmax-normalizes the survivors into routing weights that the arbiter later uses.

routing weights · top-k = 2softmax(logits)

code.synthesis

0.52

retrieval.rag

0.31

planning.graph

0.12

review.static

0.05

solid = engaged · faint = eligible but cut by top-k

gate.pypython

1# Gate: scores -> an executable plan (who runs, with what weight).2from dataclasses import dataclass34@dataclass(frozen=True)5class Plan:6    engaged: list[str]          # expert ids that will run7    weights: dict[str, float]   # normalized routing weight per engaged expert8    dropped: list[str]          # eligible but cut by top-k / capacity910@dataclass11class TopKGate:12    top_k: int = 213    capacity: int = 8           # max in-flight experts across the pool14    load_penalty: float = 0.10  # subtracted per unit of current load1516    def select(self, scores, load: dict[str, float]) -> Plan:17        # 1. drop ineligible, then 2. apply a load-balancing penalty so a hot18        #    expert yields to an idle peer of comparable affinity.19        ranked = sorted(20            (s for s in scores if s.eligible),21            key=lambda s: s.logit - self.load_penalty * load.get(s.expert_id, 0),22            reverse=True,23        )24        chosen = ranked[: min(self.top_k, self.capacity)]25        weights = _softmax({s.expert_id: s.logit for s in chosen})26        return Plan(27            engaged=[s.expert_id for s in chosen],28            weights=weights,29            dropped=[s.expert_id for s in ranked[self.top_k :]],30        )3132def _softmax(logits: dict[str, float]) -> dict[str, float]:33    import math34    m = max(logits.values())35    exp = {k: math.exp(v - m) for k, v in logits.items()}36    z = sum(exp.values())37    return {k: v / z for k, v in exp.items()}

Go data-plane gate

On the hot path the same selection runs in Go, where the gate is called once per dispatch under the worker pool's lock. The logic mirrors the Python reference exactly so plans are identical across planes.

gate.gogo

1// Go data-plane gate: the same selection, built for the hot path.2package gate34import "sort"56type Score struct {7    ExpertID string8    Logit    float649    Eligible bool10}1112type Plan struct {13    Engaged []string14    Weights map[string]float6415}1617type TopK struct {18    K           int19    Capacity    int20    LoadPenalty float6421}2223func (g TopK) Select(scores []Score, load map[string]float64) Plan {24    eligible := scores[:0]25    for _, s := range scores {26        if s.Eligible {27            eligible = append(eligible, s)28        }29    }30    sort.SliceStable(eligible, func(i, j int) bool {31        return g.adj(eligible[i], load) > g.adj(eligible[j], load)32    })33    k := min(g.K, g.Capacity)34    if k > len(eligible) {35        k = len(eligible)36    }37    chosen := eligible[:k]38    return Plan{Engaged: ids(chosen), Weights: softmax(chosen)}39}4041func (g TopK) adj(s Score, load map[string]float64) float64 {42    return s.Logit - g.LoadPenalty*load[s.ExpertID]43}

Capacity is global, not per-dispatch

capacity bounds total in-flight experts across all concurrent dispatches. Under load the gate can return fewer than top_k experts; downstream arbitration must tolerate a short plan rather than assume exactly k proposals.

Custom strategies

Both RoutingStrategy and the gate are plain protocols. Implement score() / select() to ship a hand-tuned router (keyword rules, a cost-aware gate, a sticky-session gate for stateful experts) without touching the rest of the pipeline. The hub validates that every engaged expert is eligible before fan-out, so a buggy gate fails fast instead of running the wrong expert.

Router design#

Gating mechanism#

Go data-plane gate

Custom strategies#

Router design

Gating mechanism

Custom strategies