rfml-moe-hub · v0.3.0

Mixture-of-Experts orchestration for multi-agent workflows

rfml-moe-hub is a control layer that routes a task to the smallest set of capable experts, runs them, and arbitrates their proposals into one committed result. It pairs a Python control plane with a Go data plane and slots into existing Peer-Consult and TaskFlow structures without rewriting them.

dispatch · task#4f1cconsensus reached

intask#4f1c

routerdispatch

gatesoftmax · top-k = 2

code.synthesis0.91

retrieval.rag0.74

planning.graph0.38

review.static0.12

arbiterconsensus

weighted quorum over engaged experts

intask

routerdispatch

gatetop-k = 2

code.synthesis0.91

retrieval.rag0.74

planning.graph0.38

review.static0.12

arbiterconsensus

routed in 6ms2 / 4 experts engagedarbitration: weighted-quorum

One dispatch: the router scores experts, the gate selects top-k, the engaged experts produce proposals, and the arbiter commits a single consensus result.

What problem it solves

Multi-agent systems tend to run every agent on every task and reconcile the mess afterward. That burns tokens, multiplies latency, and produces contradictory output that someone has to untangle downstream. rfml-moe-hub treats agents as experts in a Mixture-of-Experts layer: a router decides who is qualified, a gate decides how many actually run, and an arbiter decides what the answer is.

The result is precise capability orchestration - the right experts, engaged sparsely, with one accountable output - rather than a broadcast-and-pray fan-out.

The three pillars

Every dispatch passes through three components, each specified independently so you can swap strategies without touching the others.

Router

Scores every registered expert against the inbound task signature and produces a sparse routing distribution. No expert runs until the router commits.

Read spec

Gating

Turns router scores into an executable plan: top-k selection, capacity limits, capability masking, and a load-balancing penalty that keeps any one expert from starving the rest.

Read spec

Arbitration

Collects proposals from engaged experts and folds them into a single committed result under a declared consensus protocol, with a deterministic tie-break and an audit trail.

Read spec

Quickstart

Install the control plane, register a few experts, and dispatch. The hub picks the runtime - in-process Python for prototyping, the Go data plane for production throughput.

installshell

1# install the library (Python control plane)2pip install rfml-moe-hub34# or pull the Go runtime for the data plane5go get github.com/rfml/moe-hub@v0.3.0

hello_hub.pypython

1from moe_hub import Hub, Expert, Router, Arbiter23hub = Hub(4    router=Router(strategy="learned-softmax", top_k=2),5    arbiter=Arbiter(protocol="weighted-quorum", quorum=0.66),6)78hub.register(Expert("code.synthesis", capabilities=["python", "refactor"]))9hub.register(Expert("retrieval.rag", capabilities=["search", "cite"]))1011result = hub.dispatch(task="patch the failing auth test", context=ctx)12print(result.consensus, result.engaged_experts)

Spec status: draft (v0.3.0)

The protocol surfaces below are stable enough to build against; field names are frozen for the v0.x line. Wire formats may still gain optional fields before v1.

Design principles

Sparse by default. No expert runs unless the gate selects it. Cost scales with k, not with the size of the expert pool.
Declared, not implicit. Routing strategy, gate policy, and arbitration protocol are explicit config - never buried in agent prompts.
Deterministic tie-breaks. Given the same proposals and seed, arbitration always commits the same result. Replayable by construction.
Transport-agnostic. The same protocol runs in-process, over gRPC to the Go data plane, or across a Peer-Consult mesh.

See the Hub wiring in detail on the system architecture page.

What problem it solves#

The three pillars#

Router

Gating

Arbitration

Quickstart#

Design principles#

What problem it solves

The three pillars

Quickstart

Design principles