Self-Healing Provider Pipeline
ClaudusBridge talks to the active LLM through a small provider runtime — a Node process that handles auth + routing for Claude / ChatGPT / Gemini-CLI. The runtime listens on a port that changes every time it starts. The editor learns that port at boot via a tiny sentinel file and caches it inside BaseURL.
Before 0.6.1, if the runtime crashed, was killed manually, or got upgraded out of band, the editor kept POSTing to the dead port forever (Network error (no response from http://127.0.0.1:<old-port>/v1/messages)) until UE itself was restarted. 0.6.1 → 0.6.3 turned that into a self-healing pipeline. This guide explains the moving parts.
If the provider runtime restarts out of band — for any reason — you don't need to do anything. The next chat request silently swaps the URL via dispatch.json mtime detection and goes to the live port. The retry-after-failure logic stays as a safety net. You should never see libcurl error: 7 in the editor log on 0.6.3+.
The sentinel — dispatch.json
When the provider runtime starts up, it writes a tiny JSON file at:
<LOCALAPPDATA>/ClaudusBridge/ProviderRuntime/dispatch.json
{
"port": 52523,
"upstreamPort": 52522,
"pid": 29936,
"runtime": "claudus-provider-runtime",
"routeLayer": "claudus-public-model-router-v2",
"startedAt": "2026-05-15T07:14:33.614Z",
"updatedAt": "2026-05-15T07:14:34.260Z",
"installRoot": "C:\\Users\\…\\ClaudusBridge\\ProviderRuntime"
}
The file is rewritten on every fresh runtime startup. The Node process closes the handle immediately after writing, so the editor can read it concurrently without lock contention.
ClaudusBridge's auto-login at editor startup reads this file to discover the runtime port. The same file is the basis of the 0.6.x self-healing pipeline below.
1. Proactive refresh (0.6.3 — the primary path)
At the entry of every chat-send path (the synchronous ResolveAndAppendReply used by ask_claudus / submit_chat_sync, and the async ResolveAndAppendReplyAsync used by SubmitHumanMessage from the Output Log), the editor calls:
FCBClaudusAI::EnsureFreshProviderRuntimeURL();
That helper does one cheap stat() of dispatch.json (single-digit microseconds on local SSD). If the mtime advanced past the last reconciled value, it re-parses the file. If the new port differs from the cached BaseURL, it swaps both BaseURL and ProviderRuntimeRootURL in place and updates LastSeenDispatchMtimeTicks.
The request then goes out against the fresh port. libcurl never touches the dead socket. No 2-second timeout, no LogHttp warning block, no visible recovery message.
The only visible signal is a single LogTemp: Display line:
LogTemp: Display: [ClaudusBridge] dispatch.json mtime advanced;
swapping cached provider runtime URL http://127.0.0.1:50662/v1/messages
-> http://127.0.0.1:52523/v1/messages before issuing request.
If the mtime hasn't moved, the helper is a no-op (just the stat() cost). Steady-state overhead is negligible.
2. Retry-after-failure (0.6.1 + 0.6.2 — the safety net)
The proactive refresh covers the common case. The safety net covers the rare race where the runtime swap happens between the stat() and the actual request:
| Path | When safety net fires | What it does |
|---|---|---|
Dashboard auto-connect (ConnectProviderRuntimeAsync) | Connect to runtime URL fails with bConnectedOk=false (libcurl couldn't connect) | Re-read dispatch.json via TryRediscoverProviderRuntimeURL(); if the port differs, reissue with retries-1 against the new URL. The original OnComplete is handed down so callers see one terminal result. Default retries = 1. |
Synchronous chat (ResolveAndAppendReply → CallAnthropic) | CallAnthropic returns HttpCode == 0 against the local runtime | Re-read dispatch.json, swap BaseURL, re-call CallAnthropic once. Single fully-resolved answer to the caller. |
Async chat (IssueDashboardProviderRequest) | Lambda receives bConnectedOk=false while in ClaudusRuntime mode | Reissue against the rediscovered URL using the same "(thinking…)" placeholder. The user sees a slightly longer thinking-spinner instead of an error chat entry. |
When the safety net fires, you see a system chat entry tagged provider-connect-rediscover (auto-connect path) or a LogTemp: Display line (sync/async chat paths) announcing the swap.
The retry counter prevents infinite loops: each path retries at most once.
3. CBDesktopAuthBridge::InvalidateDispatch()
The DesktopAuth layer (the dashboard auth/login bridge) memoizes the dispatch port across calls, with a TTL check. If you've upgraded the runtime in the middle of a session and want to force the next DiscoverDispatch() to re-read the sentinel from disk regardless of TTL, call:
DesktopAuth->InvalidateDispatch();
This drops every memoization keyed on the old port (bDispatchAvailable, DispatchPort, ControlBaseURL, LastRefreshSeconds, LastDiscoverySeconds). The next read goes back to disk.
The auto-connect retry path (path 1 in the safety net table above) calls this automatically inside the cmd-auto-connect-error callback after ConnectProviderRuntimeAsync has already burned its single retry. So the next user message picks up the fresh port at the DesktopAuth layer too.
For most users this helper is internal plumbing — you don't call it directly. It exists for the recovery sequence and for any external tooling that wants to force a clean port-rediscovery without relying on a failed request.
What you'll see in the Output Log
Happy path (no runtime restart)
Just the normal chat flow. No LogTemp: Display: ... swapping ... lines, no LogHttp: Warning blocks.
Runtime restarted out of band
LogTemp: Display: [ClaudusBridge] dispatch.json mtime advanced;
swapping cached provider runtime URL http://127.0.0.1:50662/v1/messages
-> http://127.0.0.1:52523/v1/messages before issuing request.
One line per restart, on the next chat request. After that, steady-state silence again.
Rare race (proactive refresh raced the runtime restart)
LogHttp: Warning: ... POST http://127.0.0.1:50662/v1/messages completed with reason 'ConnectionError' after 2.02s
LogTemp: Display: [ClaudusBridge] Provider runtime at http://127.0.0.1:50662/v1/messages did not respond;
rediscovered live port and retrying synchronously against http://127.0.0.1:52523/v1/messages.
The libcurl warning is unavoidable when the actual HTTP request times out — that warning is emitted by UE's HTTP module, not by us. But the retry recovers and the user gets their answer.
Genuinely no runtime
LogTemp: Display: [ClaudusBridge] Provider runtime at http://127.0.0.1:50662/v1/messages did not respond;
rediscovered live port and retrying synchronously against http://127.0.0.1:50662/v1/messages.
(Or no rediscovered URL at all if dispatch.json is gone.) The retry path checks that the rediscovered URL differs from the failed URL — if the sentinel still points to a dead port, the retry bails and reports the underlying error. Use agent_login from any MCP client to spawn a fresh runtime in that case.
How to test (or reproduce) the self-heal
# 1. Confirm the current cached port
cb stream # not relevant, but a cheap "still alive?" check
Get-Content "$env:LOCALAPPDATA\ClaudusBridge\ProviderRuntime\dispatch.json"
# 2. Kill the runtime (replace <pid> with the value from dispatch.json's "pid")
Stop-Process -Id <pid> -Force
# 3. Spawn a fresh runtime
$script = "C:\<...your project path...>\Plugins\ClaudusBridge\Resources\ClaudusProviderRuntime\launch-claudus-provider-runtime.ps1"
Start-Process powershell.exe -ArgumentList "-NoProfile","-ExecutionPolicy","Bypass","-File","`"$script`"" -WindowStyle Hidden
# 4. Confirm dispatch.json now shows a new port
Get-Content "$env:LOCALAPPDATA\ClaudusBridge\ProviderRuntime\dispatch.json"
# 5. Send a chat — the editor swaps the URL silently
cb call ask_claudus '{"message":"Reply with the single word PONG. Nothing else."}'
You should see the response come back normally. Open the editor's Output Log; the only swap-related line should be the Display log entry showing the URL transition.
What stays user-owned
The self-healing pipeline only handles port discovery — finding which port the local runtime is listening on. It does not:
- Restart the runtime when it dies (the runtime is a peer; the user or another process decides whether to relaunch).
- Persist provider OAuth tokens (CCAG /
gemini-cliown credential storage; the plugin only reads sanitized presence flags viaclaudus_get_last_auth_state— see Cognition Tier overview). - Migrate sessions across editor restarts (each UE process is independent).
For runtime auto-launch on demand, the existing agent_login flow already handles bootstrap on a cold start. Self-heal kicks in after a runtime has been running and the editor knows its sentinel.
Where to go next
- Cognition Tier overview — Provider auth visibility (the observability layer on top of the pipeline)
- Connecting Your AI Client — How
agent_loginspawns the runtime in the first place - Auto-Observations — If a connect failure ever makes it to the user as an error chat entry, the
action_failedobservation captures it for the next session's audit