Endara Relay
One endpoint for all your MCP servers. A single Rust binary that aggregates local STDIO servers, remote HTTP/SSE servers, and OAuth servers and serves them at http://localhost:9400/mcp. A separate management API (used by Endara Desktop) is exposed on a local Unix-domain socket / Windows Named Pipe — never on a TCP port.
Overview
Endara Relay is a single Rust binary that sits between your AI client (Claude Desktop, Cursor, ChatGPT, Windsurf, VS Code, Zed, Continue, or any MCP-compatible app) and all the MCP servers you actually use. Point every client at one local endpoint — http://localhost:9400/mcp— and the relay handles the rest: spawning STDIO servers, holding onto SSE / HTTP connections, refreshing OAuth tokens, and merging every server's tool catalog into a single unified list with collision-free names.
It uses one transport-specific adapter per endpoint, namespaces tools with a stable prefix to avoid collisions, and watches config.toml for changes so you can add or remove servers without restarting. STDIO adapters are restarted automatically with exponential backoff if the underlying process crashes.
Optionally, Relay can run in JS execution mode, where instead of advertising hundreds of tool definitions to the model on every turn, it advertises three meta-tools and lets the model run a sandboxed JavaScript program that calls the underlying tools in a single round-trip. See JS execution engine below.
No cloud, no accounts, no telemetry. Everything runs on your machine.
Install
Pick whichever you have set up.
Homebrew (macOS / Linux)
brew install endara-ai/tap/endara-relayCargo
cargo install endara-relayPre-built binaries
Download the latest release for your platform from github.com/endara-ai/endara-relay/releases.
Or, if you'd rather not run the relay yourself, install Endara Desktop — it bundles the relay, starts and stops it for you, and provides a UI for managing endpoints.
Quick start
Drop the following into ~/.endara/config.toml:
# ~/.endara/config.toml
[relay]
machine_name = "my-mac"
[[endpoints]]
name = "filesystem"
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/Users/me/projects"]
Then start the relay:
endara-relay startPoint any MCP-compatible client at http://localhost:9400/mcp. Claude Desktop only speaks stdio, so use the mcp-remote bridge — drop this into claude_desktop_config.json:
{
"mcpServers": {
"endara": {
"command": "npx",
"args": ["-y", "mcp-remote", "http://localhost:9400/mcp"]
}
}
}For Cursor, add the same URL under Settings → MCP → Add new MCP server → HTTP. Restart the client and the filesystem tools should appear in its tool list, prefixed with filesystem__.
CLI reference
The relay has a single subcommand, start, which boots the HTTP server, loads the config, and starts watching it for changes.
| Flag | Default | Description |
|---|---|---|
--data-dir | ~/.endara | Base directory for config, logs, and OAuth tokens. The relay creates it if it doesn't exist and writes a default config.toml on first run. |
--config | <data-dir>/config.toml | Override the path to the TOML configuration file. |
--port | 9400 | Port to listen on. The MCP endpoint (/mcp and /mcp/*), /oauth/callback, and /healthz are served on this TCP port. The management API (/api/*) is not served on TCP — it is bound to a Unix-domain socket / Windows Named Pipe; see Management API. |
--log-format | text | Log output format. Either text or json. |
NoteThe RUST_LOG environment variable overrides the log filter when set. The default is info,endara_relay=debug. Logs are written to both stdout and ~/.endara/logs/relay.log.<YYYY-MM-DD>.
Configuration reference
Endara Relay reads a single TOML file. Default location: ~/.endara/config.toml. The file has one [relay] table and any number of [[endpoints]] entries.
[relay] table
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
machine_name | string | yes | system hostname (when default config is generated) | Identifies this relay instance in logs and /api/status. Free-form; pick anything that helps you tell machines apart. |
local_js_execution | bool | no | false | When true, the advertised tool catalog is replaced with three meta-tools (list_tools, search_tools, execute_tools) and direct tool calls are rejected. See JS execution engine. |
token_dir | string (path) | no | <data-dir>/tokens | Override the directory used for OAuth token and DCR-credential storage. Useful when you want a non-default location separate from the data dir. |
[[endpoints]] entries
Each [[endpoints]] table describes one MCP server to connect to. Required fields depend on the chosen transport.
| Field | Type | Required | Description |
|---|---|---|---|
name | string | yes | Unique, non-empty identifier. Used as the default tool prefix (sanitized to lowercase ASCII), and is the path segment in management API URLs (those URLs are served on the management socket — see Management API). |
description | string | no | Free-form; surfaced in the UI and in logs. |
tool_prefix | string | no | Override the auto-derived prefix. If omitted, defaults to sanitize_name(name). See Tool prefixing. |
transport | stdio | sse | http | oauth | yes | Adapter type. Determines which other fields are required. |
command | string | yes (stdio) | Executable to spawn for STDIO transports. |
args | array of string | no | Arguments passed to the spawned process. |
url | string | yes (sse, http, oauth) | Endpoint URL for HTTP-based transports. |
env | map of string → string | no | Environment variables passed to STDIO subprocesses. Values support $VAR resolution and $$ escaping — see Environment variable resolution. |
headers | map of string → string | no | Extra HTTP headers for http / sse / oauth transports. Header values support inline $VAR substitution (e.g. Authorization = "Bearer $TOKEN"). |
disabled | bool | no | Default false. When true, the endpoint is registered but no adapter is started; toggling this does not restart adapters during hot-reload. |
disabled_tools | array of string | no | Tool names to hide from the advertised catalog without disabling the underlying server. Calls to disabled tools return an MCP error. |
oauth_server_url | string | yes (oauth) | Authorization server base URL. The relay performs OIDC / metadata discovery against this URL. |
client_id | string | no | Pre-provisioned OAuth client identifier. If omitted, the relay performs Dynamic Client Registration when needed. |
scopes | array of string | no | OAuth scopes requested during authorization. |
token_endpoint | string | no | Override the discovered token endpoint URL. Rarely needed. |
OAuth credentialsOAuth client credentials are not stored in config.toml. They are written via POST /api/endpoints/{name}/oauth/credentials(or the equivalent flow in Endara Desktop) and persisted by the relay's TokenManager under ~/.endara/tokens/ with mode 0600. Dynamic Client Registration (DCR) populates them automatically when the server supports it; otherwise you provide client_id (and client_secret for confidential clients) via the API or desktop UI. Note: /api/* is exposed only on the management socket described in Management API, not on http://localhost:9400. Use curl --unix-socket (or the Desktop UI) to call it.
Transport snippets
STDIO
[[endpoints]]
name = "github"
description = "GitHub MCP server"
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
env = { GITHUB_TOKEN = "$GITHUB_TOKEN" }
HTTP
[[endpoints]]
name = "context7"
transport = "http"
url = "https://mcp.context7.com/mcp"
headers = { Authorization = "Bearer $CONTEXT7_KEY" }
SSE
[[endpoints]]
name = "remote-sse"
transport = "sse"
url = "https://example.com/mcp/sse"
OAuth
[[endpoints]]
name = "linear"
transport = "oauth"
url = "https://mcp.linear.app/mcp"
oauth_server_url = "https://mcp.linear.app"
scopes = ["read", "write"]
# client_id / client_secret are persisted via the management API, not TOML
Environment variable resolution
Endpoint env values and headers values are passed through a small resolver before adapters start:
$VAR— looks upVARin the relay's process environment and substitutes its value. Header values support$VARin any position (e.g.Bearer $TOKEN);envvalues must be a single$VARreference at the start of the string.$$— escapes a literal dollar sign.$$VARbecomes the literal string$VAR; useful when a server really wants the dollar sign character.- Plain text — kept as-is.
If a referenced variable is not set, the relay records the failure through ConfigError::EnvVarMissing. With graceful validation (the default at startup and during hot-reload), the affected endpoint is registered as a failed adapter with the underlying error visible on its entry in GET /api/endpoints (filter the array by name) — startup itself does not fail.
Validation rules
- The
[relay]table is required; missing it is a fatal error. - Endpoint
names must be non-empty and unique within the file. stdioendpoints requirecommand;sse,http, andoauthendpoints requireurl;oauthadditionally requiresoauth_server_url.- Per-endpoint validation runs gracefully: an endpoint that fails any of the above is registered as a failed adapter rather than blocking the relay from starting. This means one bad entry can't take the whole relay down.
Hot reload
The relay watches config.toml via the notify crate and applies changes without a restart. The file is diffed against the running config, and:
- New endpoints are spun up; removed endpoints are shut down gracefully.
- Endpoints whose meaningful fields changed (
transport,command,args,url,env,headers, OAuth fields) are restarted in place. - Endpoints whose only changes are
disabledordisabled_toolsare not restarted — the tool-catalog filter is updated in memory. - Unchanged endpoints keep running with their existing adapter and OAuth tokens.
Management API
The relay exposes a small JSON API for inspecting state, restarting adapters, completing OAuth flows, and editing the running configuration. Endara Desktop drives this API. Unlike /mcp, the management API does not listen on TCP — it binds to a Unix-domain socket on macOS / Linux ($XDG_RUNTIME_DIR/endara-relay/api.sock, falling back to $TMPDIR/endara-relay-<uid>/api.sock on macOS or <data-dir>/api.sock if no runtime dir is set) and to a per-user Named Pipe on Windows (\\.\pipe\endara-relay-<sid>). To script against it, use curl --unix-socket (macOS / Linux) or a Named Pipe client (Windows). Keeping the management API off TCP rules out drive-by browser attacks against a local HTTP endpoint; see Security for the broader threat model.
| Method | Path | Description |
|---|---|---|
GET | /api/status | Process uptime, total endpoint count, and healthy count. |
GET | /api/endpoints | All endpoints with health, tool count, last-activity timestamps, and lifecycle state. |
GET | /api/catalog | Full merged tool catalog across all endpoints with applied prefixes, source endpoint name, and current availability (reflects per-endpoint and per-tool disable state plus health). |
GET | /api/endpoints/:name/tools | Tool definitions for a single endpoint, including each tool’s input schema. |
GET | /api/endpoints/:name/logs | Recent log lines for an endpoint, for debugging stuck or failing adapters. |
POST | /api/endpoints/:name/restart | Restart an endpoint’s adapter. Returns immediately; the heavy work runs in the background and lifecycle state is surfaced through GET /api/endpoints. |
POST | /api/endpoints/:name/refresh | Re-list tools from a healthy endpoint without restarting it. |
POST | /api/endpoints/:name/disable | Shut down the adapter and mark the endpoint disabled. Persisted to the disabled-state file so the endpoint stays disabled across restarts; its tools disappear from the catalog. |
POST | /api/endpoints/:name/enable | Clear the disabled flag and re-initialize the adapter. Persisted to the disabled-state file. |
POST | /api/endpoints/:name/tools/:tool_name/disable | Hide a single tool from the merged catalog without disabling the endpoint as a whole. Persisted to the disabled-state file. |
POST | /api/endpoints/:name/tools/:tool_name/enable | Re-enable a previously disabled tool on an endpoint. Persisted to the disabled-state file. |
DELETE | /api/endpoints/:name | Remove an endpoint from the running registry and persist the deletion to config.toml. |
GET | /api/config | Current parsed configuration with env values redacted. |
POST | /api/config/reload | Force an immediate reload from disk (the file watcher does this automatically; this endpoint is for triggering it manually). |
POST | /api/test-connection | Try connecting with the supplied transport / command / url / headers without persisting an endpoint. Useful for UIs validating user input before saving. |
POST | /api/endpoints/:name/oauth/start | Start an OAuth authorization flow for the endpoint and return the authorize URL. |
POST | /api/endpoints/:name/oauth/credentials | Persist OAuth client credentials (client_id / client_secret) for the endpoint. |
GET | /api/endpoints/:name/oauth/status | Whether the endpoint has tokens, when they expire, and which scopes were granted. |
POST | /api/endpoints/:name/oauth/revoke | Revoke and delete the stored OAuth tokens for the endpoint. |
POST | /api/endpoints/:name/oauth/refresh | Force-refresh an access token using the stored refresh token. |
GET | /api/endpoints/:name/oauth/metrics | In-process OAuth metric counters for the endpoint (e.g. token refreshes, refresh failures), as JSON. |
POST | /api/oauth/setup | Create a transient OAuth setup session: discovers OAuth metadata, attempts Dynamic Client Registration, and returns the authorize URL — without writing to config.toml. |
POST | /api/oauth/setup/:id/credentials | Submit manual client_id / client_secret for a setup session when DCR is unavailable, and receive the authorize URL. |
GET | /api/oauth/setup/:id/status | Poll the status of a setup session (pending / awaiting credentials / authorized / failed). |
POST | /api/oauth/setup/:id/commit | Persist a successfully authorized setup session: write the new endpoint into config.toml and register the running adapter. Only succeeds once the session has reached the Authorized state. |
DELETE | /api/oauth/setup/:id | Cancel a setup session and clean up its in-memory state without writing to config. |
POST | /api/endpoints/:name/credentials | Persist OAuth client credentials (client_id and optional client_secret) for an existing OAuth endpoint via the TokenManager DCR file. Modern replacement for the legacy client_secret TOML field. To seed credentials during initial setup, before the endpoint exists, use POST /api/oauth/setup/:id/credentials instead. |
GET | /api/endpoints/:name/credentials | Inspect which credential fields are currently set for an endpoint (values are not returned). |
Scripting against the API
Because /api/* lives on a local socket / pipe, the invocation depends on your platform. Methods, paths, JSON bodies, and status codes are standard HTTP — only the transport is local.
# Linux — Unix-domain socket under $XDG_RUNTIME_DIR
curl --unix-socket "$XDG_RUNTIME_DIR/endara-relay/api.sock" \
http://localhost/api/status
# macOS — Unix-domain socket under $TMPDIR
curl --unix-socket "$TMPDIR/endara-relay-$(id -u)/api.sock" \
http://localhost/api/status
# Windows (PowerShell) — per-user Named Pipe
# curl 8.x supports --unix-socket against \\.\pipe\<name>
curl.exe --unix-socket "\\.\pipe\endara-relay-$([System.Security.Principal.WindowsIdentity]::GetCurrent().User.Value)" `
http://localhost/api/status
On all platforms, the host portion of the URL (http://localhost) is ignored by the relay — only the path and method matter. The socket / pipe is owned by the current user with restrictive permissions, and on Unix the relay verifies the peer's UID before accepting a connection.
JS execution engine
When [relay] local_js_execution = true, the relay replaces its full advertised tool catalog with three meta-tools — list_tools, search_tools, and execute_tools — and rejects any direct tool call with the message "Direct tool calls are not allowed in JS execution mode. Use execute_tools instead." The model is expected to look up the tools it actually needs through search_tools, then call them inside a single sandboxed JavaScript program.
How it works
execute_tools runs the supplied script in an embedded boa_engine JavaScript sandbox — entirely in-process, no Node.js, no require / import / fetch, no filesystem or network access of its own. The script body is wrapped in (async function() { ... })() so top-level await works. Whatever value you pass to returnbecomes the meta-tool's result. Each call gets a fresh context — no state persists between execute_tools invocations.
Sandbox limits
- 30-second wall-clock timeout per
execute_toolscall (hardcoded). Slow tool calls inside the script count toward this budget. - 1,000,000 loop-iteration cap on each loop in the script, to keep
while (true) {}from hanging the relay. JSON.parseis wrapped with a friendlier error that includes the input kind, length, and a short preview when parsing fails.
Functions and globals exposed to the script
tools["NAME"](args)— global object with one function per available tool. Returns the parsed JSON of the MCP result. The function is synchronous from JavaScript's perspective, butawait tools[...]is harmless and is the recommended style for forward compatibility and readability.call(name, args?, opts?)— alternative invocation form that auto-unwraps the response: returnsstructuredContentwhen present, otherwise JSON-parsescontent[0].textwhen it looks like JSON, otherwise returns the raw text. Throws if the tool returnsisError: true. Pass{ raw: true }to skip the unwrap and get the full MCP result, or{ retry: N }to retry transient failures (HTTP 502/503/504, timeouts, connection resets) up toNtimes with backoff (200/400/800 ms ± 25% jitter).- Standard ECMAScript globals (
JSON,Math,Array,String,Promise,Date, etc.) perboa_engine. - Not exposed:
console,fetch,require,import,process,globalThis.fs, timers (setTimeout/setInterval).
Tool naming inside the script
Tool keys on the tools object follow the same prefixing scheme as the underlying catalog. Multi-server mode produces prefix__name with a double underscore between prefix and tool name (e.g. github__list_repos). Single-server mode omits the prefix.
Result shape and the safe-handling pattern
Every tools[...] call returns the standard MCP tool result: { content?: [{ type, text }], structuredContent? }. structuredContentis the server's structured output and is preferred. content[0].text is provider-defined prose and is not guaranteed to be JSON — it may be empty, truncated, or natural language. Use this pattern:
const r = await tools["todoist__get-tasks"]({ limit: 5 });
if (r.structuredContent) return r.structuredContent;
const t = r.content && r.content[0] && r.content[0].text;
return typeof t === "string" && /^\s*[\[{]/.test(t) ? JSON.parse(t) : t;
The three meta-tools
list_tools({ limit?, offset? })
Paginated catalog. limit defaults to 50 and is capped at 200. Returns { tools, total, limit, offset }; each tool entry is { name, description, input_schema, annotations? }. Use this when you want to enumerate every tool the relay knows about.
search_tools({ query, limit? })
Fuzzy ranked search across tool name, description, endpoint name, and input-schema property names. limit defaults to 20 and is capped at 200. Search is case-insensitive and typo-tolerant (Levenshtein), and respects camelCase / snake_case / kebab-case word boundaries. Ranking goes exact > prefix > substring > fuzzy; field weights are name > description > endpoint; tools matching more query tokens rank higher. Returns an array of { name, description, input_schema, annotations? }.
execute_tools({ script })
Runs script under the rules above and returns whatever the script returns. Throws if the script throws or exceeds a sandbox limit; the error message is propagated back to the meta-tool caller.
Why this exists — the token-burn problem
A typical desktop client connects to many MCP servers (filesystem, github, slack, jira, todoist, postgres, …). The combined catalog can easily be hundreds of tools with multi-thousand- character JSON schemas attached to each.
In standard MCP mode, every one of those tool definitions is sent to the model on every request — the catalog alone can cost tens of thousands of input tokens per turnjust to advertise capabilities the model probably won't use this turn.
JS mode collapses that advertised surface to three tools. The model uses search_tools to look up the handful of tools it needs for the current task, calls them inside a single execute_tools round-trip, and returns only the distilled answer. Two compounding wins:
- The upfront catalog cost drops by orders of magnitude.
- Intermediate tool results never have to round-trip back through the model — the script can fetch 1,000 records, filter to 5, and return only those 5. Without JS mode the model would see all 1,000 in its context just to pick 5.
Worked examples
Example 1 — discover then call:
// The model doesn't know the exact tool name, so it searches first.
const matches = await tools["search_tools"]({ query: "list github issues", limit: 5 });
const m = matches[0]; // pick the top hit
const r = await tools[m.name]({ repo: "endara-ai/endara-relay", state: "open" });
return r.structuredContent ?? r.content?.[0]?.text;
Example 2 — chain calls in one round-trip:
const projects = await tools["todoist__get-projects"]({});
const proj = (projects.structuredContent ?? []).find(p => p.name === "Inbox");
const tasks = await tools["todoist__get-tasks"]({ project_id: proj.id });
return { projectId: proj.id, tasks: tasks.structuredContent };
Example 3 — reduce-and-return (the token-burn-reduction pattern):
// Fetch potentially huge data, but only return what the model needs.
const all = await tools["github__list_issues"]({ repo: "endara-ai/endara-relay", state: "open" });
const issues = all.structuredContent ?? [];
// 200 issues -> 5 stale ones with just the fields we care about.
const stale = issues
.filter(i => Date.now() - new Date(i.updated_at).getTime() > 30 * 86400_000)
.sort((a, b) => new Date(a.updated_at) - new Date(b.updated_at))
.slice(0, 5)
.map(i => ({ number: i.number, title: i.title, updated_at: i.updated_at }));
return { staleCount: stale.length, stale };
Limits to remember
Any single execute_tools call is bounded by the 30-second wall-clock timeout and the 1M-iteration loop cap. Scripts cannot persist state between calls — each invocation starts from scratch. If a tool call inside the script throws, the sandbox surfaces the error message back to the meta-tool caller.
Tool prefixing
Two MCP servers can ship tools with the same name (for example, both a filesystem and a sandbox server might call something read_file). To keep names unique, the relay prefixes every tool it advertises with the endpoint's prefix and a double underscore, e.g. github__list_repos.
The prefix is taken from the endpoint's tool_prefix if set; otherwise it's derived from name by sanitizing to lowercase ASCII (non-ASCII characters are stripped). If sanitization yields an empty string, set tool_prefix explicitly. When the relay is connected to only one underlying server, prefixes are omitted and tools keep their original names.
Crash recovery
STDIO adapters are restarted automatically with exponential backoff when the underlying process exits unexpectedly. Each restart resets the adapter to the initializing lifecycle state and then either back to ready on success or to failed with the most recent error exposed on the endpoint entry returned by GET /api/endpoints. SSE / HTTP / OAuth adapters reconnect on transport errors; OAuth tokens are refreshed automatically when an access token nears expiry.
File locations
All paths are under the data directory (default ~/.endara):
~/.endara/config.toml— the main configuration file.~/.endara/logs/relay.log.<YYYY-MM-DD>— daily rotated log files. Stdout also receives the same lines.~/.endara/tokens/— OAuth tokens and DCR client credentials. Files are written with mode0600on Unix. Override the location via[relay] token_dir.
Troubleshooting
Port 9400 already in use (EADDRINUSE)
Another endara-relayinstance is already listening, or you have both Endara Desktop's bundled relay and a separately installed CLI relay running. Stop one of them, or pass --port to use a different port. See Desktop troubleshooting for how Desktop handles the same conflict.
Endpoint stuck in failed state
Inspect GET /api/endpoints/{name}/logs for recent adapter output and ~/.endara/logs/relay.log.<YYYY-MM-DD>for the relay's own log. Common causes: a missing command, a server that needs an env var that wasn't set, or an OAuth flow that hasn't been completed.
Environment variable resolution failure
If a $VAR reference in env or headersisn't set, the affected endpoint is registered as a failed adapter. Set the variable in the relay's process environment (or in your shell profile if you launch the relay from the shell) and the next reload picks it up. Use $$ to emit a literal dollar sign.
Tool name collisions
If two endpoints derive the same prefix from their name — say two endpoints called github — set an explicit tool_prefix on one of them. Validation will not let two endpoints share the same name.