Endara Relay

One endpoint for all your MCP servers. A single Rust binary that aggregates local STDIO servers, remote HTTP/SSE servers, and OAuth servers and serves them at http://localhost:9400/mcp. A separate management API (used by Endara Desktop) is exposed on a local Unix-domain socket / Windows Named Pipe — never on a TCP port.

Overview

Endara Relay is a single Rust binary that sits between your AI client (Claude Desktop, Cursor, ChatGPT, Windsurf, VS Code, Zed, Continue, or any MCP-compatible app) and all the MCP servers you actually use. Point every client at one local endpoint — http://localhost:9400/mcp— and the relay handles the rest: spawning STDIO servers, holding onto SSE / HTTP connections, refreshing OAuth tokens, and merging every server's tool catalog into a single unified list with collision-free names.

It uses one transport-specific adapter per endpoint, namespaces tools with a stable prefix to avoid collisions, and watches config.toml for changes so you can add or remove servers without restarting. STDIO adapters are restarted automatically with exponential backoff if the underlying process crashes.

Optionally, Relay can run in JS execution mode, where instead of advertising hundreds of tool definitions to the model on every turn, it advertises three meta-tools and lets the model run a sandboxed JavaScript program that calls the underlying tools in a single round-trip. See JS execution engine below.

No cloud, no accounts, no telemetry. Everything runs on your machine.

Install

Pick whichever you have set up.

Homebrew (macOS / Linux)

brew install endara-ai/tap/endara-relay

Cargo

cargo install endara-relay

Pre-built binaries

Download the latest release for your platform from github.com/endara-ai/endara-relay/releases.

Or, if you'd rather not run the relay yourself, install Endara Desktop — it bundles the relay, starts and stops it for you, and provides a UI for managing endpoints.

Quick start

Drop the following into ~/.endara/config.toml:

# ~/.endara/config.toml
[relay]
machine_name = "my-mac"

[[endpoints]]
name = "filesystem"
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/Users/me/projects"]

Then start the relay:

endara-relay start

Point any MCP-compatible client at http://localhost:9400/mcp. Claude Desktop only speaks stdio, so use the mcp-remote bridge — drop this into claude_desktop_config.json:

{
  "mcpServers": {
    "endara": {
      "command": "npx",
      "args": ["-y", "mcp-remote", "http://localhost:9400/mcp"]
    }
  }
}

For Cursor, add the same URL under Settings → MCP → Add new MCP server → HTTP. Restart the client and the filesystem tools should appear in its tool list, prefixed with filesystem__.

CLI reference

The relay has a single subcommand, start, which boots the HTTP server, loads the config, and starts watching it for changes.

Flag	Default	Description
`--data-dir`	`~/.endara`	Base directory for config, logs, and OAuth tokens. The relay creates it if it doesn't exist and writes a default `config.toml` on first run.
`--config`	`<data-dir>/config.toml`	Override the path to the TOML configuration file.
`--port`	`9400`	Port to listen on. The MCP endpoint (`/mcp` and `/mcp/`), `/oauth/callback`, and `/healthz` are served on this TCP port. The management API (`/api/`) is not served on TCP — it is bound to a Unix-domain socket / Windows Named Pipe; see Management API.
`--log-format`	`text`	Log output format. Either `text` or `json`.

NoteThe RUST_LOG environment variable overrides the log filter when set. The default is info,endara_relay=debug. Logs are written to both stdout and ~/.endara/logs/relay.log.<YYYY-MM-DD>.

Configuration reference

Endara Relay reads a single TOML file. Default location: ~/.endara/config.toml. The file has one [relay] table and any number of [[endpoints]] entries.

[relay] table

Field	Type	Required	Default	Description
`machine_name`	string	yes	system hostname (when default config is generated)	Identifies this relay instance in logs and `/api/status`. Free-form; pick anything that helps you tell machines apart.
`local_js_execution`	bool	no	`false`	When `true`, the advertised tool catalog is replaced with three meta-tools (`list_tools`, `search_tools`, `execute_tools`) and direct tool calls are rejected. See JS execution engine.
`token_dir`	string (path)	no	`<data-dir>/tokens`	Override the directory used for OAuth token and DCR-credential storage. Useful when you want a non-default location separate from the data dir.

[[endpoints]] entries

Each [[endpoints]] table describes one MCP server to connect to. Required fields depend on the chosen transport.

Field	Type	Required	Description
`name`	string	yes	Unique, non-empty identifier. Used as the default tool prefix (sanitized to lowercase ASCII), and is the path segment in management API URLs (those URLs are served on the management socket — see Management API).
`description`	string	no	Free-form; surfaced in the UI and in logs.
`tool_prefix`	string	no	Override the auto-derived prefix. If omitted, defaults to `sanitize_name(name)`. See Tool prefixing.
`transport`	`stdio` \| `sse` \| `http` \| `oauth`	yes	Adapter type. Determines which other fields are required.
`command`	string	yes (`stdio`)	Executable to spawn for STDIO transports.
`args`	array of string	no	Arguments passed to the spawned process.
`url`	string	yes (`sse`, `http`, `oauth`)	Endpoint URL for HTTP-based transports.
`env`	map of string → string	no	Environment variables passed to STDIO subprocesses. Values support `$VAR` resolution and `$$` escaping — see Environment variable resolution.
`headers`	map of string → string	no	Extra HTTP headers for `http` / `sse` / `oauth` transports. Header values support inline `$VAR` substitution (e.g. `Authorization = "Bearer $TOKEN"`).
`disabled`	bool	no	Default `false`. When `true`, the endpoint is registered but no adapter is started; toggling this does not restart adapters during hot-reload.
`disabled_tools`	array of string	no	Tool names to hide from the advertised catalog without disabling the underlying server. Calls to disabled tools return an MCP error.
`oauth_server_url`	string	yes (`oauth`)	Authorization server base URL. The relay performs OIDC / metadata discovery against this URL.
`client_id`	string	no	Pre-provisioned OAuth client identifier. If omitted, the relay performs Dynamic Client Registration when needed.
`scopes`	array of string	no	OAuth scopes requested during authorization.
`token_endpoint`	string	no	Override the discovered token endpoint URL. Rarely needed.

OAuth credentialsOAuth client credentials are not stored in config.toml. They are written via POST /api/endpoints/{name}/oauth/credentials(or the equivalent flow in Endara Desktop) and persisted by the relay's TokenManager under ~/.endara/tokens/ with mode 0600. Dynamic Client Registration (DCR) populates them automatically when the server supports it; otherwise you provide client_id (and client_secret for confidential clients) via the API or desktop UI. Note: /api/* is exposed only on the management socket described in Management API, not on http://localhost:9400. Use curl --unix-socket (or the Desktop UI) to call it.

Transport snippets

STDIO

[[endpoints]]
name = "github"
description = "GitHub MCP server"
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
env = { GITHUB_TOKEN = "$GITHUB_TOKEN" }

HTTP

[[endpoints]]
name = "context7"
transport = "http"
url = "https://mcp.context7.com/mcp"
headers = { Authorization = "Bearer $CONTEXT7_KEY" }

SSE

[[endpoints]]
name = "remote-sse"
transport = "sse"
url = "https://example.com/mcp/sse"

OAuth

[[endpoints]]
name = "linear"
transport = "oauth"
url = "https://mcp.linear.app/mcp"
oauth_server_url = "https://mcp.linear.app"
scopes = ["read", "write"]
# client_id / client_secret are persisted via the management API, not TOML

Environment variable resolution

Endpoint env values and headers values are passed through a small resolver before adapters start:

$VAR — looks up VARin the relay's process environment and substitutes its value. Header values support $VAR in any position (e.g. Bearer $TOKEN); env values must be a single $VAR reference at the start of the string.
$$ — escapes a literal dollar sign. $$VAR becomes the literal string $VAR; useful when a server really wants the dollar sign character.
Plain text — kept as-is.

If a referenced variable is not set, the relay records the failure through ConfigError::EnvVarMissing. With graceful validation (the default at startup and during hot-reload), the affected endpoint is registered as a failed adapter with the underlying error visible on its entry in GET /api/endpoints (filter the array by name) — startup itself does not fail.

Validation rules

The [relay] table is required; missing it is a fatal error.
Endpoint names must be non-empty and unique within the file.
stdio endpoints require command; sse, http, and oauth endpoints require url; oauth additionally requires oauth_server_url.
Per-endpoint validation runs gracefully: an endpoint that fails any of the above is registered as a failed adapter rather than blocking the relay from starting. This means one bad entry can't take the whole relay down.

Hot reload

The relay watches config.toml via the notify crate and applies changes without a restart. The file is diffed against the running config, and:

New endpoints are spun up; removed endpoints are shut down gracefully.
Endpoints whose meaningful fields changed (transport, command, args, url, env, headers, OAuth fields) are restarted in place.
Endpoints whose only changes are disabled or disabled_tools are not restarted — the tool-catalog filter is updated in memory.
Unchanged endpoints keep running with their existing adapter and OAuth tokens.

Management API

The relay exposes a small JSON API for inspecting state, restarting adapters, completing OAuth flows, and editing the running configuration. Endara Desktop drives this API. Unlike /mcp, the management API does not listen on TCP — it binds to a Unix-domain socket on macOS / Linux ($XDG_RUNTIME_DIR/endara-relay/api.sock, falling back to $TMPDIR/endara-relay-<uid>/api.sock on macOS or <data-dir>/api.sock if no runtime dir is set) and to a per-user Named Pipe on Windows (\\.\pipe\endara-relay-<sid>). To script against it, use curl --unix-socket (macOS / Linux) or a Named Pipe client (Windows). Keeping the management API off TCP rules out drive-by browser attacks against a local HTTP endpoint; see Security for the broader threat model.

Method	Path	Description
`GET`	`/api/status`	Process uptime, total endpoint count, and healthy count.
`GET`	`/api/endpoints`	All endpoints with health, tool count, last-activity timestamps, and lifecycle state.
`GET`	`/api/catalog`	Full merged tool catalog across all endpoints with applied prefixes, source endpoint name, and current availability (reflects per-endpoint and per-tool disable state plus health).
`GET`	`/api/endpoints/:name/tools`	Tool definitions for a single endpoint, including each tool’s input schema.
`GET`	`/api/endpoints/:name/logs`	Recent log lines for an endpoint, for debugging stuck or failing adapters.
`POST`	`/api/endpoints/:name/restart`	Restart an endpoint’s adapter. Returns immediately; the heavy work runs in the background and lifecycle state is surfaced through GET /api/endpoints.
`POST`	`/api/endpoints/:name/refresh`	Re-list tools from a healthy endpoint without restarting it.
`POST`	`/api/endpoints/:name/disable`	Shut down the adapter and mark the endpoint disabled. Persisted to the disabled-state file so the endpoint stays disabled across restarts; its tools disappear from the catalog.
`POST`	`/api/endpoints/:name/enable`	Clear the disabled flag and re-initialize the adapter. Persisted to the disabled-state file.
`POST`	`/api/endpoints/:name/tools/:tool_name/disable`	Hide a single tool from the merged catalog without disabling the endpoint as a whole. Persisted to the disabled-state file.
`POST`	`/api/endpoints/:name/tools/:tool_name/enable`	Re-enable a previously disabled tool on an endpoint. Persisted to the disabled-state file.
`DELETE`	`/api/endpoints/:name`	Remove an endpoint from the running registry and persist the deletion to config.toml.
`GET`	`/api/config`	Current parsed configuration with env values redacted.
`POST`	`/api/config/reload`	Force an immediate reload from disk (the file watcher does this automatically; this endpoint is for triggering it manually).
`POST`	`/api/test-connection`	Try connecting with the supplied transport / command / url / headers without persisting an endpoint. Useful for UIs validating user input before saving.
`POST`	`/api/endpoints/:name/oauth/start`	Start an OAuth authorization flow for the endpoint and return the authorize URL.
`POST`	`/api/endpoints/:name/oauth/credentials`	Persist OAuth client credentials (client_id / client_secret) for the endpoint.
`GET`	`/api/endpoints/:name/oauth/status`	Whether the endpoint has tokens, when they expire, and which scopes were granted.
`POST`	`/api/endpoints/:name/oauth/revoke`	Revoke and delete the stored OAuth tokens for the endpoint.
`POST`	`/api/endpoints/:name/oauth/refresh`	Force-refresh an access token using the stored refresh token.
`GET`	`/api/endpoints/:name/oauth/metrics`	In-process OAuth metric counters for the endpoint (e.g. token refreshes, refresh failures), as JSON.
`POST`	`/api/oauth/setup`	Create a transient OAuth setup session: discovers OAuth metadata, attempts Dynamic Client Registration, and returns the authorize URL — without writing to config.toml.
`POST`	`/api/oauth/setup/:id/credentials`	Submit manual client_id / client_secret for a setup session when DCR is unavailable, and receive the authorize URL.
`GET`	`/api/oauth/setup/:id/status`	Poll the status of a setup session (pending / awaiting credentials / authorized / failed).
`POST`	`/api/oauth/setup/:id/commit`	Persist a successfully authorized setup session: write the new endpoint into config.toml and register the running adapter. Only succeeds once the session has reached the Authorized state.
`DELETE`	`/api/oauth/setup/:id`	Cancel a setup session and clean up its in-memory state without writing to config.
`POST`	`/api/endpoints/:name/credentials`	Persist OAuth client credentials (client_id and optional client_secret) for an existing OAuth endpoint via the TokenManager DCR file. Modern replacement for the legacy client_secret TOML field. To seed credentials during initial setup, before the endpoint exists, use POST /api/oauth/setup/:id/credentials instead.
`GET`	`/api/endpoints/:name/credentials`	Inspect which credential fields are currently set for an endpoint (values are not returned).

Scripting against the API

Because /api/* lives on a local socket / pipe, the invocation depends on your platform. Methods, paths, JSON bodies, and status codes are standard HTTP — only the transport is local.

# Linux — Unix-domain socket under $XDG_RUNTIME_DIR
curl --unix-socket "$XDG_RUNTIME_DIR/endara-relay/api.sock" \
  http://localhost/api/status

# macOS — Unix-domain socket under $TMPDIR
curl --unix-socket "$TMPDIR/endara-relay-$(id -u)/api.sock" \
  http://localhost/api/status

# Windows (PowerShell) — per-user Named Pipe
# curl 8.x supports --unix-socket against \\.\pipe\<name>
curl.exe --unix-socket "\\.\pipe\endara-relay-$([System.Security.Principal.WindowsIdentity]::GetCurrent().User.Value)" `
  http://localhost/api/status

On all platforms, the host portion of the URL (http://localhost) is ignored by the relay — only the path and method matter. The socket / pipe is owned by the current user with restrictive permissions, and on Unix the relay verifies the peer's UID before accepting a connection.

JS execution engine

When [relay] local_js_execution = true, the relay replaces its full advertised tool catalog with three meta-tools — list_tools, search_tools, and execute_tools — and rejects any direct tool call with the message "Direct tool calls are not allowed in JS execution mode. Use execute_tools instead." The model is expected to look up the tools it actually needs through search_tools, then call them inside a single sandboxed JavaScript program.

How it works

execute_tools runs the supplied script in an embedded boa_engine JavaScript sandbox — entirely in-process, no Node.js, no require / import / fetch, no filesystem or network access of its own. The script body is wrapped in (async function() { ... })() so top-level await works. Whatever value you pass to returnbecomes the meta-tool's result. Each call gets a fresh context — no state persists between execute_tools invocations.

Sandbox limits

30-second wall-clock timeout per execute_tools call (hardcoded). Slow tool calls inside the script count toward this budget.
1,000,000 loop-iteration cap on each loop in the script, to keep while (true) {} from hanging the relay.
JSON.parse is wrapped with a friendlier error that includes the input kind, length, and a short preview when parsing fails.

Functions and globals exposed to the script

tools["NAME"](args)— global object with one function per available tool. Returns the parsed JSON of the MCP result. The function is synchronous from JavaScript's perspective, but await tools[...] is harmless and is the recommended style for forward compatibility and readability.
call(name, args?, opts?) — alternative invocation form that auto-unwraps the response: returns structuredContent when present, otherwise JSON-parses content[0].text when it looks like JSON, otherwise returns the raw text. Throws if the tool returns isError: true. Pass { raw: true } to skip the unwrap and get the full MCP result, or { retry: N } to retry transient failures (HTTP 502/503/504, timeouts, connection resets) up to N times with backoff (200/400/800 ms ± 25% jitter).
Standard ECMAScript globals (JSON, Math, Array, String, Promise, Date, etc.) per boa_engine.
Not exposed: console, fetch, require, import, process, globalThis.fs, timers (setTimeout / setInterval).

Tool naming inside the script

Tool keys on the tools object follow the same prefixing scheme as the underlying catalog. Multi-server mode produces prefix__name with a double underscore between prefix and tool name (e.g. github__list_repos). Single-server mode omits the prefix.

Result shape and the safe-handling pattern

Every tools[...] call returns the standard MCP tool result: { content?: [{ type, text }], structuredContent? }. structuredContentis the server's structured output and is preferred. content[0].text is provider-defined prose and is not guaranteed to be JSON — it may be empty, truncated, or natural language. Use this pattern:

const r = await tools["todoist__get-tasks"]({ limit: 5 });
if (r.structuredContent) return r.structuredContent;
const t = r.content && r.content[0] && r.content[0].text;
return typeof t === "string" && /^\s*[\[{]/.test(t) ? JSON.parse(t) : t;

The three meta-tools

`list_tools({ limit?, offset? })`

Paginated catalog. limit defaults to 50 and is capped at 200. Returns { tools, total, limit, offset }; each tool entry is { name, description, input_schema, annotations? }. Use this when you want to enumerate every tool the relay knows about.

`search_tools({ query, limit? })`

Fuzzy ranked search across tool name, description, endpoint name, and input-schema property names. limit defaults to 20 and is capped at 200. Search is case-insensitive and typo-tolerant (Levenshtein), and respects camelCase / snake_case / kebab-case word boundaries. Ranking goes exact > prefix > substring > fuzzy; field weights are name > description > endpoint; tools matching more query tokens rank higher. Returns an array of { name, description, input_schema, annotations? }.

`execute_tools({ script })`

Runs script under the rules above and returns whatever the script returns. Throws if the script throws or exceeds a sandbox limit; the error message is propagated back to the meta-tool caller.

Why this exists — the token-burn problem

A typical desktop client connects to many MCP servers (filesystem, github, slack, jira, todoist, postgres, …). The combined catalog can easily be hundreds of tools with multi-thousand- character JSON schemas attached to each.

In standard MCP mode, every one of those tool definitions is sent to the model on every request — the catalog alone can cost tens of thousands of input tokens per turnjust to advertise capabilities the model probably won't use this turn.

JS mode collapses that advertised surface to three tools. The model uses search_tools to look up the handful of tools it needs for the current task, calls them inside a single execute_tools round-trip, and returns only the distilled answer. Two compounding wins:

The upfront catalog cost drops by orders of magnitude.
Intermediate tool results never have to round-trip back through the model — the script can fetch 1,000 records, filter to 5, and return only those 5. Without JS mode the model would see all 1,000 in its context just to pick 5.

Worked examples

Example 1 — discover then call:

// The model doesn't know the exact tool name, so it searches first.
const matches = await tools["search_tools"]({ query: "list github issues", limit: 5 });
const m = matches[0];                        // pick the top hit
const r = await tools[m.name]({ repo: "endara-ai/endara-relay", state: "open" });
return r.structuredContent ?? r.content?.[0]?.text;

Example 2 — chain calls in one round-trip:

const projects = await tools["todoist__get-projects"]({});
const proj = (projects.structuredContent ?? []).find(p => p.name === "Inbox");
const tasks = await tools["todoist__get-tasks"]({ project_id: proj.id });
return { projectId: proj.id, tasks: tasks.structuredContent };

Example 3 — reduce-and-return (the token-burn-reduction pattern):

// Fetch potentially huge data, but only return what the model needs.
const all = await tools["github__list_issues"]({ repo: "endara-ai/endara-relay", state: "open" });
const issues = all.structuredContent ?? [];
// 200 issues -> 5 stale ones with just the fields we care about.
const stale = issues
  .filter(i => Date.now() - new Date(i.updated_at).getTime() > 30 * 86400_000)
  .sort((a, b) => new Date(a.updated_at) - new Date(b.updated_at))
  .slice(0, 5)
  .map(i => ({ number: i.number, title: i.title, updated_at: i.updated_at }));
return { staleCount: stale.length, stale };

Limits to remember

Any single execute_tools call is bounded by the 30-second wall-clock timeout and the 1M-iteration loop cap. Scripts cannot persist state between calls — each invocation starts from scratch. If a tool call inside the script throws, the sandbox surfaces the error message back to the meta-tool caller.

Tool prefixing

Two MCP servers can ship tools with the same name (for example, both a filesystem and a sandbox server might call something read_file). To keep names unique, the relay prefixes every tool it advertises with the endpoint's prefix and a double underscore, e.g. github__list_repos.

The prefix is taken from the endpoint's tool_prefix if set; otherwise it's derived from name by sanitizing to lowercase ASCII (non-ASCII characters are stripped). If sanitization yields an empty string, set tool_prefix explicitly. When the relay is connected to only one underlying server, prefixes are omitted and tools keep their original names.

Crash recovery

STDIO adapters are restarted automatically with exponential backoff when the underlying process exits unexpectedly. Each restart resets the adapter to the initializing lifecycle state and then either back to ready on success or to failed with the most recent error exposed on the endpoint entry returned by GET /api/endpoints. SSE / HTTP / OAuth adapters reconnect on transport errors; OAuth tokens are refreshed automatically when an access token nears expiry.

File locations

All paths are under the data directory (default ~/.endara):

~/.endara/config.toml — the main configuration file.
~/.endara/logs/relay.log.<YYYY-MM-DD> — daily rotated log files. Stdout also receives the same lines.
~/.endara/tokens/ — OAuth tokens and DCR client credentials. Files are written with mode 0600 on Unix. Override the location via [relay] token_dir.

Troubleshooting

Port 9400 already in use (EADDRINUSE)

Another endara-relayinstance is already listening, or you have both Endara Desktop's bundled relay and a separately installed CLI relay running. Stop one of them, or pass --port to use a different port. See Desktop troubleshooting for how Desktop handles the same conflict.

Endpoint stuck in failed state

Inspect GET /api/endpoints/{name}/logs for recent adapter output and ~/.endara/logs/relay.log.<YYYY-MM-DD>for the relay's own log. Common causes: a missing command, a server that needs an env var that wasn't set, or an OAuth flow that hasn't been completed.

Environment variable resolution failure

If a $VAR reference in env or headersisn't set, the affected endpoint is registered as a failed adapter. Set the variable in the relay's process environment (or in your shell profile if you launch the relay from the shell) and the next reload picks it up. Use $$ to emit a literal dollar sign.

Tool name collisions

If two endpoints derive the same prefix from their name — say two endpoints called github — set an explicit tool_prefix on one of them. Validation will not let two endpoints share the same name.