Endara Relay

One endpoint for all your MCP servers. A single Rust binary that aggregates local STDIO servers, remote HTTP/SSE servers, and OAuth servers and serves them at http://localhost:9400/mcp. A separate management API (used by Endara Desktop) is exposed on a local Unix-domain socket / Windows Named Pipe — never on a TCP port.

Overview

Endara Relay is a single Rust binary that sits between your AI client (Claude Desktop, Cursor, ChatGPT, Windsurf, VS Code, Zed, Continue, or any MCP-compatible app) and all the MCP servers you actually use. Point every client at one local endpoint — http://localhost:9400/mcp— and the relay handles the rest: spawning STDIO servers, holding onto SSE / HTTP connections, refreshing OAuth tokens, and merging every server's tool catalog into a single unified list with collision-free names.

It uses one transport-specific adapter per endpoint, namespaces tools with a stable prefix to avoid collisions, and watches config.toml for changes so you can add or remove servers without restarting. STDIO adapters are restarted automatically with exponential backoff if the underlying process crashes.

Optionally, Relay can run in JS execution mode, where instead of advertising hundreds of tool definitions to the model on every turn, it advertises three meta-tools and lets the model run a sandboxed JavaScript program that calls the underlying tools in a single round-trip. See JS execution engine below.

No cloud, no accounts, no telemetry. Everything runs on your machine.

Install

Pick whichever you have set up.

Homebrew (macOS / Linux)

brew install endara-ai/tap/endara-relay

Cargo

cargo install endara-relay

Pre-built binaries

Download the latest release for your platform from github.com/endara-ai/endara-relay/releases.

Or, if you'd rather not run the relay yourself, install Endara Desktop — it bundles the relay, starts and stops it for you, and provides a UI for managing endpoints.

Quick start

Drop the following into ~/.endara/config.toml:

# ~/.endara/config.toml
[relay]
machine_name = "my-mac"

[[endpoints]]
name = "filesystem"
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/Users/me/projects"]

Then start the relay:

endara-relay start

Point any MCP-compatible client at http://localhost:9400/mcp. Claude Desktop only speaks stdio, so use the mcp-remote bridge — drop this into claude_desktop_config.json:

{
  "mcpServers": {
    "endara": {
      "command": "npx",
      "args": ["-y", "mcp-remote", "http://localhost:9400/mcp"]
    }
  }
}

For Cursor, add the same URL under Settings → MCP → Add new MCP server → HTTP. Restart the client and the filesystem tools should appear in its tool list, prefixed with filesystem__.

CLI reference

The relay has a single subcommand, start, which boots the HTTP server, loads the config, and starts watching it for changes.

FlagDefaultDescription
--data-dir~/.endaraBase directory for config, logs, and OAuth tokens. The relay creates it if it doesn't exist and writes a default config.toml on first run.
--config<data-dir>/config.tomlOverride the path to the TOML configuration file.
--port9400Port to listen on. The MCP endpoint (/mcp and /mcp/*), /oauth/callback, and /healthz are served on this TCP port. The management API (/api/*) is not served on TCP — it is bound to a Unix-domain socket / Windows Named Pipe; see Management API.
--log-formattextLog output format. Either text or json.

NoteThe RUST_LOG environment variable overrides the log filter when set. The default is info,endara_relay=debug. Logs are written to both stdout and ~/.endara/logs/relay.log.<YYYY-MM-DD>.

Configuration reference

Endara Relay reads a single TOML file. Default location: ~/.endara/config.toml. The file has one [relay] table and any number of [[endpoints]] entries.

[relay] table

FieldTypeRequiredDefaultDescription
machine_namestringyessystem hostname (when default config is generated)Identifies this relay instance in logs and /api/status. Free-form; pick anything that helps you tell machines apart.
local_js_executionboolnofalseWhen true, the advertised tool catalog is replaced with three meta-tools (list_tools, search_tools, execute_tools) and direct tool calls are rejected. See JS execution engine.
token_dirstring (path)no<data-dir>/tokensOverride the directory used for OAuth token and DCR-credential storage. Useful when you want a non-default location separate from the data dir.

[[endpoints]] entries

Each [[endpoints]] table describes one MCP server to connect to. Required fields depend on the chosen transport.

FieldTypeRequiredDescription
namestringyesUnique, non-empty identifier. Used as the default tool prefix (sanitized to lowercase ASCII), and is the path segment in management API URLs (those URLs are served on the management socket — see Management API).
descriptionstringnoFree-form; surfaced in the UI and in logs.
tool_prefixstringnoOverride the auto-derived prefix. If omitted, defaults to sanitize_name(name). See Tool prefixing.
transportstdio | sse | http | oauthyesAdapter type. Determines which other fields are required.
commandstringyes (stdio)Executable to spawn for STDIO transports.
argsarray of stringnoArguments passed to the spawned process.
urlstringyes (sse, http, oauth)Endpoint URL for HTTP-based transports.
envmap of string → stringnoEnvironment variables passed to STDIO subprocesses. Values support $VAR resolution and $$ escaping — see Environment variable resolution.
headersmap of string → stringnoExtra HTTP headers for http / sse / oauth transports. Header values support inline $VAR substitution (e.g. Authorization = "Bearer $TOKEN").
disabledboolnoDefault false. When true, the endpoint is registered but no adapter is started; toggling this does not restart adapters during hot-reload.
disabled_toolsarray of stringnoTool names to hide from the advertised catalog without disabling the underlying server. Calls to disabled tools return an MCP error.
oauth_server_urlstringyes (oauth)Authorization server base URL. The relay performs OIDC / metadata discovery against this URL.
client_idstringnoPre-provisioned OAuth client identifier. If omitted, the relay performs Dynamic Client Registration when needed.
scopesarray of stringnoOAuth scopes requested during authorization.
token_endpointstringnoOverride the discovered token endpoint URL. Rarely needed.

OAuth credentialsOAuth client credentials are not stored in config.toml. They are written via POST /api/endpoints/{name}/oauth/credentials(or the equivalent flow in Endara Desktop) and persisted by the relay's TokenManager under ~/.endara/tokens/ with mode 0600. Dynamic Client Registration (DCR) populates them automatically when the server supports it; otherwise you provide client_id (and client_secret for confidential clients) via the API or desktop UI. Note: /api/* is exposed only on the management socket described in Management API, not on http://localhost:9400. Use curl --unix-socket (or the Desktop UI) to call it.

Transport snippets

STDIO

[[endpoints]]
name = "github"
description = "GitHub MCP server"
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
env = { GITHUB_TOKEN = "$GITHUB_TOKEN" }

HTTP

[[endpoints]]
name = "context7"
transport = "http"
url = "https://mcp.context7.com/mcp"
headers = { Authorization = "Bearer $CONTEXT7_KEY" }

SSE

[[endpoints]]
name = "remote-sse"
transport = "sse"
url = "https://example.com/mcp/sse"

OAuth

[[endpoints]]
name = "linear"
transport = "oauth"
url = "https://mcp.linear.app/mcp"
oauth_server_url = "https://mcp.linear.app"
scopes = ["read", "write"]
# client_id / client_secret are persisted via the management API, not TOML

Environment variable resolution

Endpoint env values and headers values are passed through a small resolver before adapters start:

If a referenced variable is not set, the relay records the failure through ConfigError::EnvVarMissing. With graceful validation (the default at startup and during hot-reload), the affected endpoint is registered as a failed adapter with the underlying error visible on its entry in GET /api/endpoints (filter the array by name) — startup itself does not fail.

Validation rules

Hot reload

The relay watches config.toml via the notify crate and applies changes without a restart. The file is diffed against the running config, and:

Management API

The relay exposes a small JSON API for inspecting state, restarting adapters, completing OAuth flows, and editing the running configuration. Endara Desktop drives this API. Unlike /mcp, the management API does not listen on TCP — it binds to a Unix-domain socket on macOS / Linux ($XDG_RUNTIME_DIR/endara-relay/api.sock, falling back to $TMPDIR/endara-relay-<uid>/api.sock on macOS or <data-dir>/api.sock if no runtime dir is set) and to a per-user Named Pipe on Windows (\\.\pipe\endara-relay-<sid>). To script against it, use curl --unix-socket (macOS / Linux) or a Named Pipe client (Windows). Keeping the management API off TCP rules out drive-by browser attacks against a local HTTP endpoint; see Security for the broader threat model.

MethodPathDescription
GET/api/statusProcess uptime, total endpoint count, and healthy count.
GET/api/endpointsAll endpoints with health, tool count, last-activity timestamps, and lifecycle state.
GET/api/catalogFull merged tool catalog across all endpoints with applied prefixes, source endpoint name, and current availability (reflects per-endpoint and per-tool disable state plus health).
GET/api/endpoints/:name/toolsTool definitions for a single endpoint, including each tool’s input schema.
GET/api/endpoints/:name/logsRecent log lines for an endpoint, for debugging stuck or failing adapters.
POST/api/endpoints/:name/restartRestart an endpoint’s adapter. Returns immediately; the heavy work runs in the background and lifecycle state is surfaced through GET /api/endpoints.
POST/api/endpoints/:name/refreshRe-list tools from a healthy endpoint without restarting it.
POST/api/endpoints/:name/disableShut down the adapter and mark the endpoint disabled. Persisted to the disabled-state file so the endpoint stays disabled across restarts; its tools disappear from the catalog.
POST/api/endpoints/:name/enableClear the disabled flag and re-initialize the adapter. Persisted to the disabled-state file.
POST/api/endpoints/:name/tools/:tool_name/disableHide a single tool from the merged catalog without disabling the endpoint as a whole. Persisted to the disabled-state file.
POST/api/endpoints/:name/tools/:tool_name/enableRe-enable a previously disabled tool on an endpoint. Persisted to the disabled-state file.
DELETE/api/endpoints/:nameRemove an endpoint from the running registry and persist the deletion to config.toml.
GET/api/configCurrent parsed configuration with env values redacted.
POST/api/config/reloadForce an immediate reload from disk (the file watcher does this automatically; this endpoint is for triggering it manually).
POST/api/test-connectionTry connecting with the supplied transport / command / url / headers without persisting an endpoint. Useful for UIs validating user input before saving.
POST/api/endpoints/:name/oauth/startStart an OAuth authorization flow for the endpoint and return the authorize URL.
POST/api/endpoints/:name/oauth/credentialsPersist OAuth client credentials (client_id / client_secret) for the endpoint.
GET/api/endpoints/:name/oauth/statusWhether the endpoint has tokens, when they expire, and which scopes were granted.
POST/api/endpoints/:name/oauth/revokeRevoke and delete the stored OAuth tokens for the endpoint.
POST/api/endpoints/:name/oauth/refreshForce-refresh an access token using the stored refresh token.
GET/api/endpoints/:name/oauth/metricsIn-process OAuth metric counters for the endpoint (e.g. token refreshes, refresh failures), as JSON.
POST/api/oauth/setupCreate a transient OAuth setup session: discovers OAuth metadata, attempts Dynamic Client Registration, and returns the authorize URL — without writing to config.toml.
POST/api/oauth/setup/:id/credentialsSubmit manual client_id / client_secret for a setup session when DCR is unavailable, and receive the authorize URL.
GET/api/oauth/setup/:id/statusPoll the status of a setup session (pending / awaiting credentials / authorized / failed).
POST/api/oauth/setup/:id/commitPersist a successfully authorized setup session: write the new endpoint into config.toml and register the running adapter. Only succeeds once the session has reached the Authorized state.
DELETE/api/oauth/setup/:idCancel a setup session and clean up its in-memory state without writing to config.
POST/api/endpoints/:name/credentialsPersist OAuth client credentials (client_id and optional client_secret) for an existing OAuth endpoint via the TokenManager DCR file. Modern replacement for the legacy client_secret TOML field. To seed credentials during initial setup, before the endpoint exists, use POST /api/oauth/setup/:id/credentials instead.
GET/api/endpoints/:name/credentialsInspect which credential fields are currently set for an endpoint (values are not returned).

Scripting against the API

Because /api/* lives on a local socket / pipe, the invocation depends on your platform. Methods, paths, JSON bodies, and status codes are standard HTTP — only the transport is local.

# Linux — Unix-domain socket under $XDG_RUNTIME_DIR
curl --unix-socket "$XDG_RUNTIME_DIR/endara-relay/api.sock" \
  http://localhost/api/status
# macOS — Unix-domain socket under $TMPDIR
curl --unix-socket "$TMPDIR/endara-relay-$(id -u)/api.sock" \
  http://localhost/api/status
# Windows (PowerShell) — per-user Named Pipe
# curl 8.x supports --unix-socket against \\.\pipe\<name>
curl.exe --unix-socket "\\.\pipe\endara-relay-$([System.Security.Principal.WindowsIdentity]::GetCurrent().User.Value)" `
  http://localhost/api/status

On all platforms, the host portion of the URL (http://localhost) is ignored by the relay — only the path and method matter. The socket / pipe is owned by the current user with restrictive permissions, and on Unix the relay verifies the peer's UID before accepting a connection.

JS execution engine

When [relay] local_js_execution = true, the relay replaces its full advertised tool catalog with three meta-toolslist_tools, search_tools, and execute_tools — and rejects any direct tool call with the message "Direct tool calls are not allowed in JS execution mode. Use execute_tools instead." The model is expected to look up the tools it actually needs through search_tools, then call them inside a single sandboxed JavaScript program.

How it works

execute_tools runs the supplied script in an embedded boa_engine JavaScript sandbox — entirely in-process, no Node.js, no require / import / fetch, no filesystem or network access of its own. The script body is wrapped in (async function() { ... })() so top-level await works. Whatever value you pass to returnbecomes the meta-tool's result. Each call gets a fresh context — no state persists between execute_tools invocations.

Sandbox limits

Functions and globals exposed to the script

Tool naming inside the script

Tool keys on the tools object follow the same prefixing scheme as the underlying catalog. Multi-server mode produces prefix__name with a double underscore between prefix and tool name (e.g. github__list_repos). Single-server mode omits the prefix.

Result shape and the safe-handling pattern

Every tools[...] call returns the standard MCP tool result: { content?: [{ type, text }], structuredContent? }. structuredContentis the server's structured output and is preferred. content[0].text is provider-defined prose and is not guaranteed to be JSON — it may be empty, truncated, or natural language. Use this pattern:

const r = await tools["todoist__get-tasks"]({ limit: 5 });
if (r.structuredContent) return r.structuredContent;
const t = r.content && r.content[0] && r.content[0].text;
return typeof t === "string" && /^\s*[\[{]/.test(t) ? JSON.parse(t) : t;

The three meta-tools

list_tools({ limit?, offset? })

Paginated catalog. limit defaults to 50 and is capped at 200. Returns { tools, total, limit, offset }; each tool entry is { name, description, input_schema, annotations? }. Use this when you want to enumerate every tool the relay knows about.

search_tools({ query, limit? })

Fuzzy ranked search across tool name, description, endpoint name, and input-schema property names. limit defaults to 20 and is capped at 200. Search is case-insensitive and typo-tolerant (Levenshtein), and respects camelCase / snake_case / kebab-case word boundaries. Ranking goes exact > prefix > substring > fuzzy; field weights are name > description > endpoint; tools matching more query tokens rank higher. Returns an array of { name, description, input_schema, annotations? }.

execute_tools({ script })

Runs script under the rules above and returns whatever the script returns. Throws if the script throws or exceeds a sandbox limit; the error message is propagated back to the meta-tool caller.

Why this exists — the token-burn problem

A typical desktop client connects to many MCP servers (filesystem, github, slack, jira, todoist, postgres, …). The combined catalog can easily be hundreds of tools with multi-thousand- character JSON schemas attached to each.

In standard MCP mode, every one of those tool definitions is sent to the model on every request — the catalog alone can cost tens of thousands of input tokens per turnjust to advertise capabilities the model probably won't use this turn.

JS mode collapses that advertised surface to three tools. The model uses search_tools to look up the handful of tools it needs for the current task, calls them inside a single execute_tools round-trip, and returns only the distilled answer. Two compounding wins:

Worked examples

Example 1 — discover then call:

// The model doesn't know the exact tool name, so it searches first.
const matches = await tools["search_tools"]({ query: "list github issues", limit: 5 });
const m = matches[0];                        // pick the top hit
const r = await tools[m.name]({ repo: "endara-ai/endara-relay", state: "open" });
return r.structuredContent ?? r.content?.[0]?.text;

Example 2 — chain calls in one round-trip:

const projects = await tools["todoist__get-projects"]({});
const proj = (projects.structuredContent ?? []).find(p => p.name === "Inbox");
const tasks = await tools["todoist__get-tasks"]({ project_id: proj.id });
return { projectId: proj.id, tasks: tasks.structuredContent };

Example 3 — reduce-and-return (the token-burn-reduction pattern):

// Fetch potentially huge data, but only return what the model needs.
const all = await tools["github__list_issues"]({ repo: "endara-ai/endara-relay", state: "open" });
const issues = all.structuredContent ?? [];
// 200 issues -> 5 stale ones with just the fields we care about.
const stale = issues
  .filter(i => Date.now() - new Date(i.updated_at).getTime() > 30 * 86400_000)
  .sort((a, b) => new Date(a.updated_at) - new Date(b.updated_at))
  .slice(0, 5)
  .map(i => ({ number: i.number, title: i.title, updated_at: i.updated_at }));
return { staleCount: stale.length, stale };

Limits to remember

Any single execute_tools call is bounded by the 30-second wall-clock timeout and the 1M-iteration loop cap. Scripts cannot persist state between calls — each invocation starts from scratch. If a tool call inside the script throws, the sandbox surfaces the error message back to the meta-tool caller.

Tool prefixing

Two MCP servers can ship tools with the same name (for example, both a filesystem and a sandbox server might call something read_file). To keep names unique, the relay prefixes every tool it advertises with the endpoint's prefix and a double underscore, e.g. github__list_repos.

The prefix is taken from the endpoint's tool_prefix if set; otherwise it's derived from name by sanitizing to lowercase ASCII (non-ASCII characters are stripped). If sanitization yields an empty string, set tool_prefix explicitly. When the relay is connected to only one underlying server, prefixes are omitted and tools keep their original names.

Crash recovery

STDIO adapters are restarted automatically with exponential backoff when the underlying process exits unexpectedly. Each restart resets the adapter to the initializing lifecycle state and then either back to ready on success or to failed with the most recent error exposed on the endpoint entry returned by GET /api/endpoints. SSE / HTTP / OAuth adapters reconnect on transport errors; OAuth tokens are refreshed automatically when an access token nears expiry.

File locations

All paths are under the data directory (default ~/.endara):

Troubleshooting

Port 9400 already in use (EADDRINUSE)

Another endara-relayinstance is already listening, or you have both Endara Desktop's bundled relay and a separately installed CLI relay running. Stop one of them, or pass --port to use a different port. See Desktop troubleshooting for how Desktop handles the same conflict.

Endpoint stuck in failed state

Inspect GET /api/endpoints/{name}/logs for recent adapter output and ~/.endara/logs/relay.log.<YYYY-MM-DD>for the relay's own log. Common causes: a missing command, a server that needs an env var that wasn't set, or an OAuth flow that hasn't been completed.

Environment variable resolution failure

If a $VAR reference in env or headersisn't set, the affected endpoint is registered as a failed adapter. Set the variable in the relay's process environment (or in your shell profile if you launch the relay from the shell) and the next reload picks it up. Use $$ to emit a literal dollar sign.

Tool name collisions

If two endpoints derive the same prefix from their name — say two endpoints called github — set an explicit tool_prefix on one of them. Validation will not let two endpoints share the same name.