The MCP Server Nobody Warned You About: stdout, Errors, and Why Rust Wins

I have been running MCP servers in production for over a year. Eight plugins. Thirty-four tools. Thousands of AI agent sessions per week at systemprompt.io. And I can tell you with absolute certainty that the hardest problems in MCP development have nothing to do with the code you see in tutorials.

The tutorials show you how to define a tool. How to return a result. How to wire up transport. That part is genuinely easy. The hard part is everything they leave out. The stdout corruption that only happens under load. The error messages that send AI models into infinite retry loops. The memory behaviour differences between TypeScript and Rust that do not matter until they suddenly, catastrophically do.

This is that article. The one I wish someone had written before I spent three weeks debugging phantom JSON-RPC failures.

The stdout Problem That Will Ruin Your Week

MCP uses stdio transport. Your server reads JSON-RPC requests from stdin and writes JSON-RPC responses to stdout. Simple enough. Every tutorial covers this.

What they do not cover: stdout is exclusively the protocol channel. Not mostly. Not primarily. Exclusively. Every single byte that hits stdout must be valid JSON-RPC framing. One rogue println!() buried in a dependency three layers deep, and your protocol stream is corrupted.

Here is what that looks like in practice. Claude Code sends a tool call. Your server processes it correctly, generates the right response. But somewhere in the call stack, a library logs a deprecation warning to stdout. Your JSON-RPC response now has a plaintext line prepended to it. Claude Code tries to parse it as JSON. It fails. The tool call returns an error. Claude retries. Same thing happens. After three retries, the agent gives up and tells the user the tool is broken.

You will not catch this in development. It only triggers when a specific code path runs that activates the noisy dependency. Your tests pass. Your manual testing works. Production breaks intermittently.

The fix in Rust is straightforward but requires discipline:

// NEVER this in an MCP server
println!("Processing request for {}", tool_name);

// ALWAYS this
eprintln!("Processing request for {}", tool_name);

// Or better, use tracing with a stderr writer
tracing_subscriber::fmt()
    .with_writer(std::io::stderr)
    .with_env_filter("info")
    .init();

In TypeScript, the same problem exists but is harder to control. console.log() goes to stdout. Every npm package you depend on might call console.log(). You can redirect it, but you are fighting the ecosystem's default behaviour. In Rust, println!() is explicit, greppable, and you can ban it with a clippy lint. Your dependencies almost never write to stdout unless you ask them to.

This is the first reason Rust wins for MCP servers. Not performance. Not type safety. Stdout hygiene.

Error Messages Are AI User Experience

This is the insight that changed how I build MCP tools. When your tool returns an error, it is not going into a log file. It is not showing up in a terminal for a human developer. It is going directly to an AI model that will read it, interpret it, and decide what to do next.

Your error messages are UX copy. For an AI reader.

I learned this the hard way. Our file analysis tool returned errors like "ENOENT: no such file or directory". Perfectly reasonable error. A human developer would understand it immediately. But Claude would see that error and retry with the exact same path. Three times. Then apologise to the user and say the tool was not working.

We changed the error to:

Err(McpError::internal(format!(
    "File not found: '{}'. The workspace root is '{}'. \
     This tool expects paths relative to the workspace root. \
     Available top-level entries: {}",
    params.path,
    self.root_dir.display(),
    self.list_top_level_entries().join(", ")
)))

After that change, Claude would read the error, see the available entries, construct the correct path, and succeed on the second try. Every time.

The difference between "tool error, human figures it out" and "tool error, AI recovers automatically" is entirely in how you write your error messages. Include:

What went wrong, specifically
What the tool expected instead
What valid alternatives look like
Enough context for the AI to self-correct

This is not a Rust-specific insight, but Rust's type system makes it easier to enforce. You can create custom error types that require context fields. The compiler refuses to let you create an error without providing the information the AI needs. Try enforcing that in a dynamically typed language.

For more on building tools that AI agents actually use well, see our comparison of MCP servers and traditional CLI tools.

Memory Behaviour: Where TypeScript and Rust Diverge

Here are two numbers. A TypeScript MCP server using the official SDK, idle, waiting for requests: 45MB RSS. The equivalent Rust server using rmcp: 12MB RSS.

You might say: who cares about 33MB? Memory is cheap.

Fair point if you are running one server. But MCP servers are not like web services. They do not run as a single long-lived process serving many clients. Claude Code spawns a fresh instance of your server for each session. If you have 50 developers on your team, each running Claude Code, each with two or three active sessions, you have 100-150 instances of your server running simultaneously.

At 45MB each, that is 6.75GB just for idle MCP servers. At 12MB each, it is 1.8GB. On a shared development machine or a CI runner with constrained memory, that difference matters.

But the real difference is not steady-state memory. It is the allocation pattern under load.

TypeScript MCP servers allocate through V8's garbage collector. Memory usage grows as tools process data, and it comes back down eventually. The "eventually" part is the problem. V8's GC is generational and lazy. Under sustained tool-call load, memory can spike to 3-5x the baseline before the GC reclaims it. For a server processing large files or directory trees, that spike can be significant.

Rust does not have this problem. Memory allocation is deterministic. When a function returns and its local variables go out of scope, that memory is freed. Immediately. Not eventually. Not when the GC gets around to it. The RSS trace of a Rust MCP server under load looks like a flat line with small bumps. The TypeScript equivalent looks like a sawtooth wave.

The rmcp crate at 4.7 million downloads has proven this at scale. It is the official Rust SDK for MCP, and it does not just wrap the protocol. It gives you a zero-overhead abstraction layer that compiles down to exactly the code you would write by hand.

The Macro API That Actually Works

I am generally suspicious of macro-heavy Rust APIs. They obscure what is happening, make errors inscrutable, and fight the language. The rmcp macros are the exception.

#[derive(Debug, Clone)]
struct CodeAnalyser {
    root: PathBuf,
}

#[tool(tool_box)]
impl CodeAnalyser {
    #[tool(description = "Count lines of code in a directory tree")]
    async fn count_lines(
        &self,
        #[tool(aggr)] params: CountParams,
    ) -> Result<CallToolResult, McpError> {
        let count = self.walk_and_count(&params.path)?;
        Ok(CallToolResult::success(vec![
            Content::text(format!("{count} lines"))
        ]))
    }

    #[tool(description = "Find the largest files by size")]
    async fn find_largest(
        &self,
        #[tool(aggr)] params: LargestParams,
    ) -> Result<CallToolResult, McpError> {
        let files = self.sorted_by_size(params.limit.unwrap_or(10))?;
        Ok(CallToolResult::success(vec![
            Content::text(self.format_file_list(&files))
        ]))
    }

    #[tool(description = "Analyse language distribution across the project")]
    async fn language_breakdown(
        &self,
        #[tool(aggr)] params: BreakdownParams,
    ) -> Result<CallToolResult, McpError> {
        let stats = self.analyse_languages()?;
        Ok(CallToolResult::success(vec![
            Content::text(self.format_breakdown(&stats))
        ]))
    }
}

Three attributes do all the heavy lifting. #[tool(tool_box)] registers the impl block as a tool provider. #[tool(description)] attaches the description that Claude sees when it discovers your tools. #[tool(aggr)] tells the macro to aggregate parameters into the struct rather than treating them individually.

The JsonSchema derive on your parameter structs generates the JSON Schema that tells Claude what arguments each tool accepts, what types they are, and which are optional. Doc comments on struct fields become parameter descriptions. The schema generation is automatic, correct, and stays in sync with your code because it is your code.

No separate schema file. No OpenAPI spec to maintain. No risk of the schema drifting from the implementation.

The Cold Start Tax

Claude Code starts your MCP server when it needs a tool and stops it when the session ends. This is not like a web server that boots once and runs for days. Your server might start and stop dozens of times per day per developer.

TypeScript cold start: parse the JavaScript, JIT compile the hot paths, initialise the runtime, load your dependencies, set up the transport. Around 200ms on a fast machine. Longer on CI runners or constrained VMs.

Rust cold start: load the binary into memory, run main(), set up the transport. Under 50ms. Usually under 30ms.

150ms does not sound like much. But it compounds. Every tool call starts with "is the server running? No? Start it." If your server takes 200ms to start and Claude makes 10 tool calls across a coding session, that is 2 seconds of overhead. Not catastrophic. Not invisible either.

The Rust binary is ready before the AI model has finished generating the tool-call request. There is no perceptible delay. The tool just works.

Deployment Is a Binary Copy

I want to talk about deployment because this is where the Rust advantage becomes most visible to people who are not the developer.

A TypeScript MCP server requires: Node.js runtime (correct version), npm install (with all transitive dependencies), and whatever build step your project uses. Your .mcp.json entry looks like:

{
  "mcpServers": {
    "my-tools": {
      "command": "node",
      "args": ["./path/to/dist/index.js"]
    }
  }
}

A Rust MCP server requires: the binary. Your .mcp.json entry looks like:

{
  "mcpServers": {
    "my-tools": {
      "command": "./target/release/my-tools"
    }
  }
}

No runtime. No package manager. No node_modules. One file, under 4MB, that runs on any machine with the same OS and architecture. You can commit it to your repo, distribute it through your company's artifact store, or just copy it to a shared drive. It works.

For teams, this eliminates an entire category of "it works on my machine" problems. The binary is the deployment artefact. If it compiles, it deploys. For production deployment strategies, our MCP server deployment guide covers the operational side.

What Goes Wrong in Production

After running 34+ MCP tools across 8 plugins at systemprompt.io, here is my honest list of what breaks:

Schema ambiguity. If your tool accepts a path parameter, Claude does not know if that means an absolute path, a relative path, a path relative to the workspace root, or a glob pattern. Be explicit in your parameter descriptions. "Absolute file path" or "Path relative to workspace root, e.g. src/main.rs". The more specific your schema, the fewer failed tool calls you see.

Timeout mismatches. Claude Code has a timeout for tool calls. If your tool does something slow (network requests, large file scans), it can hit that timeout. The server process gets killed mid-operation. No graceful shutdown. No partial result. Just a dead process and a confused AI. Keep tool operations fast or stream progress.

Permission errors that look like bugs. Your MCP server runs with whatever permissions the spawning process has. If Claude Code runs as a user who cannot read /etc/ssl/certs, your HTTPS-calling tool fails with a TLS error, not a permission error. The AI sees "TLS handshake failed" and has no idea that the fix is a permission change. Wrap system-level operations with permission checks and return actionable errors.

Tool overload. We started with 12 tools in one server. Claude would sometimes pick the wrong tool because two tools had similar descriptions. We split into focused servers. Three tools each. Disambiguation went to near zero. Keep your tool count per server low and your descriptions distinct. For the architecture behind this, see our Claude Code extensions guide.

The Testing Gap

Most MCP server tests verify that the tool produces the correct output for a given input. That is necessary but not sufficient. You also need to test:

What the AI sees. Serialise your tool's response to JSON and read it. Is the information structured in a way that an AI can parse? Are numbers formatted consistently? Are file paths absolute or relative? The AI consumes your tool's output as text. If that text is ambiguous to a language model, your tool will be misused.

Error recovery paths. Call your tool with invalid inputs and read the error messages. Not just "does it return an error" but "does the error contain enough information for an AI to fix the problem and retry successfully?" This is a functional requirement, not a nice-to-have.

The full stdio round-trip. Spawn your binary, send a JSON-RPC request to stdin, read the response from stdout. This catches serialisation issues, stdout pollution from dependencies, and transport-level bugs that unit tests never see.

#[tokio::test]
async fn test_stdio_round_trip() {
    let mut child = Command::new("./target/debug/my-tools")
        .stdin(Stdio::piped())
        .stdout(Stdio::piped())
        .stderr(Stdio::piped())
        .spawn()
        .unwrap();

    let stdin = child.stdin.as_mut().unwrap();
    let request = r#"{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"count_lines","arguments":{"path":"test.rs"}}}"#;

    stdin.write_all(request.as_bytes()).unwrap();
    stdin.write_all(b"\n").unwrap();

    // Read and validate the response is clean JSON-RPC
    // No stray output, no corrupted framing
}

Why I Stopped Building TypeScript MCP Servers

I built my first six MCP servers in TypeScript. They worked. They shipped. Users were happy. Then I rebuilt them in Rust and the difference was not subtle.

Cold starts went from noticeable to invisible. Memory usage dropped by 75%. The stdout corruption bugs that took days to diagnose simply could not happen because println!() is not the default output mechanism in Rust libraries. The compiler caught three tool-parameter type mismatches that had been silently wrong in the TypeScript version for weeks.

The build time is the tradeoff. First compilation of an rmcp project takes a few minutes. Incremental builds are fast, but that initial compile is not. If you are prototyping and iterating rapidly, TypeScript's instant feedback loop is genuinely better. But once you know what you are building, Rust gives you a server you can ship and forget about.

For production MCP infrastructure, including authentication and security, the confidence that comes from the Rust type system is not academic. It is the difference between deploying on Friday afternoon and deploying on Friday afternoon without anxiety.

The Honest Recommendation

If you are building your first MCP server, start with TypeScript. The iteration speed matters more than performance when you are still figuring out what tools to build and how to structure them.

If you are building your second MCP server, or if your first one is going into production, switch to Rust. The rmcp crate makes the tool-definition experience nearly as simple as the TypeScript SDK. The deployment story is dramatically simpler. The production behaviour is predictable in ways that matter.

If you are building MCP infrastructure for a team or an organisation, there is no contest. Rust. Single binaries. Deterministic memory. No runtime dependencies. Your ops team will thank you.

The Model Context Protocol has 97 million SDK downloads and backing from the Linux Foundation. It is not going away. The tools you build today will run in production for years. Build them in a language that makes production easy.