Building Better MCP Servers
MCP servers are increasingly everywhere, GitHub, Atlassian - heck even BGG (shameless plug) now have official and unofficial MCP implementations. However, many MCP servers are still clumsily built, essentially serving as glorified API wrappers (much like the maligned ChatGPT wrappers).
What distinguishes the good from the bad is understanding the context in which these tools operate. Having built several MCP servers, including the BGG MCP I plugged earlier, I'd like to share some thoughts and lessons.
The Ecosystem Problem
There's a rush to ship MCP servers right now. Companies see competitors launching them and feel pressure to have one too — any MCP server is better than none, right? Wrong. I've seen MCPs that return entire REST APIs wholesale into MCP tools, creating monsters that return 15,000 tokens for simple queries. The AI chokes and users get frustrated.
This isn't just about poor implementation though - it's about misunderstanding the paradigm shift. Traditional APIs serve data. MCP servers enable outcomes. If we keep building them like the former, we'll erode user trust in what is a transformative technology.
1. Context Window as a Finite Resource
The first consideration is recognising and treating the AI's context window as the finite resource it is. Unlike regular API calls, where response size limitations are generally less significant, for an AI, every bit of the response gets tokenised — filling the context window with potentially huge amounts of irrelevant information.
This doesn't just waste precious context space; it actively degrades the AI's performance and response quality, as it must now parse all this cruft as part of every subsequent interaction.
Take a real failure I encountered, an MCP server for a CRM system that returned complete user objects — 50+ fields including internal timestamps and worst of all base64 encoded avatars — when asked for user's emails. The AI spent more tokens parsing irrelevant data than answering the actual query. Worse, this bloat persisted throughout the entire conversation.
Instead, MCP servers should return only the data necessary to perform the primary function of the tool, at least by default. You can always add parameters — or additional tools — if users need more information.
In this new AI age, it highlights how GraphQL APIs are already well adapted for MCP out of the box, given they allow the client (our MCP) to request only the data it needs.
2. Build Tools Around Outcomes, Not APIs
It's tempting to look at an API and assume each endpoint needs its own tool. Whilst this might suffice for basic data fetching, we need to consider the outcomes users actually expect from the MCP server.
If they just wanted a JSON response, they could in theory ask the AI to fetch it directly from the API themselves. Instead, tools should be oriented around outcomes. In my BGG MCP implementation, there are APIs and tools for finding game details, searching, and getting forum data. Whilst each part alone provides something useful, together they become more than the sum of their parts. The BGG rules tool, in particular, combines several APIs and enhances the response to let users find answers to common rules queries directly from the forums.
This is a common and tedious task gamers face. My tool is adapted for that specific use case. Whilst in theory you could achieve the same result with separated tools and enough awkward prompting, the integrated tool makes it frictionless and avoids polluting the context window with several tool calls to boot. The goal is to get users to their intended result in as few tool calls as possible.
If you're building your first MCP server, start by identifying the top 3 user journeys and optimise for those. Everything else is secondary.
3. Response Structuring
Beyond keeping responses minimal and relevant, you can go further still. Anthropic introduced the concept of using XML tags to structure prompts and responses. XML is uniquely suited over JSON for AI responses as you can actually contextualise the values.
response.WriteString("<instructions>\n")
response.WriteString("Your goal is to help the user resolve their rules question or understand game mechanics.\n")
response.WriteString("IMPORTANT: First verify the game found matches what the user is asking about. If it seems wrong, mention it.\n")
response.WriteString("1. Identify threads that directly address the user's specific rules query based on their titles\n")
response.WriteString("2. Look for threads with high reply counts (indicating thorough discussions) or official-sounding titles\n")
response.WriteString("3. Present the 1-4 most relevant threads with brief descriptions of what the titles suggest they discuss\n")
response.WriteString("4. For the most promising thread(s), proactively use bgg-thread-details to fetch the actual content\n")
response.WriteString("5. After reading the thread content, provide a clear answer to the user's rules question\n")
response.WriteString("Remember: You're seeing thread titles only. Use bgg-thread-details to get actual answers.\n")
response.WriteString("</instructions>\n\n")
Consider the snippet from the BGG Rules tool above. Rather than returning raw data as the MCP response, it transforms the data into a prompt that gives the AI context about what to do with the response and how to interpret it.
This isn't just useful for enhancing subsequent AI responses; it can also chain requests by instructing the AI to invoke other tools.
4. Error Handling for AI Consumption
Here's something often overlooked: errors are responses like any other in the MCP context. When your underlying API fails, don't just pass through a generic "500 Internal Server Error". The AI needs to understand what happened to inform the user properly.
Instead of:
{"error": "Connection timeout"}
Return:
<error>
<type>temporary_failure</type>
<message>The game database is currently unavailable</message>
<suggestion>This is usually resolved within a few minutes. You could try searching by game name instead, or I can retry in a moment.</suggestion>
</error>
The AI can now give the user meaningful options rather than just reporting a cryptic failure. Remember, you're designing for AI consumption, not debugging.
5. Sampling, Roots & Elicitation
Three recent additions to the MCP protocol that aren't yet widely implemented are sampling, roots, and elicitation.
Sampling is essentially a way for the MCP to pass something back to the AI for processing before continuing with its tool call and the final response. This is brilliant because traditionally, if an MCP tool needed AI capabilities as part of its response, it would need its own API key to make external calls. Now, the MCP server can directly offload that processing to the existing AI client without any external dependencies.
Roots are more straightforward but equally important - they define the AI's "working directories". This means AI clients can trust that an MCP server won't accidentally delete their entire filesystem.
Lastly, Elicitation allows servers to request structured user input during execution. Consider a deployment tool that discovers multiple staging environments — rather than guessing or erroring out, it can request clarification: "Which environment: production or staging?". The protocol even distinguishes between different user responses: accepting with data, explicitly declining, or simply cancelling.
These additions increasingly distinguish proper tools that enable fully agentic workflows from mere wrappers, and show the protocol maturing beyond basic tool calling into a framework for genuine human-AI collaboration.
Looking Ahead
As you can see, MCP is a rich ecosystem. With the release of the official registry and increased OAuth support, it's clear the ecosystem is rapidly maturing. However, the MCP servers that prove valuable to people's workflows won't be rushed API wrappers — they'll be those that enable users to extract real value from their connected tools.