Key takeaways
- An MCP server is a lightweight process that exposes tools, resources, and prompts to AI applications over a standardized protocol.
- It acts as the bridge between an LLM-powered client and an external system — a database, API, file system, or SaaS product.
- The protocol standardizes how AI apps discover and invoke capabilities, replacing per-integration custom code.
- MCP servers communicate with MCP clients over a defined transport layer (stdio or HTTP+SSE), not directly with the LLM.
- A single MCP server works well for development; production environments with many servers introduce governance, auth, and routing challenges that require additional infrastructure.
What is an MCP server?
You have an AI-powered application that needs to query a database, create a GitHub issue, or pull data from a SaaS tool. Before November 2024, you wrote custom glue code for each integration — one-off adapters with their own schemas, auth flows, and error handling. Every new external system meant another bespoke connector.
An MCP server eliminates that pattern. It is a process that exposes capabilities — tools, resources, and prompts — to AI applications using the [Model Context Protocol (MCP)](https://konghq.com/blog/learning-center/what-is-mcp)Model Context Protocol (MCP), an [open standard](https://www.infoq.com/news/2024/12/anthropic-model-context-protocol/)open standard [introduced by Anthropic in November 2024](https://www.anthropic.com/news/model-context-protocol)introduced by Anthropic in November 2024.
A critical distinction: MCP and MCP server are not synonyms. MCP is the protocol specification — the set of rules governing how AI applications discover and invoke external capabilities. An MCP server is a process that implements the server side of that specification. The relationship is the same as HTTP and a web server: HTTP defines the protocol, Apache or Nginx implements it.
What MCP standardizes is discovery and invocation. Instead of reading API docs, writing integration code, and maintaining per-service adapters, an AI application connects to an MCP server and learns what it can do through capability negotiation. The server declares its tools, resources, and prompts. The client understands how to call them. The protocol handles the rest.
This matters because AI applications need to interact with dozens or hundreds of external systems. Without a standard, each integration is a maintenance burden. With MCP, the integration surface is consistent: one protocol, one discovery mechanism, one invocation pattern.
MCP architecture: how servers fit in
MCP defines four components, each with a distinct role:
- Host: The AI application the user interacts with — Claude Desktop, an IDE with AI features, or a custom agent application.
- Client: A protocol connector that lives inside the host. It manages the connection to a specific MCP server.
- Server: The process that exposes capabilities. It runs locally or remotely and responds to client requests.
- Transport: The communication layer between client and server — stdio for local processes, HTTP with Server-Sent Events (SSE) for remote connections.
Here is how a request flows through the system:
- The host starts one or more MCP clients, each configured to connect to a specific server.
- Each client establishes a connection to its assigned server over the chosen transport.
- Capability negotiation occurs: the server declares what tools, resources, and prompts it exposes, and the client and server agree on supported protocol features.
- The user interacts with the host. The LLM determines it needs to invoke an external tool — for example, querying a database.
- The client sends a [JSON-RPC request](https://www.jsonrpc.org/specification)JSON-RPC request to the server, specifying the tool name and parameters.
- The server executes the operation, and returns the result to the client, which passes it back to the host and the LLM.
One detail engineers often miss: the server never talks directly to the LLM. The client mediates all communication. The LLM decides it needs a tool, the host tells the client, and the client talks to the server. This separation keeps the protocol clean and the server implementation simple — it does not need to understand LLM internals.
Each client-server connection is a 1:1 pairing. If a host needs to interact with five external systems, it runs five clients, each connected to one server. The host orchestrates across all of them.
For a detailed reference, the MCP specification and architecture documentation are available at [modelcontextprotocol.io](https://modelcontextprotocol.io/specification/2025-06-18)modelcontextprotocol.io.
What an MCP server exposes
An MCP server declares three types of capabilities, each with a different control boundary. If you want to go deeper and [build an MCP server](https://konghq.com/blog/engineering/mcp-servers-guide)build an MCP server yourself, Kong's developer guide walks through the full implementation.
Tools (model-controlled)
Tools are functions the LLM can invoke autonomously during a conversation. The server describes each tool — its name, parameters, and schema — and the LLM decides when to call it based on context.
Examples:
query_database(sql) — Execute a SQL query against a connected database.
create_github_issue(title, body) — Open a new issue in a GitHub repository.
send_slack_message(channel, text) — Post a message to a Slack channel.
Resources (application-controlled)
Resources are data the host application can fetch to provide context to the LLM. Unlike tools, the LLM does not decide when to retrieve resources — the host does.
Examples:
- File contents from a local or remote file system.
- Database schemas describing table structures and relationships.
- API documentation for a connected service.
Prompts (user-controlled)
Prompts are predefined templates that users explicitly select. They provide structured ways to interact with the server's capabilities.
Examples:
- "Summarize this table" — A prompt template that takes a table name and generates a summary.
- "Explain this table's relationships" — A prompt that describes foreign keys and joins.
Why three types matter
The distinction is not arbitrary. Each type maps to a different control boundary, which directly affects authorization policy.
Tools let the LLM trigger actions autonomously — creating records, sending messages, modifying state. These require the strictest authorization controls because a misconfigured tool could let an LLM execute operations the user never intended.
Resources are read-only context, fetched by the application on behalf of the user. The risk profile is lower, but access still needs governance — not every user should see every database schema.
Prompts are user-initiated and explicit. The user chooses to run them, so the control model is straightforward.
A concrete example ties this together. A PostgreSQL MCP server might expose:
- Tools:
query(sql), insert(table, data), update(table, conditions, data)
- Resources: Table schemas, column types, index definitions
- Prompts: "Explain this table's relationships," "Generate a migration for this schema change"
One server, three capability types, three control boundaries.
MCP server vs. REST API