Title: MCP security: what breaks when you connect an untrusted tool server
The Enforgate team ·
The Model Context Protocol (MCP) has done for AI tools what package managers did for code: connecting a new capability is now a one-line config change. That convenience has the same downside package managers did -- you're often running something you didn't write, didn't review, and don't fully control. When that something can act on your data, the risk is real.
Here's what actually goes wrong, and how to contain it.
Risk 1 -- the server changes under you
You connect an MCP server today and vet its tools. Next week its maintainer ships an update that adds a new tool, changes an argument, or alters what a tool does. Your agent picks it up silently. The thing you audited is not the thing running.
Containment: pin what you trust. Enforgate keeps a per-organization registry of trusted MCP servers with version pinning, and captures a fingerprint of each server's tool list on first connect. If the tools later change, that mismatch raises an alert instead of quietly taking effect.
Risk 2 -- you can't prove the server is who it claims
An MCP endpoint is just a URL or a command. Nothing about that inherently proves the server on the other end is the one you intended to trust.
Containment: verify identity. Registry entries can carry a public key, and Enforgate verifies the server's identity on connect -- so a swapped-out or impersonated endpoint is caught, not trusted by default.
Risk 3 -- the tool exfiltrates data in its response
Even a well-behaved tool can return more than you want: an API key in an error message, PII in a record, a secret embedded in output. Once that lands in the model's context, it can be logged, repeated, or leaked downstream.
Containment: inspect responses. Enforgate scans every tool response for secrets and sensitive data before it reaches the agent, and either redacts the matches or blocks the response outright -- your choice, per organization.
Risk 4 -- the agent hands the tool a live credential
Give a tool your real database password or API key and that secret now lives in the agent's context, your prompts, and potentially your logs.
Containment: never let the raw secret reach the agent. Enforgate injects secrets into tool arguments at the boundary, after policy evaluation, so the agent references a secret by name and never sees its value -- and the audit log never records it.
The common thread
Every one of these risks comes from the same place: an agent talking directly to a tool it implicitly trusts. Put a boundary in between -- one that pins what you connect, verifies who you're talking to, inspects what comes back, and gates every call against policy -- and "connect an untrusted MCP server" stops being a leap of faith. The MCP security docs go deeper on each control.
