We Scanned 32 Community MCP Servers — Here's What We Found
MCP servers are the tools that AI agents like Claude use to take actions on your behalf — read files, run queries, call APIs, browse the web. We ran static analysis across 32 community MCP servers representing over 120,000 combined GitHub stars. 97% had security findings. 75% had critical-severity issues. The patterns that emerged say something important about how this ecosystem is being built.
Methodology and scope
We queried the GitHub API for repositories tagged with topic:mcp-serverin TypeScript, Python, and JavaScript, filtering for active repos with at least 5 stars. We shallow-cloned and ran CodeSlick's 306-check static analyzer on each repo sequentially. Of 60 repos discovered, 32 completed successfully — 28 were skipped due to output buffer limits on large codebases. This is a static analysis survey: findings reflect code patterns detected by automated rules, not confirmed exploitability through runtime testing. We note one known scanner limitation in the methodology section below.
The Scan Scope
Unlike our previous audit of official AI SDK repos (vercel/ai, LangChain.js, openai-node), this scan targets the community layer — the MCP servers that developers install to connect Claude and other AI agents to external systems. These are the tools that run on developer machines, inside CI pipelines, and increasingly in production environments.
Repos scanned: 32 (out of 60 discovered — 28 skipped, too large)
Repos with findings: 31 (97%)
Repos with CRITICAL: 24 (75%)
Repos with MCP checks: 16 (50%)
Files scanned: 1,029
─────────────────────────────────────────────────────────────────
Total findings: 5,408
Critical: 450
High: 770
Medium: 3,402
Low: 786
─────────────────────────────────────────────────────────────────
MCP-specific findings: 454 across 16 reposThe 32 repos scanned represent some of the most-installed community MCP servers: upstash/context7 (49K stars), browser-tools-mcp (7.1K), AgentDeskAI (7.1K), firecrawl-mcp-server (5.8K), BrowserMCP (6K), and oraios/serena (21.7K), among others. These are tools real developers are installing into their AI agent workflows today.
Why MCP Security Is Different
Security vulnerabilities in a web API are bad. Security vulnerabilities in an MCP server are worse. The attack surface is different.
When Claude uses an MCP server, the server's code executes in your environment — on your machine, in your shell, with your credentials. A command injection vulnerability in a web API requires an attacker to reach the network. A command injection vulnerability in an MCP server can be triggered by an attacker who controls the data Claude is asked to process — a document, a webpage, a code file. This is the tool poisoning threat model: malicious content in the environment that causes the AI to call a vulnerable tool with attacker-controlled input.
That context makes the patterns we found more significant than they would be in an ordinary codebase audit.
Pattern 1: MCP-Specific Risk — 454 Findings Across 16 Repos
CodeSlick includes 12 MCP-specific checks covering tool poisoning surfaces, prompt injection vectors, missing input validation on tool parameters, and unsafe data handling in AI-facing code. 16 of the 32 repos triggered at least one of these checks. The top five by MCP-specific finding count:
| Repository | Stars | MCP Findings | Total Findings |
|---|---|---|---|
| upstash/context7 | 49.6K | 121 | 454 |
| antvis/mcp-server-chart | 3.8K | 77 | 115 |
| ForLoopCodes/contextplus | 1.5K | 60 | 1099 |
| exa-labs/exa-mcp-server | 4.0K | 48 | 139 |
| cjo4m06/mcp-shrimp-task-manager | 2.1K | 47 | 711 |
| BrowserMCP/mcp | 6.1K | 40 | 79 |
| AgentDeskAI/browser-tools-mcp | 7.1K | 29 | 455 |
The most flagged category within MCP-specific checks was missing input validation on tool parameters — tool handlers that accept raw strings from the AI and pass them directly to downstream operations without sanitization. In the MCP threat model, the AI is not a trusted input source: it can be influenced by the content it processes. Tool parameters should be treated with the same skepticism as user input in a web form.
upstash/context7's 121 MCP-specific findings were concentrated in type safety gaps and missing null checks on values passed through tool handlers — a pattern that creates undefined behavior when the AI passes unexpected inputs. exa-labs' 48 findings included weak random number generation in contexts where randomness matters for security.
Pattern 2: Command Injection and SQL Injection in Agent-Facing Code
Several repos had command injection or SQL injection in code paths directly reachable by AI tool calls. These are the highest-severity findings in the dataset because they are exploitable through the normal operation of the tool — no special access required beyond the AI agent calling a vulnerable function.
Critical findings with direct AI attack surface:
- oraios/serenaMultiple SQL injection via string interpolation in database tool handlers; 3 command injection instances including a construction like
subprocess.Popen(restoration_cmd)where the command is built from externally controlled variables - AgentDeskAI/browser-tools-mcpCommand injection via
exec()with user-controlled input; prototype pollution viaJSON.parse()with spread operator - JoeanAmier/XHS-DownloaderNode.js command injection via
child_process.exec()with unsanitized user input - cjo4m06/mcp-shrimp-task-manager
eval()usage in a task management context — arbitrary code execution if task content is attacker-influenced - D4Vinci/ScraplingInsecure deserialization via
pickle.loads()— arbitrary code execution when deserializing attacker-controlled data
The eval() finding in mcp-shrimp-task-manager deserves attention specifically because of the tool's purpose. A task manager MCP server processes task descriptions and instructions from the AI. If an attacker can inject content into those descriptions — via a document the AI reads, via a web page it browses — eval()becomes a direct path to arbitrary execution on the developer's machine.
This is not a theoretical attack chain. It is the exact scenario that tool poisoning exploits are designed to trigger.
Pattern 3: Type Safety Debt and Missing Error Handling
The most common findings across the ecosystem were not the dramatic ones. They were the quiet ones that accumulate into unreliable software:
Category Count Description
──────────────────────────────────────────────────────────────────────
type-checking 1,537 TypeScript errors (strict mode violations)
missing-null-checks 583 Unchecked nullable access
console-log-production 450 Debug output leaking to production
print-statement 370 Python debug statements in server code
exception-details-exposed 214 Stack traces/internals sent to clients
missing-rate-limiting 155 No rate control on API-facing endpoints
missing-error-handling 154 Unhandled async operations
silent-exception-suppression 141 bare except: / catch {} swallowing errors
async-without-try-catch 119 Unguarded async operations
any-type-usage 108 TypeScript type erasure1,537 type-checking violations is the single largest category — TypeScript errors caught under strict mode that many of these repos are likely not running with. Missing null checks and type erasure via any compound this: the type system is present but not providing safety guarantees.
450 console.log in production codematters more in MCP servers than in most contexts. These tools handle file contents, database results, API responses, and user data. Debug logging that was never removed ships sensitive content to wherever the server logs go — which in the MCP server context is often a developer's terminal, but could be a log aggregator visible to multiple people.
155 missing rate limiting findings on API-facing endpoints represents a systemic gap. MCP servers are called by AI agents that can issue hundreds of tool invocations per session. Without rate controls, a single agentic workflow gone wrong can exhaust API quotas, trigger costs, or create denial-of-service conditions against downstream services.
Notable Repos
upstash/context749,600 starsThe highest-starred repo in our scan. 454 total findings, 121 MCP-specific. The majority are type-checking violations (225) and missing null checks (93) — patterns that cause crashes under unexpected inputs. One high-severity path traversal finding in a file access path. No command injection or SQL injection, which is appropriate for a documentation context server.
AgentDeskAI/browser-tools-mcp7,100 starsA browser automation server — one of the highest-risk tool categories because it can take action on web content that is inherently attacker-controlled. 4 critical findings including a command injection via exec() and a prototype pollution path. 130 console.log statements in 12 files. The density of debug logging relative to codebase size (11 per file on average) is the highest in the dataset.
oraios/serena21,700 starsA semantic code analysis tool with broad file system and database access. 236 findings across 137 scanned files — 25 critical, 69 high. Multiple SQL injection via string interpolation in database handlers, and 3 command injection instances including patterns where shell commands are constructed from external variables. The combination of broad system access and injection vulnerabilities makes this the most significant risk profile in the dataset.
cjo4m06/mcp-shrimp-task-manager2,100 stars711 findings across 123 files — the highest total count in the dataset. The distribution skews toward code quality (260 console.log, 79 null checks) but includes 22 critical findings: eval() usage, insecure deserialization, and missing error handling in async paths. A task manager that processes AI-generated task content is precisely the attack surface where eval() creates a direct exploitation path.
exa-labs/exa-mcp-server4,000 stars48 MCP-specific findings, 0 critical. A relatively clean profile — type-checking (84) and null checks (32) dominate. The weak random number generation (10 findings) in a search context is lower risk than in authentication or token generation, but worth addressing if the server generates any session identifiers. No injection vulnerabilities detected.
A Note on Scanner Limitations
We identified one known false positive class in this scan that we want to be explicit about.
CodeSlick's malicious package detector flagged request and req as known-malicious packages in several repos (D4Vinci/Scrapling, oraios/serena, firerpa/lamda). We verified this directly against the OSV.dev API: the request package has zero OSSF MAL-* entries. It is deprecated and no longer maintained, but it is not malicious. This is a false positive in our scanner that we are correcting.
We have excluded these findings from our security-relevant totals in this post. The aggregate stats above include them in the raw total, but the critical finding highlights and risk assessments above do not treat deprecated packages as malicious. We are releasing a fix for this false positive class in an upcoming CodeSlick update.
What This Means for Developers Installing MCP Servers
The 97% finding rate is not a reason to stop using community MCP servers. It is a reason to evaluate them before installing.
The risk profile of an MCP server is different from a regular npm package. A library with a null check bug might crash your app. An MCP server with a command injection bug is callable by your AI agent during normal operation — and if the data your agent processes can be influenced by an attacker, the injection can be triggered indirectly without the attacker ever touching your system directly.
Before installing an MCP server:
- 1.Review the tool handlers that accept parameters — are inputs validated before being passed to shell commands, SQL queries, or eval?
- 2.Check for
exec(),subprocess.Popen(),eval(), andpickle.loads()— in MCP servers, these are high-risk patterns even when intentional - 3.Consider the tool's access scope: a server with file system + database + shell access deserves more scrutiny than one that only calls a read-only API
- 4.Run a static analyzer before integrating into production workflows — the patterns that matter most are the ones in agent-facing code paths
If you maintain an MCP server that appeared in this scan, these findings are a starting point, not a verdict. Static analysis generates false positives and misses context. The findings worth prioritizing first are command injection, SQL injection, and eval() — not because the others do not matter, but because these are the ones with a direct path from AI input to system compromise.
Scan Your MCP Server
We built CodeSlick's 12 MCP-specific checks specifically for this problem — detecting the vulnerability patterns most dangerous in agent-facing tool code. If you build or maintain an MCP server, you can scan it at codeslick.dev, via the CLI (npx codeslick scan --mcp), or directly from Claude using our MCP server for MCP server security — which is perhaps the most recursive thing we have shipped.
npm install -g codeslick-mcp-server
# Then add to Claude Code settings:
# npx codeslick-mcp
# Tools available:
# analyze_code — 306 security checks across JS/TS/Python/Java/Go
# scan_dependencies — CVE scanning via OSV.dev (npm, pip, maven, go)
# generate_sbom — SPDX 2.3 + CycloneDX 1.4 generation
# detect_secrets — 38 hardcoded secret patternsThe full scan dataset — per-repo JSON results and the aggregated summary — is available in the CodeSlick repository for anyone who wants to verify the findings or build on the data.