Agent Plugin and Skill Security: How Enterprises Should Evaluate Install Risk
A practical security checklist for Agent Plugins and Skills: provenance, scanning, least privilege, signed releases, runtime controls, telemetry privacy, and release gates.
Enterprise teams are right to be cautious about installing Agent Plugins and Skills. A plugin can contain instructions, scripts, connectors, MCP config, commands, hooks, agents, and assets. That makes it useful, but it also means the plugin can become a supply-chain risk if nobody checks what it declares, what it can access, and what it actually does.
Treat the plugin as the installable product and each Skill, connector, hook, command, MCP server, agent, and asset as a component with its own risk profile. Security review should happen before installation, again before each new version, and continuously through runtime telemetry.
What can go wrong
Most plugin risk is not exotic. It usually comes from one of these patterns:
| Risk | What it looks like |
|---|---|
| Hidden instructions | A SKILL.md or command contains behavior the user never sees in the description. |
| Data exfiltration | The plugin asks the agent to send prompts, files, records, screenshots, or connector payloads to an external service. |
| Overbroad permissions | MCP tools, connectors, scripts, or browser workflows can do more than the plugin's job requires. |
| Tool poisoning | A tool or MCP description steers the agent toward unsafe behavior. |
| Dangerous code | Scripts invoke shell commands, network calls, filesystem writes, or package installs without clear need. |
| Dependency risk | A plugin bundles vulnerable, abandoned, or typosquatted packages. |
| Description-behavior mismatch | The marketplace page says "review invoices" but the component can create, delete, or modify records. |
| Telemetry leakage | Analytics accidentally include prompts, file contents, tool arguments, model outputs, or customer data. |
The security target is simple: the plugin's description, component inventory, permissions, code, runtime behavior, and telemetry policy should all agree.
Use scanning as a release gate
Security scanners are becoming part of the agent plugin workflow. NVIDIA documents SkillSpector as a scanner for agent skills that can review repositories, URLs, zip files, directories, and individual SKILL.md files. Its documented checks include prompt injection, data exfiltration, privilege escalation, supply-chain issues, excessive agency, dangerous code patterns, MCP least privilege, MCP tool poisoning, and known vulnerable dependencies.
That pattern is the right baseline: scan before install, scan before release, and fail the release on critical or high-risk findings unless someone formally accepts the risk.
Scanning should cover both static and semantic concerns:
| Review type | What it catches |
|---|---|
| Static checks | Suspicious strings, dangerous APIs, dependency vulnerabilities, unexpected network calls, unsafe shell commands, and malformed manifests. |
| Semantic checks | Whether the component's behavior matches its description, whether triggers are vague or overbroad, and whether the plugin asks the agent to do more than the user expects. |
Scanning does not replace product review. A clean scan means "no known issue detected," not "safe forever."
Enterprise install checklist
Before approving a plugin, ask for these artifacts:
- Plugin identity: stable plugin id, publisher, version, source repository, release notes, and install targets.
- Component inventory: every Skill, connector, agent or sub-agent, hook, MCP config, command, tool, and asset.
- Permission map: what each component can read, write, execute, call, or send externally.
- Source provenance: where the package came from, who maintains it, and whether releases are signed.
- Scan report: static scan, dependency scan, and semantic behavior review for the exact version being installed.
- Runtime policy: whether components run in a sandbox, require confirmation, or can mutate user systems.
- Telemetry policy: what events are emitted and confirmation that no prompts, file contents, connector payloads, tool arguments, screenshots, DOM, or model outputs are sent.
- Rollback path: how to disable, remove, or revert the plugin if it behaves unexpectedly.
Least privilege by component
Do not approve permissions at the plugin level and hope each component behaves. Review the actual component that needs the permission.
| Component | Security question |
|---|---|
| Skill | Does the description match the task and include clear "must not" behavior? |
| Connector | Is it read-only when the workflow only needs review? Are write scopes isolated? |
| MCP config | Are exposed tools limited to the plugin's job? Are tool descriptions honest and narrow? |
| Hook | What event triggers it? Can it run shell commands or send data externally? |
| Command | Does it accept untrusted input? Can it modify files, accounts, or settings? |
| Agent or sub-agent | Is its role bounded, and can it delegate to tools outside the plugin's purpose? |
| Asset | Could it contain hidden content, remote references, or misleading UI? |
The safest plugins make the risky path explicit. For example, "review expenses" and "submit an expense" should be separate capabilities, with the submit path requiring user confirmation.
Telemetry should be closed and metadata-only
Security teams often reject plugins because analytics are vague. Do not ask them to trust a black box. Define the event envelope.
Good plugin telemetry records facts like:
plugin.installskill.invocation.startskill.invocation.endplugin.component.invokedplugin.component.error- plugin id, component id, version, harness, duration, status, and coarse error category
It should not record:
- user prompts
- file contents
- connector payloads
- tool arguments
- browser DOM
- screenshots
- transaction text
- account names
- model outputs
Telvine's event schema is designed for this closed-envelope model: enough data to measure adoption, errors, latency, version quality, and component usage without collecting sensitive user content.
A practical release workflow
Use this workflow for every plugin version:
npm i -g @telvine/cli
telvine login
telvine publish ./my-plugin
Then gate the release:
- Build the plugin package from source.
- Generate the component inventory.
- Run static scans on manifests, Skills, scripts, hooks, MCP config, and dependencies.
- Run semantic checks for description-behavior mismatch.
- Confirm telemetry emits only metadata.
- Review write-capable connectors, commands, and hooks manually.
- Sign or attest the release package.
- Publish the approved version.
- Watch runtime events for errors, unexpected component usage, and adoption anomalies.
What buyers should ask vendors
If you are buying an Agent Plugin, ask:
- Which components are included in this plugin version?
- Which components can read data, write data, execute code, or call external services?
- Do you scan each release for prompt injection, data exfiltration, tool poisoning, MCP least privilege, and dependency risk?
- Do you provide scan reports or CI evidence for the exact version we install?
- Is the package signed or otherwise tied to a verified publisher?
- Can we disable individual components without removing the whole plugin?
- What telemetry do you collect, and can you prove it excludes prompts and customer content?
- How do we roll back a version?
If the vendor cannot answer those questions, the plugin is not enterprise-ready.