API Evangelist API Evangelist
Guidance
API Learnings
APIs
API Governance
API Solutions
API Discovery
API Building Blocks
API Evangelist LLC

Agent Skills

Packaged capability units that AI agents load to execute API operations

Agent skills are the latest expression of a question I’ve been asking the API community since at least 2016: what does your API actually let someone do? Not what database table it exposes, not what HTTP endpoints it has, but what meaningful, business-relevant capability it delivers. The vocabulary has changed several times over the years — skills, then capabilities, now agent skills — but the underlying problem has been remarkably consistent. We have spent two decades getting good at exposing technical resources and remarkably bad at packaging those resources into self-describing units of value that something other than the original developer can discover, understand, and put to use.

The first time I wrote seriously about skills was in 2016, and the trigger was Amazon Alexa. Alexa’s developer model was built around “skills” — discrete, named capabilities that the platform could invoke in response to a voice request. I wasn’t convinced that voice and bots were the future, and I said so at the time, but I was completely convinced that the skills framing was important regardless of whether you cared about voice at all. Thinking in terms of the skills your API enables — rather than the resources it exposes — better reflected the actual journey developers and end users were on. People didn’t want a database endpoint. They wanted to accomplish something. The skill was the unit that described the something.

I framed Alexa Skills as the poster child for enterprise API efforts in 2018, and the reason had nothing to do with voice. I sat in an enterprise IT architecture meeting where an executive showed a five-step Alexa conversation alongside a complex diagram of the organization’s backend systems. Each step in that simple conversation reached into a different system, weaving a complicated web of connections to answer a basic question. That slide captured the whole problem. The skill — the conversational capability the business wanted — was simple to describe and enormously complicated to actually deliver, because the underlying systems had never been organized around delivering capabilities. They had been organized around storing data. I wrote at the time that the daunting challenge wasn’t the conversational interface; it was getting at the right data to provide a relevant answer. And that the biggest obstacles would be human and political, not technical. That observation has held up completely as we’ve moved from voice agents to AI agents.

The skills concept matured for me into the broader idea of capabilities, which I’ve been working on intensively. A capability, in the framing I’ve developed alongside people like Daniel Kocot, Christian Posta, Mike Amundsen, and Kevin Swiber, is a business-aligned, self-describing, semantically rich declaration of what a system can do. Not an API capability specifically — a capability, full stop. Something the business needs to be capable of, expressed in a way that both the business and engineering sides can understand, that exists within a clear domain with a well-defined vocabulary, that has clear boundaries, that is composable and discoverable and machine-readable and governed. The skill and the capability are the same instinct expressed at different moments: package the value, not the plumbing.

What I keep finding is that everyone is willing to do the technical work of defining a capability, and almost no one is willing to do the business work. This isn’t a nefarious thing — it’s just that engineering culture has never prioritized it. The engineers I sit down with to nail the technical details aren’t always equipped or willing to work through the business details, and the business people aren’t always equipped or willing to work through the technical ones. The rare beast is the person who can do both. The result is that capabilities — and now agent skills — get the lion’s share of technical specification and almost no business specification, and the two drift apart over time. That drift is exactly how you get the friction and pain we associate with legacy systems. Agent skills don’t escape this. A skill that codifies technical operations without codifying the business context around them is just a faster path to the same disconnect.

When the current wave of AI agents arrived, my position was contrarian and I’ve held it: I care deeply about the knowledge expressed in agent skills, and almost not at all about the agents themselves. The agent hustle is someone else’s. What interests me is that the AI evolution finally gave people the vocabulary and the motivation to codify the value that exists inside an enterprise — to write down what matters and express it in a useful, structured, discoverable way. I had been trying to get people to do this for years through the SDK property in APIs.json and got very little uptake. It took the agentic moment to make people care. That’s worth being honest about. The technology I’m skeptical of created the conditions for work I’ve wanted done all along.

The example that crystallized this for me was Speakeasy publishing their agent skills in early 2026. I dug into it not because I cared about agents but because the knowledge packed into those skills was beautiful — a clear, structured list of exactly what matters at the SDK and code-generation layer of API operations. Start a new SDK project. Diagnose generation failures. Write OpenAPI specs. Manage overlays. Generate an MCP server. Set up SDK testing. Each one a discrete, named, business-relevant capability with the condensed, battle-tested knowledge needed to actually do it. As a systems thinker I needed to deconstruct what they had built so I could apply it across the rest of the API lifecycle, from both producer and consumer vantage points.

That deconstruction surfaced the file-level conventions that now define how agent skills are packaged. There is a whole emerging family of markdown files that tell an AI how to behave in your project, and they are all converging on the same idea from different vendors. CLAUDE.md from Anthropic for Claude Code. RULES.md from Cursor. GEMINI.md from Google. copilot-instructions.md from GitHub. These live at the project-configuration level — persistent, project-scoped system prompts. Then there is SKILL.md, the actual unit of an agent skill: condensed best-practice instructions for accomplishing a specific kind of task, that the agent reads before acting. And AGENTS.md for orchestrating multi-agent workflows. The distribution layer matters too — Speakeasy used .claude-plugin/plugin.json as a per-plugin manifest and marketplace.json as a catalog, alongside the long-standing README.md and LICENSE.md that predate all of this. The format and location differ by tool, but the pattern is now universal: a markdown file in your repo that codifies knowledge for non-human consumers.

Where agent skills connect to my long-running skepticism of MCP is in what they reveal about layering. I’ve been critical of MCP since 2025 — not because the use case is invalid, but because it tends to arrive as a tactical shortcut that skips the design, product, and business work, and because its early disregard for authentication and semantics was a warning, not a detail. Agent skills are interesting precisely because, done well like Speakeasy did, they carry the knowledge and the business context. The skill isn’t just “call this endpoint.” It’s “here is what you’re trying to accomplish, here is how to do it well, here are the pitfalls.” That’s the layer that MCP-as-shortcut skips. Capabilities, and the best agent skills, are everything you need to integrate with AI — but only if you do the business work alongside the technical work, which is the part everyone wants to wave their hands about.

The throughline from Alexa skills in 2016 to agent skills in 2026 is that we keep rediscovering the same truth from new directions. The valuable unit is not the resource, the endpoint, or the database. It is the packaged, self-describing, business-aligned capability — the thing that says “here is what I let you accomplish, and here is what you need to know to accomplish it.” Voice agents needed it. Conversational interfaces needed it. AI agents need it now. Each wave makes people care about codifying enterprise knowledge in a way the previous wave couldn’t. The skills are the gold. The agents are just the current reason people are finally willing to dig for it.

References