API Evangelist API Evangelist
Guidance
API Learnings
APIs
API Governance
API Solutions
API Discovery
API Building Blocks
API Evangelist LLC

Discovery

Finding and cataloging APIs across an organization or the web

API discovery is the problem I’ve spent more time on than almost any other, partly because it’s genuinely hard and partly because I tried to solve it myself and learned exactly how hard it is. Discovery is the deceptively simple question of how you find the API you need — whether you’re a developer searching the open web for a capability, an enterprise trying to inventory the hundreds of APIs scattered across its own teams, or now an AI agent trying to locate and invoke the right tool at runtime. It sounds like it should be solved by now. It is not. After fifteen years, more directories, more search engines, more registries, and more standards than I can count, API discovery remains one of the most stubbornly unsolved problems in the entire API space, and I’ve come to believe that’s because it’s not really a technical problem at all.

ProgrammableWeb was the original answer, and its arc tells you most of what you need to know about the discovery problem. John Musser built ProgrammableWeb starting in 2005, and for over a decade it was the directory of public APIs — the place you went to find out what existed, a hand-curated catalog that grew from a few hundred APIs into thousands. I wrote its history because it deserved one; it was foundational infrastructure for the entire API economy. But the directory model had a fundamental flaw: it depended on manual curation, and manual curation cannot scale to a world of hundreds of thousands of APIs. When ProgrammableWeb shut down in 2022, it was the end of an era, and it was also a confirmation of something I’d been saying for years — that the centralized, manually-maintained directory was never going to be the answer to discovery at web scale. The directory could tell you what a human editor had time to catalog. It couldn’t keep pace with reality.

My own attempt to solve this was APIs.json, and I’m still proud of it even though it never achieved the adoption I hoped for. I launched APIs.json in 2014 after months of work on what I called the five-month journey toward a stable discovery format, and the core idea was simple: instead of a centralized directory that someone has to maintain, what if every API provider published a small, machine-readable index file — APIs.json — at a well-known location on their domain, the way sitemap.xml works for web pages? I’d been chasing the sitemap-for-APIs analogy for a while, and APIs.json was the realization of it. The file would point to the API’s documentation, its OpenAPI definition, its terms of service, its pricing, its support channels — all the operational metadata a consumer or a crawler needs. APIs.io was the search engine built on top of it, the GET /apis vision: a distributed, decentralized discovery layer where the index emerges from what providers publish rather than from what a central authority catalogs. I described APIs.json as a distributed transport layer for the API economy, and that was the ambition — discovery that scales because it’s decentralized, the way the web itself scales.

The deep lesson of APIs.json, and the reason I keep returning to it, is why it didn’t take over the world despite being, I still think, basically the right design. The technology worked. The problem was incentives and adoption. Providers didn’t publish APIs.json files because there was no immediate, obvious benefit to them — discovery helps consumers and the ecosystem more than it helps the individual provider, and that’s a classic collective-action problem. The same dynamic that killed UDDI in the SOAP era was at work: discovery is a commons, and commons are chronically under-invested in because the benefit is diffuse while the cost is concentrated. I learned that one definition to rule them all, as I’d hoped for in 2013, was never going to happen through technical elegance alone, because the obstacle was never technical elegance. It was that nobody whose participation was required had a strong enough reason to participate.

Internal discovery turned out to be where the problem is most acute and most solvable, and where my thinking matured most. The enterprise version of the discovery problem is brutal: most large organizations genuinely do not know how many APIs they have, where they are, who owns them, or what they do. I’ve written some version of “it’s Monday morning, do you know where all your APIs are?” more times than I can count, because the answer is almost always no. You cannot govern, secure, or rationalize an API portfolio you can’t see, and I’ve argued repeatedly that you have to know where all your APIs are before you can deliver on API governance — discovery is the precondition for everything else. The internal discovery problem is more solvable than the open-web one precisely because the incentives are clearer: the organization that pays for the redundancy, the security exposure, and the maintenance burden of undiscovered APIs is the same organization that benefits from finding them. When I write that lack of investment in API discovery drives redundancy and inefficiency, the internal enterprise context is where that cost is most concretely felt.

The reframe that mattered most came when I stopped thinking about discovery as finding APIs and started thinking about it as finding capabilities. I wrote in 2017 that API discovery would eventually be about finding companies that do what you need, with the API simply assumed — because consumers don’t actually want an API, they want a capability, and the API is just how they access it. That instinct has only deepened. By 2026 I was writing that a list of a thousand APIs doesn’t do me much good when I just need to be capable of doing something specific. The directory-era framing — here are all the APIs, go find yours — is exactly backwards. What a consumer needs is to express an intent and be matched to the capability that fulfills it. This is a semantic problem, a discovery-meets-meaning problem, and it’s why I’ve insisted that discovery is bound up with schema, semantics, hypermedia, and workflows rather than being a simple matter of listing endpoints.

Trust is the dimension of discovery I underweighted for years and now think is central. I wrote in 2025 that trust is the secret ingredient missing in API discovery, and the point is this: finding an API is not the same as being able to trust it. In a world of hundreds of thousands of APIs, the hard problem isn’t locating candidates — it’s knowing which ones are reliable, well-maintained, secure, honestly documented, and safe to depend on. A discovery system that returns a thousand results with no basis for trust has just moved the problem, not solved it. This is why APIs.io evolved to include a Spectral-powered ratings system — discovery has to carry quality and trust signals, not just existence signals. The operational fingerprint of an API, the evidence of how it’s actually run, becomes part of what discovery has to surface.

The AI agent era has made all of this suddenly, urgently relevant, and it’s vindicated a lot of the discovery work that looked like a dead end. When everyone got excited about AI agents calling APIs, I wrote that you see AI agents, but API Evangelist just sees API discovery, semantics, hypermedia, and workflows — because an agent trying to find and invoke the right API at runtime is the discovery problem in its purest form. The agent needs machine-readable indexes, semantic descriptions, capability matching, trust signals, and runtime workflows — exactly the things APIs.json and the broader discovery research were reaching toward a decade early. By 2026 I was making the API Evangelist network itself agent-readable and measuring how agent-ready it was, because the discovery infrastructure that agents need is the discovery infrastructure I’ve been building and advocating for all along. And my newest thinking is that agent-era discovery will need to be contextual and ephemeral — not a static directory you consult once, but a dynamic, just-in-time matching of intent to capability that exists for the moment it’s needed and then dissolves. The directory was a noun. Discovery is becoming a verb.

The throughline across fifteen years is that discovery is a commons problem wearing a technical costume. Every technical approach — directories, APIs.json, search engines, registries, GitHub-as-index, semantic schema, agent-readable metadata — addresses a real piece of it, and none of them solves it alone, because the binding constraint has always been collective investment in shared infrastructure that benefits everyone diffusely and no one concentratedly. API discovery is hard, as I titled a post in 2023, and it’s hard in a specific way: the technology is mostly tractable, but getting an entire ecosystem to publish, maintain, and trust shared discovery infrastructure runs against the grain of how individual providers behave. I still believe in the decentralized, publish-your-own-index vision, and I think the AI agents that now desperately need discovery might finally supply the incentive that human developers never quite did. If agents make discovery economically necessary in a way it never was when humans were doing the finding, then the infrastructure I’ve been building toward for a decade might finally get the investment the commons always needed. That would be a fitting resolution to the problem I’ve chased the longest.

References