Are prompts the new unit of work in applications?

Salman

Paracha

July 27, 2025

Are Prompts the New Unit of Work?

In the early days of internet applications, the unit of work was either: a function call, a database query, an HTTP request. The latter being formally structured as APIs and we saw the rise of things like API design and microservices architecture. Services talked to each other through structured interfaces and well-defined contracts. Infrastructure patterns evolved around this model. Layer 4 handled traffic at the socket level, Layer 7 handled application-level routing based on paths, headers, and cookies.

‍

But is something fundamental chanaging with AI? In AI-native systems, the primary unit of work is no longer a structured API call. It’s a prompt. A prompt isn't just data—it’s an open-ended instruction expressed in natural language. It doesn’t follow a spec. It’s not typed or versioned in the way an API might be. It might ask for a summary, a chart, a line of code, a meal plan, or all of the above. And it’s often sent to a generic endpoint like /v1/chat/completions with a POST payload that looks identical regardless of intent.

So how do we route it? How do we inspect it? How do we handle prompts at scale?

‍

From API Paths to Prompts

Routing at Layer 7 made sense when applications were deterministic. /cart/checkout clearly meant one thing. But in an LLM-based system, the meaning is embedded in free-form language: “Can you turn this bullet list into a short product pitch?”

‍

That’s a meaningful task, but there’s no structured metadata telling us what it is. You have to understand the intent to process and route it properly. Your application may want to quickly reject jailbreaking prompts. Or maybe the prompt goes to GPT-4 if the task is creative writing. Maybe it goes to Claude if it’s summarization. Maybe it goes to a cheaper local model if the quality bar is lower. Or maybe its routed to an agent that is well suited for that task.

This kind of routing isn’t application-layer in the traditional sense. It’s more of an intent-layer. Would it then be fair to define a new layer of the OSI model? What about Layer 8?

‍

What Is Layer 8?

Layer 8 is not technically part of the OSI model. It’s a new framing and more of a metaphor based on the way we now must build and operate AI-native systems. If Layer 7 routes based on protocol semantics (like Host headers and URL paths), then Layer 8 routes based on goal semantics—the underlying intent behind the prompt.

‍

It's the layer where:

Prompts are parsed
Intent is inferred
Models or agents are selected
Guardrails are applied
Observability is tracked by goal, not just status code

This isn’t theoretical. Many systems are already implementing Layer 8 logic today, just without calling it that.

‍

Why This Matters for Developers

If you’re building with LLMs or agents, you’re already facing problems that traditional infra can’t handle well:

You want to route traffic to different models based on prompt type
You need observability tied to intent, not HTTP status
You’re debugging tasks, not endpoints
You’re planning work across agents based on high-level language instructions

Technically these are Layer 7 concerns. But there is something subtle happening as AI changes workload patterns. Prompts arefundamentally about understanding what the user wants to do—not just how they asked.

‍

A Shift in Infrastructure Design

This shift means the edge needs to be smarter. Not just “is this request valid,” but “what is this request trying to accomplish?” That’s the promise of intent-aware or a models-integrated infra:

Prompt routers that dispatch based on goal and task represented in natural language (which is a lot more than semantic similartity - more on that in a different blog)
A smart edge proxy that enrich requests based on detected goals, knows how to forward the request to the right agent or LLM.

Just as application load balancers helped scale APIs, its this prompt-aware or intent-aware infra layer that will help scale AI applications in a platform and language agnostic way.

‍

Closing Thought

Prompts are becoming the atomic unit of work in the AI-native world. And just like we needed Layer 7 logic to handle the rise of APIs, we now need Layer 8 to handle the rise of prompts. Once you recognize this shift, everything from observability to routing to planning starts to look different. And infrastructure that thrives in this new environment will be one that natively understands prompts. It will platform agnostic, framework friendly and transparent to developers so that they can build and ship AI applications to production faster.

‍