Context Engineering Is the Real Skill. Not Prompt Engineering.

The AI industry has a prompt engineering obsession. LinkedIn is flooded with prompt engineering courses. Companies are hiring Chief Prompt Officers. Conferences run multi-day tracks on the art of crafting the perfect instruction. There are certifications now (actual certifications) for writing sentences that tell a language model what to do.

All of this misses the point entirely.

The Claim

Prompt engineering is treating the symptom, not the disease. The reason your prompts need to be elaborate, carefully tested, and constantly maintained is not that prompts are inherently hard. It’s that your AI has no context about your business.

Consider what happens when you give an AI agent a task without context. You ask it to approve a purchase order. It doesn’t know your approval thresholds. It doesn’t know that orders over $10K require VP sign-off, that vendor XYZ is on a preferred list with different terms, or that Q4 has a budget freeze. So you write a prompt that includes all of this information. Every rule, every exception, every conditional. The prompt becomes a page long. Then someone changes the approval threshold, and you have to find and update every prompt that references it.

Now consider the alternative. You structure your business context: approval rules live in a schema, vendor relationships are defined entities, budget policies are encoded as skills. When the agent gets “approve this PO,” it already knows everything it needs to know. The prompt is five words. The result is correct.

This is the difference between prompt engineering and context engineering. Prompt engineering is giving someone detailed driving directions to a city they’ve never visited: turn left at the third light, watch for the one-way street, parking is behind the building on the east side. Context engineering is giving them a map and GPS. With the map, any reasonable instruction works. “Go to the restaurant” is sufficient. Without the map, even perfect directions fail the moment something unexpected happens, construction detour, closed road, renamed street.

The entire prompt engineering industry exists because organizations are trying to compensate for missing context by overloading the instruction. It’s an expensive, fragile workaround for a structural problem. And the structural problem has a structural solution.

Context engineering is the practice of structuring your organization’s knowledge (processes, rules, entities, relationships, domain expertise) so any AI agent can operate on it effectively. Business-as-Code is the methodology that makes it concrete: schemas define your entities, skills encode your expertise as documents, and structured context gives agents the background they need. The result is AI that works with simple instructions because the hard work is already done. It’s in the context layer, not in the prompt.

The Evidence

We’ve deployed AI agents across dozens of business functions at this point. Customer service, procurement, compliance review, financial analysis, content operations, technical support. The pattern is the same every single time.

The two paths

Organizations fall into one of two camps when they start with AI.

Camp A: Prompt-first. They pick a use case (say, customer service) and start writing prompts. The first prompt is simple: “Answer the customer’s question.” Output is generic and wrong half the time. So they add instructions: “Use a professional tone. Reference our return policy. Don’t make promises about timelines.” Better, but still inconsistent. So they add more: exception handling, escalation rules, product-specific instructions, tone variations by customer tier. The prompt grows to 2,000 words. It works 70% of the time. They spend weeks testing edge cases, rewriting sections, A/B testing phrasing. Eventually they get to 80% reliability. Then the model updates, and half their carefully tuned prompts drift. Back to testing.

Camp B: Context-first. They spend 2-3 weeks on Business-as-Code. They define their customer entity schema: tiers, history, preferences, account status. They write skills for their domain: return policy logic, escalation decision trees, product knowledge, tone guidelines by context. They build the context layer that connects it all. Then they give the agent a simple instruction: “Handle this customer inquiry.” The agent pulls the customer’s schema, loads the relevant skills, and resolves the inquiry. First-pass reliability: 85%. After a week of skill refinement: 92%. When the model updates, the context layer doesn’t care. It’s structured data, not prompt-dependent phrasing.

This is not theoretical. One client’s customer service operation went from 40% task completion with elaborate prompts to 92% with simple prompts plus full Business-as-Code context. The prompts they had spent two months crafting were replaced by one-sentence instructions. The context did the work the prompts were trying to do.

The math that should end the debate

Here’s the calculation that convinced our most skeptical client.

Their organization had identified 100 AI use cases across five departments. Under the prompt engineering approach, each use case required a custom prompt, crafted, tested, iterated, and documented. Average time per prompt: 4-6 hours of a senior person’s time for initial development, plus ongoing maintenance. That’s 400-600 hours just to get the prompts written. Then factor in maintenance: model updates, business rule changes, edge case discoveries. Call it 20% annual maintenance, roughly 80-120 hours per year, every year, forever.

The Business-as-Code approach: 80-120 hours for the initial implementation. Define the core entity schemas, write the foundational skills, build the context layer. That covers not just the first 100 use cases but every future use case that touches those same entities and rules. Maintenance is targeted: update a schema when the business entity changes, update a skill when the policy changes. One change propagates to every agent that uses it. Annual maintenance: 20-30 hours.

Year one: prompt engineering costs 500+ hours. Context engineering costs 140 hours.

Year two: prompt engineering costs another 100+ hours in maintenance alone, plus new prompts for new use cases. Context engineering costs 30 hours in maintenance, and new use cases take minutes because the context layer already exists.

The compounding is the key. Every schema you define, every skill you encode, every piece of context you structure makes every subsequent agent smarter. Prompt engineering is linear. Each new use case is a new prompt. Context engineering is exponential. Each new piece of context improves every agent in the system.

What context engineering looks like in practice

At NimbleBrain, we eat our own cooking. Our entire operation runs on Business-as-Code artifacts. Our CLAUDE.md files are literal context engineering: structured documents that give AI agents the full picture of our codebase, conventions, and decision rules. We have 21+ MCP servers, each with its own context layer. Our Upjack framework defines applications as JSON schemas plus natural language skills. Our schemas are hosted at schemas.nimblebrain.ai.

When we onboard a new AI agent to our own systems, it doesn’t need elaborate prompts. It reads the context. It understands the entities, the relationships, the rules. Simple instructions produce correct output because the context does the heavy lifting.

This is what we mean by The Recursive Loop. We BUILD the context, OPERATE agents on it, LEARN from the gaps, and BUILD deeper context. Each cycle makes the system smarter. The prompts stay simple. The context gets richer.

Our client engagements follow the same pattern. In a typical 4-week engagement, we spend the first 2-3 weeks on Business-as-Code implementation, the knowledge audit, the schema definitions, the skill encoding. The last 1-2 weeks are deploying agents on top of that context. Clients are often surprised that the “AI part” is the shortest phase. But that’s the whole point. When the context is right, the AI part is straightforward.

The fragility problem nobody talks about

Prompt engineering has a dirty secret: prompts are coupled to model behavior in ways that are invisible until they break.

A prompt tuned for GPT-4 may not work the same on GPT-4o. A prompt that works on Claude 3.5 Sonnet may need adjustment on Claude 4. The phrasing, the order of instructions, the emphasis. These interact with model internals in ways that nobody fully understands, including the model providers. This is why “prompt engineering” requires constant testing after every model update. You’re building on shifting sand.

Context engineering sidesteps this entirely. A JSON schema that defines your customer entity doesn’t change when the model updates. A skill document that describes your return policy logic reads the same to every model. The context layer is model-agnostic by design. It’s structured knowledge, not model-specific instructions. When you upgrade models (and you will, repeatedly) the context layer comes along unchanged. Only the prompts need checking, and if you’ve kept them simple (as context engineering allows), there’s barely anything to check.

We tracked this across three model transitions last year. Organizations using prompt-heavy approaches spent an average of 40-60 hours per major model update recalibrating their prompts. Organizations with Business-as-Code implementations spent 2-4 hours, mostly verifying that everything still worked, which it did.

The Counterarguments

Any honest argument addresses the best objections. Here are four.

”But prompt engineering is faster to start”

True. Writing a clever prompt takes ten minutes. Structuring your business context takes weeks. If you need a one-off demo or a quick prototype, prompt engineering is the rational choice.

But organizations don’t build one AI agent. They build dozens. And every prompt you write without context is a prompt that embeds business logic in an instruction string instead of a structured, reusable artifact. That logic is invisible to other agents, invisible to version control, impossible to audit, and fragile to model updates. It’s technical debt from day one.

The “faster to start” advantage evaporates by the third use case and becomes a liability by the tenth. We’ve seen organizations with 200+ custom prompts that no single person fully understands. Updating a business rule means hunting through dozens of prompts to find every place it’s referenced. Context engineering makes the rule a single artifact that every agent references. Change it once, and every agent is updated.

”But LLMs are getting smarter: won’t they need less context?”

This is the most common objection, and it gets the relationship exactly backwards. Smarter models don’t need less context. They use context better.

GPT-5, Claude, Gemini. Each generation gets better at reasoning over provided context. They understand more nuance, handle more complexity, follow more sophisticated instructions. But none of that matters if the context isn’t there. A brilliant model with no knowledge of your approval thresholds still can’t approve your purchase order correctly. A model with a million-token context window is useless if the million tokens are unstructured noise.

The trend in AI is actually making context engineering more valuable, not less. Larger context windows mean agents can consume more structured context. Better reasoning means agents can handle more complex business rules. Multimodal capabilities mean agents can work with more types of business artifacts. Every advance in model capability is an advance in the value of well-structured context.

Smarter models are the engine. Context is the fuel. A better engine without fuel still goes nowhere.

”But context engineering is too expensive”

Compared to what?

A Business-as-Code implementation for a mid-sized operation runs 80-120 hours, roughly $40K-$60K at market rates for the kind of senior people who should be doing it. That’s a one-time investment that compounds across every AI use case the organization deploys.

The alternative is not “free.” The alternative is spending 4-6 hours per prompt, per use case, forever. It’s model update cycles that break existing prompts. It’s the opportunity cost of your senior people writing and testing instructions instead of building systems. It’s the inconsistency cost of agents that work 70% of the time instead of 92%.

Context engineering is expensive the way a foundation is expensive. You can skip it and build faster initially. But everything you build on top will be unstable, and you’ll eventually tear it down and start over. We’ve seen that movie three times in the past year alone, organizations that spent 6+ months on prompt engineering, hit a wall, and came to us for a Business-as-Code implementation that replaced the whole thing in weeks.

”But I need prompt engineering AND context engineering”

You do. And the right ratio is 90% context, 10% prompts.

Prompt engineering has a legitimate role. Specifying output format, setting tone and style, defining the specific task at hand. These are prompt-level concerns. Context engineering handles everything else: business rules, entity definitions, domain expertise, organizational knowledge, decision frameworks.

The industry has this ratio inverted. Most organizations spend 90% of their AI effort on prompts and 10% (or zero) on context. They’re polishing the instruction while ignoring the foundation. It’s like spending your entire house budget on window treatments while the walls have no insulation.

Get the context right, and simple prompts work. Ignore the context, and even perfect prompts fail. Allocate your effort accordingly.

The Conclusion

The prompt engineering era is a transitional phase. It exists because organizations haven’t structured their context yet, and someone needs to compensate for the gap. Prompt engineers are human translators between disorganized business knowledge and AI agents that need structure. They’re a stopgap.

The organizations that figure this out first will have a durable advantage. Not because they have better AI models. Everyone has access to the same models. Not because they have better prompts. Prompts are infinitely copyable. Because they have structured context that took real work to build and improves with every interaction. That context is a moat.

Skills-as-Documents means your domain expertise lives as structured markdown, not locked in prompt strings or people’s heads. Schemas mean your business entities are defined once and referenced everywhere. The Recursive Loop means your system gets smarter every cycle: BUILD the context, OPERATE agents on it, LEARN from the gaps, BUILD deeper.

Stop hiring prompt engineers. Start structuring your business context. Stop investing in instruction-crafting workshops. Start investing in knowledge architecture.

The real skill gap in enterprise AI is not the ability to write a good prompt. Any reasonably articulate person can write a good prompt. The real skill gap is the ability to look at an organization’s operations and structure them in a way that machines can act on. That’s context engineering. Business-as-Code is the methodology. And the organizations that adopt it will have AI that actually works (reliably, at scale, across every function) while their competitors are still tweaking prompts.

Frequently Asked Questions

What is context engineering?

Context engineering is the practice of structuring your organization's knowledge: processes, rules, entities, relationships, and domain expertise, so any AI agent can operate on it effectively, regardless of the specific prompt used. It involves defining business entities as schemas, encoding expertise as skills, and building a persistent context layer. Think of it as creating the operating manual that every AI agent in your organization reads before doing anything.

Is prompt engineering useless?

Not useless, but massively overvalued. Prompt engineering has a role in shaping tone, format, and specific output requirements. What it cannot do is substitute for missing context. A perfectly crafted prompt sent to an agent with no knowledge of your business rules, approval thresholds, or customer segments will still produce generic output. The industry has the ratio backwards. It should be 90% context engineering, 10% prompt refinement.

How does Business-as-Code relate to context engineering?

Business-as-Code is the implementation methodology for context engineering. Context engineering is the discipline, the recognition that structured context is what makes AI work. Business-as-Code is how you actually do it: schemas define your business entities, skills encode your domain expertise as structured documents, and the context layer ties it all together. It takes the abstract idea of 'give AI more context' and makes it concrete and repeatable.

How long does it take to implement context engineering?

A focused Business-as-Code implementation takes 2-3 weeks for the core layer: key entity schemas, 10-15 domain skills, and the foundational context structure. That initial investment pays off immediately, agents start producing reliable output with simple prompts. From there, it compounds. Each new skill or schema you add makes every agent in the system smarter. Compare that to prompt engineering, where each new use case requires a fresh round of prompt crafting and testing.

Can I start with prompt engineering and add context later?

You can, but you will regret it. Prompt-first organizations accumulate what we call 'prompt debt', hundreds of fragile, interdependent prompts that break with every model update and resist modification. When you eventually add context engineering, you end up rebuilding most of those prompts anyway because they were compensating for missing context. Starting with context engineering is cheaper, faster to compound, and avoids the rework.

What's the ROI of context engineering vs. prompt engineering?

The math is straightforward. A typical organization needs 50-100 prompts to cover its AI use cases. Each prompt takes 4-6 hours to craft, test, and iterate, call it 400 hours. Those prompts need maintenance every time the model changes or the business evolves. A Business-as-Code implementation runs about 80-120 hours upfront. But the context layer is durable across model updates, reusable across every agent and use case, and compounds as you add to it. Within 6 months, context engineering costs a fraction of ongoing prompt maintenance.

Do I need to hire a context engineer?

You don't need a new job title. You need someone who understands your business operations deeply and can work with structured data formats like JSON schemas and markdown. The best context engineers are often operations leaders or senior domain experts who learn the technical format, not technologists who try to learn the business. NimbleBrain's embed model pairs our technical team with your domain experts for exactly this reason, the knowledge transfer runs both directions.

What tools do I need for context engineering?

Surprisingly few. Context engineering is mostly about structure and methodology, not tooling. You need a way to define schemas (JSON Schema works), a way to write skills (structured markdown), and a way to version and deploy them (git). NimbleBrain uses its own Business-as-Code toolkit, CLAUDE.md files, the Upjack framework for declarative app definitions, and JSON schemas hosted at schemas.nimblebrain.ai, but the principles work with any structured format. The hard part is the knowledge work, not the tools.

Ready to put this thesis
into practice?

Talk to us

Or email directly: hello@nimblebrain.ai

The Claim

The Evidence

The two paths

The math that should end the debate

What context engineering looks like in practice

The fragility problem nobody talks about

The Counterarguments

”But prompt engineering is faster to start”

”But LLMs are getting smarter: won’t they need less context?”

”But context engineering is too expensive”

”But I need prompt engineering AND context engineering”

The Conclusion

Frequently Asked Questions

Ready to put this thesisinto practice?

Ready to put this thesis
into practice?