An Illustrated Guide to Context Engineering, Prompt Engineering, and The Future of Both

What I've learned about context engineering as an AI PM. Plus the Skill and the prompt I use to pressure-test it.

May 13, 2026

TL;DR: Context engineering is a systems design discipline that separates working AI from failing AI. It replaces prompt engineering as a core skill for product and workflow builders in 2026. The four canonical strategies are Write, Select, Compress, and Isolate (LangChain framework). Memory architecture splits into episodic, semantic, and procedural layers. Stanford’s ACE framework enables self-improving agents. For PMs and builders, this means owning context architecture as a product decision, not delegating it to engineering.

Last week, I came across a job post that asked for context engineering skills, but the responsibilities clearly described prompt engineering.

These are not synonyms.

Prompt engineering is deciding what and how to ask the model.
Context engineering is deciding what the model knows when it answers.

One starts a conversation. The other shapes the information ecosystem in which that conversation happens, so AI can complete tasks reliably.

Context engineering is not a better version of prompt engineering. It’s the system around the prompt.

Right now, as everyone offloads work to agents, our job is shifting. We’re learning to do two things extremely well: give the agent the right context, and judge the output. That's the new work.

This post is everything I’ve learned about context engineering as an AI PM and product builder. If someone had shared this level of detail with me when I was starting, I would have saved a lot of hours, credits, and headaches.

Hey, I’m Karo Zieminski 🤗
AI Product Manager and builder.
I write Product with Attitude, an AI newsletter community of 18K subscribers learning to build with AI and developing critical AI literacy through practice. The kind where you sit down on a Saturday morning, follow a guide, and walk away with a working agent, automation, or product. Built by you. Understood by you. Owned by you.

If you’re new here, welcome! Here’s what you might have missed:

→ The Only AI Prompting Guide That Works On Reasoning Models (And Our Cognition)
→ Claude Design Review: 48-Hour Builder’s Test + Hero Prompts

Join 18K readers from around the world and learn with us.

What’s inside

Why prompt engineering is no longer enough
Why bigger context windows do not solve bad context
The 4 strategies every reliable AI system needs
The 6 signs your context window is getting dirty right now
Why PMs should own context architecture as a product decision
How Stanford’s ACE framework points toward self-improving agents
A context window hygiene Skill you can copy
A context architecture prompt you can copy

Part 1
Context Engineering vs Prompt Engineering

The Tweets That Turned Context Engineering Into a Discipline

The term crystallized in June 2025, when Shopify CEO Tobi Lütke posted on X:

I really like the term ‘context engineering’ over prompt engineering. It describes the core skill better: the art of providing all the context for the task to be plausibly solvable by the LLM.

Andrej Karpathy agreed:

[…] the delicate art and science of filling the context window with just the right information for the next step.

That tweet went viral. The mental model they introduced is now canonical.

The LLM is the CPU: the brain.
The context window is RAM: the short-term memory.
We are the operating system: responsible for what gets loaded into that memory.

Our job is to load exactly the right data into working memory for each step. Too little context, and the AI guesses. Too much irrelevant context, and the AI gets confused.

The skill is knowing what to include, what to leave out, and when to update it.

Three-step context window diagram showing the user choosing what gets uploaded, the context window holding short-term memory of goal, rules, files, and examples, and the LLM processing the loaded context.

Context Engineering vs. Prompt Engineering

As language models graduated from simple instruction followers to full-blown reasoning engines and agentic coworkers, the way we work with them had to evolve too.

Prompt engineering, the skill everyone obsessed over early on, is now just one part of context engineering. Understanding the distinction determines whether we build temporary demos or systems.

Side-by-side comparison of prompt engineering versus context engineering, contrasting one phrasing question with three deeper questions about what the model knows, how its information is curated, and how it behaves in production.

Context Engineering vs. Prompt Engineering:
A Comparison Table

Context Engineering vs. Prompt Engineering:
When to Use Which

Most confusion around prompt vs. context engineering comes down to when to use each.

Prompt engineering is fast, lightweight, and works well for one-off tasks.
Context engineering takes more setup (and maintenance!), but it’s what makes systems reliable over time.

This table shows which one to use based on the situation we’re in.

Weak Context Kills Strong Prompts

Gartner predicts 40% of enterprise applications will use task-specific AI agents by late 2026. Every one of those agents lives or dies by how well its context is engineered.

A well-crafted prompt in a poorly engineered context still fails. A poorly crafted prompt in a well-engineered context often succeeds.

That asymmetry is the argument for treating context as the underlying system.

Context Engineering vs. Context Window

The context window is the model’s attention span.
Context engineering is deciding what goes into that attention span, and when.

The context window can only hold so much at once.

Like me trying to cook dinner while my husband is talking about economics, my son is complaining that he’s hungry and the oven starts beeping.

Technically, all the information is there. Emotionally, everyone is in danger. AI works in a similar way.

If you throw everything into the context window, it gets messy fast.
If you choose well, the model can do something useful with it.

The difference is the right information, at the right time, in a space that can handle it.

The context window is not unlimited storage. It is finite, expensive, and acts as the model’s working memory.

Bigger Context Windows Don’t Fix Bad Context

Hand-drawn illustration of a tiny grumpy character dwarfed by a massive oversized window with heavy curtains and a faint spotlight, next to a cheerful character smiling out of a small cozy window — visualizing why a bigger LLM context window doesn't automatically mean better AI performance, and why smaller, focused context often beats million-token windows.

Frontier model marketing in 2026 loves token counts.

Gemini 3.1 Pro, Claude Opus 4.7, Claude Mythos Preview, and GPT 5.5 all offer 1M-token context windows.

A lot of people assume that it automatically means better AI performance.

It doesn’t.

A context window is not unlimited storage. It is finite, expensive, and acts as the model’s working memory, not permanent memory.

A bigger context window simply means the LLM can receive more information at once. It doesn’t mean the model will understand, prioritize, or use that information well.

Context Rot

The context pipeline is how context travels through the system: collected, ordered, updated, retrieved, compressed, and passed from one step to the next.

Context rot is what happens when that pipeline is poorly managed. It shows up when a session, agent, or workflow becomes overloaded with messy summaries, outdated details, repeated instructions, irrelevant files, conflicting decisions, too many tool options, and buried requirements.

Even if the right information is somewhere in the input, the system becomes less reliable when that information is surrounded by clutter.

Like me looking for the can opener while my husband debates something I didn’t ask about.

The can opener may exist.

That doesn’t mean anyone is getting dinner.

Context Structure

Hand-drawn illustration of a tall stack of books with two small green bookmarks placed near the top and bottom of the stack — visualizing the "Lost in the Middle" problem in LLM context windows, where large language models reliably recall information at the beginning and end of long prompts but lose accuracy on content buried in the middle.

A useful way to understand this comes from Stanford’s “Lost in the Middle” paper.

The researchers found that AI models often pay more attention to information at the beginning and the end of a long input. They struggle more when the important detail sits somewhere in the middle.

The beginning sets the instructions.
The end is closest to the generated answer.
The middle has to fight for attention.

Even if the right answer is there, it’s easier for the model to miss it. This is especially true for long context windows.

Having the right information in the context is not enough.
The structure of that information matters.

10 Reasons Context Breaks in Real Systems

These are the common ways context breaks in real systems.

Hand-drawn 2x5 grid illustrating the 10 failure modes of LLM context windows: Context pollution (a machine leaking sludge from rotten input), Context poisoning (a skull-and-crossbones vial dripping toxic ink onto a memory notebook), Context distraction (a character ignoring the open book in front of her for a shiny globe), Context confusion (two tangled speech bubbles overlapping with a question mark), Context conflict (two arrows colliding with "Fact A is true" versus "Fact A is false"), Stale context (a faded outdated treasure map), Tool bloat (a character buried under overlapping tool cards and icons), Mode collapse (a conveyor belt stamping "SAME" on every output), Positional neglect (a tall stack of books with bookmarks only at the top and bottom), and Memory confusion (a character tangled in threads of conflicting facts and events) — a visual taxonomy of context engineering failures in AI agents.

Share this with 3 friends or colleagues and you’ll get a free month of premium membership.

Section divider for Perplexity guides library on Product with Attitude.

1. Context pollution

This happens when we include too much information: full documentation libraries, hundreds of chat turns, old notes, unused examples, irrelevant tool descriptions. The system can’t focus. Important details get buried, and answers become less relevant or inconsistent

2. Context poisoning

This happens when bad or malicious info gets written into memory and persists.

3. Context distraction

This happens when irrelevant but salient info pulls attention.

4. Context confusion

This happens when overlapping or ambiguous info muddies reasoning.

5. Context conflict

This happens when two sources contradict.

6. Stale context

This happens when pre-loaded context is outdated.

The system gives answers based on old or incorrect assumptions, even when better information is available.

7. Tool bloat

Tool bloat breaks agents.
Every registered tool consumes tokens and attention.

This happens when we register every possible tool or action “just in case.” Every tool consumes attention. Every option creates more routing work.

A model with too many tools can struggle to choose the right one, call the wrong one, or invent one that does not exist.

8. Mode collapse

This happens when we keep doing the same thing every time, even when the situation changes and the model gets stuck in a pattern.
The system gives similar answers every time, starts missing exceptions, and quietly fails on unusual cases, and fails silently on edge cases.

9. Positional neglect

This happens when critical information is buried in the middle of a long context. The model may have access to it, but access is not the same as reliable use.

10. Memory confusion

This happens when different kinds of memory are mixed together. Our long-term preferences, yesterday’s failed tool call, and a permanent product rule should not live in the same undifferentiated blob.

They age differently. They fail differently. They should be managed differently.

6 Signs Your Context Window Is Messy Right Now

If any of these are happening, stop and clean up the context before you keep building. That’s exactly what the Context Window Hygiene Skill is for.

1. The agent repeats itself.

2. The agent forgets an earlier decision.

3. The agent invents a tool, file, or function.

4. The AI Asks What You Are Working On

5. Output quality drops mid-session.

6. Each Response Gets More Expensive

The Four Strategies That Make Context Engineering Work (LangChain Framework)

Don’t worry if you’re not familiar with this framework. I’ll walk you through the parts that matter. I’m also sharing a Skill I built from it, which I use to periodically review my own context engineering setup.

LangChain, a developer toolkit for building applications with LLMs, offers a simple way to understand context management: AI systems work better when they build understanding step by step and keep only the most useful information in the model’s working memory.

Anthropic’s guide reinforces this. Systems should build up understanding gradually, layer by layer, and only keep what is necessary in working memory.

These four strategies are what reliable AI systems are built on.

1. Write: save context outside the window

Write means storing important information somewhere stable, outside the model’s limited working memory.

That could be a file, a note, a database, or a project instruction document like CLAUDE.md or AGENTS.md, that are the standard pattern for project-scoped persistent memory. Think of it as the agent’s permanent reference document.

This context stays protected.

It doesn’t get summarized, trimmed or distorted.

If you’re new to rules files, this article is a great place to start:

If You Build With AI, You Need This File. And The System That Generates It.

Karo (Product with Attitude)

November 27, 2025

Read full story

2. Select: retrieve only what is needed

Select means bringing in only the information the model needs right now.

This is the broader idea behind RAG: the system searches for the most relevant information first.

That search can happen through keyword search, meaning-based search, similarity matching, or a mix of methods.

The best systems don’t stop at finding possible matches. They rerank the results before adding them into context.

The goal is to load only what is useful for the next step, not to load everything that might be useful.

3. Compress: retain only required tokens

Compress means reducing the amount of context while keeping the important parts.

As a task gets longer, the model’s working memory fills up. At some point, the system has to summarize, shorten, or reorganize what came before.

Claude Code does this with “auto-compact”, which summarizes the conversation when the context window exceeds 95%.

For long-running AI agents, this is essential. It’s what makes agents viable for tasks that span hundreds of turns.

No agent, no matter how clever, can carry every detail forever. Try it and you’ll watch a good $200 agent slowly turn into a $2,000 agent that’s worse at its job.

The goal is to preserve what the model still needs to do the job well, not to remember everything.

4. Isolate: split context across agents

Isolate means separating context by role, so each AI agent only sees the information it needs for its specific task.

A customer support agent may need a customer’s account history, past messages, and open tickets.

It doesn’t need product documentation for unrelated competitor products.

When every agent sees everything, irrelevant information can leak into its reasoning and confuse the answer. Role-based context keeps each agent focused.

The goal is to give each specialized agent the right information for its job and keep the rest out: a scoped window tuned to its function.

The Context Window Hygiene Skill

Use it when:

The agent starts repeating itself.
The agent forgets an earlier decision.
The agent invents a tool, file, or function.
The AI suddenly asks what you’re working on.
Output quality drops mid-session.
Or each response gets more expensive.

What comes out:

A handoff note that preserves the decisions.
A persistent CLAUDE.md or memory/ entry for anything that should outlive the session.
A scoped subagent prompt instead of one mega-agent juggling everything.
A live artifact for views you’ll re-open later.
Sometimes a scheduled task for work that can wait.

Copy the Context Window Hygiene Skill.

Copy Skill

The Context Spec Generator Prompt

This prompt turns Claude into a context engineering architect that audits the information design of an AI agent before it’s built.

You describe what you’re building, and it returns a ranked list of the five highest-stakes context items, decisions about where each one lives (system prompt, episodic memory, semantic store, or runtime retrieval), freshness requirements, failure modes, observable signals for production, fallbacks for retrieval failures, an eviction order for compression, and one risk you didn’t think to ask about.

It treats context as a product with SLAs rather than as prompt-stuffing, and forces specificity by gating on constraints (model, window size, latency, session length) before answering.

Copy the Context Spec Generator Prompt

Copy Prompt

Part 2
Context Engineering For PMs
and Builders

This is the part many PMs don’t realize is theirs to own.

Context engineering is not purely an engineering discipline. It sits at the intersection of product strategy, user understanding, and system design. Engineers can build the retrieval infrastructure. They can’t decide what the model should know about the user, when freshness matters, or what should age out.

That’s product work.

The 5-Factor Context Principles Every AI Builder Should Know

The 5-Factor Agents framework is a way of thinking about how to build agents that work long-term. It’s adapted from Heroku’s 12-Factor App that provides engineering-grade principles for production AI systems.

These are the context principles that matter most:

1. Curate, Don’t Accumulate

Don’t keep adding everything to the chat forever and don’t assume the AI will automatically know which parts of a long conversation matter most. Just-in-time retrieval outperforms pre-loaded context for most tasks.

Simple prompt:

Here is the current context:

[the goal], [the current state], [the rules], [the decisions], [what to ignore], 

Use only this as the source of truth. Ignore earlier conflicting instructions

2. Compact Errors

When something goes wrong, don’t paste the entire mess back into the AI. Long error logs, repeated failures, and messy notes confuse the model. A short summary beats a full dump.Give the AI a compact failure note:

What failed:
What I expected:
What actually happened:
What we already tried:
What I want next:

3. Pre-Fetch Context

A good brief at the start saves time, tokens, and frustration. Give the AI the important information before it starts working, not halfway through.

If you keep adding missing details later, the agent may have to revise its plan, undo work, or make assumptions.

4. Keep Context Below Capacity

Keep context under 40% capacity. Don’t wait until the session is overloaded. Headroom is not waste. It’s working space.

5. Use Small, Focused Agents

Sub-agents isolate context across roles. One agent per scoped task beats one mega-agent doing everything.

The PM’s Role in Context Engineering

Here’s the role division I work with on cross-functional teams:

PMs define what goes in each context layer. Engineers build the infrastructure to fetch and store it.
PMs design the orchestration logic, what the model sees when. Engineers implement the orchestration engine that executes it.
PMs decide what ages out, what persists, and what gets re-fetched. Engineers build the pruning rules.
PMs own context as a product surface. Engineers own it as a system surface.

If the PM isn’t doing this, one of two things happens. Either an engineer makes the product decision by default, or nobody makes it and the agent gets every available signal dumped into the window.

The PM’s core translation task looks like this:

“Users want better suggestions” becomes: the system needs access to past rejections, current workspace state, and team preferences.
“Users want more personalization” becomes: capture the user’s writing style, common corrections, and role-specific patterns.
“The agent should feel like it knows me” becomes: define which signals persist across sessions, which expire, and which the user should be able to correct.

The actionable principle: treat context as a product. Iterate it. Prune it. Measure it.

The Context Pyramid: a PM Framework

This four-level context hierarchy framework is inspired by OpenAI PM Miqdad Jaffer and Productboard:

Use the context pyramid as a sorting tool before you build or prompt an AI system. That separation makes agents easier to debug, easier to update, and less likely to rot under their own context.

Part 3
The Future Of Context Engineering and Prompt Engineering

Stanford ACE: The Self-Improving Context Layer

Most writing about context engineering still treats context as something you design once.

But the research frontier is moving toward dynamic context: context that changes, improves, and learns from experience.

In October 2025, researchers from Stanford University, SambaNova Systems, and UC Berkeley introduced Agentic Context Engineering, or ACE.

The core idea is simple but important: instead of treating context as a fixed prompt, ACE treats it as an evolving playbook the agent can update over time.

Every time an ACE-style agent works on a task, it can notice what helped, what failed, and what should be remembered for next time.

The ACE architecture has three components:

Editorial reference card for the ACE framework (Agentic Context Engineering), breaking the system into three roles: Generator (the part that tries to solve the task — produces the work, traces the steps, and identifies which pieces of context helped), Reflector (the part that reviews what happened — looks at successes and failures, then extracts useful lessons), and Curator (the part that updates the playbook — decides what should be added, changed, organized, or preserved so the context improves over time) — a context engineering loop for self-improving AI agents. — Three roles, one loop. Generator does the work. Reflector reviews it. Curator rewrites the playbook. That's how an agent gets smarter without a new model.

The important shift is that ACE doesn’t improve the model by retraining it.

It improves the environment around the model.

Instead of changing the model’s weights, it updates the instructions, strategies, examples, and lessons the model sees before it works.

The paper reports that ACE improved agent performance by 10.6% on agent benchmarks and 8.6% on finance tasks, while also reducing adaptation cost and latency.

What I’m watching for in 2026

Prompt engineering on the system side

The best AI products won’t ask users to bring perfect prompt and context. They’ll build both layers into the product itself.

You can see this with Perplexity for Finance. Common use cases are already anticipated: stock research, filings, earnings recaps, valuation lookups. Both prompts and context are built into the workflow.

Prompt engineering on the user side

When I built my Palantir dashboard in Perplexity last week, I didn’t prompt at all. I gave it a company name. The agent routed through financial data, filings, transcripts, estimates, market context, and a citation layer.

For common use cases, users won’t write elaborate mega-prompts. Model providers will wrap those tasks in workflows and UX layers.

Context engineering as a product layer

I expect systems that manage their own context: selecting what matters, pruning what expires, carrying forward what should survive the session. For builders, this means we won’t only decide what to feed the model right now. We’ll design how the system learns what to keep next time.

Context becomes an evolving map, not a static brief.

What This Means For You

Context engineering is the systems design discipline that determines whether an AI product ships reliably or collapses under real users.

For product builders, this means owning the context architecture as a product decision. Define what the model needs to know. Decide when it needs to know it. Design how stale information gets pruned.

If you’re a PM and your engineering team is making context decisions without you, that’s not delegation. That’s abdication. Get in the room.

If you’ve been designing context architectures for your AI products and found patterns that work, or patterns that failed spectacularly, drop them in the comments.

If You Build With AI, You Need This File. And The System That Generates It.

Discussion about this post

Ready for more?

An Illustrated Guide to Context Engineering, Prompt Engineering, and The Future of Both

What I've learned about context engineering as an AI PM. Plus the Skill and the prompt I use to pressure-test it.

What’s inside

Part 1 Context Engineering vs Prompt Engineering

The Tweets That Turned Context Engineering Into a Discipline

Context Engineering vs. Prompt Engineering

Context Engineering vs. Prompt Engineering: A Comparison Table

Context Engineering vs. Prompt Engineering: When to Use Which

Weak Context Kills Strong Prompts

Context Engineering vs. Context Window

Bigger Context Windows Don’t Fix Bad Context

Context Rot

Context Structure

10 Reasons Context Breaks in Real Systems

1. Context pollution

2. Context poisoning

3. Context distraction

4. Context confusion

5. Context conflict

6. Stale context

7. Tool bloat

8. Mode collapse

9. Positional neglect

10. Memory confusion

6 Signs Your Context Window Is Messy Right Now

The Four Strategies That Make Context Engineering Work (LangChain Framework)

1. Write: save context outside the window

If You Build With AI, You Need This File. And The System That Generates It.

2. Select: retrieve only what is needed

3. Compress: retain only required tokens

4. Isolate: split context across agents

The Context Window Hygiene Skill

The Context Spec Generator Prompt

Part 2 Context Engineering For PMs and Builders

The 5-Factor Context Principles Every AI Builder Should Know

1. Curate, Don’t Accumulate

2. Compact Errors

3. Pre-Fetch Context

4. Keep Context Below Capacity

5. Use Small, Focused Agents

The PM’s Role in Context Engineering

The Context Pyramid: a PM Framework

Part 3 The Future Of Context Engineering and Prompt Engineering

Stanford ACE: The Self-Improving Context Layer

What I’m watching for in 2026

Prompt engineering on the system side

Prompt engineering on the user side

Context engineering as a product layer

What This Means For You

Discussion about this post

Ready for more?

Part 1
Context Engineering vs Prompt Engineering

Context Engineering vs. Prompt Engineering:
A Comparison Table

Context Engineering vs. Prompt Engineering:
When to Use Which

Part 2
Context Engineering For PMs
and Builders

Part 3
The Future Of Context Engineering and Prompt Engineering