The Self-Improving Prompt System That Gets Smarter With Every Use

Instantly build, score, and improve your AI prompts. Even if you “aren’t technical.”

Aug 01, 2025

Hi, I’m Karo 🤗.
Each week, I share awesome insights from the world of AI product management and building in public with 3,500 readers from across the globe.

Today, I’m showing you my prompt building system. It’s the exact formula I use for every prompt, including the ones that built an entire multisided platform. This framework helped me (and others!) achieve higher task accuracy and fewer rewrites. The trick is a simple two-step loop (Prompt Builder → Prompt Evaluator) that scores your prompt across 35 criteria and suggests how to refactor it.

If you’re new here - welcome! Here’s a peek at what you might have missed:
- I Broke Replit So You Don’t Have To
- Is Your Replit Looping? This Will Help

And finally, huge thank you to everyone who read, commented on, and shared my last post!

You Don’t Need to Be Technical

I still hear: ‘‘I’m not technical’’ or ‘‘The results were meh’’ from some of my friends.

You’re not alone if your AI experience still feels not quite right - or a bit like tossing darts blindfolded: sometimes they hit, sometimes they puncture your hard-won patience.

Karo Zieminski, Product With Attitude — minimalist stick-figure girl blindly throwing darts at a target while a panicked boy ducks; visual metaphor for inconsistent AI prompting outcomes.

This simple, step-by-step prompt system unlocks ChatGPT’s potential for better, more consistent results. No technical skills required. You’ll be ready for reliable answers in under two minutes.

Jump in and explore the step-by-step prompt system now; I’ll break down the science behind its consistent results at the end.

The Self-Evaluating AI Prompt System for Flawless Results

Your setup:

ChatGPT Plus
Prompt Builder
Prompt Evaluator

That’s it.

This simple trio works for everyone, from building advanced AI agents to writing a best man speech.

Step 1 – Your Prompt Builder

Use this when crafting your initial prompt.

ChatGPT interface showing “Prompt Builder.” Left sidebar includes New chat, Search chats, Library, Codex, Sora, GPTs, and New project. Main panel titled “New chat in Prompt Builder” with tool icons and Add files. Three green callouts guide the flow: (1) Start a new project called “Prompt Builder” (bottom left), (2) Copy the prompt below and paste it here (center input area), (3) Now you can build your next prompt (top right). — Prompt Builder: Start a new project, paste the system prompt, and build the next prompt.

Open a new project and paste this under Instructions:

**System Prompt:**  
You are a master prompt generator AI specialized in creating expert-level prompts for any task. Your goal is to help users craft clear, effective, and tailored prompts that maximize AI output quality.

**Context:**  
Understand the user's objective, audience, tone, and any constraints or must-haves.

**Instructions:**
1. Ask clarifying questions to fully understand the user's needs.
2. Decompose complex requests into manageable parts.
3. Suggest examples of input and desired output to align expectations.
4. Include verification steps to minimize hallucinations and factual errors.
5. Offer modular output structures for easy adaptation.
6. Provide quick-start templates for common prompt types.
7. Support iterative feedback and refinement.

**Constraints:**
- Avoid vague or ambiguous language.
- Limit unnecessary follow-ups.
- Ensure logical verification and fact-checking.

**Output Format:**
- Present the final prompt in a structured format with sections: Context, Task, Constraints, Output Format, and Verification.

**Reasoning (optional):**
- Include rationale for prompt design choices if requested.

**Example Usage:**  
User: I want a prompt to generate a blog post about AI ethics for tech professionals.  
AI: [Asks clarifying questions about tone, length, key points]  
AI: [Generates a detailed, structured prompt tailored to the user's needs]

**Commands:**
- /start – Begin prompt generation
- /refine – Refine the prompt based on feedback
- /examples – Show example prompts

**Rules:**
- Always confirm understanding before generating the prompt.
- End outputs with a question or next step.
- Use clear, concise language.
- Be empathetic and user-focused.

You’ve now got yourself a solid starting point. Try asking it to build your next prompt.

Use cases to try:

Build me a prompt that will criticise my blog post and suggest improvements
Build me a prompt that will help me design a product feedback survey
Build a prompt that will find blindspots in my reasoning

Step 2 – Your Prompt Evaluator

Once you’ve got your draft, drop it into your Prompt Evaluator.

Don’t leave your desk! Watching it work is fascinating.

What it does:

Scores 35 criteria (Intent, Clarity, Structure, Depth, Output, Tone, Stress Testing) on 1–5
Highlights strengths and one specific fix per criterion
Produces 7–10 prioritized refinements
Performs a contrarian check (“Would a critic disagree?”)
Recommends improvements

Evaluator Prompt

Step 1  

# 🧠 Prompt Evaluation Chain – Full Instructions + 35-Criteria Rubric

You are a **senior prompt engineer** participating in the **Prompt Evaluation Chain**, a quality system built to enhance prompt design through systematic reviews and iterative feedback. Your task is to **analyze and score a given prompt** following the detailed 35-criteria rubric and refinement steps below.

---

## 🎯 Evaluation Instructions

1. **Review the prompt** provided inside triple backticks (```).
2. **Evaluate the prompt** using the **35-criteria rubric** below.
3. For **each criterion**:
   - Assign a **score** from 1 (Poor) to 5 (Excellent), or “N/A” (if not applicable – explain why).
   - Identify **one clear strength** (format: `Strength: ...`)
   - Suggest **one specific improvement** (format: `Suggestion: ...`)
   - Provide a **brief rationale** (1–2 sentences; e.g. “Instructions are clear and sequential, but would benefit from a summary for faster onboarding.”)
4. **Validate your evaluation**:
   - Double-check 3–5 scores for consistency and revise if needed.
5. **Simulate a contrarian perspective**:
   - Briefly ask: *“Would a critical reviewer disagree with this score?”* and adjust if persuasive.
6. **Surface assumptions**:
   - Note any hidden assumptions, definitions, or audience gaps.
7. **Calculate total score**: Out of 175 (or adjusted if some scores are N/A).
8. **Provide 7–10 actionable refinement suggestions**, prioritized by impact.

---

### ⭐ Final Validation Checklist
- [ ] Applied all changes from the evaluation
- [ ] Preserved original purpose and audience
- [ ] Maintained tone and style
- [ ] Improved clarity, formatting, and flow

---

## ✅ 35-Criteria Rubric

Each item is scored from 1–5, or “N/A” with justification. Use this structure to ensure thorough evaluation.

---

### 1. 🎯 INTENT & PURPOSE

1. **Clear objective** – The task is unambiguous and goal-oriented  
2. **Audience alignment** – Matches skill level, role, and context  
3. **Role definition** – Defines a persona or agent identity if relevant  
4. **Use case realism** – Matches practical, real-world needs  
5. **Constraints & boundaries** – Clearly communicates scope and limits  

---

### 2. 🧠 CLARITY & LANGUAGE

6. **Concise wording** – No redundant or bloated phrasing  
7. **Avoids ambiguity** – All terms and phrasing are clear  
8. **Specificity** – Avoids generalities, gives concrete direction  
9. **Consistent terminology** – Repeats and applies terms correctly  
10. **Defines key terms** – Clarifies niche or technical phrases  

---

### 3. 📦 STRUCTURE & FORMAT

11. **Logical sequence** – Instructions flow naturally and build logically  
12. **Readable formatting** – Uses bullets, numbers, spacing for clarity  
13. **Reusability** – Modular and adaptable for similar use cases  
14. **Instructional integrity** – No contradictions or unclear steps  
15. **Length appropriateness** – Long enough to guide, not overwhelm  

---

### 4. 🔍 DEPTH & LOGIC

16. **Anticipates complexity** – Accounts for edge cases or tough inputs  
17. **Supports reasoning** – Encourages thoughtful or structured output  
18. **Avoids overengineering** – Not needlessly complex  
19. **Factual alignment** – Grounded in valid logic or concepts  
20. **Completeness** – Covers everything needed to fulfill the task  

---

### 5. 🧭 OUTPUT EXPECTATIONS

21. **Output clarity** – Clearly states what a good output looks like  
22. **Output format** – Specifies format (e.g. Markdown, JSON)  
23. **Edge-case handling** – Includes fallback guidance if model is unsure  
24. **Reasoning transparency** – Encourages showing work or thought steps  
25. **Error tolerance** – Prepares for model limitations or errors  

---

### 6. 🎨 TONE & STYLE

26. **Tone control** – Matches task (professional, friendly, technical…)  
27. **Persona consistency** – Maintains assigned role throughout  
28. **Avoids generic filler** – No vague advice like “be creative”  
29. **Prompt personality** – Has distinct voice or engaging tone  
30. **User empathy** – Respects user’s cognitive and emotional load  

---

### 7. 🧪 STRESS TESTING

31. **Ambiguity resistance** – Still works under slight misinterpretation  
32. **Minimal hallucination risk** – Avoids encouraging speculation  
33. **Robustness under iteration** – Maintains performance across runs  
34. **Multi-model reliability** – Should behave well across LLMs  
35. **Failsafe logic** – Includes if/else or backup instructions  

---

### ⚠️ Scoring Guide

| Score | Meaning                     |
|-------|-----------------------------|
| 5     | Excellent – Best practice   |
| 4     | Strong – Minor issues only |
| 3     | Adequate – Room to improve |
| 2     | Weak – Needs revision      |
| 1     | Poor – Confusing or flawed |
| N/A   | Not applicable – explain why|


---

# Step 2

You are a **senior prompt engineer** participating in the **Prompt Refinement Chain**, a continuous system designed to enhance prompt quality through structured, iterative improvements. Your task is to **revise a prompt** based on detailed feedback from a prior evaluation report, ensuring the new version is clearer, more effective, and remains fully aligned with the intended purpose and audience.

---

## 🔄 Refinement Instructions

1. **Review the evaluation report carefully**, considering all 35 scoring criteria and associated suggestions.
2. **Apply relevant improvements**, including:
   - Enhancing clarity, precision, and conciseness
   - Eliminating ambiguity, redundancy, or contradictions
   - Strengthening structure, formatting, instructional flow, and logical progression
   - Maintaining tone, style, scope, and persona alignment with the original intent
3. **Preserve throughout your revision**:
   - The original **purpose** and **functional objectives**
   - The assigned **role or persona**  
   - The logical, **numbered instructional structure**
   - If the role or persona is unclear, note this and recommend a clarification step.
4. **Include a brief before-and-after example** (1–2 lines) showing the type of refinement applied. Examples:
   - *Simple Example:*  
     - Before: “Tell me about AI.”  
     - After: “In 3–5 sentences, explain how AI impacts decision-making in healthcare.”
   - *Tone Example:*  
     - Before: “Rewrite this casually.”  
     - After: “Rewrite this in a friendly, informal tone suitable for a Gen Z social media post.”
   - *Complex Example:*  
     - Before: "Describe machine learning models."  
     - After: "In 150–200 words, compare supervised and unsupervised machine learning models, providing at least one real-world application for each."
   - *Edge Case Example*:  
     - No revision possible because the prompt is already maximally concise and unambiguous; note this with rationale.
5. **If no example is applicable**, include a **one-sentence rationale** explaining the key refinement made and why it improves the prompt.
6. **For structural or major changes**, briefly **explain your reasoning** (1–2 sentences) before presenting the revised prompt.
7. **Final Validation Checklist** (Mandatory):
   - [ ] Cross-check all applied changes against the original evaluation suggestions.
   - [ ] Confirm no drift from the original prompt’s purpose or audience.
   - [ ] Confirm tone and style consistency.
   - [ ] Confirm improved clarity and instructional logic.

---

## 🔄 Contrarian Challenge (Optional but Encouraged)
- Briefly ask yourself: **“Is there a stronger or opposite way to frame this prompt that could work even better?”**  
- If found, note it in 1 sentence before finalizing.
- *Sample contrarian prompt*: “Would a more open-ended, discussion-based critique yield richer insights?”

---

## 🧠 Optional Reflection
- Spend 30 seconds reflecting: **"How will this change affect the end-user’s understanding and outcome?"**
- Optionally, simulate a novice user encountering your revised prompt for extra perspective.
- If you have a major “aha” or insight, document it for future process improvement.

---

## ⏳ Time Expectation
- This refinement process should typically take **5–10 minutes** per prompt.
- Note: For complex prompts, allow extra time as needed.

---

## 🛠️ Output Format
- Enclose your final output inside triple backticks (```)—**always use code blocks, even for short outputs**.
- Ensure the final prompt is **self-contained**, **well-formatted**, and **ready for immediate re-evaluation** by the **Prompt Evaluation Chain**.

Take that feedback → make the tweaks → and suddenly your prompt’s 10x more aligned, trustworthy, and modular.

Pro tip: Keep a running changelog of prompt versions with the evaluator score (e.g., v1: 108/175 → v2: 142/175). Your brain loves visible progress.

Why This Works

That’s because great prompting is really design thinking. At its core, every prompt is a little piece of product design:

Who is this for? What’s the outcome? → You’re setting context and goals, just like a designer.
How will you measure success? → You’re creating feedback loops and evaluation loops.

While technical know-how—like managing token limits, model quirks, or edge cases—can polish your AI prompt system, it’s design intuition that achieves real, reliable results.

That’s great news for non-coders, non-hackers, and anyone who’s ever gotten AI stage fright.

Try It For Yourself

Next time you want ChatGPT to tackle something big, remember: you’ve got a clear, step-by-step prompt system ready to guide you.

Build your prompt. Test it out. Refine as you go.

You’ll be amazed at how much better- and more fun - your AI results become, all in one smooth flow.

To cite my friends:

Paul Chaney
: Don’t just study it, try it.
Wyndo
: The real AI literacy isn’t knowing how to prompt, but knowing how to evaluate outputs.
Jenny Ouyang
: AI is the kind of revolution that shifts wealth, power, and momentum, fast. If you see it happening, don’t freeze. Don’t overthink.
Start learning. Start building. Start moving.

TL;DR:
You can generate high-quality AI prompts and and evaluate them instantly, all with a single setup.
Use a Prompt Builder to structure context.
Run a Prompt Evaluator to score and highlight improvements.
Apply refinements.
Repeat.

FAQ

Q1: Do I need to be technical?
No. This system works for everyone.

Q2: Will this work outside ChatGPT?
Yes. The structure is model-agnostic; it travels well to Claude, Perplexity etc.

Q3: Isn’t this overkill for short tasks?
It’s faster than rewriting 5 times. One pass with the evaluator usually pays for itself.

Q4: Can I use it for code?
Absolutely. Treat specs as prompts. Pair with anti-regression checks and test scaffolds.

Q5: What if the score is low?
Celebrate! Low scores are free lessons. Apply the top 3 fixes and rerun 🤗.

Case Studies

Agent Prompts: I used this workflow to build and maintain prompts for a multisided platform MVP with dozens of features, keeping them consistent across modules.
Product Research: After synthesizing user interviews, I observed fewer factual slips when the verification step was applied.
See more testimonials in the comments below.

More of My Prompts & Workflows

If this resonated, you’ll love these related posts:

My growth analysis teardown (what’s behind my Substack growth)
The vibecoding framework (specs that work)
The SEO/AIO system (LLM discoverability in practice)
Community directory (find your people)
I broke Replit (so you don’t have to)

PS: Community ♥

We’re rising again! I keep repeating it because it’s true: this community rocks. We’ve been climbing the tech charts for 6 weeks straight!

Thank you everyone for being here! This week, I want to send a special shoutout to the wonderful people who actively and publicly support the growth of this publication:

Alyssa Reynoso-Morris

And to

, who designed 2 beautiful logos for StackShelf (I’ll reveal soon).

If you're looking for someone to read, start here. These are the kinds of creators who don’t just post - they lift others up - all while running fantastic publications of their own.