What makes a good prompt? The four ingredients
~22 min readWhat Makes a Good Prompt? The Four Ingredients
Most professionals assume that prompting AI is either trivially easy or some arcane skill reserved for engineers. Neither is true. The reality sits in a more interesting middle ground — one where a handful of structural decisions determine whether ChatGPT or Claude gives you something publishable or something embarrassing. This lesson dismantles three beliefs that are holding back smart people right now, replaces them with sharper mental models, and then gives you the actual framework that separates mediocre prompts from ones that produce real work. By the time you finish, you'll understand why a 400-word prompt sometimes outperforms a 10-word one — and why the reverse is also true.
Three Things Most People Get Wrong
Myth 1: Longer Prompts Always Get Better Results
The first instinct when AI output disappoints is to add more words to the prompt. More context, more caveats, more explanation of what you don't want. This feels logical — you're being thorough. But length without structure is just noise. GPT-4 processes your prompt as tokens (roughly 0.75 words each), and it weighs every part of that input. When you bury your actual request under three paragraphs of preamble, the model distributes attention across all of it. Your key instruction gets diluted. A 600-token prompt where the core ask appears in sentence 14 will consistently underperform a 150-token prompt where the core ask is sentence one.
This doesn't mean brevity always wins. Context matters enormously — but only the right context. A marketing manager asking Claude to write a product description gets better output by specifying the target customer, the product's single biggest benefit, and the desired tone. That's three targeted pieces of information. What doesn't help is a paragraph explaining the company's founding story, the manager's job title, or a list of 11 things the description should avoid. Irrelevant context creates a signal-to-noise problem that even the best models struggle with. Anthropic's own documentation on Claude explicitly notes that clear, focused prompts outperform verbose ones when the task is well-defined.
The corrected mental model is this: prompt quality is about density, not length. Every sentence in your prompt should be doing a specific job — defining the task, constraining the output, establishing tone, or providing necessary context. If you can remove a sentence without changing what the model produces, that sentence was deadweight. Experienced prompt writers often draft long prompts and then cut them by 30-40% before sending, the same way a good editor treats a first draft. The goal is the minimum effective dose of information — enough to fully specify what you want, nothing more.
More Words ≠ Better Output
Myth 2: AI Understands What You Mean, Not Just What You Say
This is the most expensive misconception in practice. Professionals with strong communication skills assume that AI models read intent the way experienced colleagues do — inferring what you meant from context, filling gaps with common sense, asking for clarification when something is ambiguous. This assumption leads to prompts like "write me a report on our Q3 performance" sent to ChatGPT with zero additional information. The model does produce a report. It's just completely fabricated, because it has no access to your Q3 data and will confidently invent plausible-sounding numbers rather than say it can't help. That's not a bug — it's how language models work by design.
Large language models are next-token predictors. At a mechanical level, they are always asking: given everything before this point, what word comes next? They don't have goals or intentions. They don't know what you were trying to accomplish when you typed your prompt. What they're extraordinarily good at is pattern-matching your words to billions of training examples and generating statistically coherent continuations. When your prompt is ambiguous, the model doesn't pause and reason about your intent — it picks the most probable interpretation and runs with it. That interpretation is often wrong in ways that aren't immediately obvious, especially in professional contexts where the details matter.
The fix is to treat AI models as very capable, very literal-minded junior colleagues who have never worked at your company, never met your customers, and have no access to your files unless you paste them in. They need explicit instruction on what you want, in what format, for what purpose, and from what perspective. Perplexity AI's search-augmented model is a partial exception — it can pull live information — but even there, the quality of interpretation depends entirely on how precisely you've specified what you're after. The professional skill being tested here isn't writing ability. It's the ability to make your intentions unambiguous.
Prompt
VAGUE: Write a summary of our product for new customers. PRECISE: Write a 3-sentence product summary for first-time visitors to our SaaS pricing page. The product is a project management tool for remote engineering teams. Emphasize speed of setup (under 10 minutes), integrations with Jira and Slack, and a $29/seat/month price point. Tone: confident and direct, no jargon.
AI Response
The vague prompt produces a generic paragraph that could describe any software product. The precise prompt produces something a marketing team could actually use — because every ambiguity has been resolved before the model starts generating. The difference in output quality is dramatic, and the precise prompt took about 45 seconds longer to write.
Myth 3: Prompting Is a Natural Language Skill You Already Have
Because prompts are written in plain English (or any language the model supports), it's natural to assume that good writers are automatically good prompters. They're not — at least not without adjustment. Strong human writers are trained to imply, suggest, and trust their readers to infer. Academic writing builds arguments gradually. Business writing often buries the ask in politeness. Literary writing rewards ambiguity. All of these instincts work against you when prompting AI. The skills that make someone a good writer for human audiences actively interfere with prompt effectiveness, because the model has no social intelligence to bridge the gap between what you wrote and what you meant.
Effective prompting is a distinct skill that borrows from technical writing, instructional design, and systems thinking. You're not composing — you're specifying. The closest analogy is writing a good creative brief or a detailed project specification: you define the output, the constraints, the audience, and the success criteria before anyone starts work. GitHub Copilot users discovered this early — developers who wrote precise, structured comments above their code got dramatically better autocomplete suggestions than those who wrote casual or vague comments. The same principle applies across every AI tool. Prompting rewards clarity, structure, and explicitness over elegance and implication.
Common Belief vs. Reality
| Common Belief | What's Actually True |
|---|---|
| Longer prompts produce better AI output | Denser, more focused prompts outperform verbose ones; length only helps when every added word reduces ambiguity |
| AI infers your intent from context | Models are literal next-token predictors; they pick the most probable interpretation of your words, not the most correct one |
| Good writers are naturally good prompters | Prompting rewards explicitness and structure; literary and business writing instincts often work against clarity |
| You only need one prompt to get a good result | Iterative refinement is standard practice; even expert prompters treat the first output as a draft to improve upon |
| Politeness and hedging improve AI responses | Phrases like 'if possible' or 'please try to' introduce ambiguity; direct imperatives produce more reliable outputs |
What Actually Works: The Four Ingredients
Strip away the myths and a clear pattern emerges from the prompts that consistently produce professional-grade output. Every effective prompt contains some combination of four ingredients: a defined role or persona, a specific task, the necessary context, and a constrained output format. Not every prompt needs all four at full strength — a simple factual question to Perplexity needs almost none of them. But when you're asking ChatGPT, Claude, or Gemini to produce something you'll actually use — a document, an analysis, a piece of communication — the absence of any one ingredient is usually what causes the output to fall short. Think of these four as dials, not checkboxes.
The role ingredient is often the most underused. Telling Claude to respond as "a senior financial analyst reviewing a startup's pitch deck" does something important: it activates a specific slice of the model's training data and applies a consistent perspective to everything that follows. The model has been trained on enormous amounts of text written by and about financial analysts — their vocabulary, their concerns, their skepticism thresholds, their typical recommendations. Specifying a role narrows the probability distribution of responses toward exactly that domain. This is why "act as a copywriter with 10 years of direct response experience" produces fundamentally different output than "write me some ad copy," even when the task description is identical.
The format ingredient is what most professionals forget entirely. When you don't specify a format, the model defaults to whatever format appears most frequently in its training data for that type of request — which is usually flowing prose paragraphs. That default is wrong for most professional use cases. An analyst needs a table. A manager needs bullet points. A developer needs code blocks. A communications team needs a structured press release. Specifying format explicitly — "respond in a three-column table," "use H2 headings and bullet points," "limit your response to 200 words" — constrains the output in ways that make it immediately usable rather than requiring reformatting. Notion AI and similar tools build format constraints directly into their templates for exactly this reason.
The Four-Ingredient Check
Goal: Experience firsthand how adding role, context, and format constraints transforms AI output from generic to usable — and develop the habit of auditing your own prompts before sending.
1. Open ChatGPT, Claude, or Gemini — whichever you have access to. 2. Send this exact prompt as-is and save the output: "Write me an email to a client who missed a deadline." 3. Note three specific ways the output fails to match what you'd actually send — tone, specificity, length, or assumptions it made. 4. Now write a new version of the prompt that includes: a defined role (who you are professionally), the specific task, three pieces of context (who the client is, what the deadline was for, what the consequence of missing it is), and a format constraint (length, tone, structure). 5. Send your revised prompt and compare the two outputs side by side. 6. Identify which of the four ingredients made the biggest difference to the output quality and write one sentence explaining why. 7. Save both prompts and both outputs — you'll use this comparison throughout the course as a reference point for measuring your progress.
Frequently Asked Questions
- Do I need all four ingredients in every prompt? No — a simple factual question ('What is the capital of Portugal?') needs none of them. The four ingredients matter most when you're asking for original content, analysis, or structured output that you plan to actually use.
- Does the order of the ingredients matter? Yes, modestly. Putting the role and task first tends to outperform burying them at the end, because the model begins generating with the most relevant context active from the start.
- Is this framework specific to one AI tool? No. The four-ingredient model applies to ChatGPT (GPT-4 and GPT-4o), Claude 3, Gemini 1.5, and any other large language model you're prompting via chat interface or API.
- What if I don't know what role to assign? Start with the role of the person who would professionally produce the output you want — a lawyer, an editor, a data analyst, a UX researcher. Even a rough role assignment improves output compared to no role at all.
- How specific does the format constraint need to be? As specific as you need the output to be. 'Use bullet points' is fine for simple lists. For complex documents, specify heading levels, word counts per section, and whether you want examples or just assertions.
- Can I save my four-ingredient prompts and reuse them? Absolutely — this is one of the highest-leverage habits in professional AI use. ChatGPT's custom instructions feature, Claude's Projects, and Notion AI's templates all support prompt reuse at different levels of sophistication.
Key Takeaways
- Prompt quality is about density, not length — every sentence should reduce ambiguity or constrain the output, not just add words.
- AI models are literal next-token predictors, not intent-readers. They pick the most probable interpretation of your words, which is often not the most correct one for your specific context.
- Prompting is a distinct skill from writing. It rewards explicitness, structure, and specification over elegance, implication, and politeness.
- The four ingredients of an effective prompt are: role, task, context, and format. Complex professional tasks benefit from all four being explicitly defined.
- Specifying a role activates a relevant slice of the model's training and applies a consistent professional perspective to the output.
- Format constraints are the most commonly skipped ingredient and one of the highest-impact changes you can make to an existing prompt.
Three Things Most Professionals Get Wrong About Prompts
You've seen the four ingredients — role, context, task, and format. Now here's the uncomfortable part: most professionals who learn those ingredients still write mediocre prompts. Not because they forget the framework, but because they're working from faulty assumptions about how AI models actually respond. These assumptions feel logical. They're based on how humans communicate. And they consistently produce worse results than a corrected mental model would. The three myths below are the most common culprits — each one subtle enough to survive even after someone has taken a course on prompting.
Myth 1: Longer, More Detailed Prompts Always Produce Better Results
The logic seems airtight: more information means fewer assumptions, fewer assumptions means better output. So professionals load their prompts with background, caveats, qualifications, and edge cases. A prompt that started as three sentences balloons into twelve. The AI gets everything — the full history of the project, every stakeholder's concern, the three alternative approaches that were rejected last quarter. More detail, better answer. Except that's not how it works. Models like GPT-4 and Claude process your entire prompt as a single context window, but they don't weight all parts equally. Irrelevant detail dilutes the signal. Critical instructions buried in paragraph five get less attention than those in paragraph one.
The real variable isn't length — it's relevance density. A 40-word prompt with four precisely chosen details consistently outperforms a 200-word prompt where those same four details are buried in noise. OpenAI's own prompt engineering documentation acknowledges this: models perform better when instructions are front-loaded and unambiguous, not when they're comprehensive. Claude's behavior is similar — Anthropic's internal testing shows that prompts with contradictory or competing instructions (which often appear in long prompts) produce hedged, wishy-washy outputs as the model tries to satisfy everything simultaneously. The model isn't being lazy. It's doing exactly what you asked — it just can't reconcile your competing demands.
The corrected mental model is surgical, not exhaustive. Ask yourself: what is the one thing the model absolutely must know to do this task well? What context changes the answer if it's missing? What format constraint matters? Strip everything else. A senior consultant writing a prompt for competitive analysis doesn't need to explain what competitive analysis is, describe the history of the industry, or list every possible output format. That's noise. What matters is the specific angle, the intended audience for the output, and the format that makes the insight actionable. Three things. Not thirty.
More Words ≠ More Control
Myth 2: You Should Ask AI to 'Be Creative' or 'Think Outside the Box'
This one is everywhere. Marketing briefs ask AI to "be creative." Strategy documents request that the model "think outside the box." Managers tell ChatGPT to "surprise them." The intent is good — people want outputs that feel fresh, not formulaic. The problem is that "be creative" is not a constraint. It's the absence of a constraint. And AI models, unlike human creatives who push back against briefs they find limiting, interpret vague instructions as permission to produce the most statistically average version of whatever you asked for. The model has seen millions of documents labeled "creative" — it averages them. Average creative work is, by definition, not creative.
Genuine creative output from AI comes from specific, unusual constraints — not from asking for creativity in the abstract. Telling Claude to write a product description "from the perspective of a skeptical customer who just changed their mind" produces something unexpected. Asking ChatGPT to write a strategy memo "using only analogies from sports" forces non-standard framing. Instructing Gemini to generate five positioning statements where "each one must appeal to a completely different fear" creates genuine variety. The constraint is the creative engine. This is actually how human creatives work too — the best advertising campaigns come from tight briefs, not open-ended ones.
The corrected mental model: replace abstract qualities with specific mechanisms. Don't say "be creative" — describe the unusual angle you want. Don't say "think outside the box" — specify which box you want to avoid and what direction to go instead. If you want the model to surprise you, give it a rule that forces deviation from the obvious path. "Do not use any of these five clichés" is more powerful than "be original." Constraints liberate. Vague permission paralyzes.
Prompt
You are a brand strategist known for counterintuitive positioning. Write three taglines for a project management software tool. Rules: (1) None of the taglines can mention productivity, efficiency, or organization. (2) Each tagline must imply a specific emotion — choose from: relief, pride, or control. (3) Maximum 8 words each.
AI Response
1. (Relief) — "The meeting that finally didn't happen." 2. (Pride) — "Your team ran it without you." 3. (Control) — "Every decision, documented before it disappears." Each tagline avoids the category's standard vocabulary while hitting a distinct emotional note. The constraints forced departure from the obvious — which is exactly what "be creative" never achieves on its own.
Myth 3: The AI Understands What You Mean, Even If You Don't Say It
This is the most seductive myth because it's partially true. Modern large language models are remarkably good at inferring intent. Ask ChatGPT to "clean up this email" and it correctly assumes you want better grammar and tone, not a complete rewrite. Ask Claude to "summarize this report" and it knows you want the key points, not a word-for-word repetition. These successful inferences build false confidence. Professionals start assuming the model shares their professional context, knows their industry norms, understands their company's voice, and can read between the lines the way a trusted colleague would. It cannot. Every unstated assumption is a place where the model fills in the gap with a statistical average — which may be miles from your actual need.
The gap between what you meant and what you said becomes most expensive in high-stakes outputs: client-facing documents, strategic recommendations, financial analyses, executive presentations. A management consultant who asks GPT-4 to "draft a findings slide" without specifying that findings should be framed as risks-and-mitigations (not recommendations) will get a structurally correct but strategically wrong slide. Fixing it takes longer than writing the extra sentence in the prompt would have. The model didn't fail — it produced a perfectly reasonable findings slide. It just produced someone else's version of one, not yours.
The Unstated Assumption Test
Myth vs. Reality: The Corrected Mental Models
| Common Belief | What's Actually True | The Fix |
|---|---|---|
| More detail = better output | Relevance density matters more than length; excess detail dilutes key instructions | Front-load the one thing the model must know; cut everything else |
| 'Be creative' unlocks better outputs | Vague permission produces statistical averages; specific constraints force deviation | Replace abstract qualities with mechanical rules (e.g., 'avoid these words', 'use this angle') |
| AI infers your professional context | The model fills gaps with the most common interpretation, not your specific one | State every non-obvious assumption explicitly, especially audience, tone, and success criteria |
| Natural language works best — write like you'd talk | Clear, structured instructions outperform conversational prose for complex tasks | Use numbered steps, explicit headers, or clear separators for multi-part requests |
| One good prompt is enough | Most high-quality outputs come from 2-3 iterations, refining based on first response | Treat the first output as a draft to redirect, not a final answer to accept or reject |
What Actually Works: Applying the Four Ingredients With Precision
With the myths cleared away, the four ingredients from Part 1 — role, context, task, format — become sharper tools. Role isn't just a job title you assign the model; it's a way of pre-loading a specific reasoning style, vocabulary set, and set of priorities. When you tell Claude "you are a CFO reviewing a business case for capital allocation," you're not playing pretend. You're activating a cluster of patterns in the model's training that shift how it evaluates claims, what risks it flags, and what questions it asks. The same business case reviewed by "a growth marketer" versus "a CFO" produces genuinely different, genuinely useful perspectives — because the role instruction changes which patterns dominate the response.
Context does the same precision work when it's specific rather than comprehensive. The most valuable context you can provide is almost always the same three things: who will read or use this output, what decision or action it feeds into, and what already exists that this should be consistent with. Everything else is background noise. A prompt for a client proposal that says "the client is a risk-averse procurement director at a mid-sized manufacturer who has rejected two previous proposals on price" gives the model exactly what it needs to adjust tone, anticipate objections, and frame value differently. That's 25 words of context. It outperforms three paragraphs of company history every time.
Format is where most professionals leave the most value on the table. They specify length ("keep it short") but not structure. They request bullet points but don't specify what each bullet should contain. The difference between "give me bullet points" and "give me five bullet points, each structured as: [insight] — [evidence] — [implication for the reader]" is enormous. The second instruction tells the model what a good bullet point looks like in your context. Notion AI, Perplexity, and ChatGPT all respond dramatically better to format instructions that define the internal logic of the output, not just its surface appearance. Think of format as a template you're asking the model to fill in, not a vague aesthetic preference.
The 'Template' Approach to Format Instructions
Practice: Rewrite a Weak Prompt Using the Corrected Mental Models
Goal: Develop the habit of diagnosing prompt failures against specific mental models, not just rewriting by instinct. By the end of this task you'll have a concrete before/after example from your own work — the most durable kind of learning.
1. Open ChatGPT, Claude, or Gemini — whichever you use most often at work. 2. Think of a task you've used AI for in the past month where the output disappointed you. Write down the original prompt you used (or a close approximation). 3. Run that original prompt now and save the output. This is your baseline. 4. Analyze the prompt against the three myths: Is it too long and detail-heavy? Does it ask for vague qualities like 'creative' or 'professional'? Does it rely on unstated assumptions about your context? 5. Rewrite the prompt using the four-ingredient structure: assign a specific role, provide the three key context pieces (audience, decision it feeds, what must be consistent), state the task with a clear success criterion, and define the format as a template with one example. 6. Run the rewritten prompt in the same tool and save the output. 7. Compare the two outputs side-by-side. Note specifically: where did the second output better match your actual need? Which ingredient made the biggest difference? 8. Write one sentence summarizing the single most important change you made and why it worked. 9. Save both prompts and your summary sentence — you'll use them as a reference template for similar tasks going forward.
Frequently Asked Questions
- Does the order of the four ingredients matter? Yes, mildly. Role and context work best at the start because they frame how the model interprets everything that follows. Task and format can come after, but the task should always appear before the format instruction.
- Should I use the same prompt structure for ChatGPT and Claude? The four-ingredient structure works across both, but Claude tends to follow nuanced role instructions more precisely, while ChatGPT handles explicit numbered task lists slightly better. Test both for your most common use cases.
- How do I know if my context is relevant or just noise? Ask: if this information were missing, would a capable professional make a different decision about how to approach the task? If yes, include it. If the task would be approached the same way regardless, cut it.
- What if I don't know the right role to assign? Use a functional description instead of a job title — 'someone who evaluates this from a financial risk perspective' works as well as 'CFO.' The function matters more than the label.
- Is it worth saving and reusing good prompts? Absolutely. Tools like Notion, a simple document, or ChatGPT's custom instructions feature let you store prompt templates. A library of 10-15 tested prompts for your most common tasks is worth more than any single perfect prompt.
- Does adding 'please' or polite language change outputs? Marginally, in some models — some research suggests polite framing slightly improves response quality in Claude. But the effect is small compared to structural improvements. Focus on the ingredients first, pleasantries last.
Key Takeaways From This Section
- Relevance density beats length — a short, precise prompt with four targeted details outperforms a long prompt where those details are buried in noise.
- Vague creative direction ('be creative,' 'think outside the box') produces average outputs; specific unusual constraints produce genuinely differentiated results.
- AI models fill unstated gaps with the most statistically common interpretation — which is almost never the most contextually correct one for your specific professional situation.
- The most valuable context is almost always the same three things: who reads this, what decision it feeds, and what it must be consistent with.
- Format instructions work best when they define the internal logic of the output — give the model a template with one filled-in example, not just a shape preference.
- The four-ingredient framework (role, context, task, format) becomes dramatically more powerful once you've cleared out the false assumptions that undermine each ingredient.
The Four Ingredients: What Most Professionals Get Wrong
Most professionals believe that longer prompts always produce better results, that AI tools need polite phrasing to perform well, and that adding more context is always beneficial. These beliefs are widespread — and all three are incomplete at best, actively counterproductive at worst. The four ingredients of a great prompt (role, context, task, and format) work through precision and relevance, not volume or courtesy. Understanding where these beliefs come from, and why they fail in practice, gives you a sharper mental model for every prompt you write from this point forward.
Myth 1: Longer Prompts Always Win
The logic seems sound: more information means better output. Professionals who are used to writing detailed briefs naturally apply the same instinct to AI prompts. But length without structure is noise. ChatGPT, Claude, and Gemini all use attention mechanisms that weigh tokens against each other — when you bury the core task inside three paragraphs of background, the model dilutes its focus. The actual instruction gets less weight than it deserves, and the output drifts toward the most statistically prominent content in your prompt rather than the most important.
The real variable isn't length — it's signal density. A 40-word prompt with a clear role, specific task, and defined format will routinely outperform a 200-word prompt that meanders. Research published by Anthropic on prompt engineering confirms that structural clarity (what you want, in what form, for what purpose) matters far more than total word count. Think of it this way: a well-structured 50-word prompt is a sharp knife. A sprawling 300-word prompt is a blunt one, regardless of how much effort went into it.
The practical fix is to front-load your intent. State the task in the first sentence. Then add role and format. Then — only if genuinely necessary — add context. Strip anything that doesn't directly constrain or clarify the output. When you review a prompt before sending it, ask: does every sentence here change what the AI will produce? If not, cut it. This editing discipline is the single fastest way to improve your prompt quality across all four ingredients.
More Words ≠ Better Output
Myth 2: AI Tools Respond Better to Polite Phrasing
A surprising number of professionals write prompts like emails — opening with pleasantries, softening requests, adding 'please' and 'thank you.' The instinct is understandable; we're conditioned to communicate this way with other people. But large language models don't have feelings to manage or relationships to maintain. Politeness isn't processed as social signal — it's processed as tokens. 'Please could you possibly help me write' uses six tokens to say what 'Write' says in one. That token overhead is minor, but the deeper problem is that hedged phrasing produces hedged output.
When you write 'Could you maybe suggest some ideas for...', you're signaling uncertainty about what you want. The model reflects that uncertainty back. Direct, declarative prompts — 'Generate five ideas for...' — produce more decisive, usable responses. This isn't about being rude to a machine; it's about writing in the register that produces the best output. The four ingredients framework is inherently direct: role assignments, task specifications, format instructions. That directness is structural, not stylistic, and it works because it removes ambiguity from every layer of the prompt.
Prompt
Hi! I was wondering if you could possibly help me come up with some ideas for a marketing email? I'd really appreciate it if you could suggest maybe 3-5 options. Thank you so much!
AI Response
WEAK OUTPUT: Vague suggestions with low confidence, over-qualified language, generic ideas. BETTER PROMPT: 'You are a B2B email marketing specialist. Write 5 subject line options for a cold outreach email selling project management software to operations managers. Each subject line should be under 50 characters and imply a specific business benefit.' STRONG OUTPUT: Five specific, differentiated subject lines with measurable constraints met — immediately usable.
Myth 3: More Context Always Helps
Context is one of the four ingredients — but it's the one most prone to excess. The myth is that dumping everything you know about a situation into a prompt produces the most informed response. In practice, irrelevant context introduces ambiguity. If you're asking Claude to draft a client proposal and you include three paragraphs about your company's founding history, the model has to decide whether that history is relevant to the proposal's tone, scope, or content. It often guesses wrong. Relevant context means information that directly changes what a good output looks like — industry, audience, constraints, prior decisions.
The test for any piece of context is simple: if I removed this, would the output change in a meaningful way? Company history in a proposal prompt? Probably not. The client's industry and their known objections? Absolutely yes. Applying this filter tightens your prompts and forces you to think clearly about what actually matters for the task. That clarity — knowing precisely what context is load-bearing — is a skill that transfers across every AI tool you use, from Perplexity research queries to GitHub Copilot code comments.
| Common Belief | The Reality |
|---|---|
| Longer prompts produce better results | Signal density beats word count — structure matters more than length |
| Polite phrasing improves AI responses | Direct, declarative prompts produce more decisive, usable outputs |
| More context always helps | Only load-bearing context improves output; irrelevant detail introduces ambiguity |
| You only need to prompt once | Iteration is the method — first drafts are starting points, not final answers |
| Any AI tool responds to prompts the same way | ChatGPT, Claude, and Gemini have distinct strengths; prompts often need tuning per tool |
What Actually Works: The Four Ingredients Applied
The four ingredients — role, context, task, format — work because they eliminate the four most common sources of weak output: unclear expertise level, missing background, ambiguous instruction, and unspecified structure. Role tells the model which knowledge domain and voice to operate from. Context supplies only the load-bearing facts. Task states the deliverable with precision. Format defines exactly how the output should be shaped — bullet list, table, 200-word paragraph, three options ranked by risk. When all four are present and tight, the model has no room to guess, and guessing is where output quality degrades.
Iteration is not a workaround for bad prompting — it's the actual method. Even expert prompt writers treat first outputs as drafts. The difference is that experts use structured follow-ups: 'Make option 2 more specific to a regulated industry,' 'Shorten this to 100 words without losing the three key points,' 'Rewrite in the voice of a CFO addressing a skeptical board.' Each follow-up applies one or two of the four ingredients as a correction. This builds toward the output you need in two or three exchanges rather than one perfect prompt that never quite exists.
Across tools, the four ingredients scale consistently. In Notion AI, a tight task-plus-format instruction produces cleaner document summaries than open-ended requests. In GitHub Copilot, specifying the role ('senior Python developer') and format ('with inline comments explaining each step') produces more maintainable code. In Midjourney, the 'format' ingredient maps directly to aspect ratio, style reference, and quality parameters. The vocabulary differs per tool, but the underlying logic is identical: constrain the output space precisely, and the model fills it well.
Build a Prompt Template You Actually Reuse
Goal: Produce two reusable, high-quality prompt templates for real tasks in your role — plus a direct before/after comparison that makes the four-ingredient impact concrete and personal.
1. Pick one recurring task in your job — something you do weekly, like summarizing reports, drafting emails, or preparing talking points. 2. Write a raw, instinctive prompt for that task exactly as you would have before this lesson — no four-ingredient structure. 3. Send that raw prompt to ChatGPT or Claude and save the output. 4. Now rewrite the prompt using all four ingredients: assign a specific role, add only load-bearing context, state the task with measurable precision, and define the exact output format. 5. Send the structured prompt to the same tool and save the output. 6. Compare the two outputs side by side — note at least three specific differences in quality, specificity, or usability. 7. Refine the structured prompt once based on what's still missing in the output. 8. Save the final structured prompt as your 'template' for this task — label it with the role, tool, and date. 9. Repeat this process for one more recurring task this week to begin building a personal prompt library.
Frequently Asked Questions
- Do I need all four ingredients every time? No — simple, low-stakes tasks often need only task and format. Use all four when the output quality materially matters or when you're working with a complex deliverable.
- Which ingredient matters most? Task. Without a precise task, role and context have nothing to anchor to. If you can only improve one thing, make the task instruction more specific.
- Does the order of the four ingredients matter? Slightly. Leading with role primes the model's register before it reads your task. Role → Context → Task → Format is the most reliable sequence, though tools like Claude handle reordering gracefully.
- Should I use the same prompt across ChatGPT, Claude, and Gemini? Start with the same prompt, but expect to tune it. Claude tends to follow nuanced format instructions more literally; ChatGPT responds well to explicit persona framing; Gemini integrates well with factual, search-style context.
- How long should my prompts be? Long enough to include all load-bearing information, short enough to cut everything else. For most professional tasks, 50–120 words is the productive range.
- What if the output is still weak after applying all four ingredients? Iterate with one specific correction at a time — don't rewrite the whole prompt. Targeted follow-ups like 'make this more concise' or 'add a risk column to the table' are faster and more effective than starting over.
Key Takeaways
- Signal density beats length — a tight 50-word prompt outperforms a vague 300-word one every time.
- Direct, declarative prompts produce more decisive outputs than hedged or polite phrasing.
- Context only helps when it's load-bearing — include information that changes the output, cut everything else.
- The four ingredients (role, context, task, format) eliminate the four main sources of weak AI output.
- Iteration is the method, not a fallback — treat first outputs as drafts and refine with targeted follow-ups.
- The four-ingredient framework transfers across every major AI tool, with only surface-level vocabulary changes.
- Building a personal prompt library of reusable templates compounds your productivity gains over time.
A colleague sends a 250-word prompt to ChatGPT and gets a vague, unfocused response. What is the most likely cause?
You're prompting Claude to draft a board presentation on supply chain risks. Which of these represents load-bearing context?
A marketer writes: 'Could you possibly help me maybe come up with some email subject lines?' What is the primary problem with this prompt?
After applying all four ingredients, an analyst gets a solid but slightly off-target response from Gemini. What is the most effective next step?
Which statement best describes how the four-ingredient framework applies across different AI tools like Notion AI, GitHub Copilot, and Midjourney?
Sign in to track your progress.
