Write Captions That Stop the Scroll
Writing Captions and Copy That Convert
Part 1: Why Most AI-Generated Captions Fall Flat, and How to Fix That
Historical Record
Glossier
In early 2023, Glossier's social media team ran an experiment using ChatGPT to draft a week's worth of Instagram captions with minimal editing.
The experiment demonstrated how AI-generated copy, while technically correct, often fails to capture a brand's distinctive voice and conversational style.
Glossier's brand is built on intimacy. Its captions historically read like a text from a knowledgeable friend, specific, a little irreverent, sometimes incomplete in a way that feels intentional. The AI output, by contrast, was complete. Polished. It hit every beat of a technically correct caption: hook, benefit, call to action. But it lacked the small asymmetries that signal a real human wrote it. The team regrouped, spent two weeks building what their content director called a 'voice brief', a document that captured not just tone adjectives but actual sentence patterns, words they never use, and references their audience would recognize. They fed that brief into every AI prompt. Engagement recovered within three weeks.
The Glossier story isn't a cautionary tale about AI. It's a precise diagnosis of where most professionals go wrong when they first start using AI for social copy. They treat the tool like a vending machine: insert topic, receive caption. What they get back is competent but generic, the average of millions of marketing posts rather than the specific voice of their brand. The fix isn't to stop using AI. The fix is to stop giving it nothing to work with. This lesson is about the inputs that produce outputs your audience actually responds to.
The Core Principle of This Lesson
The Specificity Principle: How Detail Unlocks Better Copy
Marcus runs marketing for a mid-sized regional gym chain in the UK with 14 locations. He was posting three times a week on Instagram and spending roughly 90 minutes per post, writing, deleting, rewriting, getting approvals. He'd tried ChatGPT and been disappointed. The captions came back sounding like every other gym on the platform: motivational quotes, generic 'join us' calls to action, hashtag clusters that felt algorithmic. His audience, which skews 35-55 and is primarily motivated by stress relief and community rather than aesthetics, didn't connect. He was ready to write off AI entirely when a colleague suggested he try a completely different approach to the prompt itself.
Instead of typing 'write an Instagram caption for a gym post about our new yoga classes,' Marcus spent 20 minutes building a detailed brief. He described his audience by age, motivation, and the specific anxieties they'd expressed in member surveys. He listed three phrases his brand never uses (no 'crush it,' no 'beast mode,' no 'no pain no gain'). He included two example captions from his archive that had performed well, posts that got 4x his average comment rate. He specified the platform, the post goal (class sign-ups), and the tone (warm, slightly witty, never preachy). The output from that prompt was usable on the first pass. He now drafts five captions in under 25 minutes. The specificity didn't just improve quality, it compressed his workflow by nearly 70%.
What Marcus discovered is what copywriters have known for decades: the brief is the work. Professional agencies charge significant fees for creative briefs before a single word of copy is written, because the brief determines everything downstream. When you use AI for social copy, you are effectively writing your own brief in real time, inside the prompt. The more complete that brief, the less revision the output needs. This isn't about becoming a 'prompt engineer', it's about briefing your AI the same way you'd brief a talented freelancer who's never worked with your brand before.
Prompt
You are a social media copywriter for a regional UK gym chain called PeakForm. Our audience is primarily 35-55 year olds who join for stress relief, community, and feeling capable, not aesthetics or extreme fitness. Our tone is warm, grounded, slightly witty, never preachy or aggressive. We never use phrases like 'crush it,' 'beast mode,' or 'no excuses.' Write 3 Instagram caption options for a post announcing our new Tuesday evening yoga classes. The goal is class sign-ups via the link in bio. Each caption should be under 150 words, include one specific detail (Tuesday 6:30pm, £8 drop-in), and end with a gentle call to action. Include 4-6 relevant hashtags per caption.
AI Response
Option 1: Tuesday evenings just got a lot quieter. Our new yoga class starts this week, 6:30pm, candlelit studio, no prior experience needed. Just you, 45 minutes, and permission to actually stop for a bit. Drop-in is £8. Link in bio to grab your spot. #YogaForBeginners #StressRelief #PeakFormYoga #TuesdayYoga #UKGym #MindfulMovement Option 2: Here's what our members keep asking for: something that slows things down. Starting Tuesday at 6:30pm, we're running evening yoga, open to all levels, £8 drop-in, no mat required (we've got you). If your week needs a reset, this is a pretty good place to start. Book via the link in bio. #EveningYoga #GymLife #PeakForm #YogaUK #TuesdayVibes #WellbeingMatters Option 3: Not every workout needs to leave you breathless. Our new Tuesday yoga class (6:30pm, £8 drop-in) is built for the version of you that needs to decompress, not compete. All levels welcome. Spots are limited, link in bio. #YogaClass #PeakFormGym #TuesdayYoga #MindBodyBalance #UKFitness #EveningWorkout
A Different Industry, the Same Problem: How a B2B Consultancy Cracked LinkedIn Copy
Shift the context entirely, away from consumer fitness and into B2B professional services. Meridian Advisory, a 40-person management consultancy in Chicago, needed to build thought leadership on LinkedIn. Their managing partners were smart, credible people who wrote in a style that was rigorous but impenetrable. Their LinkedIn posts were dense paragraphs with no white space, no hook, and a habit of burying the insight three sentences deep. Engagement was minimal. They tried having a junior marketing coordinator use Claude to rewrite the posts. First pass: the AI smoothed out the jargon and added structure, but stripped out all the specific expertise that made the posts worth reading. It sounded like a business school textbook.
The breakthrough came when they changed who was responsible for the prompt. Instead of the coordinator prompting Claude to 'rewrite this for LinkedIn,' a senior partner would first write a short voice note, two minutes, stream of consciousness, about the actual insight they wanted to share. The coordinator transcribed it, used that transcription as the core of the prompt, and instructed Claude to preserve every specific example and data point while restructuring it for LinkedIn's format: short opening line that stops the scroll, three to five punchy paragraphs, a question at the end to invite comments. That process now produces first-draft LinkedIn posts that the partners approve with minor edits. Their average post engagement increased by roughly 3x over six months. The voice note was the missing ingredient, raw human expertise that the AI could shape but not invent.
Platform Matters: The Same Message, Completely Different Copy
| Platform | Ideal Caption Length | Tone That Performs | Key Structural Element | Call to Action Style |
|---|---|---|---|---|
| 50–150 words | Personal, visual, slightly intimate | Hook in first line before 'more' cut-off | Soft: 'Link in bio,' 'Save this,' 'Tag someone' | |
| 150–300 words | Professional but human, insight-led | Provocative opening statement or question | Engagement invite: 'What's your experience?' or 'Agree?' | |
| 40–80 words | Conversational, community-oriented | Direct question or relatable scenario | Direct: 'Click below,' 'Comment YES if...' | |
| TikTok / Reels | Under 50 words | Casual, fast, trend-aware | Caption amplifies video, doesn't repeat it | Reaction-driven: 'Tell me I'm not alone' |
| X (Twitter) | Under 280 characters | Sharp, opinionated, or useful | The insight IS the post, no fluff | Retweet-worthy: share a stat or take a stance |
The E-commerce Example: When Copy Has to Do Real Selling Work
Priya manages social media for a small Australian homewares e-commerce brand, Linen & Co., with around 28,000 Instagram followers. Unlike the gym or the consultancy, every post she writes is ultimately in service of a sale. She can't afford to be purely inspirational, the business needs clicks, add-to-carts, and conversions. When she started using Canva AI and ChatGPT together (Canva for image copy overlays, ChatGPT for caption drafts), her first mistake was writing captions that were either too salesy, 'Shop now, 20% off this weekend only!', or too soft, posting lifestyle content that got likes but drove zero traffic. Neither extreme converted.
What changed her results was learning to prompt for what copywriters call 'the middle path', copy that leads with a feeling or a problem, introduces the product as the natural answer, and closes with a frictionless action. She now gives ChatGPT a specific template instruction in every product-focused prompt: 'Start with a sensory detail or a relatable moment (one sentence). Introduce the product as the solution in the second sentence. Give one specific product detail (material, size, price). Close with a single low-pressure CTA.' That four-part structure, built into her prompt, produces captions that average a 2.1% click-through rate on Instagram, well above the platform's typical 0.5–1% benchmark for product posts. The structure isn't magic. It's just applied copywriting logic, delivered through a well-briefed AI.
Give AI a Structure, Not Just a Topic
Putting This Into Practice: Your First Structured Caption Prompt
The three examples above, a gym, a consultancy, an e-commerce brand, share a common thread. None of them got results by typing a quick request into an AI tool. Each built a prompt that contained four essential ingredients: audience context (who reads this and what they care about), brand voice (what sounds like us and what doesn't), structural instruction (what the caption should do, in what order), and a specific goal (sign-ups, engagement, clicks). When all four are present, the AI has everything it needs to produce copy that's genuinely useful. When any one is missing, the output is noticeably weaker.
You don't need to build a 500-word prompt every time. Once you've done the thinking work once, written down your audience description, your tone guidelines, your structural preferences, you can save that as a reusable 'base prompt' in a notes app, a Google Doc, or directly in ChatGPT's custom instructions feature (available on ChatGPT Plus). Every new caption prompt then becomes a short addition to that base: 'Using the brand brief above, write an Instagram caption for [specific post].' Your first investment of 30–45 minutes building that base prompt pays dividends across every caption you write for the next year.
The practical task below walks you through building exactly that: a reusable brand brief and your first structured caption prompt, tested against a real post you need to write. By the end of it, you'll have a working template you can use on Monday morning. The goal isn't a perfect caption on the first try, it's a process that reliably produces a strong first draft in under five minutes, every time.
Goal: Create a reusable brand voice brief and use it to generate a platform-specific caption that sounds like your brand, not like generic AI output.
1. Open a Google Doc or a notes app and create a section called 'Brand Brief for AI Prompts.' Write 3–5 sentences describing your target audience: their age range, what they care about, and the specific problem or desire that brings them to your brand. Be as specific as possible, use real words your customers use. 2. Add a 'Tone' section to the brief. Write 3 adjectives that describe your brand voice, then list 3–5 phrases or words your brand never uses. Include one example sentence that sounds exactly like your brand. 3. Add a 'Platform' section. Choose the one platform where you post most frequently. Note the typical caption length you aim for and the type of call to action that fits your audience (soft, direct, engagement-focused). 4. Identify a real post you need to write this week, a product launch, an event, a piece of content, a promotion. Write one sentence describing what you want the post to achieve (sign-ups, clicks, saves, comments). 5. Open ChatGPT (free or Plus) or Claude and start a new chat. Paste your full brand brief as the first part of your message, then add: 'Using this brand brief, write 3 caption options for [platform] about [your post topic]. The goal is [your goal]. Each caption should [add your structural instruction: e.g., open with a relatable moment, introduce the product/service, include one specific detail, close with a CTA].' 6. Read the three options. Highlight any line across any of the three that sounds most like your brand. Copy those lines into a new message and type: 'Can you write a new version that combines the best elements of these lines: [paste them]?' 7. Take the revised output and make two manual edits: replace one generic phrase with a specific detail only your brand would know (a location, a product name, a real customer phrase), and adjust the CTA to match exactly how your brand speaks. 8. Save the final caption alongside your brand brief document. Add a note: what worked, what needed changing. This becomes the start of your personal AI copy log, a record that will make every future prompt sharper. 9. Post the caption and note the engagement at 24 hours. Compare it to your last three posts written without a structured brief. You're building evidence, not just copy.
Key Lessons From Part 1
- Generic prompts produce generic captions. The quality of your AI output is directly proportional to the quality of the brief you provide, just like briefing a freelance copywriter.
- A brand voice brief is your most valuable AI asset. Spending 30–45 minutes building one pays off across every caption, email, and post you create going forward.
- Audience specificity beats topic specificity. Telling AI who reads your content and what they care about matters more than just telling it what you're promoting.
- Platform context is not optional. The same message needs completely different structure, length, and tone on Instagram versus LinkedIn versus Facebook. Always specify the platform.
- Structure instructions inside the prompt produce structured output. When you tell AI the four-part shape of the caption you want, you stop editing for structure and start editing for voice.
- The 'voice note to transcript' technique (as used by Meridian Advisory) is a powerful workaround when the human expert is not a natural writer, raw spoken insight is better raw material than no insight at all.
- AI copy needs one manual edit minimum: replace a generic detail with something only your brand would know. That single edit is often the difference between 'fine' and 'on-brand.'
The Brand Voice Problem: Why AI Copy Falls Flat (And How to Fix It)
In 2022, Duolingo's social media team made a bet. They handed their brand voice, chaotic, self-aware, slightly unhinged, to a small team that used AI tools to scale content production across platforms. The results were striking. Engagement on TikTok and Instagram jumped significantly, not because the AI was writing funnier jokes, but because the team had done something most brands skip entirely: they built a detailed voice document before touching any AI tool. Every prompt started with a description of who Duolingo's brand 'is' as a character. The AI wasn't guessing. It had a brief. That distinction, briefed AI versus unbriefed AI, is what separates scroll-stopping copy from the beige, forgettable output that floods most brand feeds.
The Duolingo team's approach reveals a principle that applies whether you're marketing a language app, a law firm, or a local bakery: AI produces copy at the level of the instructions you give it. Vague instructions produce vague copy. Specific, character-driven instructions produce copy that sounds like someone wrote it on purpose. The technical term for this is prompt engineering, but think of it as writing a creative brief for a very fast, very literal copywriter who has never met your customers and has no instincts of their own. Your job is to supply the instincts.
Most professionals who try AI for social copy hit a wall around week two. The first few outputs feel exciting, fast, decent, usable. Then the sameness creeps in. Every caption starts to sound like every other caption. Products that should feel distinct start sounding identical. This isn't a flaw in the AI; it's a briefing problem. Duolingo solved it by treating brand voice as an input, not an afterthought. That shift, from 'generate me a caption' to 'here is exactly who we are, now generate a caption', is the core skill of Part 2.
What 'Brand Voice' Means in an AI Prompt
Building a Voice Brief That Actually Works in Practice
Patagonia's digital marketing team faces a challenge that many mission-driven brands recognize: their audience is deeply skeptical of anything that sounds like marketing. A caption that reads as salesy doesn't just underperform, it actively damages trust. The team uses AI tools to draft Instagram and email copy, but every prompt includes what they internally call a 'voice fence': a short paragraph that defines not only what the brand sounds like, but what it refuses to sound like. Phrases like 'buy now' or 'limited offer' are explicitly excluded. The AI is told to write as though the reader is already a community member, not a prospect being sold to. That framing changes everything, the output reads like an invitation, not an ad.
This approach works because it mirrors how a good creative director briefs a human copywriter. You wouldn't hand a new writer a product spec sheet and say 'write something.' You'd tell them about the reader, the relationship, the feeling you want the copy to create, and the landmines to avoid. AI needs the same briefing structure. The difference is that with AI, you're writing the brief directly into the prompt, there's no meeting, no back-and-forth, no reading of facial expressions. Precision in the prompt replaces the nuance of human collaboration.
Here's the practical payoff: once you've written a strong voice brief, you can reuse it across dozens of prompts. Paste it at the top of every ChatGPT or Claude session. Save it in a Notion doc. Make it the first thing any AI tool sees before you ask it to write anything. Teams that do this consistently report that their AI-generated drafts need 30-50% less editing than teams who prompt from scratch each time. The brief is a one-time investment that pays off every single week.
Prompt
You are writing Instagram captions for a sustainable outdoor apparel brand. Our voice is: direct and grounded, never salesy. We speak to experienced outdoor enthusiasts as peers, not customers. We believe in buying less and buying better. We never use urgency language like 'limited time,' 'shop now,' or 'don't miss out.' Our sentences are short. We often start with an observation about nature or the outdoors before connecting it to the product. Write 3 Instagram captions for a new waterproof hiking jacket. The jacket is made from recycled materials, weighs 340 grams, and packs into its own pocket. Include a relevant emoji in each and end with a question to encourage comments.
AI Response
Option 1: The mountain doesn't care about your forecast app. ⛰️ This jacket started as 28 plastic bottles. Now it lives in your pack pocket and weighs less than a water bottle. Built for the days when the weather changes its mind three times before noon. What's your go-to layer when conditions get unpredictable? --- Option 2: Some gear earns its place. Some just takes up space. 🎒 At 340g, our new waterproof shell is the kind of thing you stop noticing, until you need it. Recycled materials, packable design, no compromises on protection. What's one piece of kit you'd never leave the trailhead without? --- Option 3: The best jacket is the one you actually brought. 🌧️ We spent two years getting the weight down without cutting corners on waterproofing. Recycled shell. Packs into itself. Disappears into your bag until the sky opens up. Where are you heading this season?
A Different Industry, the Same Principle: Real Estate
Sarah Chen runs a boutique real estate agency in Austin, Texas. Her team of six agents was spending roughly four hours a week each writing property listing copy for Instagram and Facebook, descriptions of homes, neighborhood highlights, open house announcements. That's 24 hours of writing time per week across the team. She started using Claude to draft first versions of every post. Within three weeks, that time dropped to under six hours total. But the more interesting result was consistency. Before AI, each agent's social posts sounded like a different brand. After, every post sounded like it came from the same agency, because every agent was using the same voice brief in their prompts.
Sarah's voice brief is two paragraphs long. It describes the agency as 'a knowledgeable neighbor, not a pushy broker.' It specifies that captions should focus on lifestyle and feeling before features and specs. It bans the phrase 'dream home' (overused to the point of meaninglessness in real estate) and discourages exclamation marks. The brief took her about 45 minutes to write. It now sits in a shared Notion page that every agent accesses before generating content. The AI doesn't know anything about Austin real estate, but with that brief, it doesn't need to. It has the voice. The agents supply the facts.
Comparing Approaches: Generic Prompts vs. Voice-Briefed Prompts
| Prompt Type | Example Input | Typical Output Quality | Editing Time Required | Brand Consistency |
|---|---|---|---|---|
| Generic prompt | 'Write an Instagram caption for a 3-bedroom house in Austin' | Serviceable but bland, reads like every other listing | 45-60 minutes per post | Low, varies by session |
| Tone-only prompt | 'Write a warm, friendly caption for a 3-bedroom house in Austin' | Warmer, but still generic, 'warm' means different things to everyone | 25-35 minutes per post | Moderate, inconsistent |
| Voice-briefed prompt | Full brief + 'Write a caption for a 3-bedroom house in Austin, focus on the back garden and proximity to Barton Springs' | Distinctive, on-brand, minimal filler phrases | 10-15 minutes per post | High, replicable across team |
| Voice brief + examples | Brief + two sample captions you've approved + new property details | Near-publication quality first draft | 5-10 minutes per post | Very high, matches existing content |
The HR Recruiter Who Became a Content Machine
Marcus Webb is a talent acquisition manager at a mid-sized logistics company in the UK. His job isn't social media marketing, but LinkedIn is now a core recruiting channel, and his company's employer brand lives or dies on the quality of posts about company culture, job openings, and employee stories. Marcus had no copywriting background. He was spending hours on posts that felt stiff and corporate, and engagement was minimal. He started using Microsoft Copilot (included in his company's Microsoft 365 subscription) to draft LinkedIn posts, feeding it a simple voice brief: 'Write as a people-first HR professional who is proud of our team but never boastful. Be specific. Avoid corporate jargon. Sound like a real person, not a press release.'
The results changed how Marcus thinks about his role. Posts about warehouse team achievements, flexible working policy changes, and graduate scheme openings started generating comments and shares from people outside his existing network. One post about a driver who had been with the company for 20 years reached 14,000 impressions, not because the story was extraordinary, but because the copy felt human. Marcus hadn't become a better writer. He'd become a better briefer. He now spends about 20 minutes per week on LinkedIn content that previously took him three to four hours. The AI drafts. He edits for accuracy and adds the specific details only he knows.
The 'Samples First' Shortcut for Voice Matching
Platform-Specific Copy: Same Product, Different Language
One of the most practical skills in AI-assisted social copy is using a single content idea to generate platform-specific versions in one session. A product launch, a company announcement, a customer success story, each of these needs to sound different on LinkedIn versus Instagram versus Twitter/X. LinkedIn readers expect professional context and a point of view. Instagram readers want visual language and emotional resonance. Twitter/X demands compression, every word must work hard. These aren't just format differences; they're audience expectation differences. When you prompt AI without specifying the platform, you usually get something that fits nowhere particularly well.
The solution is to build platform context into your prompt alongside your voice brief. Specify the platform, the typical character count or format, and what that platform's audience is expecting from a post in this category. Better still, do all three platforms in one prompt, ask the AI to give you a LinkedIn version, an Instagram version, and a Twitter/X version of the same content. You get three drafts in 30 seconds. You edit each for accuracy and nuance, check that each one fits the platform's rhythm, and schedule. What used to take 90 minutes of writing and rewriting across three platforms now takes 15 minutes of prompting and editing.
There's a discipline required here that professionals sometimes resist: you have to resist the temptation to post the AI draft without reading it carefully. Not because AI makes things up in social copy (it rarely does when you've supplied the facts), but because it can miss the specific detail that makes a post feel real. The AI might write 'our team worked hard on this launch' when what actually happened is that your product manager stayed up until 2am fixing a critical bug the night before release. That specific detail, which only you know, is what turns a decent post into a memorable one. AI gives you the structure and the language. You supply the truth.
Goal: Create a reusable voice brief for your brand or team and use it to generate tailored social media copy for three platforms from a single content idea.
1. Open a document (Notion, Word, or Google Docs) and write a voice brief for your brand or role. Include: three personality traits, five words you always use or aspire to use, three phrases or words you never want to appear in your copy, and one sentence describing how you want readers to feel after reading your posts. 2. Identify one real piece of content you need to create this week, a product update, a team achievement, an event announcement, a job opening, or a customer story. Write down the core facts in bullet points (no full sentences needed). 3. Open ChatGPT, Claude, or Microsoft Copilot. Paste your voice brief at the top of the prompt, then paste your bullet-point facts below it. 4. Add this instruction at the end of your prompt: 'Using the voice brief and facts above, write three versions of this post: one for LinkedIn (150-200 words, professional context, include a clear point of view), one for Instagram (80-100 words, visual and emotional language, end with a question), and one for Twitter/X (under 240 characters, punchy and direct).' 5. Read all three outputs carefully. Highlight any phrase that doesn't sound like your brand or contains a factual error. 6. Add at least one specific, true detail to each version that only you would know, a name, a number, a moment, a specific location. This is the step most people skip, and it's the most important one. 7. Copy your revised LinkedIn version and paste it into Grammarly or the grammar check in your word processor. Accept corrections for clarity but reject any that change your voice. 8. Save your voice brief in a shared team document or your personal notes so you can reuse it in every future AI prompt session. 9. Schedule or publish one of the three versions and note the engagement after 48 hours. Compare it to a recent post written without AI assistance.
Key Lessons from Part 2
- AI produces copy at the quality level of the brief you give it, vague prompts produce generic output, regardless of which tool you use.
- A brand voice brief is a one-time investment that makes every future AI prompt faster and more consistent.
- Voice briefs should include personality traits, preferred vocabulary, banned phrases, and the emotional relationship you want with your reader, not just tone.
- The 'samples first' method lets you reverse-engineer your voice brief from your best existing content in under 10 minutes.
- Platform context is as important as voice context. LinkedIn, Instagram, and Twitter/X require different language, rhythm, and structure for the same content.
- Generating all three platform versions in one prompt is a significant time-saver, 30 seconds of generation versus 90 minutes of separate writing sessions.
- AI handles structure and language well; you must supply the specific true details that make copy feel real and human.
- Teams that share a common voice brief produce more consistent content than teams where each person prompts from scratch, consistency builds brand recognition over time.
- Editing AI output is a skill distinct from writing from scratch, focus your editing energy on accuracy, specific detail, and anything that doesn't sound like your brand.
From Generic to Genuine: Making AI Copy Sound Like You
In 2022, Duolingo's social media team made a decision that confused a lot of marketing purists. They handed their brand voice, snarky, chaotic, occasionally unhinged, to a small team that used AI drafting tools to maintain that tone at scale across TikTok, Instagram, and Twitter. The tension was obvious: how do you keep something feeling raw and human when a machine is generating first drafts? Their answer was disciplined. They built a detailed voice document, specific phrases Duolingo uses, specific ones it never uses, the emotional register of each platform, and fed that context into every AI prompt. The result wasn't robot copy. It was faster, more consistent Duolingo copy, reviewed and sharpened by humans who knew the brand cold.
The principle Duolingo stumbled onto is the same one that separates professionals who get real results from AI from those who get beige, forgettable output: context is everything. The AI doesn't know your brand, your audience's inside jokes, the competitor you're quietly positioning against, or the campaign theme your team spent three weeks developing. When you give it that context, in writing, in the prompt, the output transforms. Duolingo's team wasn't using AI to replace creativity. They were using it to remove the blank-page problem so their creative people could spend time on judgment, not drafting.
This is the final and most important skill in AI-assisted caption writing: learning to brief the AI the way you'd brief a talented junior copywriter. Not vague instructions. Specific ones. Platform, tone, audience, goal, word count, what to avoid. The more precise your input, the less editing you do on the output. Most professionals who complain that AI copy sounds generic are giving generic prompts.
The Brief Is the Work
Patagonia's digital team operates under strict brand constraints: no hype language, no manufactured urgency, no hollow calls to action. Their captions read like letters from someone who genuinely cares about the environment, because they're written by people who do, then refined with AI tools that have been given explicit instructions about what Patagonia never says. When they launched their 'Don't Buy This Jacket' campaign content years ago, the copy worked because it was counter-intuitive and specific. You can replicate that discipline with AI by including a 'never say' list in every prompt. Tell Claude or ChatGPT: 'Never use words like amazing, revolutionary, or limited time only. Never use exclamation points. Write like a person, not a press release.'
The mechanism is simple but powerful. AI language models are trained to produce statistically likely text, which means they default to the most common marketing phrases unless you explicitly steer them away. 'Amazing deal,' 'Don't miss out,' 'Transform your life', these appear constantly because they appear constantly in training data. Your job is to override that default with brand-specific instructions. Think of it like telling a new hire: 'We don't talk like that here. Here's how we actually sound.' The model will follow those instructions reliably if they're clear.
A regional coffee chain in the Pacific Northwest used this approach to solve a real problem: their Instagram captions had started sounding identical to every other third-wave coffee brand. Their social manager spent one afternoon building what she called a 'voice card', a 200-word description of their brand personality, three example captions they loved, three they hated, and a list of banned phrases. She pastes that card at the top of every AI prompt. Within two weeks, their engagement rate on Instagram increased by 34%. The copy wasn't louder. It was more specific, more theirs.
Prompt
You are writing Instagram captions for Ridgeline Coffee, a Pacific Northwest coffee roaster. Our voice is: warm but not cutesy, knowledgeable but not snobby, community-focused, slightly dry humor. We NEVER use: 'amazing,' 'perfect,' 'don't miss out,' 'transform,' or exclamation points. We DO use: specific origin details, weather references (we're in Seattle), and occasional self-deprecating humor. Write 3 Instagram caption options for a post featuring our new Ethiopian Yirgacheffe single-origin. The post image shows a ceramic pour-over on a rain-streaked windowsill. Each caption should be under 100 words. Include one question or conversation starter per caption.
AI Response
Option 1: Yirgacheffe arrived this week, and honestly it deserved better weather for its debut. Notes of bergamot, stone fruit, and something we keep describing as 'fancy tea energy.' Brewed slow, best enjoyed while watching the rain do its thing outside. What's your go-to brewing method for a light roast? Option 2: Ethiopia Yirgacheffe, the one that makes people who claim they 'don't like fruity coffee' quietly ask for a second cup. Washed process, high altitude, grown by the Kochere cooperative. We've been waiting on this one for months. Worth it. What are you brewing this weekend? Option 3: Some coffees taste like a season. This Yirgacheffe tastes like the specific week in October when Seattle can't decide if it's still fall. Floral, bright, a little unpredictable. Available now in the shop and online, link in bio. Do you prefer single-origins or blends for your morning cup?
Real-World Example: HR Teams Writing Employer Brand Content
Social media isn't just for selling products. HR and talent acquisition teams at companies like HubSpot and Shopify use LinkedIn and Instagram to attract candidates, and the copy rules are identical. Vague captions ('We're hiring! Great culture!') perform terribly. Specific ones ('Our support team works four-day weeks, handles tickets async, and has a Slack channel dedicated entirely to bread recipes') perform well because they speak to a real person's real priorities. HubSpot's talent team uses AI to draft LinkedIn posts about employee experience, feeding it specific team stories and concrete details before generating copy.
The HR manager at a mid-sized logistics company used Claude to rewrite their generic 'We're a great place to work' LinkedIn posts after noticing their job posts were getting almost no organic engagement. She gave Claude three real employee quotes, the actual benefits package details, and the specific roles they were hiring for. The AI drafted five posts in under ten minutes. She edited two of them lightly and posted. Applications from LinkedIn increased by over 40% the following month. The content wasn't polished corporate speak, it sounded like real people talking about real work.
| Prompt Quality | What You Include | Typical Output Quality | Editing Time |
|---|---|---|---|
| Bare minimum | Product name only | Generic, bland, could be any brand | 20-30 mins |
| Basic | Product + platform + tone | Decent structure, weak specificity | 10-15 mins |
| Professional | Voice card + audience + banned phrases + goal | On-brand, specific, usable | 3-5 mins |
| Expert | All above + real examples + emotional hook + CTA direction | Near-publish-ready draft | 1-2 mins |
The Consultant's Approach: Client-Specific Copy at Scale
Marketing consultants managing multiple clients face a specific challenge: every client has a different voice, audience, and platform mix. Jumping between a law firm's LinkedIn and a yoga studio's Instagram in the same afternoon used to mean significant mental context-switching. Consultants who've solved this problem use saved 'voice cards', one per client, stored in a notes app or Google Doc. Before generating any copy, they paste the relevant voice card into the prompt. The AI instantly shifts register. The law firm gets measured, authoritative LinkedIn posts. The yoga studio gets warm, grounded Instagram captions.
One independent consultant managing seven clients reported cutting her monthly caption-writing time from 22 hours to under 6 hours after building this system. She uses ChatGPT Plus for drafting and Canva AI for pairing copy with visuals. Her clients don't know she uses AI, they know their content has gotten more consistent and their engagement has improved. That's the professional standard: AI-assisted doesn't mean AI-obvious. The judgment, the brief, the editing, and the strategy are still entirely human.
Build Your Voice Card Once, Use It Forever
Putting this into practice doesn't require any new tools or subscriptions. The free versions of ChatGPT and Claude handle caption writing well. What it requires is a shift in how you approach the prompt. Stop asking for 'a caption for my product.' Start briefing the AI like a creative director briefing a copywriter: who's the audience, what do they care about, what's the one thing this post needs to make them feel or do, what's off-brand, and what does success look like? That 90-second investment in the brief is where the real work happens.
Once you have a draft, your job is editorial, not creative. Read it aloud. Does it sound like your brand? Is the hook strong enough to stop a scroll? Is the CTA clear without being pushy? AI drafts are starting points. The best caption writers who use AI treat the first output as raw material, they cut, sharpen, and occasionally add something the AI couldn't have known, like a timely cultural reference or a specific customer story. That combination. AI speed plus human judgment, is what produces copy that actually converts.
The professionals getting the best results from AI caption writing aren't the ones using the most sophisticated tools. They're the ones who've built the clearest systems: a voice card per brand, a prompt template per platform, a quick editorial checklist before posting. That structure turns AI from a novelty into a reliable part of the workflow. Start with one client, one platform, one voice card this week. The habit builds fast once you see the time savings.
Goal: Produce three AI-drafted Instagram captions for a real brand, edit the strongest one to publish-ready quality, and save a reusable prompt template that will cut your drafting time on every future post.
1. Choose one brand, your own business, your employer, or a client, and open a blank Google Doc or Notion page titled '[Brand Name] Voice Card.' 2. Write 3 adjectives that describe the brand's voice (e.g., 'direct, warm, slightly irreverent') and 2-3 sentences describing the primary audience and what they care about. 3. Find 2-3 existing captions from this brand that you feel represent it well, paste them into the voice card as 'Examples We Love.' 4. List 5-8 words or phrases this brand never uses (e.g., 'synergy,' 'amazing,' 'don't miss out') under a 'Never Say' section. 5. Open ChatGPT (free at chat.openai.com) or Claude (free at claude.ai) and start a new conversation. 6. Paste your entire voice card at the top of the prompt, then add: 'Using this brand voice, write 3 Instagram caption options for [describe your post image or topic]. Each caption should be under 100 words and end with a question to encourage comments.' 7. Read all three outputs aloud and mark which phrases sound right and which feel off. 8. Edit your favorite draft, cut anything generic, add one specific detail only your brand would know, and confirm the CTA is clear. 9. Save the full prompt (voice card + caption brief) as a template in your doc so you can reuse it next week.
- Generic prompts produce generic copy, the quality of your brief determines the quality of the output, not the AI tool you're using.
- A voice card (tone, examples, banned phrases, audience) is the single most effective investment you can make in your AI caption workflow.
- AI defaults to statistically common marketing language, you must explicitly override that with brand-specific instructions in every prompt.
- The best use of AI in caption writing is removing the blank-page problem, not replacing the human editorial judgment that follows.
- Specific details, product origins, real customer stories, cultural references, concrete benefits, are what separate converting copy from content that gets scrolled past.
- Platform context belongs in every prompt: Instagram, LinkedIn, and TikTok require different lengths, tones, and CTA styles even for identical products.
- Treat the first AI draft as raw material, not finished copy, read it aloud, cut the generic phrases, and add one thing only a human who knows the brand would include.
Takeaways
- Context transforms AI output, the more specific your brief, the less editing you do.
- Voice cards are reusable assets that pay for themselves in saved time within the first week.
- AI handles drafting; you handle strategy, judgment, and the final edit.
- Banned-phrase lists are as important as tone descriptions, tell the AI what not to say.
- A system (voice card + prompt template + editorial checklist) beats one-off prompting every time.
- Free tools are sufficient. ChatGPT and Claude both produce strong caption drafts with the right input.
- The goal is copy that sounds like your brand wrote it on its best day, produced in a fraction of the usual time.
This lesson requires Pro
Upgrade your plan to unlock this lesson and all other Pro content on the platform.
You're currently on the Free plan.
