Back to Lead with AI: Manager's Playbook

Lesson 1 of 5

What AI Actually Does (And Doesn't)

~22 min readLast reviewed May 2026

This lesson counts toward:Lead Through AI: The Executive Playbook Teach Smarter, Learn Faster Lead AI Adoption, Drive Results Run Smarter: AI for Operations Leaders Grow Faster: AI for Small Teams

How AI Actually Works

Part 1: What's Really Happening When You Type a Prompt

In the spring of 2023, the marketing team at Heinz ran an experiment. They asked several AI image generators to produce 'ketchup.' Every single tool, without being told the brand name, generated an image of a red bottle that looked unmistakably like Heinz. The team was delighted, but also a little unsettled. How did the AI know? Nobody programmed it with the rule 'ketchup equals Heinz.' The AI had absorbed millions of images and captions from the internet and, through sheer pattern repetition, had learned an association so strong it reproduced it on demand. Heinz turned this into a marketing campaign. But the moment also exposed something most professionals don't fully grasp: AI doesn't think the way we do. It recognizes patterns at a scale no human can match.

The Heinz team's creative director, Cristina Kenz, described the result as 'the ultimate proof point' for brand recognition. But what she was really witnessing was a statistical mirror. The AI had seen enough human-generated content, recipe blogs, grocery store photos, Instagram posts, Wikipedia articles, that it could reconstruct the world's most associated image for a given word. This is the core mechanic behind every AI tool you will use as a manager: massive pattern recognition trained on human-generated data. It's not magic. It's not consciousness. It's a very sophisticated, very fast system for predicting what comes next based on what it has seen before.

That distinction matters enormously in practice. When you understand that AI tools like ChatGPT, Claude, or Microsoft Copilot are pattern-matching engines rather than reasoning minds, you start making better decisions about when to trust them, when to push back, and when to add your own judgment. The managers who get the most out of AI aren't the ones who treat it like a search engine or a magic oracle. They're the ones who understand the basic mechanics well enough to work with the grain of the tool rather than against it. This lesson gives you that foundation, no technical background required.

The Core Principle This Lesson Is Built On

AI tools don't understand your question the way a colleague does. They predict the most statistically likely useful response based on patterns in their training data. Knowing this explains why AI is brilliant at some tasks (drafting, summarizing, reformatting) and unreliable at others (verifying facts, doing live research, making judgment calls). Your job as a manager is to know which is which.

The Pattern Machine: Where AI's Power Actually Comes From

Picture a new hire who has read every email, report, proposal, legal brief, news article, and novel ever published, in multiple languages, before their first day on the job. They haven't lived any of it. They haven't made decisions, felt consequences, or built relationships. But they've seen so many examples of how professionals communicate that they can produce a passable first draft of almost anything you ask for. That's roughly the situation with a large language model (LLM), the technology behind ChatGPT, Claude, Gemini, and Copilot. The term 'large language model' just means a system trained on enormous amounts of text to predict what words should follow other words. GPT-4, the model behind ChatGPT Plus, was trained on an estimated 1 trillion words of text.

Here's where it gets practically useful. Because these models have seen so many examples of professional writing, they've absorbed implicit rules about structure, tone, and format that most people never consciously articulate. Ask ChatGPT to write a performance review for a mid-level employee who missed targets but showed growth, and it will produce something that follows the unwritten conventions of HR writing, balanced language, specific-sounding examples, a constructive close. It learned those conventions by reading thousands of real performance reviews. It doesn't know your employee. But it knows the genre. That's a genuine and significant capability you can put to work.

The limitation is the flip side of the same coin. Because AI is predicting based on patterns rather than reasoning from facts, it can produce outputs that sound authoritative but are wrong. This is called 'hallucination' in the industry, a slightly dramatic word for what is essentially confident confabulation. The AI fills in gaps with plausible-sounding content. A legal team at a New York law firm learned this the hard way in 2023 when an attorney used ChatGPT to research case precedents. The AI invented several convincing but entirely fictional court cases. The attorney submitted them to a federal judge. The resulting sanctions made international news. Pattern recognition without verification is a liability.

HR Manager. Writing a Performance Review Draft

Prompt

I'm an HR manager. Write a first draft of a performance review for a mid-level marketing coordinator named Jordan. Jordan consistently meets deadlines and produces high-quality written content, but struggles with proactive communication, their manager often has to follow up to get status updates. Jordan has been in the role for 18 months. Tone should be professional and constructive. Include a section on strengths, a section on development areas, and suggested goals for the next review period.

AI Response

**Performance Review: Jordan [Last Name]. Marketing Coordinator** **Review Period:** [Insert dates] | **Manager:** [Insert name] **Overall Summary** Jordan has demonstrated consistent reliability and strong written output over the past 18 months. Their contributions to the marketing team's content pipeline have been dependable, and the quality of their deliverables reflects a solid understanding of the team's standards and brand voice. **Strengths** Jordan's most notable strength is their ability to produce polished, deadline-conscious work. Projects assigned to Jordan are consistently completed on time and require minimal revision. Their written materials, including campaign copy, internal briefs, and stakeholder summaries, are clear, well-structured, and aligned with team objectives. Colleagues have noted that Jordan's drafts often serve as useful templates for others. **Development Areas** The primary area for growth is proactive communication. During this review period, Jordan's manager frequently needed to initiate check-ins to receive status updates on in-progress work. While the work itself met expectations, the lack of unprompted updates created uncertainty in project planning and occasionally delayed dependent tasks. Building a habit of regular, self-initiated progress communication will be important as Jordan takes on more complex projects. **Goals for Next Review Period** 1. Send a brief weekly status update to the manager every Friday by 4 p.m. for all active projects. 2. Flag any blockers or delays within 24 hours of identifying them, rather than waiting for a check-in. 3. Take ownership of one cross-functional project, practicing stakeholder communication from brief to delivery. **Manager Notes** [Space for additional context, examples, or personalization before finalizing.]

A Hospital System Learns to Work With the Grain

In 2023, UC San Diego Health piloted an AI tool, built on GPT-4, to help physicians draft responses to patient messages in their MyChart portal. Before the pilot, doctors were spending an average of 24 minutes per day just on inbox management. Patient messages asking about medication interactions, test results, and appointment logistics were piling up. The AI draft tool didn't answer patients directly. It generated a suggested response that the physician could review, edit, and send. The key design insight: the AI handled the genre work (professional, empathetic, medically cautious language) while the physician provided the factual accuracy and clinical judgment.

The results were striking. Physician satisfaction with inbox management improved significantly, and patients reported that response quality felt more thorough, likely because doctors had more time to focus on the substance rather than the mechanics of writing. This example illustrates a principle worth internalizing: AI performs best as a first-draft engine, not a final-answer machine. The physicians at UC San Diego weren't replacing their expertise. They were offloading the cognitive labor of formatting and phrasing so they could apply their expertise more precisely. That's the model that works across industries.

What Different AI Tools Are Actually Built For

Tool	Best For	Underlying Strength	Watch Out For
ChatGPT Plus (OpenAI)	Drafting, brainstorming, summarizing long documents, writing emails and reports	Broad training data; strong at diverse professional writing styles	Can hallucinate facts and citations; knowledge cutoff means no live data
Claude Pro (Anthropic)	Analyzing long documents, nuanced writing, careful reasoning tasks	Handles very long inputs (up to 200,000 words); strong at following detailed instructions	Occasionally over-cautious; less strong on creative tasks than GPT-4
Microsoft Copilot (M365)	Working inside Word, Excel, PowerPoint, Outlook, Teams	Directly integrated with your files and calendar; can draft emails from meeting notes	Requires M365 subscription; quality depends on how organized your files are
Google Gemini (Workspace)	Working inside Gmail, Docs, Sheets, Slides	Native integration with Google apps; good at summarizing email threads	Still maturing; less consistent than Copilot for complex document tasks
Notion AI	Summarizing meeting notes, drafting project briefs, organizing information in Notion	Deeply integrated with Notion databases and pages	Only useful if your team already uses Notion; limited outside that ecosystem
Grammarly AI	Improving tone, clarity, and professionalism of written communication	Strong at real-time editing and tone adjustment within emails and docs	Not a drafting tool; best used to refine rather than generate content

The major AI tools available to non-technical professionals in 2024, and what they're actually optimized for.

A Sales Director Discovers the Difference Between Tools

Rachel Garrett, a sales director at a mid-sized B2B software company in Austin, started using AI tools in early 2023. Her first instinct was to use ChatGPT for everything, drafting proposals, summarizing competitor research, writing follow-up emails. It worked well for drafts. But she noticed that when she asked ChatGPT to summarize a 40-page RFP (Request for Proposal) that a client had sent over, she had to paste the document in chunks, losing context between sections. When her company moved to Microsoft Copilot through their existing M365 subscription, the workflow changed. She could open the RFP in Word, ask Copilot to summarize the key requirements, and get a structured output in under two minutes, without leaving the document.

Rachel's experience points to something the comparison table above captures: the right tool depends on where your work lives. If your team runs on Microsoft 365. Outlook, Word, Teams, SharePoint. Copilot is almost certainly the highest-leverage tool because it works inside your existing environment. If you're a Google Workspace shop, Gemini makes the same argument. ChatGPT and Claude are more powerful for complex reasoning and long-form drafting, but they require you to bring the content to them. Neither is universally better. The best tool is the one that fits your workflow with the least friction.

Monday Morning Move: Try the Tool That Fits Your Stack

Before experimenting with new AI tools, check what you already have access to. If your company uses Microsoft 365, you may already have Copilot available, check with your IT team or look for the Copilot icon in Word or Outlook. If you're on Google Workspace, Gemini is built into Gmail and Docs. Starting with a tool that's already in your environment means zero setup and immediate application to real work.

Why 'Talking to AI' Is a Skill Worth Developing

Most professionals who feel disappointed by AI tools have one thing in common: they're treating them like a search engine. They type a short, vague query and expect a precise, useful result. 'Write me a proposal' returns something generic. 'Summarize this' returns something too broad. The frustration is real, but the cause is usually the instruction, not the tool. AI tools respond to context, specificity, and structure. The more clearly you describe who you are, what you need, who the output is for, and what format you want, the more useful the response becomes. This isn't a technical skill. It's a communication skill, one that most managers already have in other contexts.

Think about how you brief a junior colleague on a task. You don't just say 'write a report.' You say: 'Write a two-page summary of last quarter's sales performance for the CFO. Focus on the three regions that missed targets. Use plain language, she doesn't want jargon. Include a bullet-point summary at the top.' That same specificity, applied to an AI prompt, produces dramatically better results. The industry term for this is 'prompt engineering,' but that phrase makes it sound more technical than it is. It's really just clear briefing. You already know how to do this. You just need to apply it to a new kind of collaborator.

The practical implication is that your value as a manager doesn't decrease because AI can draft documents quickly. It increases, because the quality of what AI produces is directly proportional to the quality of the direction you give it. A vague brief returns a vague draft. A sharp, specific brief, with context about the audience, the purpose, the tone, and the constraints, returns something you can actually use. The professionals who will get the most out of these tools are the ones who are good at clarifying what they actually want. That's a management skill, not a technical one.

First Contact: Test an AI Tool on a Real Work Task

Goal: Experience the difference between a vague prompt and a well-briefed prompt using a real task from your current work.

1. Choose a task from your current workload that involves writing, an email, a summary, a short report, a meeting agenda, or a proposal section. Keep it real; don't invent something. 2. Open ChatGPT (chat.openai.com, free account works), Claude (claude.ai, free account works), or Microsoft Copilot (copilot.microsoft.com, free) in your browser. 3. Type a short, vague version of your request, the kind you might type into a search engine. For example: 'Write a follow-up email after a sales meeting.' Hit send and read the result. 4. Now write a second, fully briefed version of the same request. Include: your role, the specific situation, who the output is for, the desired tone, any constraints (length, format, things to avoid), and one or two key points the output must include. 5. Send the second prompt and read the result. Note the differences in specificity, relevance, and usability. 6. Copy the better output into a document. Edit it, add specific names, real numbers, actual context that only you know. Time how long this takes compared to writing from scratch. 7. Save both prompts and both outputs. You'll use these in Part 2 to refine your prompting technique further. 8. Write two sentences in your own words describing what the AI did well and what it got wrong or missed entirely. 9. Based on this test, identify one recurring writing task in your week that you could consistently use AI to draft first.

Key Takeaways from Part 1

AI tools like ChatGPT, Claude, and Copilot are pattern-recognition systems trained on vast amounts of human-generated text, they predict useful responses, they don't reason the way humans do.
This makes AI highly capable at genre-based tasks: drafting professional documents, reformatting content, summarizing long text, and generating structured outputs in a specific style.
The same mechanic creates a real risk: AI can produce confident, plausible-sounding content that is factually wrong. Always verify facts, statistics, citations, and names before using AI output professionally.
Different tools are optimized for different environments. The best tool is usually the one already integrated into where your work lives. M365 users should explore Copilot; Google Workspace users should explore Gemini.
Prompt quality determines output quality. Specific, context-rich instructions, describing your role, the audience, the format, and the purpose, consistently produce more useful results than short, vague queries.
AI works best as a first-draft engine. Your job is to provide direction, add facts and context the AI can't know, and apply judgment the AI can't exercise. The combination of AI speed and human judgment is where the real productivity gain lives.

When AI Gets It Wrong. And Why That Matters for Your Team

2023

Historical Record

New York law firm partner

In 2023, a partner at a New York law firm submitted a legal brief citing six court cases that were entirely fabricated. The attorney had used ChatGPT to help research the brief and the model generated fictional case names, judges, dockets, and rulings that sounded authoritative.

This incident illustrates a critical failure mode of AI language models: they generate fluent, confident-sounding text based on patterns rather than verified facts, creating a predictable risk for professional work.

What happened here cuts to the heart of how AI language models actually work. The model didn't lie in the human sense, it didn't know the cases were fake. It generated text that looked statistically like what a real legal citation should look like, because that's exactly what it was trained to do. It had seen thousands of legal briefs and knew the pattern: case name, court, year, ruling summary. So it produced that pattern, fluently and confidently, even when the underlying facts didn't exist. This is called a hallucination, and it's one of the most important concepts any professional using AI tools needs to understand.

The principle extracted from this story isn't 'don't use AI for legal work.' It's more nuanced and more useful: AI is extraordinarily good at producing text that looks correct, but it has no internal alarm that fires when something is factually wrong. Confidence in AI output is stylistic, not epistemic. The model writes with the same assured tone whether it's summarizing a real report or inventing one. Your job as a professional isn't to be suspicious of everything AI produces, it's to know which outputs require verification and which are safe to use as-is.

What 'Hallucination' Actually Means

Hallucination is the industry term for when an AI generates information that sounds plausible but is factually wrong or completely made up. It happens because AI models predict what text should come next based on patterns, they don't look things up in real time (unless specifically connected to a search tool). Think of it like a very well-read colleague who occasionally fills gaps in their memory with confident-sounding guesses. Smart, useful, but needs a fact-check on anything specific.

The Pattern Machine: How AI Learns to Sound Like an Expert

To understand why hallucinations happen, you need a working mental model of how these systems learn. Imagine you're training a new hire by giving them access to every business book, every industry report, every email thread, every news article, and every academic paper ever published, and then asking them to read all of it before their first day. They'd emerge with an extraordinary grasp of language, concepts, tone, and structure. They'd know what a good executive summary sounds like. They'd know how marketing plans are typically structured. They'd be able to draft a performance review that hits all the right notes. But they wouldn't have lived any of it. They'd be pattern-matching at enormous scale.

That's essentially what large language models do during training. They process billions of words and learn to predict which words follow which other words across an almost incomprehensible range of contexts. The result is a system that has absorbed the shape of human knowledge without experiencing the world directly. It knows that capital cities are usually mentioned alongside country names. It knows that financial reports contain phrases like 'year-over-year growth' and 'EBITDA margin.' It knows that apology emails tend to start with acknowledgment before moving to remedy. This structural knowledge is genuinely powerful, and it's why AI tools can accelerate so much professional writing and thinking work.

But here's the catch that professionals need to keep front of mind: the model's knowledge is frozen at a point in time. ChatGPT-4's training data has a cutoff. Claude's does too. Unless the tool is explicitly connected to a live search engine, like Copilot in Microsoft 365, which can pull from the web, or ChatGPT with browsing enabled, the AI is working from memory, not from today's reality. Ask it about a law that changed last quarter, a competitor's new pricing, or your company's current headcount, and it will either tell you it doesn't know (if it's well-designed) or fill in the gap with something plausible-sounding (if it's not being careful). Knowing this shapes how you use these tools wisely.

HR Manager. Drafting a Job Description with Guardrails

Prompt

I'm an HR manager at a 200-person logistics company. Write a job description for a Senior Operations analyzt role. The position is hybrid (3 days in office, Chicago), reports to the VP of Operations, and requires 5+ years of experience in supply chain or logistics analytics. We use SAP and Tableau. Salary range is $95,000–$115,000. Focus on the responsibilities and requirements sections. Use plain, direct language, not corporate jargon.

AI Response

**Senior Operations analyzt. Job Description** **About the Role** We're looking for a Senior Operations analyzt to join our Chicago-based operations team. You'll work directly with the VP of Operations to turn data into decisions, identifying inefficiencies, tracking performance, and supporting strategic planning across our logistics network. This is a hybrid role: three days in our Chicago office, two days remote. **What You'll Do** - Analyze supply chain and logistics data to identify cost-saving and efficiency opportunities - Build and maintain dashboards in Tableau that give leadership real-time visibility into operations - Work with cross-functional teams (warehouse, transport, procurement) to diagnose operational bottlenecks - Prepare weekly and monthly performance reports for senior leadership - Support annual planning by modeling scenarios and forecasting operational capacity - Manage and query data in SAP; flag data quality issues and recommend fixes - Present findings clearly to non-technical stakeholders **What We're Looking For** - 5+ years of experience in supply chain, logistics, or operations analytics - Hands-on experience with SAP and Tableau (required) - Strong ability to translate complex data into clear business recommendations - Comfortable working independently and managing competing priorities - Bachelor's degree in a relevant field; advanced degree a plus **Compensation** $95,000–$115,000 base salary, depending on experience. Full benefits package included.

A Hospital System That Got AI Adoption Right

In 2022, a regional hospital network in the American Midwest began piloting AI writing tools for their administrative and communications teams, not clinical staff. The goal was simple: reduce the time nurses and unit coordinators spent drafting internal memos, patient communication templates, shift handover summaries, and staff scheduling updates. Within six months, the communications team reported saving an average of four hours per week per person on drafting tasks. The key to their success wasn't just the technology, it was the clear boundaries they set around what AI could and couldn't touch.

The hospital's approach offers a template worth stealing. They categorized tasks into three buckets: AI-first (drafts AI produces that get light human review before sending), AI-assisted (AI provides a starting point, but a human substantially rewrites), and AI-free (anything involving patient records, clinical recommendations, legal documents, or personalized care communications). That taxonomy didn't require anyone to understand machine learning. It just required leadership to think clearly about where accuracy was non-negotiable versus where speed mattered more than perfection. Every manager on your team can apply that same logic starting this week.

Comparing the Major AI Tools: What Each One Is Actually Good At

Tool	Best For	Unique Strength	Watch Out For	Typical User
ChatGPT Plus (OpenAI)	Long-form writing, brainstorming, research summaries	Versatile across almost any task; strong reasoning with GPT-4o	Hallucinations on specific facts; knowledge cutoff unless browsing is on	Managers, marketers, consultants
Claude Pro (Anthropic)	Analyzing long documents, nuanced writing, policy drafts	Handles very long documents (up to ~150k words); careful, measured tone	Less integrated into existing business tools	HR, legal-adjacent roles, executives
Microsoft Copilot	Working inside Word, Excel, Outlook, Teams, PowerPoint	Lives inside tools you already use; can reference your actual files	Requires Microsoft 365 subscription; quality varies by app	Office-based professionals across all roles
Google Gemini	Gmail, Google Docs, Sheets, Meet integration	Deep integration with Google Workspace; strong at summarizing emails	Still maturing; uneven performance on complex tasks	Teams running on Google Workspace
Notion AI	Meeting notes, project docs, internal wikis	Summarizes and rewrites within your existing Notion workspace	Limited outside Notion; not a general-purpose tool	Ops teams, project managers, startups
Grammarly AI	Editing tone, fixing grammar, improving clarity	Inline suggestions as you write; excellent for email and reports	Not a content generator, it refines, it doesn't create from scratch	Anyone who writes professionally

Quick reference: matching AI tools to professional needs. Most teams benefit from two tools, one general-purpose (ChatGPT or Claude) and one integrated into existing workflows (Copilot or Gemini).

A Sales Director Who Stopped Writing Proposals From Scratch

Marcus runs a seven-person sales team at a mid-sized B2B software company. Before AI tools, his team spent three to five hours crafting each client proposal, pulling together company research, tailoring messaging, structuring the document, writing the executive summary. It was high-effort, repetitive work that pulled senior reps away from actual selling. Marcus started using Claude Pro to draft initial proposals. He'd paste in the prospect's LinkedIn profile, their company's recent press releases, and his team's notes from the discovery call. Then he'd ask Claude to produce a first draft structured around the client's stated priorities.

The result wasn't a finished proposal, it was a strong 70% draft that his team could refine in under an hour instead of building from zero. More importantly, the AI-generated drafts were often better structured than what his team produced under time pressure, because the model had absorbed the pattern of thousands of effective proposals. Marcus still reviews every proposal personally before it goes out. He checks the facts, adjusts the tone, and adds the specific relationship context that only a human who knows the client can provide. His team's proposal output increased by roughly 40% in the first quarter. The AI didn't replace his salespeople, it removed the part of the job they liked least.

The 70% Rule for AI Drafts

Don't expect AI to produce a finished, send-ready document on the first try, and don't need it to. Treat AI output as a strong first draft that gets you 70% of the way there. Your job is to supply the remaining 30%: your specific context, your relationship knowledge, your professional judgment, and your fact-checking. This mindset shift makes AI dramatically more useful because you stop being disappointed by imperfection and start being impressed by the time saved.

Putting AI to Work in Your Actual Workflow

The professionals who get the most out of AI tools aren't the ones who use them most, they're the ones who use them most deliberately. They've identified three or four specific tasks in their week that AI handles well and built a small habit around those tasks. A marketing manager might use ChatGPT to draft social media copy every Monday morning. An HR director might use Claude to summarize candidate feedback after every interview round. A teacher might use Copilot to generate differentiated quiz questions from a lesson plan. These aren't dramatic workflow overhauls, they're precise insertions of AI into moments where it saves real time.

The quality of what you get back from AI tools depends heavily on what you put in. This is what practitioners mean when they talk about 'prompt engineering', but that phrase makes it sound more technical than it is. A better frame: think of it like briefing a smart contractor. The more specific context you give, your role, your audience, the purpose of the document, the tone you want, any constraints, the more useful the output. A vague request gets a generic response. A specific request, loaded with real context, gets something you can actually use. The difference between 'write me a project update email' and 'write a project update email to my CFO explaining a two-week delay due to vendor issues, keeping it under 150 words and avoiding blame language' is enormous.

There's also a compounding benefit that most people discover after a few weeks of regular AI use: you get better at knowing what AI can do. You develop intuition for which requests will land well and which will produce something you need to heavily edit. You learn to spot hallucinations faster. You start writing prompts that are more precise because you've seen what precision produces. This isn't a technical skill, it's professional judgment applied to a new tool, the same way you learned over time how to write an effective email subject line or run a meeting that doesn't waste everyone's time.

Map Your Work Week for AI Opportunities

Goal: Identify three to five tasks in your current role where AI tools could realiztically save you time or improve quality, and test one of them before the end of the week.

1. Open a blank document or notebook and write down every recurring task you do in a typical work week, emails, reports, meeting prep, research, feedback, scheduling communications, presentations. 2. Next to each task, mark whether it involves specific factual data that must be accurate (like financial figures or legal terms) or whether it's primarily writing and structure work. 3. Circle the tasks that are primarily writing and structure, these are your best AI candidates. 4. Pick the one that takes you the most time and produces the most generic output (e.g., weekly status emails, meeting agenda drafts, job posting updates). 5. Open ChatGPT, Claude, or whichever AI tool you have access to, and write a prompt for that task. Include your role, the audience, the purpose, the desired length, and any specific constraints or tone requirements. 6. Review the AI's output. Note what's usable immediately, what needs editing, and what's factually wrong or missing context only you have. 7. Edit the output to add your specific context, correct any errors, and adjust the tone to match your voice. 8. Save both the original prompt and the final edited version. This becomes your template for the next time you do this task. 9. Repeat for a second task next week, building a small personal library of prompts that work for your specific role.

Key Principles from Part 2

AI hallucinations happen because models generate plausible-sounding text, not verified facts. Confidence in tone does not mean accuracy in content.
AI knowledge has a training cutoff. For current events, recent data, or live information, use tools with web access enabled (Copilot, ChatGPT with browsing) or verify manually.
Different AI tools have different strengths. Copilot and Gemini integrate into existing software; ChatGPT and Claude are more powerful for general tasks; Grammarly refines rather than creates.
Categorize your tasks: AI-first, AI-assisted, or AI-free. The category depends on how much accuracy risk is involved, not how complex the task is.
Treat AI output as a strong 70% draft. Your value-add is the context, judgment, relationships, and verification that AI cannot supply.
Better prompts produce better outputs. Give AI your role, audience, purpose, tone, and constraints, the same way you'd brief a capable colleague.
Regular use builds intuition. The more you use AI tools deliberately, the faster you develop judgment about when to trust, when to edit, and when to verify.

When AI Gets It Wrong, and What That Means for You

In 2023, a partner at a New York law firm submitted a legal brief citing six court cases to support his client's argument. The cases looked authoritative, proper citations, plausible summaries, convincing legal language. Every single one was fabricated. ChatGPT had generated fictional case names, fictional rulings, and fictional quotes with complete confidence. The partner had not verified them. The judge was not amused. The firm faced sanctions and the story became front-page news. This wasn't a story about a rogue AI. It was a story about a professional who misunderstood what AI actually does, and paid the price for it.

The lawyers involved weren't careless people. They were experienced professionals who assumed that because the AI's output looked authoritative, it was accurate. That assumption is the central trap. AI language models generate text by predicting what words should come next based on patterns learned from training data. They don't look things up. They don't check facts in real time. They produce fluent, confident-sounding language even when the underlying content is wrong. Understanding this one fact changes how every manager, marketer, and consultant should use these tools.

The principle here is not that AI is untrustworthy, it's that AI has a specific, predictable failure mode you can learn to manage. A confident tone is not a signal of accuracy. Length is not a signal of quality. Formatting is not a signal of correctness. Once you internalize this, you stop treating AI output as a final product and start treating it as a talented first draft from someone who occasionally makes things up without realizing it. That shift in mental model makes you a dramatically better user of these tools.

Why AI 'Hallucinations' Happen

AI models like ChatGPT and Claude are trained to produce fluent, coherent text, not to retrieve verified facts. When asked about something outside their training data, or something obscure, they sometimes generate plausible-sounding but incorrect information. This is called a hallucination. It's not a bug being fixed next week, it's a structural feature of how these models work. Your job is to verify anything factual before using it professionally.

The Confidence Problem Across Industries

A marketing director at a mid-sized retail brand used Claude to research competitor pricing strategies for a board presentation. The output was detailed and well-structured, specific percentages, named campaigns, attributed quotes from executives. It looked like solid competitive intelligence. When her team fact-checked it ahead of the presentation, roughly 40 percent of the specific claims couldn't be verified, and two of the executive quotes didn't exist. She caught it in time. But she described the experience as 'deeply unsettling, it read like a real analyzt report.'

This pattern shows up across every professional context. An HR manager using AI to summarize employment law changes needs to verify those summaries against actual legal sources. A teacher using Gemini to generate historical facts for a lesson plan needs to cross-check dates and attributions. A sales consultant using Copilot to draft a proposal that references industry statistics needs to confirm those numbers came from a real study. The workflow is the same: use AI for speed and structure, then apply your professional judgment to verify what matters.

The good news is that AI's failure modes are predictable. Models are most reliable when generating structure, tone, and language, writing emails, reformatting content, brainstorming options, summarizing documents you provide. They are least reliable when asked to recall specific facts, recent events, statistics, or anything that requires external verification. Knowing which category your task falls into tells you exactly how much checking you need to do.

Asking AI to summarize a document you provide, reliable use case

Prompt

Here is a client feedback survey I conducted last month. Please summarize the top 5 themes from these responses and flag any concerns that appear more than twice. [Survey responses pasted below] 'The onboarding process took too long, three weeks is excessive.' 'Really impressed with the account manager's responsiveness.' 'Invoicing is confusing, we got billed for services we didn't use.' 'Love the product but the setup documentation is outdated.' 'The onboarding took longer than expected.' 'Billing errors caused a delay in our payment approval process.' 'Account manager was excellent, very proactive.' 'Documentation needs updating badly.' 'We were charged for something we canceled.'

AI Response

Here are the top themes from your client feedback survey: **1. Onboarding Duration (mentioned 2x, flagged)** Multiple clients noted the onboarding process felt too long. One client specifically cited three weeks as excessive. This suggests a gap between client expectations and your current onboarding timeline. **2. Billing and Invoicing Errors (mentioned 3x, flagged)** This is your most frequently raised concern. Three separate clients referenced billing problems, charges for unused or canceled services, and at least one case where billing errors delayed internal payment approval. This warrants immediate review of your invoicing process. **3. Account Manager Performance (mentioned 2x, positive)** Two clients specifically praised their account manager's responsiveness and proactivity. This is a strength worth recognizing and potentially replicating across the team. **4. Outdated Documentation (mentioned 2x, flagged)** Two clients flagged that setup documentation is not current. This may be contributing to onboarding friction. **5. Overall Product Satisfaction** Despite operational frustrations, at least one client explicitly separated product quality from process issues, a useful signal that retention risk is process-driven, not product-driven. **Priority recommendation:** Address billing errors first, it appears three times and has a direct financial impact on client relationships.

A Teacher Who Found the Right Balance

A high school history teacher in Chicago started using ChatGPT to build lesson plans and discussion questions. Early on, she made the same mistake as the lawyers, she used AI-generated historical details without checking them. A student caught an incorrect date in a worksheet. After that, she developed a simple personal rule: AI writes the structure and the questions, she verifies every specific fact. Her lesson planning time dropped from four hours a week to ninety minutes, and her materials actually improved because she had more time to add nuance.

Her workflow became a model for her department. Use AI to generate the skeleton, learning objectives, discussion prompts, activity formats, summary language. Then spend your expert time on what AI can't reliably do: verify accuracy, add context, apply judgment about what your specific students need. This division of labor is exactly how professionals in every field should think about AI. You bring the expertise. AI brings the speed.

Task Type	AI Reliability	Your Role	Example
Writing & reformatting	High	Edit for tone and accuracy	Drafting a client email from bullet points
Summarising content you provide	High	Check emphasis and completeness	Summarising meeting notes you paste in
Brainstorming options	High	Filter and evaluate	Generating 10 subject line options for a campaign
Recalling specific facts/statistics	Low	Always verify independently	Asking for industry market size figures
Citing sources or cases	Low	Never use without verification	Asking for legal precedents or research citations
Recent events (post-training)	Very Low	Use a live search tool instead	Asking about news from the last 6 months

AI reliability varies by task type, matching the right tool to the right job reduces risk significantly.

Putting This Into Practice From Monday

The most effective professional AI users develop a two-step habit. First, they identify whether their task is structure-based or fact-based. Writing an email? Drafting a proposal structure? Brainstorming campaign names? AI is excellent here, use it freely. Researching a statistic, checking a regulation, citing a study? Treat AI output as a starting point and verify it through a credible source before it goes anywhere official.

Second, they give AI better inputs. The law firm partners gave ChatGPT a vague instruction and accepted whatever came back. The marketing director did the same. The teacher eventually learned to provide context, the curriculum standard, the student age group, the specific topic, and got dramatically better, more usable output. AI performs better when you treat it like a capable new colleague who needs a proper brief, not a search engine you can query with two words.

None of this requires technical skill. It requires a mental model shift: AI is a powerful drafting and thinking partner that has a specific weakness around factual recall. Build your workflows around that reality. Use AI for the parts it does well, apply your professional judgment to the parts it doesn't, and you get the speed benefits without the professional risk. That's the entire game.

The 30-Second Verification Habit

Any time AI gives you a specific number, a named study, a legal reference, a quote, or a historical fact, spend 30 seconds verifying it before it leaves your hands. Paste the claim into Google, check the original source, or ask a colleague with domain expertise. This one habit prevents 95% of AI-related professional embarrassments. It sounds obvious. Most people skip it anyway.

Build Your AI Reliability Filter

Goal: Develop a personal, practical framework for knowing when to trust AI output and when to verify it, grounded in a real task from your own job.

1. Open ChatGPT (free version is fine) or Claude at claude.ai. 2. Think of a real work task you did last week that involved writing, research, or summarising information. 3. Identify whether that task was primarily structure-based (writing, formatting, brainstorming) or fact-based (statistics, citations, regulations). 4. If it was structure-based: type the task into ChatGPT now and generate a draft. Note how long it takes versus doing it manually. 5. If it was fact-based: ask ChatGPT to complete the task, then take one specific fact or figure from the response and verify it using a Google search or official source. 6. Record what you found, did the AI get it right, partially right, or wrong? 7. Write two sentences describing how you would adjust your process for this type of task going forward. 8. Share your finding with one colleague and ask them to try the same exercise with a task from their own role. 9. Save your two sentences as a personal 'AI use rule' you can refer back to when deciding how much to trust AI output in your work.

Key Takeaways

AI language models predict plausible text, they do not retrieve verified facts. Fluent output does not mean accurate output.
Hallucinations are a structural feature of how these models work, not a bug that will disappear. Build verification into your workflow.
AI is most reliable for structure-based tasks: drafting, reformatting, summarising content you provide, and brainstorming.
AI is least reliable for specific facts, statistics, citations, legal references, and anything requiring current information.
Better inputs produce better outputs. Give AI context, constraints, and a clear brief, just like you would a capable human colleague.
The professional skill is not using AI, it's knowing which parts of your work to hand to AI and which parts require your expertise and judgment.
A 30-second verification habit before any AI-generated fact leaves your hands eliminates most professional risk.

Featured Reading

This lesson requires Pro

Upgrade your plan to unlock this lesson and all other Pro content on the platform.

Upgrade to Pro

You're currently on the Free plan.

Practice this in a lab

Should Your Hospital Buy That AI Documentation Tool?

beginner · 8 min