Hallucinations in depth: what causes them and how to spot them
~23 min readThe Lawyer Who Trusted the Machine
In June 2023, New York attorney Steven Schwartz filed a legal brief in federal court that cited six cases to support his client's personal injury claim against Avianca Airlines. The cases had proper-sounding names, realistic docket numbers, and plausible legal reasoning. They also did not exist. Schwartz had used ChatGPT to research the brief and had not verified the citations against any legal database. When opposing counsel flagged the discrepancy, Schwartz asked ChatGPT directly whether the cases were real. The model confidently confirmed they were. Judge P. Kevin Castel fined the law firm $5,000 and required the attorneys to notify the judges named in the fictitious opinions. The case became front-page news — not because AI had failed in some exotic, technical way, but because it had failed in the most ordinary way possible: it made something up and then doubled down when asked.
What makes this case instructive is not the embarrassment. It's the mechanism. Schwartz was not being reckless by the standards he'd applied to every other research tool. He asked a follow-up question. He sought confirmation. The problem is that asking ChatGPT to verify its own output is like asking a witness whether they're telling the truth — the model has no external reference point to check against. It generates a confirmation using the same process it used to generate the original claim: predicting the most statistically plausible next token. If it generated a false case name, it will generate a false confirmation of that case name with equal fluency. The model doesn't know it fabricated anything, because knowing requires a ground truth the model doesn't possess.
The Schwartz case crystallizes the central tension of this lesson. Large language models like GPT-4, Claude, and Gemini are extraordinarily capable at producing fluent, authoritative-sounding text. That fluency is the product — it's why these tools feel useful. But fluency and accuracy are independent properties. A sentence can be perfectly constructed and completely false. The challenge for any professional using AI is learning to separate the signal from the style. That skill starts with understanding what hallucinations are, why they happen structurally, and what patterns they follow — because they do follow patterns, and those patterns are learnable.
What 'Hallucination' Actually Means
Why Language Models Fabricate: The Core Mechanism
To understand hallucinations, you need a working model of what these systems actually do. GPT-4, Claude 3, and Gemini 1.5 are all transformer-based language models. They are trained on vast corpora of text — GPT-4's training data is estimated to contain over a trillion tokens, roughly equivalent to millions of books — and through that training they learn statistical relationships between words, phrases, and concepts. When you send a prompt, the model doesn't retrieve a stored answer from a database. It generates a response token by token, each token chosen based on probability distributions shaped by everything it has seen during training. The model is, at its core, a very sophisticated autocomplete engine. That framing isn't reductive — it's essential. Because autocomplete doesn't check facts. It predicts plausible continuations.
This architecture produces a specific failure mode. When a model encounters a prompt that requires specific factual knowledge — a case citation, a statistic, a person's biography — it doesn't have a mechanism to say 'I don't have reliable data on this.' Instead, it generates what a correct answer would look like. If the training data contained many legal briefs, the model knows what a case citation looks like structurally: a party name versus a party name, a year, a court, a reporter citation. It can produce something that perfectly mimics the format of a real citation while referring to a case that never existed. This is why hallucinated facts often look so convincing. They aren't random nonsense. They're contextually appropriate, structurally correct fabrications.
The problem compounds when models are asked about topics at the edge of their training data — obscure research, recent events, niche professional domains, or specific local information. In these areas, the model has seen fewer examples, so the statistical signal is weaker. The model still generates a response, but it's working from a thinner foundation. Think of it like a confident generalist who has read widely but not deeply in your specialty. They can hold a conversation that sounds expert. They can use the right vocabulary. But the specific claims — the numbers, the names, the dates — have a higher chance of being slightly wrong or entirely invented, because those details weren't well-represented in what they read. This is why hallucinations are not uniformly distributed across topics. They cluster in predictable places.
Prompt
What percentage of Fortune 500 companies had a Chief AI Officer as of 2023? Cite your source.
AI Response
As of 2023, approximately 34% of Fortune 500 companies had appointed a Chief AI Officer or equivalent role, according to a McKinsey Global Institute report on AI adoption in enterprise leadership published in Q3 2023. [Note: This statistic and the specific report cited do not exist. McKinsey has published AI adoption research, but this particular figure and report title are fabricated. The model generated a plausible-sounding source because the prompt requested one — which is itself a hallucination trigger.]
The Pharmaceutical Researcher's Near-Miss
Move from law to life sciences and the stakes shift again. In 2023, a medical affairs team at a mid-sized European pharmaceutical company began using Claude to accelerate literature reviews for regulatory submissions. The team was experienced, the prompt engineering was thoughtful, and the outputs were impressive — until a senior researcher noticed that a summary of a 2021 clinical trial referenced a patient cohort size of 1,240 participants. The actual trial had enrolled 412. The model had generated a number that was plausible for a trial of that type and scale, but wrong by a factor of three. The error was caught because the researcher happened to remember the trial. Had it been a less familiar study, the inflated cohort figure would have made it into a document sent to a European regulatory body.
This example reveals a second pattern: hallucinations are most dangerous when they are partially correct. The model had the right trial, the right authors, the right general findings. Only the number was wrong. Partial accuracy is more treacherous than complete fabrication, because it passes the casual plausibility check. When everything around a false detail is accurate, the false detail inherits credibility from its context. The pharmaceutical team updated their workflow immediately — every specific numerical claim in AI-generated summaries now requires manual verification against the source document. That policy adds friction, but it reflects a clear-eyed understanding of where the model's reliability breaks down: not in summarizing concepts, but in reproducing exact figures.
| Output Type | Hallucination Risk | Why | Verification Method |
|---|---|---|---|
| Conceptual summaries | Low | Models trained on many explanations of common concepts | Spot-check against known knowledge |
| Named citations (cases, papers, reports) | Very High | Model mimics citation format without retrieval | Check primary source directly |
| Specific statistics and percentages | High | Plausible numbers generated from context, not data | Trace to original dataset or report |
| Dates and timelines | Medium-High | Training data has temporal gaps and conflicts | Cross-reference with dated sources |
| Biographical details (people, roles) | High | Sparse or conflicting training data on individuals | Check LinkedIn, official bios, press releases |
| Step-by-step processes (general) | Low-Medium | Well-represented in training data; errors tend to be subtle | Test the process or consult domain expert |
| Recent events (post-training cutoff) | Very High | Model has no data; generates plausible-sounding updates | Use Perplexity or Bing with web search enabled |
The Marketing Director Who Shipped a Fake Study
In early 2024, a B2B SaaS company's marketing team used ChatGPT to draft a white paper on customer retention trends. The prompt asked for supporting research, and the model produced several citations — including a 'Harvard Business Review study' showing that increasing customer retention by 5% raises profits by 25–95%. That statistic is real and widely cited. But the model also generated a second citation: a 'Gartner report from Q2 2023' claiming that 68% of enterprise buyers now require vendors to demonstrate AI-powered support capabilities before signing contracts. That report does not exist. The white paper went through two rounds of internal review, was approved by the VP of Marketing, and was downloaded 1,400 times before a prospect emailed to ask for the full Gartner report. The team had no answer.
The marketing context adds a dimension the legal and pharmaceutical cases don't have: reputational risk at scale. A false citation in a legal brief harms one case. A false citation in a white paper distributed to thousands of prospects undermines the company's credibility with every reader who tries to verify it. The marketing team's mistake wasn't using AI to draft the content — that's a reasonable efficiency gain. The mistake was treating AI-generated citations as verified sources rather than as placeholders requiring confirmation. There's a meaningful difference between using AI to structure an argument and using AI to source an argument. The former is low-risk. The latter requires human verification at every specific claim.
The Citation Placeholder Rule
What This Means in Practice
The three cases above — a law firm, a pharma team, and a marketing department — span different industries, different use cases, and different consequences. But they share a structural similarity: in each case, a professional delegated a verification function to a tool that is architecturally incapable of performing it. Language models generate plausible text. They do not retrieve verified facts. Those are different operations, and conflating them is the root cause of every professional hallucination incident. The practical implication is not 'use AI less.' It's 'use AI for the right operations.' Drafting, restructuring, summarizing well-understood concepts, generating options, improving clarity — these are tasks where the model's generative capability is an asset and the hallucination risk is manageable. Sourcing specific facts, verifying numbers, and confirming citations are tasks where you need a different tool or a human.
The risk is also not static. Hallucination rates vary significantly across models and have been improving. GPT-4 hallucinates measurably less than GPT-3.5 on factual recall benchmarks. Perplexity AI reduces citation hallucinations by grounding responses in live web search results — though it introduces new risks around source quality. Claude 3 Opus performs well on long-document summarization with low fabrication rates when the source material is provided in the context window. Knowing which tool to use for which task is part of the skill set. A model with web access is better for recent facts. A model given the source document directly is better for accurate summarization. A model working from memory alone is worst for specific factual claims. These are not opinions — they reflect how the underlying architectures process information differently.
There's also a prompt-level dynamic that professionals underestimate. The way you phrase a request influences the model's hallucination behavior. Asking 'What does the research say about X?' invites the model to generate research-sounding content whether or not it has reliable data. Asking 'Summarize only what you know with high confidence about X, and flag anything uncertain' produces a meaningfully different response — not perfect, but more calibrated. Asking for citations in the same prompt as substantive content increases fabrication risk, because the model is now solving two tasks simultaneously: generating the argument and generating the sourcing. Separating those steps — draft first, then ask for sources separately and verify each — reduces the chance that false citations get embedded into content before anyone looks closely.
Goal: Build a personal, domain-specific understanding of where your AI tool's hallucination risk is highest, based on direct evidence rather than general advice.
1. Open ChatGPT (GPT-4), Claude, or Gemini — whichever tool you use most in your work. 2. Choose a topic directly relevant to your professional domain — a regulation, a market trend, a research area, or a competitor. 3. Send this prompt: 'Give me three recent statistics about [your topic] and cite the source for each one.' 4. Copy the three statistics and their citations into a separate document. 5. For each citation, spend 90 seconds searching for it: try Google Scholar, the publisher's website, or a direct Google search of the title and author. 6. Record whether each citation: (a) exists exactly as described, (b) exists but with different details, or (c) cannot be found. 7. For any statistic you can verify, check whether the number the model gave matches the number in the actual source. 8. Note which category of error appeared most often: fabricated source, real source with wrong details, or accurate citation. 9. Write two sentences summarizing what this tells you about where to apply extra scrutiny when using this tool in your work.
Key Principles from These Cases
- Fluency is not accuracy. A model that writes like an expert is not necessarily right like one. Polished prose is a product of training on good writing — it is independent of factual correctness.
- Asking a model to verify its own output doesn't work. The confirmation uses the same generative process as the original claim. It cannot access external ground truth.
- Hallucinations cluster in predictable places: specific citations, exact statistics, biographical details, and anything requiring knowledge of recent or niche events.
- Partial accuracy is more dangerous than complete fabrication. When most of an output is correct, false details inherit credibility from the accurate context surrounding them.
- The prompt structure influences hallucination risk. Requests that bundle content generation with citation generation produce more fabrications than requests that separate the two tasks.
- Different tools have different risk profiles. Perplexity with web search, Claude with a document in context, and GPT-4 working from memory are three different reliability environments.
- The right response is calibrated use, not avoidance. AI tools add real value in drafting, structuring, and synthesizing — the skill is knowing which operations require human verification.
Takeaways
- Language models generate plausible text — they do not retrieve verified facts. These are fundamentally different operations.
- The Schwartz case, the pharma near-miss, and the SaaS white paper all share one root cause: delegating verification to a tool incapable of performing it.
- Hallucination risk is highest for citations, exact numbers, dates, biographical details, and post-training-cutoff events.
- Use the output type risk table as a triage guide: not all AI content needs the same scrutiny level.
- Treat every AI-generated citation as a placeholder until you've confirmed it exists in a primary source.
- Prompt structure matters: separate content generation from citation requests, and ask the model to flag uncertainty explicitly.
- Your tool choice matters: grounded tools (Perplexity, models with documents in context) perform differently than models working from training memory alone.
When the Model Sounds Like an Expert (But Isn't)
In 2023, the legal firm Levidow, Levidow & Oberman submitted a court brief in Mata v. Avianca that cited six airline safety cases — complete with docket numbers, judges' names, and detailed rulings. Every single case was fabricated by ChatGPT. The attorney, Steven Schwartz, had asked the model to find supporting precedents, and it had obliged with extraordinary confidence. The brief reached a federal judge before anyone checked. Schwartz was fined $5,000 and publicly sanctioned. The cases had never existed. The citations had never existed. The court filings they referenced had never existed. What made this catastrophic wasn't just the error — it was that every fabricated citation looked exactly like a real one.
This case became a landmark because it exposed a specific and dangerous hallucination pattern: the model wasn't guessing randomly. It was pattern-completing. Legal citations follow a rigid format — case name, year, court, docket number — and GPT-4 had absorbed thousands of them during training. When asked for cases that supported a particular legal argument, it generated structurally perfect citations that fit the argument's logic. The model was doing precisely what it was trained to do: produce the most statistically likely continuation of a legal research prompt. The problem is that 'statistically likely' and 'factually real' are entirely different things.
The principle this surfaces is one of the most important in AI literacy: fluency is not accuracy. A model that produces beautifully formatted, logically structured, professionally toned output has simply learned what good output looks like — not what true output looks like. This distinction collapses in domains where format and credibility are tightly linked: law, medicine, finance, academic research. The more a field has standardized how correct information is presented, the better an AI model becomes at mimicking that presentation regardless of whether the underlying content is real.
Why Legal and Academic Citations Are Especially Vulnerable
The Confidence Mechanism: Why Models Don't Know What They Don't Know
To understand why hallucinations happen with such apparent confidence, you need to understand what a large language model actually is at its core. It's a next-token predictor. Given everything that came before — including your prompt and everything it has generated so far — it calculates a probability distribution over every possible next word and picks from the most likely options. There is no internal fact-checking module. There is no lookup against a verified database of true statements. There is no uncertainty flag that fires when the model is about to invent something. The architecture simply does not contain a mechanism for distinguishing 'I know this' from 'I'm extrapolating this.'
This is why the hedging language you sometimes see — phrases like 'I believe' or 'you may want to verify this' — is itself a learned pattern, not a genuine signal of model uncertainty. Claude and ChatGPT have been fine-tuned through reinforcement learning from human feedback (RLHF) to add caveats in situations where human raters preferred them. But the model doesn't actually know when it's uncertain. It knows when caveats were rewarded during training. This means a model can produce a completely fabricated statistic with no hedge at all, because in training, confident-sounding statistics were often rated as more helpful than hedged ones. The calibration is imperfect at best.
What this means practically: the presence or absence of a disclaimer tells you almost nothing about whether the content is accurate. A model that says 'I'm not entirely sure, but I believe the WHO reported a 34% increase...' may be more wrong than one that states a figure flatly — or equally wrong. Professionals who learn to read AI output treat confidence as a style choice the model makes, not as information about reliability. The only reliable signal is external verification. This is uncomfortable for people used to trusting authoritative-sounding sources, but it's the only rational stance given how these systems are built.
Prompt
You just told me that McKinsey published a 2022 report showing that 67% of Fortune 500 companies had a dedicated AI ethics officer. Can you give me the exact report title and a link to it?
AI Response
I need to correct myself here. I cannot verify that specific statistic or report, and I may have generated that figure incorrectly. McKinsey does publish research on AI adoption and governance, but I don't have reliable access to confirm whether that specific report exists or whether that figure is accurate. I'd recommend searching McKinsey.com directly for their AI governance research, or checking their Global Institute publications. I should not have stated that figure with confidence — I apologize for the potential confusion.
A Healthcare Example: When Confident Errors Have Real Stakes
In 2023, researchers at Stanford tested several large language models on medical licensing exam questions and clinical vignettes. The models performed impressively on standard cases — GPT-4 passed the USMLE with scores around 86-90%. But when researchers probed edge cases, rare drug interactions, and recently updated clinical guidelines, the failure mode was stark. Models would cite drug dosages that were outdated, reference treatment protocols that had been superseded, and occasionally invent clinical studies that didn't exist. The dangerous part wasn't the errors themselves — it was that they arrived packaged inside otherwise accurate, well-structured clinical reasoning.
A physician using an AI tool to quickly review treatment options for a complex patient faces a specific cognitive trap: the first 90% of the response is accurate and useful, which builds trust, and then a fabricated or outdated detail slips into the final recommendation. The physician who skims the output rather than reading it critically may catch the logical flow and miss the erroneous specific. This is the 'trust transfer' problem — confidence earned in one part of a response gets applied to the whole response. Healthcare organizations piloting tools like Microsoft Copilot for Health and Nuance DAX are learning to build verification checkpoints specifically at the claim level, not the response level.
Hallucination Patterns Across Different Output Types
| Output Type | Common Hallucination Pattern | Why It Happens | Detection Method |
|---|---|---|---|
| Legal citations | Fabricated case names, docket numbers, rulings | Legal citation format is highly learnable; model pattern-completes | Verify on Westlaw, CourtListener, or Google Scholar |
| Academic references | Real author names attached to fake papers; invented DOIs | Academic citation structure is rigid and well-represented in training data | Search DOI directly; check PubMed or Google Scholar |
| Statistics and data | Plausible-sounding percentages attributed to real organizations | Numbers follow rhetorical patterns; model fills gaps with likely figures | Trace to primary source; check the original report |
| Company facts | Wrong founding dates, revenue figures, executive names | Company info changes; training data has a cutoff and contains errors | Cross-check LinkedIn, Companies House, official filings |
| Product specifications | Incorrect version numbers, features, pricing | Product details update frequently after training cutoff | Check official product pages directly |
| Historical events | Correct event, wrong date, wrong participant, or wrong outcome | Events are well-known but details are confused across similar events | Verify with primary historical sources or encyclopedias |
| Quotes and attribution | Real person, fabricated quote — or real quote, wrong person | Quotes are associated with famous names during training regardless of source | Search the exact quote in quotation marks; check Wikiquote |
The Marketing Analyst's Trap: Competitive Intelligence That Isn't
A growth marketing team at a mid-size SaaS company used ChatGPT to compile a competitive analysis of five rival products. The output was polished — feature comparisons, pricing tiers, integration lists, customer segment positioning. They built a strategy deck around it. Three months later, in a sales call, a prospect pointed out that one of the competitors had pivoted their pricing model six months earlier and no longer offered the tier the deck described. A second competitor had been acquired. The AI had produced accurate-sounding competitive intelligence that was partly outdated and partly invented, and no one had verified it because the format looked like research.
This example illustrates a hallucination category that's easy to overlook: temporal drift combined with plausible fabrication. ChatGPT's training data has a cutoff — GPT-4 Turbo's is April 2024, though knowledge quality degrades in the months approaching any cutoff. Perplexity AI and Gemini with Google Search grounding do better here because they can retrieve current web content, but even grounded models can misread or misrepresent sources. For competitive intelligence specifically, the rule is simple: anything that could have changed in the last 12 months needs live verification, regardless of which tool generated it.
The 'Source or Skip' Rule for Factual Claims
What Hallucinations Reveal About How to Prompt Better
Understanding hallucination mechanics isn't just defensive knowledge — it directly improves how you construct prompts. The Levidow case, the Stanford medical study, and the marketing team's competitive deck all share a structural prompt problem: they asked the model to retrieve specific facts from memory without giving it any factual material to work from. When you ask a model to recall something specific — a citation, a statistic, a product detail — you're forcing it into a high-hallucination regime. The model has no choice but to generate from pattern rather than from source. The output will look the same whether it's accurate or fabricated.
The fix is a prompting strategy called 'closed-context generation.' Instead of asking the model to remember facts, you paste in the source material and ask the model to reason over it. A lawyer who pastes in actual case summaries and asks ChatGPT to identify which ones support a particular argument gets reliable output, because the model is now doing pattern recognition and synthesis over real text rather than generating from memory. A marketer who pastes in a competitor's current pricing page and asks for a comparison gets accurate output. The model's reasoning capabilities are strong. Its recall capabilities are not. Build prompts that use the former and avoid depending on the latter.
This also explains why retrieval-augmented generation (RAG) systems — the architecture behind Perplexity, Bing AI, and enterprise tools like Glean — produce fewer hallucinations on factual queries. They first retrieve relevant documents, then generate responses grounded in that retrieved text. The generation step is still probabilistic, but it's constrained by real source material. When you're building workflows around AI tools, choosing a RAG-enabled tool for fact-dependent tasks and a pure generation model for creative or structural tasks is a meaningful architectural decision, not just a product preference.
Goal: Build a personal map of where hallucinations have already entered your professional work, and establish a repeatable triage habit before that risk scales.
1. Open your most-used AI tool (ChatGPT, Claude, Gemini, or Copilot) and retrieve or generate three recent outputs you've used professionally — a summary, a draft, or a research output works best. 2. For each output, highlight every factual claim: statistics, citations, dates, company names, product details, quotes, and named studies or reports. 3. Using the comparison table from this lesson, identify which hallucination category each claim falls into (e.g., 'statistic attributed to an organization' or 'product specification'). 4. For each flagged claim, spend no more than 60 seconds attempting to find the primary source — use Google, the organization's official website, or a database like PubMed or Google Scholar. 5. Record what you find in three columns: Claim | Verified / Unverifiable / Wrong | Source Found (or not). 6. For any claim you cannot verify in 60 seconds, mark it 'unverifiable' and decide whether to remove it, replace it with a verified alternative, or explicitly flag it as unconfirmed in your document. 7. Write one sentence for each output describing the highest-risk claim type you found and what verification step you'll build into your workflow going forward. 8. Share your three-row findings table with a colleague who also uses AI tools and discuss whether your risk profiles differ by use case.
Key Patterns From These Cases
- Fluency is not accuracy — a response that sounds expert has only learned what expertise sounds like, not what truth looks like.
- Structured formats accelerate hallucination risk — citations, statistics, and specifications are high-risk because their format is highly learnable and their accuracy is hard to spot-check visually.
- Confidence signals are trained behaviors, not uncertainty indicators — hedging language reflects RLHF rewards, not genuine model self-knowledge.
- Trust transfer is the hidden danger — accuracy in one part of a response creates unearned confidence in the rest of it.
- Temporal drift compounds fabrication — anything that could have changed since the model's training cutoff is doubly unreliable.
- Closed-context prompting dramatically reduces hallucination — models reason well over text you provide; they recall poorly from training memory alone.
- RAG-enabled tools reduce but don't eliminate hallucination — grounding helps, but generation over retrieved text still carries risk if the retrieval step surfaces wrong or misrepresented sources.
What to Carry Forward
- Check at the claim level, not the response level — one accurate paragraph doesn't validate the next one.
- The 60-second source rule keeps verification practical: traceable in under a minute, or flag it.
- Paste sources in; don't ask the model to remember them — this is the single highest-leverage prompt habit for factual accuracy.
- Know your tool's training cutoff — GPT-4 Turbo cuts off at April 2024; Claude 3.5 Sonnet at early 2024; for live information, use Perplexity or Gemini with Search.
- High-stakes domains (legal, medical, financial, academic) require systematic verification workflows, not just spot-checking.
- Format credibility is a trap — a beautifully formatted citation that doesn't exist is more dangerous than a vague claim, because it looks checkable and people don't check it.
In 2023, the legal team at a mid-sized Australian law firm used ChatGPT to draft a research memo on precedents for a contract dispute. The output looked authoritative — case names, court levels, citation years, all formatted correctly. Three of the six cases it cited did not exist. The attorneys caught two of them during review. The third made it into a client briefing before a partner noticed the citation returned no results in Westlaw. The firm lost four billable hours unwinding the error and spent considerably more rebuilding client trust. The failure wasn't that the AI was wrong. The failure was that the output looked exactly right.
That story captures the deepest problem with hallucinations: they are not random noise. They are plausible, well-formatted, contextually appropriate-sounding content that happens to be false. The legal team wasn't careless — they were working with an output that mimicked the structure and tone of real legal research perfectly. This is what makes hallucinations uniquely dangerous compared to other software errors. A broken formula in Excel throws an error. A hallucinating language model throws a citation that looks like it came from a senior associate.
Why legal and medical outputs are highest risk
The principle the law firm learned the hard way is this: the more specialized and verifiable a domain is, the more dangerous an unverified AI output becomes. General summaries can be roughly right and still useful. Specific citations, figures, and proper nouns either exist or they don't. When a model invents a case name, it doesn't produce gibberish — it produces something that fits the pattern of real case names perfectly, because it has trained on thousands of them. The model knows what a case citation looks like. It does not know which ones are real.
This same dynamic plays out in finance. In early 2024, a financial analyst at a boutique investment firm used Perplexity to research a competitor's recent acquisitions. Perplexity cited a source for each claim, which created a false sense of security. Two of the acquisition figures were outdated; one was fabricated — a deal attributed to the wrong company entirely. The analyst's report went to a client before the error was caught. Perplexity's citation feature reduces hallucination risk significantly compared to base ChatGPT, but it doesn't eliminate it. Sources can be misread, misattributed, or the model can still confabulate details around a real source link.
The lesson from the investment firm isn't that Perplexity is unreliable. It's that citation presence and factual accuracy are different things. A cited hallucination is still a hallucination. The citation creates the appearance of verification without the substance of it. Professionals who understand this distinction use AI-generated citations as a starting point for verification, not as verification itself. Click through. Check the primary source. Confirm the number actually appears where the model says it does.
| Output Type | Hallucination Risk | Verification Method | Tool Guidance |
|---|---|---|---|
| Legal case citations | Very High | Search Westlaw, LexisNexis, or Google Scholar directly | Never trust ChatGPT/Claude citations without checking |
| Named statistics with percentages | High | Trace to original report or dataset | Ask model to name the source, then verify that source |
| Product or company facts | Medium-High | Check company website, press releases, SEC filings | Perplexity reduces but doesn't eliminate risk |
| Summarised concepts | Medium | Cross-reference with a textbook or trusted overview | Claude tends to be more cautious with hedging language |
| Structural or logical reasoning | Lower | Evaluate the argument's internal logic yourself | GPT-4 and Claude both perform well here |
| Creative or generative content | Low (factual risk) | Read for tone/fit rather than fact-checking | All major models suitable; no verification needed |
The pattern also appears in marketing, though the stakes feel lower — until they don't. A content strategist at a SaaS company used Claude to generate a competitive comparison blog post. Claude confidently described a competitor's pricing tier and feature set. The pricing was six months out of date, and one feature it attributed to the competitor had been deprecated. The post went live, the competitor's team noticed, and a public correction followed on LinkedIn. Nobody lost a court case. But the company's credibility with a technical audience took a real hit, and the content team spent a week in damage control.
The strategist's mistake wasn't using Claude for competitive research. It was using Claude as the final source rather than as a first draft. Claude is excellent at structuring a comparison, generating the categories to compare, and producing clean prose. It is not a live database of competitor feature sets. The correct workflow: use Claude to build the structure and draft the narrative, then populate every specific claim — pricing, features, dates — from the competitor's own website or recent press. AI handles the thinking framework; humans handle the factual specifics.
The 'named thing' rule
What this means practically is that your verification effort should scale with specificity and stakes. A paragraph summarising how machine learning models are trained needs no external check — the concepts are stable and widely documented. A paragraph claiming that a specific company raised $47M in Series B funding in Q3 2022 needs thirty seconds on Crunchbase. The mental model to build is: the more specific the claim, the cheaper it is to verify, and the more expensive it is to get wrong. Generalities are slow to check and low risk. Specifics are fast to check and high risk.
You also now know that hallucinations aren't evenly distributed across model types or prompt styles. Retrieval-augmented tools like Perplexity reduce fabrication by grounding outputs in retrieved documents. Models like Claude are trained to hedge uncertainty more visibly than GPT-3.5, which tends toward confident-sounding answers regardless of accuracy. GPT-4 outperforms GPT-3.5 on factual tasks significantly. Asking a model to cite its reasoning, acknowledge uncertainty, or flag what it doesn't know all reduce hallucination rates measurably. The way you prompt shapes the reliability of what comes back.
The final thing to internalise is that hallucination is not a bug being fixed — it's a structural feature of how generative models work. These systems predict plausible continuations of text. They are not databases. They are not search engines. They will always generate plausible-sounding content in domains where they lack grounded knowledge. The professionals who get the most value from these tools are not the ones who trust them most. They are the ones who understand the failure modes well enough to route the right tasks to AI and verify the right outputs before acting on them.
Prompt
I'm researching the current pricing tiers for Salesforce Sales Cloud. Please give me what you know, but explicitly flag any information that may be outdated or that you're uncertain about, and tell me what I should verify directly with Salesforce.
AI Response
Based on my training data (which has a knowledge cutoff and may not reflect current pricing): Salesforce Sales Cloud has historically offered Essentials, Professional, Enterprise, and Unlimited tiers. As of my last knowledge update, prices ranged from approximately $25 to $300+ per user per month. ⚠️ Flags to verify directly: Current prices change frequently and Salesforce often adjusts tiers — check salesforce.com/pricing for live figures. Any specific add-on costs (Einstein AI features, additional storage) are particularly prone to change. I'd also recommend checking whether the 'Essentials' tier still exists as Salesforce has restructured lower tiers in recent years.
Goal: Produce a personalised, role-specific checklist that maps your actual AI use cases to their hallucination risk levels and verification steps — a working document you use every week.
1. Open a document or note-taking app — this becomes a reference you keep. 2. Write down three AI tasks you already use or plan to use regularly (e.g., 'drafting client emails', 'researching competitors', 'summarising reports'). 3. For each task, identify the specific output types it produces using the table in this lesson (e.g., named statistics, structural reasoning, creative content). 4. Assign each task a risk level — High, Medium, or Low — based on how many 'named things' the output typically contains. 5. For each High or Medium task, write one specific verification step you will take (e.g., 'Check every cited statistic against the original report URL'). 6. Write a reusable prompt instruction you will add to high-risk prompts — model it on the uncertainty-flagging example above. 7. Add a column: 'Best tool for this task' — choose from ChatGPT, Claude, Perplexity, or Gemini based on what you've learned about their hallucination profiles. 8. Save this as 'AI Verification Checklist' and put it somewhere you'll open it before starting AI-assisted work. 9. Review and update it after the next three times you catch an error or near-miss in an AI output.
- Hallucinations look correct by design — they mimic the structure and tone of accurate outputs, which is why visual inspection alone fails.
- The highest-risk outputs contain named things: specific people, cases, companies, statistics, and dates. These are fast to verify and expensive to get wrong.
- Citation presence does not equal factual accuracy. Tools like Perplexity reduce hallucination risk but do not eliminate it — always trace citations to primary sources.
- Verification effort should scale with specificity and stakes. General conceptual summaries carry low risk; specific claims carry high risk and take seconds to check.
- Prompt design directly affects hallucination rates. Asking models to flag uncertainty, cite reasoning, and acknowledge knowledge limits produces more reliable outputs.
- Hallucination is structural, not accidental. These models predict plausible text — they are not querying a database. Understanding this shapes how you assign tasks to AI tools.
- Different models have different hallucination profiles. GPT-4 outperforms GPT-3.5 on factual accuracy; Claude tends to hedge more visibly; Perplexity grounds outputs in retrieved sources.
- Treat every AI-generated named thing as a hypothesis until you verify it against a primary source.
- Use the output type risk table to triage what needs checking and what doesn't — not everything requires verification.
- Build uncertainty-flagging into your prompts for any high-stakes or domain-specific task.
- Choose your AI tool based on the task's hallucination risk profile — Perplexity for research, Claude for nuanced reasoning, GPT-4 for complex analysis.
- Your verification workflow is as important as your prompting skill — both are learnable professional competencies.
A colleague says: 'The Perplexity output is fine — it cited three sources for every claim.' What's the most accurate response?
You ask ChatGPT to summarise how transformer neural networks work for a non-technical presentation. What level of verification does this output require?
The Australian law firm's experience demonstrates which core principle about AI hallucinations?
Which prompt addition most directly reduces hallucination risk in a high-stakes output?
A SaaS content strategist uses Claude to write a competitive comparison post. Claude produces clean, well-structured content comparing pricing tiers and features. What is the correct workflow?
Sign in to track your progress.
