Chapter 08 · Module 01 · Beginner–Intermediate · 26–30 min

Chapter 8: Why LLMs Hallucinate — The PM Version

Why LLMs sound confident when they are wrong — and how product teams design around hallucination.

Book: AI Learning Beginner–Intermediate 26–30 min
Start reading Back to module
Generate Guess Ground Verify

Plausible language is not proof — product layers make answers accountable

Introduction

In Chapter 2, we saw how transformers became Large Language Models.

In Chapter 3, we covered tokens and context windows — the working memory of an AI system.

In Chapter 4, we introduced AI safety, RLHF, and Constitutional AI — why alignment matters and how human feedback shapes behavior.

In Chapter 5, we went deeper into InstructGPT and the RLHF pipeline that turned base models into instruction-following assistants.

In Chapter 6, we compared prompting, fine-tuning, RAG, and tools — the layers that shape what the model knows and how it behaves.

In Chapter 7, we covered temperature, top-p, and sampling — why the same prompt can produce different answers and how randomness affects reliability.

Now we address one of the biggest concerns in AI products:

Hallucination.

Hallucination is when an AI model generates content that sounds valid, confident, and well-written, but is not actually grounded in the provided context or verified world knowledge.

For product managers, hallucination is not just a model defect. It is a product risk. It affects user trust, decision quality, compliance, workflow safety, customer experience, auditability, and adoption.

A chatbot hallucinating a fun fact is annoying. A claims assistant hallucinating a policy clause is dangerous. A legal assistant hallucinating a case reference is risky. A medical assistant hallucinating a diagnosis is unacceptable.

This chapter explains why hallucinations happen and how product teams should design around them — not by telling the model to “be accurate,” but by building architecture that makes truthful behavior easier and unsupported claims harder.

The simple PM version

The model generates plausible language, not guaranteed truth.
Ground answers in sources. Verify claims. Allow uncertainty.
Treat hallucination as a product discipline, not a one-time model fix.

1. What Is Hallucination?

Hallucination usually means the model produces content that is fabricated, unsupported, inconsistent, unfaithful to the provided context, or presented as fact without evidence.

A hallucination is not just any mistake. A spelling error is not necessarily hallucination. A formatting issue is not necessarily hallucination.

The model says something as if it is true, but that claim is not grounded in the available evidence.

Examples

User requestHallucinated output
“Summarize this policy document.”Adds a coverage benefit not in the document
“List missing claim documents.”Mentions a document not required for this case
“Cite the source.”Invents a source or clause number
“Summarize this legal notice.”Adds an allegation not present in the notice
“Write a biography.”Invents awards, dates, or affiliations
“Explain this API error.”Makes up a nonexistent function or config

The dangerous part is not only that the output is wrong. The dangerous part is that it may sound polished and confident.

2. Two Important Types of Hallucination

A useful distinction separates hallucinations by where the failure happens — in the provided context or in world knowledge.

TypeSimple meaning
In-context hallucinationOutput conflicts with or adds unsupported information beyond the provided context
Extrinsic hallucinationOutput is not grounded in external world knowledge or verified sources

In-context hallucination

This happens when the model is given source material but produces something not supported by that source.

Example: you give the model a hospital discharge summary. The model says “The patient had diabetes.” But the discharge summary does not mention diabetes. That is an in-context hallucination — the model failed to stay faithful to the provided document.

Extrinsic hallucination

This happens when the model makes a claim about the world that is not grounded in verified external knowledge.

Example: the user asks “Who founded this company?” The model confidently gives a name, but the name is wrong. That is an extrinsic hallucination.

Hallucination typeProduct fix
In-contextBetter document grounding, citation, extraction validation
ExtrinsicRAG, search, trusted knowledge base, refusal when unknown
BothEvaluation, human review, source attribution, confidence controls

PM takeaway

You cannot fix all hallucinations with one generic prompt. The fix depends on the type.

3. Why Hallucination Happens: The Model Is Predicting Text, Not Verifying Truth

LLMs are trained to generate likely text. They are not naturally trained to verify truth against reality. This is the root problem.

The model asks: “What response is likely given this prompt?” It does not automatically ask: “Can I prove this claim from trusted evidence?”

A fluent answer can be statistically likely and factually wrong.

PM analogy

Imagine a very well-read employee who is excellent at writing but does not always check the source before answering. If they are unsure, they may still produce something that sounds plausible. The issue is not lack of language skill — it is lack of grounding.

For factual products, the model should not be treated as the final source of truth. It should be connected to trusted documents, databases, search, policy systems, rule engines, APIs, or human reviewers.

PM takeaway

The model should generate from evidence, not from vibes.

4. Cause 1: Pre-Training Data Can Be Missing, Wrong, or Outdated

LLMs are trained on massive data collected from books, websites, code, articles, forums, documentation, and other sources. That data is not perfect.

Data issueProduct impact
Missing informationModel may guess
Outdated informationModel may give old answer
Incorrect informationModel may learn wrong pattern
Contradictory informationModel may blend conflicting claims
Biased informationModel may reproduce bias
Low-quality contentModel may learn unreliable patterns

If the model learned an old version of a policy, law, product description, or company detail, it may continue to answer from that stale pattern — and may not know the information is stale.

NeedBetter design
Latest policyRetrieve latest approved policy
Current customer dataQuery system of record
Latest regulationSearch trusted source
Updated pricingAPI lookup
Internal SOPKnowledge base retrieval
Claim statusWorkflow system integration

PM takeaway

For any product relying on current or company-specific information, do not depend on model memory. Model memory is not a reliable knowledge system.

5. Cause 2: The Model May Not Know What It Does Not Know

A major hallucination driver is poor uncertainty handling. The model may be asked a question where the correct answer is “I don’t know.” But instead, it generates a plausible answer.

Language models are often optimized to be helpful and answer questions. Admitting uncertainty may not always be naturally rewarded.

Example

User asks: “What was the exact internal approval reason for this claim?”

If the model does not have access to the claim system, the correct answer is: “I do not have access to that information.” A hallucinating model may instead infer a reason from context and present it as fact.

SituationDesired AI behavior
Source missingSay source is missing
Evidence weakState uncertainty
Data unavailableAsk for access or upload
Conflicting sourcesShow conflict
Not authorizedRefuse or escalate
High-risk decisionAsk for human review

PM takeaway

A model that always answers is not always helpful. Sometimes the safest and most accurate answer is: “I cannot determine this from the available information.”

6. Cause 3: Fine-Tuning New Knowledge Can Create Risk

Fine-tuning is useful for teaching behavior. But using fine-tuning to insert new factual knowledge can be risky. Fine-tuning usually uses much less compute and much smaller datasets than pre-training. The model may not reliably integrate new facts in a stable way.

ProblemRisk
Some rules are outdatedModel learns stale behavior
Rules are incompleteModel fills gaps
Examples conflictModel becomes inconsistent
Rules change laterFine-tuned model becomes outdated
Training data is smallModel overfits
Model blends old and new knowledgeHallucination risk increases
ProblemBetter first solution
Need latest policyRAG
Need live claim dataAPI / tool
Need consistent summary formatFine-tuning may help
Need brand toneFine-tuning may help
Need current regulationSearch / retrieval
Need decision ruleRule engine / workflow
Need factual answerRetrieval + citation

This connects directly to Chapter 6. Fine-tuning is not a knowledge management system.

7. Cause 4: Sampling and Randomness Can Increase Factual Drift

In Chapter 7, we covered temperature, top-p, and sampling. Those settings matter for hallucination.

When the model samples more creatively, it may choose less likely tokens. That can be useful for brainstorming. But for factual workflows, more randomness can increase the risk of unsupported claims.

Example

Prompt: “Summarize this policy clause.” With higher randomness, the model may paraphrase too freely, add interpretations, overgeneralize, or make the summary sound more complete than the clause allows.

Use caseSampling direction
Data extractionVery low randomness
Policy summaryLow randomness
Claims decision supportLow randomness
Medical / legal / financeLow randomness
Marketing copyMedium to high randomness
BrainstormingHigher randomness acceptable

PM takeaway

Sampling settings will not eliminate hallucination. But poor sampling settings can make hallucination worse.

8. Cause 5: Long Outputs Can Drift Away from Facts

Hallucination risk can increase as a generated response becomes longer. Every sentence creates more opportunities to add unsupported claims.

Example: user asks for a detailed founder biography. The model may correctly state the name and company in the first paragraph, then invent awards, education, previous roles, funding milestones, or media mentions. More claims mean more verification burden.

PM takeaway

For factual outputs, prefer concise answers, atomic facts, cited claims, structured fields, and source-backed summaries. Long is not automatically better — in factual AI products, long can be dangerous.

9. Cause 6: RAG Can Reduce Hallucination, But Bad RAG Can Still Fail

RAG (Retrieval-Augmented Generation) reduces hallucination by giving the model relevant external context before it answers. But RAG is not magic. Bad retrieval can still produce bad answers.

FailureWhat happens
Wrong document retrievedModel answers from irrelevant source
Old version retrievedModel gives outdated answer
Important chunk missingModel fills the gap
Chunk too smallContext is incomplete
Chunk too largeModel loses focus
Conflicting sources retrievedModel blends contradictions
No citation requirementUser cannot verify answer
Poor rankingMost relevant source not used
CapabilityWhy it matters
Source versioningAvoid stale answers
Relevance rankingRetrieve the right document
Chunk strategyPreserve enough context
Citation displayBuild trust
Source priority rulesResolve conflicts
Fallback behaviorAvoid guessing
Retrieval evaluationMeasure whether RAG works
Human reviewHandle high-risk cases

PM takeaway

Do not say “We added RAG, so hallucination is solved.” RAG reduces risk only if the retrieval system is good.

10. Cause 7: The Model May Confuse Similar Entities

Hallucination often happens when entities are rare, similar, or ambiguous — two companies with similar names, similar policy clauses, similar hospital names, similar claim IDs, or similar legal sections.

Example: a user asks about “Aster Whitefield Health Priority Pass.” The model may mix it with another Aster service or hospital package unless the source context is clear.

RiskProduct control
Similar namesAsk clarifying question
Similar policiesShow matched source
Similar claim IDsValidate against database
Similar usersUse unique identifiers
Similar hospitalsUse provider master data
Similar productsUse product catalog lookup

PM takeaway

Do not rely on model memory for entity resolution. Use system-of-record data.

11. Cause 8: The Model May Be Optimized for Helpfulness Over Factuality

RLHF and instruction tuning can make models more helpful. But helpfulness and factuality are not always the same. A human may prefer a detailed, confident answer — but a detailed answer is not necessarily more factual.

Sometimes the most factual answer is short and cautious: “The provided documents do not contain enough information to answer this.” That may feel less satisfying, but it is more honest.

QualityWhy
CorrectnessTruth matters
GroundingEvidence matters
Appropriate uncertaintyAvoids fake confidence
Source citationSupports audit
ConcisenessReduces unsupported claims
EscalationHandles unknowns safely

PM takeaway

A helpful hallucination is still a hallucination. Be careful what your product rewards.

12. Hallucination Detection: Retrieval-Based Checking

One way to detect hallucination is to break the answer into factual claims and check each claim against a trusted source — similar to how a careful reviewer works.

StepAction
1Generate answer
2Extract factual claims
3Retrieve evidence for each claim
4Check whether evidence supports the claim
5Mark unsupported claims
6Revise or reject answer

PM example

AI answer: “The claim is eligible because maternity coverage starts after 9 months.”

ClaimEvidence needed
Claim is eligiblePolicy rule + member data
Maternity coverage appliesDiagnosis / procedure
Waiting period is 9 monthsPolicy clause
Waiting period completedPolicy start date + admission date

PM takeaway

For high-risk workflows, do not only generate. Verify.

13. Hallucination Detection: Multiple-Sample Consistency

Another approach is to ask the model multiple times and check whether the answers are consistent. If the model gives different factual answers across runs, that is a warning sign.

Question: “Who authored this paper?” Run 1: “John Smith and Maria Lee.” Run 2: “David Chen and Priya Rao.” Run 3: “The authors are not available.” That inconsistency signals uncertainty.

Use caseWhy
Reference checkingFake references may vary
Biography generationInvented facts may be inconsistent
Open-ended factual questionsAnswers may drift
Unknown questionsModel may guess differently
High-risk answersRepeated variation is a warning

PM takeaway

This is not a complete solution. But it is useful as a risk signal.

14. Hallucination Detection: Indirect Queries

Sometimes asking directly is not enough. A model may say a fake reference exists if asked directly. Indirect queries can help.

Instead of “Is this paper real?” ask: “Who are the authors?” “Which journal published it?” “What year was it published?” “What is the DOI?” If the model cannot consistently answer supporting details, the reference may be hallucinated.

Generated claimIndirect check
Paper titleAuthors, venue, year
Legal caseCourt, date, citation number
Policy clauseClause text, page number, section
Medical codeCode description, classification source
Claim ruleRule ID, policy source, effective date

PM takeaway

Citations should be verified, not decorated.

15. Hallucination Detection: Calibration and Confidence

A well-calibrated model should express uncertainty when it is uncertain. Poor calibration means the model sounds confident even when it is likely wrong. Users tend to trust confident answers.

Bad behavior

“Yes, this claim is eligible.” — when the evidence is incomplete.

Better behavior

“Eligibility cannot be confirmed from the available documents. The policy clause and admission date are required.”

Confidence situationProduct behavior
Strong evidenceAnswer with source
Partial evidenceState uncertainty
Conflicting evidenceShow conflict
No evidenceDo not guess
High-risk taskEscalate
Low-risk taskProvide caveat

PM takeaway

A confidence score alone is not enough. The product must define what happens at each confidence level.

16. Mitigation 1: Use RAG with Attribution

RAG helps reduce hallucination by grounding the answer in retrieved sources. But the answer also needs attribution — showing which source supports which claim.

Weak vs better answer

Weak: “This claim requires a shortfall.”

Better: “This claim may require a shortfall because the discharge summary is missing. Source: Document checklist, Section 3.2.”

Output elementSource needed
Claim recommendationRule / policy / source document
Missing documentChecklist / SOP
DeductionPolicy clause or billing rule
Medical summaryDischarge summary / source line
Customer adviceApproved FAQ / policy
Compliance statementLegal / regulatory source

PM takeaway

No source, no confident claim. That should be the rule.

17. Mitigation 2: Verification Before Final Answer

A good AI system should not always answer immediately. For factual tasks, it may need a verification step.

StepAction
1Draft answer
2Extract claims
3Check claims against sources
4Remove unsupported claims
5Add citations
6State uncertainty
7Produce final answer
Use caseVerification need
Legal summaryHigh
Medical summaryHigh
Claim decision supportHigh
Financial adviceHigh
Internal policy Q&AMedium to high
Marketing copyLow
BrainstormingLow

PM takeaway

Not every AI feature needs heavy verification. But high-risk workflows do — even if verification adds latency.

18. Mitigation 3: Design Refusal and “I Don’t Know” Behavior

A good AI product must know when not to answer. This does not mean refusing everything. It means refusing unsupported certainty.

SituationGood response
Source missing“I cannot determine this without the policy document.”
Data unavailable“I do not have access to the claim record.”
Ambiguous entity“Which hospital do you mean?”
Conflicting sources“The documents conflict; human review is needed.”
High-risk uncertainty“This should be escalated for review.”

Design fallback responses for missing context, weak evidence, source conflict, unknown answer, restricted data, unsafe requests, and high-risk decisions. Do not leave them to chance.

19. Mitigation 4: Keep Outputs Atomic and Checkable

Long paragraphs hide hallucinations. Atomic outputs expose them. Instead of asking for a detailed explanation, ask for facts, source, confidence, next action, and missing evidence.

FieldOutput
FindingDischarge summary is missing
SourceDocument checklist
EvidenceNo discharge summary found in uploaded documents
ConfidenceHigh
Recommended actionRaise shortfall

PM takeaway

For serious workflows, prefer structured outputs: tables, checklists, field-level extraction, source mapping, and confidence labels. Do not ask the model to write a beautiful essay when the product needs a verifiable decision aid.

20. Mitigation 5: Use Human Review Where Failure Cost Is High

No hallucination control is perfect. For high-risk workflows, humans remain necessary.

ScenarioWhy
Claim approval / rejectionFinancial and customer impact
Medical interpretationClinical risk
Legal draftingLegal risk
Compliance decisionsRegulatory risk
Payment triggersFinancial risk
Customer-facing sensitive replyReputation risk
Unclear evidenceJudgment required

PM takeaway

Human-in-the-loop is not a weakness — it is a safety layer. Make review efficient by showing AI recommendation, supporting evidence, missing evidence, confidence, source links, audit trail, and suggested next action.

21. Hallucination Metrics PMs Should Track

If hallucination matters to your product, measure it. Do not rely on anecdotal feedback.

MetricMeaning
Unsupported claim ratePercentage of claims without source support
Citation accuracyWhether cited source actually supports the claim
Factual error rateHuman-reviewed factual mistakes
Refusal correctnessWhether model refused when it should
Over-refusal rateWhether model refused legitimate requests
Escalation rateHow often human review is triggered
Override rateHow often humans change AI output
Source retrieval accuracyWhether correct document was retrieved
Format validityWhether structured output is usable
Repeated-run consistencyWhether same input gives stable facts

PM takeaway

A hallucination strategy without metrics is wishful thinking. Build factuality evaluation into the product lifecycle.

22. Practical Example: Claims Shortfall Assistant

User asks: “Should we raise a shortfall for this claim?”

Weak system: “Yes, raise a shortfall because the final bill and discharge summary are missing.” — but how does it know?

FieldOutput
RecommendationRaise shortfall
Missing documentsDischarge summary, final bill
EvidenceNot found in uploaded document set
Source checkedClaim document checklist, uploaded files
ConfidenceHigh
Decision boundaryProcessor approval required
Next actionSend shortfall template for missing documents

PM takeaway

The product should force the model to show its evidence. If it cannot show evidence, it should not give a confident recommendation.

23. Practical Example: Policy Q&A Assistant

User asks: “Does this policy cover maternity after 9 months?”

Weak answer: “Yes, maternity is covered after 9 months.”

Better answer: “I found a maternity waiting-period clause in the policy document. It states that maternity coverage applies after 9 months, subject to the listed exclusions and sub-limits. Please review Clause 4.3 before using this for a customer-facing decision.”

Best answer if source is missing: “I cannot confirm this because the policy document is not available in the current context.”

PM takeaway

The system should not answer policy questions from memory when the policy document is available or required. For enterprise AI, source-grounded answers should be the default.

24. Practical Example: Blog Writing Assistant

Hallucination is not only a problem in high-risk domains. Content products can invent awards, investor names, revenue, universities, media mentions, or acquisition history — damaging credibility.

Content typeControl
OpinionCreative freedom allowed
BiographySource required
Company historySource required
Technical explanationSource or caveat required
Case studyVerified data required
Marketing copyClaims must be approved

PM takeaway

Creativity is acceptable. Fake facts are not.

25. What PMs Should Not Do

MistakeWhy it is bad
Assuming better model means no hallucinationEven strong models can hallucinate
Using fine-tuning for changing knowledgeCan go stale or increase risk
Trusting citations blindlyCitations can be fabricated or weak
Adding RAG without evaluating retrievalBad retrieval creates bad answers
Rewarding long answersMore claims mean more risk
Hiding uncertaintyUsers over-trust confident answers
Skipping human review in high-risk workflowsUnsafe decisions may slip
Measuring only user satisfactionUsers may like wrong answers
Treating hallucination as only engineering issueProduct design controls the risk
Thinking low temperature guarantees truthIt only reduces randomness

Hallucination management is not a single feature. It is a product discipline.

26. The PM Mental Model

Hallucination happens because LLMs generate plausible language, not guaranteed truth.
Generate → Guess → Ground → Verify.
The solution is not one trick — it is a system.

LayerPurpose
RAGBring trusted knowledge into context
AttributionShow evidence
VerificationCheck claims
Low randomnessReduce factual drift
Structured outputMake claims checkable
Confidence behaviorHandle uncertainty
Refusal designAvoid unsupported answers
Human reviewControl high-risk decisions
MetricsMeasure factuality
MonitoringCatch production failures

Do not ask the model to be magically truthful. Build a product system that makes truthful behavior easier and unsupported claims harder.

Chapter Summary

ConceptPM understanding
HallucinationFabricated or unsupported content presented as fact
In-context hallucinationOutput not faithful to provided source context
Extrinsic hallucinationOutput not grounded in external world knowledge
Pre-training data issuesTraining data may be missing, wrong, stale, or contradictory
Unknown knowledgeModel may answer even when it should say it does not know
Fine-tuning riskFine-tuning new knowledge can increase hallucination risk
Sampling riskMore randomness can increase factual drift
Long output riskMore generated claims create more chances for errors
RAGHelps ground answers, but must be evaluated
AttributionLinks claims to supporting sources
VerificationChecks factual claims before final output
CalibrationModel should express uncertainty appropriately
Human reviewRequired where failure cost is high
PM roleDesign the system that prevents, detects, and manages hallucination

Closing Thought

Hallucination is one of the most misunderstood problems in AI product development. It is tempting to think “A better model will solve it.” Sometimes better models help — but hallucination is not only a model-size problem.

It is a grounding problem, a context problem, a retrieval problem, a feedback problem, a sampling problem, a workflow problem, and a product design problem.

For product managers, the most important shift is this: do not ask the model to be magically truthful. Build a product system that makes truthful behavior easier and unsupported claims harder.

That means giving the model the right sources, forcing evidence, checking claims, allowing uncertainty, controlling randomness, and keeping humans in the loop where the stakes are high.

A reliable AI product is not one where the model never hallucinates. It is one where hallucinations are anticipated, reduced, detected, and prevented from causing harm.

The next chapter in this module compares pre-training, fine-tuning, and RLHF — how each training stage shapes what a model knows and how it behaves before generation and grounding layers apply.

The real PM lesson

Anticipate hallucination. Design around it. Measure it. Do not pretend it will disappear.

Chapter navigation

← Previous

Chapter 7: Temperature, Top-p, and Sampling — The PM Version

Why the same prompt can produce different answers — and how temperature and top-p shape product reliability.

Read chapter →
Next →

Chapter 9: Long Context Window Tradeoffs — The PM Version

Why bigger context helps — and why context engineering matters for reliable agents.

Read chapter →