Hallucination · ailiteracy.nepal

A first-year political science student in Pulchowk asks ChatGPT for “five quotes from Bhanubhakta Acharya about Nepali unity.” ChatGPT produces five eloquent, well-attributed quotes. The student pastes them into an essay. The TA, glancing at the citations, recognises the third quote as nothing Bhanubhakta ever wrote. None of the five are real.

This is hallucination — the most-discussed failure mode of generative AI, and the one that quietly causes the most damage. The model didn’t lie. It didn’t fail. It did exactly what it was trained to do: produce text that sounds like the kind of text the prompt asked for. The quotes are syntactically correct, thematically appropriate, and fully invented.

Why hallucination happens

Recall from Chapter 1: a language model is a next-word predictor. It does not have a truth signal — there is no internal flag that says “this is a known fact” versus “this is a guess.” Every output is, mechanically, a guess about what comes next.

When the training data on a topic is rich and consistent (the basics of Newton’s laws, common Python syntax, well-documented historical events), the most likely next word is usually the correct next word. The model produces accurate text not because it knows it is accurate, but because the patterns of accurate text are what it has learned.

When training data on a topic is thin, contradictory, or simply absent, the model still produces a confident next word — but the patterns it draws on are weaker. It interpolates. It generalises from related topics. It produces text that sounds right in the genre even when no underlying fact supports it.

This is hallucination. It is not a bug. It is the predictable consequence of training a system to produce fluent text without giving it a way to flag uncertainty.

Where hallucination shows up most

Some patterns to expect:

Specific citations. Quotes attributed to authors, paper titles attributed to journals, page numbers in books, exact statistics. Models confidently invent all of these. The format is plausible (author, year, journal name, volume); the existence of the cited work is often not.

Niche facts. Anything outside the dominant training distribution — specific Nepali villages, individual professionals, historical events that don’t appear in widely-indexed English sources. The model will produce something and that something will sound right.

Numbers in the middle of generated text. Statistics, dates, prices. The model is often right or close-but-wrong with no way to tell which from the output alone.

Legal and medical specifics. The exact section of the Muluki Civil Code that applies, the dosage of a Nepali brand-name medication, the precise rules governing some bureaucratic procedure. Confidence is high, accuracy is variable.

Code that uses unfamiliar libraries. A model will generate code that calls a function that doesn’t exist, or uses a parameter that was deprecated, in a way that looks completely plausible.

Real names + plausible biographies. Asking a model about a specific Nepali professional you know slightly will often produce a mix of true and invented facts, presented seamlessly.

How to catch hallucination

A working playbook.

Rule 1 — Treat all specific facts as unverified until you check them. Quotes, citations, statistics, dates, names: assume each is a guess that requires verification.

Rule 2 — Ground the model in your source material. If you want a summary of a report, paste in the report. If you want a quote, paste in the source. The model is much more accurate when it can refer to text in front of it than when it relies on training-data memory.

Rule 3 — Ask the model to flag its own uncertainty. A simple prompt addition: “After your answer, list any specific facts (numbers, dates, names, quotes) that you are not highly confident about, and explain why.” The model is surprisingly often right about which of its own claims are sketchy.

Rule 4 — Use models with retrieval where stakes are high. Modern tools (ChatGPT with browsing, Perplexity, Claude with documents) can search and cite actual sources. For factual research, prefer these over plain LLMs.

Rule 5 — Cross-check with a second source. Even when a model gives a confident citation, run the title through Google Scholar or the author through Wikipedia. The five seconds of verification costs less than the embarrassment.

A worked example: catching the made-up quote

Going back to our student. Suppose instead of trusting the model’s quotes, they had asked:

Give me five quotes from Bhanubhakta Acharya about Nepali unity. For each: the exact source (text and page), and your confidence (1–10) that this quote is real. If you cannot find a real source, say so explicitly.

A well-aligned model in 2026 will respond more honestly: maybe one or two confident quotes (well-attested), three or four flagged as “thematically representative of Bhanubhakta’s work but I cannot confirm the exact wording,” and a recommendation to consult a specific anthology. Same model, same task, much better output — because the prompt made uncertainty a first-class concern.

This is the deepest lesson of this chapter: most hallucination is the model giving you what you asked for, plus confidence you didn’t actually need. Asking for calibrated output is the highest-leverage change.

When hallucination matters most

Three categories where hallucination is catastrophic and requires extreme caution:

Anything you will publish or attribute. Op-eds, academic work, journalism, official communication. A hallucinated fact in print is your reputation, not the model’s.
Anything that drives a high-stakes decision. Medical, legal, financial. A wrong fact here costs money, health, or freedom.
Anything that involves real, named people. Hallucinating biographies, quotes, or actions of real people creates defamation risk and personal harm.

For these categories, the rule is: do not trust generative AI output as the source of fact. Use it for drafting, structuring, summarising material you have verified. Do not use it to invent the underlying facts.

For lower-stakes uses — brainstorming, internal notes, conversation, casual writing — looser verification is fine. The model’s occasional wrong fact in your brainstorm doesn’t hurt you. The same wrong fact in a court filing does.

A note on improvement

Hallucination has reduced significantly between 2022 and 2026, but it has not been solved. Frontier models hallucinate less and on a narrower range of topics than earlier ones. Retrieval-augmented systems (those that search the web or your documents) hallucinate much less on factual queries. But the underlying mechanism — text generation without an internal truth signal — has not changed.

Expect this to continue improving slowly. Expect it to never disappear. Expect that the most reliable habit is your own verification, not the tool’s accuracy.

Check your understanding

Quick check

—

A student asks ChatGPT for “five real quotes from Bhanubhakta Acharya about Nepali unity” and receives five eloquent, well-attributed quotes. What is the most accurate description of what is happening?

The model has retrieved five genuine quotes from a database
The model has produced text that fits the request — quotes that *sound like* Bhanubhakta's style on this theme — but several or all may be invented (hallucination)
The model is deliberately lying to mislead the student
The model has access to all literature ever written and the quotes are guaranteed real

Quick check

—

The single highest-leverage habit for reducing the damage of hallucination is:

Use only the most expensive model
Treat all specific facts (citations, statistics, names, quotes) as unverified until you check them — and ground the model in your own source material wherever possible
Never use generative AI for any task
Ask the same question to three different models and average the answers

What comes next

We’ve covered the most discussed failure mode. The next section is about a quieter one — bias and cultural blindspots. Where a model produces technically correct output that nonetheless gets Nepal wrong, because the training data was thin on Nepal-specific patterns.