AI legal research in India without the hallucination risk

TL;DR: General-purpose AI tools invent plausible-looking case citations because they predict text rather than retrieve law. Indian courts have already imposed costs and issued contempt warnings for AI-fabricated citations. A five-step verification workflow (confirm the case exists, read the judgment, check it is still good law, match the proposition to the actual holding, confirm jurisdiction binds your forum) keeps you safe. A retrieval-grounded tool narrows the risk but does not remove your duty to verify.

Why language models fabricate case law
Why Indian legal research is especially exposed
What has already gone wrong in Indian courts
The US case that put the legal world on notice
Chatbot versus grounded legal AI: what is actually different
A five-step verification workflow
Prompt hygiene: how you ask shapes what you get
Source hygiene: where the answer came from matters
Your professional duties do not change
A verification checklist you can use today
Frequently asked questions
How Niyam approaches this

Why language models fabricate case law

To understand the risk, you need to understand the mechanism. A large language model (LLM) is trained to predict the next most plausible token in a sequence. It has processed an enormous volume of text (including legal writing, judgment databases, law reviews, and textbooks) and from that training it has learned the statistical shape of legal language.

Ask a general-purpose LLM a legal question and it will generate text that looks like a credible legal answer. It knows what a Supreme Court citation looks like. It knows the cadence of a headnote. It knows how a judge opens a paragraph distinguishing an authority. When the most statistically plausible continuation of your question involves a citation, the model produces one, whether or not that citation corresponds to a case that was ever decided.

This is the hallucination problem. The model is not lying; it does not have beliefs or intentions. It is doing exactly what it was trained to do: generating text that matches the shape of text it has seen. The result is a fabricated citation that looks as authentic as a real one.

What makes this acutely dangerous in law is that legal language is highly formulaic. A fabricated citation can have the party names set correctly, the reporter abbreviation set correctly, the year in a plausible range, and the volume and page numbers formatted right. The only thing missing is the actual case. And that is the one thing you cannot tell from the surface.

The fluency of an AI answer carries no information about its accuracy. A wrong answer and a right answer can be equally fluent. A fabricated citation and a real citation look identical until you go and verify.

The difference between retrieval and pure generation

Not all AI legal tools work the same way. A general-purpose chatbot relies entirely on its training data. A retrieval-augmented generation (RAG) system works differently: when you ask a question, the system first searches a corpus of actual judgments and statutes, retrieves the relevant passages, and then asks the model to compose an answer grounded in those retrieved documents, with citations pointing back to the source material.

This is a meaningful improvement. The model is working from real documents rather than from statistical inference alone. A well-built tool will show you the source judgments so you can open them. A well-designed system is also built to tell you when it cannot find adequate grounding for a reliable answer, rather than generating something anyway.

But retrieval narrows the risk; it does not eliminate it. Three gaps remain even in the best retrieval-grounded system, and a careful researcher keeps them in mind:

The corpus is not the whole law. No indexed corpus contains every judgment, tribunal order, notification, and amendment. If a relevant authority is absent from the index, it will not appear; and the absence of a citation is not proof that no authority exists.
Grounding is not interpretation. A real case can be retrieved and still be characterised incorrectly. The citation is genuine; the summary of the holding is subtly wrong.
Good-law status is time-sensitive. A judgment that was settled authority when the corpus was last updated may have been distinguished, overruled, or superseded by legislation since then.

The rule that follows from this: a retrieval-grounded tool gives you real leads, fast. You verify the leads. The tool changes how you find the starting point, not whether you confirm it.

Why Indian legal research is especially exposed

The hallucination problem exists everywhere AI touches legal research, but several features of Indian legal practice make it more likely that a fabricated citation will escape detection.

Citation format complexity

India has no single citation system. A Supreme Court judgment decided in 1985 might appear legitimately in any of these forms:

(1985) 3 SCC 545: Supreme Court Cases, the most widely preferred today
AIR 1985 SC 1011: All India Reporter, widely used in older material
1985 SCR (2) 450: Supreme Court Reports
(1985) 1 SCALE 312: Scale Law Reporter
MANU/SC/0012/1985: a commercial database’s proprietary identifier format
2025 INSC 345: Neutral citation under the new system

All six could refer to the same decision. An AI model trained on text that contains all these formats can generate a citation that has the correct form for any one of them while pointing to a case that does not exist at that location. The parallel-citation system means there is no single place to go and instantly confirm a citation’s validity.

Volume and overruling complexity

India has produced an enormous body of case law across 25 High Courts, the Supreme Court, hundreds of tribunals, and statutory authorities at the central and state level. The sheer volume means that a fabricated citation in a less-trafficked reporter or tribunal series is less likely to be caught by memory alone.

There is also the question of overruled and distinguished cases. Indian courts cite and distinguish a great deal of older authority. A case decided in 1975 may have been distinguished in 2008 and effectively overruled in 2019. But if you find the 1975 citation, it looks real and it looks on-point. AI tools that do not track overruling status can deliver you a citation that exists and is genuinely on-point for your proposition but is no longer good law.

Speed of statutory change

India has seen significant statutory reform in the past few years: the Bharatiya Nyaya Sanhita, Bharatiya Nagarik Suraksha Sanhita, and Bharatiya Sakshya Adhiniyam replaced the Indian Penal Code, Code of Criminal Procedure, and Indian Evidence Act respectively. Judgments citing old provisions are still abundant and are still good law in many contexts, but the mapping between old and new provisions is not always straightforward. An AI tool working from a corpus with a cutoff before these changes took full effect can produce research that conflates old and new law.

Low verification culture around AI tools

Because AI tools produce output that looks researched and organised, there is a natural tendency to read them more trustingly than a first-year associate’s note. That trust is the real vulnerability. The output looks authoritative; the citations look real; the answer is structured in a way that signals confidence. A good research workflow treats all of this as a first draft, not a final answer.

What has already gone wrong in Indian courts

The risks described above are not theoretical. Indian courts have encountered AI-fabricated citations in actual proceedings, and the consequences for the parties and advocates involved have been real.

The Bombay High Court imposes costs: January 2026

In Deepak Shivkumar Bahry v. Heart & Soul Entertainment, Justice MM Sathaye of the Bombay High Court imposed costs of Rs 50,000 on a litigant on 15 January 2026 after his written submissions relied on a case that did not exist.

The submissions filed by Mohammed Yasin, director of the respondent company, bore what Justice Sathaye called “give-away” features of AI generation: repetitive phrasing, bullet-point formatting, and distinctive tick marks. A central case cited in those submissions (Jyoti w/o Dinesh Tulsiani vs Elegant Associates) could not be located by the court or its law clerks.

Justice Sathaye’s observation is worth reading carefully: “If an AI tool is used in aid of research, it is welcome; however, there is great responsibility upon the party, even an advocate using such tools, to cross verify the references and make sure that the material generated by the machine/computer is really relevant, genuine and in existence.”

The costs were directed to the High Court Employees Medical Fund, payable within two weeks. The message: filing AI-generated content without verifying it is not a technical oversight; it is conduct that attracts a financial penalty.

The Delhi High Court sees a petition withdrawn: September 2025

In September 2025, the Delhi High Court encountered a petition filed by the Greenopolis Welfare Association in which AI-generated content had replaced real authority. Among the fabricated material was a citation to paragraph 73 of Raj Narain v. Indira Nehru Gandhi (1972) 3 SCC 850: a real and well-known case. The problem: that judgment contains only 27 paragraphs. Paragraphs 73 and 74 do not exist. The quotes attributed to those paragraphs were invented.

A separate citation to Chitra Narain v. DDA, 2008 (87) DLT 276 was also included, with passages attributed to it that did not appear in the actual ruling.

The respondents filed an eight-page note documenting the fabrications. Before Justice Girish Kathpalia, Senior Advocate Rakesh Tiku (appearing for the petitioner) sought permission to withdraw the petition. The court allowed the withdrawal but made a pointed note of the circumstances in its order.

The detail that stands out is the paragraph count. Raj Narain v. Indira Nehru Gandhi is a landmark judgment from the Supreme Court’s constitutional bench; it is a real, known, heavily-cited case. The AI did not invent the case name; it fabricated the content attributed to it. That is a more sophisticated failure mode than inventing a party name, and it is harder to catch on a quick read.

The Supreme Court acts suo motu: March 2026

The most significant Indian development came in March 2026. In the matter of Gummadi Usha Rani & Anr. v. Sure Mallikarjuna Rao & Anr. [SLP (C) No. 7575 of 2026], a bench of Justice PS Narasimha and Justice Alok Aradhe took suo motu cognizance of an Andhra Pradesh trial court order that had relied on fabricated AI-generated judgments in a property dispute.

The trial court had dismissed the petitioners’ objections to a Commissioner’s report by citing past decisions that turned out not to exist.

The Supreme Court’s order of 4 March 2026 used language that has no precedent in Indian AI jurisprudence: “a decision based on such non-existent and fake alleged judgments is not an error in the decision making. It would be a misconduct and legal consequence shall follow.”

The court issued notice to the Attorney General for India, the Solicitor General of India, and the Bar Council of India, and appointed Senior Advocate Shyam Divan as amicus curiae. The case was heard further on 10 March 2026.

The significance of this language is hard to overstate. The court has drawn a clear line: citing a fabricated judgment is not a mistake that falls within the range of ordinary error. It is misconduct. For a trial court judge, that carries the prospect of disciplinary action. For an advocate, it engages Section 35 of the Advocates Act, 1961, under which professional misconduct can lead to suspension or removal from the rolls.

The ITAT recalls an order: December 2024

Earlier, in December 2024, the Bengaluru bench of the Income Tax Appellate Tribunal issued an order in Buckeye Trust v. PCIT (ITA No. 1051/Bang/2024) citing four judicial precedents. Three of the four were problematic: K. Rukmani Ammal v. K. Balakrishnan (1973) 91 ITR 631 does not exist; S. Gurunarayana v. S. Narasinhulu (2004) 7 SCC 472 does not exist; and Sudhir Gopi v. Usha Gopi (2018) 14 SCC 452 was found to refer to an entirely different case: K. Subba Rao v. State of Telangana. One citation, 57 ITR 232 (SC), was real but unrelated to the matter before the tribunal.

The order was recalled within a week under Section 254(2) of the Income Tax Act, and a fresh hearing was scheduled.

The pattern in this case is instructive: the AI-generated citations were not all invented. Some had the real-sounding shape of actual citations but pointed nowhere. One pointed to a real case but the wrong one entirely. The mix of fabricated and misattributed authority made the error harder to catch in review.

The US case that put the legal world on notice

Before Indian courts had their own documented incidents, a US case made the problem globally visible. In Mata v. Avianca, Inc. (No. 1:22-cv-01461, S.D.N.Y. 2023), attorneys Peter LoDuca and Steven A. Schwartz of Levidow, Levidow and Oberman used ChatGPT to prepare a motion in a personal injury case. The motion cited cases that did not exist: fabricated party names, fabricated holdings, fabricated internal quotations.

When Avianca’s lawyers said they could not locate several of the cited cases, the plaintiff’s attorneys provided copies of documents purportedly containing the opinions. They had gone back to ChatGPT to ask whether the cases were real. ChatGPT confirmed that they were real and that they could be found in reputable legal databases. They could not be.

Judge P. Kevin Castel issued sanctions of USD 5,000 against the attorneys, describing one of the fabricated legal analyses as “gibberish.” His published opinion, reported at 678 F.Supp.3d 443 (2023), is the first major judicial response to LLM hallucinations in legal practice, and it has been cited in legal ethics commentary worldwide.

The Mata facts also expose what made the error so damaging: the attorneys did not verify by opening a real database. They asked the same AI to confirm its own output. An AI will do this confidently. Asking a language model whether its citation is real is not verification; it is asking the system that generated the error whether the error is correct.

Chatbot versus grounded legal AI: what is actually different

The distinction between a general-purpose chatbot and a retrieval-grounded legal AI is not a marketing distinction. It is a technical one with real implications for risk.

Dimension	General-purpose chatbot	Retrieval-grounded legal AI
How it finds law	Relies on training-time statistical patterns	Searches a real indexed corpus before answering
Citation source	Generated from training; no source document	Pulled from an actual retrieved judgment
Can you open the source?	No; there is no source to open	Yes; a well-built tool links to the judgment
Hallucination rate for citations	High; model invents plausible-sounding citations	Lower; citations are drawn from indexed material
Can it still characterise a case wrong?	Yes	Yes; retrieval does not prevent mischaracterisation
Good-law checking	None	Depends on tool; check whether this is supported
Corpus completeness	Training data cutoff; no index	Limited to the corpus; recent decisions may be missing
What to do when it is wrong	You have no trail to verify against	You can open the cited judgment and check

The bottom row of that table is the operative one for your workflow. With a chatbot, a wrong citation leads to a dead end. With a retrieval-grounded tool, a wrong characterisation leads you to the actual judgment where you can check the holding yourself. The verification is faster because the tool gives you a real starting point.

Neither tool removes the duty to verify. They change what verification looks like.

See also our guide to checking whether Indian case law is still good and our guide to reading and briefing an Indian judgment for the verification steps once you have a citation in hand.

A five-step verification workflow

The fear that verification will erase the time AI saved is understandable but does not match how verification actually works. The AI has collapsed the “where do I even start” problem: it has given you a short list of specific authorities to confirm. You are verifying a list, not building one from scratch. That is significantly faster than traditional research, as long as you verify the right things in the right order.

Step 1: Confirm the case exists

Open the judgment on a primary case-law source: the Supreme Court or High Court official websites, or the e-SCR portal. Do not rely on the AI’s summary to confirm existence. Search for the exact citation or the party names directly.

If the case does not come up, stop. Do not assume the database has a gap. Search a second database. Search by party name alone. If it still does not come up, the citation is likely fabricated.

The tell in the Delhi High Court case was simple: paragraph 73 of a 27-paragraph judgment. If you open the judgment, that fact is obvious in thirty seconds.

Step 2: Read the relevant portion of the actual judgment

This is not optional. Confirm that the judgment says what the AI says it says. Read the actual paragraphs cited, not just the headnote. Headnotes are editorial summaries; they do not always capture the nuance of the holding. The proposition you intend to rely on has to appear in the text of the judgment itself.

A well-built AI tool will show you the retrieved passage from the judgment alongside its answer. Even then, read the passage in context; the sentence immediately before and after it can change its meaning significantly.

Step 3: Check that it is still good law

Confirm the authority has not been overruled, reversed, or distinguished into practical irrelevance. For Supreme Court judgments, a citator check through a dedicated commercial citator is the most reliable approach. For High Court judgments, check whether the Supreme Court has subsequently addressed the same point. For anything involving recently amended legislation, confirm the judgment’s statutory basis has not been replaced.

AI tools that flag good-law status against their own corpus give you a first-pass indicator. For anything high-stakes, that first pass is not sufficient on its own. A citator that tracks how a judgment has been treated in later cases surfaces subsequent distinctions and overrulings against the corpus. See our detailed guide to good-law checking in India for the full citator workflow.

On neutral citations specifically: the Supreme Court now issues citations in the format YYYY INSC NNN, which are stable identifiers not tied to any private publisher. Familiarising yourself with the neutral citation system helps when you need to verify a citation quickly across different databases. We cover that system in more detail in our piece on Supreme Court neutral citations.

Step 4: Confirm the proposition matches the actual holding

This is the most intellectually demanding step and the one AI is most likely to get slightly wrong. A case can exist, can say something close to what the AI claims, and still not actually support the proposition you are relying on.

Ask specifically:

Is this statement from the majority, a concurring opinion, or a dissent?
Is this the ratio or an observation in passing (obiter)?
Does the fact pattern of the case map closely enough to your facts that the holding actually applies?
Was the court addressing the same statute, the same provision, the same procedural context?

An AI that misses the obiter/ratio distinction can hand you an observation from a dissenting judgment framed as settled law. That can be a problem in a filing and a serious one in an argument.

Step 5: Confirm the jurisdiction binds your forum

Different High Courts have different authority within their territorial jurisdiction. A Bombay High Court division bench decision on a contract question is not binding on the Delhi High Court, though it may be persuasive. A Supreme Court decision is binding on all courts, but some Supreme Court decisions are specifically tied to statutes that do not apply uniformly across states.

Check that the court that decided the case has authority whose binding effect reaches your forum. If you are relying on a judgment for persuasive value, say so explicitly in your submission rather than citing it as binding authority.

Prompt hygiene: how you ask shapes what you get

The verification workflow above is the safety net. Prompt hygiene is the practice that reduces how often the safety net is needed.

Ask for sources, not conclusions. Instead of asking “what does the law say about X,” ask “what cases have the Supreme Court and the relevant High Court decided on X, and what did they hold?” This shifts the output toward citations you can verify, rather than confident conclusions without sources.

Ask the tool to express uncertainty. “If you are not certain about any citation, say so” or “flag any case you are less confident about” can surface the AI’s own uncertainty signals. A well-built tool will have those signals; a general-purpose chatbot will often mask them.

Do not ask the AI to confirm its own citations. This is the Mata error. If you are uncertain about a citation, verify it in a primary database, not by re-querying the same AI. The system that generated the citation will often confirm it, because generation and confirmation use the same statistical process.

Be specific about jurisdiction, court level, and time period. An open-ended question like “what is the law on adverse possession” will produce a mix of Supreme Court authority, High Court decisions from multiple jurisdictions, and academic commentary. A more specific question (“what has the Supreme Court held on the limitation period for adverse possession of agricultural land under the Limitation Act 1963, and have any High Courts recently applied or distinguished those holdings”) gives the tool a much narrower and more answerable task.

Ask for the text to be broken down by source. If a retrieval-grounded tool is returning a synthesised answer with multiple citations, asking it to break the answer into one paragraph per authority (each grounded in one retrieved document) makes the verification work more tractable.

Source hygiene: where the answer came from matters

Where the AI sourced its answer (whether from an indexed corpus of Indian judgments or from a general training dataset) makes a material difference to how much work your verification has to do.

Prefer tools that are built on primary Indian legal material. A tool indexed against the Supreme Court’s official website, High Court judgment databases, and statutory repositories is working from primary sources. A tool that has processed legal commentary, law review articles, and secondary analysis is a step removed from primary authority, and errors compound with distance from the source.

Understand the corpus cutoff. Every indexed corpus has a date beyond which new material has not been added. For ongoing matters, recent amendments, and fast-moving areas of law, the corpus may not reflect the current state. Ask whether the tool discloses its corpus update date.

Distinguish fact-finding from authority-finding. AI is reasonably reliable at finding what the law has generally said about a category of question over many decisions. It is less reliable at precision-level authority for a specific proposition in a specific jurisdiction at a specific time. Adjust your trust accordingly.

Cross-verify across databases for high-stakes matters. For anything that will be filed, confirmed against a client, or relied on in a significant argument, check the key authorities across at least two independent databases. Different commercial databases have different indexing and update cycles; a citation one cannot locate should be checked in another before being discarded.

Your professional duties do not change

Every obligation that applies to your legal work applies equally when the starting point was an AI tool. That is not a criticism of AI tools; it is simply what the professional duty framework says.

The duty of candour requires that everything you represent to a court as authority is actually authority. Filing a fabricated citation is not just an error; as the Supreme Court’s March 2026 order makes clear, it is misconduct. The fact that the fabrication came from an AI tool does not change the characterisation.

The duty of competence requires that you understand the tools you use. A lawyer who uses AI for research without understanding that the output requires verification is not using the tool competently, regardless of how good the output looks.

Section 35 of the Advocates Act, 1961 empowers the State Bar Council’s disciplinary committee to act on professional or other misconduct. Where an advocate knowingly or recklessly files fabricated authority (including AI-generated fabrications), the bar council and the court both have the authority to act. The Bombay High Court’s cost order and the Supreme Court’s misconduct declaration are early data points in what will likely become a body of disciplinary guidance.

Disclosure. Some courts are beginning to address whether AI assistance in preparing submissions must be disclosed. There is not yet a uniform national standard in India, but that position is evolving. Where a court’s practice directions or an applicable professional guideline asks for AI-use disclosure, that disclosure is your responsibility to make. Staying current with the guidance that applies to your forum is part of using these tools responsibly.

The mental model that keeps you safe. AI is the instrument; you are the advocate. The tool can do the mechanical work of finding and organising relevant material. The judgment about what the law is, whether an authority supports your position, what goes into a filing, and what you represent to your client is yours. And that is exactly the part a court holds you accountable for.

This does not mean AI is not useful. It means AI is useful in the same way that a law library is useful: it gives you access to material you need to work through yourself. A library does not vouch for the propositions you extract from it. Neither does an AI tool.

For a broader discussion of how AI fits into contract drafting and document review (adjacent domains where similar verification duties apply), see our AI contract drafting and review workflow piece.

A verification checklist you can use today

Use this for any AI-generated research output before it informs a filing, a significant client advice, or a submission.

Check	Done
Confirmed the case exists in at least one primary Indian database
Opened the actual judgment text (not just a headnote or summary)
Read the specific paragraphs the AI cited
Confirmed the cited statement is in the majority judgment, not a concurrence or dissent
Distinguished ratio from obiter for the proposition relied on
Checked good-law status through a citator (official portal or commercial database)
Confirmed the judgment has not been overruled or distinguished on the relevant point
Checked that the statutory basis of the judgment has not been repealed or amended
Confirmed the court’s jurisdiction binds your forum, or noted the basis for relying on it as persuasive
Verified any statutory provision cited against the current text on India Code or the Gazette
Identified whether the corpus date of the tool covers the period of your research question
Flagged any citation the AI itself indicated uncertainty about for additional checking
Not asked the AI to confirm its own citation as a substitute for opening a database

Thirteen checks sounds like a lot. Most of them take under a minute once you have the judgment open. The ones that take longer (reading the relevant paragraphs, checking good-law status, thinking about the obiter/ratio distinction) are not AI-verification steps. They are the research steps you would have to do with any source. AI just gets you to the source faster.

Frequently asked questions

What exactly is an AI hallucination in legal research?

An AI hallucination is text generated by a language model that reads as factual but has no basis in reality. In legal research, hallucinations most commonly take the form of fabricated citations: cases that sound real (correct party names, plausible reporter abbreviation, year in a sensible range) but were never decided or do not say what they are claimed to say. The model generates them because legal citation formats are predictable patterns, and pattern prediction is what language models do.

Why are general-purpose chatbots particularly dangerous for legal research?

A general-purpose chatbot has no connection to a real database of judgments at query time. It generates an answer from statistical patterns in its training data. If your question is best answered with a citation, it will generate a plausible-sounding one. There is no index, no retrieval, and no primary document behind the citation. The answer can look identical to an answer grounded in real sources, which is exactly what makes it dangerous.

What is retrieval-augmented generation and does it solve the hallucination problem?

Retrieval-augmented generation (RAG) is an architecture in which the AI system first searches a real indexed corpus and retrieves relevant documents, then asks the language model to compose an answer grounded in those documents. This means the citation points to a document that was actually retrieved from the index. It reduces hallucinations significantly because the model is working from real material rather than inference alone. It does not eliminate the risk entirely: the model can still mischaracterise retrieved material, the corpus may not contain all relevant authority, and good-law status may have changed since the corpus was last updated.

Has any Indian court penalised a lawyer for AI-fabricated citations?

Yes. In January 2026, the Bombay High Court imposed costs of Rs 50,000 on a litigant (Deepak Shivkumar Bahry v. Heart and Soul Entertainment, Justice MM Sathaye) after written submissions filed on behalf of the respondent relied on a fabricated case citation. The court found the submissions bore characteristic signs of AI generation and that the respondent’s director had signed them without verifying their contents.

What did the Supreme Court of India say about AI fake citations?

In March 2026, in Gummadi Usha Rani v. Sure Mallikarjuna Rao [SLP (C) No. 7575 of 2026], the Supreme Court issued suo motu cognizance of an Andhra Pradesh trial court order that relied on AI-generated, non-existent judgments. A bench of Justice PS Narasimha and Justice Alok Aradhe stated: “a decision based on such non-existent and fake alleged judgments is not an error in the decision making. It would be a misconduct and legal consequence shall follow.” The court issued notice to the Attorney General, Solicitor General, and Bar Council of India.

What happened in the Mata v. Avianca case?

In Mata v. Avianca, Inc. (No. 1:22-cv-01461, S.D.N.Y. 2023), attorneys representing the plaintiff in a personal injury case used ChatGPT to prepare a motion. ChatGPT generated citations to cases that did not exist. When the court and opposing counsel could not locate the cases, the attorneys again asked ChatGPT whether the cases were real, and ChatGPT confirmed that they were. They were not. Judge P. Kevin Castel imposed sanctions of USD 5,000 on the attorneys. The decision is reported at 678 F.Supp.3d 443 (2023).

Why is Indian legal citation format a particular problem for AI?

Indian case law can carry multiple valid citation forms for the same judgment: SCC, AIR, SCR, SCALE, MANU identifiers, and now neutral citations (INSC). A language model trained on material that uses all these formats can generate a citation that has the correct format for one system but points nowhere in any of them. There is no single authoritative index to check, which means a fabricated Indian citation requires more deliberate cross-checking than a fabricated US or UK citation that can be quickly confirmed in a single database.

What is the Buckeye Trust case and what does it illustrate?

The ITAT Bengaluru bench issued an order in Buckeye Trust v. PCIT (ITA No. 1051/Bang/2024) in December 2024 citing four precedents, three of which were fabricated or misattributed. One cited case did not exist at the citation given; one pointed to a real case with a different name; and one real citation was unrelated to the matter. The order was recalled within a week. The case illustrates a failure mode where the AI produces a mix of real and fabricated citations in the same answer, which is harder to catch than all-or-nothing fabrication.

Does asking an AI to verify its own citations work?

No. Asking the same AI whether its citation is correct does not provide independent verification. The system that generated the citation uses the same statistical process to evaluate it. As demonstrated in the Mata case, an AI will confidently confirm citations that do not exist. Verification means opening a primary case-law source (an official court portal or a commercial database) and confirming the citation independently.

What is the obiter/ratio distinction and why does it matter in AI research?

The ratio decidendi is the part of a judgment that is binding: the proposition of law that was necessary to decide the case. Obiter dicta are observations made in passing, not necessary to the decision. Courts are not bound by another court’s obiter, though it may be persuasive. An AI tool summarising a judgment may present an obiter observation as a holding, or may not distinguish between them. Before relying on any proposition from an AI-generated summary, confirm from the judgment text whether it is part of the reasoning that decided the case.

Can I rely on AI for checking whether a case is still good law?

A retrieval-grounded AI tool that explicitly tracks good-law status (flagging whether a cited case has been discussed, distinguished, or overruled in subsequent decisions within its corpus) gives you a useful first-pass indicator. For high-stakes matters, that is not sufficient on its own. The corpus may not extend to the most recent judgments, and good-law checking in the context of Indian law (where an older case may have been distinguished in multiple subsequent decisions without formal overruling) requires the kind of citator functionality that dedicated commercial citators are specifically built to provide.

What is the professional risk of filing AI-generated material without verification?

Under Section 35 of the Advocates Act, 1961, an advocate can face disciplinary proceedings before the State Bar Council for professional or other misconduct. Filing fabricated authority (whether or not it originated from an AI) can constitute a false statement to a court, which is a serious matter. In addition to disciplinary action, the Supreme Court has indicated that filing AI-generated fake precedents engages the contempt jurisdiction of the court. The cost order from the Bombay High Court and the Supreme Court’s misconduct declaration in early 2026 are early markers of how courts intend to treat this.

Does AI make legal research faster, and is the time saving real?

Yes, the time saving is real, and it survives honest verification. AI collapses the “where do I even start” problem: it gives you a short, structured list of likely relevant authorities and organises the key points. Without AI, that starting list takes hours to compile. With AI, you have it in minutes. The verification steps (opening the citations, reading the relevant passages, checking good-law status) are steps you would have to take with any starting point. The AI does not remove those steps; it just changes what you are verifying from a blank page to a specific short list.

What should I do if an AI-generated case citation cannot be verified?

If a citation cannot be found in any primary case-law source (official portals or commercial databases), even after searching by party name alone, treat it as fabricated. Do not include it in any filing, advice, or submission. Do not attempt to reconstruct it by asking the AI again. Look for the underlying point of law using a different research route: a manual keyword search in a primary database, a textbook treatise with proper citations, or a reliable secondary source that footnotes primary authority.

How does the new neutral citation system help with verification?

The Supreme Court’s neutral citation system assigns each judgment a stable identifier in the format YYYY INSC NNN (for example, 2024 INSC 100). This identifier is not tied to any private publisher and does not change across different databases. A neutral citation can be used to locate the judgment on the Supreme Court’s official website or in any database that has adopted the system. This makes cross-checking faster: a single neutral citation can be confirmed in multiple sources quickly. We cover the neutral citation system in detail in our piece on SC neutral citations and free judgments.

What role does prompt design play in reducing hallucination risk?

Prompt design does not eliminate the hallucination risk, but it shapes it. Asking for citations with specific court and time period constraints reduces the space of plausible inventions. Asking the tool to express uncertainty where it exists gives you a signal for where to look more carefully. Asking for source-grounded answers (one claim per source) rather than synthesised summaries makes verification more tractable. None of this substitutes for verification, but it makes the verification task more focused.

Is it appropriate to disclose AI use to courts in India?

There is no uniform national requirement in India as of mid-2026, but practice is evolving. Individual courts may have issued or be developing practice directions on this point. Where such directions exist or where professional guidelines applicable to your forum address AI-tool disclosure, that disclosure obligation is yours to meet. The safer approach in significant filings is to be transparent about the use of AI as a research starting point and clear that all cited authority has been independently verified.

What features should I look for in a legal AI tool for Indian law?

Look for four things: (1) the tool retrieves from a real corpus of Indian judgments rather than relying on training-time inference; (2) it shows you the source document for every citation so you can open it and check; (3) it has transparency about its corpus date and what sources are indexed; and (4) it tells you when it cannot find adequate grounding rather than generating an answer anyway. Beyond that, apply the verification workflow to every significant answer regardless of which tool you use.

What is the difference between an AI giving wrong legal advice and an AI fabricating a citation?

Wrong legal advice is an interpretation error: the authority is real, the holding is accurate, but the AI has applied it to your facts in a way that is mistaken. That is a judgment error, the same kind of error a junior associate or a secondary source can make. Fabricating a citation is different: the authority does not exist, or the words attributed to it were never written. The first is a reason to check the application; the second is a reason never to file the citation. Both require verification, but they are not the same kind of failure.

Should lawyers stop using AI for legal research given these risks?

No. The Bombay High Court made this explicit: “If an AI tool is used in aid of research, it is welcome.” The risks described throughout this piece are not arguments against AI use; they are arguments for verification. The lawyers most exposed to these risks are not those who use AI but those who use AI without verifying. A practitioner who uses a grounded legal AI tool and applies a disciplined verification workflow gets both the research speed of AI and the reliability their practice and their court require.

How Niyam approaches this

Every answer Niyam provides cites the actual judgment it retrieved to compose that answer. You can open the source, read the judgment, and check. That is how the grounded-AI model is supposed to work: not a system you trust blindly, but a system that gives you a specific, checkable starting point for the professional judgment that has always been yours to make. This is the design principle behind Niyam’s retrieval-grounded legal research, where every proposition links back to the primary source it came from.

Niyam is built on 72,000+ Indian judgments. Your research is private, never sold or used to train public models. For questions about the tool or to discuss how it fits your research practice, write to [email protected] or try it at app.niyam.ai.

On this page