# How to find similar judgments fast with AI for Indian lawyers

**TL;DR:** Finding the precedent that is actually on point is slow because Indian case law runs into millions of documents and keyword search misses cases that use different words for the same idea. Semantic search fixes the vocabulary problem: it converts your facts into a vector and retrieves judgments with a similar meaning, not just matching terms. The strongest tools pair this with retrieval-augmented generation (RAG) so the AI answer is grounded in real, citable judgments you can open and read. That cuts the research time, but it does not remove your duty to verify. AI legal research tools still hallucinate, and Indian courts have already imposed costs on parties who filed fabricated case law. Use AI to find candidate cases fast, then confirm each one exists, read the ratio, and check it is still good law before you cite it.

---

## On this page

- [Why finding on-point precedent is so hard](#why-finding-on-point-precedent-is-so-hard)
- [Keyword search versus semantic search, explained simply](#keyword-search-versus-semantic-search-explained-simply)
- [What "find similar judgments" actually means](#what-find-similar-judgments-actually-means)
- [How AI embeddings and RAG retrieve grounded cases](#how-ai-embeddings-and-rag-retrieve-grounded-cases)
- [The hallucination risk and why grounding matters](#the-hallucination-risk-and-why-grounding-matters)
- [Verifying good law before you rely on a case](#verifying-good-law-before-you-rely-on-a-case)
- [A practical workflow for finding similar judgments](#a-practical-workflow-for-finding-similar-judgments)
- [Pitfalls that catch out careful lawyers](#pitfalls-that-catch-out-careful-lawyers)
- [How an India-law-trained tool helps](#how-an-india-law-trained-tool-helps)
- [Putting it together: a worked example](#putting-it-together-a-worked-example)
- [Frequently asked questions](#frequently-asked-questions)
- [Start finding similar judgments faster](#start-finding-similar-judgments-faster)

---

## Why finding on-point precedent is so hard

Every Indian litigator knows the feeling. You have a set of facts, a client who needs an answer today, and a strong sense that some High Court or the Supreme Court has dealt with something close to this before. The problem is finding it. Not finding *a* case. Finding the *right* case - the one whose facts line up with yours, whose ratio supports the proposition you need, and which is still good law.

The first reason this is hard is sheer volume. Indian Kanoon alone provides access to more than 1.4 million central laws and judgments from the Supreme Court, all 24 High Courts, and numerous tribunals, sitting inside a broader database of over 20 million documents, as described on the [Indian Kanoon impact study](https://peterneis.github.io/files/Bhupatiraju_et-al_2024_Indian_Kanoon.pdf). The Supreme Court's own free e-SCR portal carries roughly 30,000 reported judgments with neutral citations, per [Bar and Bench](https://www.barandbench.com/news/ending-citation-chaos-neutral-citation-simplifies-legal-referencing-in-indian-courts). Add the paid databases and the daily flow of fresh orders, and no human can read even a fraction of what might be relevant. The precedent you need is in there. The question is whether you can surface it before the hearing.

The second reason is harder to see. Even when the right case exists, the words you search for may not be the words the judge used. A judgment about an employer terminating a daily-wage worker without a hearing might never use the phrase "principles of natural justice" in the operative paragraph; it might talk about "an opportunity to be heard" or "audi alteram partem" or simply describe the procedure that was followed. Search for one phrase and you miss the case that decided your exact point using another. This is the brittleness of keyword search, and it is the single biggest reason good precedent stays hidden.

The third reason is time pressure on the people who do most of the searching. Research is a real cost centre. Thomson Reuters has reported that a large share of an attorney's working time goes to activities other than the practice of law, with their solo and small-firm study finding around [40 percent of time spent on non-practice tasks](https://legal.thomsonreuters.com/en/insights/articles/what-would-you-do-with-40-percent-more-time). Junior associates carry the brunt of first-pass research. When the search tool is brittle, that first pass is slower and the partner gets a thinner memo. Better retrieval is not a luxury here. It changes how much of the day goes to thinking versus hunting.

So the core difficulty is not a shortage of law. It is a retrieval problem dressed up as a reading problem. You are trying to match a pattern of facts and a legal proposition against a corpus too large to read, using tools that historically matched words rather than meaning. That is exactly the gap AI-assisted search is built to close.

---

## Keyword search versus semantic search, explained simply

To use AI search well, you need a clear picture of how it differs from the search you already know. The difference is not cosmetic. It changes what the tool can find.

**Keyword search** matches the characters you typed against the characters in the documents. Type "specific performance of agreement to sell" and the engine returns documents containing those words, usually ranked by how often and how prominently the words appear. Boolean operators, proximity searches, and field filters refine this, and skilled researchers get a lot out of them. But the engine has no idea what your words *mean*. If a judgment expresses the same idea differently, the engine cannot see the connection. As the [Free Law Project notes](https://free.law/2025/03/11/semantic-search/), traditional keyword search often falls short when legal concepts appear in varied terminology across cases.

**Semantic search** works on meaning. It converts your query and every document into a numerical representation called a vector embedding, then finds the documents whose vectors sit closest to your query's vector. Closeness in this space corresponds to similarity of meaning. So a search about "patent breach" can surface a case about "intellectual property infringement" even though the words differ, an example [Airbyte uses](https://airbyte.com/data-engineering-resources/semantic-search-vs-vector-search) to explain the gap. For law, where the same principle wears a dozen different verbal costumes, this is a genuine breakthrough.

Here is the practical contrast.

| Dimension | Keyword search | Semantic search |
| --- | --- | --- |
| What it matches | Exact words and phrases | Meaning and concepts |
| Misses cases that... | Use different terminology for the same idea | Are genuinely about a different issue |
| Best for | Known citations, party names, statute numbers, exact terms of art | Fact patterns, legal concepts, "find cases like this" |
| Weakness | Brittle, vocabulary-dependent | Can over-retrieve loosely related cases |
| Your input | Carefully chosen keywords and operators | A natural description of your facts or issue |

The honest position is that these are complements, not rivals. [Redis frames it well](https://redis.io/blog/semantic-search-vs-keyword-search/): keyword search is unbeatable when you know the exact term, citation, or name; semantic search wins when you are exploring a concept and do not know the magic words. A serious legal tool runs both and blends the results, often called hybrid search. You want the precision of keyword matching for "AIR 2017 SC 4161" and the recall of semantic matching for "cases where a flat buyer got possession years late and claimed refund with interest."

If you want a deeper feel for how Indian search engines differ in practice, our guide on [choosing an Indian case law search engine](/blog/choosing-indian-case-law-search-engine) walks through what to look for.

---

## What "find similar judgments" actually means

"Find me similar cases" sounds like one request. It is really three, and the best tools let you steer between them. Knowing which kind of similarity you want is the difference between a tight set of leads and a pile of noise.

**Fact-pattern similarity.** Here you want judgments whose facts resemble yours. A man injured by a serving cart on a flight; a homebuyer who paid and never got possession; a cheque dishonoured for insufficient funds with a defective demand notice. Fact-pattern matching is what most lawyers mean when they say "a case like this." It is powerful for spotting how courts have treated situations on all fours with your own, and it is exactly where semantic search shines, because facts are described in prose that rarely uses identical words.

**Ratio similarity.** Sometimes the facts do not matter much; you want the *legal principle*. You need cases that decided a particular proposition - that delay in possession entitles the buyer to refund with interest, that bail is the rule and jail the exception, that an order passed without a hearing is void. This is matching on the *ratio decidendi*, the binding reason for the decision. As [Wikipedia summarises](https://en.wikipedia.org/wiki/Ratio_decidendi), the ratio is the point the parties actually fought over and the court actually decided, distinct from passing observations. Ratio matching is more demanding for any tool, because it has to separate the binding rule from *obiter dicta*, and a recent [arXiv survey on generative AI in legal reasoning](https://arxiv.org/pdf/2508.18880) flags exactly this as a known limitation: finding similar cases needs more than keyword overlap, it needs the ratio.

**Headnote and holding similarity.** Reported judgments come with editorial headnotes that compress the case into a few propositions. Matching on headnotes or court-prepared holdings can be a fast filter, because the headnote already states what the case stands for. It is a useful middle layer between raw facts and abstract ratio.

In practice you move between these. You might start with fact-pattern similarity to gather a broad set, then narrow to the ones whose ratio supports your specific proposition, then read the headnotes to triage which to open fully. The table below maps the three to the question each answers.

| Similarity type | The question it answers | When to lead with it |
| --- | --- | --- |
| Fact pattern | "Has a court dealt with a situation like mine?" | Early, to gather candidates from your facts |
| Ratio | "Has a court decided the principle I need?" | When your point of law is fixed and you need authority for it |
| Headnote / holding | "What does this case stand for at a glance?" | Triage, to decide which judgments to read in full |

A tool that only does one of these is doing part of the job. The reason this distinction matters so much is that a case can be factually similar yet legally useless to you, or factually distant yet exactly on your point of law. Knowing what you are matching on keeps you from celebrating the wrong cases. If you are shaky on isolating the ratio from the rest of a judgment, our walkthrough on [how to read a judgment](/blog/how-to-read-a-judgment) is the place to start.

---

## How AI embeddings and RAG retrieve grounded cases

Let us open the box a little, because understanding the mechanism tells you where to trust the tool and where to keep your guard up.

**Step one: embeddings.** Every judgment in the corpus is passed through an embedding model, which turns the text into a vector - a long list of numbers that captures its meaning. Judgments about similar issues end up with vectors that sit near each other. Your query gets the same treatment: your description of the facts becomes a vector too. The system then measures the distance between your query vector and every judgment vector, usually with cosine similarity, and returns the nearest neighbours. That is semantic retrieval in one sentence: meaning becomes geometry, and "similar" becomes "close." A plain-English explainer of this pipeline lives in [this developer guide to embeddings and vector databases](https://dev.to/imsushant12/embeddings-vector-databases-and-semantic-search-a-comprehensive-guide-2j01).

Domain matters here. A generic embedding model trained on the open internet does not deeply understand Indian legal language. Legal-domain models do better; the research behind LEGAL-BERT and similar legal embeddings shows domain-adapted models capture legal nuance that general models miss, a point the [Free Law Project makes](https://free.law/2025/03/11/semantic-search/) when describing domain-adapted semantic search. An embedding model that understands the difference between "quashing an FIR" and "discharge" will retrieve better than one that treats them as vaguely related text.

**Step two: retrieval-augmented generation.** Semantic retrieval gives you a ranked list of real judgments. RAG goes one step further and uses those retrieved judgments to write a grounded answer. The architecture has two parts, as [Databricks describes](https://www.databricks.com/glossary/retrieval-augmented-generation-rag): a retriever that scans the corpus and pulls the most relevant documents, and a generator (the language model) that composes an answer using those documents as its source material. The model is told, in effect, "here are the actual cases; answer using these and cite them."

This is the design choice that matters most. A general chatbot answers from its training data and its own statistical instincts. A RAG system answers from documents it just retrieved, with citations pointing back to the source, like footnotes in a paper, which is how [NVIDIA explains](https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/) the value of giving the model sources it can cite. Grounding the answer in retrieved, real content reduces the guesswork that produces fabricated citations.

It helps to picture the difference with a small analogy. Asking a pure chatbot for case law is like asking a brilliant colleague who has read everything to recall a precedent from memory at a dinner party: the answer is fluent, fast, and sometimes confidently wrong about a citation. A RAG tool is like the same colleague with the law reports open in front of them, reading out the passage and pointing to the page. The first answers from memory; the second answers from the book. In legal work, the book is the only thing a court will accept, which is why the architecture, not just the cleverness of the model, decides whether a tool belongs in your workflow.

A second design detail worth knowing is chunking. Judgments are long, often running to dozens of pages, so a RAG system usually splits each one into smaller passages before embedding them. This is why a good tool can point you not just to a case but to the specific paragraph that matters, and why retrieval quality depends on sensible splitting: chop a judgment in the wrong place and the operative reasoning gets separated from the facts that give it meaning. You do not need to engineer this yourself, but it explains why two tools searching the same corpus can return noticeably different results. The corpus is the same; the way it was prepared for retrieval is not.

The honest caveat, which we return to below, is that grounding *reduces* hallucination but does not *eliminate* it. The same body of research that praises RAG also documents its failure modes. Hold that thought.

---

## The hallucination risk and why grounding matters

This is the part no responsible article about AI legal research can soften. AI tools invent things, and in law, inventions get people sanctioned.

The case that woke the profession up was [Mata v. Avianca](https://en.wikipedia.org/wiki/Mata_v._Avianca,_Inc.) in the United States. Two lawyers filed a brief built on six judicial decisions that did not exist; ChatGPT had fabricated them, complete with plausible party names, real-sounding judges, and fake internal quotations. When challenged, counsel doubled down and attached further fabricated excerpts before the truth came out. In June 2023 the court [sanctioned the lawyers and fined them USD 5,000](https://www.seyfarth.com/news-insights/update-on-the-chatgpt-case-counsel-who-submitted-fake-cases-are-sanctioned.html). It became the reference point for what AI hallucination means in practice: confident, fluent, formatted correctly, and completely false.

This is not a foreign problem India can watch from a distance. It is already here. In December 2024, the Bengaluru bench of the Income Tax Appellate Tribunal passed an order citing three Supreme Court judgments and a Madras High Court ruling that did not exist, and recalled the order within a week, as reported by [LiveLaw](https://www.livelaw.in/articles/phantom-precedents-ai-generated-case-law-indian-courts-526665). In October 2025, the Bombay High Court quashed a tax assessment after finding the assessing authority had relied on non-existent decisions. In January 2026, the Bombay High Court imposed a cost of Rs 50,000 on a party for dumping fake case laws into written submissions, also covered by [Analytics India Magazine](https://analyticsindiamag.com/ai-features/indias-new-courtroom-menace-judgments-that-never-existed/). And in 2026 the Supreme Court signalled that citing AI-generated fake case law can amount to professional misconduct, asking the Bar Council of India to form an expert committee, per [MediaNama](https://www.medianama.com/2026/03/223-supreme-court-ai-fake-case-laws-misconduct/).

Now the part lawyers most need to hear: grounded, retrieval-based tools are safer, but they are not safe. The Stanford RegLab study put real numbers on this. Testing the leading commercial legal AI products, it found that purpose-built, RAG-based legal research tools from LexisNexis and Thomson Reuters still hallucinated somewhere between roughly 17 percent and 33 percent of the time, despite vendor claims of being hallucination-free, as documented in the [Stanford RegLab paper](https://reglab.stanford.edu/publications/hallucination-free-assessing-the-reliability-of-leading-ai-legal-research-tools/) and reported by [LawSites](https://www.lawnext.com/2024/05/stanford-will-augment-its-study-finding-that-ai-legal-research-tools-hallucinate-in-17-of-queries-as-some-raise-questions-about-the-results/). These were the good tools. The failure mode RAG introduces is subtle: the system retrieves a *real* case but then *mischaracterises* what it held, or cites a genuine judgment that does not actually support the proposition. The citation is real; the support is fake.

So why does grounding still matter? Because it changes the *kind* of error and the *cost* of checking. With a pure chatbot, you may be handed a citation to a case that never existed, and the only way to catch it is to fail to find it. With a grounded tool, the case is real and *the source is shown to you*, so verification becomes a click: open the judgment the tool retrieved and read whether it says what the tool claims. Grounding does not make verification optional. It makes verification fast and possible. That is the whole game.

| Risk | Pure chatbot | Grounded RAG tool | Your defence |
| --- | --- | --- | --- |
| Citation to a non-existent case | High | Low | Confirm the case exists in an authoritative database |
| Real case, wrong proposition | High | Medium | Open the source and read the actual holding |
| Outdated good-law status | High | Medium | Check subsequent history independently |
| Wrong jurisdiction relied on | Medium | Medium | Confirm the court binds your forum |

We go much deeper on the mechanics of fabrication in [AI legal research in India without the hallucination risk](/blog/ai-legal-research-india), and on the duty side in our note on the [lawyer's duty to verify AI output](/blog/lawyer-duty-verify-ai-output).

---

## Verifying good law before you rely on a case

Finding a similar judgment is only half the work. A case that perfectly matches your facts and states your principle is worthless, or worse, dangerous, if it has been overruled, distinguished into oblivion, or superseded by statute. Citing dead law in court is its own kind of embarrassment, and no AI tool removes this duty from you.

"Good law" means the proposition you are relying on is still the law: not reversed on appeal, not overruled by a larger bench, not displaced by a later coordinate-bench decision, and not rendered academic by an amendment. In common-law systems this check has a name borrowed from the US practice of "Shepardizing." In India, the paid databases offer their own citator services. SCC Online and Manupatra flag whether a case has been followed, distinguished, overruled, or referred, with Manupatra's overruled-by-a-higher-court signal noted as a distinguishing feature in this [overview of finding case law](https://lawbhoomi.com/how-to-find-case-laws/). Always confirm a judgment is still good law before you build on it.

A practical good-law check for any similar judgment your AI search surfaces:

1. **Confirm the case exists.** Open it in an authoritative source: e-SCR or the court website for Supreme Court matters, the High Court portal, or a recognised database. If you cannot find it independently of the AI tool, treat it as fabricated until proven otherwise.
2. **Read the actual holding.** Do not trust the AI's summary. Read the operative paragraphs and confirm the case decides the proposition you need, on a fact pattern close enough to bind or persuade.
3. **Check subsequent history.** Has it been appealed, referred to a larger bench, overruled, or distinguished? A citator, the "cited by" links on the case, and a quick search for the case name in later judgments will surface this.
4. **Check the statute.** Even a correctly decided case can be undercut by a later amendment. The new criminal laws replacing the IPC, CrPC and Evidence Act are a live example - a case interpreting an old section may need re-reading against the new provision.
5. **Confirm jurisdiction.** A Supreme Court ruling binds everyone. A High Court ruling binds courts within its territory and is persuasive elsewhere. Make sure the authority actually carries weight in your forum.

Our dedicated guide on [good-law checking](/blog/good-law-checking) expands each of these steps with Indian examples.

---

## A practical workflow for finding similar judgments

Here is a workflow that uses AI for what it is good at - fast, meaning-based retrieval - while keeping the verification that protects you. It moves from broad to narrow, and ends where every research task should end, with subsequent history.

**1. Start broad with your facts.** Describe your matter in plain language, the way you would explain it to a colleague: "Flat buyer paid full consideration in 2019, builder delayed possession by four years, buyer wants refund of principal with interest and compensation." Feed this to semantic search and let it gather a wide set of fact-similar candidates. Do not over-specify yet; you are casting a net.

**2. Skim and triage on headnotes.** For each candidate, read the headnote or the tool's grounded summary to decide whether it is worth opening. You are looking for cases whose holding touches your proposition. Discard the clearly off-point ones now so you do not waste reading time later.

**3. Narrow by ratio.** Take your surviving candidates and ask the sharper question: does this judgment's *ratio* support the exact proposition I need? This is where you switch from "similar facts" to "right principle." Open the judgment, find the operative reasoning, and confirm the binding rule, separating it from passing observations. Keep only the cases whose ratio is genuinely on your point.

**4. Confirm each case independently.** For every case you intend to cite, open it in an authoritative database and confirm it exists and reads as the tool claimed. This is the non-negotiable step that defends against hallucination. The source link a grounded tool gives you makes this fast.

**5. Check subsequent history and good-law status.** Run the good-law check above on your final shortlist. A case is not ready to cite until you know nothing has displaced it.

**6. Build your authority ladder.** Order your final cases from binding to persuasive, most recent to older, most factually similar to more analogous. This is the structure your written submission needs anyway.

The table summarises which engine to lean on at each stage.

| Stage | Goal | Lead with | Verification |
| --- | --- | --- | --- |
| 1. Broad | Gather candidates | Semantic search on facts | None yet |
| 2. Triage | Filter to relevant | Headnotes / grounded summaries | Light |
| 3. Narrow | Match the principle | Ratio reading | Read the operative paragraphs |
| 4. Confirm | Defeat hallucination | Authoritative database | Open and read each case |
| 5. History | Confirm still good | Citator / "cited by" | Independent check |
| 6. Order | Build the argument | Your judgment | Final read-through |

Notice that AI does heavy lifting in stages one to three and then steps back. The closer you get to citing, the more the work is yours. That is the right division of labour.

---

## Pitfalls that catch out careful lawyers

Even good researchers stumble on AI-assisted search in predictable ways. Knowing the traps is most of the defence.

**Treating fluency as accuracy.** A well-written AI summary of a case feels authoritative. Fluency is not truth. A confidently worded summary can quietly misstate the holding. Read the source.

**Stopping at fact similarity.** A case can match your facts beautifully and decide a different point of law. Always push through to the ratio. Factual resemblance is a lead, not an authority.

**Assuming retrieval means the corpus is complete.** No index contains every judgment, tribunal order, and notification. If the tool returns nothing on a point, that is not proof no authority exists. Absence of a result is not a finding.

**Ignoring good-law status because the case is recent.** Even a 2024 judgment can be referred to a larger bench or distinguished by 2026. Recency is not the same as settled.

**Over-trusting a single tool.** The Stanford numbers apply to the market leaders. Cross-check important authorities against a second source, and against the official court record.

**Pasting privileged facts into a public chatbot.** Feeding client confidences into a general consumer AI raises confidentiality concerns and gives you no control over where the data goes. Use a tool built for legal work with appropriate data handling. We discuss this in our note on [legal AI data residency in India](/blog/legal-ai-data-residency-india).

**Letting juniors cite what they have not read.** The whole point of faster retrieval is more time to *read*, not less. If an associate hands up a list of cases nobody has opened, the speed has bought you risk, not value. The safest team habit is a simple rule: no case enters a draft until at least one person has opened the judgment and read the paragraphs being relied on. The tool can shorten the search to minutes, but the reading is what stands between you and a cost order.

**Confusing a high similarity score with a strong authority.** Semantic ranking tells you how close two texts are in meaning, not how persuasive a case is for your argument. A factually identical matter decided by a single bench of a distant High Court may rank above a binding Constitution Bench decision that is phrased more abstractly. Read the ranking as a reading order, not a hierarchy of authority. The court cares about which bench decided the point and whether it binds your forum, not about cosine distance.

---

## How an India-law-trained tool helps

A search tool is only as good as the corpus it searches and the language it understands. This is where a tool built specifically for Indian law pulls ahead of a generic AI assistant.

**It searches the right corpus.** A general chatbot has read a slice of the internet and remembers it imperfectly. An India-first tool retrieves from an actual, maintained index of Indian judgments and statutes, so the cases it surfaces are real Indian cases, and the source is there to open. That single design choice moves you from "hope the model remembers correctly" to "read the judgment it just retrieved."

**It understands Indian legal language.** Embeddings trained on Indian judgments grasp that "quashing under Section 482 CrPC" and "quashing under Section 528 BNSS" are close cousins, that "anticipatory bail" and "pre-arrest bail" are the same idea, and that a description of an unheard worker is reaching for natural justice. Domain understanding is what makes semantic retrieval accurate rather than merely fuzzy, the lesson from [domain-adapted legal embeddings](https://free.law/2025/03/11/semantic-search/).

**It is built to show its sources and admit uncertainty.** A tool designed for legal work links every proposition to the judgment it came from, so verification is a click, and a well-designed system tells you when it lacks adequate grounding rather than inventing an answer. That honesty is a feature, not a weakness.

**It keeps Indian data handling in mind.** For privileged client material, where data is processed and stored is a professional concern, not a technicality. A tool built for the Indian profession is built with that in mind. We have written about why this matters in [sovereign AI for India's legal sector](/blog/sovereign-ai-india-legal-tech).

This is the design philosophy behind Niyam's similar-judgments search. You describe your facts or your issue in plain language; it retrieves real Indian judgments by meaning, not just by keyword; it shows you the source so you can read the ratio yourself; and it is built to surface, not to fabricate. The tool finds the leads fast. You stay the lawyer who decides what to cite.

---

## Putting it together: a worked example

Take a concrete matter. Your client is a homebuyer who paid the full price for a flat in 2019. The builder delayed possession by four years and is now offering possession but no compensation. The client wants a refund of the principal with interest, plus compensation for the delay. You need precedent.

A keyword search for "refund of flat with interest delayed possession" returns documents containing those words. Useful, but it misses the judgment that decided your exact point using "deficiency in service" and "the allottee cannot be made to wait indefinitely," because those words are different.

A semantic search on your plain-language description - "homebuyer paid in full, builder delayed possession four years, buyer seeks refund with interest" - retrieves cases that *mean* the same thing regardless of phrasing. You get a broad set of fact-similar candidates across consumer fora and the higher courts.

You triage on headnotes, keeping the ones whose holding touches refund and interest. You narrow by ratio, opening the strongest candidates and confirming each one actually decides that a buyer facing inordinate delay is entitled to refund with interest, not merely that delay is regrettable. You then confirm each surviving case independently in an authoritative database, read the operative paragraphs yourself, and check the citator and "cited by" links for any later decision that distinguished or overruled it. Finally you order your cases from binding Supreme Court authority down to persuasive forum decisions.

The AI compressed hours of brittle keyword hunting into minutes of reading the *right* judgments. It did not write your argument, vouch for the law, or excuse you from reading. That is exactly the line a careful lawyer wants. For the citation mechanics once your cases are settled, see our guide on [how to cite Indian judgments](/blog/how-to-cite-indian-judgments).

---

## Frequently asked questions

**Is semantic search better than keyword search for finding case law?**

For exploring a fact pattern or a legal concept, yes, because it finds cases that mean the same thing even when they use different words. For pinpoint lookups like a known citation, party name, or statute number, keyword search is faster and more precise. The best tools run both and blend the results, so you get keyword precision and semantic recall together rather than choosing between them.

**Can AI find the exact precedent I need on its own?**

It can find strong candidates very fast, but the final judgment of relevance is yours. A case can match your facts and decide a different point of law, or state your principle on facts too distant to bind. AI narrows millions of documents to a readable shortlist; you read the ratio and decide what is actually on point.

**Will a grounded RAG tool stop AI from hallucinating citations?**

It reduces the risk substantially but does not remove it. The Stanford RegLab study found leading RAG-based legal tools still produced incorrect information in roughly 17 to 33 percent of queries, often by citing a real case that does not support the stated proposition. Grounding makes the error cheaper to catch, because the source is shown and you can open it, but it never removes your duty to verify each case before you rely on it.

**Has any Indian court actually penalised lawyers for AI-fabricated case law?**

Yes. The Income Tax Appellate Tribunal in Bengaluru recalled an order in December 2024 after it cited non-existent judgments. The Bombay High Court quashed a tax assessment in October 2025 over fabricated decisions and, in January 2026, imposed a cost of Rs 50,000 on a party for filing fake case laws. In 2026 the Supreme Court indicated that citing AI-generated fake case law can amount to professional misconduct and asked the Bar Council of India to study the issue.

**How do I check whether a judgment is still good law?**

Confirm it exists in an authoritative source, read the actual holding, then check its subsequent history for any reversal, overruling, distinguishing, or larger-bench reference. Citator services on SCC Online and Manupatra flag this, and the "cited by" links on a case plus a search for the case name in later judgments will surface most problems. Also check whether any later statute or amendment has displaced the point.

**What does "find similar judgments" mean in practice?**

It can mean three things: cases with similar facts, cases that decided the same legal principle (the ratio), or cases whose headnote states the proposition you need. They are not the same. You usually start with fact similarity to gather candidates, then narrow by ratio to find genuine authority for your point, using headnotes to triage which judgments to read in full.

**Is it safe to paste my client's facts into a general AI chatbot to search?**

Be careful. Consumer chatbots may use your inputs in ways you do not control, which raises confidentiality concerns for privileged material. For client work, use a tool built for the legal profession with appropriate data handling, and avoid pasting identifying client details into general-purpose consumer AI.

---

## Start finding similar judgments faster

Finding on-point precedent should not eat your day. Niyam's similar-judgments search retrieves real Indian judgments by meaning, not just by keyword, shows you the source so you can read the ratio yourself, and is built to surface authority rather than fabricate it. You get the leads in minutes; you stay the lawyer who decides what to cite.

Try it on a live matter and see how much faster the right case appears.

[Start for ₹100](https://app.niyam.ai/register)