How Neural Nets Are Liberating Legal Search from the Keyword Prison

0
27
How Neural Nets Are Liberating Legal Search from the Keyword Prison

For all the advances made in the science of legal search, we are – as Pablo Arredondo, cofounder and chief innovation officer of Casetext likes to say – stuck in the keyword prison.

Virtually all search tools – whether for legal research, e-discovery review, document review, or anything else – are confined to indexes. Every word in every case or contract or email or whatever is fed into an index, and the search tool looks for that word or some combination of indexed words.

No doubt, keyword search is powerful. It has dramatically changed how lawyers and legal professionals work. Over the years, search scientists have improved upon it with Boolean and natural language queries.

But the bottom line remains that if the word you search is not in the index, you get no result.

Why does this matter? Because in legal search, we should be able to search concepts and circumstances. We should be able to search for legal holdings or fact patterns even when the words do not square up.

That is the keyword prison in which we remain locked.

Well, your liberator is here, and she comes in the form of the neural network. It is technology that enables search queries to find highly relevant results, even when the results contain not even one of the search terms.

Not only do neural nets free search from the constraints of keywords, but they also humanize it, making search operations function more like our brains – thus the “neural” moniker.

ILTACON Panel

At ILTACON last week, the most fascinating panel I attended was one called, What is Natural Processing and How Can I Use It?, at which the aforementioned Arredondo spoke, together with panelists Damien Riehl, vice president, litigation workflow and analytics, at Fastcase; Scott Reents, lead attorney, data analytics and e-discovery, Cravath, Swaine & Moore; and Samantha Seaton, knowledge management counsel, Fisher Phillips.

The set-up of the panel had Riehl making the case that there is no “right” form of NLP for every purpose in legal search – that sometimes rule-based, traditional NLP is more precise, and sometimes neural nets are the better option.

Arredondo then made the case for neural nets, with Reents and Seaton following up to provide examples of innovative ways they have used these search tools in their firms.

In his talk, Arredondo called neural nets, “one of the biggest leaps forward in the history of search.”

To be clear, Arredondo has a horse in this race. In June, his company Casetext launched AllSearch, a search tool based on neural net technology.

“I see this as the most important product launch in the history of the company by far,” he told me at the time.

At the ILTACON panel, Arredondo offered some examples of the power of neural net search that – to be honest – I don’t have in my notes.

But in an earlier post I wrote about the Casetext technology, I included some examples Casetext had provided of how an earlier version of its neural net search was able to “understand” words and sentences in context. For example, when this statement was entered as a query:

 “Target’s employees were uncompensated while waiting for loss prevention inspections before leaving work.”

Casetext’s tool returned the following statement from the case Frlekin v. Apple Inc. (9th Cir. 2020):

“Employees receive no compensation for the time spent waiting for and undergoing exit searches, because they must clock out before undergoing a search.”

Thus, the search tool understood that “uncompensated” was the same as “no compensation,” that “loss prevention inspections” were similar to “exit searches,” and that “before leaving work” was similar to “must clock out.”

Casetext’s AllSearch takes that beyond legal research to all types of legal documents, meaning it can be used for e-discovery collections, contract review, brief banks, litigation records, deposition transcripts, expert reports, or any other collection of documents.

New Ways to Search

At ILTACON, Arredondo said this effectively opens two new and powerful ways to conduct searches.

For one, you can now simply write a complete sentence as your query, without regard to whether the sentence contains specific keywords.

“For the first time, you can write what you want to write, and the search engine will come to you,” Arredondo said.

For another, you can simply drag and drop any document and conduct a search based on its contents.

A key characteristic of neural net technology is that it effectively trains itself. You have probably heard of the game-playing AI programs developed by Google’s DeepMind, starting with the original AlphaGo – which had to be trained on thousands of human games – to  AlphaGo Zero – which was trained only on the basic rules of the game, without examples – and then to AlphaZero – which was given no training of any kind and mastered three different games in three days.

Training a neural net for legal search is much the same, Arredondo said – you just let it keep playing the game, by giving it sentences with missing words and letting it learn to fill in the appropriate word given the context.

For humans, that is easy. Arredondo offered the example, “The man went to the store to buy a BLANK of milk.” We all know to say “bottle” or “carton,” but not “beaver.”

If you change it to, “The woman went to the store to buy a BLANK of milk,” the response does not change. But if you say, “The man went to the store to buy a BLANK of beer,” the answer does.

“A very, very important breakthrough has happened,” Arredondo said.

Examples and Alternatives

In their portions of the panel, both Reents, the Cravath data analytics and e-discovery attorney, and Seaton, the Fisher Phillips knowledge management counsel, offered real-world examples of using Casetext’s AllSearch product with powerful effects – Reents for e-discovery review and Seaton for reviewing collections of collective bargaining agreements.

In Seaton’s example, the firm created a database of CBAs and then offered its clients access to the database. For the firm’s clients, the advantage of the neural net technology was that the clients did not have to know how to conduct sophisticated searches or even have prior knowledge of contracts to conduct effective searches in the database.

For his part, Riehl made the case that NLP tools fall along a spectrum, grouped between “traditional” symbolic AI using rule-based approaches and the newer breed of neural nets and deep-learning tools.

Along that spectrum, symbolic AI tools are often better for searching for specific legal terms and concepts, since legal terms are unambiguous, while neural nets may be better for searching for facts that involve more ambiguous concepts such as sentiment or actions, Riehl said.

The right tool often depends on the task at hand, he said. NLP can be used to:

  • Search for law or facts.
  • Tag, classify and structure data such as motions and contracts.
  • Perform analytics on data.
  • Generate first drafts of documents, phrases or citations.

The task, Riehl argued, will drive the choice of tool, with structured AI performing better than neural nets in terms of precision and recall in some instances.

He gave the example of how Docket Alarm is using NLP to tag matters and documents for analytics, which I reported on in this post and which Riehl discussed in our recent LawNext podcast interview.

Key Takeaways

The panelists offered two key takeaways pertaining to NLP:

  1. Recent breakthroughs in applying neural nets to language have led to explosive gains, with profound and immediate implications for the legal profession.
  2. Often, rule-based NLP (Symbolic AI) is more effective/accurate.

Both of those may be true, but it seems beyond dispute that neural net technologies stand to be the greater game changer in law, freeing us all from the keyword prison.