Deep Mode: The Engineering Behind Arbitration’s Most Rigorous AI Mode

THE AUTHOR:
Ayushman Dash, Head of Data and AI at Jus Mundi

International arbitration produces some of the most complex legal questions in practice: multi-jurisdictional disputes, interlocking treaty frameworks, and decades of awards across dozens of institutions.

No tool has ever been truly built to handle all of it.

Even the best legal AI research tools hit a ceiling on the queries that actually decide cases. The queries that matter most, the complex, layered, high-stakes ones, have always demanded more than a fast answer. They require sustained reasoning, precise source evaluation, and analysis that goes beyond what any single search can surface. At that level of complexity, even strong tools fall short.

We heard this repeatedly from the lawyers, arbitrators, and researchers in our community; the answers were fast, the sources were relevant, but for the hardest queries, something was still missing: genuine depth. The kind that lets you walk into a hearing, a drafting session, or a client meeting with full confidence in your research.

Deep Mode is our answer to that gap.

Why Deep Takes 20 Minutes. And Why That Is the Point.

Most AI research modes operate under time constraints. To deliver a fast answer, the system gathers what it can, reasons through it, and concludes. But remove the time constraint entirely, and something very different becomes possible.

Deep Mode takes the time it needs to conduct thorough research through a continuous planning loop. A planning agent breaks your query into tasks, dispatches specialized sub-agents to carry them out, collects their findings, and then decides: what is still missing? What needs to go deeper? The loop continues, refining and expanding, until the answer is genuinely complete, not just good enough.

The diagram below shows how this works in practice:

Several of those sub-agents were built specifically to address the failure modes we kept seeing in complex arbitration research:

A landmark case agent ensures that foundational precedents are never missed. Previously, time-constrained modes would sometimes move to a conclusion before surfacing the most significant authorities. That no longer happens.
A tribunal reasoning agent focuses on how awards actually work through a legal question, the test applied, the evidence considered, the distinctions drawn, and the limits recognized. This is the most technically demanding part of any arbitration analysis, and it now has a dedicated agent.
A faithfulness correction agent runs continuously throughout the process, not just at the end. Every claim is checked against its source before it makes it into your answer. If the support is not there, the claim is rejected.
Dedicated research agents for case law, treaties, publications, and applicable rules versions work in parallel, each focused on a single source type so nothing gets shortchanged in the broader search.
A reflection agent sits above the whole process, reviewing findings, identifying gaps, and deciding when the research is genuinely complete. Calibrating that agent, knowing when to reflect and when to move, was one of the harder engineering problems we solved: too much reflection and the system overthinks, too little and it misses things. Through continuous testing and refining, we got that balance right.

And this list keeps growing. As we identify new gaps and edge cases, we build new specialized agents to close them.

Underlying all of this is Tenet, our proprietary Language Model for search and retrieval trained specifically on international arbitration. General-purpose models, however powerful, are not built for the domain-specific reasoning that arbitration demands. Tenet is. And when we connected Tenet to Deep Mode’s extended reasoning loop, the improvement was immediate and significant. That combination, a purpose-built model running over our exclusive corpus with no time limit, is what makes Deep Mode’s output different in kind, not just degree.

While queries take around 20 minutes on average, that is not a limitation. It is what genuine depth requires. When your answer is ready, you will find a comprehensive, source-grounded analysis that is ready to inform your work.

The Benchmark Behind Deep: How We Know It Works

We run every update through a rigorous quality-assurance process using a homegrown Eval Tool called Anchor. It supports our two-phase QA framework and measures quality across five dimensions.

It also flags queries for a second round of human review and every update passes through this rigorous QA process.

We measure answer quality across five dimensions:

Faithfulness determines whether a cited authority actually supports the conclusion drawn, or whether the model is reasoning beyond what the source says.
Correctness means the law is right, not just plausible.
Completeness captures whether the answer covers the full scope of the question.
Relevance separates the authorities that are truly applicable from those that are merely analogous.
Language & Format determines whether the output is structured clearly enough to actually use.

When we benchmarked Deep Mode against Think Mode across a test set of complex arbitration queries, the improvements were consistent and meaningful across every dimension:

The completeness gain is the most telling. That is exactly the dimension most affected by time constraints in previous modes, and it is the dimension that matters most when stakes are high.

Expert Tested, Practitioner Approved

Before the preview release, we worked closely with practitioners at some of the world’s leading arbitration practices to test Deep Mode on real, high-complexity queries.

The feedback was unambiguous.

“The depth of analysis provided by Jus AI’s Deep Mode is a game changer for international arbitration research. It offers genuinely sophisticated, context-aware insights into complex treaty and contract disputes.”

– Thomas Snider, Partner and Head of International Arbitration, Charles Russell Speechlys.

“Jus Mundi’s Deep Mode is the most impressive AI-powered arbitration research tool that I have used. It understands the key principles of international arbitration and is able to analyse even the most complex legal questions with striking accuracy and meticulous detail. Clearly, it is in a league of its own!”

– Lucia Bizikova, Senior Associate, Wimerhale

What came through consistently across our alpha testers was not just that the answers were longer or more detailed. It was that the output felt qualitatively different, closer to the output of a senior associate who had actually done the research, not a summary of what was retrievable in a given timeframe.

When To Effectively Leverage Deep Mode

Deep Mode is not the default, and that is intentional. Think Mode remains the right tool for the majority of queries: quick lookups, single-jurisdiction questions, early-stage research, and anything where speed matters. Deep Mode is for a different category of question entirely.

Use Deep Mode when your query requires:

Multi-jurisdictional surveys where gaps in coverage have real consequences
Full doctrinal analysis, not just a summary of the field
On-point case discovery where completeness and landmark authorities are essential
Drafting support where citation accuracy cannot be left to chance
Arbitrator profiling that requires recognizing patterns across a large body of awards
Complex questions involving tribunal reasoning where shallow synthesis is not enough

A useful way to think about it: if Think Mode is the sharp, fast legal associate, Deep Mode is what happens when that associate gets a promotion. Not just more information, but better judgment about what matters, where the real authorities sit, and how much confidence the law actually supports.

Each AI Core and AI Max user receives 10 Deep Mode queries during the preview phase. That limit is intentional: Deep Mode consumes significantly more tokens than any of our other modes, resulting in substantially higher costs. The quota is designed to ensure every query is used deliberately, on the questions that genuinely demand this level of analysis.

What Comes Next

Deep Mode is the furthest we have pushed the boundary of what AI-powered arbitration research can deliver. But it is not the ceiling.

We are already working on the next iteration of Tenet, which will feed further gains into Deep Mode’s reasoning pipeline. We are also listening closely to how practitioners are using it during the preview phase, and that feedback will shape how Deep Mode is positioned and refined for the long term.

Arbitration generates the most complex legal questions in practice. Deep Mode is built to answer them.

Explore Deep Mode

And see what you can achieve with arbitration’s most rigorous AI mode.

About Jus Mundi

Founded in 2019 and recognized as a mission-led company, Jus Mundi is a pioneer in the legal technology industry dedicated to powering global justice through artificial intelligence. Headquartered in Paris, with additional offices in New York, London, and Singapore. Jus Mundi serves over 150,000 users from law firms, multinational corporations, governmental bodies, and academic institutions in more than 90 countries. Through its proprietary AI technology, Jus Mundi provides global legal intelligence, data-driven arbitration professional selection, and business development services.

Press Contact Helene Maïo, Senior Digital Marketing Manager, Jus Mundi – [email protected]

*The views and opinions expressed by authors are theirs and do not necessarily reflect those of their organizations, employers, or Daily Jus, Jus Mundi, or Jus Connect.

by Jus Mundi

Deep Mode: The Engineering Behind Arbitration’s Most Rigorous AI Mode

Related Posts

The Case for an Arbitrator Digital Competence Score

Why Online Presence Matters in Arbitration

5th Italian Arbitration Day: International Arbitration in the Age of Tariffs, Sanctions and Global Uncertainty

Ressources

Newsletter