Leni Tops Four Major AI Benchmarks, Outperforming Systems from OpenAI, Anthropic, Google, and Perplexity
PR Newswire
NEW YORK, May 12, 2026
NEW YORK, May 12, 2026 /PRNewswire/ — Leni, an AI-powered analytics platform for commercial real estate, today announced top-tier results on four independent AI benchmarks. Leni placed first on the DRACO Benchmark for deep research, in the top two on SpreadsheetBench Verified, outperformed every public model on BullshitBench, and ranked ahead of Genspark, Manus and OpenAI Deep Research on GAIA.
“Most teams obsess over models, but the key engineering needed for effective AI adoption, which delivers highly accurate and reliable results for teams, relies on architecture or harness,” said Leni CEO and Co-Founder Arunabh Dastidar. “That’s why the most popular coding tool today is 98 percent harness and 2 percent models. We called it years ago and have produced purpose-built infrastructure that can reliably be used for serious work where accuracy and security are crucial. It shifts the work from babysitting and guessing to trusted, verifiable output, so teams can move faster with confidence.”
DRACO, developed by Perplexity AI and Harvard, measures whether AI can produce in-depth research that a senior analyst would sign off on. Leni scored 71.6 percent, ahead of the deep research products from Perplexity, Google, and OpenAI. SpreadsheetBench Verified, which grades AI on hundreds of real spreadsheet tasks, ranked Leni in the top two globally, completing 365 of 400 tasks correctly. On BullshitBench (Version 2), which tests whether AI pushes back on nonsensical questions instead of inventing an answer, Leni caught 98 percent of fabricated premises, ahead of all 142 public AI models on the leaderboard. GAIA, developed by Meta and HuggingFace, measures whether AI can complete real-world tasks that involve multiple steps without making mistakes early on, which would throw off the final answer. Leni scored 77.0 percent on the validation set, ahead of Genspark, Manus, and OpenAI Deep Research. In commercial real estate, where the margin for error is zero, these benchmarks measure whether a system can accurately produce the analysis that determines the closing of a deal.
The results matter because the gap between AI promise and AI reliability is costing companies real money, according to Dastidar. A staggering 99 percent of companies reported financial losses tied to AI-related risks, with an average loss of $4.4 million per company and an estimated $4.3 billion across the 975 respondents, according to an EY survey published in October 2025. The pattern is prevalent in commercial real estate, where 92 percent of CRE firms have piloted AI but only 5 percent say they have achieved all of their AI goals, according to JLL’s 2025 Global Real Estate Technology Survey.
“If I had to describe Leni’s impact, it’s simple: faster and easier,” said Scott Jones, Vice President of IT at Ram Realty Advisors. “On the asset management side in particular, teams are no longer stuck doing manual work. The data flows directly from the source, and they can trust it. Leni shifts the focus away from aggregating information and building reports to what actually matters: finding deals, executing them better, and running assets more effectively.”
Leni’s agentic AI platform is designed for investment, asset management, and operations teams across commercial real estate, pulling data from PDFs, spreadsheets, and core systems to execute complex workflows end to end. At the platform’s core is its Universal Data Model (UDM), the industry’s first standardized data framework for multifamily real estate, developed over three years by a team that includes alums from MIT, Greystar, EY, and Geoffrey Hinton’s Vector Institute. The UDM creates a common language for a sector long defined by proprietary formats and data silos, integrating across every major real estate system. The result is secure, model-agnostic automation that delivers decision-ready outputs without requiring in-house AI infrastructure.
“Trust is the most important part of any AI system that a business actually uses,” said Leni’s Head of Industry Strategy, Marcio Sahade, who previously spent 14 years at firms such as Tishman Speyer and Hines. “If a team cannot rely on what comes back, they end up redoing the work themselves, and the AI never delivers on its promise.”
He added, “What these benchmarks measure is exactly that gap: whether a system can be trusted to produce finished work, not just plausible-sounding output. That is the bar we hold ourselves to with every customer.”
About Leni
Leni is a secure, accuracy-driven AI platform purpose-built for serious investment work across the commercial real estate, lending, and investment sectors. Since its public launch in 2023, the company has raised $8.5 million to build best-in-class AI infrastructure for the sector. Leni enables accurate, secure, and context-aware deliverables for investment and asset management teams. The platform today supports a total portfolio of over $40 billion assets under management. For more information, visit: http://www.leni.co.
View original content to download multimedia:https://www.prnewswire.com/news-releases/leni-tops-four-major-ai-benchmarks-outperforming-systems-from-openai-anthropic-google-and-perplexity-302769724.html
SOURCE Leni


