Top Papers — Mar 15
The highest-scoring arXiv ML papers from Mar 15, ranked by LLM relevance.
Ninghui Li +3 · 2026-03-12
This article, a lightly adapted version of Perplexity's response to NIST/CAISI Request for Information 2025-0035, details our observations and recommendations concerning the security of frontier AI ag…
Guanyu Jiang +4 · 2026-03-12
Multimodal agents can now tackle complex reasoning tasks with diverse tools, yet they still suffer from inefficient tool use and inflexible orchestration in open-ended settings. A central challenge is…
Yuanjie Lyu +4 · 2026-01-30
Small LLMs often struggle to match the agentic capabilities of large, costly models. While reinforcement learning can help, progress has been limited by two structural bottlenecks: existing open-sourc…
Łukasz Borchmann +14 · 2026-03-12
Multimodal agents offer a promising path to automating complex document-intensive workflows. Yet, a critical question remains: do these agents demonstrate genuine strategic reasoning, or merely stocha…
Sanchit Pandey · 2026-03-12
Retrieval augmented generation RAG is widely deployed to improve factual accuracy in language models yet it remains unclear whether smaller models of size 7B parameters or less can effectively utilize…
Kunfeng Chen +4 · 2026-03-12
Tool-calling empowers Large Language Models (LLMs) to interact with external environments. However, current methods often struggle to handle massive and noisy candidate tools in long-context tool-call…
Weixun Wang +89 · 2025-12-31
Agentic crafting requires LLMs to operate in real-world environments over multiple turns by taking actions, observing outcomes, and iteratively refining artifacts. Despite its importance, the open-sou…
Yuxiang Zhou +4 · 2025-11-15
Mobile agents show immense potential, yet current state-of-the-art (SoTA) agents exhibit inadequate success rates on real-world, long-horizon, cross-application tasks. We attribute this bottleneck to …
Zhejun Zhao +10 · 2025-08-06
The advent of Large Language Models (LLMs) is transforming search engines into conversational AI search products, primarily using Retrieval-Augmented Generation (RAG) on web corpora. However, this par…
Ziting Wang +4 · 2024-11-01
Large Language Models (LLMs) have demonstrated impressive ability in generation and reasoning tasks but struggle with handling up-to-date knowledge, leading to inaccuracies or hallucinations. Retrieva…
Want papers like these in your inbox?
PaperBrief sends you a personalised daily digest of the arXiv papers that actually matter for your research track.
Get your personalised digest →