arXiv AI Papers

Beyond Scalars: Evaluating and Understanding LLM Reasoning via Geometric Progress and Stability

Back to overview

Researchers introduce TRACED, a framework that evaluates LLM reasoning quality beyond scalar probabilities using geometric analysis. By measuring reasoning trajectories through Progress (displacement) and Stability (curvature), the method reveals distinct patterns: correct reasoning shows high progress and stability, while hallucinations exhibit low progress with unstable, fluctuating curvature.