Research Papers

In-depth research from 42 Robots AI on applied AI, model evaluation, and production ML systems.

June 2, 2026 · v1.0 · CC BY 4.0
Q1 2026 LLM Evaluation Study

Methodology and findings from 30 evaluations and 20,110 tests across 9 production models from OpenAI, Google, and Anthropic. Three assumptions about LLM evaluation that the data falsifies.
Read the paper →

Q1 2026 LLM Evaluation Study