Benchmarks in 2026 are all over the place, and hallucination rates vary wildly...
https://zoom-wiki.win/index.php/The_Uncertainty_Dilemma:_Building_Trust_in_AI_Advisory_Workflows
Benchmarks in 2026 are all over the place, and hallucination rates vary wildly by test. Even with web search, the HalluHard benchmark shows a 30.2% error rate