When artificial intelligence systems began acing long-standing academic assessments, researchers realized they had a problem: the tests were too easy. Popular evaluations, such as the Massive Multitask Language Understanding (MMLU) exam, once considered formidable, are no longer challenging enough to meaningfully test advanced AI systems.
Alibaba leads $290 million investment for building a new kind of AI model as LLM limits emerge
...
Read moreDetails



