Copy
28 January 2025 Medium: Substack
Unpacking Deepseek by Gijs Verheijke
Benchmarking AI is a hard problem. And as the benchmarks become the goal (Goodhart's law), the model makers train the model to do well on them, and hence they seize to be good benchmarks.