DeepSeek R1 Distill Qwen 14B leverages Qwen 2.5 and DeepSeek R1 for superior performance. This 14B distilled model outperforms OpenAI's o1-mini, scoring 74% on MMLU and excelling in math tasks (AIME: 69.7%, MATH-500: 93.9%)
anthropic