DeepSeek R1 Distill Llama 8B distills DeepSeek R1's reasoning into Llama-3.1's 8B architecture. This compact model combines Llama's efficiency with DeepSeek's advanced capabilities through knowledge distillation.
anthropic