DeepSeek R1 Training Explained
Once in a while there comes an update in community which sets the path for future research. Same is the case with release of DeepSeek Reasoning Model R1. It has not just set the benchmarks for upcoming researches but has also introduced the use of pure Reinforcement Learning in training the reasoning models. Moreover, by making these models and training procedures public, DeepSeek has enabled both industry and research community to drive further innovation and make LLMs smarter.