HackerNews中文版

Deepseek 究竟是如何做到的？他们是否只是将 Claude 的回答作为训练数据输入到自己的模型中，以此来提高推理能力？他们究竟是如何利用其他模型的输出来训练自己的模型的？这其中涉及哪些工程技术？我很想了解一下这种大规模操作是如何执行的。背景： Anthropic 最近指控 Deepseek、MiniMax 和 Moonshot 使用大量虚假账户与 Claude 进行交互，并利用这些输出结果来训练他们的模型，称之为“蒸馏攻击”。

查看原文

How is Deepseek actually doing this? Are they just feeding claude's answers into their own models as their own model as training data to improve reasoning? How exactly one train it's model on output of other? what's enginnering inovlved here?<p>I'd love breakdown of how thsi is executed at scale.<p>Backstory:<p>Anthropic recently accused Deepseek,Minimax,Moonshot of using lots of fake accounts to generate exchanges with claude, using the outputs to train the model and called it "distillation attack".

问 HN：用 AI 训练 AI 究竟是怎么运作的？