问 HN:用 AI 训练 AI 究竟是怎么运作的?

2作者: timonpimba大约 2 小时前
Deepseek 究竟是如何做到的?他们是否只是将 Claude 的回答作为训练数据输入到自己的模型中,以此来提高推理能力? 他们究竟是如何利用其他模型的输出来训练自己的模型的?这其中涉及哪些工程技术? 我很想了解一下这种大规模操作是如何执行的。 背景: Anthropic 最近指控 Deepseek、MiniMax 和 Moonshot 使用大量虚假账户与 Claude 进行交互,并利用这些输出结果来训练他们的模型,称之为“蒸馏攻击”。
查看原文
How is Deepseek actually doing this? Are they just feeding claude&#x27;s answers into their own models as their own model as training data to improve reasoning? How exactly one train it&#x27;s model on output of other? what&#x27;s enginnering inovlved here?<p>I&#x27;d love breakdown of how thsi is executed at scale.<p>Backstory:<p>Anthropic recently accused Deepseek,Minimax,Moonshot of using lots of fake accounts to generate exchanges with claude, using the outputs to train the model and called it &quot;distillation attack&quot;.