Ask HN: 我使用 LLM 的方式如何训练底层模型?

1作者: emehex7 个月前
我理解使用 Chat、Cursor 和 Claude Code 等工具进行软件开发,很可能会为这些 LLM 提供训练数据,以帮助它们在编码方面变得更好(讽刺的是,我可能正在为让自己被淘汰做贡献……) 但我很好奇实际的运作机制:这个反馈循环到底是如何工作的?当我接受、拒绝或修改这些模型生成的代码时,这个信号是否会直接反馈到训练中? 我并不反对这种做法,只是真心想了解“香肠”是如何制作出来的。
查看原文
I understand that using tools like Chat, Cursor, and Claude Code for software development is likely providing training data to help these LLMs get better at coding (the irony isn&#x27;t lost on me that I might be contributing to making myself obsolete...)<p>But I&#x27;m curious about the actual mechanics: How exactly does this feedback loop work? When I accept, reject, or modify the code that these models spit out, is that signal fed directly back into training?<p>Not necessarily against this, just genuinely curious about how the sausage is made.