HackerNews中文版

我正在构建一个封装器，它会查询 GPT-4、Claude 和 Gemini，然后在沙盒中执行它们的代码以捕捉幻觉。<p>30 秒的延迟换取确定性，这值得吗？还是你更喜欢速度？<p>我今天正在为人们进行手动测试，如果有人想尝试，请联系我。

查看原文

m building a wrapper that queries GPT-4, Claude, and Gemini, then executes their code in a sandbox to catch hallucinations.<p>Is the latency (30s) worth the certainty? Or do you prefer speed?<p>I'm running manual tests for people today if anyone wants to try it.

你愿意为一款能够执行 LLM 代码并进行验证的工具付费吗？