你愿意为一款能够执行 LLM 代码并进行验证的工具付费吗?

1作者: ZOdex7 天前
我正在构建一个封装器,它会查询 GPT-4、Claude 和 Gemini,然后在沙盒中执行它们的代码以捕捉幻觉。<p>30 秒的延迟换取确定性,这值得吗?还是你更喜欢速度?<p>我今天正在为人们进行手动测试,如果有人想尝试,请联系我。
查看原文
m building a wrapper that queries GPT-4, Claude, and Gemini, then executes their code in a sandbox to catch hallucinations.<p>Is the latency (30s) worth the certainty? Or do you prefer speed?<p>I&#x27;m running manual tests for people today if anyone wants to try it.