Show HN: 我不再指望我的 LLM 配合了
2 分•作者: seanlf•16 天前
一次运行中出现 42 个验证错误。Claude 道歉而不是编写 HTML。OAuth 令牌在中途过期。<p>然后我修复了约束。八天,零故障,零干预。<p>秘诀不在于更好的提示词……而在于将 LLM 视为一个受约束的函数:模式验证的工具调用会拒绝格式错误的输出并强制重试,两阶段架构将编辑判断与格式化分开,以及无聊的 DevOps(重试逻辑、速率限制、结构化日志记录)。<p>Claude 的调用在 2000 行的系统中大约有 30 行。大部分工作都在它周围。<p><a href="https://seanfloyd.dev/blog/llm-reliability" rel="nofollow">https://seanfloyd.dev/blog/llm-reliability</a>
<a href="https://github.com/SeanLF/claude-rss-news-digest" rel="nofollow">https://github.com/SeanLF/claude-rss-news-digest</a>
查看原文
42 validation errors in one run. Claude apologising instead of writing HTML. OAuth tokens expiring mid-digest.<p>Then I fixed the constraints. Eight days, zero failures, zero intervention.<p>The secret wasn't better prompts... it was treating the LLM as a constrained function: schema-validated tool calls that reject malformed output and force retries, two-pass architecture separating editorial
judgment from formatting, and boring DevOps (retry logic, rate limiting, structured logging).<p>The Claude invocation is ~30 lines in a 2000-line system. Most of the work is everything around it.<p><a href="https://seanfloyd.dev/blog/llm-reliability" rel="nofollow">https://seanfloyd.dev/blog/llm-reliability</a>
<a href="https://github.com/SeanLF/claude-rss-news-digest" rel="nofollow">https://github.com/SeanLF/claude-rss-news-digest</a>