HackerNews中文版

一次运行中出现 42 个验证错误。Claude 道歉而不是编写 HTML。OAuth 令牌在中途过期。然后我修复了约束。八天，零故障，零干预。秘诀不在于更好的提示词……而在于将 LLM 视为一个受约束的函数：模式验证的工具调用会拒绝格式错误的输出并强制重试，两阶段架构将编辑判断与格式化分开，以及无聊的 DevOps（重试逻辑、速率限制、结构化日志记录）。Claude 的调用在 2000 行的系统中大约有 30 行。大部分工作都在它周围。<a href="https://seanfloyd.dev/blog/llm-reliability" rel="nofollow">https://seanfloyd.dev/blog/llm-reliability</a> <a href="https://github.com/SeanLF/claude-rss-news-digest" rel="nofollow">https://github.com/SeanLF/claude-rss-news-digest</a>

查看原文

42 validation errors in one run. Claude apologising instead of writing HTML. OAuth tokens expiring mid-digest.Then I fixed the constraints. Eight days, zero failures, zero intervention.The secret wasn't better prompts... it was treating the LLM as a constrained function: schema-validated tool calls that reject malformed output and force retries, two-pass architecture separating editorial judgment from formatting, and boring DevOps (retry logic, rate limiting, structured logging).The Claude invocation is ~30 lines in a 2000-line system. Most of the work is everything around it.<a href="https://seanfloyd.dev/blog/llm-reliability" rel="nofollow">https://seanfloyd.dev/blog/llm-reliability</a> <a href="https://github.com/SeanLF/claude-rss-news-digest" rel="nofollow">https://github.com/SeanLF/claude-rss-news-digest</a>

Show HN: 我不再指望我的 LLM 配合了