我在Anthropic事后分析后对使用Claude Code的改进

4作者: cinooo7 天前
在看完Anthropic最近的复盘(anthropic.com/engineering/april-23-postmortem)后,我开始思考自己使用Claude Code的方式。他们为了修复延迟而降低了默认的推理力度,认为这是一个错误的权衡,并在公众的审视下进行了回滚。回滚是好的,但它们并没有改变我一直忽略的根本问题。 事实是,我们现在可能拥有一支工程师团队。Token成本是真实的成本,但我无法为我正在做的任何个人工作雇佣自己的工程师。用这个视角来看待token的使用,方法就会发生转变。它不再是关于成本上限,而是变成了成本/产出/质量的视角,就像我在真实团队中考虑招聘决策一样。 现在,我考虑的四个方面是:模型、配置、提示和代理。 关于模型。Opus仍然是做出关键决策和架构推理的最佳选择。Sonnet通常足以胜任编码和简单的重复性任务。我为工作选择合适的模型。如果我偷工减料,就不能期望质量。 关于配置。/effort的范围从低到最高。Opus 4.7的默认设置为xhigh。我设置级别以适应工作。快速编辑不需要最大程度的努力,而架构决策则需要。这是最便宜的举措,也是我一直跳过的。 关于提示:我发现三种模式最有效。 1. “如果无法确定,请提问。” 如果没有这个,我就没有给模型留出退路,这会关闭解决方案,即使没有明确的答案,也需要公开权衡。 2. “时间和成本在这里不是因素。更倾向于稳健、可持续、可扩展的解决方案,不要留下技术债务。” 这颠覆了任务期间的隐式优化压力。 3. “反思本次会话,并通过claude.md或技能编码你所学到的内容,这样下一次迭代就不会重复同样的错误。” 值得作为一项技能来捕获并为自己迭代。如果没有这个,每次会话都会从零开始,重复我已经纠正过的错误。 关于代理。由于这本身就是一个完整的帖子,这里就不详细介绍了,对我来说有效的模式是使用代理来分离关注点。一个代理根据代码进行规范审查(代码是事实的来源)。另一个代理在实现后进行代码审查。 工程和产品团队一直在平衡上市速度、成本和质量。人工智能也是如此。区别在于我选择哪些杠杆。有意识地将预算花在努力上,工作就会以我想要的水平呈现出来。
查看原文
After watching Anthropic&#x27;s recent postmortem (anthropic.com&#x2F;engineering&#x2F;april-23-postmortem), I&#x27;ve been thinking about the way I approach Claude Code differently. They lowered default reasoning effort to fix latency, called it the wrong tradeoff, and reverted under public scrutiny. The reverts are good but they don&#x27;t change the underlying equation I&#x27;d been ignoring.<p>The fact is we potentially have a team of engineers at our disposal now. Token cost is a real cost, but I couldn&#x27;t hire my own engineer for any of the personal work I&#x27;m doing. Apply that lens to token usage and the approach shifts. It stops being about the cost ceiling and becomes a cost&#x2F;output&#x2F;quality view, the same way I&#x27;d think about hiring decisions on a real team.<p>Four places I now think about: model, configuration, prompting, agents.<p>On model. Opus is still the strongest for critical decisions and architectural reasoning. Sonnet is usually good enough for coding and simple repetitive tasks. I use the right model for the work. If I cheap out, I can&#x27;t expect quality.<p>On configuration. &#x2F;effort runs from low to max. Opus 4.7&#x27;s default is xhigh. I set the level to fit the work. A quick edit doesn&#x27;t need max, an architectural decision does. The cheapest move and the one I&#x27;d been skipping.<p>On prompting: three patterns I find most effective.<p>1. &quot;Ask questions if unsure.&quot; Without this I&#x27;m not giving the model an out, which closes off solutions even when there&#x27;s no clean answer and tradeoffs need to be surfaced.<p>2. &quot;Time and cost are not factors here. Prefer robust, sustainable, scalable solutions, do not leave tech debt.&quot; This inverts the implicit optimisation pressure for the duration of the task.<p>3. &quot;Reflect on this session and encode via claude.md or skills what you learned, so the next iteration doesn&#x27;t repeat the same mistakes.&quot; Worth capturing as a skill and iterating for myself. Without this every session starts from zero, repeating mistakes I&#x27;ve already corrected.<p>On agents. Without going into detail as this is a whole post in itself, the pattern that works for me is using agents to separate concerns. One agent does spec review against the code (code is source of truth). A separate agent does code review after implementation.<p>Engineering and product teams have always balanced speed to market with cost and quality. AI is no different. The difference is which levers I choose. Spend the budget on effort deliberately, and the work comes back at the level I want.