我在Anthropic事后分析后对使用Claude Code的改进
4 分•作者: cinooo•7 天前
在看完Anthropic最近的复盘(anthropic.com/engineering/april-23-postmortem)后,我开始思考自己使用Claude Code的方式。他们为了修复延迟而降低了默认的推理力度,认为这是一个错误的权衡,并在公众的审视下进行了回滚。回滚是好的,但它们并没有改变我一直忽略的根本问题。
事实是,我们现在可能拥有一支工程师团队。Token成本是真实的成本,但我无法为我正在做的任何个人工作雇佣自己的工程师。用这个视角来看待token的使用,方法就会发生转变。它不再是关于成本上限,而是变成了成本/产出/质量的视角,就像我在真实团队中考虑招聘决策一样。
现在,我考虑的四个方面是:模型、配置、提示和代理。
关于模型。Opus仍然是做出关键决策和架构推理的最佳选择。Sonnet通常足以胜任编码和简单的重复性任务。我为工作选择合适的模型。如果我偷工减料,就不能期望质量。
关于配置。/effort的范围从低到最高。Opus 4.7的默认设置为xhigh。我设置级别以适应工作。快速编辑不需要最大程度的努力,而架构决策则需要。这是最便宜的举措,也是我一直跳过的。
关于提示:我发现三种模式最有效。
1. “如果无法确定,请提问。” 如果没有这个,我就没有给模型留出退路,这会关闭解决方案,即使没有明确的答案,也需要公开权衡。
2. “时间和成本在这里不是因素。更倾向于稳健、可持续、可扩展的解决方案,不要留下技术债务。” 这颠覆了任务期间的隐式优化压力。
3. “反思本次会话,并通过claude.md或技能编码你所学到的内容,这样下一次迭代就不会重复同样的错误。” 值得作为一项技能来捕获并为自己迭代。如果没有这个,每次会话都会从零开始,重复我已经纠正过的错误。
关于代理。由于这本身就是一个完整的帖子,这里就不详细介绍了,对我来说有效的模式是使用代理来分离关注点。一个代理根据代码进行规范审查(代码是事实的来源)。另一个代理在实现后进行代码审查。
工程和产品团队一直在平衡上市速度、成本和质量。人工智能也是如此。区别在于我选择哪些杠杆。有意识地将预算花在努力上,工作就会以我想要的水平呈现出来。
查看原文
After watching Anthropic's recent postmortem (anthropic.com/engineering/april-23-postmortem), I've been thinking about the way I approach Claude Code differently. They lowered default reasoning effort to fix latency, called it the wrong tradeoff, and reverted under public scrutiny. The reverts are good but they don't change the underlying equation I'd been ignoring.<p>The fact is we potentially have a team of engineers at our disposal now. Token cost is a real cost, but I couldn't hire my own engineer for any of the personal work I'm doing. Apply that lens to token usage and the approach shifts. It stops being about the cost ceiling and becomes a cost/output/quality view, the same way I'd think about hiring decisions on a real team.<p>Four places I now think about: model, configuration, prompting, agents.<p>On model. Opus is still the strongest for critical decisions and architectural reasoning. Sonnet is usually good enough for coding and simple repetitive tasks. I use the right model for the work. If I cheap out, I can't expect quality.<p>On configuration. /effort runs from low to max. Opus 4.7's default is xhigh. I set the level to fit the work. A quick edit doesn't need max, an architectural decision does. The cheapest move and the one I'd been skipping.<p>On prompting: three patterns I find most effective.<p>1. "Ask questions if unsure." Without this I'm not giving the model an out, which closes off solutions even when there's no clean answer and tradeoffs need to be surfaced.<p>2. "Time and cost are not factors here. Prefer robust, sustainable, scalable solutions, do not leave tech debt." This inverts the implicit optimisation pressure for the duration of the task.<p>3. "Reflect on this session and encode via claude.md or skills what you learned, so the next iteration doesn't repeat the same mistakes." Worth capturing as a skill and iterating for myself. Without this every session starts from zero, repeating mistakes I've already corrected.<p>On agents. Without going into detail as this is a whole post in itself, the pattern that works for me is using agents to separate concerns. One agent does spec review against the code (code is source of truth). A separate agent does code review after implementation.<p>Engineering and product teams have always balanced speed to market with cost and quality. AI is no different. The difference is which levers I choose. Spend the budget on effort deliberately, and the work comes back at the level I want.