HackerNews中文版

我一直在思考为什么在 Python 中，死代码检测（以及一般的静态分析）感觉不如其他语言那么可靠。我理解 Python 本质上是动态的。理论上，这应该很简单（再次强调，理论上）：解析 AST，构建调用图，找到零引用的符号。但在实践中，它很快就会失效，原因有很多，例如：1. 动态分发（getattr、注册表、插件系统）2. 框架入口点（Flask/FastAPI 路由、Django 视图、pytest fixtures）3. 装饰器和隐式命名约定4. 仅通过测试或运行时配置调用的代码大多数工具似乎都选择了两种糟糕的权衡之一：1. 保持保守，错过大量真正意义上的死代码或者2. 过于激进，标记出虚假阳性，导致人们不再信任到目前为止，对我来说最有效的方法是将代码视为一种置信度评分，再加上一些有限的运行时信息（例如，在测试期间实际执行了什么），而不是依赖 100% 的静态分析。很好奇其他人如何在实际代码库中处理这个问题。你们是接受虚假阳性吗？还是完全忽略死代码检测？有没有人见过真正可扩展的方法？我知道 SonarQube 噪音很大。我构建了一个带有 VS Code 扩展的库，主要用于探索这些权衡（如果相关，请看下面的链接），但我更感兴趣的是其他人如何思考这个问题。也希望我是在正确的频道里仓库链接：https://github.com/duriantaco/skylos

查看原文

I’ve been thinking about why dead code detection (and static analysis in general) feels so unreliable in Python compared to other languages. I understand that Python is generally dynamic in nature.In theory it should be simple(again in theory): parse the AST, build a call graph, find symbols with zero references. In practice it breaks down quickly because of many things like:1. dynamic dispatch (getattr, registries, plugin systems)2. framework entrypoints (Flask/FastAPI routes, Django views, pytest fixtures)3. decorators and implicit naming conventions4. code invoked only via tests or runtime configurationMost tools seem to pick one of two bad tradeoffs:1. be conservative and miss lots of genuinely dead codeor2. be aggressive and flag false positives that people stop trustingWhat’s worked best for me so far is treating the code as sort of a confidence score, plus some layering in limited runtime info (e.g. what actually executed during tests) instead of relying on 100% static analysis.Curious how others handle this in real codebases..Do yall just accept false positives? or do yall ignore dead code detection entirely? have anyone seen approaches that actually scale? I am aware that sonarqube is very noisy.I built a library with a vsce extension, mainly to explore these tradeoffs (link below if relevant), but I’m more interested in how others think about the problem. Also hope I'm in the right channelRepo for context: https://github.com/duriantaco/skylos

Ask HN: 为什么 Python 中的死代码检测比大多数工具承认的更难？