HackerNews中文版

我正在尝试一种基于规则的方法，对 Google 搜索结果摘要进行分类（中立 / 不利 / 权威-监管），用于合规和尽职调查。我一直遇到的一个问题是来自高权威来源的误报：单个监管机构的 PDF 文件或旧的执法行动可能会压倒数十个中立结果，即使上下文已经发生了实质性变化。对于那些从事 OSINT（公开情报）、风险或搜索分析工作的人：你们通常如何大规模验证误报与真实的不利信号？你们会以不同的方式衡量权威性，或者应用时间或上下文衰减吗？

查看原文

I’m experimenting with a rules-based approach to classify Google SERP snippets (neutral / adverse / authority-regulatory) for compliance and due diligence use cases.<p>One issue I keep running into is false positives from high-authority sources: a single regulator PDF or an old enforcement action can outweigh dozens of neutral results, even when the context has materially changed.<p>For those working in OSINT, risk, or search analysis: how do you usually validate false positives vs. true adverse signals at scale? Do you weight authority differently, or apply temporal or contextual decay?

问 HN：你们如何检测 Google 搜索结果中的合规风险？