问 HN:你们如何检测 Google 搜索结果中的合规风险?

1作者: paolocermelli6 个月前
我正在尝试一种基于规则的方法,对 Google 搜索结果摘要进行分类(中立 / 不利 / 权威-监管),用于合规和尽职调查。 我一直遇到的一个问题是来自高权威来源的误报:单个监管机构的 PDF 文件或旧的执法行动可能会压倒数十个中立结果,即使上下文已经发生了实质性变化。 对于那些从事 OSINT(公开情报)、风险或搜索分析工作的人:你们通常如何大规模验证误报与真实的不利信号?你们会以不同的方式衡量权威性,或者应用时间或上下文衰减吗?
查看原文
I’m experimenting with a rules-based approach to classify Google SERP snippets (neutral &#x2F; adverse &#x2F; authority-regulatory) for compliance and due diligence use cases.<p>One issue I keep running into is false positives from high-authority sources: a single regulator PDF or an old enforcement action can outweigh dozens of neutral results, even when the context has materially changed.<p>For those working in OSINT, risk, or search analysis: how do you usually validate false positives vs. true adverse signals at scale? Do you weight authority differently, or apply temporal or contextual decay?