HackerNews中文版

我们在生产环境中部署 LLM 时，反复遇到相同的问题，因此构建了 Verdic：大多数 AI 故障并非源于内容安全，而是意图漂移。随着模型变得更具自主性，输出结果常常悄无声息地从描述性行为转变为指令性行为——没有任何明确的信号表明系统正在采取行动。在这种情况下，关键词过滤器和基于规则的防护措施很快就会失效。 Verdic 是一个意图治理层，位于模型和应用程序之间。它不检查主题或关键词，而是评估： * 输出是否将未来的选择压缩为特定的行动方案 * 响应是否施加规范性压力（指导行为 vs 解释说明）目标不是内容审核，而是行为控制：检测 AI 系统何时在其部署的意图之外运行，尤其是在受监管或决策关键的工作流程中。 Verdic 目前以 API 的形式运行，具有可配置的允许/警告/阻止结果。我们正在对自主性工作流程和长期运行的链进行测试，这些场景中意图漂移最难检测。这是一个早期版本。我主要希望收到在生产环境中部署 LLM 的用户的反馈，特别是在以下方面： * 自主性系统 * AI 治理 * 风险与合规 * 我们可能遗漏的故障模式欢迎提问或分享有关该方法的更多细节。

查看原文

We built Verdic after repeatedly running into the same issue while deploying LLMs in production: most AI failures aren’t about content safety, they’re about intent drift.As models become more agentic, outputs often shift quietly from descriptive to prescriptive behavior — without any explicit signal that the system is now effectively taking action. Keyword filters and rule-based guardrails break down quickly in these cases.Verdic is an intent governance layer that sits between the model and the application. Instead of checking topics or keywords, it evaluates:whether an output collapses future choices into a specific course of actionwhether the response exerts normative pressure (directing behavior vs explaining)The goal isn’t moderation, but behavioral control: detecting when an AI system is operating outside the intent it was deployed for, especially in regulated or decision-critical workflows.Verdic currently runs as an API with configurable allow / warn / block outcomes. We’re testing it on agentic workflows and long-running chains where intent drift is hardest to detect.This is an early release. I’m mainly looking for feedback from people deploying LLMs in production, especially around:agentic systemsAI governancerisk & compliancefailure modes we might be missingHappy to answer questions or share more details about the approach.

Verdic – 人工智能系统意图治理层