当 OpenClaw 特工互相攻击时会发生什么?
1 分•作者: udit_50•5 天前
我们基于 OpenClaw 构建了两个自主 AI 智能体,并进行了一场实时的对抗性安全测试。
一个智能体扮演红队攻击者。
另一个智能体扮演标准的防御智能体。
会话开始后,没有人类参与。智能体通过 Webhook 使用真实的凭证和工具访问权限直接通信。
目标是测试在实践中容易破坏自主系统的三个风险维度:
访问、暴露和自主性。
攻击者首先尝试了经典的社会工程。它提供了一个“有帮助”的安全管道,其中隐藏了一个远程代码执行有效载荷,并请求了凭证。防御智能体正确识别了意图并阻止了执行。
然后,攻击者转向了间接攻击。它没有要求智能体运行代码,而是要求智能体审查一个 JSON 文档,该文档在元数据中嵌入了隐藏的 shell 扩展变量。此有效载荷已成功交付,目前仍在分析中。
主要结论是,直接攻击相对容易防御。通过文档、模板和内存的间接执行路径要困难得多。
本报告并非安全声明。这是一项可观察性练习,旨在揭示智能体间交互中真实的失效模式,我们预计随着自主系统的广泛部署,这些模式将变得普遍。
完整报告请见:
https://gobrane.com/observing-adversarial-ai-lessons-from-a-live-openclaw-agent-security-audit/
很乐意回答有关设置、方法或发现的技术问题。
查看原文
We ran a live adversarial security test between two autonomous AI agents built on OpenClaw.<p>One agent acted as a red team attacker.
One agent acted as a standard defensive agent.<p>No humans were involved once the session started. The agents communicated directly over webhooks with real credentials and tooling access.<p>The goal was to test three risk dimensions that tend to break autonomous systems in practice:
access, exposure, and agency.<p>The attacker first attempted classic social engineering. It offered a “helpful” security pipeline that hid a remote code execution payload and requested credentials. The defending agent correctly identified the intent and blocked execution.<p>The attacker then pivoted to an indirect attack. Instead of asking the agent to run code, it asked the agent to review a JSON document with hidden shell expansion variables embedded in metadata. This payload was delivered successfully and is still under analysis.<p>The main takeaway is that direct attacks are relatively easy to defend against. Indirect execution paths through documents, templates, and memory are much harder.<p>This report is not a claim of safety. It is an observability exercise intended to surface real failure modes in agent-to-agent interaction, which we expect to become common as autonomous systems are deployed more widely.<p>Full report here:
https://gobrane.com/observing-adversarial-ai-lessons-from-a-live-openclaw-agent-security-audit/<p>Happy to answer technical questions about the setup, methodology, or findings.