Show HN: AI 智能体的确定性安全解决方案 – OpenClaw 及其他 2 个
3 分•作者: steadeepanda•大约 10 小时前
我想分享一个我最初为 OpenClaw 制作的解决方案,它有助于控制你的 AI 代理在无需影响其能力的情况下可以访问的内容。我希望它对你有所帮助。
基本上,该解决方案让你可以在安全边界内自由地试验你的代理。
它是有意设计的确定性系统(不包含任何 AI 层),这意味着该解决方案遵循清晰且已定义的规则,以最大限度地提高安全性/可靠性和可预测性。
这些规则经过了大量测试,用于检测提示注入尝试和其他安全案例(在文档中详细说明)。
所有内容都位于本地,并存储在你的计算机上,包括文档站点。
它为你提供了一个控制面板,用于监控和控制边界。当边界即将被突破时,你会收到一个批准请求,让你看到你的 openclaw 试图做什么。
它还(目前)支持 Tailscale,因此你可以连接你的 Tailscale IP 地址并在手机上接收所有内容,并且你可以正常聊天、批准或拒绝请求。它允许你通过你的 tailscale IP 地址(建议使用私有 IP 地址)从任何地方访问控制面板。目前仅支持 Telegram 频道。
目前仅支持 Linux 操作系统以及 Opencode Claude Code & OpenClaw 运行器。
开始所需的步骤在自述文件中进行了说明,其中还包括快速演示/展示图片,以便你可以了解它的外观。
我很乐意听取大家的反馈,特别是针对提示注入的测试,以了解它如何处理。如有任何问题,请随时在 GitHub 上开一个工单,我会尽力修复它们。
链接在这里:[https://github.com/steadeepanda/agent-ruler/](https://github.com/steadeepanda/agent-ruler/)
感谢你的阅读。我很乐意与你讨论。
查看原文
I wanted to share a solution that I made initially for myself for OpenClaw, that helps control what your ai agents can reach when you let it do stuff without impacting its power, I hope it's useful to you.<p>Basically the solution lets you experiment freely with your agent within safe boundaries.<p>It's deterministic on purpose (doesn't include any Al layer) which means the solution follows clear and already defined rules, to maximize safety/security and predictability.<p>Rules are heavily tested on detecting prompt injection attempts and other security cases (explained in detail in the docs).<p>Everything is local and lives on your computer including the docs site.<p>It gives you a control panel to monitor and control boundaries. When boundaries are about to get crossed you receive an approval request which lets you see what your openclaw was trying to do.<p>It also (currently) supports Tailscale, so you can connect your Tailscale IP address and receive everything on your phone and you can also chat normally, approve or deny requests. It lets access the control panel via your tailscale IP address (a private one is recommended) from anywhere.
Currently only Telegram Channel is supported.<p>Only supports linux os for now and Opencode Claude Code & OpenClaw runners.<p>The things you need to get started are explained in the readme, also include quick demo/showcase images so you can see how it looks.<p>I'll be happy to hear feedback from you guys, especially having it tested against prompt injections to see how it handles it, don't hesitate to open a ticket on the GitHub for any issue that you found, I'll do my best to fix them.<p>Link here: <a href="https://github.com/steadeepanda/agent-ruler/" rel="nofollow">https://github.com/steadeepanda/agent-ruler/</a><p>Thank you for reading.
I'll be happy to discuss about it.