展示 HN:神经攻城——一项抵抗流氓 AI 劝说的基于文本的实验
2 分•作者: kadzaki•9 个月前
我一直在开发一个名为“神经围攻”的 iOS 应用实验,这是一个基于文本的系统,设想了未来世界被流氓 AI 派系统治,而人类通过对话而非武器反击的场景。
玩家将面对使用不同说服策略的 AI “boss”,包括讽刺、逻辑陷阱、模因驱动的操控、心理压力等,并必须在对话中智胜它们。胜负将影响一个共享的“战争地图”,该地图追踪抵抗军的状态。
我的目标有两个:
探索交互式模拟是否能让人们更了解 AI 的说服力。
看看这类系统除了娱乐价值之外,是否可以作为研究人类-AI 动态的一种方式。
我很希望得到社区的反馈:
- 这种类型的实验在 AI 安全或人类韧性研究中是否会有用?
- 或者最好将其纯粹作为娱乐性的反乌托邦叙事来对待?
查看原文
I’ve been building an iOS app experiment called Neural Siege, a text-based system that imagines a future where rogue AI factions dominate the world, and humans fight back through dialogue rather than weapons.<p>Players face AI “bosses” that use different persuasion tactics—sarcasm, logic traps, meme-driven manipulation, psychological pressure—and must outwit them in conversation. Victories and losses affect a shared “war map” that tracks the state of the Resistance.<p>My goals are twofold:<p>Explore whether interactive simulations can make people more aware of how persuasive AI can be.<p>See if these kinds of systems could have value beyond entertainment, as a way to study human–AI dynamics.<p>I’d love feedback from this community:<p>- Could this type of experiment be useful in AI safety or human resilience research?<p>- Or is it better to treat it purely as a dystopian narrative for entertainment?