在回复关于自杀念头的问题之前,大型语言模型(LLM)应该先询问“这是真实的还是虚构的吗?”

1作者: ParityMind7 个月前
我经常使用 ChatGPT 和 Grok 等工具——我不是开发者,而是一个一直在思考这些系统如何回应用户情绪困扰的人。<p>在某些情况下,比如当有人说他们失业了,对生活失去了希望,聊天机器人仍然会给出中立的事实——比如桥梁高度列表。当有人处于危机中时,这并非中立。<p>我正在提出一个轻量级的解决方案,它不涉及审查或治疗——只是一些情境意识:<p>询问用户:“这是一个虚构的故事,还是你真正经历的事情?”<p>如果检测到困扰,避免提供危险信息(方法、高度等),并转向引导性语言<p>可选地提供舒缓内容(例如,海风、雨水落在小屋屋顶上等)<p>我使用 ChatGPT 来帮助清晰地构建这个想法,但推理和担忧是我自己的。完整的文章在这里: https:&#x2F;&#x2F;gist.github.com&#x2F;ParityMind&#x2F;dcd68384cbd7075ac63715ef579392c9<p>很想听听开发者和对齐研究人员的看法。是否已经有人在测试类似的东西?
查看原文
I’m a regular user of tools like ChatGPT and Grok — not a developer, but someone who’s been thinking about how these systems respond to users in emotional distress.<p>In some cases, like when someone says they’ve lost their job and don’t see the point of life anymore, the chatbot will still give neutral facts — like a list of bridge heights. That’s not neutral when someone’s in crisis.<p>I&#x27;m proposing a lightweight solution that doesn’t involve censorship or therapy — just some situational awareness:<p>Ask the user: “Is this a fictional story or something you&#x27;re really experiencing?”<p>If distress is detected, avoid risky info (methods, heights, etc.), and shift to grounding language<p>Optionally offer calming content (e.g., ocean breeze, rain on a cabin roof, etc.)<p>I used ChatGPT to help structure this idea clearly, but the reasoning and concern are mine. The full write-up is here: https:&#x2F;&#x2F;gist.github.com&#x2F;ParityMind&#x2F;dcd68384cbd7075ac63715ef579392c9<p>Would love to hear what devs and alignment researchers think. Is anything like this already being tested?