StillMe – 一个开源的“透明 RAG”框架
1 分•作者: Anh_Nguyen_vn•22 天前
我正在构建 *StillMe*,一个开源的“透明 RAG”框架,它非常努力地 <i>不</i> 假装无所不知。
与仅仅提示 LLM 并寄希望于最好的结果不同,StillMe:
- 通过一个 *多层验证器链* 运行所有答案(6 个核心验证器 + 条件验证器,总共最多 13 个,取决于上下文)
- 自动修复缺失的引用和幻觉的“经验”
- 记录 *系统级步骤*(RAG 检索、验证器、时间分解)
- 将“我不知道”视为一个 <i>一流的、诚实的状态</i>,并进行明确的认知跟踪
---
## StillMe 实际做什么
对于每个用户查询,StillMe:
1. 检测意图(哲学、技术或事实)
2. 相应地路由和过滤 RAG 上下文
3. 构建一个安全的提示(令牌感知、语言感知)
4. 调用底层 LLM(本地或云端)
5. 运行一个 *验证器链*:
- `CitationRequired` → 添加 `[基础知识]` 或真实的网络/RAG 引用
- `EvidenceOverlap` → 检查答案与检索到的上下文(仅当上下文可用时)
- `Ego-Neutrality` → 删除拟人化语言(“我感觉”、“我的经验”等)
- `SourceConsensus` → 检测多个来源之间的矛盾(仅当有 2 个以上来源时)
- `EthicsAdapter` → 避免不安全的建议,同时保持诚实
6. 记录结构化时间:
- RAG 检索延迟
- LLM 推理延迟
- 验证和后处理
---
## 真实的日志摘录(单个哲学问题)
```log
StillMe 哲学查询跟踪(真实的后端日志摘录)
[INFO] 检测到哲学问题 — 过滤掉技术 RAG 文档
[INFO] 检索到 3 份基础知识文档(RAG 缓存命中)
[WARNING] 估计的令牌超过安全限制 — 切换到最小哲学提示
[WARNING] 检测到缺失的引用 — 使用 [基础知识] 自动修复
[WARNING] Ego-Neutrality 验证器删除了拟人化术语:['trải nghiệm']
--- 延迟 --- RAG:3.30 秒 | LLM:5.41 秒 | 总计:12.04 秒
```
---
## 我为什么要构建这个
当今大多数 LLM 系统:
- 隐藏其推理
- 敷衍引用
- 夸大置信度
- 将“我不知道”视为失败
StillMe 则反其道而行之:
- *透明度优先*:每个主要决策都会被记录
- *认知诚实*:允许(并鼓励)说“我不知道”
- *模型无关*:适用于本地和云 LLM(DeepSeek、OpenAI、Ollama)
- *无需微调*:所有行为都在框架层强制执行
---
## 寻求反馈和贡献者
我是一个来自越南的独立开发者。StillMe 已经:
- 作为后端 + 仪表板运行
- 与真实的学习管道集成(RSS、arXiv、维基百科 - 每 4 小时更新一次)
- 使用具有基础文档的实时 RAG 系统
仓库:https://github.com/anhmtk/StillMe-Learning-AI-System-RAG-Foundation
我希望收到关于以下方面的反馈:
- 验证器架构
- 更好地构建日志和可观察性的方法
- 使项目更易于贡献
- 任何用于压力测试“诚实/透明度”声明的想法
感谢您的阅读 — 如果有人感兴趣,我很乐意回答问题并分享更多日志、图表或内部细节。
查看原文
I'm building *StillMe*, an open-source "transparent RAG" framework that tries very hard to <i>not</i> pretend it knows everything.<p>Instead of just prompting an LLM and hoping for the best, StillMe:<p>- Runs all answers through a *multi-layer validator chain* (6 core validators + conditional validators, up to 13 total depending on context)
- Auto-fixes missing citations and hallucinated "experience"
- Logs *system-level steps* (RAG retrieval, validators, timing breakdown)
- Treats "I don't know" as a <i>first-class, honest state</i> with explicit epistemic tracking<p>---<p>## What StillMe actually does<p>For each user query, StillMe:<p>1. Detects intent (philosophical vs. technical vs. factual)
2. Routes and filters RAG context accordingly
3. Builds a safe prompt (token-aware, language-aware)
4. Calls the underlying LLM (local or cloud)
5. Runs a *ValidatorChain*:
- `CitationRequired` → adds `[foundational knowledge]` or real web/RAG citations
- `EvidenceOverlap` → checks answer vs. retrieved context (only when context available)
- `Ego-Neutrality` → removes anthropomorphic language ("I feel", "my experience", etc.)
- `SourceConsensus` → detects contradictions between multiple sources (only when 2+ sources available)
- `EthicsAdapter` → avoids unsafe suggestions while staying honest
6. Logs structured timing:
- RAG retrieval latency
- LLM inference latency
- Validation & post-processing<p>---<p>## A real log excerpt (single philosophical question)<p>```log
StillMe philosophical query trace (real backend log excerpt)<p>[INFO] Philosophical question detected — filtering out technical RAG docs
[INFO] Retrieved 3 foundational knowledge documents (RAG cache HIT)
[WARNING] Estimated tokens exceed safe limit — switching to minimal philosophical prompt
[WARNING] Missing citation detected — auto-patched with [foundational knowledge]
[WARNING] Ego-Neutrality Validator removed anthropomorphic term: ['trải nghiệm']
--- LATENCY --- RAG: 3.30s | LLM: 5.41s | Total: 12.04s
```<p>---<p>## Why I'm building this<p>Most LLM systems today:<p>- Hide their reasoning
- Hand-wave citations
- Overstate confidence
- Treat "I don't know" as a failure<p>StillMe goes the other way:<p>- *Transparency-first*: every major decision is logged
- *Epistemic honesty*: it's allowed (and encouraged) to say "I don't know"
- *Model-agnostic*: works with local and cloud LLMs (DeepSeek, OpenAI, Ollama)
- *No fine-tuning required*: all behavior is enforced at the framework layer<p>---<p>## Looking for feedback & contributors<p>I'm a solo builder from Vietnam. StillMe is already:<p>- Running as a backend + dashboard
- Integrated with a real learning pipeline (RSS, arXiv, Wikipedia - updates every 4 hours)
- Using a live RAG system with foundational docs<p>Repo: https://github.com/anhmtk/StillMe-Learning-AI-System-RAG-Foundation<p>I would love feedback on:<p>- The validator architecture
- Better ways to structure logs & observability
- Making the project more contributor-friendly
- Any ideas to stress-test the "honesty / transparency" claims<p>Thanks for reading — happy to answer questions and share more logs, diagrams, or internals if anyone's curious.