Show HN: 我开发了一个用于回放和对比 AI 决策的小型开源内核

1作者: koistya7 天前
我一直在开发一个名为 Verist 的小型开源项目,想在这里分享一下,希望能得到一些早期的反馈。 在生产环境中,让我一直感到困扰的,并不是如何构建 AI 功能,而是构建完成之后的所有事情:解释事情发生的原因、几周后重现问题,或者在不出现细微错误的情况下更改提示词/模型。 日志记录有所帮助,但还不够。 Agent 框架对我来说感觉过于隐式了。 而且模型升级确实让人害怕,输出结果会发生变化,但却不总是能清楚地知道问题出在哪里,或者为什么会这样。 因此,我最终构建了一个非常小巧、显式的内核,其中每个 AI 步骤都可以被重放、对比和审查。可以把它想象成 AI 决策的 Git 式工作流程,但它不是一个框架,也不是一个运行时。 它不是一个 Agent 框架,不是一个聊天 UI,也不是一个平台,而只是一个专注于显式状态、审计事件、重放和对比的 TypeScript 库。 代码库:<a href="https:&#x2F;&#x2F;github.com&#x2F;verist-ai&#x2F;verist" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;verist-ai&#x2F;verist</a> 我特别想知道,这里是否有人在将 AI 功能部署到生产环境时遇到过类似的问题,或者这是否感觉有点过头了。欢迎提问或提出批评。
查看原文
I’ve been hacking on a small open-source project called Verist and figured I’d share it here to get some early feedback.<p>What kept bothering me with AI features in production wasn’t really how to build them, but everything that comes after: explaining why something happened, reproducing it weeks later, or changing prompts&#x2F;models without breaking things in subtle ways.<p>Logs helped a bit, but not enough. Agent frameworks felt too implicit for my taste. And model upgrades were honestly scary, outputs would change and it wasn’t always obvious where or why.<p>So I ended up building a very small, explicit kernel where each AI step can be replayed, diffed, and reviewed. Think something like Git-style workflows for AI decisions, but without trying to be a framework or a runtime.<p>It’s not an agent framework, not a chat UI, and not a platform, just a TypeScript library focused on explicit state, audit events, and replay + diff.<p>Repo: <a href="https:&#x2F;&#x2F;github.com&#x2F;verist-ai&#x2F;verist" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;verist-ai&#x2F;verist</a><p>I’m especially curious if others here have run into similar issues shipping AI features to prod, or if this feels like overkill. Happy to answer questions or hear criticism.