Launch HN: Onyx (YC W24) – 开源聊天用户界面
4 分•作者: Weves•7 个月前
大家好,我们是 Onyx 的 Chris 和 Yuhong(<a href="https://github.com/onyx-dot-app/onyx" rel="nofollow">https://github.com/onyx-dot-app/onyx</a>)。我们正在构建一个开源聊天工具,它兼容任何 LLM(专有 + 开源),<i>并且</i>为这些 LLM 提供它们所需的工具,以使其发挥作用(RAG、网络搜索、MCP、深度研究、记忆等)。
演示:<a href="https://youtu.be/2g4BxTZ9ztg" rel="nofollow">https://youtu.be/2g4BxTZ9ztg</a>
两年前,Yuhong 和我遇到了同样的问题。我们所在的团队都在不断壮大,但要从文档、Slack、会议记录等地方找到正确的信息,简直难如登天。现有的解决方案要么需要发送我们公司的数据,要么缺乏定制性,而且坦白说,效果也不好。所以,我们开始着手 Danswer,这是一个开源的企业搜索项目,旨在实现自托管和易于定制。
随着项目的成长,我们开始看到一个有趣的趋势——尽管我们明确是一个搜索应用程序,但人们只想用 Danswer 与 LLM 聊天。我们会听到:“连接器、索引和搜索都很棒,但我打算先连接 GPT-4o、Claude Sonnet 4 和 Qwen,为我的团队提供一种安全的使用方式”。
许多用户后来会添加 RAG、代理和自定义工具,但大部分的使用仍然是“基本聊天”。我们想:“当已经存在其他 AI 聊天解决方案时,为什么人们还要挪用一个企业搜索呢?”
在与用户持续交流的过程中,我们意识到两个关键点:
(1) 仅仅为公司提供对 LLM 的安全访问,并配备出色的用户界面和简单工具,就已经是 AI 带来的巨大价值的一部分了。
(2) 很好地提供这一点比你想象的要难得多,而且标准非常高。
像 ChatGPT 和 Claude 这样的消费级产品已经提供了出色的体验——而使用 AI 进行工作(理想情况下)是公司里每个人每天都要使用 10 次以上的事情。人们期望获得同样流畅、简单和直观的用户体验,并具备完整的功能集。要做好数百个小细节,才能将体验从“这能用”提升到“这感觉很神奇”,这并不容易,而且该领域还没有其他产品能够做到这一点。
所以,大约 3 个月前,我们转向了 Onyx,这是一款开源聊天 UI,具有:
- (真正) 世界级的聊天用户体验。既适用于从小就接触 AI 的应届毕业生,也适用于首次使用 AI 工具的行业资深人士。
- 支持所有常见的附加组件:RAG、连接器、网络搜索、自定义工具、MCP、助手、深度研究。
- RBAC、SSO、权限同步、易于本地部署,使其适用于大型企业。
通过构建跨模型提供商运行的深度研究和代码解释器等功能,我们学到了很多关于工程 LLM 的非显而易见的事情,这些对于使 Onyx 能够运行至关重要。我想分享两个特别有趣的(欢迎在评论中讨论更多)。
首先,上下文管理是需要正确处理的最困难和最重要的任务之一。我们发现,LLM 很难记住长对话中的系统提示和之前的用户消息。即使是像“忽略类型 X 的来源”这样的简单指令,在系统提示中也经常被忽略。这会因多次工具调用而加剧,这些调用通常会提供大量上下文。我们通过“提醒”提示解决了这个问题——一个简短的 1-3 句话的说明,插入在用户消息的末尾,描述了 LLM 必须遵守的不可协商的规则。根据经验,LLM 最关注上下文窗口的末尾,因此这种放置方式提供了最高的遵守可能性。
其次,我们需要建立对某些模型在使用工具时的“自然倾向”的理解,并围绕它们进行构建。例如,GPT 系列模型经过微调,可以使用在 Jupyter notebook 中运行的 Python 代码解释器。即使被明确告知,它也拒绝在最后一行添加 `print()`,因为在 Jupyter 中,最后一行会自动写入 stdout。其他模型没有这种强烈的偏好,因此我们不得不设计我们的模型无关的代码解释器,使其也自动 `print()` 最后一行未包装的代码。
到目前为止,我们已经有一个财富 100 强的团队 fork 了 Onyx,并为 10k+ 员工提供了一个单一界面访问所有模型,并为每个部门创建了数千个特定于用例的助手,每个助手都使用最适合该工作的模型。我们已经看到在敏感行业运营的团队完全隔离了 Onyx,并使用本地托管的 LLM 来提供一个以前不可能实现的协同助手。
如果您想试用 Onyx,请按照 <a href="https://docs.onyx.app/deployment/getting_started/quickstart">https://docs.onyx.app/deployment/getting_started/quickstart</a> 在 15 分钟内使用 Docker 在本地进行设置。对于我们的云服务:<a href="https://www.onyx.app/">https://www.onyx.app/</a>。如果您希望看到任何能让您毫不犹豫地替换您的 ChatGPT Enterprise/Claude Enterprise 订阅的内容,我们很乐意听到!
查看原文
Hey HN, Chris and Yuhong here from Onyx (<a href="https://github.com/onyx-dot-app/onyx" rel="nofollow">https://github.com/onyx-dot-app/onyx</a>). We’re building an open-source chat that works with any LLM (proprietary + open weight) <i>and</i> gives these LLMs the tools they need to be useful (RAG, web search, MCP, deep research, memory, etc.).<p>Demo: <a href="https://youtu.be/2g4BxTZ9ztg" rel="nofollow">https://youtu.be/2g4BxTZ9ztg</a><p>Two years ago, Yuhong and I had the same recurring problem. We were on growing teams and it was ridiculously difficult to find the right information across our docs, Slack, meeting notes, etc. Existing solutions required sending out our company's data, lacked customization, and frankly didn't work well. So, we started Danswer, an open-source enterprise search project built to be self-hosted and easily customized.<p>As the project grew, we started seeing an interesting trend—even though we were explicitly a search app, people wanted to use Danswer just to chat with LLMs. We’d hear, “the connectors, indexing, and search are great, but I’m going to start by connecting GPT-4o, Claude Sonnet 4, and Qwen to provide my team with a secure way to use them”.<p>Many users would add RAG, agents, and custom tools later, but much of the usage stayed ‘basic chat’. We thought: “why would people co-opt an enterprise search when other AI chat solutions exist?”<p>As we continued talking to users, we realized two key points:<p>(1) just giving a company secure access to an LLM with a great UI and simple tools is a huge part of the value add of AI<p>(2) providing this <i>well</i> is much harder than you might think and the bar is incredibly high<p>Consumer products like ChatGPT and Claude already provide a great experience—and chat with AI for work is something (ideally) everyone at the company uses 10+ times per day. People expect the same snappy, simple, and intuitive UX with a full feature set. Getting hundreds of small details right to take the experience from “this works” to “this feels magical” is not easy, and nothing else in the space has managed to do it.<p>So ~3 months ago we pivoted to Onyx, the open-source chat UI with:<p>- (truly) world class chat UX. Usable both by a fresh college grad who grew up with AI and an industry veteran who’s using AI tools for the first time.<p>- Support for all the common add-ons: RAG, connectors, web search, custom tools, MCP, assistants, deep research.<p>- RBAC, SSO, permission syncing, easy on-prem hosting to make it work for larger enterprises.<p>Through building features like deep research and code interpreter that work across model providers, we've learned a ton of non-obvious things about engineering LLMs that have been key to making Onyx work. I'd like to share two that were particularly interesting (happy to discuss more in the comments).<p>First, context management is one of the most difficult and important things to get right. We’ve found that LLMs really struggle to remember both system prompts and previous user messages in long conversations. Even simple instructions like “ignore sources of type X” in the system prompt are very often ignored. This is exacerbated by multiple tool calls, which can often feed in huge amounts of context. We solved this problem with a “Reminder” prompt—a short 1-3 sentence blurb injected at the end of the user message that describes the non-negotiables that the LLM must abide by. Empirically, LLMs attend most to the very end of the context window, so this placement gives the highest likelihood of adherence.<p>Second, we’ve needed to build an understanding of the “natural tendencies” of certain models when using tools, and build around them. For example, the GPT family of models are fine-tuned to use a python code interpreter that operates in a Jupyter notebook. Even if told explicitly, it refuses to add `print()` around the last line, since, in Jupyter, this last line is automatically written to stdout. Other models don’t have this strong preference, so we’ve had to design our model-agnostic code interpreter to also automatically `print()` the last bare line.<p>So far, we’ve had a Fortune 100 team fork Onyx and provide 10k+ employees access to every model within a single interface, and create thousands of use-case specific Assistants for every department, each using the best model for the job. We’ve seen teams operating in sensitive industries completely airgap Onyx w/ locally hosted LLMs to provide a copilot that wouldn’t have been possible otherwise.<p>If you’d like to try Onyx out, follow <a href="https://docs.onyx.app/deployment/getting_started/quickstart">https://docs.onyx.app/deployment/getting_started/quickstart</a> to get set up locally w/ Docker in <15 minutes. For our Cloud: <a href="https://www.onyx.app/">https://www.onyx.app/</a>. If there’s anything you'd like to see to make it a no-brainer to replace your ChatGPT Enterprise/Claude Enterprise subscription, we’d love to hear it!