HackerNews中文版

我一直在“dogfooding”（自用）我构建的一个小型写作助手，名为 Rephrazo，我想分享一些到目前为止的实现细节和错误。这个想法很简单：* 突出显示你正在写作的文本 * 按下热键 * 在一个小弹窗中获得 AI 释义 * 一键插入目标是消除小的编辑操作中“复制 - 打开 AI 工具 - 粘贴 - 重写 - 粘贴回”的循环。这篇文章是关于我如何将其连接起来的，哪些技术上有效，哪些无效。### 我设计时考虑的约束从一开始，我就尝试在几个约束下进行设计：* 一个热键 → 一个主要动作 * 停留在当前应用程序内（没有浏览器，没有大的侧面板） * 最小的 UI：单个建议，一键插入 * 延迟“感觉是即时的”，否则就不会被使用每当我打破这些约束（添加额外的选择、提示等），在“dogfooding”中的使用率就会下降。### 整体架构大致分解：* 桌面客户端，它：<pre><code> * 监听全局热键 * 抓取当前文本选择 * 将其发送到 API * 在选择附近的一个小覆盖层中显示返回的释义</code></pre> * 后端 API，它：<pre><code> * 接受选定的文本 + 一些最小的上下文 * 调用 LLM * 应用一个固定的提示（“让它更清晰，尽可能保持语气/风格”） * 返回单个建议（目前没有多选） </code></pre> 目前还没有花哨的基础设施，只是想尽可能缩短从“按键”到“返回文本”的路径。### 文本捕获和插入令人惊讶的棘手部分不是 LLM，而是：* 可靠地捕获选定的文本 * 不搞乱用户的剪贴板 * 插入重写的文本，不破坏格式第一个版本实际上滥用了剪贴板：* 保存剪贴板 * 复制选择 * 发送到后端 * 通过粘贴结果替换选择 * 恢复剪贴板这奏效了……直到它失效：* 一些应用程序忽略模拟的按键 * 有时剪贴板会被其他东西覆盖 * 感觉很脆弱，很“黑客”我正在慢慢转向更多应用程序感知的集成（如果可能），同时仍然保留一个通用的后备方案。### 延迟和用户体验延迟比我预想的更重要。大致分为几类：* < 500 毫秒 → 感觉是即时的，人们很满意 * 1–2 秒 → 如果建议明显更好，则可以接受 * > 3 秒 → 人们会后悔按下热键，并且使用频率会降低一些小的用户体验改进有所帮助：* 在选择附近立即显示一个小的“加载”状态 * 立即渲染弹窗（骨架状态），然后在响应到达时填充它 * 失败时，显示一条简短、诚实的消息，而不是默默地什么都不做如果你正在构建 AI 工具，这不会让你感到惊讶，但当你看到自己的用户在几次缓慢的响应后犹豫不决时，感觉就不同了。### 出错的地方* 我早期过度构建了自定义功能：<pre><code> * 语气下拉菜单 * 多种模式（“更短”、“更长”、“更正式”） * 额外的切换开关人们忽略了它们，或者产生了决策疲劳。 </code></pre> * 我低估了在不同应用程序中选择/插入的边缘情况的数量。* 我在最初的版本中没有记录足够的日志，所以我不得不改造遥测技术来了解实际使用情况。如果你有兴趣，目前的早期版本在这里： [https://rephrazo-ai.app/](https://rephrazo-ai.app/)

查看原文

I’ve been dogfooding a small writing helper I built called Rephrazo, and I thought it might be useful to share some implementation details and mistakes so far.The idea is simple:* highlight text where you’re writing * press a hotkey * get an AI paraphrase in a small popup * insert it back with one clickThe goal is to remove the “copy - open AI tool - paste - rewrite - paste back” loop for small edits.This post is about how I wired it up, what worked technically, and what didn’t.### Constraints I designed forFrom the beginning I tried to design under a few constraints:* One hotkey → one main action * Stay inside the current app (no browser, no big side panel) * Minimal UI: single suggestion, one click to insert * Latency “feels instant” or it doesn’t get usedWhenever I broke these constraints (added extra choices, prompts, etc.), usage dropped in dogfooding.### High-level architectureRough breakdown:* Desktop client that:<pre><code> * listens for a global hotkey * grabs the current text selection * sends it to an API * displays the returned paraphrase in a small overlay near the selection</code></pre> * Backend API that:<pre><code> * accepts the selected text + some minimal context * calls an LLM * applies a fixed prompt (“make this clearer, keep tone/voice as much as possible”) * returns a single suggestion (no multi-choice for now) </code></pre> No fancy infra yet, just trying to keep the path from “key press” to “returned text” as short as possible.### Text capture and insertionThe surprisingly tricky part wasn’t the LLM, it was:* reliably capturing the selected text * not messing up the user’s clipboard * inserting the rewritten text back without breaking formattingThe first version literally abused the clipboard:* save clipboard * copy selection * send to backend * replace selection by pasting the result * restore clipboardThis worked… until it didn’t:* some apps ignore simulated keypresses * sometimes the clipboard got overwritten by other things in between * it felt fragile and “hacky”I’m slowly moving toward more app-aware integrations (where possible) while still keeping a generic fallback.### Latency and UXLatency matters more than I expected. Rough buckets:* < 500 ms → feels instant, people are happy * 1–2 seconds → acceptable if the suggestion is clearly better * > 3 seconds → people regret pressing the hotkey and use it lessA few tiny UX things helped:* show a small “loading” state immediately near the selection * render the popup instantly (skeleton state), then fill it when the response arrives * on failure, show a short, honest message instead of silently doing nothingIf you’re building AI tools, this won’t surprise you, but it’s different when you watch your own users hesitate after a few slow responses.### Things that went wrong* I overbuilt customization early:<pre><code> * tone dropdowns * multiple modes (“shorter”, “longer”, “more formal”) * extra toggles People ignored them, or got decision fatigue. </code></pre> * I underestimated how many edge cases there are with selection/insertion across different apps.* I didn’t log enough in the first builds, so I had to retrofit telemetry to understand actual usage.If you’re curious, the current early version is here: [https://rephrazo-ai.app/](https://rephrazo-ai.app/)

我开发了一个一键式内联 AI 重写工具（以及它遇到的问题）