Launch HN: Recall.ai (YC W20) – Recall.ai(YC冬季2020)发布:用于会议录音和转录的 API
13 分•作者: davidgu•9 个月前
大家好,我们是 Recall.ai 的 David 和 Amanda (<a href="https://www.recall.ai">https://www.recall.ai</a>)。今天,我们推出了桌面录制 SDK,这是一种无需会议机器人即可获取会议数据的方法:<a href="https://www.recall.ai/product/desktop-recording-sdk">https://www.recall.ai/product/desktop-recording-sdk</a>。这是我们相当长一段时间以来最大的发布,所以我们决定终于来一次 Launch HN :)
这里有一个演示,展示了它如何从会议中生成文字记录,然后是代码示例:<a href="https://www.youtube.com/watch?v=4croAGGiKTA" rel="nofollow">https://www.youtube.com/watch?v=4croAGGiKTA</a>。API 文档位于 <a href="https://docs.recall.ai/">https://docs.recall.ai/</a>。
早在 2020 年冬季,我们的第一个产品是一个 API,允许您将机器人参与者发送到会议中。这使开发人员可以访问会议中的音频/视频流和其他数据。如今,这个 API 为市场上大多数会议录制产品提供支持。
最近,通过桌面形式而不是机器人进行会议录制变得越来越流行。许多产品,如 Notion 和 ChatGPT,都增加了桌面录制功能,而大型语言模型 (LLM) 使得处理非结构化文字记录变得更容易。但实际上,使用桌面应用程序大规模可靠地录制会议很困难,而且大多数希望添加录制功能的开发人员都不想构建所有这些基础设施。
仅使用麦克风和系统音频进行基本录制相当简单,因为您可以使用系统 API。但是,当您想要捕获发言者姓名、生成视频录制、获取实时数据或大规模在生产环境中运行此功能时,就会变得更加困难:
* 捕获发言者姓名涉及使用辅助功能 API 来屏幕抓取视频会议窗口,以监视谁在何时发言。当视频会议平台更改其用户界面时,我们必须立即发布更改,以确保其持续运行。
* 生成干净的视频录制,并且不捕获视频会议平台的用户界面,这涉及检测参与者磁贴、裁剪它们并将它们组合成一个干净的视频录制。
* 由于桌面录制代码在最终用户机器上运行,我们需要使其尽可能高效。这意味着编写高度平台优化的代码,在可用时利用硬件编码器,并花费大量时间进行分析和性能测试。
会议录制几乎没有容错空间,因为如果出现任何问题,您将永远丢失数据。可靠性尤其重要,这大大增加了所需的工程工作量。
我们的桌面录制 SDK 解决了所有这些问题,并允许开发人员将会议录制功能构建到他们的桌面应用程序中,这样他们就可以录制视频会议和面对面会议,而无需机器人。
我们构建 Recall.ai 是因为我们自己也遇到了这个问题。在我们的第一家初创公司,我们为产品经理构建了一个工具,其中包括会议录制功能。70% 的工程时间都花在了这个功能上!最终,我们开始创建 Recall.ai 来解决这个问题。从那时起,超过 2000 家公司使用我们来为其录制功能提供支持,例如 Hubspot 用于销售电话录制,Clickup 用于其 AI 笔记助手。我们的用户是为金融服务、远程医疗、事件管理、销售、面试等构建商业产品的工程团队。我们还为大型企业提供内部工具。
运行这种基础设施带来了意想不到的技术挑战!例如,我们不得不调试音频编码器中 3600 万分之一的段错误 (<a href="https://www.recall.ai/blog/debugging-a-1-in-36-000-000-segfault-in-our-audio-encoder">https://www.recall.ai/blog/debugging-a-1-in-36-000-000-segfa...</a>),我们遇到了一个仅在您有数万个并发写入器时才会发生的 Postgres 锁定 (<a href="https://news.ycombinator.com/item?id=44490510">https://news.ycombinator.com/item?id=44490510</a>),并且我们通过优化我们在进程之间移动数据的方式,每年在 AWS 上节省了超过 100 万美元 (<a href="https://news.ycombinator.com/item?id=42067275">https://news.ycombinator.com/item?id=42067275</a>)。
您可以在这里试用:<a href="https://www.recall.ai">https://www.recall.ai</a>。它是自助式的,有 5 美元的免费积分。定价从每小时录制 0.70 美元开始,按秒计费。我们提供批量折扣。
通过 Recall.ai 录制的所有数据均为我们客户的财产,我们支持 0 天保留,并且我们不会在客户数据上训练模型。
我们希望得到您的反馈!
查看原文
Hey HN, we're David and Amanda from Recall.ai (<a href="https://www.recall.ai">https://www.recall.ai</a>). Today we’re launching our Desktop Recording SDK, a way to get meeting data without a bot in the meeting: <a href="https://www.recall.ai/product/desktop-recording-sdk">https://www.recall.ai/product/desktop-recording-sdk</a>. It’s our biggest release in quite a while so we thought we’d finally do our Launch HN :)<p>Here’s a demo that shows it producing a transcript from a meeting, followed by examples in code: <a href="https://www.youtube.com/watch?v=4croAGGiKTA" rel="nofollow">https://www.youtube.com/watch?v=4croAGGiKTA</a> . API docs are at <a href="https://docs.recall.ai/">https://docs.recall.ai/</a>.<p>Back in W20, our first product was an API that lets you send a bot participant into a meeting. This gives developers access to audio/video streams and other data in the meeting. Today, this API powers most of the meeting recording products on the market.<p>Recently, meeting recording through a desktop form factor instead of a bot has become popular. Many products like Notion and ChatGPT have added desktop recording functionality, and LLMs have made it easier to work with unstructured transcripts. But it’s actually hard to reliably record meetings at scale with a desktop app, and most developers who want to add recording functionality don’t want to build all this infrastructure.<p>Doing a basic recording with just the microphone and system audio is fairly straightforward since you can just use the system APIs. But it gets a lot harder when you want to capture speaker names, produce a video recording, get real-time data, or run this in production at large scale:<p>- Capturing speaker names involves using accessibility APIs to screen-scrape the video conference window to monitor who is speaking at what time. When video conferencing platforms change their UI, we must ship a change immediately, so this keeps working.<p>- Producing a video recording that is clean, and doesn’t capture the video conferencing platform UI involves detecting the participant tiles, cropping them out, and compositing them together into a clean video recording.<p>- Because the desktop recording code runs on end-user machines, we need to make it as efficient as possible. This means writing highly platform-optimized code, taking advantage of hardware encoders when available, and spending a lot of time doing profiling and performance testing.<p>Meeting recording has zero margin for failure because if anything breaks, you lose the data forever. Reliability is especially important, which dramatically increases the amount of engineering effort required.<p>Our Desktop Recording SDK takes care of all this and lets developers build meeting recording features into their desktop apps, so they can record both video conferences and in-person meetings without a bot.<p>We built Recall.ai because we experienced this problem ourselves. At our first startup, we built a tool for product managers that included a meeting recording feature. 70% of our engineering time was taken up by just this feature! We ended up starting Recall.ai to solve this instead. Since then, over 2000 companies use us to power their recording features, e.g. Hubspot for sales call recording, Clickup for their AI note taker. Our users are engineering teams building commercial products for financial services, telehealth, incident management, sales, interviewing, and more. We also power internal tooling for large enterprises.<p>Running this sort of infrastructure has led to unexpected technical challenges! For example, we had to debug a 1 in 36 million segfault in our audio encoder (<a href="https://www.recall.ai/blog/debugging-a-1-in-36-000-000-segfault-in-our-audio-encoder">https://www.recall.ai/blog/debugging-a-1-in-36-000-000-segfa...</a>), we encountered a Postgres lock-up that only occurs when you have tens of thousands of concurrent writers (<a href="https://news.ycombinator.com/item?id=44490510">https://news.ycombinator.com/item?id=44490510</a>), and we saved over $1M a year on AWS by optimizing the way we shuffle data around between our processes (<a href="https://news.ycombinator.com/item?id=42067275">https://news.ycombinator.com/item?id=42067275</a>).<p>You can try it here: <a href="https://www.recall.ai">https://www.recall.ai</a>. It's self-serve with $5 of free credits. Pricing starts at $0.70 for every hour of recording, prorated to the second. We offer volume discounts with scale.<p>All data recorded through Recall.ai is the property of our customers, we support 0-day retention, and we don’t train models on customer data.<p>We would love your feedback!