HackerNews中文版

我手头的这些例子都未经充分优化—— 例如，Modal Labs 使用 FastAPI - [https://modal.com/docs/examples/chatterbox_tts](https://modal.com/docs/examples/chatterbox_tts) BentoML 也使用类似 FastAPI 的服务 - [https://www.bentoml.com/blog/deploying-a-text-to-speech-application-with-bentoml](https://www.bentoml.com/blog/deploying-a-text-to-speech-application-with-bentoml) 甚至 Chatterbox TTS 也有一个非常简单的例子 - [https://github.com/resemble-ai/chatterbox](https://github.com/resemble-ai/chatterbox) Tritonserver 的文档中没有 TTS 示例。我 100% 确信，使用 TritonServer，利用模型并发和批量处理，可以编写一个高度优化的版本。如果有人使用 Tritonserver 实现了 TTS 服务，或者有更好的推理服务器替代方案可以部署，请在这里帮帮我。我不想重复造轮子。

查看原文

All the examples I have are highly unoptimized - For eg, Modal Labs uses FastAPI - https://modal.com/docs/examples/chatterbox_tts\ BentoML also uses FastAPI like service - https://www.bentoml.com/blog/deploying-a-text-to-speech-application-with-bentoml\<p>Even Chatterbox TTS has a very naive example - https://github.com/resemble-ai/chatterbox\<p>Tritonserver docs don’t have a TTS example.<p>I am 100% certain that a highly optimized variant can be written with TritonServer, utilizing model concurrency and batching.<p>If someone has implemented a TTS service with Tritonserver or has a better inference server alternative to deploy, please help me out here. I don’t want to reinvent the wheel.

提问 HN：你们用什么推理服务器来托管 TTS 模型？