提问 HN:使用 AI/LLM API 让我很想放弃。我做错了什么?
1 分•作者: moomoo11•6 个月前
我正在尝试自动化我们目前的一些手动流程,但仍然无法克服这个障碍。我到底哪里做错了?
我正在使用这些 AI API 来进行实际的处理工作,说实话,我感到很沮丧,甚至有些愤怒。这些 AI 公司向我们兜售一些关于自动化的宏大愿景,但实际使用他们的服务却令人失望。
1. 结果从不一致。“请确保提取所有项目” -> [项目 1,项目 2,项目 3,“实际上是一个注释 // ...剩余项目”] 搞什么鬼!!有时候它会给我一个所有项目的完整列表,有时候它会搞这种鬼。我提供了一个工具,但它有一半的时间只抓取前 3 个,也许还会抓取最后一个(忽略中间的所有内容)。
2. 由于结果不可靠,我不得不进行更多的后处理。大约 60% 的时间,即使在后处理之后,我也必须拒绝,因为它们没有达到我的置信度阈值。
3. 这些 API 得到了供应商的糟糕支持。
- iOS 有一些疯狂的行为,文件扩展名有时是 .jpg 或 .JPG 等。例如,OpenAI 的 API 将返回“Bad Request”,因为扩展名不是“.jpg”,所以现在我必须添加更多代码来确保当用户上传文件时,我重命名文件。
- 文档会说它支持一系列文件格式,但随后拒绝请求,因为它不是 .PDF,即使目的是“助手”(文档说可以处理图像)。没问题,我来转换一下…
- 处理来自其他来源(G Drive 等)的文件,这些文件缺少扩展名但存在 MIME 类型… 再次,Bad Request。
4. 我们从 2024 年的“AGI 指日可待”变成了今天的“_A_rtificial _S_uper _I_ntelligence 指日可待”。我们能不能放松一下?我是否掉进了营销陷阱?
我认为 LLM 对于 Cursor 等应用程序或客户支持非常有用,因为它们不需要给出“完美”的响应,因为人工操作员会进一步提示它。你不得不处理来自 Cursor 的愚蠢输出多少次(我是一个高级用户,我每天都在处理这个问题)。RAG 是一个很酷的应用程序,我认为那里实际上不需要正确性或精确性。我已经输入了数百条笔记,我有时会参考它们。每次我都会得到不同的答案,但我不需要它们是完美的。
:q!
查看原文
I'm trying to automate a few manual processes we have right now, but I still can't get over this hump. What am I doing wrong?<p>I am using these AI APIs for actual processing type work, and I am left defeated and somewhat angry if I'm being honest. These AI companies sell us some galaxy-brain vision of automation, but actually using their services is a disappointing experience.<p>1. The results are never consistent. "Please ensure you extract ALL items" -> [Item1, Item2, Item3, "literally a comment // ...remaining items"] WHAT THE F$#K!! Sometimes it gives me a full list of all items, and sometimes it does that BS. I provided a tool, and half of the time it just grabs the first 3 and maybe it will grab the very last one too (ignoring everything in the middle).<p>2. Because the results are not reliable, I have to do more post-processing. About 60% of the time, even after post, I have to reject because they don't meet my confidence threshold.<p>3. The APIs are poorly supported by the vendors.<p>- iOS has some insane behavior where file extensions are sometimes .jpg or .JPG, etc. OpenAI's API, for example, will return Bad Request because the extension was not ".jpg" so now I have to add more code to ensure that when the user uploads files, I rename the file.<p>- The docs will say it supports a list of file formats, but then rejects the request because it was not .PDF even though the purpose was "assistants" (which the docs say can handle images). No problem, I'll just convert..<p>- Dealing with files coming from other sources (G Drive, etc.) where the extension is missing but the MIME type is present.. Again, bad request.<p>4. We went from "AGI any day now" in 2024, to "_A_rtificial _S_uper _I_ntelligence any day now" today. Can we just relax? Did I fall for a marketing trap?<p>I think LLMs are great for applications like in Cursor, or for customer support, where it doesn't need to give "perfect" responses because a human operator will prompt it further. How many times have you had to deal with stupid output from Cursor (I'm a power user, I deal with this daily). RAG is a cool application, and there's no real need for correctness or exactness there, IMO. I've got hundreds of my notes that I've fed which I reference sometimes. I get different answers each time, but I don't need them to be perfect.<p>:q!