提问 HN:使用 AI/LLM API 让我很想放弃。我做错了什么?

1作者: moomoo116 个月前
我正在尝试自动化我们目前的一些手动流程,但仍然无法克服这个障碍。我到底哪里做错了? 我正在使用这些 AI API 来进行实际的处理工作,说实话,我感到很沮丧,甚至有些愤怒。这些 AI 公司向我们兜售一些关于自动化的宏大愿景,但实际使用他们的服务却令人失望。 1. 结果从不一致。“请确保提取所有项目” -> [项目 1,项目 2,项目 3,“实际上是一个注释 // ...剩余项目”] 搞什么鬼!!有时候它会给我一个所有项目的完整列表,有时候它会搞这种鬼。我提供了一个工具,但它有一半的时间只抓取前 3 个,也许还会抓取最后一个(忽略中间的所有内容)。 2. 由于结果不可靠,我不得不进行更多的后处理。大约 60% 的时间,即使在后处理之后,我也必须拒绝,因为它们没有达到我的置信度阈值。 3. 这些 API 得到了供应商的糟糕支持。 - iOS 有一些疯狂的行为,文件扩展名有时是 .jpg 或 .JPG 等。例如,OpenAI 的 API 将返回“Bad Request”,因为扩展名不是“.jpg”,所以现在我必须添加更多代码来确保当用户上传文件时,我重命名文件。 - 文档会说它支持一系列文件格式,但随后拒绝请求,因为它不是 .PDF,即使目的是“助手”(文档说可以处理图像)。没问题,我来转换一下… - 处理来自其他来源(G Drive 等)的文件,这些文件缺少扩展名但存在 MIME 类型… 再次,Bad Request。 4. 我们从 2024 年的“AGI 指日可待”变成了今天的“_A_rtificial _S_uper _I_ntelligence 指日可待”。我们能不能放松一下?我是否掉进了营销陷阱? 我认为 LLM 对于 Cursor 等应用程序或客户支持非常有用,因为它们不需要给出“完美”的响应,因为人工操作员会进一步提示它。你不得不处理来自 Cursor 的愚蠢输出多少次(我是一个高级用户,我每天都在处理这个问题)。RAG 是一个很酷的应用程序,我认为那里实际上不需要正确性或精确性。我已经输入了数百条笔记,我有时会参考它们。每次我都会得到不同的答案,但我不需要它们是完美的。 :q!
查看原文
I&#x27;m trying to automate a few manual processes we have right now, but I still can&#x27;t get over this hump. What am I doing wrong?<p>I am using these AI APIs for actual processing type work, and I am left defeated and somewhat angry if I&#x27;m being honest. These AI companies sell us some galaxy-brain vision of automation, but actually using their services is a disappointing experience.<p>1. The results are never consistent. &quot;Please ensure you extract ALL items&quot; -&gt; [Item1, Item2, Item3, &quot;literally a comment &#x2F;&#x2F; ...remaining items&quot;] WHAT THE F$#K!! Sometimes it gives me a full list of all items, and sometimes it does that BS. I provided a tool, and half of the time it just grabs the first 3 and maybe it will grab the very last one too (ignoring everything in the middle).<p>2. Because the results are not reliable, I have to do more post-processing. About 60% of the time, even after post, I have to reject because they don&#x27;t meet my confidence threshold.<p>3. The APIs are poorly supported by the vendors.<p>- iOS has some insane behavior where file extensions are sometimes .jpg or .JPG, etc. OpenAI&#x27;s API, for example, will return Bad Request because the extension was not &quot;.jpg&quot; so now I have to add more code to ensure that when the user uploads files, I rename the file.<p>- The docs will say it supports a list of file formats, but then rejects the request because it was not .PDF even though the purpose was &quot;assistants&quot; (which the docs say can handle images). No problem, I&#x27;ll just convert..<p>- Dealing with files coming from other sources (G Drive, etc.) where the extension is missing but the MIME type is present.. Again, bad request.<p>4. We went from &quot;AGI any day now&quot; in 2024, to &quot;_A_rtificial _S_uper _I_ntelligence any day now&quot; today. Can we just relax? Did I fall for a marketing trap?<p>I think LLMs are great for applications like in Cursor, or for customer support, where it doesn&#x27;t need to give &quot;perfect&quot; responses because a human operator will prompt it further. How many times have you had to deal with stupid output from Cursor (I&#x27;m a power user, I deal with this daily). RAG is a cool application, and there&#x27;s no real need for correctness or exactness there, IMO. I&#x27;ve got hundreds of my notes that I&#x27;ve fed which I reference sometimes. I get different answers each time, but I don&#x27;t need them to be perfect.<p>:q!