Show HN: 我搭建了一个个人 AI 新闻聚合器,用于过滤 RSS 订阅源(n8n 和 OpenAI)
1 分•作者: practicalaifg•6 个月前
嗨,HN,
我发现自己花太多时间在科技新闻和 RSS 订阅源上“末日浏览”,浏览数百个标题,只为了找到 3-4 个真正对我的工作有用的内容。
为了解决这个问题,我使用 n8n 构建了一个自托管的自动化工作流程,它充当个人编辑。
架构:
摄取:每天早上拉取 RSS 订阅源(TechCrunch、Hacker News 等)。
过滤(代理):将标题传递给 GPT-4o-mini,并附带一个系统提示,要求其“充当高级编辑”。它根据特定兴趣(例如,“对本地 LLM 高度感兴趣”,“对加密货币八卦兴趣低”)对每篇文章进行 0-10 分的评分。
逻辑:丢弃任何得分低于 7 的内容。
研究:使用 Tavily API 抓取并总结高分文章的完整内容。
交付:通过 SMTP 发送一封干净的电子邮件摘要。
最难的部分(SSE 和超时):最大的技术障碍是处理超时。由于 AI 研究步骤需要时间,HTTP 请求经常会中断。我不得不配置服务器发送事件 (SSE) 并调整 Node.js 中的执行超时环境变量,以在深度研究阶段保持连接。
资源:
工作流程/源代码 (JSON):[https://github.com/sojojp-hue/NewsSummarizer/tree/main](https://github.com/sojojp-hue/NewsSummarizer/tree/main)
视频演练和演示:[https://youtu.be/mOnbK6DuFhc](https://youtu.be/mOnbK6DuFhc)
我很想听听其他人是如何处理信息过载的,或者是否有更好的方法来处理 AI 代理的长时间轮询。
查看原文
Hi HN,<p>I found myself wasting too much time doomscrolling through tech news and RSS feeds, scanning hundreds of headlines just to find the 3-4 items that actually mattered to my work.<p>To fix this, I built a self-hosted automation workflow using n8n that acts as a personal editor.<p>The Architecture:<p>Ingest: Pulls RSS feeds (TechCrunch, Hacker News, etc.) every morning.<p>Filter (The Agent): Passes headlines to GPT-4o-mini with a system prompt to "act as a senior editor." It scores each article 0-10 based on specific interests (e.g., "High interest in Local LLMs," "Low interest in crypto gossip").<p>Logic: Discards anything with a score < 7.<p>Research: Uses Tavily API to scrape and summarize the full content of the high-scoring articles.<p>Delivery: Sends a single, clean email digest via SMTP.<p>The Hardest Part (SSE & Timeouts): The biggest technical hurdle was handling timeouts. Since the AI research step takes time, the HTTP requests would often drop. I had to configure Server-Sent Events (SSE) and adjust the execution timeout env variables in Node.js to keep the connection alive during the deep-dive research phase.<p>Resources:<p>Workflow/Source (JSON): <a href="https://github.com/sojojp-hue/NewsSummarizer/tree/main" rel="nofollow">https://github.com/sojojp-hue/NewsSummarizer/tree/main</a><p>Video Walkthrough & Demo: <a href="https://youtu.be/mOnbK6DuFhc" rel="nofollow">https://youtu.be/mOnbK6DuFhc</a><p>I’d love to hear how others are handling information overload or if there are better ways to handle the long-polling for the AI agents.