RSS Aggregator — RSS 聚合器
v1.0.0根据时间表监控、过滤和总结RSS/Atom订阅源。使用场景:(1)跟踪行业新闻或竞争对手博客,(2)在多个订阅源中设置关键词警报,(3)获取每日/定期的新文章摘要,(4)将有趣的文章路由到Discord/电子邮件/webhook,(5)构建个人新闻管道。触发器:rss订阅源,atom,订阅源监控,新闻聚合器,跟踪此博客,关键词警报,订阅源摘要,订阅此订阅源,监控此网站。
运行时依赖
安装命令
点击复制技能文档
RSS 聚合器 监控 RSS/Atom feeds 按照时间表,通过关键词或日期进行筛选,并将摘要路由到您首选的渠道。 设置 需要 feedparser Python 包:pip install feedparser 核心脚本 保存为 scripts/fetch_feeds.py:
食谱 Recipe 1:每日新闻摘要 cron_add( name="Tech news digest", schedule={"kind": "cron", "expr": "0 8 1-5", "tz": "Africa/Johannesburg"}, payload={ "kind": "agentTurn", "message": "Run: python scripts/fetch_feeds.py https://news.ycombinator.com/rss 7. Then summarize the top 5 stories as a clean bullet list with titles and links." }, delivery={"mode": "announce"}, sessionTarget="isolated" )#!/usr/bin/env python3 """RSS/Atom feed fetcher with filtering and summarization.""" import feedparser import sys import json from datetime import datetime, timedelta from pathlib import Pathdef parse_date(entry): """Extract publication date from entry.""" for field in ('published_parsed', 'updated_parsed', 'created_parsed'): if hasattr(entry, field) and entry.get(field): return datetime(entry[field][:6]) return None
def fetch_feed(url, max_age_days=None, keyword_filter=None): """Fetch and filter feed entries.""" feed = feedparser.parse(url) entries = feed.entries # Filter by age if max_age_days: cutoff = datetime.now() - timedelta(days=max_age_days) entries = [e for e in entries if parse_date(e) and parse_date(e) >= cutoff] # Filter by keyword if keyword_filter: kw_lower = keyword_filter.lower() entries = [e for e in entries if kw_lower in (e.get('title', '') + e.get('summary', '')).lower()] return { 'title': feed.feed.get('title', url), 'url': url, 'entries': [ { 'title': e.get('title', 'No title'), 'link': e.get('link', ''), 'published': parse_date(e).isoformat() if parse_date(e) else None, 'summary': e.get('summary', e.get('description', ''))[:500] } for e in entries ] }
if __name__ == '__main__': url = sys.argv[1] if len(sys.argv) > 1 else '' max_age = int(sys.argv[2]) if len(sys.argv) > 2 else None keyword = sys.argv[3] if len(sys.argv) > 3 else None if not url: print(json.dumps({'error': 'URL required'})) sys.exit(1) result = fetch_feed(url, max_age, keyword) print(json.dumps(result, indent=2))
Recipe 2:多源监控 // 首先,创建 scripts/multi_fetch.py:
"""
import feedparser, json, sys
from scripts.fetch_feeds import fetch_feed
feeds = [
"https://techcrunch.com/feed/",
"https://www.theverge.com/rss/index.xml",
"https://feeds.feedburner.com/TechCrunch/"
]
results = [fetch_feed(url, max_age_days=1) for url in feeds]
print(json.dumps(results, indent=2))
然后安排:
cron_add(
name="Industry pulse",
schedule={"kind": "cron", "expr": "0 /6 ", "tz": "UTC"},
payload={
"kind": "agentTurn",
"message": "Run: python scripts/multi_fetch.py. Filter entries from last 6 hours. Post new articles to #news channel on Discord with title + link."
},
delivery={"mode": "announce"},
sessionTarget="isolated"
)Recipe 3:关键词提醒 cron_add( name="AI keyword alert", schedule={"kind": "cron", "expr": "0 /4 ", "tz": "UTC"}, payload={ "kind": "agentTurn", "message": "Run: python scripts/fetch_feeds.py https://feeds.feedburner.com/venturebeat/Settings 1 \"AI OR machine learning OR LLM\". If results have entries, format as: Alert Article Title. Send to Discord #alerts channel." }, delivery={"mode": "webhook", "to": "https://discord.com/api/webhooks/..."}, sessionTarget="isolated" )
Recipe 4:源健康检查 cron_add( name="Feed health check", schedule={"kind": "cron", "expr": "0 9 ", "tz": "UTC"}, payload={ "kind": "agentTurn", "message": "Check if these feeds are still live: Hacker News (https://news.ycombinator.com/rss), TechCrunch (https://techcrunch.com/feed/). Run fetch without filters. If any feed returns 0 entries or error, alert via webhook." }, delivery={"mode": "announce"}, sessionTarget="isolated", failureAlert={"after": 3, "mode": "announce", "cooldownMs": 86400000} )
Recipe 5:源到稍后阅读(Notion) cron_add( name="RSS to Notion", schedule={"kind": "cron", "expr": "0 7 *", "tz": "Africa/Johannesburg"}, payload={ "kind": "agentTurn", "message": "Run: python scripts/fetch_feeds.py https://example.com/rss 1. Create Notion page for each entry in your Reading List database with title, link, and summary as page content." }, delivery={"mode": "none"}, sessionTarget="isolated" )
管理源 # 测试源 python scripts/fetch_feeds.py [max-age-days] [keyword-filter] # 示例 python scripts/fetch_feeds.py https://news.ycombinator.com/rss 7 python scripts/fetch_feeds.py https://techcrunch.com/feed