RSS Aggregator — RSS 聚合器

v1.0.0

根据时间表监控、过滤和总结RSS/Atom订阅源。使用场景：（1）跟踪行业新闻或竞争对手博客，（2）在多个订阅源中设置关键词警报，（3）获取每日/定期的新文章摘要，（4）将有趣的文章路由到Discord/电子邮件/webhook，（5）构建个人新闻管道。触发器：rss订阅源，atom，订阅源监控，新闻聚合器，跟踪此博客，关键词警报，订阅源摘要，订阅此订阅源，监控此网站。

0· 208·0 当前·0 累计

by @fuzzyb33s (Fuzzyb33s)·MIT-0

API开发网络工具浏览器自动化通信工具邮件服务

下载技能包

License

MIT-0

License

MIT-0

可自由使用、修改和再分发，无需署名。

查看条款 ↗

运行时依赖

无特殊依赖

安装命令

点击复制

官方npx clawhub@latest install fuzzy-rss-aggregator

镜像加速npx clawhub@latest install fuzzy-rss-aggregator --registry https://cn.longxiaskill.com 镜像可用

需要定制？告诉我你的需求 →

技能文档

RSS 聚合器监控 RSS/Atom feeds 按照时间表，通过关键词或日期进行筛选，并将摘要路由到您首选的渠道。设置需要 feedparser Python 包：pip install feedparser 核心脚本保存为 scripts/fetch_feeds.py：

#!/usr/bin/env python3
"""RSS/Atom feed fetcher with filtering and summarization."""
import feedparser
import sys
import json
from datetime import datetime, timedelta
from pathlib import Path
def parse_date(entry):
    """Extract publication date from entry."""
    for field in ('published_parsed', 'updated_parsed', 'created_parsed'):
        if hasattr(entry, field) and entry.get(field):
            return datetime(entry[field][:6])
    return None
def fetch_feed(url, max_age_days=None, keyword_filter=None):
    """Fetch and filter feed entries."""
    feed = feedparser.parse(url)
    entries = feed.entries
    # Filter by age
    if max_age_days:
        cutoff = datetime.now() - timedelta(days=max_age_days)
        entries = [e for e in entries if parse_date(e) and parse_date(e) >= cutoff]
    # Filter by keyword
    if keyword_filter:
        kw_lower = keyword_filter.lower()
        entries = [e for e in entries if kw_lower in (e.get('title', '') + e.get('summary', '')).lower()]
    return {
        'title': feed.feed.get('title', url),
        'url': url,
        'entries': [
            {
                'title': e.get('title', 'No title'),
                'link': e.get('link', ''),
                'published': parse_date(e).isoformat() if parse_date(e) else None,
                'summary': e.get('summary', e.get('description', ''))[:500]
            } for e in entries
        ]
    }if __name__ == '__main__':
    url = sys.argv[1] if len(sys.argv) > 1 else ''
    max_age = int(sys.argv[2]) if len(sys.argv) > 2 else None
    keyword = sys.argv[3] if len(sys.argv) > 3 else None
    if not url:
        print(json.dumps({'error': 'URL required'}))
        sys.exit(1)
    result = fetch_feed(url, max_age, keyword)
    print(json.dumps(result, indent=2))

食谱 Recipe 1：每日新闻摘要 cron_add( name="Tech news digest", schedule={"kind": "cron", "expr": "0 8 1-5", "tz": "Africa/Johannesburg"}, payload={ "kind": "agentTurn", "message": "Run: python scripts/fetch_feeds.py https://news.ycombinator.com/rss 7. Then summarize the top 5 stories as a clean bullet list with titles and links." }, delivery={"mode": "announce"}, sessionTarget="isolated" )
Recipe 2：多源监控 // 首先，创建 scripts/multi_fetch.py：
""" import feedparser, json, sys from scripts.fetch_feeds import fetch_feed feeds = [ "https://techcrunch.com/feed/", "https://www.theverge.com/rss/index.xml", "https://feeds.feedburner.com/TechCrunch/" ] results = [fetch_feed(url, max_age_days=1) for url in feeds] print(json.dumps(results, indent=2))
然后安排： cron_add( name="Industry pulse", schedule={"kind": "cron", "expr": "0 /6 ", "tz": "UTC"}, payload={ "kind": "agentTurn", "message": "Run: python scripts/multi_fetch.py. Filter entries from last 6 hours. Post new articles to #news channel on Discord with title + link." }, delivery={"mode": "announce"}, sessionTarget="isolated" )

Recipe 3：关键词提醒 cron_add( name="AI keyword alert", schedule={"kind": "cron", "expr": "0 /4 ", "tz": "UTC"}, payload={ "kind": "agentTurn", "message": "Run: python scripts/fetch_feeds.py https://feeds.feedburner.com/venturebeat/Settings 1 \"AI OR machine learning OR LLM\". If results have entries, format as: Alert Article Title. Send to Discord #alerts channel." }, delivery={"mode": "webhook", "to": "https://discord.com/api/webhooks/..."}, sessionTarget="isolated" )

Recipe 4：源健康检查 cron_add( name="Feed health check", schedule={"kind": "cron", "expr": "0 9 ", "tz": "UTC"}, payload={ "kind": "agentTurn", "message": "Check if these feeds are still live: Hacker News (https://news.ycombinator.com/rss), TechCrunch (https://techcrunch.com/feed/). Run fetch without filters. If any feed returns 0 entries or error, alert via webhook." }, delivery={"mode": "announce"}, sessionTarget="isolated", failureAlert={"after": 3, "mode": "announce", "cooldownMs": 86400000} )

Recipe 5：源到稍后阅读（Notion） cron_add( name="RSS to Notion", schedule={"kind": "cron", "expr": "0 7 *", "tz": "Africa/Johannesburg"}, payload={ "kind": "agentTurn", "message": "Run: python scripts/fetch_feeds.py https://example.com/rss 1. Create Notion page for each entry in your Reading List database with title, link, and summary as page content." }, delivery={"mode": "none"}, sessionTarget="isolated" )

管理源 # 测试源 python scripts/fetch_feeds.py [max-age-days] [keyword-filter] # 示例 python scripts/fetch_feeds.py https://news.ycombinator.com/rss 7 python scripts/fetch_feeds.py https://techcrunch.com/feed

License

运行时依赖

安装命令

技能文档

相关技能推荐