Feature Description
Expand the current XResearch workflow into a unified SourceDiscovery pipeline that collects trending content from multiple platforms (X/Twitter, TikTok, YouTube Shorts, Instagram Reels, Reddit) and consolidates it into a single research feed before feeding downstream workflows.
Motivation
Currently XResearch only scrapes X/Twitter trending topics → AI-analyzed posts → Telegram reports. The Video Sourcing + Retouch workflow (see companion issue) needs broader source discovery — not just text posts but also video content — to find viral clips across platforms, extract key themes, and feed them as source material for retouch/re-upload.
A consolidated research pipeline would:
- Find video content (not just text posts) across TikTok, YouTube Shorts, Instagram Reels
- Cross-reference trends across platforms — see the same trend on X, TikTok, and Reddit
- Deduplicate and rank source material by engagement metrics across all platforms
- Store source metadata (URL, platform, views, engagement, thumbnail, duration) in a structured format that downstream workflows can consume
- Send consolidated research reports to Telegram (like current XResearch)
Proposed Solution
New class: src/classes/SourceDiscovery.py
SourceDiscovery(fp_profile_path)
├── discover_x_trends() # existing XResearch logic, refactored
├── discover_tiktok_trends() # scrape TikTok trending/creative center
├── discover_youtube_trends() # scrape YouTube Shorts trending
├── discover_reddit_trends() # scrape Reddit popular/rising posts
├── consolidate_sources() # merge + dedupe + rank across platforms
└── run() → List[SourceItem] # unified pipeline
SourceItem schema (replaces current XResearch post dict)
{
"id": "src-<platform>-<hash>",
"platform": "tiktok|x|youtube|reddit|instagram",
"url": "https://...",
"title": "...",
"content": "...",
"views": 0,
"engagement": { "likes": 0, "shares": 0, "comments": 0 },
"has_video": true,
"video_url": "https://...",
"thumbnail_url": "https://...",
"duration_seconds": 30,
"trend_name": "...",
"ai_analysis": { "main_topic": "...", "keywords": [], "angle": "...", "summary": "..." },
"discovered_at": "2026-04-25T..."
}
New script: scripts/run_source_discovery.py
CLI entry point mirroring scripts/xresearch.py pattern, with flags:
--platforms x,tiktok,youtube,reddit
--min-engagement 1000
--save-json (persist SourceItems to .mp/sources/)
--send-telegram
Storage: .mp/sources/ directory
Each run saves sources_YYYY-MM-DD.json so downstream workflows can pick up discovered sources.
Current XResearch → Migration
XResearch.py stays functional but internal methods get refactored into SourceDiscovery
xresearch.py script stays as shortcut: --platforms x
- Full migration path: old script works, new script is the supercharged version
Alternatives Considered
- Per-platform separate scripts → rejected (cross-platform dedup/ranking is core value)
- API-based discovery → rejected (most platforms require auth; Selenium matches existing pattern)
- Just extend XResearch → rejected (scope too different; new class keeps things clean)
Dependencies
- Companion issue: Video Sourcing + Retouch workflow
- Requires
yt-dlp for video metadata extraction
Feature Description
Expand the current XResearch workflow into a unified SourceDiscovery pipeline that collects trending content from multiple platforms (X/Twitter, TikTok, YouTube Shorts, Instagram Reels, Reddit) and consolidates it into a single research feed before feeding downstream workflows.
Motivation
Currently XResearch only scrapes X/Twitter trending topics → AI-analyzed posts → Telegram reports. The Video Sourcing + Retouch workflow (see companion issue) needs broader source discovery — not just text posts but also video content — to find viral clips across platforms, extract key themes, and feed them as source material for retouch/re-upload.
A consolidated research pipeline would:
Proposed Solution
New class:
src/classes/SourceDiscovery.pySourceItem schema (replaces current XResearch post dict)
{ "id": "src-<platform>-<hash>", "platform": "tiktok|x|youtube|reddit|instagram", "url": "https://...", "title": "...", "content": "...", "views": 0, "engagement": { "likes": 0, "shares": 0, "comments": 0 }, "has_video": true, "video_url": "https://...", "thumbnail_url": "https://...", "duration_seconds": 30, "trend_name": "...", "ai_analysis": { "main_topic": "...", "keywords": [], "angle": "...", "summary": "..." }, "discovered_at": "2026-04-25T..." }New script:
scripts/run_source_discovery.pyCLI entry point mirroring
scripts/xresearch.pypattern, with flags:--platforms x,tiktok,youtube,reddit--min-engagement 1000--save-json(persist SourceItems to.mp/sources/)--send-telegramStorage:
.mp/sources/directoryEach run saves
sources_YYYY-MM-DD.jsonso downstream workflows can pick up discovered sources.Current XResearch → Migration
XResearch.pystays functional but internal methods get refactored into SourceDiscoveryxresearch.pyscript stays as shortcut:--platforms xAlternatives Considered
Dependencies
yt-dlpfor video metadata extraction