If this tool is helping you, please ⭐ the repo! It really helps discoverability.
SEC 13F Filing Tracker | Institutional Portfolio Analysis | AI-Powered Stock Research
A comprehensive Python tool for tracking hedge fund portfolios through SEC filings (13F, 13D/G, Form 4). Transform raw SEC EDGAR data into actionable investment insights. Built for financial analysts, quantitative traders, and retail investors seeking to analyze institutional investor strategies, portfolio changes, and discover stock opportunities by following elite fund managers.
Keywords: SEC filings tracker, 13F analysis, hedge fund portfolio, institutional investors, stock research, investment intelligence, CUSIP converter, financial data scraper, AI stock analysis
- 📊 Hedge Fund Tracker
- 🚀 Quick Start
- ✨ Key Features
- 📦 Installation
- 📁 Project Structure
- 👨🏻💻 How This Tool Tracks Hedge Funds
- 🏢 Hedge Funds Selection
- 🧠 AI Models Selection
⚠️ Limitations & Considerations- ⚙️ Automation with GitHub Actions
- 🗃️ Technical Stack
- 🤝🏼 Contributing & Support
- 📚 References
- 🙏🏼 Acknowledgments
- 📄 License
# Clone the repository
git clone https://github.com/dokson/hedge-fund-tracker.git
cd hedge-fund-tracker
# Install Python dependencies
pipenv install
# Install and build the React frontend
cd app/frontend && npm install && npm run build && cd ../..
# Run the application (opens web UI in your browser)
pipenv run python -m app.main| Feature | Description |
|---|---|
| 🌐 Modern Web UI | Premium React-based platform with real-time SSE streaming for AI tasks, native Dark Mode, and responsive design. |
| 📊 Visual Analytics | Interactive charts (Recharts) to track institutional holdings, sectoral trends, and quarterly portfolio evolutions. |
| 🆚 Comparative Analysis | Combines quarterly (13F) and non-quarterly (13D/G, Form 4) filings for an up-to-date view. |
| 📋 Comprehensive Reports | High-fidelity analysis pages for both investment funds (portfolios) and specific stocks (tickers). |
| 🔍 Smart Ticker Resolution | Multi-fallback system (yfinance, Finnhub, FinanceDatabase) to resolve CUSIPs into actionable stock symbols. |
| 🤖 AI Financial Analyst | Leverages top-tier LLMs to calculate "Promise Scores" and perform deep due diligence on high-conviction opportunities. |
| ⚙️ Automated Data Pipeline | Scheduled GitHub Actions to fetch, process, and commit the latest SEC filings directly to your repository. |
| 🌐 GitHub Pages Demo | Static deployment with bundled data — all analysis features work without a backend. |
| ⭐ Personalized Watchlist | Star your favorite funds or stocks for quick access and personalized tracking across the platform. |
| 🗃️ GICS Hierarchy | Autonomous parser to build a granular GICS classification database. |
- Python 3.13+
- pipenv (install with
pip install pipenv)
-
📥 Clone and navigate:
git clone https://github.com/dokson/hedge-fund-tracker.git cd hedge-fund-tracker -
📲 Install dependencies: Navigate to the project root and run the following command. This will create a virtual environment and install all required packages.
pipenv install
💡 Tip: If
pipenvis not found, you might need to usepython -m pipenv install. This can happen if the user scripts directory is not in your system's PATH. -
🔨 Build the frontend: Build the React interface (required once before first run):
cd app/frontend && npm install && npm run build && cd ../..
-
▶️ Run the application: Execute within the project's virtual environment:pipenv run python -m app.main
This starts a FastAPI server on
http://localhost:8000and opens the web UI in your browser automatically.⚠️ Note on CLI mode (Legacy): The terminal CLI is a deprecated version of the tool, built before the development of the modern Web UI. While still functional, it requires a manual.envconfiguration. This file is automatically generated the first time you launch the Web UI. So, if you still wish to use the "old school" CLI, just run:pipenv run python -m app.main --cli
The data update operations (downloading and processing filings) are inside a dedicated script. This keeps the main application focused on analysis, while the updater handles populating and refreshing the database.
To run the data update operations, you need to use the updater.py script from the project root:
pipenv run python -m database.updaterThe updater.py script includes semi-automated maintenance tasks:
- Sorting: Upon exit (option
0), the script automatically sorts thedatabase/stocks.csvfile by ticker to maintain performance and prevent Git diff noise. - Auto-Documentation: This README's excluded funds section is synchronized whenever the database is refreshed manually.
This will open a separate menu for data management:
┌───────────────────────────────────────────────────────────────────────────────┐
│ Hedge Fund Tracker - Database Updater │
├───────────────────────────────────────────────────────────────────────────────┤
│ 0. Exit │
│ 1. Generate latest 13F reports for all known hedge funds │
│ 2. Fetch latest non-quarterly filings for all known hedge funds │
│ 3. Generate 13F report for a known hedge fund │
│ 4. Manually enter a hedge fund CIK to generate a 13F report │
└───────────────────────────────────────────────────────────────────────────────┘The project includes an autonomous GICS (Global Industry Classification Standard) parser (database/gics/updater.py). Originally developed by MSCI and S&P, it scrapes Wikipedia to build a full hierarchy of 163 sub-industries. This provides the AI Analyst with granular industry context while remaining independent of third-party libraries.
The tool can utilize API keys for enhanced functionality, but all are optional:
| Service | Purpose | Get Free API Key |
|---|---|---|
Finnhub |
CUSIP to stock ticker conversion | Finnhub Keys |
GitHub Models |
Access to top-tier models (e.g., xAI Grok-3, OpenAI GPT-5, etc...) | GitHub Tokens |
Google AI Studio |
Access to Google Gemini models | AI Studio Keys |
Groq AI |
Access to various LLMs (e.g., OpenAI gpt-oss, Meta Llama, etc...) | Groq Keys |
Hugging Face |
Access to open weights models (e.g., DeepSeek R1, Kimi-Linear-48B, etc...) | HF Tokens |
OpenRouter |
Access to various LLMs (e.g., Claude 4.5 Opus, GLM 4.5 Air, etc...) | OpenRouter Keys |
💡 Note: Ticker resolution primarily uses yfinance, which is free and requires no API key. If that fails, the system falls back to Finnhub (if an API key is provided), with the final fallback being FinanceDatabase.
💡 Note: You don't need to use all the APIs. For the generative AI models (Google AI Studio, GitHub Models, Groq AI, Hugging Face, and OpenRouter), you only need the API keys for the services you plan to use. For instance, if you want to experiment with models like OpenAI GPT-4o mini, you just need a GitHub Token. Experimenting with different models is encouraged, as the quality of AI-generated analysis, both for identifying promising stocks and for conducting due diligence, can vary. However, top-performing stocks are typically identified consistently across all tested models. All APIs used in this project are currently free (with GitHub Models providing a generous free tier for developers).
hedge-fund-tracker/
├── 📁 .github/
│ ├── 📁 scripts/
│ │ └── 🐍 fetcher.py # Daily script for data fetching (scheduled by workflows/daily-fetch.yml)
│ └── 📁 workflows/ # GitHub Actions for automation
│ ├── ⚙️ deploy-pages.yml # GitHub Actions: Deploy to GitHub Pages
│ ├── ⚙️ filings-fetch.yml # GitHub Actions: Filings fetching job
│ └── ⚙️ python-tests.yml # GitHub Actions: Unit tests
├── 📁 app/ # Main application logic
│ ├── 📁 frontend/ # React + Vite web UI
│ │ ├── 📁 public/ # Static assets (404.html, logo.png)
│ │ ├── 📁 scripts/ # copy-database.mjs (bundles CSVs for GH Pages)
│ │ ├── 📁 src/
│ │ │ ├── 📁 components/ # Shared UI components (ModelSelector, TerminalOutput, FeatureNotAvailable, etc.)
│ │ │ ├── 📁 lib/ # config.ts (IS_GH_PAGES_MODE), dataService.ts (CSV I/O), aiClient.ts (SSE)
│ │ │ └── 📁 pages/ # AIRanking, AIDueDiligence, FundsConfig, AISettings, DatabaseOperations
│ │ ├── 📦 package.json
│ │ └── ⚙️ vite.config.ts
│ ├── 🐍 server.py # FastAPI server (serves frontend + all API endpoints)
│ └── ▶️ main.py # Entry point: web server (default) or CLI (--cli)
├── 📁 database/ # Data storage
│ ├── 📁 2025Q1/ # Quarterly reports
│ │ ├── 📊 fund_1.csv # Individual fund quarterly report
│ │ ├── 📊 fund_2.csv
│ │ └── 📊 fund_n.csv
│ ├── 📁 YYYYQN/
│ ├── 📁 GICS/
│ │ ├── 🗃️ hierarchy.csv # Full GICS hierarchy
│ │ └── ▶️ updater.py # GICS updater script
│ ├── 📝 hedge_funds.csv # Curated hedge funds list -> EDIT THIS to add or remove funds to track
│ ├── 📝 models.csv # LLMs list to use for AI Financial Analyst -> EDIT THIS to add or remove AI models
│ ├── 📊 non_quarterly.csv # Stores latest 13D/G and Form 4 filings
│ ├── 📊 stocks.csv # Master data for stocks (CUSIP-Ticker-Name)
│ └── ▶️ updater.py # Main entry point for updating the database
├── 📁 tests/ # Test suite
├── 📝 .env.example # Template for your API keys
├── ⛔ .gitignore # Git ignore rules
├── 🧾 LICENSE # MIT License
├── 🛠️ Pipfile # Project dependencies
├── 🔏 Pipfile.lock # Locked dependency versions
└── 📖 README.md # Project documentation (this file)
📝 Hedge Funds Configuration File:
database/hedge_funds.csvcontains the list of hedge funds to monitor (CIK, name, manager) and can also be edited at runtime.📝 LLMs Configuration File:
database/models.csvcontains the list of available LLMs for AI analysis and can also be edited at runtime.
This tracker leverages the following types of SEC filings to provide a comprehensive view of institutional activity.
-
📅 Quarterly 13F Filings
- Required for funds managing $100M+
- Filed within 45 days of quarter-end
- Shows portfolio snapshot on last day of quarter
-
📝 Non-Quarterly 13D/G Filings
- Required when acquiring 5%+ of company shares
- Filed within 10 days of the transaction
- Provides a timely view of significant investments
-
✍🏻 Non-Quarterly SEC Form 4 Insider Filings
- Filed by insiders (executives, directors) or large shareholders (>10%) when they trade company stocks
- Must be filed within 2 business days of the transaction
- Offers real-time insight into the actions of key individuals and institutions
This tool tracks a curated list of what I found to be the top-performing institutional investors that file with the U.S. SEC, identified based on their performance over the last 3-5 years. This curation is the result of my own methodology designed to identify the top percentile of global investment funds. My selection methodology is detailed below.
Modern portfolio theory (MPT) offers many methods for quantifying the risk-return trade-off, but they are often ill-suited for analyzing the limited data available in public filings. Consequently, the hedge_funds.csv was therefore generated using my own custom selection algorithm designed to identify top-performing funds while managing for volatility.
Note: The selection algorithm is external to this project and was used only to produce the curated
hedge_funds.csvlist.
My approach prioritizes high cumulative returns but also analyzes the path taken to achieve them: it penalizes volatility, similar to the Sharpe Ratio, but this penalty is dynamically adjusted based on performance consistency; likewise, drawdowns are penalized, echoing the principle of the Sterling Ratio, but the penalty is intentionally dampened to avoid overly punishing funds that recover effectively from temporary downturns.
The list of hedge funds is actively managed to maintain its quality; funds that underperform may be replaced, while new top performers are periodically added.
However, despite their strong performance, several funds with portfolios predominantly focused on Healthcare and Biotech, such as Nextech Invest, Enavate Sciences, Caligan Partners, and Boxer Capital Management, have been intentionally excluded. These funds invest in highly specialized sectors where I lack the necessary expertise. Consequently, I consider them too risky for my personal investment profile, given the complexity and volatility inherent in biotech and healthcare ventures.
The quality of the output analysis is directly tied to the quality of the input data. To enhance the accuracy of the insights and opportunities identified, many popular high-profile funds have been intentionally excluded by design (the list below is automatically managed and capped to 50 funds, but you can see the full list in excluded_hedge_funds.csv):
- Warren Buffett's Berkshire Hathaway
- Ken Griffin's Citadel Advisors
- Ray Dalio's Bridgewater Associates
- Michael Burry's Scion Asset Management
- Peter Thiel's Thiel Macro
- Cathie Wood's ARK Invest
- Bill Ackman's Pershing Square
- Dmitry Balyasny's Balyasny Asset Management
- Alec Litowitz's Magnetar Capital
- Cliff Asness's AQR Capital Management
- David Tepper's Appaloosa
- Israel Englander's Millennium Management
- Frank Sands's Sands Capital Management
- Murray Stahl's Horizon Kinetics
- Edward Mule's Silver Point Capital
- David Abrams's Abrams Capital Management
- Jeffrey Ubben's ValueAct Capital
- Paul Singer's Elliott Investment
- Chris Hohn's The Children's Investment
- Daniel Loeb's Third Point
- Boaz Weinstein's Saba Capital
- William Huffman's Nuveen
- George Soros's Soros Fund Management
- Bill Gates's Gates Foundation Trust
- Carl Icahn's Icahn Enterprises
- Dev Kantesaria's Valley Forge Capital Management
- Lewis Sanders's Sanders Capital
- Brad Gerstner's Altimeter Capital Management
- Andreas Halvorsen's Viking Global Investors
- Paul Tudor Jones's Tudor Investment Corporation
- Chris Davis's Davis Advisors
- Paul Isaac's Arbiter Partners
- Robert Robotti's Robotti Value Investors
- Jim Cracchiolo's Ameriprise Financial
- Li Lu's Himalaya Capital Management
- Francis Chou's Chou Associates
- Anand Parekh's Alyeska Investment Group
- Ken Fisher's Fisher Asset Management
- David Katz's Matrix Asset Advisors
- Lee Ainslie's Maverick Capital
- Joel Greenblatt's Gotham Funds
- Barry Ritholtz's Ritholtz Wealth Management
- Robert Pitts's Steadfast Capital Management
- John Paulson's Paulson & Co.
- Jeremy Grantham's GMO
- Paul Marshall & Ian Wace's Marshall Wace
- Seymour Kaufman's Crosslink Capital
- Mario Gabelli's GAMCO Investors
- John Overdeck's Two Sigma
- Richard Pzena's Pzena Investment Management
- and many more... (see
database/excluded_hedge_funds.csvfor the full list)
💡 Note: For convenience, key information for these funds, including their CIKs, is maintained in the
database/excluded_hedge_funds.csvfile.
Want to track additional funds? Simply edit database/hedge_funds.csv and add your preferred institutional investors. For example, to add Berkshire Hathaway, Pershing Square and ARK-Invest, you would add the following lines:
"CIK","Fund","Manager","Denomination","CIKs"
"0001067983","Berkshire Hathaway","Warren Buffett","Berkshire Hathaway Inc",""
"0001336528","Pershing Square","Bill Ackman","Pershing Square Capital Management, L.P.",""
"0001697748","ARK Invest","Cathie Wood","ARK Investment Management LLC",""💡 Note:
hedge_funds.csvcurrently includes not only traditional hedge funds but also other institutional investors (private equity funds, large banks, VCs, pension funds, etc., that file 13F to the SEC) selected from what I consider the top 5% of performers.If you wish to track any of the Notable Exclusions hedge funds, you can copy the relevant rows from
excluded_hedge_funds.csvintohedge_funds.csv.
-
Denomination: This is the exact legal name used by the fund in its filings. It is essential for accurately processing non-quarterly filings (13D/G, Form 4) as the scraper uses it to identify the fund's specific transactions within complex filing documents. -
CIKs(optional): A comma-separated list of additional CIKs. This field is used to track filings from related entities or subsidiaries. Some investment firms have complex structures where different legal entities file separately (e.g., a management company and a holding company).Example: Jeffrey Ubben's ValueAct Holdings (CIK =
0001418814) also has filings under ValueAct Capital Management (CIK =0001418812). By adding0001418812to theCIKscolumn, the tool aggregates non-quarterly filings from both entities for a complete view."CIK","Fund","Manager","Denomination","CIKs" "0001418814","ValueAct","Jeffrey Ubben","ValueAct Holdings, L.P.","0001418812"
The AI Financial Analyst's primary goal is to identify stocks with the highest growth potential based on hedge fund activity. It achieves this by calculating a "Promise Score" for each stock. This score is a weighted average of various metrics derived from 13F filings. The AI's first critical task is to act as a strategist, dynamically defining the heuristic by assigning the optimal weights for these metrics based on the market conditions of the selected quarter. Its second task is to provide quantitative scores (e.g., momentum, risk) for the top-ranked stocks.
The models included in database/models.csv have been selected because they have demonstrated the best performance and reliability for these specific tasks. Through experimentation, they have proven effective at interpreting the prompts and providing insightful, well-structured responses.
You can easily add or change the AI models used for analysis by editing the database/models.csv file. This allows you to experiment with different Large Language Models (LLMs) from supported providers.
To add a new model, open database/models.csv and add a new row with the following columns:
- ID: The specific model identifier as required by the provider's API.
- Description: A brief, user-friendly description that will be displayed in the selection menu.
- Client: The provider of the model. Must be one of
GitHub,Google,Groq,HuggingFace, orOpenRouter.
Here are the official model lists for each provider:
It's crucial to understand the inherent limitations of tracking investment strategies solely through SEC filings:
| Limitation | Impact | Mitigation |
|---|---|---|
| 🕒 Filing Delay | Data can be 45+ days old | Focus on long-term strategies |
| 🧩 Incomplete Picture | Only US long positions shown | Use as part of broader analysis |
| 📉 No Short Positions | Missing hedge information | Consider reported positions carefully |
| 🌎 Limited Scope | No non-US stocks or other assets | Supplement with additional data |
Many tracking websites rely solely on quarterly 13F filings, which means their data can be over 45 days old and miss many significant trades. Non-quarterly filings like 13D/G and Form 4 are often ignored because they are more complex to process and merge.
This tracker helps overcome that limitation by integrating multiple filing types. When analyzing the most recent quarter, the tool automatically incorporates the latest data from 13D/G and Form 4 filings. As a result, the holdings, deltas, and portfolio percentages reflect not just the static 13F snapshot, but also any significant trades that have occurred since. This provides a more dynamic and complete picture of institutional activity.
The frontend can be deployed as a static demo on GitHub Pages — no Python backend required. AI features and data updates are disabled in this mode, but all core analysis pages work with bundled data.
Live demo: https://{username}.github.io/hedge-fund-tracker/
| Page | Status |
|---|---|
| Dashboard (Latest Filings) | Fully functional |
| Quarterly Trends | Fully functional |
| Hedge Fund Portfolios | Fully functional |
| Stocks Browser | Fully functional |
| Funds Config | Read-only (data visible, no edits) |
| AI Ranking | Disabled (requires local backend) |
| AI Due Diligence | Disabled (requires local backend) |
| AI Settings | Hidden |
| Database Operations | Hidden |
- Fork the repository on GitHub
- Enable GitHub Pages: Go to Settings > Pages > Source: "GitHub Actions"
- Push to
master— the deploy workflow (.github/workflows/deploy-pages.yml) runs automatically
The build step (npm run build:gh-pages) bundles all CSV data into dist/database/ so the static site is fully self-contained.
For full functionality (AI analysis, data updates, file editing), run locally:
pipenv install
cd app/frontend && npm install && npm run build && cd ../..
pipenv run python -m app.mainThis repository includes a GitHub Actions workflow (.github/workflows/filings-fetch.yml) designed to keep your data effortlessly up-to-date by automatically fetching the latest SEC filings.
- Scheduled Runs: The workflow runs automatically to check for new 13F, 13D/G, and Form 4 filings from the funds you are tracking (
hedge_funds.csv). It runs four times a day from Monday to Friday (at 01:30, 13:30, 17:30, and 21:30 UTC) and once on Saturday (at 04:00 UTC). - Safe Branching Strategy: Instead of committing directly to your main branch, the workflow pushes all new data to a dedicated branch named
automated/filings-fetch. - GitHub Pages Deploy: A separate workflow (
.github/workflows/deploy-pages.yml) automatically rebuilds and deploys the static frontend to GitHub Pages whenever frontend or database files change onmaster. - User-Controlled Merging: This approach gives you full control. You can review the changes committed by the bot and then merge them into your main branch whenever you're ready. This prevents unexpected changes and allows you to manage updates at your own pace.
- Automated Alerts: If the script encounters a non-quarterly filing where it cannot identify the fund owner based on your
hedge_funds.csvconfiguration, it will automatically open a GitHub Issue in your repository, alerting you to a potential data mismatch that needs investigation.
- Fork the Repository: Create your own fork of this project on GitHub.
- Enable Actions: GitHub Actions are typically enabled by default on forked repositories. You can verify this under the Actions tab of your fork.
- Configure Secrets: For the workflow to resolve tickers and create issues, you need to add your API keys as repository secrets. In your forked repository, you must add your
FINNHUB_API_KEYas a repository secret. Go toSettings>Secrets and variables>Actionsin your forked repository to add it.
| 🗂️ Category | 🦾 Technology |
|---|---|
| Core | Python 3.13+, pipenv |
| Backend | FastAPI, uvicorn |
| Frontend | React 18, Vite, TypeScript, Tailwind CSS |
| UI Components | shadcn/ui, Radix UI, Lucide, Sonner |
| Data Viz & State | Recharts, TanStack Query v5 |
| Web Scraping | Requests, Beautiful Soup 4, lxml |
| Reliability | Tenacity, Python-Dotenv |
| Stocks Data | yfinance, Finnhub-Stock-API, FinanceDatabase |
| Gen AI | python-toon, Google AI SDK, OpenAI SDK |
- 🐛 Bug Reports
- 🆕 Feature Requests
- 🔀 Fork & PR
- 🔁 Share on X or LinkedIn
This tool is in active development, and your input is valuable. If you have any suggestions or ideas for new features, please feel free to get in touch.
- SEC Developer Resources
- SEC: Frequently Asked Questions About Form 13F
- SEC: Guidance on Beneficial Ownership Reporting (Sections 13D/G)
- Wikipedia: Global Industry Classification Standard
- MSCI: Global Industry Classification Standard (GICS)
- S&P Global: GICS Structure & Methodology
- CUSIP (Committee on Uniform Security Identification Procedures)
- Modern Portfolio Theory (MPT)
This project began as a fork of sec-web-scraper-13f by Gary Pang. The original tool provided a solid foundation for scraping 13F filings from the SEC's EDGAR database. It has since been significantly re-architected and expanded into a comprehensive analysis platform, incorporating multiple filing types, AI-driven insights, and automated data management.
This project uses a dual license:
- Original work (Gary Pang's sec-web-scraper-13f): MIT License.
- All new work (everything added by Alessandro Colace): Copyright © 2025 Alessandro Colace — All Rights Reserved. Personal and educational use is permitted; redistribution and commercial use require written permission.
See the LICENSE file for the full terms.





