Skip to content

InstaScrape is a command-line Python tool that fetches all parent comments from any public Instagram Reel using your session cookies. It's fast, efficient, and now comes with a progress bar so you can see the scraping in action. Designed for researchers, analysts, or curious minds.

License

Notifications You must be signed in to change notification settings

kaifcodec/InstaScrape

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚀 InstaScrape — Async Instagram Comment Scraper


❓ Built with a steel heart, unasked for, yet unable to turn away from the world it watches.
❓ Assembled from iron and thought, never meant to be this cold, yet it endures.
❓ Created with a reluctant steel heart, seeing life it cannot touch.
— Author: 401

Python License GitHub stars

Scrape all parent comments from any Instagram Reel with automated login, async speed, real-time progress, and clean exports — no manual cookie copying required.


✨ Features

  • Automated Login: cookie.json persistence with iat + expiry, no manual cookies needed.
  • 🔄 Self-healing Auth: detects expired cookies mid-run, prompts relogin, resumes automatically.
  • Async Engine: powered by httpx.AsyncClient with requests-per-second throttling.
  • 📊 Progress Tracking: accurate percent and ETA from Instagram’s comment count.
  • 📁 Dual Exports: TXT and JSON files saved in timestamped folders.

📦 Requirements

  • Python 3.9+
  • Dependencies:
pip install -r requirements.txt

🛠️ Installation

git clone https://github.com/kaifcodec/InstaScrape
cd InstaScrape
pip install -r requirements.txt

▶️ Usage

python3 main.py
  • Enter the Instagram Reel URL (e.g., https://www.instagram.com/reel/SHORTCODE/).
  • Set Max requests per second (5-7 recommended). Adjust for stability.
  • On first run, provide username/password; cookie.json is created and reused until expiry.

📁 Output

  • TXT: download_comments/txt/reel_comments_YYYYMMDD_HHMMSS.txt
  • JSON: download_comments/json/reel_comments_YYYYMMDD_HHMMSS.json Example JSON structure:
{
  "generated_at": 1700000000,
  "count": 123,
  "comments": [
    { "username": "user1", "text": "Nice!", "created_at": 1699999000 }
  ]
}

🔧 How it Works

  • Cookie Lifecycle: cookie.json stores iat and expiry; validated on startup & during requests.
  • Error Resilience: retries transient errors and refreshes cookies on 401/redirect-to-login.
  • Progress Accuracy: uses Instagram’s comment count to calculate percent & ETA.
  • Async Efficiency: httpx.AsyncClient with HTTP/2, keep-alive, and RPS limiter.

💡 Tips

  • Start with 5-7 RPS to minimize throttling; increase gradually.
  • Filenames use local time; switch to UTC by replacing datetime.now() with datetime.utcnow() in main.py.

⚠️ Disclaimer

Use responsibly. Comply with Instagram’s Terms of Service. Intended for personal or permitted use only.

About

InstaScrape is a command-line Python tool that fetches all parent comments from any public Instagram Reel using your session cookies. It's fast, efficient, and now comes with a progress bar so you can see the scraping in action. Designed for researchers, analysts, or curious minds.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages