This repository contains a Node.js-based scraper for extracting Amazon product reviews. The scraper utilizes the Crawlbase Crawling API with the Amazon Product Reviews scraper, ensuring smooth and accurate data extraction while handling pagination automatically.
➡ Read the full blog here to learn more.
The amazon_reviews_scraper.js extracts detailed Amazon product reviews, including:
- Review ID
- Reviewer Name & Profile Link
- Review Title & Text
- Rating (Stars)
- Review Date
- Review Attributes (Size, Color, Product Grade, etc.)
- Helpful Votes Count
- Verified Purchase Status
- Review Comments Count
- Review Link
- Media (Images & Videos)
- Pagination Handling (Automatically Fetches Multiple Pages)
The scraper recursively fetches reviews across multiple pages and saves the extracted data in a JSON file.
Ensure Node.js is installed on your system. Check the version using:
node -vInstall the required dependency:
npm install crawlbase- Sign up on Crawlbase to get an API token.
- This token is required to access the Crawling API for bypassing Amazon’s anti-bot protection.
Replace "CRAWLBASE_JS_TOKEN" in the script with your Crawlbase Crawling API Token.
node amazon_reviews_scraper.jsThe extracted Amazon product reviews will be saved in a JSON file named amazon_reviews.json.