PlayCast is a flexible Playwright-based framework for extracting structured product data from multiple Russian e-commerce websites.
It provides a simple API for parsing sites like Citilink, DNS, OZON, Avito, and 28bit.
- Site-specific parsers for
Citilink,DNS,OZON,Avito, and28bit - Uses Playwright and browser automation to fetch pages and collect search results
- Includes shared utilities for human-like typing, scrolling, and selector handling
-- Simple import and usage:
from playcast.Parser import ParseOzon
The goal is to make a reusable scraping toolkit instead of a one-off script. In the future, this project can grow into an extensible library with:
- a common parser base class
- plugin-style parser registration
- configuration-driven selectors
- optional LLM assistance for selector discovery and page analysis
python -m pip install -U pip
python -m pip install playwright beautifulsoup4 fake-useragent httpx lxml requests undetected-playwright playwright-stealth rebrowser-playwright
python -m playwright install chromiumImport and use a parser:
import asyncio
from multiparser.Parser import ParseOzon
async def main():
results = await ParseOzon.get_cards_by_placeholder("RTX 5080")
print(results)
asyncio.run(main())This will launch Playwright, search for "RTX 5080" on OZON, extract product data, and return the results.
playcast/Parser.py- Main parser classes withget_cardsmethodsparsers/- Individual parser implementationscommon/utils.py- Shared helper functionsdata/config.py- URLs, default search keywords, and constants
A possible future improvement is to use a large language model (LLM) to analyze a page's HTML and automatically suggest selectors or attribute patterns. This will make the parser more flexible to site changes and reduce the amount of manual CSS/XPath configuration, but the problem is that large language models will be too large.
- Add a new parser under
parsers/ - Create a corresponding class in
playcast/Parser.py - Reuse
parsers/base.pyfor shared behaviors - Keep site-specific selectors and actions isolated
- Add tests for new parser output and utilities
This repository is available under the MIT License.