This package includes comprehensive tests to ensure reliability and correctness.
Test Categories:
- ✅ Unit tests (fast, no LinkedIn required)
⚠️ Integration tests (require LinkedIn session)
Run fast unit tests that don't require LinkedIn authentication:
pytest tests/ -v -m "not integration"Expected result: All unit tests pass in ~5 seconds
What's tested:
- Data model conversions (to_dict, to_json)
- Browser context management
- Session save/load
- Navigation utilities
- Basic functionality without network calls
Some tests require actual LinkedIn scraping and take longer:
# First, create a valid session
# (See README for session setup instructions)
# Run all tests including integration tests
pytest tests/ -v- Require valid
linkedin_session.json - Make real network calls to LinkedIn
- Take 2-5 minutes per test
- May hit rate limits if run too frequently
test_browser.py- Browser management and session handlingtest_person_scraper.py- Person data model teststest_company_scraper.py- Company data model teststest_job_scraper.py- Job data model teststest_auth.py- Authentication utilities (non-network tests)
Integration tests in the same files above test actual LinkedIn scraping when run with a valid session.
# Run only unit tests (fast)
pytest -m "not integration" -v
# Run specific test file
pytest tests/test_person_scraper.py -v
# Run specific test
pytest tests/test_person_scraper.py::test_person_model_to_dict -v
# Run with coverage
pytest --cov=linkedin_scraper -v
# Run with verbose output
pytest -v -sWhen contributing tests:
- Unit tests - Test data models, utilities, and logic without network calls
- Integration tests - Mark with
@pytest.mark.integrationdecorator - Documentation - Add docstrings explaining what the test validates
- Assertions - Use clear, descriptive assertion messages
Example:
import pytest
def test_person_model():
"""Test Person model serialization"""
person = Person(name="John Doe", location="New York")
assert person.name == "John Doe"
assert person.to_dict()["name"] == "John Doe"
@pytest.mark.integration
async def test_person_scraper_real():
"""Test actual LinkedIn profile scraping"""
# Requires valid session
async with BrowserManager() as browser:
await browser.load_session("linkedin_session.json")
scraper = PersonScraper(browser.page)
person = await scraper.scrape("https://linkedin.com/in/...")
assert person.name is not NoneRunning integration tests multiple times in succession may trigger LinkedIn's rate limiting. If this happens:
- Wait 10-15 minutes before running tests again
- Use
pytest -m "not integration"for development
LinkedIn sessions expire after a few hours. If integration tests fail with authentication errors:
# Refresh your session (see README for setup instructions)LinkedIn actively blocks headless browsers:
- Tests run in headed mode (browser window opens)
- This is expected behavior for LinkedIn scrapers
- Headless mode will fail on real LinkedIn pages
For CI/CD pipelines:
# Example GitHub Actions workflow
- name: Run unit tests
run: pytest -m "not integration" -v
# Integration tests should be run separately with secrets
- name: Run integration tests
run: pytest -m integration -v
env:
LINKEDIN_SESSION: ${{ secrets.LINKEDIN_SESSION }}Tests are organized by component:
tests/
├── conftest.py # Shared fixtures
├── test_auth.py # Authentication tests
├── test_browser.py # Browser management tests
├── test_person_scraper.py # Person scraping tests
├── test_company_scraper.py # Company scraping tests
└── test_job_scraper.py # Job scraping tests
-
Fast feedback loop: Use unit tests during development
pytest -m "not integration" --tb=short -
Debug test failures: Use
-sflag to see print statementspytest tests/test_person_scraper.py -v -s
-
Test one thing: Run specific tests while debugging
pytest tests/test_person_scraper.py::test_person_model_to_dict -v
When submitting PRs:
- Ensure all existing tests pass
- Add tests for new functionality
- Mark integration tests appropriately
- Update this document if test structure changes
Last Updated: January 2026