Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,14 +23,18 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install .
- name: Lint with flake8
run: |
pip install flake8
# stop the build if there are Python syntax errors or undefined names
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
- name: Type check with mypy
run: |
pip install mypy
mypy socid_extractor/
- name: Test with pytest
run: |
pip install pytest==6.2.5 pytest-rerunfailures
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/python-publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,9 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install setuptools wheel
pip install build
- name: Build package
run: |
python setup.py sdist bdist_wheel # Could also be python -m build
python -m build
- name: Publish package distributions to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
4 changes: 0 additions & 4 deletions MANIFEST.in

This file was deleted.

194 changes: 98 additions & 96 deletions METHODS.md

Large diffs are not rendered by default.

7 changes: 4 additions & 3 deletions docs/testing-and-ci.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ Cookie-based scenarios may use files under [`tests/`](../tests/) (e.g. `*.cookie

## Pytest markers

Defined in [`pytest.ini`](../pytest.ini):
Defined in [`pyproject.toml`](../pyproject.toml) (`[tool.pytest.ini_options]`):

| Marker | Meaning |
| ------ | ------- |
Expand Down Expand Up @@ -52,9 +52,10 @@ Helper script that turns lines of the form `key: value` into `assert info.get("k

- Python **3.10, 3.11, 3.12**
- **flake8** — syntax/undefined-name checks; complexity/length as warnings (`setup.cfg` ignores `E501` for line length)
- **mypy** — type checking with `mypy socid_extractor/` (stub overrides in `pyproject.toml`)
- **pytest** — `pytest -k 'not cookies' -m 'not github_failed and not rate_limited' --reruns 3 --reruns-delay 30` (pytest-rerunfailures for flaky network tests)

Publishing to PyPI on release is handled by [`.github/workflows/python-publish.yml`](../.github/workflows/python-publish.yml).
Publishing to PyPI on release is handled by [`.github/workflows/python-publish.yml`](../.github/workflows/python-publish.yml) using `python -m build`.

## `revision.py`

Expand All @@ -66,7 +67,7 @@ python revision.py

It:

- Reads pytest marker descriptions from `pytest.ini`
- Reads pytest marker descriptions from `pyproject.toml`
- Loads tests from `tests/test_e2e.py` and schemes from `socid_extractor/schemes.py`
- Associates tests with scheme names via docstrings (method name per line) or heuristic name matching
- **Overwrites [`METHODS.md`](../METHODS.md)** with a table of methods, test links, and notes (markers, skip reasons)
Expand Down
56 changes: 56 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
[build-system]
requires = ["setuptools>=64", "wheel"]
build-backend = "setuptools.build_meta"

[project]
name = "socid-extractor"
version = "0.0.27"
description = "Extract accounts' identifiers from personal pages on various platforms"
readme = "README.md"
license = "GPL-3.0"
requires-python = ">=3.10"
authors = [
{name = "Soxoj", email = "soxoj@protonmail.com"},
]
dependencies = [
"requests>=2.24.0",
"python-dateutil>=2.8.1",
"beautifulsoup4~=4.14.3",
]

[project.urls]
Homepage = "https://github.com/soxoj/socid-extractor"

[project.scripts]
socid_extractor = "socid_extractor.cli:run"

[tool.setuptools.packages.find]
include = ["socid_extractor*"]

[tool.setuptools.package-data]
socid_extractor = ["py.typed"]

[tool.mypy]
python_version = "3.10"
warn_return_any = true
warn_unused_configs = true
disallow_untyped_defs = false

[[tool.mypy.overrides]]
module = [
"requests.*",
"dateutil.*",
"bs4.*",
]
ignore_missing_imports = true

[tool.flake8]
ignore = ["E501"]

[tool.pytest.ini_options]
asyncio_mode = "auto"
markers = [
"github_failed: requests from GitHub Actions CI servers are blocked",
"rate_limited: anti-bot / captcha / rate limiting from the site",
"requires_cookies: cookies are required to get content",
]
6 changes: 0 additions & 6 deletions pytest.ini

This file was deleted.

20 changes: 6 additions & 14 deletions revision.py
Original file line number Diff line number Diff line change
@@ -1,25 +1,17 @@
#!/usr/bin/env python3
import tomllib
from datetime import datetime

from tests import test_e2e
from socid_extractor.schemes import schemes

def collect_pytest_annotations():
annotations = {}
with open('pytest.ini') as f:
lines = f.read().splitlines()
markers = False
for line in lines:
if line.startswith('markers ='):
markers = True
continue
if not markers:
continue
if not line.startswith(' '):
break
name, descr = line.strip().split(': ')
annotations[name] = descr

with open('pyproject.toml', 'rb') as f:
data = tomllib.load(f)
for marker in data.get('tool', {}).get('pytest', {}).get('ini_options', {}).get('markers', []):
name, _, descr = marker.partition(': ')
annotations[name.strip()] = descr.strip()
return annotations


Expand Down
4 changes: 0 additions & 4 deletions setup.cfg
Original file line number Diff line number Diff line change
@@ -1,6 +1,2 @@
[egg_info]
tag_build =
tag_date = 0

[flake8]
ignore = E501
23 changes: 0 additions & 23 deletions setup.py

This file was deleted.

2 changes: 1 addition & 1 deletion socid_extractor/schemes.py
Original file line number Diff line number Diff line change
Expand Up @@ -1166,7 +1166,7 @@
'likes_count': lambda x: x.get('likes_count'),
'photos_count': lambda x: x.get('photos_count'),
'is_verified': lambda x: x.get('is_verified'),
'facebook_uid': lambda x: re.search(r'graph\.facebook\.com/(\d+)/picture', x.get('photo', '')).group(1) if x.get('photo') and 'graph.facebook.com' in x.get('photo', '') else None,
'facebook_uid': lambda x: m.group(1) if x.get('photo') and (m := re.search(r'graph\.facebook\.com/(\d+)/picture', x.get('photo', ''))) else None,
}
},
'VC.ru': {
Expand Down
16 changes: 7 additions & 9 deletions tests/test_e2e.py
Original file line number Diff line number Diff line change
Expand Up @@ -830,8 +830,6 @@ def test_tiktok_hydration_e2e():
"""
TikTok
TikTok (legacy SIGI_STATE)
Live check for the current web profile (hydration JSON, not SIGI_STATE).
Assertions are structural: ids and CDN avatar URL, not follower counts (those drift).
"""
info = extract(parse('https://www.tiktok.com/@tiktok', timeout=20)[0])

Expand Down Expand Up @@ -1390,7 +1388,7 @@ def test_duolingo_api():


def test_chess_com_api_e2e():
"""Chess.com API: e2e test via pub API endpoint."""
"""Chess.com API"""
info = extract(parse('https://api.chess.com/pub/player/john')[0])

assert info.get('chess_user_id') == '95037716'
Expand All @@ -1405,7 +1403,7 @@ def test_chess_com_api_e2e():


def test_chess_com_html_e2e():
"""Chess.com HTML: e2e test from member page."""
"""Chess.com HTML"""
info = extract(parse('https://www.chess.com/member/john')[0])

assert info.get('username') == 'John'
Expand All @@ -1414,7 +1412,7 @@ def test_chess_com_html_e2e():


def test_roblox_api_e2e():
"""Roblox API: e2e test via users API."""
"""Roblox user API"""
info = extract(parse('https://users.roblox.com/v1/users/2191')[0])

assert info.get('roblox_user_id') == '2191'
Expand All @@ -1426,7 +1424,7 @@ def test_roblox_api_e2e():


def test_roblox_html_e2e():
"""Roblox HTML: e2e test from profile page (redirect from user.aspx)."""
"""Roblox HTML"""
info = extract(parse('https://www.roblox.com/users/2191/profile')[0])

assert info.get('username') == 'john'
Expand All @@ -1436,7 +1434,7 @@ def test_roblox_html_e2e():

@pytest.mark.rate_limited
def test_stack_exchange_api_e2e():
"""Stack Exchange API: e2e test via /users endpoint."""
"""Stack Exchange API"""
info = extract(parse('https://api.stackexchange.com/2.3/users?order=desc&sort=name&inname=soxoj&site=stackoverflow')[0])

assert info.get('username') == 'Soxoj1'
Expand All @@ -1450,12 +1448,12 @@ def test_stack_exchange_api_e2e():

@pytest.mark.skip(reason='LeetCode GraphQL requires POST request')
def test_leetcode_graphql_e2e():
"""LeetCode GraphQL: e2e test (requires POST, skipped by default)."""
"""LeetCode GraphQL"""
pass


def test_boosty_api_e2e():
"""Boosty API: e2e test via blog endpoint."""
"""Boosty API"""
info = extract(parse('https://api.boosty.to/v1/blog/soxoj')[0])

assert info.get('uid') == '10276482'
Expand Down
Loading