Skip to content

Linkedin scrapper fixes #84

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 32 commits into
base: main
Choose a base branch
from

Conversation

HalaHamdi
Copy link
Collaborator

Made progress in #81

🚀 What this PR does

  • Fix LinkedIn‑scraper “current_position” bugs
    Corrects logic that was mis‑parsing the current_position field on some profiles.

  • Add dynamic GitHub Action for Classes YAMLs
    Loops over all C20xx.yaml and C20xx_Credit.yaml files in public/department/Extras/Classes/, runs the scraper on each file sequentially, commits any changes, and pushes back to the branch. Scheduled to run automatically on the 1st of every month and also triggerable via “Run workflow.”

  • Secret‑only setup
    Requires a repo secret LINKEDIN_COOKIES_JSON containing your exported cookies JSON (via “Export cookie JSON file for Puppeteer” mentioned in ⭐️ Create a Github Action for Automatic LinkedIn Scraping #81). Ensure your LinkedIn session is active when you generate that file to prevent expiration.


⚙️ Remaining work

  • Batch LinkedIn API requests
    Currently implemented but not yet wired into the main flow.

    • Discovery: LinkedIn’s API tokens can expire very quickly; a more robust batch‑request mechanism would improve reliability.
  • Error‑handling for corner cases
    Some profiles still throw unhandled exceptions.

    • Discovery: There are edge‑case profiles that need specialized parsing logic.
  • Code cleanup & refactoring
    Performance optimizations and better module structure.

    • Discovery: Future iterations should add retry logic and better logging around failed scrapes.

Copy link

vercel bot commented Jul 25, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
cmp-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Aug 3, 2025 1:20pm

Copy link

netlify bot commented Jul 25, 2025

Deploy Preview for cmp-docs ready!

Name Link
🔨 Latest commit b4d8eab
🔍 Latest deploy log https://app.netlify.com/projects/cmp-docs/deploys/688f620c2ec1c1000803cba3
😎 Deploy Preview https://deploy-preview-84--cmp-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@EssamWisam
Copy link
Owner

image

@EssamWisam
Copy link
Owner

cat-driving-serious

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants