Skip to content

stathzenziltj9jcv/xueqiu-user-posts-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

Xueqiu User Posts Scraper

A powerful data extraction tool for collecting in-depth user posts and engagement data from Xueqiu, China’s leading financial social network. It helps analysts and researchers turn real investor discussions into structured datasets for sentiment, trend, and market behavior analysis.

Bitbash Banner

Telegram   WhatsApp   Gmail   Website

Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for xueqiu-user-posts-scraper you've just found your team — Let’s Chat. 👆👆

Introduction

This project extracts detailed post data from Xueqiu user profiles, focusing on investment-related discussions, engagement metrics, and financial context. It solves the challenge of manually collecting large-scale investor sentiment data and is built for analysts, researchers, and quantitative teams working with Chinese market insights.

Financial Social Media Intelligence

  • Targets individual Xueqiu user timelines with configurable limits
  • Captures rich engagement, metadata, and financial references
  • Designed for scalable, repeatable data collection
  • Supports downstream analytics such as sentiment and correlation studies

Features

Feature Description
User Profile Targeting Scrape posts from specific Xueqiu user profiles.
Rich Post Metadata Extracts titles, full text, timestamps, and post types.
Engagement Metrics Collects likes, favorites, comments, and repost counts.
Financial Context Mapping Identifies referenced stocks and symbols.
Scalable Collection Handles multiple profiles with configurable limits.

What Data This Scraper Extracts

Field Name Field Description
id Unique post identifier.
user_id Author identifier for behavioral analysis.
title Post headline or subject.
text Full post content.
created_at Publication timestamp.
like_count Number of likes received.
fav_count Bookmark or favorite count.
retweet_count Repost or share count.
reply_count Number of replies or comments.
stock_list Referenced financial symbols.
meta_keywords Contextual and analytical metadata.

Example Output

[
      {
        "id": 345575898,
        "user_id": 1821992043,
        "title": "股息为盾,成长为矛",
        "created_at": 1754361591000,
        "like_count": 370,
        "fav_count": 190,
        "retweet_count": 30,
        "reply_count": 356,
        "stock_list": [
              { "symbol": "BK2049", "type": "35" },
              { "symbol": "BK2415", "type": "35" }
        ],
        "text": "最近市场上关于红利股和成长股之间的讨论非常激烈..."
      }
    ]

Directory Structure Tree

Xueqiu User Posts Scraper/
├── src/
│   ├── main.py
│   ├── collectors/
│   │   ├── profile_collector.py
│   │   └── post_parser.py
│   ├── utils/
│   │   └── time_utils.py
│   └── config/
│       └── settings.example.json
├── data/
│   ├── inputs.sample.json
│   └── sample_output.json
├── requirements.txt
└── README.md

Use Cases

  • Financial analysts use it to study investor sentiment, so they can anticipate market trends.
  • Quantitative teams use it to enrich trading models, so they can incorporate social signals.
  • Academic researchers use it to analyze behavioral finance patterns, so they can publish data-driven insights.
  • Market intelligence teams use it to track influential investors, so they can monitor opinion leaders.

FAQs

Does this scraper work with private profiles? Only publicly accessible profiles are supported. Private or restricted profiles may return incomplete data.

How many posts can be collected per user? The limit is configurable, allowing control over data volume and collection depth.

Is the extracted data suitable for sentiment analysis? Yes, the full text and engagement metrics are structured for NLP and sentiment pipelines.

Can this handle multiple users in one run? Yes, multiple profile URLs can be processed in a single execution.


Performance Benchmarks and Results

Primary Metric: Average extraction speed of 20–30 posts per minute per profile under standard conditions.

Reliability Metric: Stable collection with a success rate above 98% for public profiles.

Efficiency Metric: Low overhead processing with structured JSON output optimized for analytics workflows.

Quality Metric: High data completeness, capturing text, engagement, and financial references in a single dataset.

Book a Call Watch on YouTube

Review 1

"Bitbash is a top-tier automation partner, innovative, reliable, and dedicated to delivering real results every time."

Nathan Pennington
Marketer
★★★★★

Review 2

"Bitbash delivers outstanding quality, speed, and professionalism, truly a team you can rely on."

Eliza
SEO Affiliate Expert
★★★★★

Review 3

"Exceptional results, clear communication, and flawless delivery.
Bitbash nailed it."

Syed
Digital Strategist
★★★★★

Releases

No releases published

Packages

No packages published