How to automate gathering data from Stack Overflow? #612
-
| 
         How can I automate the process of gathering data from Stack Overflow using Python to track top-voted questions in a specific tag over time?  | 
  
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
| 
         Hi, @grahamj89 To automate gathering data from Stack Overflow, you can use the Stack Exchange API, specifically the  
 pip install requests
 import requests
import json
import time
# Function to fetch top-voted questions for a specific tag
def fetch_top_voted_questions(tag, pages=1):
    # URL of the Stack Exchange API for questions with a specific tag
    url = f'https://api.stackexchange.com/2.3/questions'
    
    questions = []
    
    for page in range(1, pages + 1):
        params = {
            'order': 'desc',
            'sort': 'votes',
            'tagged': tag,
            'site': 'stackoverflow',
            'page': page,
            'pagesize': 10  # Adjust based on how many questions you want per page
        }
        response = requests.get(url, params=params)
        
        if response.status_code == 200:
            data = response.json()
            questions.extend(data['items'])
        else:
            print(f"Failed to retrieve data: {response.status_code}")
            break
        
        time.sleep(1)  # Respect Stack Exchange API rate limits
        
    return questions
# Function to display top questions with their votes
def display_top_questions(tag, pages=1):
    questions = fetch_top_voted_questions(tag, pages)
    
    if questions:
        print(f"Top Voted Questions for Tag: {tag}")
        for idx, question in enumerate(questions, 1):
            title = question['title']
            link = question['link']
            votes = question['score']
            print(f"{idx}. {title} (Votes: {votes})\nLink: {link}\n")
    else:
        print("No questions found.")
# Example usage
if __name__ == "__main__":
    tag = 'python'  # Example: track Python-related questions
    pages = 2       # Fetch questions from the first 2 pages (20 questions)
    display_top_questions(tag, pages)Explanation:
 Output:The script prints out the top-voted questions with their title, score (votes), and a direct link to the question. Sample Output: Scheduling and Automation:  | 
  
Beta Was this translation helpful? Give feedback.
Hi, @grahamj89
To automate gathering data from Stack Overflow, you can use the Stack Exchange API, specifically the
/questionsendpoint, to track top-voted questions for a particular tag. Below is a basic Python script using therequestslibrary to pull data and track the most upvoted questions.requestslibrary if you don't have it already: