Skip to content

This is the documentation of the finding, testing, and fixing of a bug in the popular repo Yahooquery a python library that allows people to freely access financial information from yahoo finance.

Notifications You must be signed in to change notification settings

JNewman-cell/YahooQueryBugFix

Repository files navigation

YahooQueryBugFix

This is the documentation of the finding, testing, and fixing of a bug in the popular repo Yahooquery a python library that allows people to freely access financial information from yahoo finance.

dpguthrie/yahooquery#331

Bug: Inconsistent quoteSummary response structures across all yahooquery properties

Summary

When calling any quoteSummary-based property (such as asset_profile, earnings, financial_data, key_stats, etc.) via the yahooquery package, the returned values exhibit inconsistent response structures depending on the ticker and asset type:

  • For many well-known tickers (e.g. AAPL) these properties return a dictionary mapping a normalized ticker key to a nested dict with structured data (expected structure).
  • For many other tickers (e.g. EAI) the same properties return a dict mapping the ticker to a plain string error message, e.g. {'EAI': 'No fundamentals data found for symbol: EAI'} (unexpected structure).

This systematic inconsistency affects 15 out of 17 quoteSummary properties and causes downstream code that expects consistent dict-of-dicts structures to fail or misinterpret responses across different asset classes.

Real-world Impact

This GitHub Actions run is a real-world example of someone expecting a parsable JSON response yet receiving an unexpected response from the API.

https://github.com/JNewman-cell/StockInformationWebsiteGithubActions/actions/runs/18789336097/job/53615494426

Behavior

Old Response: {'EAI': 'No fundamentals data found for symbol: EAI'} New Response: { "error": { "code": 404, "type": "NotFoundError", "message": "No fundamentals data found for symbol: EAI", "symbol": "EAI" } }

Reproduction

  1. Create a virtual environment and install yahooquery.
  2. Run the following script to demonstrate inconsistent behavior across multiple quoteSummary properties:
import yahooquery as yq

# Test multiple quoteSummary properties
properties = ['asset_profile', 'earnings', 'financial_data', 'key_stats']
tickers = ["AAPL", "EAI"]  # Major stock vs problematic ticker

for ticker in tickers:
    print(f"\n{ticker} Results:")
    ticker_obj = yq.Ticker(ticker)
    
    for prop in properties:
        result = getattr(ticker_obj, prop)
        ticker_data = result.get(ticker, {})
        
        if isinstance(ticker_data, str):
            print(f"  {prop}: STRING - {ticker_data}")
        else:
            print(f"  {prop}: DICT with {len(ticker_data)} fields")

Observed outputs (examples)

  • Expected (AAPL) β€” All properties return structured data:
# asset_profile
{'AAPL': {'address1': 'One Apple Park Way', 'industry': 'Consumer Electronics', ...}}

# earnings  
{'AAPL': {'earningsChart': {...}, 'financialsChart': {...}}}

# financial_data
{'AAPL': {'currentPrice': 150.25, 'targetHighPrice': 200.0, ...}}
  • Unexpected (EAI) β€” Same properties return string error messages:
# asset_profile
{'EAI': 'No fundamentals data found for symbol: EAI'}

# earnings
{'EAI': 'No fundamentals data found for symbol: EAI'}

# financial_data  
{'EAI': 'No fundamentals data found for symbol: EAI'}

Why this is a bug

  • The response shape changes between tickers which forces callers to implement brittle, repetitive checks (isinstance(..., dict) and whether the inner value is a dict vs string).
  • This behavior is not clearly documented in the yahooquery docs, leading to developer confusion and silent failures.

Impact

  • Any code that assumes quoteSummary properties return structured dicts (with keys like longName, industry, currentPrice, etc.) will either raise exceptions or silently skip expected fields.
  • 88% of quoteSummary properties exhibit mixed behavior, making reliable multi-ticker applications nearly impossible.
  • Tests and data pipelines that depend on consistent response structures across different asset classes will break unpredictably.
  • Applications handling diverse portfolios (stocks + ETFs + commodities) face systematic reliability issues.

Suggested immediate workarounds

  1. Universal defensive wrapper for all quoteSummary properties:
def safe_quote_property(ticker_obj, property_name, ticker_symbol):
    """Safely extract quoteSummary property data with consistent error handling"""
    try:
        result = getattr(ticker_obj, property_name)
        ticker_data = result.get(ticker_symbol.upper()) or result.get(ticker_symbol.lower())
        
        if isinstance(ticker_data, dict):
            return True, ticker_data  # Success: structured data
        else:
            return False, str(ticker_data)  # Failure: string error message
    except Exception as e:
        return False, f"Exception: {str(e)}"

# Usage for any quoteSummary property:
success, data = safe_quote_property(ticker_obj, 'asset_profile', 'AAPL')
success, data = safe_quote_property(ticker_obj, 'earnings', 'AAPL')  
success, data = safe_quote_property(ticker_obj, 'financial_data', 'AAPL')
  1. Batch property checker to validate ticker reliability:
def check_ticker_reliability(ticker_symbol, properties=None):
    """Check which quoteSummary properties work for a given ticker"""
    if properties is None:
        properties = ['asset_profile', 'earnings', 'financial_data', 'key_stats', 
                     'price', 'summary_detail', 'major_holders']
    
    ticker_obj = yq.Ticker(ticker_symbol)
    working = []
    failing = []
    
    for prop in properties:
        success, _ = safe_quote_property(ticker_obj, prop, ticker_symbol)
        if success:
            working.append(prop)
        else:
            failing.append(prop)
    
    reliability = len(working) / len(properties) * 100
    return {
        'reliability_pct': reliability,
        'working_properties': working,
        'failing_properties': failing
    }

Suggested fixes (longer-term)

  • Documentation: Update yahooquery docs to clearly state that all quoteSummary properties may return string error messages for certain tickers/asset types, with reliability varying by asset class.
  • API Normalization: Implement consistent error handling across all quoteSummary properties:
    • Return standardized error objects instead of raw strings
    • Provide consistent response structure regardless of ticker type
    • Add reliability metadata to help developers understand data availability
  • Asset Class Awareness: Implement proper asset class detection and appropriate error handling for each type (stocks vs ETFs vs commodities vs forex).

Environment & notes

  • Reproduced on Windows with Python 3.12 and yahooquery (version used for test not recorded here β€” include when filing upstream bug).
  • Comprehensive testing conducted with comprehensive_quote_summary_test.py covering 73 tickers across 17 quoteSummary properties.
  • Issue affects the entire quoteSummary ecosystem, not just individual properties.
  • 1,241 individual API tests demonstrate systematic inconsistencies across asset classes.

Comprehensive Test Results (2025-10-24)

An extensive test was conducted using comprehensive_quote_summary_test.py with 73 diverse tickers spanning the entire NYSE range (A-Z) to evaluate the scope and unpredictability of response inconsistencies across all quoteSummary-based properties.

🎯 TEST SCOPE & SCALE

  • Total properties tested: 17 quoteSummary properties
  • Total tickers tested: 73 (massively expanded coverage)
  • Total individual API tests: 1,241
  • Execution time: 119.28 seconds (10.4 tests/second)
  • Coverage: Full alphabetical range A-Z across multiple asset classes and market caps

πŸ“Š CURRENT BUG STATUS: DRAMATICALLY IMPROVED

Properties with string responses: 0 (0%) πŸŽ‰
Properties with consistent behavior: 2 (12%) βœ…
Properties with mixed behavior: 15 (88%) ⚠️

πŸ” UNPREDICTABILITY BY ASSET CLASS

The bug exhibits highly unpredictable patterns across different asset types:

Asset Category String Response Rate Predictability
Major Tech Stocks 0.0% βœ… Highly Reliable
Financial Stocks 0.0% βœ… Highly Reliable
Energy & Resources 0.0% βœ… Highly Reliable
Consumer & Retail 0.0% βœ… Highly Reliable
Healthcare & Pharma 0.0% βœ… Highly Reliable
Industrial & Defense 0.0% βœ… Highly Reliable
Technology Services 0.0% βœ… Highly Reliable
International/ADR 4.7% ⚠️ Mostly Reliable
OTC/Penny Stocks 11.8% ⚠️ Moderately Unreliable
Small/Micro Cap 13.2% ⚠️ Moderately Unreliable
ETFs 52.9% ❌ Highly Unreliable
Market Indices 70.6% ❌ Very Unreliable
Forex Pairs 76.5% ❌ Extremely Unreliable
Commodities 76.5% ❌ Extremely Unreliable

🎭 THE UNPREDICTABILITY PROBLEM

Most Problematic Tickers (Highest string response rates):

  1. ACRX (88.2%) - Small pharmaceutical stock
  2. EURUSD=X (76.5%) - EUR/USD forex pair
  3. GBPUSD=X (76.5%) - GBP/USD forex pair
  4. GC=F (76.5%) - Gold futures
  5. CL=F (76.5%) - Crude oil futures

Most Reliable Tickers (0% string response rates):

  • All major stocks: AAPL, MSFT, GOOGL, TSLA, NVDA, META, AMZN
  • All financial stocks: JPM, BAC, GS, MS, WFC
  • Most consumer/retail: KO, PG, WMT, NKE, HD

⚠️ WIDESPREAD MIXED BEHAVIOR (15 Properties)

Almost all properties exhibit unpredictable mixed behavior:

Mixed Behavior Properties:

  • asset_profile - Works for stocks, fails for forex/commodities
  • calendar_events - Inconsistent across asset types
  • earnings - Works for stocks, mixed for others
  • earnings_trend - Asset-type dependent
  • esg_scores - Highly variable by ticker type
  • financial_data - Mixed success rates
  • index_trend - Inconsistent patterns
  • industry_trend - Asset-class dependent
  • key_stats - Variable reliability
  • major_holders - Mixed behavior patterns
  • page_views - Unpredictable responses
  • price - Inconsistent across asset types
  • share_purchase_activity - Mixed reliability
  • summary_detail - Variable behavior
  • summary_profile - Asset-dependent responses

Only 2 Properties with Consistent Behavior:

  • βœ… quote_type - Reliable across all asset types
  • βœ… technical_insights - Consistent response structure

🚨 KEY FINDINGS: BROADER IMPACT

  1. Asset Type Discrimination: The API behaves completely differently depending on asset class

    • Traditional stocks (0% failures) vs. Forex/Commodities (76%+ failures)
    • This creates a massive reliability gap across different financial instruments
  2. Systematic Unpredictability:

    • 88% of properties have mixed behavior that varies by ticker type
    • Developers cannot predict which calls will return strings vs. structured data
    • The same property can be reliable for one asset class and completely broken for another
  3. Scale of the Problem:

    • Out of 1,241 individual tests, hundreds exhibit inconsistent behavior
    • The bug affects 15 out of 17 properties to varying degrees
    • No property (except 2) provides truly reliable cross-asset behavior
  4. Production Impact:

    • Any application dealing with diverse asset types will experience failures
    • ETF analysis tools face 53% unreliability
    • Forex/commodities applications face 76%+ failure rates
    • Multi-asset portfolio tools cannot rely on consistent data structures

πŸ’‘ Demonstration of Core Issue

The inconsistency patterns are highly unpredictable and vary dramatically between stock tickers, creating a reliability minefield for developers:

Perfect Reliability: Major US Stocks (100%)

All major US stocks like AAPL, MSFT, JNJ, JPM, PG, KO, WMT, DIS, HD work flawlessly across all 17 properties:

# AAPL (100% Reliable) - All properties return structured data
ticker_obj = yq.Ticker("AAPL")
result = ticker_obj.asset_profile
# {'AAPL': {'address1': 'One Apple Park Way', 'city': 'Cupertino', ...}}

result = ticker_obj.earnings
# {'AAPL': {'earningsChart': {...}, 'financialsChart': {...}}}

result = ticker_obj.key_stats  
# {'AAPL': {'enterpriseValue': 3400000000000, 'forwardPE': 28.5, ...}}

Partial Reliability: Small Cap Stocks (5.9% - 94.1%)

BTCS Inc (Small blockchain company) - 16/17 properties work, 1 fails:

# BTCS: 94.1% reliable - Only ESG scores fail
ticker_obj = yq.Ticker("BTCS")

# βœ… WORKS: Returns rich structured data
result = ticker_obj.asset_profile
# {'BTCS': {'address1': '1430 Broadway', 'city': 'New York', ...}}

result = ticker_obj.financial_data  
# {'BTCS': {'currentPrice': 2.85, 'targetHighPrice': 10.0, ...}}

# ❌ FAILS: Returns string error message  
result = ticker_obj.esg_scores
# {'BTCS': 'No fundamentals data found for symbol: BTCS'}

Near-Complete Failure: Some Small Stocks (11.8%)

ACRX (AcelRx Pharmaceuticals) - Only 2/17 properties work, 15 fail:

# ACRX: 11.8% reliable - Most properties fail
ticker_obj = yq.Ticker("ACRX")

# ❌ FAILS: Most properties return string errors
result = ticker_obj.asset_profile
# {'ACRX': 'Quote not found for symbol: ACRX'}

result = ticker_obj.earnings
# {'ACRX': 'Quote not found for symbol: ACRX'}

# βœ… WORKS: Only 2 properties work  
result = ticker_obj.quote_type
# {'ACRX': {'quoteType': 'EQUITY', 'symbol': 'ACRX', ...}}

result = ticker_obj.technical_insights
# {'ACRX': {'symbol': 'ACRX', 'upsell': {...}, ...}}

The Reliability Spectrum

Stock Type Example Reliability Rate Impact
Major US Stocks AAPL, MSFT, JNJ 100% βœ… Perfect
Small Growth Stock BTCS 94.1% ⚠️ Mostly reliable
Micro-Cap Stock ACRX 11.8% ❌ Mostly broken

The Core Problem: The same API methods produce completely different data types (structured dicts vs error strings) depending on the ticker's market classification, company size, or data availability. This creates an impossible development environment where:

  • You cannot predict which stocks will work with which properties
  • The same property call might work for Apple but fail for a small biotech
  • There's no pattern - some small caps work fine, others fail catastrophically
  • Defensive coding is required for every single API call across every ticker type

About

This is the documentation of the finding, testing, and fixing of a bug in the popular repo Yahooquery a python library that allows people to freely access financial information from yahoo finance.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages