Skip to content

Indiana Improvements #1493

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open

Indiana Improvements #1493

wants to merge 21 commits into from

Conversation

Luis-manzur
Copy link
Contributor

This pull request enhances the ind.py scraper to include additional fields for lower court details and judge names, and updates the corresponding test cases to validate these changes.

@Luis-manzur Luis-manzur requested a review from flooie July 9, 2025 21:18
@Luis-manzur Luis-manzur linked an issue Jul 9, 2025 that may be closed by this pull request
@Luis-manzur Luis-manzur moved this to PRs to Review in Case Law Sprint Jul 9, 2025
@flooie
Copy link
Contributor

flooie commented Jul 10, 2025

Small issue here on the title ...

feat(idaho): add new fields for lower court details and judge names i… #1493

this seems like its for Indiana not Idaho

@flooie flooie changed the title feat(idaho): add new fields for lower court details and judge names i… Indiana Improvements Jul 10, 2025
Copy link
Contributor

@flooie flooie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not working as expected. take a look at this example I got from the Data and the results

Adding new item:
    case_dates: "2025-01-23"
    case_names: "Estate of Elmer Gordon Waggoner v. Anonymous Health System, Inc."
    download_urls: "https://public.courts.in.gov/Decisions/api/Document/Opinion?Id=6Prq4NVaQ75RijbOC2NrHHYSFd2Uxg3A-INdlkTtRZac_o0my9t2-w3VgZ6EdYor0"
    precedential_statuses: "Published"
    blocked_statuses: "False"
    date_filed_is_approximate: "False"
    dispositions: "Reversed and Remanded"
    docket_numbers: "24A-CT-00469"
    judges: "Paul A, III, John G - SR"
    lower_courts: "Vanderburgh Superior Court 1"
    lower_court_numbers: "82D01-2308-CT-003727"
    case_name_shorts: ""

and the JSON

{'additionalCourtCount': 0,
  'argumentUrl': None,
  'caseNumber': '24A-CT-00469',
  'category': 'Civil     ',
  'courtName': 'Court of Appeals',
  'courts': [{'id': 44201721,
              'name': 'Vanderburgh Superior Court 1',
              'number': '82D01-2308-CT-003727'},
             {'id': 44815078,
              'name': 'Court of Appeals',
              'number': '24A-CT-00469'}],
  'date': '1/23/2025',
  'decision': 'Reversed and Remanded',
  'detailsUrl': 'https://public.courts.in.gov/mycase/#/vw/CaseSummary/eyJ2Ijp7IkNhc2VUb2tlbiI6ImpvSUNnb240MmZWUmhudEMxRjhxdjVYRURZelo1ZUhweDZYaG5YbFpGTk0xIiwiSGlkZVRvb2xiYXJzIjp0cnVlLCJQQUxvZ28iOmZhbHNlLCJTUkNUIjoiMHRXRFpEMVMyQ3kwYTdUVVJFZHZ3NXBZOG5EVmp6U1NUTUxMSE9qVm81TTEifX0=',
  'id': -971371332,
  'isMemorandum': False,
  'opinion': {'dispCompInstanceId': 9416930,
              'perCuriam': False,
              'result': 'Majority Opinion',
              'votes': [{'dispCompInstanceId': 9416930,
                         'judge': 'Felix, Paul A.',
                         'judgeCode': 106858,
                         'seperateOpinion': False,
                         'voteCode': 71040,
                         'voteRank': 2,
                         'voteValue': 'Concur'},
                        {'dispCompInstanceId': 9416930,
                         'judge': 'Pyle, Rudolph R., III',
                         'judgeCode': 70232,
                         'seperateOpinion': False,
                         'voteCode': 71040,
                         'voteRank': 1,
                         'voteValue': 'Concur'},
                        {'dispCompInstanceId': 9416930,
                         'judge': 'Baker, John G. - SR',
                         'judgeCode': 99074,
                         'seperateOpinion': False,
                         'voteCode': 71039,
                         'voteRank': 0,
                         'voteValue': 'Author'}]},
  'opinionText': 'in an opinion by Judge Baker.',
  'opinionUrl': 'api/Document/Opinion?Id=6Prq4NVaQ75RijbOC2NrHHYSFd2Uxg3A-INdlkTtRZac_o0my9t2-w3VgZ6EdYor0',
  'publishedText': '',
  'style': 'Estate of Elmer Gordon Waggoner\n'
           'v.\n'
           'Anonymous Health System, Inc., et al., et al.',
  'voteText': 'Judge Pyle and Judge Felix concur'},

A couple things jump out. 1 - we can identify the author, so let's slot that into the author field. Then we should be able to identify the judges easily enough. But the parsing on judges is grabbing first names and suffixes.

Also we have an opportunity to grab opinion type - so lets do that as well so we can identify concurrences / dissents / majority opinions etc.

Also - lets update our tests so that if lower court is added, its also incorporated into tests. Here the Ct of Appeals wasnt getting any lower court data I think.

@flooie flooie assigned flooie and Luis-manzur and unassigned flooie Jul 10, 2025
@Luis-manzur Luis-manzur assigned flooie and unassigned Luis-manzur Jul 10, 2025
@Luis-manzur
Copy link
Contributor Author

@flooie back to you

[]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems bad

@flooie flooie assigned Luis-manzur and unassigned flooie Jul 10, 2025
@flooie
Copy link
Contributor

flooie commented Jul 10, 2025

looks like the api query doesnt actually filter for each court and we just iterate over the last 250 between the three courts. lets do better and filter our results this will make our test files better as well

@Luis-manzur Luis-manzur assigned flooie and unassigned Luis-manzur Jul 10, 2025
@Luis-manzur Luis-manzur assigned Luis-manzur and unassigned flooie Jul 14, 2025
@Luis-manzur Luis-manzur assigned flooie and unassigned Luis-manzur Jul 14, 2025
"judge": judge,
"author": self.extract_author(case, is_per_curiam),
"per_curiam": is_per_curiam,
"type": type,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like they acknowledge dissents in this JSON but that the document is actually combined. So the dissent I looked up is in teh same document, so we can remove this for now as it is incorrect and will default to combined opinion on CL


@staticmethod
def clean_judge_name(name: str) -> str:
"""Cleans and formats a judge's name string."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docstrings?

Copy link
Contributor

@flooie flooie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets update our docstrings and remove type but otherwise I think we can merge this

@flooie flooie assigned Luis-manzur and unassigned flooie Jul 14, 2025
@Luis-manzur Luis-manzur assigned flooie and unassigned Luis-manzur Jul 14, 2025

:param name: The name of the judge as a string.
:return: A cleaned and formatted version of the judge's name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitespace should be removed

Comment on lines 62 to 66
"lower_court": ", ".join(c["name"] for c in other_courts),
"lower_court_number": ", ".join(
c["number"] for c in other_courts
),
"judge": judge,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the lower court and lower court number fields are including too much information.

this field should only include the direct court that was appeal from. If it's the Supreme Court it should be the court of appeals in most cases etc.

We shouldnt combined docket numbers and names from the entire appellate chain

@flooie flooie assigned Luis-manzur and unassigned flooie Jul 16, 2025
@Luis-manzur Luis-manzur assigned flooie and unassigned Luis-manzur Jul 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: PRs to Review
Development

Successfully merging this pull request may close these issues.

Improve ind scraper
2 participants