Skip to content

Add script to identify candidate changes to column maps #4608

@krivard

Description

@krivard

Overview

Quarterly updates for sources that frequently change the layout/schema/spelling of their raw files are tedious. The 2025Q1 update for EIA 930 dropped 9 and added 30 columns.

A script that compares the new source file(s) to the existing maps and summarizes changes would help immensely.

Success Criteria

A good script would output:

  • columns in the map that are not in the new source file (to identify deleted columns)
  • columns in the new source file that are not mapped (to identify added columns)
  • column headers in the new source file paired with their PUDL column names (to identify changes to column order)

Metadata

Metadata

Assignees

No one assigned

    Labels

    data-updateWhen fresh data is integrated into PUDL from quarterly or annual updatesdeveloper experienceThings that make the developers' lives easier, but don't necessarily directly improve the data.

    Type

    No type

    Projects

    Status

    Icebox

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions