Skip to content

Conversation

khoaanguyenn
Copy link

@khoaanguyenn khoaanguyenn commented Dec 26, 2024

Problem

The pipelinewise tap disregards the original stream and runs refresh schema every time Meltano run command is executed. Thus, it overrides the original stream with the newly refreshed schema, wiping out metadata.selected fields that is being used to disable a particular column

Proposed changes

Describe the big picture of your changes here to communicate to the maintainers why we should accept this pull request.
If it fixes a bug or resolves a feature request, be sure to link to that issue.

Types of changes

What types of changes does your code introduce to PipelineWise?
Put an x in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation Update (if none of the other choices apply)

Checklist

  • Description above provides context of the change
  • I have added tests that prove my fix is effective or that my feature works
  • Unit tests for changes (not needed for documentation changes)
  • CI checks pass with my changes
  • Bumping version in setup.py is an individual PR and not mixed with feature or bugfix PRs
  • Commit message/PR title starts with [AP-NNNN] (if applicable. AP-NNNN = JIRA ID)
  • Branch name starts with AP-NNN (if applicable. AP-NNN = JIRA ID)
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions

updated_metadata = metadata.to_list(original_metadata_map)

# 2nd step: now copy all the metadata from the updated new discovery to the original stream
streams[idx]['metadata'] = copy.deepcopy(new_discovery[stream['tap_stream_id']]['metadata'])
Copy link
Author

@khoaanguyenn khoaanguyenn Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. This original code doesn't respect metadata within the catalog provided by Meltano SDK, it's overridden by the new discovery stream from Postgres catalog every time it's executed. The above rectification retains the original selected field in each stream that is required to select the corresponding columns per configuration in Meltano YAML file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant