Skip to content

fix: prevent atom exhaustion in merge_projects mix task#3973

Merged
taylordowns2000 merged 8 commits intomainfrom
3615-fix-atom-exhaustion
Nov 14, 2025
Merged

fix: prevent atom exhaustion in merge_projects mix task#3973
taylordowns2000 merged 8 commits intomainfrom
3615-fix-atom-exhaustion

Conversation

@elias-ba
Copy link
Contributor

@elias-ba elias-ba commented Nov 12, 2025

Description

This PR fixes a security vulnerability in the mix lightning.merge_projects task where malicious JSON input with arbitrary keys could cause atom exhaustion and crash the VM.

The fix uses String.to_existing_atom/1 to safely convert JSON keys to atoms, only allowing keys that already exist in the system. This prevents creation of unlimited atoms from malicious input while maintaining compatibility with the merge algorithm.

Closes #3956

Validation steps

  1. Test basic merge functionality:

    mix test test/mix/tasks/merge_projects_test.exs

    All tests should pass.

  2. Test with non-UUID IDs (Joe's requirement):

    • Run merge with projects using simple IDs like "1", "2", "test-source-1"
    • Verify merge works without requiring valid UUIDs
  3. Test security:

    • Try merging a project with unknown JSON keys
    • Should raise clear error about unknown fields
  4. Test offline operation:

    • Run mix task without database connection
    • Should work without any database access

Additional notes for the reviewer

  1. Implementation approach: Uses atomize_keys/1 function that recursively converts only map keys to atoms (not values). UUIDs and other string values remain as strings.

  2. No Provisioner coupling: Intentionally does not use Provisioner.parse_document/1 to avoid coupling to validation rules and usage limiting checks (per Joe's feedback).

  3. Lets merge fail naturally: If project structure is truly invalid, the merge algorithm itself will fail with appropriate errors rather than blocking upfront.

AI Usage

Please disclose how you've used AI in this work (it's cool, we just want to know!):

  • Code generation (copilot but not intellisense)
  • Learning or fact checking
  • Strategy / design
  • Optimisation / refactoring
  • Translation / spellchecking / doc gen
  • Other
  • I have not used AI

You can read more details in our Responsible AI Policy

Pre-submission checklist

  • I have performed a self-review of my code.
  • I have implemented and tested all related authorization policies. (e.g., :owner, :admin, :editor, :viewer)
  • I have updated the changelog.
  • I have ticked a box in "AI usage" in this PR

@codecov
Copy link

codecov bot commented Nov 12, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.82%. Comparing base (baeaef7) to head (655c8fc).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3973      +/-   ##
==========================================
+ Coverage   88.80%   88.82%   +0.01%     
==========================================
  Files         422      422              
  Lines       19067    19067              
==========================================
+ Hits        16933    16936       +3     
+ Misses       2134     2131       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@elias-ba
Copy link
Contributor Author

elias-ba commented Nov 12, 2025

Hey @rorymckinley and @josephjclark 👋

This is a new PR that supersedes #3956. Based on Joe's feedback about not coupling the merge task to Provisioner validation or database requirements, I've implemented a much simpler approach.

What changed from #3956

The new implementation:

  • No Provisioner dependency (avoids coupling to validation rules and usage limits)
  • No database access (works completely offline as Joe needed)
  • Simple String.to_existing_atom/1 for security (prevents atom exhaustion)
  • Supports non-UUID IDs like "1", "test-source-1" (Joe's requirement for testing)
  • Lets the merge algorithm handle validation naturally

How it works

  1. Takes JSON files as input
  2. Converts string keys to existing atoms using String.to_existing_atom/1
  3. Passes atom-keyed maps to MergeProjects.merge_project/2
  4. Returns the merged result

This approach keeps the Mix task decoupled from Lightning's validation constraints while still preventing the atom exhaustion security vulnerability.

@josephjclark - As discussed on Slack, this should satisfy your testing needs. When you have time, please test this branch thoroughly and let me know if you see any issues. I've added you as a reviewer.

@rorymckinley - When you get a chance, would you be able to help review this and get it merged? I believe this approach is cleaner and aligns better with the Mix task's purpose of being a simple, offline utility.

Thanks both! 🙏

Copy link
Collaborator

@josephjclark josephjclark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not going to pretend to 100% understand what's happening here - but this passes against my test suite, so I'm all for it!

Copy link
Collaborator

@rorymckinley rorymckinley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elias-ba I have not had a chance to really step through the test changes, will do that tomorrow, but I am shutting down now and don't want to take the chance that github forgets my current comments - so please consider this to be part 1 :).

I have 'Requested Changes' primarily because of my confusion re: Jason.decode - is it sill in the execution path, or are my eyes just tired?

This may indicate incompatible project structures or corrupted data.
Please verify both files are valid Lightning project exports.
""")
defp atomize_keys(data) when is_list(data) do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elias-ba If I comment out this method, no tests fail - is it still required?

@github-project-automation github-project-automation bot moved this from New Issues to In review in Core Nov 12, 2025
@rorymckinley
Copy link
Collaborator

@elias-ba Ok, done. If my assessment re: the continued presence of Jason.decode with atoms is correct, that is the ony blocker - the rest are nice-to-haves (depending on how you view test coverage :) ).

Address PR feedback by implementing comprehensive atom safety:

- Remove keys: :atoms from Jason.decode to prevent DoS attacks
- Rename atomize_keys to atomize for accuracy (atomizes entire structures)
- Add ensure_schemas_loaded() to load schema modules before atomization
- Use module attribute @required_schemas for maintainable schema list
- Add test coverage for ArgumentError rescue on unknown atoms
- Optimize schema loading placement (after file validation)

This ensures String.to_existing_atom/1 works safely by guaranteeing
all schema field atoms (id, name, workflows, etc.) exist in memory
before JSON key conversion.

Fixes #3615
Consolidates two nearly identical tests that were testing the same
functionality with different ID formats. The remaining test now has
clearer comments explaining what each assertion verifies.

Addresses Rory's feedback about test duplication around lines 9-55
and 704-746 in the test file.
@elias-ba
Copy link
Contributor Author

elias-ba commented Nov 14, 2025

@rorymckinley Thanks for the thorough review Rory! You're absolutely right about the blocker - that was embarrassing. I could swear I had removed the keys: :atoms from Jason.decode, but it must have come back through merge conflicts or something. Either way, I'm really glad you caught it because that was the whole point of this PR. I've now removed the keys: :atoms option entirely, so we parse with string keys and then explicitly atomize using String.to_existing_atom/1. This actually closes the DoS vulnerability.

For the atom availability concern, great catch. I've added an ensure_schemas_loaded() function that calls Code.ensure_loaded/1 on all the schema modules we need (Project, Workflow, Job, Trigger, Edge). This ensures their field atoms are in memory before we attempt atomization. Used a module attribute @required_schemas at the top of the file so it's easy to add new schemas as the export format evolves.

For the function naming, you're right that atomize_keys was misleading since it recursively atomizes maps and lists. Renamed it to just atomize throughout.

Added test coverage for the ArgumentError rescue block - it creates JSON with a dynamically generated field name that won't exist as an atom, verifying the protection mechanism actually works when unknown fields are present.

For the UUID validation discussion, since this doesn't involve any database queries we're probably running no risk by allowing non-UUID IDs. The flexibility is useful for Joe's testing scenarios without compromising anything.

On test duplication, you were spot on. The "deeply nested structures" test was essentially identical to the basic merge test, just with different ID formats. Removed it and enhanced the remaining test with clearer comments.

I also renamed the "works offline without database access" test to "runs standalone without database fixtures" which better describes what it's actually testing.

Please could you do another round of review and let me know if you see any other issues! Thanks a lot man for your help

@elias-ba elias-ba changed the title fix: prevent atom exhaustion in merge_projects mix task fix: prevent atom exhaustion in merge_projects mix task Nov 14, 2025
Copy link
Collaborator

@rorymckinley rorymckinley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elias-ba Jërëjëf Jambaar! Great work - as Joe has tested the mix task itself, I am not going to delay things by trying to get the setup required for that.

Note: Since the last review, it appears that this PR has gained some typescript files that are unrelated?

@taylordowns2000 taylordowns2000 merged commit 9a91771 into main Nov 14, 2025
8 checks passed
@taylordowns2000 taylordowns2000 deleted the 3615-fix-atom-exhaustion branch November 14, 2025 06:36
@github-project-automation github-project-automation bot moved this from In review to Done in Core Nov 14, 2025
@taylordowns2000
Copy link
Member

yeah @rorymckinley , we were supposed to be done with the quote wars (" to ') but they keep popping up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants