Skip to content

add/debug Lit CI #21002

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: master
Choose a base branch
from
Open

add/debug Lit CI #21002

wants to merge 16 commits into from

Conversation

Borda
Copy link
Member

@Borda Borda commented Jul 25, 2025

@github-actions github-actions bot added ci Continuous Integration fabric lightning.fabric.Fabric pl Generic label for PyTorch Lightning package labels Jul 25, 2025
@Borda Borda changed the title add/debug Lit CI [wip] add/debug Lit CI Aug 5, 2025
@Borda
Copy link
Member Author

Borda commented Aug 8, 2025

@lantiga any idea why this hangs on the L4 machine?

                assert not thread.is_alive()
            elif isinstance(thread, _ChildProcessObserver):
                thread.join(timeout=10)
            elif (
                thread.name == "QueueFeederThread"  # tensorboardX
                or thread.name == "QueueManagerThread"  # torch.compile
                or "(_read_thread)" in thread.name  # torch.compile
            ):
                thread.join(timeout=20)
            elif (
                sys.version_info >= (3, 9)
                and isinstance(thread, _ExecutorManagerThread)
                or "ThreadPoolExecutor-" in thread.name
            ):
                # probably `torch.compile`, can't narrow it down further
                continue
            else:
>               raise AssertionError(f"Test left zombie thread: {thread}")
E               AssertionError: Test left zombie thread: <_DummyThread(Dummy-3, started daemon 140584534672960)>

@Borda Borda marked this pull request as ready for review August 8, 2025 15:47
Copy link
Contributor

github-actions bot commented Aug 8, 2025

⚡ Required checks status: All passing 🟢

Groups summary

🟢 pytorch_lightning: Tests workflow
Check ID Status
pl-cpu-guardian success

These checks are required after the changes to src/lightning/fabric/utilities/testing/_runif.py, src/lightning/pytorch/utilities/testing/_runif.py, tests/tests_pytorch/conftest.py.

🟢 pytorch_lightning: Azure GPU
Check ID Status
pytorch-lightning (GPUs) (testing Lightning | latest) success
pytorch-lightning (GPUs) (testing PyTorch | latest) success

These checks are required after the changes to .azure/gpu-tests-pytorch.yml, src/lightning/pytorch/utilities/testing/_runif.py, tests/tests_pytorch/conftest.py, src/lightning/fabric/utilities/testing/_runif.py.

🟢 pytorch_lightning: Benchmarks
Check ID Status
lightning.Benchmarks success

These checks are required after the changes to .azure/gpu-benchmarks.yml, src/lightning/fabric/utilities/testing/_runif.py, src/lightning/pytorch/utilities/testing/_runif.py.

🟢 fabric: Docs
Check ID Status
docs-make (fabric, doctest) success
docs-make (fabric, html) success

These checks are required after the changes to src/lightning/fabric/utilities/testing/_runif.py.

🟢 pytorch_lightning: Docs
Check ID Status
docs-make (pytorch, doctest) success
docs-make (pytorch, html) success

These checks are required after the changes to src/lightning/pytorch/utilities/testing/_runif.py.

🟢 lightning_fabric: CPU workflow
Check ID Status
fabric-cpu-guardian success

These checks are required after the changes to src/lightning/fabric/utilities/testing/_runif.py, tests/tests_fabric/conftest.py.

🟢 lightning_fabric: Azure GPU
Check ID Status
lightning-fabric (GPUs) (testing Fabric | latest) success
lightning-fabric (GPUs) (testing Lightning | latest) success

These checks are required after the changes to .azure/gpu-tests-fabric.yml, src/lightning/fabric/utilities/testing/_runif.py, tests/tests_fabric/conftest.py.

🟢 mypy
Check ID Status
mypy success

These checks are required after the changes to src/lightning/fabric/utilities/testing/_runif.py, src/lightning/pytorch/utilities/testing/_runif.py.

🟢 install
Check ID Status
install-pkg-guardian success

These checks are required after the changes to src/lightning/fabric/utilities/testing/_runif.py, src/lightning/pytorch/utilities/testing/_runif.py.


Thank you for your contribution! 💜

Note
This comment is automatically generated and updates for 60 minutes every 180 seconds. If you have any other questions, contact carmocca for help.

Copy link

codecov bot commented Aug 8, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87%. Comparing base (72bb751) to head (f152380).
⚠️ Report is 1 commits behind head on master.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #21002   +/-   ##
=======================================
  Coverage      87%      87%           
=======================================
  Files         268      268           
  Lines       23321    23321           
=======================================
+ Hits        20310    20313    +3     
+ Misses       3011     3008    -3     

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Continuous Integration fabric lightning.fabric.Fabric pl Generic label for PyTorch Lightning package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant