Skip to content

Conversation

@gibfahn
Copy link

@gibfahn gibfahn commented Oct 21, 2025

  • test(rules_py): add E2E test for concurrent venv creation

    Add test to verify that multiple processes can safely use the same
    py_binary target simultaneously without encountering race conditions
    during venv creation.

    The test spawns 5 concurrent processes executing the same binary and
    validates that:

    • All processes complete successfully without errors
    • No import failures occur during concurrent venv initialization
    • No directory creation conflicts arise from simultaneous access

    This validates the file-locking mechanism that prevents race conditions
    when multiple instances try to create or access the same venv.

  • fix(rules_py): implement file-based locking for venv creation

    Prevent race conditions when multiple processes attempt to create or
    access the same Python venv simultaneously. Without locking, concurrent
    executions cause errors like 'cannot import package' and 'cannot create
    directory'.

    Implementation:

    • Add file-lock dependency (v2.1.11) for POSIX advisory locks via fcntl()
    • Introduce VenvLock struct that acquires exclusive locks on venv location
    • Lock blocks concurrent processes until available, auto-released on drop
    • Add is_venv_valid() to verify existing venvs are usable
    • Apply locking in both create_venv() and create_empty_venv()
    • Skip venv recreation if valid venv already exists

    When the second process acquires the lock, it detects the existing valid
    venv and skips recreation, eliminating race conditions entirely.

Add test to verify that multiple processes can safely use the same
py_binary target simultaneously without encountering race conditions
during venv creation.

The test spawns 5 concurrent processes executing the same binary and
validates that:
- All processes complete successfully without errors
- No import failures occur during concurrent venv initialization
- No directory creation conflicts arise from simultaneous access

This validates the file-locking mechanism that prevents race conditions
when multiple instances try to create or access the same venv.
Prevent race conditions when multiple processes attempt to create or
access the same Python venv simultaneously. Without locking, concurrent
executions cause errors like 'cannot import package' and 'cannot create
directory'.

Implementation:
- Add file-lock dependency (v2.1.11) for POSIX advisory locks via fcntl()
- Introduce VenvLock struct that acquires exclusive locks on venv location
- Lock blocks concurrent processes until available, auto-released on drop
- Add is_venv_valid() to verify existing venvs are usable
- Apply locking in both create_venv() and create_empty_venv()
- Skip venv recreation if valid venv already exists

When the second process acquires the lock, it detects the existing valid
venv and skips recreation, eliminating race conditions entirely.
@CLAassistant
Copy link

CLAassistant commented Oct 21, 2025

CLA assistant check
All committers have signed the CLA.

@aspect-workflows
Copy link

aspect-workflows bot commented Oct 21, 2025

test-os:linux-bzl:7 (Test)

33 test targets passed

Targets
//examples/multi_version:py_version_default_test [k8-fastbuild]                           2s
//examples/multi_version:py_version_test [k8-fastbuild-ST-494921797612]                   1s
//examples/pytest:pytest_test [k8-fastbuild]                                              2s
//examples/pytest:sharded/test [k8-fastbuild]                                             2s
//examples/virtual_deps:pytest_test [k8-fastbuild]                                        2s
//py/private/py_venv:test_link [k8-fastbuild]                                             223ms
//py/tests/cc-deps:test_smoke [k8-fastbuild]                                              488ms
//py/tests/external-deps/sibling-package:test_test [k8-fastbuild]                         44ms
//py/tests/external-deps:test_0_test [k8-fastbuild]                                       68ms
//py/tests/external-deps:test_1_test [k8-fastbuild]                                       39ms
//py/tests/external-deps:test_can_import_runfiles_helper [k8-fastbuild]                   605ms
//py/tests/internal-deps:assert [k8-fastbuild]                                            450ms
//py/tests/py-binary:runfiles_from_pip_test [k8-fastbuild]                                772ms
//py/tests/py-binary:test__main_from_name [k8-fastbuild]                                  35ms
//py/tests/py-binary:test__run_py_test_help [k8-fastbuild]                                40ms
//py/tests/py-external-venv:test [k8-fastbuild]                                           190ms
//py/tests/py-internal-venv:test [k8-fastbuild]                                           141ms
//py/tests/py-pex-binary:test__print_modules_pex [k8-fastbuild]                           30ms
//py/tests/py-test:test_env_vars [k8-fastbuild]                                           445ms
//py/tests/py-venv-disable-systemsite:test [k8-fastbuild]                                 127ms
//py/tests/py-venv-disable-usersite:test [k8-fastbuild]                                   111ms
//py/tests/py-venv-enable-site:test [k8-fastbuild]                                        99ms
//py/tests/py-venv-standalone-interpreter:test [k8-fastbuild]                             113ms
//py/tests/py_image_layer:my_app_layers_test_test [k8-fastbuild]                          58ms
//py/tests/py_image_layer:py_image_test [k8-fastbuild]                                    4s
//py/tests/py_venv_conflict:py_venv_conflict [k8-fastbuild]                               30ms
//py/tests/py_venv_conflict:validate_import_roots [k8-fastbuild]                          251ms
//py/tests/py_venv_image_layer:my_app_amd64_layers_test [k8-fastbuild]                    70ms
//py/tests/py_venv_image_layer:my_app_arm64_layers_test [k8-fastbuild]                    52ms
//py/tests/py_venv_image_layer:py_amd64_image_command_test [k8-fastbuild]                 4s
//py/tests/py_venv_image_layer:py_amd64_image_content_test [k8-fastbuild]                 582ms
//py/tests/py_venv_image_layer:py_arm64_image_content_test [k8-fastbuild]                 3s
//py/tests/repo_relative_imports/test:test [k8-fastbuild]                                 448ms

Total test execution time was 26s. 6 tests (15.4%) were fully cached saving 1m.


test-os:linux-bzl:7 (Test)

e2e/cross-repo-610

1 test target passed

Targets
//:test [k8-fastbuild]                                                                    86ms

Total test execution time was 86ms. 1 test (50.0%) was fully cached saving 3s.


test-os:linux-bzl:7 (Test)

e2e/interpreter-version-541

6 test targets passed

Targets
//:test_3_11_classic [k8-fastbuild]                                                       343ms
//:test_3_11_venv [k8-fastbuild]                                                          645ms
//:test_3_12_classic [k8-fastbuild-ST-68426eee3a1d]                                       233ms
//:test_3_12_venv [k8-fastbuild-ST-68426eee3a1d]                                          100ms
//:test_3_13_classic [k8-fastbuild-ST-3f862e60c079]                                       641ms
//:test_3_13_venv [k8-fastbuild-ST-3f862e60c079]                                          404ms

test-os:linux-bzl:7 (Test)

e2e/repository-rule-deps

2 test targets passed

Targets
//:all_direct [k8-fastbuild]                                                              271ms
//:test [k8-fastbuild]                                                                    279ms

test-os:linux-bzl:7 (Test)

e2e/smoke

1 test target passed

Targets
//:bin [k8-fastbuild]                                                                     253ms

test-os:linux-bzl:7 (Test)

e2e/system-interpreter

1 test target passed

Targets
//:bin [k8-fastbuild]                                                                     110ms

test-os:linux-bzl:7 (Test)

examples/uv_pip_compile

All tests were cache hits

1 test (100.0%) was fully cached saving 286ms.

@arrdem
Copy link
Collaborator

arrdem commented Oct 21, 2025

Hey @gibfahn, thanks for the PR.

While you're of course welcome to hack this ruleset to fit your needs, this PR isn't aligned with the project roadmap which is to ditch the current dynamic venv machinery entirely in favor of prebuilt venvs (see the new py_venv_* unstable code). As you've addressed here there are fundamental correctness issues with trying to run the current "binaries" concurrently, also discussed in #634 and #628.

My concern with implementing any solution like this is that -- see https://github.com/aspect-build/rules_py/pull/680/files#diff-71813fcee039f8c5ecbb0ea45a087af012c9e988132ce27075749cc2663f7968R368 -- it creates for the py[_runtime]_binary artifact class the invalidation problem. When does the venv exist but also need to be rebuilt? Do we do a tree hash? Ignore the problem?

Since the only reason this implementation of py_binary builds venvs dynamically is to work around unstable Python interpreter paths which we have a mature solution to in py_venv_binary and the only correct answer to the venv tree invalidation problem seems to be not having it by prebuilding, I don't see a reason to accept a change like this which introduces more potential for incorrectness in a binary type I'm working on migrating this ruleset away from.

Also tactically and with apologies I'm in the middle of ripping out the WORKSPACE file which involves overhauling (removing) many of the e2e tests so mechanically this isn't something I can just accept either.

@gibfahn
Copy link
Author

gibfahn commented Nov 7, 2025

Fair enough, in that case I'll close this. Thank you for the detailed explanation.

@gibfahn gibfahn closed this Nov 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants