loops, `host.loop` cleanup semantics and operation ordering #1521

0x5a17ed · 2026-01-05T17:46:27Z

0x5a17ed
Jan 5, 2026

Hey,

I wanted to use a loop in my pyinfra code. Having some experience with Ansible and knowing that loops and conditionals can be tricky in this context, I searched around in the documentation and codebase.

I found host.loop via a Google result that led to what turned out to be outdated pyinfra 2.x documentation. The page didn't make it immediately obvious that it was for an outdated version but the problem it explained sounds real to me. Perhaps a banner warning about outdated information would help future visitors?

Anyway, so I started using host.loop and noticed it was missing type annotation because my code editor stopped suggestion auto-completion. Thus I've started adding type annotations for host.loop so auto-completion would work in my editor again. While doing that, I noticed a potential issue, but before diving into the details, I have a preliminary question.

Is `host.loop` still a supported feature?

I noticed that host.loop was documented in the pyinfra 2.x documentation but this section was removed in 3.x. The current documentation doesn't mention loops at all. Is host.loop deprecated or being phased out? If so, the rest of this message might be moot.

The problem I found

Looking at the current implementation:

pyinfra/src/pyinfra/api/host.py

Lines 140 to 145 in 5291e90

    
           def loop(self, iterable): 
        
               self.loop_position.append(0) 
        
               for i, item in enumerate(iterable): 
        
                   self.loop_position[-1] = i 
        
                   yield item 
        
               self.loop_position.pop()

If a loop exits early via break, return, or exception, self.loop_position.pop() is never executed. Cleanup only happens when the generator is garbage collected which in CPython usually happens immediately due to reference counting, but this is an implementation detail, not guaranteed behavior.

Note: wrapping the generator body in try/finally doesn't fully solve this either unfortunately. The finally only runs when the generator is closed or collected and Python's for loop does not call .close() on break. This is a known limitation that has a proposed fix, but it was deferred.

Why this might matter

The loop_position feeds into solve_operation_consistency:

pyinfra/src/pyinfra/api/operation.py

Lines 410 to 441 in 5291e90

    
           def solve_operation_consistency(names, state, host): 
        
               # Operation order is used to tie-break available nodes in the operation DAG, in CLI mode 
        
               # we use stack call order so this matches as defined by the user deploy code. 
        
               if pyinfra.is_cli: 
        
                   op_order = get_operation_order_from_stack(state) 
        
               # In API mode we just increase the order for each host 
        
               else: 
        
                   op_order = [len(host.op_hash_order)] 
        
               if host.loop_position: 
        
                   op_order.extend(host.loop_position) 
        
               # Make a hash from the call stack lines 
        
               op_hash = make_hash(op_order) 
        
               # Avoid adding duplicates! This happens if an operation is called within 
        
               # a loop - such that the filename/lineno/code _are_ the same, but the 
        
               # arguments might be different. We just append an increasing number to 
        
               # the op hash and also handle below with the op order. 
        
               duplicate_op_count = 0 
        
               while op_hash in host.op_hash_order: 
        
                   logger.debug("Duplicate hash ({0}) detected!".format(op_hash)) 
        
                   op_hash = "{0}-{1}".format(op_hash, duplicate_op_count) 
        
                   duplicate_op_count += 1 
        
               host.op_hash_order.append(op_hash) 
        
               if duplicate_op_count: 
        
                   op_order.append(duplicate_op_count) 
        
               op_order = tuple(op_order) 
        
               logger.debug(f"Adding operation, {names}, opOrder={op_order}, opHash={op_hash}") 
        
               return op_order, op_hash

if host.loop_position:
    op_order.extend(host.loop_position)

A leaked entry means subsequent operations (outside the loop) would still include a stale loop position in their op_order, potentially affecting operation ordering or hashing.

My attempted solution for the cleanup problem

I refactored Host.loop to return a HostLoop class that:

Implements both the iterator and context manager protocols
Tracks its index in loop_position at creation time
Uses index-based updates (loop_position[self._index] = i) instead of loop_position[-1]
Uses a weakref finalizer for best-effort cleanup of abandoned loops
Sets a None sentinel on cleanup instead of popping (to preserve indices)
Compacts trailing None values lazily

This allows both usage patterns:

# Context manager (deterministic cleanup)
with host.loop(items) as loop:
    for item in loop:
        break

# Legacy (best-effort cleanup via weakref)
for item in host.loop(items):
    break

The edge cases this revealed

Out-of-order cleanup: a parent loop can be abandoned while a child loop continues:

parent = host.loop(items)
next(parent)  # loop_position = [0]

child = host.loop(other_items)
next(child)  # loop_position = [0, 0]

del parent   # loop_position = [None, 0]
next(child)  # loop_position = [None, 1]

Sentinel values in op_order: with None sentinels, op_order could contain None values. Alternatively, abandoned loops could retain their last reported position, which would mimic current behavior while still allowing eventual cleanup.
Loop identity: currently a loop's identity is just its index in loop_position. If two separate loops happen to be at the same index and position, they'd contribute identical values to op_order. Would sequential loop IDs help distinguish them?

The Questions left in my head

So is host.loop still supported, or is it deprecated?
If supported: is leaking loop_position entries a real problem that needs fixing?
If it needs fixing: is requiring with host.loop(...): for deterministic cleanup an acceptable API change?
Should abandoned loops contribute to op_order? If so, what value? Should it be None, their last position, or filtered out entirely?
Would sequential loop IDs add value, or is position sufficient?

Happy to share my implementation if it would help the discussion. Looking forward to your thoughts and insights on this matter.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

loops, `host.loop` cleanup semantics and operation ordering #1521

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

loops, host.loop cleanup semantics and operation ordering #1521

Uh oh!

0x5a17ed Jan 5, 2026

Is host.loop still a supported feature?

The problem I found

Why this might matter

My attempted solution for the cleanup problem

The edge cases this revealed

The Questions left in my head

Replies: 0 comments

loops, `host.loop` cleanup semantics and operation ordering #1521

0x5a17ed
Jan 5, 2026

Is `host.loop` still a supported feature?