ActionSpace: Defining the influence Agents can have #2858

EwoutH · 2025-10-24T20:04:00Z

EwoutH
Oct 24, 2025
Maintainer

Currently all Agent behavior is programmed in with lines of Python, often in relatively simple if-else statements. With things like tasks and continiousstates we try to define the temporal element of it, but we actual have to think about the other dimensions as well.

Currently, by default, Agents have infinite "power" they can do anything to any agent or environment they like. They can also know or request any information they like. In Mesa we don't have a common way to limit this influence, either in the action (output) or the information (input) dimension.

Defining limits/boundaries on the amount of influence agents have seems an obvious capability to have. A way to do that is to model a "sphere of influence" in which Agents can act, which I will (for now) call the ActionSpace for now. A similar parallel could be drawn with the information space.

An ActionSpace is a space in the sense that it is a multi-dimensional outcome space where the dimensions can influence each other: If you are at the limit of one dimension it might influence what's possible in other dimensions.

This space should be able to be (easily) defined in Mesa, and checked by the agent execution engine. This will also facilitate thing like learning agents and RL, since they now can optimize within their ActionSpace.

An ActionSpace should also be able be defined based on conditions: The agent internal states, the environment, and other agents states. Potentially time could also tie in to the ActionSpace.

It might also be useful to (sometimes) see actions as taking a certain costs (in resources or time), which can both limit the ActionSpace, and potentially also be tracked somewhere else. ActionSpaces should probably be combinable, and/or multiple ones should be able to be defined per Agent. Maybe Mesa can handle which ActionSpace applies when and combine them if needed.

When an indented action is outside the ActionSpace, it should be able to be scaled down or mapped to the nearest point that is in it. And ideally the ActionSpace should be transparant, so that an Agent can make an decision based on the current exact ActionSpace.

As a simple example (of my favourite model for "real" stuff): A wolf is hunting a sheep. But it can't move infinitely fast or change direction infinitely fast. There are physical limits (a first ActionSpace). However, the practical speed is also limited by the wolfs states like health, age, condition; the environmental factors like foliage, elevantion; and maybe even external factors. That would be a second ActionSpace.

So some tension exists between a "hard" ActionSpace with clear limits (laws of physics, game rules, etc.) and "soft" ActionSpaces, which burn up resources (energy, money, time) but do scale. I don't have a clear view how these relate to each other, but maybe the soft actions space is a specific version of the hard one? Costs can occur everywhere in the ActionSpace, while the max costs or hard limits define the actial boundaries. Or one has limits and the other scaling laws?

Below an LLM-powered semi-curated expansion of this idea (let me know if this is useful).

ActionSpace: Defining Agent Capabilities and Influence

Motivation

Currently in Mesa, agents have unlimited power by default - they can do anything to any agent or environment, access any information, with no built-in constraints. All behavior is programmed directly in Python with if-else statements. While we're developing temporal aspects through Tasks and ContinuousStates, we haven't addressed the spatial, resource, or capability dimensions of agent behavior.

Defining limits on agent influence seems like an obvious and necessary capability. A way to model this is through an "ActionSpace" - a multi-dimensional space defining what actions an agent can feasibly take and at what cost. The dimensions can interact: being at the limit in one dimension affects what's possible in others. For example, a wolf hunting a sheep can't move infinitely fast or change direction instantly due to physical limits (hard constraints), but practical speed is also limited by the wolf's health, age, terrain, and energy levels (soft constraints).

This concept would facilitate learning agents and reinforcement learning by providing a clear optimization space, enable more realistic agent behaviors through capability constraints, and provide a standardized way to model resource limitations and physical boundaries.

Core Concept

An ActionSpace is fundamentally about defining two things: what is possible (hard constraints - physics, rules, accessibility) and what is practical (soft constraints - resource costs, difficulty, risk). These relate hierarchically: soft constraints operate within hard boundaries. You can't even consider the cost of an impossible action. This mirrors the ecological distinction between fundamental niche (physiologically possible) and realized niche (actually used given competition and resources).

The ActionSpace should be both enforceable (framework validates actions) and transparent (agents can query their current capabilities). This transparency enables intelligent decision-making: agents can ask "what's my maximum speed right now?" or "can I afford this hunt?" before acting. When agents attempt infeasible actions, the ActionSpace can automatically project them to the nearest valid action, enabling graceful degradation rather than hard failures.

A Paradigm Shift: From Rules to Constraints

ActionSpace enables a fundamental shift in how we think about agent behavior in ABMs. Traditionally, we define exact rules: "if hungry and see food, then eat" or "if energy < 20, then rest." This prescriptive approach specifies precisely what agents should do in each situation. ActionSpace enables a descriptive approach: define what's possible and let agents (or learning algorithms, or optimization) figure out what to actually do within those boundaries.

Consider the difference:

Traditional rule-based approach:

def step(self):
    if self.energy < 20:
        self.rest()
    elif self.sees_prey() and self.energy > 50:
        self.chase(speed=10.0)
    elif self.hungry:
        self.forage()
    else:
        self.wander()

Constraint-based approach:

def step(self):
    # Define what's possible, not what to do
    self.action_space.add_constraint(MaxSpeed(10.0))
    self.action_space.add_constraint(EnergyCost(rate=0.5))
    
    # Agent or learning algorithm decides within constraints
    action = self.policy.choose_action(self.action_space)
    self.execute(action)

This shift has profound implications. It enables learning agents to discover emergent strategies rather than following programmed rules. Models become less brittle - as conditions change, the ActionSpace adapts automatically rather than requiring new if-else branches. It more accurately reflects reality: real agents operate under constraints, not hard-coded rules. A wolf doesn't follow "if X then Y" logic; it tries to catch prey subject to physical limitations, energy availability, and environmental conditions.

Perhaps most importantly, it separates concerns: the modeler defines the boundaries of possibility (often based on well-understood physical or biological constraints), while the mechanism for choosing within those boundaries can vary - from simple heuristics to sophisticated learning algorithms. This makes models more modular, testable, and scientifically grounded. You can validate that constraints match reality independently from whether the decision-making logic is appropriate.

This paradigm shift is particularly powerful when combined with ContinuousStates and Tasks. Instead of manually programming when energy is too low to hunt, the ActionSpace automatically reflects current energy levels. The agent's decision system simply optimizes within whatever space is currently feasible. The result is more realistic, adaptive behavior emerging from simpler, more maintainable code.

Scientific Foundations

This concept has deep roots across multiple disciplines. In game theory and economics, it relates to strategy spaces, feasible sets, and budget constraints. Control theory and robotics use configuration spaces and reachable sets. Reinforcement learning depends on well-defined action spaces - making Mesa models directly compatible with RL libraries like Gymnasium would be a significant benefit. The ecological parallel to fundamental versus realized niches is particularly apt, as it captures both the hard physiological limits and the soft practical constraints that emerge from competition and resource availability.

Design Tensions

Hard vs. Soft Constraints: These appear to be fundamentally different concepts that should be kept separate but composed together. Hard constraints answer "can I do X?" with a boolean, while soft constraints answer "how much does X cost?" with resource amounts. The implementation should make this distinction explicit:

class ActionSpace:
    def __init__(self):
        self.hard_constraints = []  # Feasibility boundaries
        self.soft_constraints = []  # Resource costs
    
    def is_feasible(self, action: Action) -> bool:
        """Check hard constraints only."""
        return all(c.check(action) for c in self.hard_constraints)
    
    def get_cost(self, action: Action) -> Dict[str, float]:
        """Get resource costs for a feasible action."""
        if not self.is_feasible(action):
            raise ValueError("Cannot compute cost for infeasible action")
        return sum(c.compute_cost(action) for c in self.soft_constraints)

Enforcement vs. Convention: Should the framework actively prevent constraint violations, or provide tools for agents to check themselves? Both have merit. Framework enforcement guarantees correctness and is essential for learning agents that don't know the rules. Convention-based approaches offer maximum flexibility and zero performance overhead when not used. The solution is to support both: default to convention-based with strong helper methods, but allow opt-in enforcement for specific use cases like RL, validation, or teaching scenarios. Agents can choose per-method or per-model whether to enforce constraints strictly.

Automatic Projection and Scaling

When an intended action lies outside the ActionSpace, it should be mappable to the nearest feasible point. This is based on projection onto constraint sets from optimization theory. Different strategies serve different needs:

Component-wise clipping independently constrains each dimension - fast and simple for most cases. Uniform scaling preserves the action's "character" by reducing intensity proportionally across all dimensions. Geometric projection finds the mathematically optimal nearest point but requires more computation. Resource-aware scaling specifically handles budget constraints by scaling actions to fit available resources.

The key insight is that hard and soft constraints need different projection strategies. Hard constraints require strict satisfaction (you can't violate physics), while soft constraints benefit from scaling to available resources. A practical implementation handles hard constraints first, then scales for resources:

def project_to_feasible(self, action: Action) -> Action:
    """Two-stage projection: hard then soft."""
    # Step 1: Satisfy hard constraints (clip to physical limits)
    action = self._clip_to_bounds(action)
    
    # Step 2: Scale to resource budget
    costs = self.get_cost(action)
    if not self.can_afford(costs):
        action = self._scale_to_budget(action, self.resources)
    
    return action

Transparent and Queryable ActionSpace

Making the ActionSpace fully introspectable is crucial for intelligent decision-making. Agents should be able to query their current exact capabilities before choosing actions. A comprehensive query interface includes checking if specific actions are feasible, getting exact resource costs, finding boundary limits for action dimensions, enumerating all currently available actions, and sampling random feasible actions for exploration.

For continuous action spaces, agents should be able to ask "what's my maximum speed given current conditions?" or "how far can I move with my remaining energy?" This enables proactive planning and adaptive behavior. For example:

class IntelligentWolf(Agent):
    def decide_hunt(self, prey):
        # Query current capabilities
        max_speed = self.action_space.get_max_feasible(Move(), 'speed')
        
        # Calculate what's needed
        required_speed = self.distance_to(prey) / self.time_available
        
        # Make informed decision
        if required_speed > max_speed:
            return None  # Can't catch this prey
        
        # Check if affordable
        chase = Move(speed=required_speed)
        costs = self.action_space.get_cost(chase)
        if self.energy < costs['energy']:
            return None  # Too expensive
        
        return chase  # Feasible and affordable

The ActionSpace should also support visualization and inspection for debugging and analysis, allowing modelers to see exactly what's constraining their agents at any given time.

Integration with Mesa Features

ActionSpace connects naturally with Mesa's evolving feature set. With Tasks, it defines what tasks are even possible to start given current capabilities. With ContinuousStates, it creates dynamic constraints where depleting energy automatically reduces the ActionSpace. For learning agents, it provides the structured environment that RL algorithms require - agents learn to optimize within their ActionSpace rather than learning constraints through trial and error.

The integration could look like:

class BehavioralWolf(Agent):
    energy = ContinuousObservable(initial_value=100, rate=-0.5)
    
    def __init__(self, model):
        super().__init__(model)
        
        # ActionSpace references continuous states
        self.action_space = ActionSpace()
        self.action_space.add_hard_constraint(MaxSpeed(10.0))
        self.action_space.add_soft_constraint(
            EnergyCost(energy_state=self.energy, rate=0.5)
        )
    
    def step(self):
        # ActionSpace automatically queries current energy
        action = self.decide_action()
        
        # Validate and auto-correct if needed
        actual_action, was_modified = self.action_space.validate_and_fix(
            action, self.get_resources()
        )
        
        if was_modified:
            self.log("Action adjusted due to constraints")
        
        self.execute(actual_action)

Open Questions

Several important design questions remain. Should ActionSpace also handle perception/information constraints, or remain focused on actions? How do we handle multi-agent conflicts when ActionSpaces overlap - who gets priority when two agents want the same cell? What's the right balance between generality and usability - should we provide rich abstractions or simple building blocks? How does this relate to Mesa's existing Space classes (Grid, Network) - should it integrate or remain independent?

The naming also deserves consideration. "ActionSpace" might confuse with spatial Space classes. Alternatives include CapabilitySpace (broader), InfluenceSpace (emphasizes the subtitle), ConstraintSpace (emphasizes limiting aspect), or PossibilitySpace/FeasibilitySpace (from game theory).

Implementation Recommendation

Start with a minimal but complete v1 focused on single-agent scenarios with clear hard/soft constraint separation. Make it convention-based by default with opt-in enforcement. Provide strong query capabilities and basic projection strategies. Show integration with one other Mesa feature (probably ContinuousStates or Tasks). Then expand based on real usage patterns - add multi-agent coordination, more sophisticated projection algorithms, and richer constraint types only as needed.

The goal is to provide a foundational framework that makes realistic agent constraints easy to model while maintaining Mesa's philosophy of flexibility and gradual adoption. By making ActionSpace both transparent and auto-correcting, we enable agents to make intelligent decisions while protecting against errors.

Discussion

This is a significant addition to Mesa's behavioral modeling capabilities. What are your thoughts on the core concept? Does the hard/soft constraint separation make sense? Is the combination of transparency and automatic projection the right approach? What use cases would benefit most from this feature? What concerns or alternative approaches should we consider?

colinfrisch · 2025-10-31T17:07:55Z

colinfrisch
Oct 31, 2025
Collaborator

Hey, I really like the idea and I can see how it could improve the quality of certain simulations. I'll answer to this with a little bias since I worked mostly on mesa-llm, and I think that there is a link.

Parallel with mesa-llm

In mesa-llm, we use internal state management and observation, and there are obvious parallels between internal_state/observations and your ActionSpace self constraints/environment constraints suggested here. For context, internal_state and observations are used at each step as context for the LLM to take into account when deciding actions, the difference with ActionSpace being that context can be ignored/bypassed. I can definitely see the usage of ActionSpace in mesa-llm, though it would require a bit of work to integrate - essentially finding a way to feed the constraints in the context, or furnish a tool to the llm that allows it to check out a specific constraint.

About the open questions

Should ActionSpace also handle perception/information constraints, or remain focused on actions?

What's the right balance between generality and usability - should we provide rich abstractions or simple building blocks?

I think that these two questions are quite linked. In my experience, people download a library for a precise usage (especially a library like mesa that is used mostly for academic/research/scientific purposes). Making a lot of complex and very rich abstraction might discourage the user, especially since what it does can probably be hardcoded while there are not too many constraints/actions to take into account. So my take on this would be that providing simple building blocks would be easier for a new user (and by extent most people using this lib) to take in. However, I think that observation and actions intertwine most of the time, so making something that handles both perception/information constraints and actions could be interesting if we manage to keep it simple and understandable enough for the user.

How do we handle multi-agent conflicts when ActionSpaces overlap - who gets priority when two agents want the same cell?

The way that we handled it in mesa-llm (when step is not async - but it's never the case in base mesa anyways) was a kind of first arrived first served : we make each agent collect the info right before its step (so after all other agents steps), so no conflicts since all the info is up to date (grid, context and everything else).

How does this relate to Mesa's existing Space classes (Grid, Network) - should it integrate or remain independent?

I don't really get the question. It will probably have to use integration hooks to collect info from the Space classes, but I don't really see how you could integrate it in one, especially since ActionSpace is designed to manage info that is not always linked to space...

The naming also deserves consideration. "ActionSpace" might confuse with spatial Space classes.

PossibilitySpace/FeasibilitySpace (from game theory) -> I like these ones better than ActionSpace I think (especially PossibilitySpace), it illustrated better the fact that info and not just action is taken into account.

Other considerations

Overall, I think that it's a good idea provided that we find good use cases for it and I'll put some more though into it as soon as I have a bit more time. My main concern is about ease of use (cf. ## About the open questions) but it looks manageable.

1 reply

EwoutH Nov 4, 2025
Maintainer Author

Thanks for your extensive response. It’s useful, I’m going to ponder about it a bit.

EwoutH · 2025-11-04T10:58:42Z

EwoutH
Nov 4, 2025
Maintainer Author

I asked an LLM with web search about related concepts. It found these:

Short answer: your “ActionSpace” definitely has precedents—just not under one universal name in ABM. It sits at the intersection of several mature ideas:

ABM/MAS — Influence–Reaction: Agents propose “influences” (intended actions) and a global reaction function enforces constraints and resolves conflicts (who moves, collisions, priorities). This is probably the closest ABM-native analogue to your “submit → project/clip → apply” loop. See IRM4S/IRM4MLS.

RL — Action spaces, masking, and safety layers: RL formalizes allowed actions via an action_space; dynamic feasibility is often handled with action masking; and safety layers project an unsafe continuous action to the nearest feasible one (hard constraints). These match your “transparent/queryable space,” “hard vs. soft,” and “projection to feasible.”

RL — Parameterized/conditional actions: Actions with continuous parameters (e.g., move(speed, turn_rate)) are standard in PAMDPs, aligning with your multi-dimensional, state-dependent ActionSpace.

Affordances (ecological psych → ABM/robotics): The environment defines what actions are possible for an agent with given effectivities—i.e., a context-dependent set of available actions. This captures your “space depends on agent/internal state + environment.”

Normative multi-agent systems / electronic institutions: Permissions, prohibitions, obligations restrict which actions are allowed given roles and context—i.e., a rule-governed feasible set. Useful for your “hard constraints from rules.”

Control/robotics — C-space, reachability, viability: Configuration space obstacles (hard limits), reachable sets, and viability theory characterize where actions can keep the system within constraints—your “feasible region” idea in continuous dynamics.

What seems genuinely new (and worth doing in Mesa)

Not the existence of the idea, but its packaging as a first-class, queryable, enforceable capability layer inside an ABM framework. Most ABM toolkits leave this scattered across ad-hoc rules, while RL/control have strong abstractions but not Mesa-native ergonomics.

Practical pointers for a Mesa design

Name & alignment: If you keep “ActionSpace,” explicitly align with Gymnasium (action_space/observation_space) so Mesa models plug into RL without adapters; also support action masking for state-dependent feasibility.

Hard vs. soft: Implement hard bounds first (clip/project). For soft constraints, compute costs and support budget-based scaling; for continuous control, consider a Dalal-style safety layer that minimally edits actions to satisfy linearized constraints.

Influence–reaction hook: Treat agent outputs as influences; apply ActionSpace enforcement and multi-agent conflict resolution in a reaction phase. That gives you clean concurrency semantics and matches prior ABM theory.

Parameterized actions: Support discrete actions with continuous parameters (PAMDP) as a first-class type.

Optional norms/roles: Provide a light “norms” layer (permissions/prohibitions) for institutional models; keep it composable with the physical/resource constraints.

Docs vocabulary: Cross-reference affordances (context-dependent availability) and viability/reachability (state trajectories under constraints) so users can find familiar theory.

A lot of stuff seems valid. https://gymnasium.farama.org/api/spaces/ indeed has action and observation spaces (didn’t know this, at least not consciously):

Spaces describe mathematical sets and are used in Gym to specify valid actions and observations. Every Gym environment must have the attributes action_space and observation_space. If, for instance, three possible actions (0,1,2) can be performed in your environment and observations are vectors in the two-dimensional unit cube, the environment code may contain the following two lines:
self.action_space = spaces.Discrete(3)
self.observation_space = spaces.Box(0, 1, shape=(2,))

So I think we can learn a lot from LR, while making sure it stays more widely applicable to ABM.

Affordances is also an interesting concept. It seems to imply that (some estimation of) return on investment is known.

Influence–Reaction I should dive into.

Tasks (#2526) also should tie in here somehow.

2 replies

EwoutH Nov 4, 2025
Maintainer Author

@harshmahesheka it has been some time, how are you?

I’m curious if you have any ideas on this from your Mesa-RL project, if integrating such blocks in Mesa is useful, and other any insights you might have on this topic!

EwoutH Nov 4, 2025
Maintainer Author

Mesa-RL already uses the gym spaces, see https://github.com/projectmesa/mesa-examples/blob/130d3d516b65d2b41b0ee61bef3ca8b35660ed63/rl/wolf_sheep/model.py#L51

So we don’t have to reinvent the wheel here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ActionSpace: Defining the influence Agents can have #2858

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

What seems genuinely new (and worth doing in Mesa)

Practical pointers for a Mesa design

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Uh oh!

ActionSpace: Defining the influence Agents can have #2858

Uh oh!

EwoutH Oct 24, 2025 Maintainer

ActionSpace: Defining Agent Capabilities and Influence

Motivation

Core Concept

A Paradigm Shift: From Rules to Constraints

Scientific Foundations

Design Tensions

Automatic Projection and Scaling

Transparent and Queryable ActionSpace

Integration with Mesa Features

Open Questions

Implementation Recommendation

Discussion

Replies: 2 comments · 3 replies

Uh oh!

colinfrisch Oct 31, 2025 Collaborator

Parallel with mesa-llm

About the open questions

Other considerations

Uh oh!

EwoutH Nov 4, 2025 Maintainer Author

Uh oh!

EwoutH Nov 4, 2025 Maintainer Author

What seems genuinely new (and worth doing in Mesa)

Practical pointers for a Mesa design

Uh oh!

EwoutH Nov 4, 2025 Maintainer Author

Uh oh!

Uh oh!

EwoutH Nov 4, 2025 Maintainer Author

EwoutH
Oct 24, 2025
Maintainer

Replies: 2 comments 3 replies

colinfrisch
Oct 31, 2025
Collaborator

EwoutH Nov 4, 2025
Maintainer Author

EwoutH
Nov 4, 2025
Maintainer Author

EwoutH Nov 4, 2025
Maintainer Author

EwoutH Nov 4, 2025
Maintainer Author