Skip to content

Mctp bridge support #71

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

faizana-nvidia
Copy link
Contributor

This PR aims to introduce ALLOCATE_ENDPOINT_ID message support along with MCTP Bridge endpoint into the existing peer structure.

@jk-ozlabs
Copy link
Member

Thanks for the contribution! I'll get to a proper review shortly.

I have some pending changes that rework a lot of the peer, link and network allocation mechanisms. That shouldn't affect your code too much, but I'll request a rebase once that is merged.

@jk-ozlabs jk-ozlabs self-assigned this Apr 25, 2025
@faizana-nvidia
Copy link
Contributor Author

Thanks for the contribution! I'll get to a proper review shortly.

I have some pending changes that rework a lot of the peer, link and network allocation mechanisms. That shouldn't affect your code too much, but I'll request a rebase once that is merged.

Sure no problem

Copy link
Member

@jk-ozlabs jk-ozlabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the main design point here is how we're handling the pool allocations. It looks like your particular use-case is around static allocations, which I'll focus on here.

As I mentioned in the dbus changes, we cannot add arguments without further version-compatiblity changes. After a bit of chatting with the team, I think a better approach would be to add a new dbus call to explicitly allocate a bridge and a predefined pool (which would include the pool size). Perhaps something like:

AllocateBridgeStatic(addr: ay, pool_start: y, pool_size: y)
  • where the Set Endpoint ID response must match the expected pool size.

(we would also want purely-dynamic pools to be allocated from SetupEndpoint and friends, but that would result in a dynamic pool allocation. This dynamic pool would be defined either by a toml config option, or via a new TMBO dbus interface. However, we can handle those later, I think)

Would that work?

Contains one interface (lladdr 0x10, local EID 8), and one endpoint (lladdr
0x1d), that reports support for MCTP control and PLDM.
Contains two interface (lladdr 0x10, local EID 8), (lladdr 0x11, local EID 10) with one endpoint (lladdr
0x1d), and MCTP bridge (lladdr 0x1e, pool size 2), that reports support for MCTP control and PLDM.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't alter the default sysnet unless it's needed by a significant number of tests (which in this case, it is not). Just set up the test fixtures default as needed for your new test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so better to update the current interface (lladdr 0x10, local EID 8) with pool size and update pool size numbered eids to the network simulating a bridge rather than creating a new one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! Just update the interface/endpoints/etc in the individual test case.

@jk-ozlabs
Copy link
Member

In general, can you add a bit more of an explanation / rationale as part of your commit messages, instead of just log output? There is some good guidance for commit messages up in the "Patch formatting and changelogs" section of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/5.Posting.rst

@jk-ozlabs
Copy link
Member

We'll also need to consider the routing setup for bridged endpoints. Ideally we would:

  1. create a route for the bridge itself, plus a neighbour entry with the appropriate physical address data
  2. create a range route for the allocated endpoint pool, using the bridge as a gateway for that range (ie, no neighbour entry)

the issue is that there is no kernel support for (2) at present: we need some kernel changes to implement gateway routes. It is possible to create "somewhat-fake" routes for those endpoints, using a neighbour table entry for each (bridged) peer that uses the bridge phys address, but that's a bit suboptimal. I'd prefer not to encode that hack into mctpd if possible.

I do have a todo for the kernel changes necessary for that, sounds like I should get onto it!

@santoshpuranik
Copy link

We'll also need to consider the routing setup for bridged endpoints. Ideally we would:

1. create a route for the bridge itself, plus a neighbour entry with the appropriate physical address data

2. create a range route for the allocated endpoint pool, using the bridge as a gateway for that range (ie, no neighbour entry)

the issue is that there is no kernel support for (2) at present: we need some kernel changes to implement gateway routes. It is possible to create "somewhat-fake" routes for those endpoints, using a neighbour table entry for each (bridged) peer that uses the bridge phys address, but that's a bit suboptimal. I'd prefer not to encode that hack into mctpd if possible.

I do have a todo for the kernel changes necessary for that, sounds like I should get onto it!

IIUC, 1 is what we can achieve with the tools we have today, right? For ex: add route to the bridge itself and then mctp route add <downstream eid> via <bridge net if>, essentially adding a neighbour table entry? Would this not continue to work as from TMBO point-of-view all packets go via the bridge route.

When you say sub-optimal, are you referring to the neighbour lookup that happens in net/mctp/route.c? Noob question, how does a gateway impl make that faster?

When is the gateway support in kernel for MCTP nets planned? We can help if you have a design in mind.

@jk-ozlabs
Copy link
Member

Hi Santosh,

IIUC, 1 is what we can achieve with the tools we have today, right?

Yes, but it requires a lot of workaround to set up.

For ex: add route to the bridge itself and then mctp route add <downstream eid> via <bridge net if>, essentially adding a neighbour table entry?

That isn't adding a neighbour table entry though; just a route. USB is a little different in that there are no neighbour table entries required, because there is no physical addressing.

For a bridge, using this scheme would require:

  1. adding the route to the bridge EID
  2. adding the neighbour entry for the bridge EID
  3. adding individual routes for each EID in the EID pool
  4. adding individual fake neighbour table entries for each EID in the EID pool, which would (incorrectly) represent that the EID has a specific physical address (ie., that of the bridge)

(for USB, we don't need (2) or (4), but that's purely a property of the transport type. We would need those to be supported in mctpd to allow other transport types like i2c).

This would work, but it's messy.

When you say sub-optimal, are you referring to the neighbour lookup that happens in net/mctp/route.c?

No, the neighbour lookups happen in net/mctp/neigh.c.

Noob question, how does a gateway impl make that faster?

Not so much faster, more tidier. With a gateway route, we would require:

  1. adding the route to the bridge EID
  2. adding the neighbour entry for the bridge EID
  3. adding one range route for the entire EID pool, referencing the bridge EID as the gateway

No fake neighbour table entries are required - since the kernel just looks up the gateway physical address from the gateway's neighbour table entry.

When is the gateway support in kernel for MCTP nets planned?

I have it done - will push a development branch shortly.

@jk-ozlabs
Copy link
Member

I have it done - will push a development branch shortly.

https://github.com/CodeConstruct/linux/tree/dev/forwarding

@santoshpuranik
Copy link

Hi Jeremy,

Thank you for the detailed response.

Hi Santosh,

IIUC, 1 is what we can achieve with the tools we have today, right?

Yes, but it requires a lot of workaround to set up.

For ex: add route to the bridge itself and then mctp route add <downstream eid> via <bridge net if>, essentially adding a neighbour table entry?

That isn't adding a neighbour table entry though; just a route. USB is a little different in that there are no neighbour table entries required, because there is no physical addressing.

For a bridge, using this scheme would require:

1. adding the route to the bridge EID

2. adding the neighbour entry for the bridge EID

3. adding individual routes for each EID in the EID pool

4. adding individual fake neighbour table entries for each EID in the EID pool, which would (incorrectly) represent that the EID has a specific physical address (ie., that of the bridge)

Ack, I see something like I2C would need a PHY address.

When you say sub-optimal, are you referring to the neighbour lookup that happens in net/mctp/route.c?

No, the neighbour lookups happen in net/mctp/neigh.c.

Ack. I should have said the neigh_lookup call that happens in route.c!

Noob question, how does a gateway impl make that faster?

Not so much faster, more tidier. With a gateway route, we would require:

1. adding the route to the bridge EID

2. adding the neighbour entry for the bridge EID

3. adding one range route for the entire EID pool, referencing the bridge EID as the gateway

No fake neighbour table entries are required - since the kernel just looks up the gateway physical address from the gateway's neighbour table entry.

Thank you, that does seem cleaner.

@jk-ozlabs
Copy link
Member

And for the userspace changes, my dev/gateway branch here:

https://github.com/CodeConstruct/mctp/tree/dev/gateway

@santoshpuranik
Copy link

@jk-ozlabs : I think we agree that mctpd has to poll all allocated endpoints with a Get Endpoint ID periodically. I think the first thing we'd need to enable in order to do that is to make MCTP requests and responses asynchronous. Do you have a design in mind to make MCTP requests async (like via a request queue per allocated endpoint)?

@jk-ozlabs
Copy link
Member

jk-ozlabs commented Jun 3, 2025

I think we agree that mctpd has to poll all allocated endpoints with a Get Endpoint ID periodically

Just as a clarification - not all endpoints, but EIDs within allocated endpoint ranges, which have not yet been enumerated. And this is assuming we expect mctpd to automatically enumerate those bridged devices. I think the latter is reasonable, but we don't have a specific design point around that yet.

With that in mind, yes, we probably want to make that async, as those requests are likely to not have a response, and therefore we're at worst-case waiting time.

In terms of design: we probably don't want a struct peer to be created for those endpoints, as they don't strictly exist as proper peers at that stage. I think a minimal-impact approach may be to keep a set of the allocated (but not-yet-enumerated) ranges, and periodically send the Get Endpoint ID requests.

We don't necessarily need to keep much state for that polling mechanism (ie, between request and response) - receiving a Get Endpoint ID response for anything in that range would trigger the enumeration process.

@santoshpuranik
Copy link

I think we agree that mctpd has to poll all allocated endpoints with a Get Endpoint ID periodically

Just as a clarification - not all endpoints, but EIDs within allocated endpoint ranges, which have not yet been enumerated.

Wouldn't we also want to poll enumerated endpoints under the bridge to determine when they "went away"?

In terms of design: we probably don't want a struct peer to be created for those endpoints, as they don't strictly exist as proper peers at that stage. I think a minimal-impact approach may be to keep a set of the allocated (but not-yet-enumerated) ranges, and periodically send the Get Endpoint ID requests.

We don't necessarily need to keep much state for that polling mechanism (ie, between request and response) - receiving a Get Endpoint ID response for anything in that range would trigger the enumeration process.

Ack. How periodically do you think we should check? Same as the logic for determining when to set endpoint state as degraded (TReclaim/2)?

@jk-ozlabs
Copy link
Member

Wouldn't we also want to poll enumerated endpoints under the bridge to determine when they "went away"?

No, and we don't do that with directly-attached endpoints either. The current philosophy is that we don't care if an endpoint disappears, until some application calls Recover. If there's no application using the endpoint, then no need to monitor for its presence.

[I'm okay with revisiting this, or handling bridged endpoints differently, if there's a compelling argument for doing so]

How periodically do you think we should check?

Treclaim/2 seems a bit too often to me, but might be fine as a starting point. I suspect that an ideal approach would be to poll more regularly when a bridge pool is initially allocated, then reduce frequency. However, let's not complicate the initial implementation too much here, and just use a configurable constant.

@jk-ozlabs
Copy link
Member

jk-ozlabs commented Jun 3, 2025

.. and speaking of Recover, we might need to revisit how we handle that for bridged endpoints, as a MCTP-level recovery operation probably isn't applicable as a directly-attached device (in the same manner, at least). CC @amboar.

@santoshpuranik
Copy link

No, and we don't do that with directly-attached endpoints either.

So we have a case where we will have to call allocate endpoint ID on the bridge device when not all of its downstream devices are available. In such a case, how do you think we can determine when those downstream EIDs become available unless we poll?

@jk-ozlabs
Copy link
Member

how do you think we can determine when those downstream EIDs become available unless we poll

I am suggesting we poll. Just that we then stop polling once we enumerate the endpoint.

@santoshpuranik
Copy link

how do you think we can determine when those downstream EIDs become available unless we poll

I am suggesting we poll. Just that we then stop polling once we enumerate the endpoint.

Ah, ack, then.

@amboar
Copy link
Contributor

amboar commented Jun 11, 2025

.. and speaking of Recover, we might need to revisit how we handle that for bridged endpoints, as a MCTP-level recovery operation probably isn't applicable as a directly-attached device (in the same manner, at least). CC @amboar.

It will need some rework as currently it assumes the peer is a neighbour and uses physical addressing for Get Endpoint ID. We still want a mechanism to establish the loss of a non-neighbour peer though. I think Recover is fine for that. We need to use some message for polling, and despite the absurdity I think Get Endpoint ID is also fine for that, just we can't use physical addressing if the peer is behind a bridge. The observed behaviour of Recover would be the same - if the peer is responsive then the D-Bus object remains exposed, or if it's unresponsive then the object is removed. The difference between a peer being unresponsive as opposed to not having yet been assigned an address cannot be determined across the bridge, so in that case we skip the substance of the recovery operation (EID re-assignment). That's a responsibility of the bridge node anyway.

@jk-ozlabs
Copy link
Member

Thanks for that, Andrew.

There might be some commonality between the peers undergoing (non-local) recovery, and those EIDs that are behind a bridge, but not-yet enumerated. If a Recover of a non-local endpoint fails (ie, the Get Endpoint ID commands involved in the Recover process all timeout), then we should return that EID to the "allocated but not yet enumerated" EID set, which means we will continue to send periodic Get Endpoint ID commands (perhaps on a less frequent basis though).

The same should occur for a Remove too.

@amboar
Copy link
Contributor

amboar commented Jun 11, 2025

Yep, that sounds sensible.

@faizana-nvidia
Copy link
Contributor Author

Thank you all for taking out time to look into the PR,

I've addressed to the asked comments on previous commit, added new commit for MCTP Bridge design doc, need to push Polling mechanism now

@jk-ozlabs
Copy link
Member

Thanks for the updates! A couple of comments:

  1. We don't really do design proposal docs as file in the repo; it's great to see your recap of the discussion points from this PR, but there's no need for that format to be long-lived in the repo itself. I would suggest turning this into a user-consumable document describing how things work according to your new implementation. Any dbus API changes belong in the mctpd.md document.

  2. Before implementing this new dbus API, we would need some confirmation that non-contiguous pool allocations are permissible. I have raised an issue with the PMCI WG, (#1540, if you have access), and would like at least some indication that the requirement can be relaxed before we commit to the separate pool ranges.

  3. In order to reduce the upfront work, you may want to skip the endpoint polling for the initial PR; the changes will still be useful in that au.com.codeconstruct.MCTP.Network1.LearnEndpoint can be used to enumerate downstream devices manually (once the pool is allocated, and we can route to those endpoints).

  4. You have a couple of cases where you add something in an initial patch, then re-work it in the follow-up patch. This makes review overly complicated.

  5. Super minor, but the formatting of introduced changes is inconsistent. Given there's still some work to do before this series is ready, I will apply the tree-wide reformat shortly, and add a .clang-format.

@faizana-nvidia faizana-nvidia force-pushed the mctp-bridge-support branch 3 times, most recently from bf8f331 to fa59ed7 Compare June 30, 2025 21:44
@faizana-nvidia
Copy link
Contributor Author

faizana-nvidia commented Jun 30, 2025

Thanks for the updates! A couple of comments:

  1. We don't really do design proposal docs as file in the repo; it's great to see your recap of the discussion points from this PR, but there's no need for that format to be long-lived in the repo itself. I would suggest turning this into a user-consumable document describing how things work according to your new implementation. Any dbus API changes belong in the mctpd.md document.
  2. Before implementing this new dbus API, we would need some confirmation that non-contiguous pool allocations are permissible. I have raised an issue with the PMCI WG, (#1540, if you have access), and would like at least some indication that the requirement can be relaxed before we commit to the separate pool ranges.
  3. In order to reduce the upfront work, you may want to skip the endpoint polling for the initial PR; the changes will still be useful in that au.com.codeconstruct.MCTP.Network1.LearnEndpoint can be used to enumerate downstream devices manually (once the pool is allocated, and we can route to those endpoints).
  4. You have a couple of cases where you add something in an initial patch, then re-work it in the follow-up patch. This makes review overly complicated.
  5. Super minor, but the formatting of introduced changes is inconsistent. Given there's still some work to do before this series is ready, I will apply the tree-wide reformat shortly, and add a .clang-format.

Hello Jeremy

Thank you for looking over the commits, based on your comment # 1 I have removed the new .md file which captured MCTP Bridge support details on PR and updated the existing mctpd.md file with new information about dbus api AssignBridgeStatic. Regarding user consumable document, I'm not much sure what this could be, if you could let me know what this document should be I can create one and update the PR.

I recently got the permission for PMCI WG, have glanced over what was stated on issue #1540, basically the Idea is to split the BusOwner EID pool and segregate a chunk of eids for Bridge's downstream pool on the higher end of Busowner pool while keeping lower end for non-bridge devices. This would be helpful for Dynamic EID assignment of downstream pool devices incase multiple Bridge's are there under same network.

My current implementation involves finding of contiguous eid chunk of min(requested pool size, bridge's pool size capability) from available BusOwner's pool but we begin looking from Asked pool_start (Static) or from next to Bridge's EID (Dynamic) and we look till we get right sized chunk and mark that eid as pool_start. I did based this from the same line of spec for which you raised the issue.

In order to reduce the upfront work, you may want to skip the endpoint polling for the initial PR; the changes will still be useful in that au.com.codeconstruct.MCTP.Network1.LearnEndpoint can be used to enumerate downstream devices manually (once the pool is allocated, and we can route to those endpoints).

I can create a new PR for Endpoint polling if thats what you mean and skip for this PR. Also for adding route for the downstream endpoint to Bridge, we would need your implementation implementation to be merged for both linux kernel and mctpd. Internally I've tested my polling logic with your pulled changes, but for this PR I haven't picked them up so discovery of downstream EID via LearnEndpoint would probably not be possible with only this PR

  1. You have a couple of cases where you add something in an initial patch, then re-work it in the follow-up patch. This makes review overly complicated.
  2. Super minor, but the formatting of introduced changes is inconsistent. Given there's still some work to do before this series is ready, I will apply the tree-wide reformat shortly, and add a .clang-format.

I've updated the patch set now for easier review, hope it helps, let me know if I can do anything else to further ease the review. Thanks for your .clang format, once that is pushed I would reply those onto my change

Add MCTP control message structures for the ALLOCATE_ENDPOINT_ID command to support bridge endpoint EID pool allocation.

- Add mctp_ctrl_cmd_alloc_eid request/response structure

Signed-off-by: Faizan Ali <[email protected]>
@faizana-nvidia faizana-nvidia force-pushed the mctp-bridge-support branch 3 times, most recently from 1b9ed20 to 259cde1 Compare July 17, 2025 23:10
@faizana-nvidia
Copy link
Contributor Author

Hello Jeremy,

Thank you for the feedback, I've updated the implementation based on above discussions. Took couple of decisions here :

  1. While looking for contiguous EIDs (Bridge EID + MAX_POOL_SIZE) if we end up going towards the end of allowed EID space then returing with dbus call AssignEndpoint failure

  2. If for some reason SetEndpointID command responds with different EID than fetched from dynamic pool at beginning, then we are considering the new response EID.
    We try to find next contiguous eid for downstream endpoints beginning from response EID + 1. if found then we are good, proceed with ALLOCATE_ENDPOINT_ID, In failure case if we fail to find the range (pool_size) then we can't really do much here simply avoiding ALLOCATE_ENDPOINT ID command for the Bridge while logging failure in finding contiguous range.

  3. Added new Interface for endpoint objects by au.com.codeconstruct.MCTP.EndpointType1 which shows details such as
    Type : MCTP Device | MCTP Bridge | LOCAL endpoint
    Pool Size, Pool Start, Pool End

Please let me know your thoughts on the implementation, open to any suggestions :)

Copy link
Member

@jk-ozlabs jk-ozlabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, it looks like we're converging on a good approach here.

A few comments inline, and one overall: I'm not sure that the addition of the reserved_eid_set is for, when we already have the pool information in the peers list. Can we implement the allocation without duplicating this data?

We'll definitely need more tests though, to check cases of allocation failures, and cases where we need to constrain the pool requests from the peer.


resp = (void *)buf;
if (!resp) {
warnx("%s Invalid response Buffer\n", __func__);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no newline is needed for warnx strings.

(here and elsewhere in the changes)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's still here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I would be inclined to avoid using the function name in log output where possible. Someone consuming logs shouldn't need to know the internal structure of mctpd to figure out context.

If there's some more understandable context to use for these messages, that would be more user-consumable. perhaps things like:

"Allocate Endpoint ID response: invalid response buffer"

(for debug messages, it may be more acceptable, but warnings less so)

I am aware we're probably doing this elsewhere in the code, but would prefer to not add new instances.

@faizana-nvidia
Copy link
Contributor Author

In general, it looks like we're converging on a good approach here.

A few comments inline, and one overall: I'm not sure that the addition of the reserved_eid_set is for, when we already have the pool information in the peers list. Can we implement the allocation without duplicating this data?

It could be implemented, we'll just have to fetch eids presence, in another bridge peer quite a lot many times (felt like less efficient at the moment)

But checking via peer would be cleaner and less maintainable way, while BITMAP would be more efficient owing to frequent check we need to do while looking for contiguous available eids per bridge.

I'm okay with removing it. Let me know.

@jk-ozlabs
Copy link
Member

Not sure I fully understand your response there:

we'll just have to fetch eids presence

what do you mean by "fetch eids presence"? where is this fetching from?

But checking via peer would be cleaner and less maintainable way

why less maintainable? it would be one fewer data structure to keep in sync.

I would suggest you add the EID allocator as a new function, that:

  • creates a temporary bitmap for the available EIDs
  • walks the peers list for this network, updating the bitmap according to each peer's ->eid and ->pool_start & pool_size
  • finds the required range from the bitmap

there's no real need for the bitmap to be persistent; we're only dealing with a fixed number of peers here, and only need to do this when performing the initial allocation (so once per discovered peer).

This way, all of the allocation logic is in a single function, and we cannot have a state where the map is out of sync with the actual peers list.

@faizana-nvidia
Copy link
Contributor Author

Not sure I fully understand your response there:

we'll just have to fetch eids presence

what do you mean by "fetch eids presence"? where is this fetching from?

By fetching i mean, iterating over peers array to find if the eid is used or not, we also now need to check if eid is falling within range of any peer's allocated pool (peer->pool_start to peer->pool_start + peer->pool_size).

But I'll see if we can avoid bitmap. If not then will implement via one function way

But checking via peer would be cleaner and less maintainable way

This is meant for non bitmap approach

why less maintainable? it would be one fewer data structure to keep in sync.

Yes I agree

I would suggest you add the EID allocator as a new function, that:

  • creates a temporary bitmap for the available EIDs
  • walks the peers list for this network, updating the bitmap according to each peer's ->eid and ->pool_start & pool_size
  • finds the required range from the bitmap

there's no real need for the bitmap to be persistent; we're only dealing with a fixed number of peers here, and only need to do this when performing the initial allocation (so once per discovered peer).

This way, all of the allocation logic is in a single function, and we cannot have a state where the map is out of sync with the actual peers list.

Copy link
Member

@mkj mkj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few style comments

* For mctp bridge to have contiguous bridge EID with it's downstream endpoints, we introduce max_pool_size, bus-owner configuration, to be assumed pool size before we get it's actual size via SET_ENDPOINT_ID command response.

Signed-off-by: Faizan Ali <[email protected]>
@faizana-nvidia
Copy link
Contributor Author

Removed BITMAP logic, utilizing contiguous nature of pool allocation for simpler checks

Add support for MCTP bridge endpoints that can allocate pools of EIDs for downstream endpoints.

We assume each AssignEndpoint d-bus call will be for a bridge. with this we allocate/reserve a max_pool_size eid range contiguous to bridge's own eid. Later this pool size is updated based on SET_ENDPOINT_ID command response.

- Add pool_size and pool_start fields to struct peer
- Update AssignEndpoint d-bus call to support MCTP Bridge dynamic EID assignement.
- Fetch contiguous eids for bridge and its downstream endpoints.
- is_eid_in_bridge_pool(): for static eid assignement via AssignEndpointStatic d-bus call, need to check eid if being part of any other bridge's pool range.

Signed-off-by: Faizan Ali <[email protected]>
Add implementation for the MCTP ALLOCATE_ENDPOINT_ID control command to enable bridges to allocate EID pools for downstream endpoints.

- Add endpoint_send_allocate_endpoint_id() for sending allocation requests
- Update gateway route for downstream EIDs
- Integrate allocation of dynamic eid for downstream endpoints of bridge with AssignEndpoint d-bus method

Signed-off-by: Faizan Ali <[email protected]>
* updated mctpd.md with new mctp bridge support for dynamic eid
assignment from AssignEndpoint d-bus call

Signed-off-by: Faizan Ali <[email protected]>
Add new test for validating AssignEndpoint D-Bus method
to verify bridge endpoint EID allocation being contiguous to its
downstream eids.

- Add test_assign_dynamic_bridge_eid() for bridge simulation testing
- Simulate bridge with some downstream endpoints in test framework
- Test EID allocation of bridge and downstream being contiguous

Signed-off-by: Faizan Ali <[email protected]>
* au.com.codeconstruct.MCTP.Bridge1 interface will capture details of bridge type endpoint such as pool start, pool size, pool end.

Signed-off-by: Faizan Ali <[email protected]>
@jk-ozlabs
Copy link
Member

Looking pretty good with the changes. Let me know when you're at a stable point and want me to re-review.

@faizana-nvidia
Copy link
Contributor Author

Looking pretty good with the changes. Let me know when you're at a stable point and want me to re-review.

Hello Jeremy/Matt
Thank you for your review comments. Please help in re-review of the MR, mostly all the comments were addressed.

@jk-ozlabs
Copy link
Member

Neat! I shall do. I've got a couple of other things pending at the moment, so will get on to a proper review shortly, but in the meantime there are a still a couple of things from the earlier reviews:

  • the dbus API changes need to be documented in mctpd.md, using the per-interface formats there
  • the tests need to cover further cases around invalid endpoint allocation requests, and allocation failures

@faizana-nvidia
Copy link
Contributor Author

Neat! I shall do. I've got a couple of other things pending at the moment, so will get on to a proper review shortly, but in the meantime there are a still a couple of things from the earlier reviews:

  • the dbus API changes need to be documented in mctpd.md, using the per-interface formats there

Ack, let me get onto it.

  • the tests need to cover further cases around invalid endpoint allocation requests, and allocation failures

Thanks, I can incorporate the test case.

Comment on lines +8 to +11
# bus-owner configuration
[bus-owner]
max_pool_size = 15

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This splits the uuid configuration example from the section that it belongs in ([mctp]).

I have fixed this up in a couple of cherry-picks in my dev/config branch (which implements configuration for the dynamic EID range, using your initial addition of the [bus-owner] section). Feel free to grab from there if you like, or to rebase on top.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, overlooked this part, my bad will update

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I have also added some documentation on the configuration in that branch too)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, reworked my change based on your reference

Copy link
Member

@jk-ozlabs jk-ozlabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, a few things inline.

Since github doesn't support review comments on the commit messages, just a few general pointers for those. Can you keep consistency with the existing messages? Mainly wrapping at 80 cols, and typically sentence format rather than dot-points.

No need to explain individual code-level changes, we can read that from the diff. Instead, general context and intent is better.

Comment on lines +1439 to +1440
warnx("%s requested allocation of pool size = %d",
dest_phys_tostr(dest), peer->pool_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't really a warning anymore, as it is no longer unimplemented. Just move to a fprintf instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

Comment on lines +1730 to +1756
int next_pool_start = get_next_pool_start(
e, n, ctx->max_pool_size);
if (next_pool_start < 0) {
warnx("Ran out of EIDs from net %d while"
"allocating bridge downstream endpoint at %s ",
net, dest_phys_tostr(dest));
is_pool_possible = false;
/*ran out of pool eid : set only bridge eid then
find first available bridge eid which is not part of any pool*/
for (e = eid_alloc_min;
e <= eid_alloc_max; e++) {
if (n->peers[e]) {
// used peer may be a bridge, skip its eid range
e += n->peers[e]
->pool_size;
continue;
}
break;
}
} else if (next_pool_start != e + 1) {
// e doesn't have any contiguous max pool size eids available
e += next_pool_start;
continue;
} else {
// found contigous eids of max_pool_size from bridge_eid
is_pool_possible = true;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest splitting this into a helper function (that calculates the EID and pool range), you're getting pretty deep into the nesting there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, I'll update this.

src/mctpd.c Outdated
Comment on lines 2252 to 2254
if (peer->pool_size > 0) {
// Call for Allocate EndpointID
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer we don't add the stub code until we need it. Or at least mark this as a TODO, as "Call for Allocate EndpointID" doesn't indicate any intent there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missed to update TODO here. I'll change the comment.

return -EADDRNOTAVAIL;
}
for (mctp_eid_t e = bridge_eid + 1; e <= bridge_eid + max_pool_size;
e++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit: we haven't used the var declaration as part of the loop initialiser style elsewhere at present. I'm not averse to doing so, but in this case it is making your for-loop wrap awkwardly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm I see what you mean. I'll update

assert new
# Assert for assigned bridge endpoint ID
assert path == f'/au/com/codeconstruct/mctp1/networks/1/endpoints/{eid}'
assert new
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate assert?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will remove


assert new
# Assert for assigned bridge endpoint ID
assert path == f'/au/com/codeconstruct/mctp1/networks/1/endpoints/{eid}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to assert this, we have checked the path construction elsewhere

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay

#check if the downstream endpoint eid is contiguous to the bridge endpoint eid
assert (eid + i + 1) == br_ep.eid
(path, new) = await net.call_learn_endpoint(br_ep.eid)
assert path == f'/au/com/codeconstruct/mctp1/networks/1/endpoints/{br_ep.eid}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here; no need to check the paths, just that LearnEndpoint succeeded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

rc = sd_bus_message_append(reply, "y", peer->pool_size);
} else if (strcmp(property, "PoolEnd") == 0) {
uint8_t pool_end =
peer->pool_size ?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we know that pool_size > 0, right? otherwise we wouldn't have this interface?

"y",
bus_bridge_get_prop,
0,
SD_BUS_VTABLE_PROPERTY_CONST),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need both size and end?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can keep end and remove size for better readability of pool eids (this would. give range)

@faizana-nvidia
Copy link
Contributor Author

  • the tests need to cover further cases around invalid endpoint allocation requests, and allocation failures

for this case, we can't really test allocation failure since dbus method AssignEndpoint will at least assign Bridge's own eid, even with case allocated was rejected/failed and we reply bridge's eid path.

Also there are other complications involved while addressing allocation rejection such as

  1. bridge needs to already have some eid, some pool size and downstream eids assigned.
  2. we have not added case where if endpoint have already been assigned any eid should return same as response to SET_ENDPOINT_ID command handling.
  3. coming back to same even all this happen and we get a rejected case, we can't infer that in our test as we have no validation factor in response of dbus method.

for successful allocation we have a factor like assigned eid to downstream endpoints should be within pool range of bridge and contiguous.

I can add invalid endpoint test though i.e if asked EID for some endpoint is falling within any other MCTP bridge pool.

@jk-ozlabs
Copy link
Member

for this case, we can't really test allocation failure since dbus method AssignEndpoint will at least assign Bridge's own eid

That's fine. Just check the resulting pool range of the Allocate Endpoint IDs command, which you would have stored on the Endpoint.

It's fine to leave the more stateful tests (ie, where the bridge already has some EID/pool assignment) for later, but there are definitely some low-hanging fruit to address:

  • where the bridge pool request (in the Set Endpoint ID response) is larger than mctpd's max
  • where the bridge pool request would conflict with some other EID, so there is no valid pool available (use the new dynamic_eid_range support to help trigger that case, and maybe create a static assignment for a conflicting EID)
  • where the endpoint returns a failure from the Assign Endpoint IDs command (is the pool allocation left in a consistent state?)
  • after a successful bridge pool assignment, where a new non-bridge endpoint is enumerated, that it does not conflict with the new pool

@jk-ozlabs
Copy link
Member

You may want to rebase to current main - this means you can drop the max_pool_size change, because it has already been merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants