Skip to content

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented Nov 30, 2023

Following #6255, we introduce here pydantic models also to the Transport module for use in configuring computers.

@sphuber sphuber force-pushed the fix/transport-model-pydantic branch 2 times, most recently from 6f4e549 to 0909304 Compare January 11, 2024 09:51
@sphuber sphuber force-pushed the fix/transport-model-pydantic branch 3 times, most recently from c30af1f to d606a83 Compare March 13, 2024 09:40
@agoscinski
Copy link
Contributor

Is it worth to tag this with a v3.0.0 milestone where we could introduce such breaking changes?

@sphuber
Copy link
Contributor Author

sphuber commented May 29, 2024

Is it worth to tag this with a v3.0.0 milestone where we could introduce such breaking changes?

The breaking changes are actually quite minimal. I don't remember by heart, but it may be just the order in which options are prompted for and some other minimal things. Mostly the changes are just in implementation.

And I am not sure if there actually ever will be a v3 😅 There are no concrete plans for now in any case. But definitely worth a discussion at some point perhaps.

Copy link

codecov bot commented May 30, 2024

Codecov Report

Attention: Patch coverage is 94.00000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 79.16%. Comparing base (fb723a9) to head (ac9bd11).

Files with missing lines Patch % Lines
src/aiida/transports/plugins/ssh.py 91.90% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6198      +/-   ##
==========================================
+ Coverage   79.13%   79.16%   +0.04%     
==========================================
  Files         565      565              
  Lines       43391    43428      +37     
==========================================
+ Hits        34331    34376      +45     
+ Misses       9060     9052       -8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@edan-bainglass edan-bainglass self-assigned this Feb 6, 2025
@khsrali khsrali self-requested a review April 10, 2025 13:15
@edan-bainglass edan-bainglass force-pushed the fix/transport-model-pydantic branch from 0c5f715 to 1d0fab2 Compare May 7, 2025 05:05
@edan-bainglass
Copy link
Member

edan-bainglass commented May 7, 2025

@agoscinski @khsrali working on this one. Rebased on main. Possibly need to add Models to the new blocking/async Transport classes (or maybe not? I don't see any new fields - @khsrali?). Then tests. Then we should be good. Let's see.

@edan-bainglass
Copy link
Member

@agoscinski do we want to add model testing for transport similar to that of the orm module?

@edan-bainglass
Copy link
Member

edan-bainglass commented May 7, 2025

@agoscinski maybe check that last commit regarding the non-interactive option for the localhost config. I guess this wasn't an issue before due to loose code but now is due to Seb's updates. We use --non-interactive when configuring the slurm-ssh computer, so this change seemed consistent.

Update

Adding --non-interactive didn't take. Environment setup still fails due to missing non-interactive argument. Have a look 🙏

@khsrali
Copy link
Contributor

khsrali commented May 7, 2025

@agoscinski @khsrali working on this one. Rebased on main. Possibly need to add Models to the new blocking/async Transport classes (or maybe not? I don't see any new fields - @khsrali?). Then tests. Then we should be good. Let's see.

https://github.com/aiidateam/aiida-core/blob/71fc14f3c2a501ff8d704d20df76a297edc8e8bc/src/aiida/transports/plugins/ssh_async.py#L70C1-L109C1

@edan-bainglass edan-bainglass force-pushed the fix/transport-model-pydantic branch from 911b43e to 9bcf7e7 Compare May 9, 2025 07:06
@edan-bainglass
Copy link
Member

edan-bainglass commented May 9, 2025

@khsrali can you give a look to the new AsyncSshTransport.Model? Note the erroneous use of _DEFAULT_max_io_allowed in the Model. I'm not sure how to reference this class variable within the model. The easy way around is to lift it to a module variable. If you approve, I'll update the PR.

@edan-bainglass edan-bainglass force-pushed the fix/transport-model-pydantic branch 3 times, most recently from c109b61 to 1965318 Compare May 15, 2025 06:09
@edan-bainglass
Copy link
Member

@khsrali please review 🙏

@khsrali
Copy link
Contributor

khsrali commented May 15, 2025

I apologize @edan-bainglass

This was a super busy week.. now I'm out on a vacation, and don't have my laptop with me..

It's quite painful to review with my phone 🥲

Will be back to work on Tuesday the 20th.
I hope that's not too late. If you urgently need this, maybe another team member can review?

P.S. also Alex is on vacation, so I think this can wait from the release point of view.

Copy link
Contributor

@khsrali khsrali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't check the changes in cmd_computer, carefully.
I have to check if we have complete test suits for that.

echo.echo_success(f'Computer `{label}` {"and all its associated nodes " if associated_nodes_pk else ""}deleted.')


class LazyConfigureGroup(VerdiCommandGroup):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How this change is related?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part of @sphuber's original work. It seems now more standard w.r.t the way the code command is handled, which was already using pydantic I think since the abstract code refactoring.

Copy link
Member

@edan-bainglass edan-bainglass Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why this is the case, but re-instating the LazyConfigureGroup class breaks the TestVerdiComputerConfigure.test_ssh_interactive test. The values passed to the CLI command are somehow out of order and redundant, leading to the following nonsense:

Report: enter ? for help.
Report: enter ! to ignore the default and set no value.
User name [edanb]: 
Port number [22]: 
Look for keys [Y/n]: some_remote_user   <- ?
Error: invalid input
Look for keys [Y/n]: 345
Error: invalid input
Look for keys [Y/n]: no
SSH key file []: 
Connection timeout in s [60]: 
Allow ssh agent [Y/n]: 
SSH proxy jump []: 
SSH proxy command []: 
Compress file transfers [Y/n]: 
GSS auth [False]: 
GSS kex [False]: 
GSS deleg_creds [False]: 
GSS host [localhost]: 
Load system host keys [Y/n]: 
Key policy (RejectPolicy, WarningPolicy, AutoAddPolicy) [RejectPolicy]: 
Use login shell when executing command [Y/n]: 
Connection cooldown time (s) [30.0]: 
Report: Configuring computer test_ssh_interactive for user test@localhost.
Success: test_ssh_interactive successfully configured for test@localhost

instead of the correct

Report: enter ? for help.
Report: enter ! to ignore the default and set no value.
Use login shell when executing command [Y/n]: 
Connection cooldown time (s) [30.0]: 
User name [edanb]: some_remote_user
Port number [22]: 345
Look for keys [Y/n]: no
SSH key file []: 
Connection timeout (s) [60]: 
Allow ssh agent [y/N]: 
SSH proxy jump []: 
SSH proxy command []: 
Compress file transfers [Y/n]: 
GSS auth [y/N]: 
GSS kex [y/N]: 
GSS deleg_creds [y/N]: 
GSS host []: 
Load system host keys [Y/n]: 
Key policy (RejectPolicy, WarningPolicy, AutoAddPolicy) [RejectPolicy]: 
Report: Configuring computer test_ssh_interactive for user test@localhost.
Success: test_ssh_interactive successfully configured for test@localhost

At the moment, it's not clear what broke this part. I suppose @sphuber encountered this and fixed it. Unfortunately (uncharacteristically) not documented 😞

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I got it. Digging into the run_cli_command fixture, I see that the user_input argument (a string providing user input to the tested command) can include new line characters to simulate the interactive prompt, i.e., pressing enter to accept defaults. This was changed in the original PR to support the CLI changes - two initial \n lead to accepting the default user name and port, which in turn assigns the user_input username and port to the wrong fields. Removing these passes the test 👍

),
]

class Model(AsyncTransport.Model):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the previous stuff are also kept?
Line 75-110

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, this is unrelated. The "previous stuff" is used in the CLI. However, still to be discussed is if that should remain, or if we use the model. If possible, good if we don't mess with the current UX, which informs the user of issues when they occur. The use of pydantic models will likely defer notification of issues until the end of the CLI process. Not ideal or friendly.

),
]

class Model(BaseModel):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same question here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above

# set up localhost computer
verdi computer setup --non-interactive --config "${CONFIG}/localhost.yaml"
verdi computer configure core.local localhost --config "${CONFIG}/localhost-config.yaml"
verdi computer configure core.local localhost --non-interactive --config "${CONFIG}/localhost-config.yaml"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why this was not needed before

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update:
@edan-bainglass reports he cannot configure a computer even with this change

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could also potentially mean that, the tests we have in place are not adequate to capture that failure.

@edan-bainglass can you please tailor the error you get?
Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the original PR, the configure_computer function in cmd_computer.py expected a non_interactive positional argument. It is unclear why @sphuber had added this. It is not used in the body of the function. Also, in other similar functions (see cmd_code.py), the non_interactive option is wrapped up in kwargs and is not explicitly defined as a positional argument. Removing the non_interactive argument from configure_computer resolves the issue I was encountering.

As for --non-interactive not being included in the original command referenced in this comment, that is unclear. It was included when configuring the slurm computer.

@edan-bainglass
Copy link
Member

@sphuber I pinged you on at least one comment regarding the original work. However, it would be great if you could provide a description to this PR briefly detailing the motivation and summarizing the changes.

@edan-bainglass edan-bainglass force-pushed the fix/transport-model-pydantic branch from 401dee6 to d93c679 Compare June 25, 2025 06:47
@edan-bainglass
Copy link
Member

edan-bainglass commented Jun 26, 2025

@khsrali @agoscinski we need to revisit the test_ssh_interactive test. It is failing, but in practice, configuring a computer interactively works. Perhaps a bad test. I'm looking into it.

Update

All good. I found the cause of the failure.

However, I also found test_show_limit to be a fragile test, inconsistently passing/failing the following assert:
assert not (str(nodes[0].pk) in result.output and str(nodes[1].pk) in result.output)
Might be an artifact of running the test locally. Not sure.

@agoscinski
Copy link
Contributor

@khsrali can you look at this, I have the feeling it is a related bug as you fixed it in #6919. It looks from a quick glance that the CLI option are now created differentl only by defining a pydantic model

if not hasattr(cls, 'Model'):
.

@edan-bainglass
Copy link
Member

@khsrali can you look at this, I have the feeling it is a related bug as you fixed it in #6919. It looks from a quick glance that the CLI option are now created differentl only by defining a pydantic model

if not hasattr(cls, 'Model'):

.

@agoscinski note that in this PR, I no longer use the dynamic class. Using the original lazy class, with the dynamic class use pushed to another PR (soon to be opened).

@edan-bainglass edan-bainglass force-pushed the fix/transport-model-pydantic branch 3 times, most recently from b23417c to 3d8d1fb Compare June 26, 2025 09:44
@edan-bainglass edan-bainglass force-pushed the fix/transport-model-pydantic branch from 3d8d1fb to ac9bd11 Compare June 26, 2025 09:47
@edan-bainglass
Copy link
Member

@agoscinski @khsrali I think perhaps we were wrong to remove the changes to the CLI. Unlike the ORM pydantic models introduced in #6842, this PR introduces pydantic models to the Transport classes solely for use in the CLI. Not sure if @sphuber expected more of them (e.g., serialization), but in this PR, no such mechanism is introduced. The serialization mechanics introduced in #6842 are limited to Entity subclasses, of which the Transport module is not a part of. So there is no straight forward (de)serialization here.

If the above is accurate, I would close #6924 and reintroduce the CLI changes here, then add a test that the model is operational, whatever that means.

Comments?

@edan-bainglass
Copy link
Member

@agoscinski @khsrali I think perhaps we were wrong to remove the changes to the CLI. Unlike the ORM pydantic models introduced in #6842, this PR introduces pydantic models to the Transport classes solely for use in the CLI. Not sure if @sphuber expected more of them (e.g., serialization), but in this PR, no such mechanism is introduced. The serialization mechanics introduced in #6842 are limited to Entity subclasses, of which the Transport module is not a part of. So there is no straight forward (de)serialization here.

If the above is accurate, I would close #6924 and reintroduce the CLI changes here, then add a test that the model is operational, whatever that means.

Comments?

@khsrali let's discuss this soon, if you have time. I'd like to close this and resolve the pydantic work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

4 participants