-
Notifications
You must be signed in to change notification settings - Fork 3
Migrate potential interface from atomistics to pyiron_lammps #261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThis PR introduces a new potential management framework for LAMMPS, adding a comprehensive module with classes and utility functions for loading, querying, and mapping interatomic potentials, while exposing key query functions at the package level. Changes
Sequence DiagramsequenceDiagram
participant User
participant get_potential_by_name
participant get_resource_path_from_conda
participant LammpsPotentialFile
participant CSV as Potential CSV
participant validate_potential_dataframe
User->>get_potential_by_name: request by name
get_potential_by_name->>get_resource_path_from_conda: discover resource path
get_resource_path_from_conda-->>get_potential_by_name: resource path
get_potential_by_name->>LammpsPotentialFile: load potentials
LammpsPotentialFile->>CSV: read potential dataframe
CSV-->>LammpsPotentialFile: dataframe
LammpsPotentialFile->>LammpsPotentialFile: update_potential_paths
LammpsPotentialFile-->>get_potential_by_name: filtered potential
get_potential_by_name->>validate_potential_dataframe: validate result
validate_potential_dataframe-->>get_potential_by_name: validated
get_potential_by_name-->>User: potential record
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
for more information, see https://pre-commit.ci
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #261 +/- ##
===========================================
- Coverage 95.17% 79.74% -15.43%
===========================================
Files 4 5 +1
Lines 497 637 +140
===========================================
+ Hits 473 508 +35
- Misses 24 129 +105 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Nitpick comments (5)
pyiron_lammps/potential.py (5)
49-60: Consider explicitOptionaltype annotation.Line 53 uses
list[str] = Nonewhich implicitly creates an Optional type. PEP 484 recommends explicitOptional[list[str]]for clarity.Apply this diff if desired:
def __init__( self, potential_df: pandas.DataFrame, default_df: pandas.DataFrame = None, - selected_atoms: list[str] = None, + selected_atoms: Optional[list[str]] = None, ):
104-111: Consider spread operator for list construction.Line 106 concatenates lists using
+. The spread operator provides a more explicit alternative.Apply this diff if desired:
- selected_atoms = self._selected_atoms + [item] + selected_atoms = [*self._selected_atoms, item]
223-231: Consider spread operator for list construction.Similar to the parent class, line 225 could use the spread operator for list construction.
Apply this diff if desired:
- selected_atoms = self._selected_atoms + [item] + selected_atoms = [*self._selected_atoms, item]
287-304: Remove redundant backslash replacement.Line 303 calls
.replace("\\", "/")after.as_posix(), butas_posix()already converts all path separators to forward slashes, making the replacement redundant.Apply this diff:
return ( Path(path.strip()) .expanduser() .resolve() .absolute() .as_posix() - .replace("\\", "/") )
358-380: Consider shorter error messages or custom exception classes.The validation logic is comprehensive and correct. However, the error messages on lines 367-370, 372-375, and 377-380 are quite verbose. While they provide helpful context, consider either:
- Creating custom exception classes with descriptive names, or
- Moving detailed messages to exception docstrings
This aligns with best practices for exception handling (TRY003).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
pyiron_lammps/__init__.py(1 hunks)pyiron_lammps/potential.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
pyiron_lammps/__init__.py (1)
pyiron_lammps/potential.py (2)
get_potential_by_name(349-355)get_potential_dataframe(340-346)
🪛 Ruff (0.14.3)
pyiron_lammps/potential.py
53-53: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
78-78: Avoid specifying long messages outside the exception class
(TRY003)
89-89: Avoid specifying long messages outside the exception class
(TRY003)
106-106: Consider [*self._selected_atoms, item] instead of concatenation
Replace with [*self._selected_atoms, item]
(RUF005)
213-213: Avoid specifying long messages outside the exception class
(TRY003)
225-225: Consider [*self._selected_atoms, item] instead of concatenation
Replace with [*self._selected_atoms, item]
(RUF005)
263-267: Avoid specifying long messages outside the exception class
(TRY003)
367-370: Avoid specifying long messages outside the exception class
(TRY003)
372-375: Avoid specifying long messages outside the exception class
(TRY003)
377-380: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: unittest_matrix (windows-latest, 3.14)
| @@ -1,5 +1,6 @@ | |||
| import pyiron_lammps._version | |||
| from pyiron_lammps.output import parse_lammps_output as parse_lammps_output_files | |||
| from pyiron_lammps.potential import get_potential_by_name, get_potential_dataframe | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add new imports to __all__ or reconsider module-level import.
Line 3 imports get_potential_by_name and get_potential_dataframe at the module level, but they're not added to the __all__ list (lines 18-21). This creates ambiguity about whether these functions are part of the public API.
Recommendation: Add both functions to __all__ to explicitly expose them as part of the public API, consistent with the other exports.
Apply this diff:
__all__ = [
+ "get_potential_by_name",
+ "get_potential_dataframe",
"parse_lammps_output_files",
"write_lammps_structure",
]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| from pyiron_lammps.potential import get_potential_by_name, get_potential_dataframe | |
| __all__ = [ | |
| "get_potential_by_name", | |
| "get_potential_dataframe", | |
| "parse_lammps_output_files", | |
| "write_lammps_structure", | |
| ] |
🤖 Prompt for AI Agents
In pyiron_lammps/__init__.py around line 3, you import get_potential_by_name and
get_potential_dataframe at module level but did not add them to the module's
__all__ (lines ~18-21); update the __all__ list to include
"get_potential_by_name" and "get_potential_dataframe" so they are explicitly
exported as part of the public API, or alternatively remove the top-level import
and require users to import them from pyiron_lammps.potential if you do not
intend them to be public.
| potential_installation = """ | ||
| Potential installation guide: | ||
| 1. Check whether iprpy-data is installed. If not, install it using: | ||
| `conda install -c conda-forge iprpy-data` | ||
| 2. Check whether the resource path is set via: | ||
| ```python | ||
| import os | ||
| print(os.environ["CONDA_PREFIX"]) | ||
| ``` | ||
| 3. If the resource path is set, you can call the potential using: | ||
| ```python | ||
| from atomistics.calculators import get_potential_by_name | ||
| get_potential_by_name( | ||
| potential_name=my_potential, | ||
| resource_path=os.path.join(os.environ["CONDA_PREFIX"], "share", "iprpy"), | ||
| ) | ||
| ``` | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update module reference in installation guide.
The installation guide references atomistics.calculators on line 25, but this PR migrates the interface to pyiron_lammps. The import statement should be updated to reflect the correct module name.
Apply this diff:
-from atomistics.calculators import get_potential_by_name
+from pyiron_lammps import get_potential_by_name📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| potential_installation = """ | |
| Potential installation guide: | |
| 1. Check whether iprpy-data is installed. If not, install it using: | |
| `conda install -c conda-forge iprpy-data` | |
| 2. Check whether the resource path is set via: | |
| ```python | |
| import os | |
| print(os.environ["CONDA_PREFIX"]) | |
| ``` | |
| 3. If the resource path is set, you can call the potential using: | |
| ```python | |
| from atomistics.calculators import get_potential_by_name | |
| get_potential_by_name( | |
| potential_name=my_potential, | |
| resource_path=os.path.join(os.environ["CONDA_PREFIX"], "share", "iprpy"), | |
| ) | |
| ``` | |
| """ | |
| potential_installation = """ | |
| Potential installation guide: | |
| 1. Check whether iprpy-data is installed. If not, install it using: | |
| `conda install -c conda-forge iprpy-data` | |
| 2. Check whether the resource path is set via: | |
🤖 Prompt for AI Agents
In pyiron_lammps/potential.py around lines 8 to 34, the installation guide
string references the old module name "atomistics.calculators" on the import
example; update that import to the new module "pyiron_lammps" and adjust the
subsequent usage example to call get_potential_by_name from pyiron_lammps (i.e.,
replace the import line and any fully-qualified calls in the string), keeping
the rest of the textual instructions unchanged.
| def default(self): | ||
| if self._default_df is not None: | ||
| atoms_str = "_".join(sorted(self._selected_atoms)) | ||
| return self._default_df[ | ||
| (self._default_df["Name"] == self._default_df.loc[atoms_str].values[0]) | ||
| ] | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling for missing default potential.
Line 190 accesses self._default_df.loc[atoms_str] without checking if atoms_str exists in the index. This will raise a KeyError if the element combination has no default defined.
Apply this diff to add proper error handling:
def default(self):
if self._default_df is not None:
atoms_str = "_".join(sorted(self._selected_atoms))
+ if atoms_str not in self._default_df.index:
+ return None
return self._default_df[
(self._default_df["Name"] == self._default_df.loc[atoms_str].values[0])
]
return None🤖 Prompt for AI Agents
In pyiron_lammps/potential.py around lines 186 to 192, the code directly indexes
self._default_df.loc[atoms_str] which will raise a KeyError if atoms_str is not
present; update the method to first check whether self._default_df is not None
and atoms_str exists in self._default_df.index (or catch KeyError) and if
missing return None (or raise a clear ValueError with context), otherwise
proceed to select rows matching the stored default Name; ensure no unguarded
.loc access remains so the method never throws an uncaught KeyError.
| def find_default( | ||
| self, element: Union[set[str], list[str], str] | ||
| ) -> pandas.DataFrame: | ||
| """ | ||
| Find the potentials | ||
| Args: | ||
| element (set, str): element or set of elements for which you want the possible LAMMPS potentials | ||
| path (bool): choose whether to return the full path to the potential or just the potential name | ||
| Returns: | ||
| list: of possible potentials for the element or the combination of elements | ||
| """ | ||
| if isinstance(element, list): | ||
| element = set(element) | ||
| elif isinstance(element, str): | ||
| element = {element} | ||
| elif not isinstance(element, set): | ||
| raise TypeError("Only, str, list and set supported!") | ||
| element_lst = list(element) | ||
| if self._default_df is not None: | ||
| merged_lst = list(set(self._selected_atoms + element_lst)) | ||
| atoms_str = "_".join(sorted(merged_lst)) | ||
| return self._default_df[ | ||
| (self._default_df["Name"] == self._default_df.loc[atoms_str].values[0]) | ||
| ] | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling for missing default potential.
Line 219 has the same issue as the default() method - accessing self._default_df.loc[atoms_str] without checking if the key exists in the index.
Apply this diff:
def find_default(
self, element: Union[set[str], list[str], str]
) -> pandas.DataFrame:
"""
Find the potentials
Args:
element (set, str): element or set of elements for which you want the possible LAMMPS potentials
path (bool): choose whether to return the full path to the potential or just the potential name
Returns:
list: of possible potentials for the element or the combination of elements
"""
if isinstance(element, list):
element = set(element)
elif isinstance(element, str):
element = {element}
elif not isinstance(element, set):
raise TypeError("Only, str, list and set supported!")
element_lst = list(element)
if self._default_df is not None:
merged_lst = list(set(self._selected_atoms + element_lst))
atoms_str = "_".join(sorted(merged_lst))
+ if atoms_str not in self._default_df.index:
+ return None
return self._default_df[
(self._default_df["Name"] == self._default_df.loc[atoms_str].values[0])
]
return None📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def find_default( | |
| self, element: Union[set[str], list[str], str] | |
| ) -> pandas.DataFrame: | |
| """ | |
| Find the potentials | |
| Args: | |
| element (set, str): element or set of elements for which you want the possible LAMMPS potentials | |
| path (bool): choose whether to return the full path to the potential or just the potential name | |
| Returns: | |
| list: of possible potentials for the element or the combination of elements | |
| """ | |
| if isinstance(element, list): | |
| element = set(element) | |
| elif isinstance(element, str): | |
| element = {element} | |
| elif not isinstance(element, set): | |
| raise TypeError("Only, str, list and set supported!") | |
| element_lst = list(element) | |
| if self._default_df is not None: | |
| merged_lst = list(set(self._selected_atoms + element_lst)) | |
| atoms_str = "_".join(sorted(merged_lst)) | |
| return self._default_df[ | |
| (self._default_df["Name"] == self._default_df.loc[atoms_str].values[0]) | |
| ] | |
| return None | |
| def find_default( | |
| self, element: Union[set[str], list[str], str] | |
| ) -> pandas.DataFrame: | |
| """ | |
| Find the potentials | |
| Args: | |
| element (set, str): element or set of elements for which you want the possible LAMMPS potentials | |
| path (bool): choose whether to return the full path to the potential or just the potential name | |
| Returns: | |
| list: of possible potentials for the element or the combination of elements | |
| """ | |
| if isinstance(element, list): | |
| element = set(element) | |
| elif isinstance(element, str): | |
| element = {element} | |
| elif not isinstance(element, set): | |
| raise TypeError("Only, str, list and set supported!") | |
| element_lst = list(element) | |
| if self._default_df is not None: | |
| merged_lst = list(set(self._selected_atoms + element_lst)) | |
| atoms_str = "_".join(sorted(merged_lst)) | |
| if atoms_str not in self._default_df.index: | |
| return None | |
| return self._default_df[ | |
| (self._default_df["Name"] == self._default_df.loc[atoms_str].values[0]) | |
| ] | |
| return None |
🧰 Tools
🪛 Ruff (0.14.3)
213-213: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
In pyiron_lammps/potential.py around lines 194 to 221, the code accesses
self._default_df.loc[atoms_str] without verifying that atoms_str is a valid
index key; update the logic to check whether atoms_str exists in
self._default_df.index (or wrap the access in a try/except KeyError) before
attempting to read .loc; if the key is missing, return None (or raise a clear
KeyError with a descriptive message) instead of letting an unhandled exception
occur, ensuring behavior matches the existing default() method.
| def get_potential_by_name(potential_name: str, resource_path=None): | ||
| if resource_path is None: | ||
| resource_path = get_resource_path_from_conda() | ||
| df = LammpsPotentialFile(resource_path=resource_path).list() | ||
| return update_potential_paths( | ||
| df_pot=df[df.Name == potential_name], resource_path=resource_path | ||
| ).iloc[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add validation before accessing filtered result.
Line 354 filters by potential_name and immediately accesses .iloc[0] without checking if any results were found. If no matching potential exists, this will raise an IndexError instead of a meaningful error message.
Apply this diff:
def get_potential_by_name(potential_name: str, resource_path=None):
if resource_path is None:
resource_path = get_resource_path_from_conda()
df = LammpsPotentialFile(resource_path=resource_path).list()
- return update_potential_paths(
- df_pot=df[df.Name == potential_name], resource_path=resource_path
- ).iloc[0]
+ filtered_df = df[df.Name == potential_name]
+ if len(filtered_df) == 0:
+ raise ValueError(f"Potential '{potential_name}' not found in database.")
+ return update_potential_paths(
+ df_pot=filtered_df, resource_path=resource_path
+ ).iloc[0]🤖 Prompt for AI Agents
pyiron_lammps/potential.py around lines 349 to 355: the function filters df by
potential_name and directly returns .iloc[0], which will raise IndexError if no
match is found; modify the function to check if the filtered dataframe is empty
before accessing iloc[0], and if empty raise a clear ValueError (or custom
exception) stating the requested potential_name was not found and include either
a short list of available potential names or the resource_path used for lookup
so callers can diagnose the issue; if not empty, proceed to return the first row
as before.
Summary by CodeRabbit