Migrate potential interface from atomistics to pyiron_lammps #261

jan-janssen · 2025-11-08T09:14:59Z

Summary by CodeRabbit

New Features
- Added functionality to query and retrieve LAMMPS interatomic potentials by name or atomic structure
- Enhanced potential data management capabilities with improved validation and resource path handling
- Potential lookup features now accessible at the package level

coderabbitai · 2025-11-08T09:15:08Z

Walkthrough

This PR introduces a new potential management framework for LAMMPS, adding a comprehensive module with classes and utility functions for loading, querying, and mapping interatomic potentials, while exposing key query functions at the package level.

Changes

Cohort / File(s)	Summary
Package-level exports `pyiron_lammps/__init__.py`	Added imports for `get_potential_by_name` and `get_potential_dataframe` from potential module to expose them at package level
Potential management framework `pyiron_lammps/potential.py`	New module introducing `PotentialAbstract` base class, `LammpsPotentialFile` subclass, and `PotentialAvailable` wrapper for potential access; utility functions for resource path discovery, dataframe manipulation, potential filtering, and path normalization; functions `get_potential_dataframe`, `get_potential_by_name`, and `validate_potential_dataframe` for querying potentials

Sequence Diagram

sequenceDiagram
    participant User
    participant get_potential_by_name
    participant get_resource_path_from_conda
    participant LammpsPotentialFile
    participant CSV as Potential CSV
    participant validate_potential_dataframe

    User->>get_potential_by_name: request by name
    get_potential_by_name->>get_resource_path_from_conda: discover resource path
    get_resource_path_from_conda-->>get_potential_by_name: resource path
    get_potential_by_name->>LammpsPotentialFile: load potentials
    LammpsPotentialFile->>CSV: read potential dataframe
    CSV-->>LammpsPotentialFile: dataframe
    LammpsPotentialFile->>LammpsPotentialFile: update_potential_paths
    LammpsPotentialFile-->>get_potential_by_name: filtered potential
    get_potential_by_name->>validate_potential_dataframe: validate result
    validate_potential_dataframe-->>get_potential_by_name: validated
    get_potential_by_name-->>User: potential record

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Multiple classes with inheritance patterns: Review PotentialAbstract base class design, inheritance flow in LammpsPotentialFile, and dynamic attribute/item access implementations (__getattr__, __getitem__)
Resource path discovery logic: Verify conda environment variable handling in get_resource_path_from_conda and path resolution in find_potential_file_base across multiple fallback paths
Dataframe transformation pipeline: Examine update_potential_paths and associated converter logic for column transformations (Species, Config, Filename)
Input validation and error handling: Check robustness of type enforcement (str/list/set), error messages in validate_potential_dataframe, and exception handling throughout
Path normalization: Verify POSIX conversion and absolute path resolution in convert_path_to_abs_posix

Poem

🐰 A framework hops forth, organized and neat,
Potentials now dance in a dataframe sweet,
Resources discovered with conda's good grace,
Atoms find their pairs—potential's embrace!
Paths unfold, validations ring true,
The warren's first chemistry glue ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Migrate potential interface from atomistics to pyiron_lammps' accurately describes the main change: introducing a new potential interface module to pyiron_lammps by migrating functionality from the atomistics package.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch potential

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

for more information, see https://pre-commit.ci

codecov · 2025-11-08T09:16:20Z

Codecov Report

❌ Patch coverage is 25.00000% with 105 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.74%. Comparing base (115a677) to head (5f29115).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
pyiron_lammps/potential.py	24.46%	105 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##             main     #261       +/-   ##
===========================================
- Coverage   95.17%   79.74%   -15.43%     
===========================================
  Files           4        5        +1     
  Lines         497      637      +140     
===========================================
+ Hits          473      508       +35     
- Misses         24      129      +105

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (5)

pyiron_lammps/potential.py (5)
49-60: Consider explicit Optional type annotation.

Line 53 uses list[str] = None which implicitly creates an Optional type. PEP 484 recommends explicit Optional[list[str]] for clarity.

Apply this diff if desired:
     def __init__(
         self,
         potential_df: pandas.DataFrame,
         default_df: pandas.DataFrame = None,
-        selected_atoms: list[str] = None,
+        selected_atoms: Optional[list[str]] = None,
     ):
104-111: Consider spread operator for list construction.

Line 106 concatenates lists using +. The spread operator provides a more explicit alternative.

Apply this diff if desired:
-        selected_atoms = self._selected_atoms + [item]
+        selected_atoms = [*self._selected_atoms, item]
223-231: Consider spread operator for list construction.

Similar to the parent class, line 225 could use the spread operator for list construction.

Apply this diff if desired:
-        selected_atoms = self._selected_atoms + [item]
+        selected_atoms = [*self._selected_atoms, item]
287-304: Remove redundant backslash replacement.

Line 303 calls .replace("\\", "/") after .as_posix(), but as_posix() already converts all path separators to forward slashes, making the replacement redundant.

Apply this diff:
     return (
         Path(path.strip())
         .expanduser()
         .resolve()
         .absolute()
         .as_posix()
-        .replace("\\", "/")
     )
358-380: Consider shorter error messages or custom exception classes.

The validation logic is comprehensive and correct. However, the error messages on lines 367-370, 372-375, and 377-380 are quite verbose. While they provide helpful context, consider either:

Creating custom exception classes with descriptive names, or

Moving detailed messages to exception docstrings

This aligns with best practices for exception handling (TRY003).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3d27297 and 5f29115.

📒 Files selected for processing (2)

pyiron_lammps/__init__.py (1 hunks)
pyiron_lammps/potential.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

pyiron_lammps/__init__.py (1)

pyiron_lammps/potential.py (2)

get_potential_by_name (349-355)

get_potential_dataframe (340-346)

🪛 Ruff (0.14.3)

pyiron_lammps/potential.py

53-53: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)

78-78: Avoid specifying long messages outside the exception class

(TRY003)

89-89: Avoid specifying long messages outside the exception class

(TRY003)

106-106: Consider [*self._selected_atoms, item] instead of concatenation

Replace with [*self._selected_atoms, item]

(RUF005)

213-213: Avoid specifying long messages outside the exception class

(TRY003)

225-225: Consider [*self._selected_atoms, item] instead of concatenation

Replace with [*self._selected_atoms, item]

(RUF005)

263-267: Avoid specifying long messages outside the exception class

(TRY003)

367-370: Avoid specifying long messages outside the exception class

(TRY003)

372-375: Avoid specifying long messages outside the exception class

(TRY003)

377-380: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: unittest_matrix (windows-latest, 3.14)

coderabbitai · 2025-11-08T09:20:15Z

pyiron_lammps/__init__.py

@@ -1,5 +1,6 @@
 import pyiron_lammps._version
 from pyiron_lammps.output import parse_lammps_output as parse_lammps_output_files
+from pyiron_lammps.potential import get_potential_by_name, get_potential_dataframe


⚠️ Potential issue | 🟠 Major

Add new imports to __all__ or reconsider module-level import.

Line 3 imports get_potential_by_name and get_potential_dataframe at the module level, but they're not added to the __all__ list (lines 18-21). This creates ambiguity about whether these functions are part of the public API.

Recommendation: Add both functions to __all__ to explicitly expose them as part of the public API, consistent with the other exports.

Apply this diff:

__all__ = [ + "get_potential_by_name", + "get_potential_dataframe", "parse_lammps_output_files", "write_lammps_structure", ]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from pyiron_lammps.potential import get_potential_by_name, get_potential_dataframe

__all__ = [

"get_potential_by_name",

"get_potential_dataframe",

"parse_lammps_output_files",

"write_lammps_structure",

]

🤖 Prompt for AI Agents

In pyiron_lammps/__init__.py around line 3, you import get_potential_by_name and get_potential_dataframe at module level but did not add them to the module's __all__ (lines ~18-21); update the __all__ list to include "get_potential_by_name" and "get_potential_dataframe" so they are explicitly exported as part of the public API, or alternatively remove the top-level import and require users to import them from pyiron_lammps.potential if you do not intend them to be public.

coderabbitai · 2025-11-08T09:20:15Z

pyiron_lammps/potential.py

+potential_installation = """
+Potential installation guide:
+
+1. Check whether iprpy-data is installed. If not, install it using:
+
+`conda install -c conda-forge iprpy-data`
+
+2. Check whether the resource path is set via:
+
+```python
+import os
+print(os.environ["CONDA_PREFIX"])
+```
+
+3. If the resource path is set, you can call the potential using:
+
+```python
+from atomistics.calculators import get_potential_by_name
+
+
+get_potential_by_name(
+    potential_name=my_potential,
+    resource_path=os.path.join(os.environ["CONDA_PREFIX"], "share", "iprpy"),
+)
+```
+
+"""


⚠️ Potential issue | 🔴 Critical

Update module reference in installation guide.

The installation guide references atomistics.calculators on line 25, but this PR migrates the interface to pyiron_lammps. The import statement should be updated to reflect the correct module name.

Apply this diff:

-from atomistics.calculators import get_potential_by_name +from pyiron_lammps import get_potential_by_name

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

potential_installation = """

Potential installation guide:

1. Check whether iprpy-data is installed. If not, install it using:

`conda install -c conda-forge iprpy-data`

2. Check whether the resource path is set via:

```python

import os

print(os.environ["CONDA_PREFIX"])

```

3. If the resource path is set, you can call the potential using:

```python

from atomistics.calculators import get_potential_by_name

get_potential_by_name(

potential_name=my_potential,

resource_path=os.path.join(os.environ["CONDA_PREFIX"], "share", "iprpy"),

)

```

"""

potential_installation = """

Potential installation guide:

1. Check whether iprpy-data is installed. If not, install it using:

`conda install -c conda-forge iprpy-data`

2. Check whether the resource path is set via:

🤖 Prompt for AI Agents

In pyiron_lammps/potential.py around lines 8 to 34, the installation guide string references the old module name "atomistics.calculators" on the import example; update that import to the new module "pyiron_lammps" and adjust the subsequent usage example to call get_potential_by_name from pyiron_lammps (i.e., replace the import line and any fully-qualified calls in the string), keeping the rest of the textual instructions unchanged.

coderabbitai · 2025-11-08T09:20:15Z

pyiron_lammps/potential.py

+    def default(self):
+        if self._default_df is not None:
+            atoms_str = "_".join(sorted(self._selected_atoms))
+            return self._default_df[
+                (self._default_df["Name"] == self._default_df.loc[atoms_str].values[0])
+            ]
+        return None


⚠️ Potential issue | 🟠 Major

Add error handling for missing default potential.

Line 190 accesses self._default_df.loc[atoms_str] without checking if atoms_str exists in the index. This will raise a KeyError if the element combination has no default defined.

Apply this diff to add proper error handling:

def default(self): if self._default_df is not None: atoms_str = "_".join(sorted(self._selected_atoms)) + if atoms_str not in self._default_df.index: + return None return self._default_df[ (self._default_df["Name"] == self._default_df.loc[atoms_str].values[0]) ] return None

🤖 Prompt for AI Agents

In pyiron_lammps/potential.py around lines 186 to 192, the code directly indexes self._default_df.loc[atoms_str] which will raise a KeyError if atoms_str is not present; update the method to first check whether self._default_df is not None and atoms_str exists in self._default_df.index (or catch KeyError) and if missing return None (or raise a clear ValueError with context), otherwise proceed to select rows matching the stored default Name; ensure no unguarded .loc access remains so the method never throws an uncaught KeyError.

coderabbitai · 2025-11-08T09:20:15Z

pyiron_lammps/potential.py

+    def find_default(
+        self, element: Union[set[str], list[str], str]
+    ) -> pandas.DataFrame:
+        """
+        Find the potentials
+
+        Args:
+            element (set, str): element or set of elements for which you want the possible LAMMPS potentials
+            path (bool): choose whether to return the full path to the potential or just the potential name
+
+        Returns:
+            list: of possible potentials for the element or the combination of elements
+
+        """
+        if isinstance(element, list):
+            element = set(element)
+        elif isinstance(element, str):
+            element = {element}
+        elif not isinstance(element, set):
+            raise TypeError("Only, str, list and set supported!")
+        element_lst = list(element)
+        if self._default_df is not None:
+            merged_lst = list(set(self._selected_atoms + element_lst))
+            atoms_str = "_".join(sorted(merged_lst))
+            return self._default_df[
+                (self._default_df["Name"] == self._default_df.loc[atoms_str].values[0])
+            ]
+        return None


⚠️ Potential issue | 🟠 Major

Add error handling for missing default potential.

Line 219 has the same issue as the default() method - accessing self._default_df.loc[atoms_str] without checking if the key exists in the index.

Apply this diff:

def find_default( self, element: Union[set[str], list[str], str] ) -> pandas.DataFrame: """ Find the potentials Args: element (set, str): element or set of elements for which you want the possible LAMMPS potentials path (bool): choose whether to return the full path to the potential or just the potential name Returns: list: of possible potentials for the element or the combination of elements """ if isinstance(element, list): element = set(element) elif isinstance(element, str): element = {element} elif not isinstance(element, set): raise TypeError("Only, str, list and set supported!") element_lst = list(element) if self._default_df is not None: merged_lst = list(set(self._selected_atoms + element_lst)) atoms_str = "_".join(sorted(merged_lst)) + if atoms_str not in self._default_df.index: + return None return self._default_df[ (self._default_df["Name"] == self._default_df.loc[atoms_str].values[0]) ] return None

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def find_default(

self, element: Union[set[str], list[str], str]

) -> pandas.DataFrame:

"""

Find the potentials

Args:

element (set, str): element or set of elements for which you want the possible LAMMPS potentials

path (bool): choose whether to return the full path to the potential or just the potential name

Returns:

list: of possible potentials for the element or the combination of elements

"""

if isinstance(element, list):

element = set(element)

elif isinstance(element, str):

element = {element}

elif not isinstance(element, set):

raise TypeError("Only, str, list and set supported!")

element_lst = list(element)

if self._default_df is not None:

merged_lst = list(set(self._selected_atoms + element_lst))

atoms_str = "_".join(sorted(merged_lst))

return self._default_df[

(self._default_df["Name"] == self._default_df.loc[atoms_str].values[0])

]

return None

def find_default(

self, element: Union[set[str], list[str], str]

) -> pandas.DataFrame:

"""

Find the potentials

Args:

element (set, str): element or set of elements for which you want the possible LAMMPS potentials

path (bool): choose whether to return the full path to the potential or just the potential name

Returns:

list: of possible potentials for the element or the combination of elements

"""

if isinstance(element, list):

element = set(element)

elif isinstance(element, str):

element = {element}

elif not isinstance(element, set):

raise TypeError("Only, str, list and set supported!")

element_lst = list(element)

if self._default_df is not None:

merged_lst = list(set(self._selected_atoms + element_lst))

atoms_str = "_".join(sorted(merged_lst))

if atoms_str not in self._default_df.index:

return None

return self._default_df[

(self._default_df["Name"] == self._default_df.loc[atoms_str].values[0])

]

return None

🧰 Tools

🪛 Ruff (0.14.3)

213-213: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents

In pyiron_lammps/potential.py around lines 194 to 221, the code accesses self._default_df.loc[atoms_str] without verifying that atoms_str is a valid index key; update the logic to check whether atoms_str exists in self._default_df.index (or wrap the access in a try/except KeyError) before attempting to read .loc; if the key is missing, return None (or raise a clear KeyError with a descriptive message) instead of letting an unhandled exception occur, ensuring behavior matches the existing default() method.

coderabbitai · 2025-11-08T09:20:15Z

pyiron_lammps/potential.py

+def get_potential_by_name(potential_name: str, resource_path=None):
+    if resource_path is None:
+        resource_path = get_resource_path_from_conda()
+    df = LammpsPotentialFile(resource_path=resource_path).list()
+    return update_potential_paths(
+        df_pot=df[df.Name == potential_name], resource_path=resource_path
+    ).iloc[0]


⚠️ Potential issue | 🟠 Major

Add validation before accessing filtered result.

Line 354 filters by potential_name and immediately accesses .iloc[0] without checking if any results were found. If no matching potential exists, this will raise an IndexError instead of a meaningful error message.

Apply this diff:

def get_potential_by_name(potential_name: str, resource_path=None): if resource_path is None: resource_path = get_resource_path_from_conda() df = LammpsPotentialFile(resource_path=resource_path).list() - return update_potential_paths( - df_pot=df[df.Name == potential_name], resource_path=resource_path - ).iloc[0] + filtered_df = df[df.Name == potential_name] + if len(filtered_df) == 0: + raise ValueError(f"Potential '{potential_name}' not found in database.") + return update_potential_paths( + df_pot=filtered_df, resource_path=resource_path + ).iloc[0]

🤖 Prompt for AI Agents

pyiron_lammps/potential.py around lines 349 to 355: the function filters df by potential_name and directly returns .iloc[0], which will raise IndexError if no match is found; modify the function to check if the filtered dataframe is empty before accessing iloc[0], and if empty raise a clear ValueError (or custom exception) stating the requested potential_name was not found and include either a short list of available potential names or the resource_path used for lookup so callers can diagnose the issue; if not empty, proceed to return the first row as before.

Migrate potential interface from atomistics to pyiron_lammps

a2e7327

[pre-commit.ci] auto fixes from pre-commit.com hooks

5f29115

for more information, see https://pre-commit.ci

coderabbitai bot reviewed Nov 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Migrate potential interface from atomistics to pyiron_lammps #261

Migrate potential interface from atomistics to pyiron_lammps #261

Uh oh!

jan-janssen commented Nov 8, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Nov 8, 2025 •

edited

Loading

Uh oh!

codecov bot commented Nov 8, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Nov 8, 2025

Uh oh!

coderabbitai bot Nov 8, 2025

Uh oh!

coderabbitai bot Nov 8, 2025

Uh oh!

coderabbitai bot Nov 8, 2025

Uh oh!

coderabbitai bot Nov 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-from pyiron_lammps.potential import get_potential_by_name, get_potential_dataframe
+__all__ = [
+    "get_potential_by_name",
+    "get_potential_dataframe",
+    "parse_lammps_output_files",
+    "write_lammps_structure",
+]

Migrate potential interface from atomistics to pyiron_lammps #261

Are you sure you want to change the base?

Migrate potential interface from atomistics to pyiron_lammps #261

Uh oh!

Conversation

jan-janssen commented Nov 8, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

codecov bot commented Nov 8, 2025

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jan-janssen commented Nov 8, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 8, 2025 •

edited

Loading