Skip to content

Conversation

JarettBakerDunn
Copy link
Contributor

This pull request adds the option to use a point density function to delete excess particles from cells. When this option is chosen, the particle manager will construct a point density function of particle positions using kernel density estimation and delete the particle whose position is at the maximum of the point density function.

The point of this is to address some issues which are caused by randomly selecting particles to delete.

The pull request also has test cases for the new deletion algorithm and benchmarks to measure how well the new algorithm compares to random deletion, as well as comparing different kernel functions used in the kernel density estimator.

The test cases will probably fail until #6650 and #6649 are merged, because I used the correct lower case spelling of the postprocessors used in the new test cases, and that spelling is corrected in those two pull requests.

Copy link
Member

@gassmoeller gassmoeller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good already. Please see the comments I left.

I think the one larger question I have is one of naming things. The current load balancing input option talks about removing and adding particles. You decided to name the new algorithm deleting particles, for which the natural counterpart would be creating particles. I have no strong opinion either way, but we should choose a consistent wording for how to name these algorithms (and since adding/removing is already in the code we should only switch if there are good reasons for it). This will have consequences for input parameters, documentation, and type names and variable names in the code. What do you think?

get_standard_deviation() const;

/**
* Returns the index of the particle with whose position has the highest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Returns the index of the particle with whose position has the highest
* Returns the index of the particle whose position has the highest

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this documentation and the one below should mention that this will only work if the particle density is defined per particle, instead of on a grid.

get_max_particle() const;

/**
* Returns the index of the particle with whose position has the lowest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Returns the index of the particle with whose position has the lowest
* Returns the index of the particle whose position has the lowest

{
max = input_value;
min_particle_index = reference_particle_id;
max_particle_index = reference_particle_id;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice find

auto particle_to_remove = particle_handler->particles_in_cell(cell).begin();
std::advance(particle_to_remove, index_to_remove);
particle_handler->remove_particle(particle_to_remove);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add the following for safety:

Suggested change
}
}
else
AssertThrow(false, ExcNotImplemented());

pdf.fill_from_particle_range(particle_handler->particles_in_cell(cell),current_n_particles_in_cell);
pdf.compute_statistical_values();

const unsigned int index_max = pdf.get_max_particle();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the type of this index is either unsigned int or std::uint64_t depending on the deal.II configuration.

Suggested change
const unsigned int index_max = pdf.get_max_particle();
const types::particle_index index_max = pdf.get_max_particle();

Comment on lines 33 to 40
subsection Initial composition model
set Model name = function

subsection Function
set Variable names = x,y
set Function expression = 1
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove if not necessary

Comment on lines 18 to 19
set Box origin X coordinate = 0
set Box origin Y coordinate = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for setting these, right?

set Dimension = 2
set Use years in output instead of seconds = false
set End time = 6
set Nonlinear solver scheme = single Advection, no Stokes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont understand, here you disable the Stokes solver, which means you need to prescribe a Stokes solution. But I do not see a subsection Prescribed Stokes solution? Does this parameter file even run?

Edit: Oh, I see. You never run this file directly, you run it through the run_all_deletion_algorithms script? Please still add one of the setups into this default file. You will still overwrite these lines in the run script, parameters can be set multiple times and the last parameter setting "wins". I.e. if you modify this file to be the linear downward flow, but then let the run script append the oscillating pattern, the final model will have the oscillating flow.

Making this file execute without error is important, because I would like you to add one more test, that only executes exactly this model, as it is for example done in this test: https://github.com/geodynamics/aspect/blob/main/tests/benchmark_particle_integration_scheme.prm. This will ensure we will notice if this benchmark ever breaks.

@@ -0,0 +1,121 @@
plot "output-cutoff-w1/statistics" using 2:26 title "deal.ii-cutoff-w1", \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add the figure you generate with this script into the PR and a small text that describes what is happening (this can be shortened text from your thesis, you can also reference your thesis in the text). This way we can check in the future if the results are qualitatively the same. See for example this benchmark: https://github.com/geodynamics/aspect/tree/main/benchmarks/particle_integration_scheme/doc

More information on adding a new benchmark to the manual: https://aspect-documentation.readthedocs.io/en/latest/user/extending/write-a-cookbook/manual-section.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script generates a number of plots, in my next commit to this pull request I am only including the plot of maximum distribution score, since I think it's the most relevant. Please let me know if I should include all of the plots in the .md file.


#include <random>

#include <aspect/particle/distribution.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add this further up where the other aspect/particle includes are.

Additionally this now creates a circular include with the distribution.h header (which includes manager.h). I fixed this in #6654, which is probably going to be merged before this PR, so it shouldnt pose a problem.

@JarettBakerDunn
Copy link
Contributor Author

Thanks for the feedback, this should be ready for another review.

@gassmoeller
Copy link
Member

I discussed this PR offline with @JarettBakerDunn and we will need to iron out some issues before pursuing this further. So until further notice this does not need a review at the moment.

Copy link
Member

@gassmoeller gassmoeller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two first comments, will send more tomorrow.

particle whose PDF value is being calculated.
*/
const double cell_diameter = surrounding_cell->diameter();
const auto &reference_coordinates = reference_particle.get_location();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename this to coordinates or position or similar, it contains no longer the reference coordinates.

std::vector<typename Particles::ParticleHandler<dim>::particle_iterator_range> particle_ranges_to_sum_over;

std::vector<typename Triangulation<dim>::active_cell_iterator> neighboring_cells;
GridTools::get_active_neighbors<Triangulation<dim>>(cell, neighboring_cells);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a problem here in that GridTools::get_active_neighbors will only return the neighbor cells that share a face with the current cell (in 2D a line), not the cells that only share a line or a vertext with the current cell (in 2D the cells diagonally across). We will need to add those as well. One point in the code where that is done already can be seen in aspect/source/particle/interpolator/distance_weighted_average.cc:73-82. Take a look, give it a try and we can discuss in person how it works.

#include <aspect/postprocess/particle_distribution_score.h>
#include <aspect/particle/manager.h>

#include <deal.II/multigrid/mg_transfer_global_coarsening.templates.h>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why this needs to be included in order to use Utilities::MPI::compute_set_union() but after experimenting it seems to be required. I've tried including all the headers from mg_transfer_global_coarsening.templates.h and mg_transfer_global_coarsening.h but it seems like something defined in mg_transfer_global_coarsening.templates.h is required to use Utilities::MPI::compute_set_union(). If anyone knows more about this I am curious to know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take a look at #6674. Maybe that fixed it already, or you can use the same solution for your file.

@JarettBakerDunn JarettBakerDunn force-pushed the particle_KDE_deletion branch 3 times, most recently from 963558a to b044776 Compare September 19, 2025 00:10
@JarettBakerDunn
Copy link
Contributor Author

I think that this is ready for another review.

Copy link
Member

@gassmoeller gassmoeller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting close, I just have a few comments about wording and documentation. Dont forget to add a changelog entry about the new particle removal option, and make sure you include the comments I had in #6676 that are relevant into this PR as well.


The point density function is generated using kernel density estimation. Kernel density estimation is a method for generating point density functions from sets of data using a kernel function and a bandwidth value. The point density functions generated through kernel density estimation are not defined continuously, instead, they are defined at discrete points from which the kernel function has been summed.

At each point at which the point density function is defined, the distance between the point in question and every point in the dataset serves as the input to the kernel function. The output of the kernel function between the point in question and each datapoint is summed and scaled by the bandwidth. The bandwidth affects how closely the point density function reflects the measured data. It is important that the bandwidth is small enough to result in a meaningful point density function, but not too high so that the function is overfit to the data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did I understand this correctly? Then it should be phrased more like this:

Suggested change
At each point at which the point density function is defined, the distance between the point in question and every point in the dataset serves as the input to the kernel function. The output of the kernel function between the point in question and each datapoint is summed and scaled by the bandwidth. The bandwidth affects how closely the point density function reflects the measured data. It is important that the bandwidth is small enough to result in a meaningful point density function, but not too high so that the function is overfit to the data.
At each point at which the point density function is defined, the distance between the point in question and every point in the dataset serves as the input to the kernel function. The output of the kernel function between the point in question and each datapoint is summed and scaled by the bandwidth. The bandwidth affects how closely the point density function reflects the measured data. It is important that the bandwidth is small enough to result in a meaningful point density function, but not so small that the function is overfit to the data.

# Particle distribution benchmark
*This section was contributed by Jarett Baker-Dunn and Rene Gassmöller*

This benchmark is designed to compare different particle removal algorithms. In particular, it is designed to test the use of a point density function of particle positions to remove particles from the part of the cell which has the highest density of particles.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a half sentence at the end (or a new sentence) that explains what you are testing against (namely the random deletion of particles in cells of the model that have too many particles).


In ASPECT's particle manager, the kernel function is defined at the location of each particle in the cell, and the particles in the cell serve as the data points on which the kernel function operates. This enables the manager to select the particles with the highest point density values when deleting particles, which works to prevent excessive clustering of particles which can occur with simpler removal algorithms.

The benchmark tests all of the available kernel functions, under constant and oscillating velocity. It also compares the point density removal algorithm to randomly removing excess particles from cells.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, here you mention the random removal. I think it is worth mentioning it at the beginning already, and maybe mention here as well that the random removal of particles was the only option before version 3.1.0.


The kernel functions tested are:

deal.ii's cutoff-w1 function,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also mention c1


The value of the triangular kernel function scales linearly with distance, so that the value returned decreases at a constant rate as the distance input increases. The ratio between increasing distance and decreasing return value is 1:1. If the kernel function were to be graphed, the slope of the triangle's edges would be 1.

The cutoff-w1 kernel function is implemented by the deal.ii library. It is similar to the triangular kernel function in that the return value decreases with increasing distance, but it is not linear.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is usually written deal.II (not deal.ii). Sometimes it is required to use deal.ii, but not here.

"function are available in the deal.ii documentation.");
prm.declare_entry ("Bandwidth", "0.3",
Patterns::Double (0.3),"The bandwidth value is used to scale the kernel "
"function when generating the point density function of particles.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add something about the "units" of this parameter (i.e. that it is relative to a representative cell extent in one spatial dimension).

Comment on lines 1039 to 1041



Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for more than one empty line


std::vector<typename Particles::ParticleHandler<dim>::particle_iterator_range> particle_ranges_to_sum_over;

GridTools::Cache<dim>grid_cache(this->get_triangulation(), this->get_mapping());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seen my comment in #6676 about where to create this variable


std::vector<typename Particles::ParticleHandler<dim>::particle_iterator_range> particle_ranges_to_sum_over;

GridTools::Cache<dim>grid_cache(this->get_triangulation(), this->get_mapping());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above

@@ -0,0 +1,140 @@
# Adapted from the drop-supg.prm benchmark
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a sentence what this test is for, is it just a version of particle_KDE_deletion in 3D? Then say so here.

@JarettBakerDunn
Copy link
Contributor Author

Thanks for the comments, this is ready for another review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants