Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR enables GPU offloading also via OpenMP.
Some notes:
It uses relatively recent features, such as the
loopclause. OpenMP and OpenACC directives share some similarities, so I vertically aligned whenver it made sense, to facilitate reading.OpenMP can be enabled with
GPU_BACKEND=OMPunderbuild.conf. OpenACC remains the default backend.Some features of OpenACC do not translate directly to OpenMP and thus are not supported (e.g., how asynchrony is handled).
Thread-level parallelization on GPUs has been removed from the code.
This porting was originally fully performed by Leon Oostrum (@loostrum) from the Netherlands eScience Center, under the ExaFLOW project. I re-worked it due to a large divergence of that development branch.