Skip to content

Conversation

@jysh1214
Copy link

@jysh1214 jysh1214 commented Jun 15, 2025

This PR introduces support for running FFT and IFFT on both CPU and CUDA backends.

  • cpu
  • cuda (implemented using cuFFT API)

@yungyuc yungyuc added array Multi-dimensional array implementation performance Profiling, runtime, and memory consumption labels Jun 17, 2025
Copy link
Member

@yungyuc yungyuc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the patch, @jysh1214 . There are some architectural issues to be addressed and please also find me to discuss on the discord channel #solvcon.

  • #undef FFT_CUDA_IMPL macro after use.
  • Evaluate if FFT_CUDA_IMPL macro can be made as a function template or merged with the function template fft_cuda.
  • Evaluate how much effort to avoid BUILD_CUDA from outside device/cuda/ directory.
  • Use an enum for backend. Do not use a string which can only be checked during runtime.

MODMESH_PROFILE ?= OFF
BUILD_METAL ?= OFF
BUILD_QT ?= ON
BUILD_CUDA ?= OFF
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's make it off by default. Perhaps the variable BUILD_CUDA can be moved to be adjacent to BUILD_METAL like:

BUILD_CUDA ?= OFF
BUILD_METAL ?= OFF
BUILD_QT ?= ON

#include <modmesh/buffer/buffer.hpp>
#include <modmesh/device/cuda/cuda_error_handle.hpp>

#define FFT_CUDA_IMPL(CUFFT_DATA_TYPE, CUFFT_EXEC_TYPE) \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jysh1214 could you please evaluate if this can be made as a template function? If it can be done within a couple of hours, please update this macro to use a function template or merged it into the function template fft_cuda.

If you cannot make it quickly or do not know how to make it, please leave a TODO comment to state that a future contributor should evaluate how to turn the macro into a function template.

}
else if constexpr (std::is_same_v<T2, double>)
{
FFT_CUDA_IMPL(cufftDoubleComplex, Z2Z)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#undef FFT_CUDA_IMPL macro after use.

#include <modmesh/math/math.hpp>
#include <modmesh/buffer/buffer.hpp>

#if defined(BUILD_CUDA)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to avoid any BUILD_CUDA macro from outside device/cuda. How much work it takes to make it?


template <template <typename> class T1, typename T2>
static void ifft(SimpleArray<T1<T2>> const & in, SimpleArray<T1<T2>> & out)
static void ifft(SimpleArray<T1<T2>> const & in, SimpleArray<T1<T2>> & out, std::string const && backend)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use an enum for backend. Do not use a string which can only be checked during runtime.

const size_t N = in.size();

if ((N & (N - 1)) == 0)
if (backend == "cpu")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String comparison can only be done during runtime, but the comparison should be available during compile-time.

@yungyuc
Copy link
Member

yungyuc commented Jun 19, 2025

@jysh1214 Could you please also take the code to the discord #solvcon channel to expedite the discussions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

array Multi-dimensional array implementation performance Profiling, runtime, and memory consumption

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants