In your paper, the moment loss is the regularization to make the moment matrix. But in your network, if I use order 1 or 2, the moment matrix is exactly zero. It seems the momentloss in the loss won't change. And if I did not misunderstand the code, the code uses the moment matrix to generate the kernel matrix?
In your paper, the moment loss is the regularization to make the moment matrix. But in your network, if I use order 1 or 2, the moment matrix is exactly zero. It seems the momentloss in the loss won't change. And if I did not misunderstand the code, the code uses the moment matrix to generate the kernel matrix?