Skip to content

Conversation

stevengogogo
Copy link
Collaborator

@stevengogogo stevengogogo commented Jun 17, 2025

This implementation provides an API for multi-objective learning.

Screenshot 2025-06-17 at 3 53 32 PM Screenshot 2025-06-17 at 3 53 51 PM

where A is an aggregator that combines gradients

For $loss = 1.0 loss_{drm} + 500 loss_{bc}$

python pinn_1d.py   --levels 1 --epochs 10000 --lr 1e-4 --activation gelu --sweeps 1 --hidden_dims 256 256 256 --high_freq 3 --loss_type 1  --bc_weight 1 --nx 2000  --aggregator "Constant" 1. 500.

For Dual Cone optimization

 python pinn_1d.py   --levels 1 --epochs 10000 --lr 1e-4 --activation gelu --sweeps 1 --hidden_dims 256 256 256 --high_freq 3 --loss_type 0  --bc_weight 1 --nx 2000  --aggregator "UPGrad"

This optimize pinn loss and boundary loss at the same time.

Monitor weights

add --monitor_aggregator

python pinn_1d.py   --levels 1 --epochs 10000 --lr 1e-4 --activation gelu --sweeps 1 --hidden_dims 256 256 256 --high_freq 3 --loss_type 1  --bc_weight 1 --nx 2000  --gamma 0  --aggregator "NashMTL" 2 10. --monitor_aggregator

Weights: tensor([2.0694e-02, 3.3308e+01], device='cuda:0')
Cosine similarity: 0.9834

TorchJD package for Jacobian method

Many ways to combine gradients:

image

@stevengogogo
Copy link
Collaborator Author

stevengogogo commented Jun 17, 2025

python pinn_1d.py   --levels 1 --epochs 10000 --lr 1e-3 --activation tanh --sweeps 1 --hidden_dims 128 128 128 128 --high_freq 3 --loss_type 2 --bc_weight 1 --nx 2000  --gamma 0 --aggregator "None" --enforce_bc --plot
Training Level 0
Level 0: TRAIN
Iteration    499/ 10000, PINN+DRM Loss: -4.5650e+01, Err 2-norm:  1.3520e+01, inf-norm: 1.2588e+00
Iteration    999/ 10000, PINN+DRM Loss: -4.5722e+01, Err 2-norm:  1.3518e+01, inf-norm: 1.2586e+00
Iteration   1499/ 10000, PINN+DRM Loss: -4.5728e+01, Err 2-norm:  1.3517e+01, inf-norm: 1.2585e+00
Iteration   1999/ 10000, PINN+DRM Loss: -4.5465e+01, Err 2-norm:  1.3456e+01, inf-norm: 1.2543e+00
Iteration   2499/ 10000, PINN+DRM Loss: -4.5679e+01, Err 2-norm:  1.3579e+01, inf-norm: 1.2660e+00
Iteration   2999/ 10000, PINN+DRM Loss: -4.5720e+01, Err 2-norm:  1.3484e+01, inf-norm: 1.2543e+00
Iteration   3499/ 10000, PINN+DRM Loss: -4.5729e+01, Err 2-norm:  1.3517e+01, inf-norm: 1.2585e+00
Iteration   3999/ 10000, PINN+DRM Loss: -4.5729e+01, Err 2-norm:  1.3513e+01, inf-norm: 1.2583e+00
Iteration   4499/ 10000, PINN+DRM Loss: -4.5729e+01, Err 2-norm:  1.3516e+01, inf-norm: 1.2584e+00
Iteration   4999/ 10000, PINN+DRM Loss: -4.5239e+01, Err 2-norm:  1.3345e+01, inf-norm: 1.2448e+00
Iteration   5499/ 10000, PINN+DRM Loss: -4.5729e+01, Err 2-norm:  1.3519e+01, inf-norm: 1.2589e+00
Iteration   5999/ 10000, PINN+DRM Loss: -4.5709e+01, Err 2-norm:  1.3428e+01, inf-norm: 1.2505e+00
Iteration   6499/ 10000, PINN+DRM Loss: -4.5712e+01, Err 2-norm:  1.3508e+01, inf-norm: 1.2570e+00
Iteration   6999/ 10000, PINN+DRM Loss: -4.5659e+01, Err 2-norm:  1.3449e+01, inf-norm: 1.2502e+00
Iteration   7499/ 10000, PINN+DRM Loss: -4.5430e+01, Err 2-norm:  1.3549e+01, inf-norm: 1.2660e+00
Iteration   7999/ 10000, PINN+DRM Loss: -4.5727e+01, Err 2-norm:  1.3490e+01, inf-norm: 1.2558e+00
Iteration   8499/ 10000, PINN+DRM Loss: -4.5729e+01, Err 2-norm:  1.3515e+01, inf-norm: 1.2583e+00
Iteration   8999/ 10000, PINN+DRM Loss: -4.5722e+01, Err 2-norm:  1.3495e+01, inf-norm: 1.2565e+00
Iteration   9499/ 10000, PINN+DRM Loss: -4.5729e+01, Err 2-norm:  1.3517e+01, inf-norm: 1.2583e+00
Iteration   9999/ 10000, PINN+DRM Loss: -4.5598e+01, Err 2-norm:  1.3552e+01, inf-norm: 1.2594e+00

image

python pinn_1d.py   --levels 1 --epochs 10000 --lr 1e-3 --activation tanh --sweeps 1 --hidden_dims 128 128 128 128 --high_freq 3 --loss_type 2 --bc_weight 1 --nx 2000  --gamma 0 --aggregator "Random" --enforce_bc --plot
Iteration    499/ 10000, PINN+DRM Loss: 3.9525e+00, Err 2-norm:  1.2545e+01, inf-norm: 1.2468e+00
Iteration    999/ 10000, PINN+DRM Loss: -3.7693e+01, Err 2-norm:  1.1308e+01, inf-norm: 1.0476e+00
Iteration   1499/ 10000, PINN+DRM Loss: -4.2093e+01, Err 2-norm:  1.2699e+01, inf-norm: 1.2083e+00
Iteration   1999/ 10000, PINN+DRM Loss: -9.0663e+00, Err 2-norm:  1.2384e+01, inf-norm: 1.1649e+00
Iteration   2499/ 10000, PINN+DRM Loss: -3.1103e+01, Err 2-norm:  1.0250e+01, inf-norm: 9.4714e-01
Iteration   2999/ 10000, PINN+DRM Loss: -1.2633e+01, Err 2-norm:  1.3792e+01, inf-norm: 1.3089e+00
Iteration   3499/ 10000, PINN+DRM Loss: -4.0297e+01, Err 2-norm:  1.3636e+01, inf-norm: 1.2782e+00
Iteration   3999/ 10000, PINN+DRM Loss: -4.2600e+01, Err 2-norm:  1.4068e+01, inf-norm: 1.2996e+00
Iteration   4499/ 10000, PINN+DRM Loss: -3.1732e+01, Err 2-norm:  1.2627e+01, inf-norm: 1.1623e+00
Iteration   4999/ 10000, PINN+DRM Loss: -3.5047e+01, Err 2-norm:  1.2027e+01, inf-norm: 1.1159e+00
Iteration   5499/ 10000, PINN+DRM Loss: -4.2588e+01, Err 2-norm:  1.4576e+01, inf-norm: 1.3555e+00
Iteration   5999/ 10000, PINN+DRM Loss: -3.0203e+01, Err 2-norm:  1.3808e+01, inf-norm: 1.3062e+00
Iteration   6499/ 10000, PINN+DRM Loss: -3.2692e+01, Err 2-norm:  1.5385e+01, inf-norm: 1.4591e+00
Iteration   6999/ 10000, PINN+DRM Loss: -3.7795e+01, Err 2-norm:  1.1694e+01, inf-norm: 1.0892e+00
Iteration   7499/ 10000, PINN+DRM Loss: -3.8121e+01, Err 2-norm:  1.3208e+01, inf-norm: 1.2493e+00
Iteration   7999/ 10000, PINN+DRM Loss: -3.5225e+01, Err 2-norm:  1.4057e+01, inf-norm: 1.2916e+00
Iteration   8499/ 10000, PINN+DRM Loss: -3.6439e+01, Err 2-norm:  1.3031e+01, inf-norm: 1.1969e+00
Iteration   8999/ 10000, PINN+DRM Loss: -4.2368e+01, Err 2-norm:  1.2165e+01, inf-norm: 1.1320e+00
Iteration   9499/ 10000, PINN+DRM Loss: -3.7700e+01, Err 2-norm:  1.3979e+01, inf-norm: 1.3268e+00
Iteration   9999/ 10000, PINN+DRM Loss: -4.4185e+01, Err 2-norm:  1.3951e+01, inf-norm: 1.2921e+00

image

@stevengogogo
Copy link
Collaborator Author

stevengogogo commented Jun 17, 2025

Experiment Dual Cone method on DRM

Use $loss = loss_{drm} + 500 loss_{bc}$

python pinn_1d.py   --levels 1 --epochs 10000 --lr 1e-4 --activation gelu --sweeps 1 --hidden_dims 256 256 256 --high_freq 3 --loss_type 1  --bc_weight 1 --nx 2000  --gamma 0  --aggregator "Constant" 1. 500. --plot
Iteration    999/ 10000, DRM Loss: -3.5050e+01, Err 2-norm:  8.1668e-01, inf-norm: 1.1278e-01

image

Dual Cone method for PINN + DRM

python pinn_1d.py   --levels 1 --epochs 10000 --lr 1e-4 --activation gelu --sweeps 1 --hidden_dims 256 256 256 --high_freq 3 --loss_type 2  --bc_weight 1 --nx 2000  --gamma 0  --aggregator "UPGrad" --plot
Training Level 0
Level 0: TRAIN
Iteration    499/ 10000, PINN+DRM Loss: 7.7932e+00, Err 2-norm:  2.2756e+02, inf-norm: 1.6297e+01
Iteration    999/ 10000, PINN+DRM Loss: -3.0725e-01, Err 2-norm:  2.2581e+02, inf-norm: 1.6224e+01
Iteration   1499/ 10000, PINN+DRM Loss: -3.5438e+00, Err 2-norm:  2.2444e+02, inf-norm: 1.6135e+01
Iteration   1999/ 10000, PINN+DRM Loss: -4.0919e+00, Err 2-norm:  2.2425e+02, inf-norm: 1.6124e+01
Iteration   2499/ 10000, PINN+DRM Loss: -4.4187e+00, Err 2-norm:  2.2413e+02, inf-norm: 1.6117e+01
Iteration   2999/ 10000, PINN+DRM Loss: -4.7988e+00, Err 2-norm:  2.2398e+02, inf-norm: 1.6107e+01
Iteration   3499/ 10000, PINN+DRM Loss: -4.9994e+00, Err 2-norm:  2.2388e+02, inf-norm: 1.6102e+01
Iteration   3999/ 10000, PINN+DRM Loss: -5.0996e+00, Err 2-norm:  2.2383e+02, inf-norm: 1.6099e+01
Iteration   4499/ 10000, PINN+DRM Loss: -5.1486e+00, Err 2-norm:  2.2383e+02, inf-norm: 1.6096e+01
Iteration   4999/ 10000, PINN+DRM Loss: -5.1739e+00, Err 2-norm:  2.2382e+02, inf-norm: 1.6095e+01
Iteration   5499/ 10000, PINN+DRM Loss: -5.1908e+00, Err 2-norm:  2.2381e+02, inf-norm: 1.6095e+01
Iteration   5999/ 10000, PINN+DRM Loss: -5.2032e+00, Err 2-norm:  2.2381e+02, inf-norm: 1.6095e+01
Iteration   6499/ 10000, PINN+DRM Loss: -5.2282e+00, Err 2-norm:  2.2379e+02, inf-norm: 1.6096e+01
Iteration   6999/ 10000, PINN+DRM Loss: -5.2406e+00, Err 2-norm:  2.2378e+02, inf-norm: 1.6096e+01
Iteration   7499/ 10000, PINN+DRM Loss: -5.2289e+00, Err 2-norm:  2.2379e+02, inf-norm: 1.6094e+01
Iteration   7999/ 10000, PINN+DRM Loss: -5.2364e+00, Err 2-norm:  2.2379e+02, inf-norm: 1.6093e+01
Iteration   8499/ 10000, PINN+DRM Loss: -5.2364e+00, Err 2-norm:  2.2379e+02, inf-norm: 1.6093e+01
Iteration   8999/ 10000, PINN+DRM Loss: -5.2717e+00, Err 2-norm:  2.2378e+02, inf-norm: 1.6096e+01
Iteration   9499/ 10000, PINN+DRM Loss: -5.2721e+00, Err 2-norm:  2.2378e+02, inf-norm: 1.6095e+01
Iteration   9999/ 10000, PINN+DRM Loss: -5.2785e+00, Err 2-norm:  2.2378e+02, inf-norm: 1.6095e+01

Dual Cone method for PINN + DRM

python pinn_1d.py   --levels 1 --epochs 10000 --lr 1e-4 --activation gelu --sweeps 1 --hidden_dims 256 256 256 --high_freq 3 --loss_type 2  --bc_weight 1 --nx 2000  --gamma 0  --aggregator "UPGrad" --plot
Training Level 0
Level 0: TRAIN
Iteration    499/ 10000, PINN+DRM Loss: 7.7932e+00, Err 2-norm:  2.2756e+02, inf-norm: 1.6297e+01
Iteration    999/ 10000, PINN+DRM Loss: -3.0725e-01, Err 2-norm:  2.2581e+02, inf-norm: 1.6224e+01
Iteration   1499/ 10000, PINN+DRM Loss: -3.5438e+00, Err 2-norm:  2.2444e+02, inf-norm: 1.6135e+01
Iteration   1999/ 10000, PINN+DRM Loss: -4.0919e+00, Err 2-norm:  2.2425e+02, inf-norm: 1.6124e+01
Iteration   2499/ 10000, PINN+DRM Loss: -4.4187e+00, Err 2-norm:  2.2413e+02, inf-norm: 1.6117e+01
Iteration   2999/ 10000, PINN+DRM Loss: -4.7988e+00, Err 2-norm:  2.2398e+02, inf-norm: 1.6107e+01
Iteration   3499/ 10000, PINN+DRM Loss: -4.9994e+00, Err 2-norm:  2.2388e+02, inf-norm: 1.6102e+01
Iteration   3999/ 10000, PINN+DRM Loss: -5.0996e+00, Err 2-norm:  2.2383e+02, inf-norm: 1.6099e+01
Iteration   4499/ 10000, PINN+DRM Loss: -5.1486e+00, Err 2-norm:  2.2383e+02, inf-norm: 1.6096e+01
Iteration   4999/ 10000, PINN+DRM Loss: -5.1739e+00, Err 2-norm:  2.2382e+02, inf-norm: 1.6095e+01
Iteration   5499/ 10000, PINN+DRM Loss: -5.1908e+00, Err 2-norm:  2.2381e+02, inf-norm: 1.6095e+01
Iteration   5999/ 10000, PINN+DRM Loss: -5.2032e+00, Err 2-norm:  2.2381e+02, inf-norm: 1.6095e+01
Iteration   6499/ 10000, PINN+DRM Loss: -5.2282e+00, Err 2-norm:  2.2379e+02, inf-norm: 1.6096e+01
Iteration   6999/ 10000, PINN+DRM Loss: -5.2406e+00, Err 2-norm:  2.2378e+02, inf-norm: 1.6096e+01
Iteration   7499/ 10000, PINN+DRM Loss: -5.2289e+00, Err 2-norm:  2.2379e+02, inf-norm: 1.6094e+01
Iteration   7999/ 10000, PINN+DRM Loss: -5.2364e+00, Err 2-norm:  2.2379e+02, inf-norm: 1.6093e+01
Iteration   8499/ 10000, PINN+DRM Loss: -5.2364e+00, Err 2-norm:  2.2379e+02, inf-norm: 1.6093e+01
Iteration   8999/ 10000, PINN+DRM Loss: -5.2717e+00, Err 2-norm:  2.2378e+02, inf-norm: 1.6096e+01
Iteration   9499/ 10000, PINN+DRM Loss: -5.2721e+00, Err 2-norm:  2.2378e+02, inf-norm: 1.6095e+01
Iteration   9999/ 10000, PINN+DRM Loss: -5.2785e+00, Err 2-norm:  2.2378e+02, inf-norm: 1.6095e+01

image

@stevengogogo
Copy link
Collaborator Author

stevengogogo commented Jun 17, 2025

Exp: Verify dual-cone optimization helps PINNs + soft BC loss

Dual Cone

python pinn_1d.py   --levels 1 --epochs 10000 --lr 1e-4 --activation gelu --sweeps 1 --hidden_dims 256 256 256 --high_freq 3 --loss_type 0  --bc_weight 1 --nx 2000  --gamma 0  --aggregator "UPGrad" --plot
Training Level 0
Level 0: TRAIN
Iteration    499/ 10000, PINN Loss: 1.1153e+01, Err 2-norm:  1.4233e+01, inf-norm: 1.4310e+00
Iteration    999/ 10000, PINN Loss: 1.0686e+00, Err 2-norm:  5.0536e-01, inf-norm: 4.5360e-02
Iteration   1499/ 10000, PINN Loss: 3.5229e-01, Err 2-norm:  2.2734e-01, inf-norm: 2.2083e-02
Iteration   1999/ 10000, PINN Loss: 1.1591e-01, Err 2-norm:  1.0089e-01, inf-norm: 1.1598e-02
Iteration   2499/ 10000, PINN Loss: 5.2839e-02, Err 2-norm:  4.8972e-02, inf-norm: 5.6659e-03
Iteration   2999/ 10000, PINN Loss: 2.9012e-02, Err 2-norm:  3.4169e-02, inf-norm: 3.8587e-03
Iteration   3499/ 10000, PINN Loss: 1.9204e-02, Err 2-norm:  2.0535e-02, inf-norm: 2.3413e-03
Iteration   3999/ 10000, PINN Loss: 1.3088e-02, Err 2-norm:  1.8215e-02, inf-norm: 2.0128e-03
Iteration   4499/ 10000, PINN Loss: 1.4971e-01, Err 2-norm:  5.1648e-02, inf-norm: 7.5520e-03
Iteration   4999/ 10000, PINN Loss: 7.1656e-03, Err 2-norm:  1.2028e-02, inf-norm: 1.9021e-03
Iteration   5499/ 10000, PINN Loss: 4.7366e-03, Err 2-norm:  7.8400e-03, inf-norm: 9.1186e-04
Iteration   5999/ 10000, PINN Loss: 3.6847e-03, Err 2-norm:  6.0205e-03, inf-norm: 7.1356e-04
Iteration   6499/ 10000, PINN Loss: 2.9949e-03, Err 2-norm:  4.2069e-03, inf-norm: 3.9807e-04
Iteration   6999/ 10000, PINN Loss: 2.4976e-03, Err 2-norm:  3.9032e-03, inf-norm: 4.7027e-04
Iteration   7499/ 10000, PINN Loss: 2.1062e-03, Err 2-norm:  3.2904e-03, inf-norm: 3.9752e-04
Iteration   7999/ 10000, PINN Loss: 1.7800e-03, Err 2-norm:  2.6893e-03, inf-norm: 3.2555e-04
Iteration   8499/ 10000, PINN Loss: 1.5204e-03, Err 2-norm:  2.2626e-03, inf-norm: 2.7375e-04
Iteration   8999/ 10000, PINN Loss: 1.3033e-03, Err 2-norm:  1.9027e-03, inf-norm: 2.2980e-04
Iteration   9499/ 10000, PINN Loss: 3.6962e-02, Err 2-norm:  5.8394e-02, inf-norm: 7.6404e-03
Iteration   9999/ 10000, PINN Loss: 9.7147e-04, Err 2-norm:  1.1256e-03, inf-norm: 1.2255e-04

image

snapshot of weights of gradients

Weights: tensor([0.5813, 1.0964], device='cuda:0')
Cosine similarity: 0.9370
Weights: tensor([0.5808, 1.0967], device='cuda:0')
Cosine similarity: 0.9372
Weights: tensor([0.5804, 1.0970], device='cuda:0')
Cosine similarity: 0.9374

PINN with $loss = loss_{pinn} + loss_{bc}$

python pinn_1d.py   --levels 1 --epochs 10000 --lr 1e-4 --activation gelu --sweeps 1 --hidden_dims 256 256 256 --high_freq 3 --loss_type 0  --bc_weight 1 --nx 2000  --gamma 0 --plot
Iteration    499/ 10000, PINN Loss: 1.7111e+01, Err 2-norm:  3.9158e+00, inf-norm: 4.6693e-01
Iteration    999/ 10000, PINN Loss: 6.1949e-01, Err 2-norm:  1.1616e+00, inf-norm: 1.3581e-01
Iteration   1499/ 10000, PINN Loss: 1.9982e-01, Err 2-norm:  2.5898e-01, inf-norm: 3.0343e-02
Iteration   1999/ 10000, PINN Loss: 1.2159e-01, Err 2-norm:  1.2659e-01, inf-norm: 1.5461e-02
Iteration   2499/ 10000, PINN Loss: 7.9549e-02, Err 2-norm:  8.1205e-02, inf-norm: 9.7390e-03
Iteration   2999/ 10000, PINN Loss: 4.8445e-02, Err 2-norm:  4.6060e-02, inf-norm: 5.8364e-03
Iteration   3499/ 10000, PINN Loss: 3.3523e-02, Err 2-norm:  3.5849e-02, inf-norm: 4.3601e-03
Iteration   3999/ 10000, PINN Loss: 2.5072e-02, Err 2-norm:  2.8865e-02, inf-norm: 3.5170e-03
Iteration   4499/ 10000, PINN Loss: 1.9703e-02, Err 2-norm:  2.3881e-02, inf-norm: 2.9146e-03
Iteration   4999/ 10000, PINN Loss: 1.5842e-02, Err 2-norm:  2.0890e-02, inf-norm: 2.5257e-03
Iteration   5499/ 10000, PINN Loss: 1.3097e-02, Err 2-norm:  1.8221e-02, inf-norm: 2.1839e-03
Iteration   5999/ 10000, PINN Loss: 1.1010e-02, Err 2-norm:  1.6584e-02, inf-norm: 1.8699e-03
Iteration   6499/ 10000, PINN Loss: 9.4804e-03, Err 2-norm:  1.4153e-02, inf-norm: 1.6923e-03
Iteration   6999/ 10000, PINN Loss: 8.2994e-03, Err 2-norm:  1.2402e-02, inf-norm: 1.5738e-03
Iteration   7499/ 10000, PINN Loss: 2.5371e-02, Err 2-norm:  3.7283e-02, inf-norm: 3.5382e-03
Iteration   7999/ 10000, PINN Loss: 9.1092e-03, Err 2-norm:  1.5372e-02, inf-norm: 2.2640e-03
Iteration   8499/ 10000, PINN Loss: 5.9730e-03, Err 2-norm:  8.5018e-03, inf-norm: 1.0841e-03
Iteration   8999/ 10000, PINN Loss: 5.4161e-03, Err 2-norm:  8.3174e-03, inf-norm: 9.8834e-04
Iteration   9499/ 10000, PINN Loss: 4.9422e-03, Err 2-norm:  7.5278e-03, inf-norm: 8.9516e-04
Iteration   9999/ 10000, PINN Loss: 2.7889e-02, Err 2-norm:  4.5591e-02, inf-norm: 4.5103e-03

image

@stevengogogo stevengogogo marked this pull request as ready for review June 18, 2025 17:22
@stevengogogo stevengogogo requested a review from liruipeng June 18, 2025 17:22
@stevengogogo
Copy link
Collaborator Author

stevengogogo commented Jun 18, 2025

@liruipeng Please take a look. It's ready. The main change is the loss function now has two output loss, losses:list. This allow multi-objective learning possible and allow for printing out all losses separatedly

@stevengogogo stevengogogo added the enhancement New feature or request label Jun 18, 2025
@stevengogogo
Copy link
Collaborator Author

stevengogogo commented Jun 20, 2025

Fix CI package and conflict

u = model.get_solution(x)
loss = loss_func(u, mesh.u_ex)
return loss
return loss, [loss,]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

major change

@stevengogogo stevengogogo added the RFR ready for review label Oct 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request experiment RFR ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PINN loss and DRM loss help getting rid of local minima by gradient selection
1 participant