v0.8.0 #483
ValerianRey
announced in
Announcements
v0.8.0
#483
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
🔥 Autogram: a new engine for Jacobian descent 🔥
After months of hard work, we're happy to release autogram: a new engine to compute the Gramian
G = J @ J.Tof the JacobianJof the losses with respect to the model parameters.Have you ever had memory issues while using TorchJD? Try out this new approach!
This Gramian is computed iteratively, while only having parts of
Jin memory at a time, so it is much more memory-efficient than computing the full Jacobian and multiplying it by its transpose.Why does the Gramian of the Jacobian matter?
Most aggregators simply make a weighted combination of the rows of the Jacobian, whose weights depend only on the Gramian of the Jacobian. So while in standard Jacobian descent, you compute the Jacobian
Jand aggregate it into a vector to update the model, in Gramian-based Jacobian descent, you directly compute the Gramian of the Jacobian, and then extract weights from this Gramian, and backward the weighted combination of the losses.This is equivalent to standard Jacobian descent, but much more memory efficient, because the Jacobian never has to be fully stored in memory. It's thus also typically much faster, especially for instance-wise risk minimization (IWRM). For more theoretical justifications, please read Section 6 of our paper.
How to make the switch?
Old engine (autojac):
New engine (autogram):
We're still working on making the engine even faster, but with this release you can already start using it. The interface is likely to change in the future, but adapting to these changes should always be easy!
Please open issues if you run into any problems while using it or if your have suggestions for improvements!
Changelog
Added
autogrampackage, with theautogram.Engine. This is an implementation of Algorithm 3from Jacobian Descent for Multi-Objective Optimization,
optimized for batched computations, as in IWRM. Generalized Gramians can also be obtained by using
the autogram engine on a tensor of losses of arbitrary shape.
Aggregators based on the weighting of the Gramian of the Jacobian, made theirWeightingclass public. It can be used directly on a Gramian (computed via theautogram.Engine) to extract some weights. The list of new public classes is:Weighting(abstract base class)UPGradWeightingAlignedMTLWeightingCAGradWeightingConstantWeightingDualProjWeightingIMTLGWeightingKrumWeightingMeanWeightingMGDAWeightingPCGradWeightingRandomWeightingSumWeightingGeneralizedWeighting(base class) andFlattening(implementation) to extract tensors ofweights from generalized Gramians.
Changed
slight performance improvement in
autojac.backwardandmtl_backwardimportable fromtorchjd.autojac(like it was prior to 0.7.0).backwardandmtl_backwardfromtorchjddirectly.This discussion was created from the release v0.8.0.
Beta Was this translation helpful? Give feedback.
All reactions