You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DADVI (paper) pre-samples the Gaussian noise in the init and then is fully deterministic in update which makes for more stable training. More samples are needed so might need gradient accumulation support Gradient accumulation #52
Last-layer deterministic VI (paper) provides a handy deterministic objective with linear last layers for regression and classification. Might be worth adding if we can generalise to exponential familiy losses and/or linearise the model.
initand then is fully deterministic inupdatewhich makes for more stable training. More samples are needed so might need gradient accumulation support Gradient accumulation #52