Optimizers

class geoopt.optim.RiemannianAdam(*args, stabilize=None, **kwargs)[source]

Riemannian Adam with the same API as torch.optim.Adam.

Parameters:
  • params (iterable) – iterable of parameters to optimize or dicts defining parameter groups
  • lr (float (optional)) – learning rate (default: 1e-3)
  • betas (Tuple[float, float] (optional)) – coefficients used for computing running averages of gradient and its square (default: (0.9, 0.999))
  • eps (float (optional)) – term added to the denominator to improve numerical stability (default: 1e-8)
  • weight_decay (float (optional)) – weight decay (L2 penalty) (default: 0)
  • amsgrad (bool (optional)) – whether to use the AMSGrad variant of this algorithm from the paper On the Convergence of Adam and Beyond (default: False)
Other Parameters:
 

stabilize (int) – Stabilize parameters if they are off-manifold due to numerical reasons every stabilize steps (default: None – no stabilize)

step(closure=None)[source]

Performs a single optimization step.

Parameters:closure (callable, optional) – A closure that reevaluates the model and returns the loss.
class geoopt.optim.RiemannianSGD(params, lr, momentum=0, dampening=0, weight_decay=0, nesterov=False, stabilize=None)[source]

Riemannian Stochastic Gradient Descent with the same API as torch.optim.SGD.

Parameters:
  • params (iterable) – iterable of parameters to optimize or dicts defining parameter groups
  • lr (float) – learning rate
  • momentum (float (optional)) – momentum factor (default: 0)
  • weight_decay (float (optional)) – weight decay (L2 penalty) (default: 0)
  • dampening (float (optional)) – dampening for momentum (default: 0)
  • nesterov (bool (optional)) – enables Nesterov momentum (default: False)
Other Parameters:
 

stabilize (int) – Stabilize parameters if they are off-manifold due to numerical reasons every stabilize steps (default: None – no stabilize)

step(closure=None)[source]

Performs a single optimization step (parameter update).

Parameters:closure (callable) – A closure that reevaluates the model and returns the loss. Optional for most optimizers.