MyAdagrad

class MyAdagrad(params, lr=0.01, lr_decay=0, init_accu_value=0.1, weight_decay=0)[source]

Modification of the Adagrad optimizer that allows to specify an initial accumulator value. This mimics the behavior of the default Adagrad implementation in Tensorflow. The default PyTorch Adagrad uses 0 for initial accumulator value.

Parameters
  • params (iterable) – iterable of parameters to optimize or dicts defining parameter groups

  • lr (float, optional) – learning rate (default: 1e-2)

  • lr_decay (float, optional) – learning rate decay (default: 0)

  • init_accu_value (float, optional) – initial accumulater value.

  • weight_decay (float, optional) – weight decay (L2 penalty) (default: 0)

step(closure=None)[source]

Performs a single optimization step.

Parameters

closure (callable, optional) – A closure that reevaluates the model and returns the loss.