AdagradDecayオプティマイザ - Platform For AI - Alibaba Cloud ドキュメントセンター

このトピックでは、AdagradDecayオプティマイザを使用して超大規模モデルトレーニングを実行する方法について説明します。

警告

GPUアクセラレーションサーバーは段階的に廃止されます。 CPUサーバーで実行されるTensorFlowタスクを送信できます。モデルトレーニングにGPU高速化インスタンスを使用する場合は、Deep Learning Containers (DLC) に移動してジョブを送信します。詳細については、「トレーニングジョブの送信」をご参照ください。

背景情報

ほとんどの場合、10億を超えるサンプルが超大規模モデルのトレーニングに使用され、サンプルの数は増え続けています。トレーニングは1ヶ月以上続きます。この問題を処理するために、PAI-TensorFlowはAdagradDecayオプティマイザを提供します。

AdagradDecayオプティマイザの有効化

超大規模モデルトレーニングにAdagradDecayオプティマイザを使用するには、tf.train.AdagradDecayOptimizerを定義する必要があります。 AdagradDecayオプティマイザは、ネイティブTensorFlowのオプティマイザと同様の方法で使用できます。次のコードは、使用法を定義します。

class AdagradDecayOptimizer(optimizer.Optimizer):
  """Optimizer that implements the Adagrad algorithm with accumulator decay.
  Different from the original Adagrad algorithm, AdagradDecay performs decay
  at given step with given rate. So that the accumulator will not be infinity.
  """
  def __init__(self,
               learning_rate,
               global_step,
               initial_accumulator_value=0.1,
               accumulator_decay_step=100000,
               accumulator_decay_rate=0.9,
               use_locking=False,
               name="AdagradDecay"):
    """Construct a new AdagradDecay optimizer.
    Args:
      learning_rate: A `Tensor` or a floating point value.  The learning rate.
      global_step: global step variable, used for calculating t%T .
      initial_accumulator_value: A floating point value. Starting and baseline
        value for the accumulators, must be positive. The accumulators will not
        be less than it.
      accumulator_decay_step: When global_step reaches times of
        accumulator_decay_step, accumulator will be decayed with
        accumulator_decay_rate. accumulator *= accumulator_decay_rate
      accumulator_decay_rate: Decay rate as above described.
      use_locking: If `True` use locks for update operations.
      name: Optional name prefix for the operations created when applying
        gradients.  Defaults to "AdagradDecay".
    Raises:
      ValueError: If the `initial_accumulator_value`, `accumulator_decay_step`
        or `accumulator_decay_rate` is invalid.
    """