Summary: this diff adds optimizer into param_info, and the associated implementations for modelhelper and brew to set optimizer for each individual parameter.
Reviewed By: kennyhorror
Differential Revision: D5385432
fbshipit-source-id: 5d682f9d1ab077e04a5d76a24d71470f4e64fc92
Summary:
This diff is introducing abstractions for parameter sharing for all the
parameters, that are created through new create_param syntax.
Possible use-cases of this parameters sharing:
1. Share params within RNN interface.
2. Some complicated models that might share some of the branches.
3. TODO (next diff): Cross-model parameter sharing.
Reviewed By: salexspb
Differential Revision: D5160935
fbshipit-source-id: c6d40a5ed7ead240cd7db0eb69de6dc5f505b05a
Summary:
This diff is creating new type of Initializer - ExternalInitializer. This
initializer is supposed to be used in cases when the parameter blob is already
expected to exist in the workspace.
Reviewed By: dzhulgakov
Differential Revision: D5171322
fbshipit-source-id: d27861f0f80afdea93c235d49f63da19adccc92c
Summary:
This diff is the first step in the effort for refactoring all parameters. As a first step - I'm merging concept of params and computed_params, that is going
to be based on tags instead (in the first version it's still using old data structs to store all the BlobReferences).
Renaming computed_params to non-trainable/non-backprop params should be done is some other diff.
Reviewed By: salexspb
Differential Revision: D5171159
fbshipit-source-id: 68031ca779f053fb266a7c4a2e5b482a3bd9c832
Summary:
This diff is the first step in the effort for refactoring all paramters. As a
first step - I'm merging concept of params and computed_params, that is going
to be based on tags instead (in the first version it's still using old data
structs to store all the BlobReferences).
Renaming computed_params to non-trainable/non-backprop params should be done is
some other diff.
Reviewed By: salexspb
Differential Revision: D5119830
fbshipit-source-id: 2001090a37346eb12abbb234e13e727c288eb8a7
Summary:
Adds support for generating and training pfp16 models. Added SGD optimizer for multi-precision trainers and a new callback to data_parallel_model in order to help multi-precision models keep their different copies of parameters in sync during training.
Closes https://github.com/caffe2/caffe2/pull/697
Differential Revision: D5159712
Pulled By: salexspb
fbshipit-source-id: 60a889494d2e2f4df1d720331e19f638c5eb95cc
Summary:
This is going to unblock Nvidia in their work on adding fp16
support to Caffe2. I discussed this with kennyhorror before to make
sure this fits into his work on parameter sharing.
Reviewed By: kennyhorror
Differential Revision: D5127797
fbshipit-source-id: 4db155d320b1862570c23b77c4252bdacbf2296f