pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-15 21:00:47 +00:00

History

Taiqing Wang 8cb1f2f9dc implement L2 regularization for Adagrad in caffe2 and dper (#37705 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37705 Pull Request resolved: https://github.com/pytorch/pytorch/pull/37372 Posted note: [Regularizing SparseNN Against Over-fitting](https://fb.workplace.com/notes/taiqing-wang/regularizing-sparsenn-against-over-fitting/220306075902708/) Problem formulation L(w) = J(w) + lambda/2 * \|\|w\|\|^2 J(w) is the empirical loss, and \|\|w\|\|^2 is the squared L2 norm of the parameters, a.k.a. L2 regularizer. dL(w)/ dw_i = dJ(w)/dw_i + lambda w_i dL(w)/ dw_i is the gradient of L(w) w.r.t. w_i. To implement the L2 regularizer, the gradient of J(w) w.r.t. w_i is added with w_i. lambda is called as weight decay in this implementation. Code changes * In the initialization method of AdagradOptimizer, a new input argument, weight_decay, is added. * In the _run function of AdagradOptimizer, the weight decay will be skipped for 1d bias vectors. * In the parameter update functions of Adagrad, the gradient is updated by weight_decay * w_i. The default value for weight_decay is zero. Test Plan: ` buck build caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test_weight_decay ` ` ./buck-out/gen/caffe2/caffe2/fb/dper/layer_models/tests/split_1/sparse_nn_test_weight_decay#binary.par ` Reviewed By: jspark1105 Differential Revision: D21258652 fbshipit-source-id: d2366ddcd736a03205a2d16f914703b16d9fce8f		2020-05-03 10:42:49 -07:00
..
benchmarks	[caffe2] open source 2/4-bit SLS operators (#34903 )	2020-03-17 22:55:10 -07:00
docs
examples	Fix typos, via a Levenshtein-type corrector (#31523 )	2020-01-17 16:03:19 -08:00
fakelowp	Make FakeLowP tests work (#36525 )	2020-04-13 20:16:33 -07:00
helpers	Fix typos, via a Levenshtein-type corrector (#31523 )	2020-01-17 16:03:19 -08:00
ideep	Remove (most) Python 2 support from Python code (#35615 )	2020-04-22 09:23:14 -07:00
layers	Add LN after specialzied output embeddings and flexible LCE (#35178 )	2020-04-30 15:32:09 -07:00
mint
mkl	Fix typos, via a Levenshtein-type corrector (#31523 )	2020-01-17 16:03:19 -08:00
modeling	Fix typos, via a Levenshtein-type corrector (#31523 )	2020-01-17 16:03:19 -08:00
models	Make caffe2/caffe2/python/models/seq2seq python3 compatible	2020-02-04 10:51:47 -08:00
onnx	Skip c2 ref onnx model tests (#37591 )	2020-04-30 14:32:47 -07:00
operator_test	implement L2 regularization for Adagrad in caffe2 and dper (#37705 )	2020-05-03 10:42:49 -07:00
predictor	[DPER3][Shape inference] Update Shape Information in dper3 backend (#34475 )	2020-03-19 13:49:34 -07:00
rnn
serialized_test	[Caffe2] Fix shape inference for element-wise operators (#33431 )	2020-02-25 09:03:06 -08:00
test
trt	Fix typos, via a Levenshtein-type corrector (#31523 )	2020-01-17 16:03:19 -08:00
__init__.py	Fix dll load logic for Python 3.8 on Windows (#32215 )	2020-01-22 08:33:34 -08:00
_import_c_extension.py	[AMD] Remove num_gpu check for remote execution (#34318 )	2020-03-06 09:53:57 -08:00
allcompare_test.py
attention.py
benchmark_generator.py
binarysize.py
brew.py
brew_test.py
build.py
cached_reader.py
caffe_translator.py
caffe_translator_test.py
checkpoint.py	Fix typos, via a Levenshtein-type corrector (#31523 )	2020-01-17 16:03:19 -08:00
checkpoint_test.py
CMakeLists.txt
cnn.py
compatibility.py
context.py
context_test.py
control.py
control_ops_grad.py
control_ops_grad_test.py
control_ops_util.py
control_test.py
convert.py
convert_test.py
convnet_benchmarks.py
convnet_benchmarks_test.py
core.py	[net_transform] only skip ConstantFill for autogen_grad (#34628 )	2020-03-11 19:09:52 -07:00
core_gradients_test.py
core_test.py
crf.py	Fix typos, via a Levenshtein-type corrector (#31523 )	2020-01-17 16:03:19 -08:00
crf_predict.py
crf_viterbi_test.py
data_parallel_model.py
data_parallel_model_test.py	Skips test_equiv_recurrent (#29255 )	2019-11-06 13:29:23 -08:00
data_workers.py
data_workers_test.py	Disables test_atomic_ops and testInputOrder (#29145 )	2019-11-05 16:53:53 -08:00
dataio.py
dataio_test.py	Fix typos, via a Levenshtein-type corrector (#31523 )	2020-01-17 16:03:19 -08:00
dataset.py
db_file_reader.py
db_test.py
device_checker.py
dlpack.h	Fix typos (#30606 )	2019-12-02 20:17:42 -08:00
dyndep.py
embedding_generation_benchmark.py
experiment_util.py
extension_loader.py
filler_test.py
functional.py	Fix typos, via a Levenshtein-type corrector (#31523 )	2020-01-17 16:03:19 -08:00
functional_test.py
fused_8bit_rowwise_conversion_ops_test.py	[caffe2] make fused rowwise quant/dequant op work for N-dim tensors (#33426 )	2020-02-19 23:29:42 -08:00
gradient_check_test.py
gradient_checker.py	[caffe2] fix type and shape inference for common gradient ops (#35857 )	2020-04-02 11:17:04 -07:00
gru_cell.py
hip_test_util.py
hsm_util.py
hypothesis_test.py	Fix typos (#30606 )	2019-12-02 20:17:42 -08:00
hypothesis_test_util.py	[caffe2] fix type and shape inference for common gradient ops (#35857 )	2020-04-02 11:17:04 -07:00
ideep_test_util.py
layer_model_helper.py	Add transfer_learning_blob_name_mappings into layer_model_helper to support layer model transfer learning	2020-03-18 15:28:00 -07:00
layer_model_instantiator.py
layer_parameter_sharing_test.py
layer_test_util.py
layers_test.py	FCTransposed to FbFCPacked (#29766 )	2019-12-10 10:18:21 -08:00
lengths_reducer_fused_8bit_rowwise_ops_test.py	[caffe2] fix how np.clip is used in lengths_reducer_fused_{4,8}_rowwise_ops_test (#32086 )	2020-01-14 22:53:28 -08:00
lengths_reducer_rowwise_8bit_ops_test.py
lstm_benchmark.py	Fix typos (#30606 )	2019-12-02 20:17:42 -08:00
memonger.py	Fix typos (#30606 )	2019-12-02 20:17:42 -08:00
memonger_test.py
mkl_test_util.py
model_device_test.py
model_helper.py	Fix TensorProtosDBInput AttributeError (#32274 )	2020-01-29 12:05:43 -08:00
model_helper_test.py
modifier_context.py	Fix typos (#30606 )	2019-12-02 20:17:42 -08:00
mpi_python.cc	Replace c10::guts::stuff with std::stuff (#30915 )	2019-12-16 13:57:19 -08:00
muji.py
muji_test.py
net_builder.py
net_builder_test.py
net_drawer.py	Fix typos, via a Levenshtein-type corrector (#31523 )	2020-01-17 16:03:19 -08:00
net_printer.py
net_printer_test.py
nomnigraph.py
nomnigraph_test.py
nomnigraph_transformations.py
nomnigraph_transformations_test.py
normalizer.py	Scale init for batch-norm and layer-norm (#31983 )	2020-01-10 11:55:56 -08:00
normalizer_context.py	Fix typos (#30606 )	2019-12-02 20:17:42 -08:00
normalizer_test.py
numa_benchmark.py
numa_test.py
observer_test.py
operator_fp_exceptions_test.py
optimizer.py	implement L2 regularization for Adagrad in caffe2 and dper (#37705 )	2020-05-03 10:42:49 -07:00
optimizer_context.py	Fix typos (#30606 )	2019-12-02 20:17:42 -08:00
optimizer_test.py	Implementation of STORM optimizer caffe2 python wrapper (#36399 )	2020-04-14 23:05:45 -07:00
optimizer_test_util.py	Fix typos (#30606 )	2019-12-02 20:17:42 -08:00
parallel_workers.py	get rid of deprecated thread.isAlive() to use py2.6 modern form is_alive()	2019-10-22 15:37:31 -07:00
parallel_workers_test.py	ParallelWorkersTest.testParallelWorkersInitFun is flaky (#29045 )	2019-11-01 13:59:02 -07:00
parallelize_bmuf_distributed_test.py
pipeline.py	Fix typos, via a Levenshtein-type corrector (#31523 )	2020-01-17 16:03:19 -08:00
pipeline_test.py
predictor_constants.py
pybind_state.cc	[caffe2] create and register child ws in pybind (#36741 )	2020-04-16 14:53:55 -07:00
pybind_state.h	caffe2: preserve python exception type from PythonOp (#36267 )	2020-04-09 12:43:24 -07:00
pybind_state_dlpack.cc
pybind_state_dlpack.h
pybind_state_gpu.cc
pybind_state_hip.cc	Make caffe2/fb folder compatible with AMD (#29131 )	2019-11-04 16:40:29 -08:00
pybind_state_ideep.cc	Upgrade MKL-DNN to DNNL v1.2 (#32422 )	2020-03-26 22:07:59 -07:00
pybind_state_int8.cc
pybind_state_nomni.cc
pybind_state_registry.cc
pybind_state_registry.h
python_op_test.py	caffe2: preserve python exception type from PythonOp (#36267 )	2020-04-09 12:43:24 -07:00
queue_util.py
record_queue.py
recurrent.py
regularizer.py	Fix typos, via a Levenshtein-type corrector (#31523 )	2020-01-17 16:03:19 -08:00
regularizer_context.py	Fix typos (#30606 )	2019-12-02 20:17:42 -08:00
regularizer_test.py
rnn_cell.py	[caffe2] Remove python2 from operator_test (#33977 )	2020-03-02 08:55:53 -08:00
schema.py	Fix typos, via a Levenshtein-type corrector (#31523 )	2020-01-17 16:03:19 -08:00
schema_test.py
scope.py
scope_test.py
session.py	Fix typos (#30606 )	2019-12-02 20:17:42 -08:00
session_test.py
sparse_to_dense_mask_test.py
sparse_to_dense_test.py
task.py	Fix typos (#30606 )	2019-12-02 20:17:42 -08:00
task_test.py
test_util.py
text_file_reader.py
timeout_guard.py
toy_regression_test.py
transformations.py
transformations_test.py
tt_core.py
tt_core_test.py
utils.py	[C2] Introduce extra_info force CPU tags for auto-generated iteration counter blobs (#32607 )	2020-02-05 23:49:27 -08:00
utils_test.py	[C2] Introduce extra_info force CPU tags for auto-generated iteration counter blobs (#32607 )	2020-02-05 23:49:27 -08:00
visualize.py	Fix typos, via a Levenshtein-type corrector (#31523 )	2020-01-17 16:03:19 -08:00
workspace.py	Fix typos, via a Levenshtein-type corrector (#31523 )	2020-01-17 16:03:19 -08:00
workspace_test.py	Revert "Revert D18171156: Merge Tensor and Variable." (#29299 )	2019-11-08 09:11:20 -08:00