pytorch/caffe2/python
Baichuan Yuan dca97b4394 Weighted decay with frequency (count-based) (#60382)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60382

Instead of setting weight_decay w uniformly for all ids, for each row i in the sparse embedding table, the actual weight_decay `w_i` becomes `w*freq_i` where `freq_i = halflife/counter_i \in [\log(2), halflife]`. Counter is from `rowwise_counter` with definition `counter_i = 1 + \exp(-iter_{\delta}*\rho)*counter_i`.

Test Plan:
buck test //caffe2/caffe2/python/operator_test:adagrad_test -- test_row_wise_sparse_adagrad

buck test caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test_weight_decay

Reviewed By: 0x10cxR1

Differential Revision: D25581030

fbshipit-source-id: 54b3831b20516c76c559b13d8deb809e2ee3b446
2021-06-21 18:46:35 -07:00
..
benchmarks
docs
examples [codemod] fix tautological imports 2021-03-27 01:15:57 -07:00
fakelowp Lint trailing newlines (#54737) 2021-03-30 13:09:52 -07:00
helpers
ideep
layers [itemwise-dropout][1/x][low-level module] Implement Itemwise Sparse Feature Dropout in Dper3 (#59322) 2021-06-04 19:59:17 -07:00
mint [typing] suppress errors in fbcode/caffe2 - batch 2 2021-03-16 16:45:41 -07:00
mkl
modeling
models
onnx Fix ONNX forward compatibility (#59327) 2021-06-02 12:39:56 -07:00
operator_test Weighted decay with frequency (count-based) (#60382) 2021-06-21 18:46:35 -07:00
predictor [caffe2] Speed up remote net loading 2021-04-20 22:32:40 -07:00
rnn Lint trailing newlines (#54737) 2021-03-30 13:09:52 -07:00
serialized_test Lint trailing newlines (#54737) 2021-03-30 13:09:52 -07:00
test
trt Replace TensorRT's deprecated API in caffe2/python/trt/test_pt_onnx_trt.py (#60236) 2021-06-19 19:56:30 -07:00
__init__.py
_import_c_extension.py
_import_c_extension.pyi [caffe2] expose whether FBGEMM is available to the Python code (#54274) 2021-03-19 12:52:14 -07:00
allcompare_test.py Disallow versionless Python shebangs (#58275) 2021-05-14 08:26:02 -07:00
attention.py
benchmark_generator.py Disallow versionless Python shebangs (#58275) 2021-05-14 08:26:02 -07:00
binarysize.py
brew.py
brew_test.py
build.py
cached_reader.py
caffe_translator.py Remove unused python2 shebang (#58409) 2021-05-17 13:19:32 -07:00
caffe_translator_test.py
checkpoint.py
checkpoint_test.py
CMakeLists.txt Lint trailing newlines (#54737) 2021-03-30 13:09:52 -07:00
cnn.py
context.py
context.pyi
context_test.py
control.py
control_ops_grad.py
control_ops_grad_test.py
control_ops_util.py
control_test.py
convert.py Lint trailing newlines (#54737) 2021-03-30 13:09:52 -07:00
convert_test.py
convnet_benchmarks.py
convnet_benchmarks_test.py
core.py
core_gradients_test.py
core_test.py
crf.py
crf_predict.py
crf_viterbi_test.py
data_parallel_model.py
data_parallel_model_test.py
data_workers.py
data_workers_test.py
dataio.py
dataio_test.py
dataset.py
db_file_reader.py [caffe2] Fix DBFileReader (#53498) 2021-03-08 08:34:39 -08:00
db_test.py
device_checker.py
dlpack.h
dyndep.py
embedding_generation_benchmark.py
experiment_util.py change logging.warn to logging.warning (#51727) 2021-03-29 10:42:30 -07:00
extension_loader.py
fakefp16_transform_lib.py
filler_test.py
functional.py
functional_test.py
fused_8bit_rowwise_conversion_ops_test.py
gradient_check_test.py
gradient_checker.py
gru_cell.py
hip_test_util.py
hsm_util.py
hypothesis_test.py
hypothesis_test_util.py
ideep_test_util.py
layer_model_helper.py
layer_model_instantiator.py
layer_parameter_sharing_test.py
layer_test_util.py
layers_test.py [itemwise-dropout][1/x][low-level module] Implement Itemwise Sparse Feature Dropout in Dper3 (#59322) 2021-06-04 19:59:17 -07:00
lazy.py
lazy_dyndep.py
lazy_dyndep_test.py Disallow versionless Python shebangs (#58275) 2021-05-14 08:26:02 -07:00
lengths_reducer_fused_8bit_rowwise_ops_test.py
lengths_reducer_rowwise_8bit_ops_test.py
lstm_benchmark.py
memonger.py Use nodes instead of node 2021-04-13 10:45:35 -07:00
memonger_test.py
mkl_test_util.py
model_device_test.py
model_helper.py
model_helper_test.py
modifier_context.py
mpi_python.cc
muji.py
muji_test.py
net_builder.py
net_builder_test.py
net_drawer.py
net_printer.py
net_printer_test.py
nomnigraph.py
nomnigraph_test.py
nomnigraph_transformations.py
nomnigraph_transformations_test.py
normalizer.py
normalizer_context.py
normalizer_test.py
numa_benchmark.py
numa_test.py
observer_test.py
operator_fp_exceptions_test.py
optimizer.py Weighted decay with frequency (count-based) (#60382) 2021-06-21 18:46:35 -07:00
optimizer_context.py
optimizer_test.py optimizer exploration - v1 and v2 + fix position_weighted optimizer + decoupled weight decay (#54042) 2021-03-27 23:03:29 -07:00
optimizer_test_util.py
parallel_workers.py
parallel_workers_test.py
parallelize_bmuf_distributed_test.py
pipeline.py
pipeline_test.py
predictor_constants.py
pybind_state.cc Make PyTorch code-base clang-tidy compliant (#56892) 2021-04-28 14:10:25 -07:00
pybind_state.h
pybind_state_dlpack.cc Make PyTorch code-base clang-tidy compliant (#56892) 2021-04-28 14:10:25 -07:00
pybind_state_dlpack.h
pybind_state_gpu.cc
pybind_state_hip.cc
pybind_state_ideep.cc Make PyTorch code-base clang-tidy compliant (#56892) 2021-04-28 14:10:25 -07:00
pybind_state_int8.cc Make PyTorch code-base clang-tidy compliant (#56892) 2021-04-28 14:10:25 -07:00
pybind_state_nomni.cc Make PyTorch code-base clang-tidy compliant (#56892) 2021-04-28 14:10:25 -07:00
pybind_state_registry.cc Make PyTorch code-base clang-tidy compliant (#56892) 2021-04-28 14:10:25 -07:00
pybind_state_registry.h
python_op_test.py
queue_util.py
record_queue.py
recurrent.py
regularizer.py
regularizer_context.py
regularizer_test.py
rnn_cell.py
schema.py
schema_test.py
scope.py
scope_test.py
session.py
session_test.py
sparse_to_dense_mask_test.py
sparse_to_dense_test.py
task.py
task_test.py
test_util.py
text_file_reader.py
timeout_guard.py [torch/debuggability] use log.info() in addition to print() in timeoutguard (#57296) 2021-04-29 15:23:35 -07:00
toy_regression_test.py
transformations.py
transformations_test.py
tt_core.py
tt_core_test.py
utils.py
utils_test.py
visualize.py
workspace.py [caffe2] expose whether FBGEMM is available to the Python code (#54274) 2021-03-19 12:52:14 -07:00
workspace_test.py [typing] suppress errors in fbcode/caffe2 - batch 2 2021-05-04 12:44:27 -07:00