pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-15 21:00:47 +00:00

History

Yangxin Zhong ed788ec780 Linearizable Label: Class Weights, Allow Missing Label, and Average by Batch Size (#29707 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29707 In D17885977, Linearizable label (a multi-class classification) was implemented in MTML. In this diff, we add several items for Linearizable label: - Assigning different weights to each class through ```model_def.tasks[i].class_weights```. - This option is a dictionary, the keys of which are indices of the classes and the values of which are weights for each class. - For example, if a linearizable-label task has 4 classes and its ```class_weights = {"0": 1, "1": 0.1, "2": 0.1, "3": 0.01}```, it means that in the loss function of this task, we assign weight 1 to its first class, weight 0.1 to its second and third class, and weight 0.01 to its forth class. The index/order of classes follows the logic of linearizable label. - Note that when you assign different weights to different classes, you need to correct the calibration by setting an appropriate ```model_def.tasks[i].calibration.linearizable_class_weight```. Basically, the class weights in calibration should be the reciprocals of the class weights in loss function. So the ```calibration.linearizable_class_weight = {"0": 1, "1": 10, "2": 10, "3": 100}``` for the example above. - Example FBLearner job: f150763093 - We also support ```model_def.allow_missing_label_with_zero_weight``` for linearizable label, which will ignore those examples with first label missing, by assigning zero weights to them in loss function. - We need to set ```allow_missing_label_with_zero_weight = true``` to enable it. - Example FBLearner job: f150763093 - Last but not least, we update caffe2 operator ```SoftmaxWithLoss``` to support loss averaged by batch size. - We need to set ```model_def.tasks[i].loss.softmaxLoss.average_by_batch_size = true``` to enable it. - Previously, the loss was averaged by weight sum of examples in batch, which is still the default behavior now (when ```average_by_batch_size = null``` or ```average_by_batch_size = false```). - Without this new feature, the calibration will be incorrect when applying non-equal-weight training among different classes to a linearizable task. - Example FBLearner job with ```average_by_batch_size = true``` results in a correct calibration: f150763093 - Example FBLearner job with ```average_by_batch_size = null``` results in an incorrect calibration: f150762990 Test Plan: buck test caffe2/caffe2/fb/dper/layer_models/tests:mtml_test_2 -- test_linearizable_label_task_with_class_weights buck test caffe2/caffe2/fb/dper/layer_models/tests:mtml_test_2 -- test_linearizable_label_task_with_zero_weight buck test caffe2/caffe2/fb/dper/layer_models/tests:mtml_test_2 -- test_linearizable_label_task_average_by_batch_size All tests passed. full canary: https://fburl.com/fblearner/troznfgh Reviewed By: chenshouyuan Differential Revision: D18461163 fbshipit-source-id: aaf3df031406ae94f74e2e365b57e47409ef0bfe		2019-11-13 16:52:27 -08:00
..
__init__.py
adaptive_weight.py
add_bias.py
arc_cosine_feature_map.py
batch_huber_loss.py	Add new regression loss function type to FBLearner (#21080 )	2019-06-17 17:43:00 -07:00
batch_lr_loss.py	Exponential decay of the weight of task loss (#27508 )	2019-10-08 09:15:41 -07:00
batch_mse_loss.py	Change dper3 loss module to match dper2 (#28265 )	2019-10-18 10:08:38 -07:00
batch_normalization.py
batch_sigmoid_cross_entropy_loss.py
batch_softmax_loss.py	Linearizable Label: Class Weights, Allow Missing Label, and Average by Batch Size (#29707 )	2019-11-13 16:52:27 -08:00
blob_weighted_sum.py
bpr_loss.py	Add BPR loss to TTSN (#24439 )	2019-08-15 23:20:15 -07:00
bucket_weighted.py	add feature name into module and update position weighted to match dper2	2019-10-14 08:06:19 -07:00
build_index.py
concat.py
constant_weight.py
conv.py
dropout.py
fc.py	Integrate FC fp16 exporter into Dper2 (#26582 )	2019-09-29 10:19:28 -07:00
fc_with_bootstrap.py	Creating new layer FCWithBootstrap used in bootstrapping uncertainty approach (#29152 )	2019-11-04 21:18:15 -08:00
fc_without_bias.py
feature_sparse_to_dense.py	Return list of AccessedFeatures from get_accessed_features (#23983 )	2019-08-14 10:50:27 -07:00
functional.py
gather_record.py
homotopy_weight.py
label_smooth.py
last_n_window_collector.py
layer_normalization.py
layers.py	Return list of AccessedFeatures from get_accessed_features (#23983 )	2019-08-14 10:50:27 -07:00
margin_rank_loss.py
merge_id_lists.py
pairwise_similarity.py
position_weighted.py
random_fourier_features.py
reservoir_sampling.py
sampling_train.py
sampling_trainable_mixin.py
select_record_by_context.py
semi_random_features.py
sparse_dropout_with_replacement.py	hook up dropout sparse with replacement operator	2019-07-23 14:34:25 -07:00
sparse_feature_hash.py	Refactor and expose metadata of tum_history layer for online prediction	2019-08-15 00:27:11 -07:00
sparse_lookup.py	Fix predict net issue with LRU hash eviction	2019-10-14 16:08:14 -07:00
split.py	Enable variable size embedding (#25782 )	2019-09-09 22:08:32 -07:00
tags.py
uniform_sampling.py