pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-15 21:00:47 +00:00

Author	SHA1	Message	Date
Yangxin Zhong	ed788ec780	Linearizable Label: Class Weights, Allow Missing Label, and Average by Batch Size (#29707 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29707 In D17885977, Linearizable label (a multi-class classification) was implemented in MTML. In this diff, we add several items for Linearizable label: - Assigning different weights to each class through ```model_def.tasks[i].class_weights```. - This option is a dictionary, the keys of which are indices of the classes and the values of which are weights for each class. - For example, if a linearizable-label task has 4 classes and its ```class_weights = {"0": 1, "1": 0.1, "2": 0.1, "3": 0.01}```, it means that in the loss function of this task, we assign weight 1 to its first class, weight 0.1 to its second and third class, and weight 0.01 to its forth class. The index/order of classes follows the logic of linearizable label. - Note that when you assign different weights to different classes, you need to correct the calibration by setting an appropriate ```model_def.tasks[i].calibration.linearizable_class_weight```. Basically, the class weights in calibration should be the reciprocals of the class weights in loss function. So the ```calibration.linearizable_class_weight = {"0": 1, "1": 10, "2": 10, "3": 100}``` for the example above. - Example FBLearner job: f150763093 - We also support ```model_def.allow_missing_label_with_zero_weight``` for linearizable label, which will ignore those examples with first label missing, by assigning zero weights to them in loss function. - We need to set ```allow_missing_label_with_zero_weight = true``` to enable it. - Example FBLearner job: f150763093 - Last but not least, we update caffe2 operator ```SoftmaxWithLoss``` to support loss averaged by batch size. - We need to set ```model_def.tasks[i].loss.softmaxLoss.average_by_batch_size = true``` to enable it. - Previously, the loss was averaged by weight sum of examples in batch, which is still the default behavior now (when ```average_by_batch_size = null``` or ```average_by_batch_size = false```). - Without this new feature, the calibration will be incorrect when applying non-equal-weight training among different classes to a linearizable task. - Example FBLearner job with ```average_by_batch_size = true``` results in a correct calibration: f150763093 - Example FBLearner job with ```average_by_batch_size = null``` results in an incorrect calibration: f150762990 Test Plan: buck test caffe2/caffe2/fb/dper/layer_models/tests:mtml_test_2 -- test_linearizable_label_task_with_class_weights buck test caffe2/caffe2/fb/dper/layer_models/tests:mtml_test_2 -- test_linearizable_label_task_with_zero_weight buck test caffe2/caffe2/fb/dper/layer_models/tests:mtml_test_2 -- test_linearizable_label_task_average_by_batch_size All tests passed. full canary: https://fburl.com/fblearner/troznfgh Reviewed By: chenshouyuan Differential Revision: D18461163 fbshipit-source-id: aaf3df031406ae94f74e2e365b57e47409ef0bfe	2019-11-13 16:52:27 -08:00
Kevin Chen	1189f559cc	Creating new layer FCWithBootstrap used in bootstrapping uncertainty approach (#29152 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29152 Bootstrapping uncertainty approach: bootstrap the last layer before the last fully-connected layer. FCWithBootstrap is a new layer to handle the logic for the bootstrapping process. Goal: - return a struct with the bootstrapped indices and bootstrapped predictions from this layer - separate the functionality in the train_net and eval_net - save the bootstrapped FC in this object so that the eval_net can use them during prediction time Reviewed By: wx1988 Differential Revision: D17822429 fbshipit-source-id: 15dec501503d581aeb69cb9ae9e8c3a3fbc7e7b5	2019-11-04 21:18:15 -08:00
Peiyao Zhou	46fefc98e2	Change dper3 loss module to match dper2 (#28265 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28265 Fix the difference in dper3 and dper2 when regressionLoss is used. Test Plan: test using dper2 model id f134632386 Comparison tool output before change: ``` FOUND OP DIFFERENT WITH DPER2!!! OP is of type ExpandDims OP inputs ['supervision:label'] OP outputs ['sparse_nn/regression_loss/mean_squared_error_loss/ExpandDims:0'] =============================== Finished all dper3 ops, number of good ops 11, bad ops 1, skipped 26 run_comparison for dper2 / dper3 nets running time: 0.0020143985748291016 result type: <class 'NoneType'> result: None ``` After change: ``` FOUND OP DIFFERENT WITH DPER2!!! OP is of type ExpandDims OP inputs ['sparse_nn_2/regression_loss_2/mean_squared_error_loss_8/Squeeze:0_grad'] OP outputs ['sparse_nn_2/over_arch_2/linear_2/FC_grad'] =============================== Finished all dper3 ops, number of good ops 19, bad ops 1, skipped 16 run_comparison for dper2 / dper3 nets running time: 0.0017991065979003906 result type: <class 'NoneType'> result: None ``` dper2 label part of net P111794577 dper3 label part of net after change P116817194 Reviewed By: kennyhorror Differential Revision: D17795740 fbshipit-source-id: 9faf96f5140f5a1efdf2985820bda3ca400f61fa	2019-10-18 10:08:38 -07:00
Long Jin	76bf8f62f7	fix loss_weight for self_supervision Summary: previously loss_weight is not used correctly for self-supervision branch Test Plan: buck test mode/dev-nosan //caffe2/caffe2/fb/dper/layer_models/models/experimental/tests:tum_test Reviewed By: xianjiec Differential Revision: D17862312 fbshipit-source-id: 554b793a5caa3886946c54333c81a0d8a10230d9	2019-10-15 10:40:48 -07:00
Alyssa Wang	4b1096c652	Fix predict net issue with LRU hash eviction Summary: We are seeing error "[enforce fail at BlackBoxPredictor.cpp:134] ! !parameter_workspace->HasBlob(out). Net REMOTE of type predict_net writes to blob cat/NGRAM_QRT_VERSIONS_x_EVENT_TYPE_AUTO_FIRST_X/Pool_Option_0/Repeat_0/sparse_lookup/w which exists in the parameter workspace" in online testing for calibration models. I'm suspecting it's due to the op CopyRowsToTensorOp are being used in prediction Test Plan: f143080108 offline predict net does not contain CopyRowsToTensorNet, which looks right. Waiting for Olga to test online behavior dper2 canary: https://fburl.com/fblearner/sv3o3yj1 Differential Revision: D17741823 fbshipit-source-id: 19721b632b5ea9ebfa1ef9ae0e99d3a10c926287	2019-10-14 16:08:14 -07:00
Benny Chen	d23d62cb1e	Fix unaries to export fp16 instead of fp32 when rest of the model export to int8 Summary: Currently accelerators does not have the concept for fp32, it only has understandings of fp16 and int8 in terms of data input. In order to fixe the issue here, we want to make sure unaries are turned into fp16 when we have the int8 exporter turned on. Reviewed By: kennyhorror Differential Revision: D17743791 fbshipit-source-id: 7322d23eb12ac3f813b525fc0ddd066f95c8ca85	2019-10-14 10:51:17 -07:00
Lei Zhang	0e8d4836e4	add feature name into module and update position weighted to match dper2 Test Plan: The notebook showed no diff for id score list https://our.intern.facebook.com/intern/anp/view/?id=154764 Reviewed By: alyssawangqq Differential Revision: D17649974 fbshipit-source-id: 84cb4ae372fc215295c2d0b139d65f4eacafae4a	2019-10-14 08:06:19 -07:00
Lin Jiang	1f158adeee	Add support for attention weight in SparseLookup (#26748 ) Summary: Support attention weights input to SparseLookup. In attention sum pooling, if attention weights can be pre-calculated before embedding lookup, they can be passed to SparseLookup and processed by SparseLengthsWeightedSum op. One example is id_score attention sum pooling. Essentially the net is converted from: LengthsSum(Mul(Gather(keys, w), att_weight)) to: SpaseLenghtsWeightedSum(keys, w, att_weight) It unblocks potential efficiency gain with distributed training. Pull Request resolved: https://github.com/pytorch/pytorch/pull/26748 Test Plan: unit test Reviewed By: chocjy Differential Revision: D17553345 Pulled By: wheatkit fbshipit-source-id: 60cc3c4b0bc1eade5459ac598e85286f3849a412	2019-10-08 20:22:25 -07:00
Swati Rallapalli	e63addfff6	Exponential decay of the weight of task loss (#27508 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27508 Implemented a simple exponential decay of the weight of lr loss function, with a lower bound. Test Plan: buck test //caffe2/caffe2/fb/dper/layer_models/tests:mtml_test -- test_task_weight_decay https://our.intern.facebook.com/intern/testinfra/testrun/3377699729136308 canary: f140103452 Reviewed By: chenshouyuan Differential Revision: D17524101 fbshipit-source-id: 9a653e21a4ecb74dfc4ac949c9e3388f36ef3a20	2019-10-08 09:15:41 -07:00
Xing Wang	a1513dced3	Integrate FC fp16 exporter into Dper2 (#26582 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26582 Add the blob quantization. replace the op in the eval/predictor net. Test Plan: # Unit test: ----- buck build fblearner/flow/projects/dper/tests/validators:test_exporter_options_validators ./buck-out/gen/fblearner/flow/projects/dper/tests/validators/test_exporter_options_validators#binary.par ---- buck build caffe2/caffe2/fb/dper/layer_models/tests:exporter_test ./buck-out/gen/caffe2/caffe2/fb/dper/layer_models/tests/exporter_test-2.7#binary.par Reviewed By: chocjy Differential Revision: D17439720 fbshipit-source-id: 68de5d0322b0111aeca5ed552210bf80a4cddc78	2019-09-29 10:19:28 -07:00
Xing Wang	73ae23a4ea	add support for real4bits quant (#25426 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25426 Add embedding table 4bit quantization support. * add the conversion from fp32 to int4. * using brew to pass the context so that the 4bit operators are added when generating the predictor net. Reviewed By: kennyhorror, chocjy Differential Revision: D16859892 fbshipit-source-id: a06c3f0b56a7eabf9ca4a2b2cb6c63735030d70b	2019-09-20 13:45:23 -07:00
Swati Rallapalli	c47ccfd01d	Enable variable size embedding (#25782 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25782 Enable variable size embedding for dot processor. We split the embedding matrix into multiple towers, based on the embedding size and perform dot product in a loop over each of the towers and finally concatenate all the dot product outputs. Test Plan: buck test //caffe2/caffe2/fb/dper/layer_models/tests/split_1: https://our.intern.facebook.com/intern/testinfra/testrun/3659174703037560 Specific unit tests -- buck test //caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test -- test_per_feature_emb_dim https://our.intern.facebook.com/intern/testinfra/testrun/3377699726358808 Reviewed By: chenshouyuan Differential Revision: D16690811 fbshipit-source-id: 8f5bce5aa5b272f5f795d4ac32bba814cc55210b	2019-09-09 22:08:32 -07:00
Xing Wang	8a8844dc83	Add the sparse feature information during logging in sparse lookup layer (#24863 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24863 Add the sparse feature name in logging for ease of debugging Test Plan: ./buck-out/gen/caffe2/caffe2/fb/dper/layer_models/sparse_nn/pooling_test#binary.par -r test_simple_sum_pooling_named_exception Another test for id_score_list. the original sparse_key is equivalent to get_key(self.input_record)() P98343716 ./buck-out/gen/caffe2/caffe2/python/layers_test-2.7#binary.par -r test_get_key Reviewed By: chocjy Differential Revision: D16901964 fbshipit-source-id: 2523de2e290aca20afd0b909111541d3d152a588	2019-08-27 23:25:26 -07:00
Yu Shi	43a2fd0e24	Support focal loss in MTML Summary: [Not in need of review at this time] Support focal loss in MTML (effectively dper2 in general) as described in https://arxiv.org/pdf/1708.02002.pdf. Adopt approach similar to Yuchen He's WIP diff D14008545 Test Plan: Passed the following unit tests buck test //caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test -- test_lr_loss_based_focal_loss buck test //caffe2/caffe2/fb/dper/layer_models/tests:mtml_test_2 -- test_mtml_with_lr_loss_based_focal_loss buck test //caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test -- test_lr_loss_based_focal_loss_with_stop_grad_in_focal_factor Passed ./fblearner/flow/projects/dper/canary.sh; URL to track workflow runs: https://fburl.com/fblearner/446ix5q6 Model based on V10 of this diff f133367092 Baseline model f133297603 Protobuf of train_net_1 https://our.intern.facebook.com/intern/everpaste/?color=0&handle=GEq30QIFW_7HJJoCAAAAAABMgz4Jbr0LAAAz Reviewed By: hychyc90, ellie-wen Differential Revision: D16795972 fbshipit-source-id: 7bacae3e2255293d337951c896e9104208235f33	2019-08-25 01:42:25 -07:00
Bin Wen	e78dad3593	Add BPR loss to TTSN (#24439 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24439 many literatures mentioned BPR is useful for improving recommendation quality. Add a BPR loss so that we can train TTSN with it. Would like to see if it can improve retrieval models. reference: https://arxiv.org/pdf/1205.2618.pdf Reviewed By: dragonxlwang Differential Revision: D16812513 fbshipit-source-id: 74488c714a37ccd10e0666d225751a845019eb94	2019-08-15 23:20:15 -07:00
Fan Wang	59094c409e	Refactor and expose metadata of tum_history layer for online prediction Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24290 Reviewed By: xianjiec Differential Revision: D16570968 fbshipit-source-id: f68d42f3a8e1a6c8d30e00c2dd7f7efc1fb35d7c	2019-08-15 00:27:11 -07:00
Kevin Wilfong	88b1f6619e	Return list of AccessedFeatures from get_accessed_features (#23983 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23983 While testing I realized that model layers can extract different types of features from the same column. For example, MultifeedFeaturesTransform uses float and ID list features from the "features" column. get_accessed_features returns a map from column to AccessedFeatures, and AccessedFeatures only has the feature IDs for one feature type. This is incompatible with have multiple types of features per column, one type ends up overwriting another in the map. To fix this, I've modified get_accessed_features to return a map from column to a list of AccessedFeatures objects. Reviewed By: itomatik Differential Revision: D16693845 fbshipit-source-id: 2099aac8dc3920dd61de6b6ad5cf343c864803bc	2019-08-14 10:50:27 -07:00
Frank Jiang	1439152e72	Make hashing default for bucket-weighted pooling (#24266 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24266 As title Reviewed By: huginhuangfb Differential Revision: D16775870 fbshipit-source-id: f919fdffa014ef3ce9a69fe173dd240e91813c3e	2019-08-13 13:56:32 -07:00
Levent Ertoz	8d4956fd02	hook up dropout sparse with replacement operator Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23183 Reviewed By: ffjiang Differential Revision: D16428262 fbshipit-source-id: 0d6e17d15c898629bbd2826441f2c9701a78b0bd	2019-07-23 14:34:25 -07:00
Kevin Wilfong	3ca7c0ffdb	Add get_accessed_features function to ModelLayer class (#23036 ) Summary: We need a way to figure get a complete list fo features that are used in training a model. One way to do this is to make it possible to get the list of features used in each Model Layer. Then once the model is complete we can go through the layers and aggregate the features. I've introduced a function to expose that information here, get_accessed_features, and implemented it in the FeatureSparseToDense layer to start with. I've tried to include the minimum amount of information to make this useful, while making it easy to integrate into the variety of model layers. This is, for example, why AccessedFeatures does not contain feature_names which is not always present in a model layer. I debated whether or not to include feature_type, but I think that's useful enough, and easy enough to figure out in a model layer, that it's worth including. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23036 Test Plan: Added a unit test to verify the behavior of get_accessed_features in FeatureSparseToDense. aml_dper2-fblearner-flow-integration-tests failed due to a known issue D16355865 aml_dper3-fblearner-flow-integration-tests failed due to a known issue T47197113 I verified no tests in the integration tests failed to issues other than those known ones. DPER2 canaries: https://fburl.com/fblearner/1217voga Reviewed By: volkhin Differential Revision: D16365380 Pulled By: kevinwilfong fbshipit-source-id: 2dbb4d832628180336533f29f7d917cbad171950	2019-07-22 15:04:28 -07:00
Alyssa Wang	d9e15bccb0	Perform weight re-init for embedding table in sparse_lookup.py (#22348 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22348 This is the last step of LRU hash eviction weight re-init. This diff checks if there's evicted values in sparse_lookup, if so call op created in D15709866 to re-init the values for indicies in evicted_values. Also created gradient op for the operator. The gradient op just passes the output gradient as input gradient. Reviewed By: itomatik Differential Revision: D16044736 fbshipit-source-id: 9afb85209b0de1038c5153bcb7dfc5f52e0b2abb	2019-07-03 10:33:40 -07:00
Alyssa Wang	bb07f2d063	Pass LRU hash output evicted_values to SparseLookup (#21389 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21389 As titled. To do weight re-init on evicted rows in embedding table, we need to pass the info of the evicted hashed values to SparseLookup, which is the layer model responsible for constructing the embedding table and do pooling. To pass evicted values, we need to adjust the output record of lru_sparse_hash to include the evicted values, and add optional input to all processors that needs to take in sparse segment. For SparseLookup to get the evicted values, its input record needs to be adjusted. Now the input record can have type IdList/IdScoreList/or a struct of feature + evicted values Reviewed By: itomatik Differential Revision: D15590307 fbshipit-source-id: e493881909830d5ca5806a743a2a713198c100c2	2019-07-02 11:27:37 -07:00
Frank Jiang	84a2d5d7aa	Add hashing to bucket-weighted pooling (#20673 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20673 Add option to bucket-weighted pooling to hash the bucket so that any cardinality score can be used. Reviewed By: huginhuangfb Differential Revision: D15003509 fbshipit-source-id: 575a149de395f18fd7759f3edb485619f8aa5363	2019-06-20 15:12:36 -07:00
Hongyu Xiong	76a250d590	Add new regression loss function type to FBLearner (#21080 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21080 Add Huber loss as a new option for regression training (refer to TensorFlow implementation: https://fburl.com/9va71wwo) # huber loss def huber(true, pred, delta): error = abs(true-pred) loss = 0.5 * min(error, delta)^2 + delta * max(error - delta, 0) return mean(loss) As a combination of MSE loss (`x < delta`) and MAE loss (`x >= delta`), the advantage of Huber loss is to reduce the training dependence on outlier. One thing worth to note is that Huber loss is not 2nd differential at `x = delta`. To further address this problem, one could consider adopt the loss of `LOG(cosh(x))`. Reviewed By: chintak Differential Revision: D15524377 fbshipit-source-id: 73acbe2728ce160c075f9acc65a1c21e3eb64e84	2019-06-17 17:43:00 -07:00
Jiyan Yang	2c91ba3bbc	Add div hashing Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21422 Reviewed By: xianjiec Differential Revision: D15589181 fbshipit-source-id: f6ff0726164f88da45e4b090b4d5ad05305b3225	2019-06-12 11:27:37 -07:00
Xing Wang	c4e0d61646	Regularization is not supported in FP16 (#21319 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21319 Add assertion to raise Exception when Regularization is applied on FP16. Reviewed By: bddppq Differential Revision: D15528486 fbshipit-source-id: c887c90d1d9ccfdaded3b5fa16816c6f29910e2e	2019-06-09 23:59:48 -07:00
Xing Wang	7d84ca6e06	clean code to unify the logic to use fp16 by the optimizer engine (#20915 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20915 Clean the unary processor code. Some question are added into the comments to seek suggestions. Reviewed By: pjh5 Differential Revision: D15448502 fbshipit-source-id: ef0c45718c1a06187e3fe2e4e59b7f20c641d9c5	2019-06-03 15:03:35 -07:00
Jiyan Yang	33f421027c	Allow recency weight pooling for fp16 (#20506 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20506 as titled Reviewed By: alex1o1o7cloud Differential Revision: D15342758 fbshipit-source-id: 89e7cb6d7b9511ef6c70611359736328571d7fc0	2019-05-14 20:13:38 -07:00
Jiyan Yang	6c3b8a24ff	Make sure reducer=None is not used when fp16 embedding is enabled Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20349 Reviewed By: hyuen Differential Revision: D15291545 fbshipit-source-id: fa5fd0b97aeca6e5f45866908f3f205b701c931b	2019-05-13 11:53:14 -07:00
Xue Feng	1129b3344a	move DistillBatchLRLoss Layer from open source to fb Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20291 Reviewed By: chocjy Differential Revision: D15272181 fbshipit-source-id: 2e0964fa1b1031607134548bb87c4e103c5b1383	2019-05-10 17:46:04 -07:00
Jiyan Yang	714344a976	Specify to use Float16UniformFill if necessary in sparse lookup layer (#18499 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18499 If the init op is not fp16 compatible, it should throw. However, in the special case where the original init op is UniformFill, we replace it with Float16UniformFill Reviewed By: kennyhorror Differential Revision: D14627209 fbshipit-source-id: eb427772874a732ca8b3a25d06670d119ce8ac14	2019-04-23 10:14:08 -07:00
Jiyan Yang	deadf3ba89	Add assertion to make sure init op is always fp16 compatible in fp16 training Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18498 Reviewed By: kennyhorror Differential Revision: D14626755 fbshipit-source-id: d8a0b3c02920ab3835911a21bf05e8956853fcd7	2019-04-21 23:43:13 -07:00
Xing Wang	b6f130aa70	try to enable uncertainty for lr loss (#17236 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17236 Following the paper in https://papers.nips.cc/paper/7141-what-uncertainties-do-we-need-in-bayesian-deep-learning-for-computer-vision.pdf, approximate the classification case with the regression formulation. For the LRLoss, add penalty based on the variance and regularization on the variance with a tunable parameter lambda. Reviewed By: chocjy Differential Revision: D14077106 fbshipit-source-id: 4405d8995cebdc7275a0dd07857d32a8915d78ef	2019-04-11 07:35:19 -07:00
Huan Gui	d3fcd0d798	add dropout during eval (#17549 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17549 Currently Dropout is only enabled in training, we enable the option of having dropout in Eval. This is to follow [1]. This functionality would be used for uncertainty estimation in exploration project. [1] Gal, Yarin, and Zoubin Ghahramani. "Dropout as a bayesian approximation: Representing model uncertainty in deep learning." international conference on machine learning. 2016. Reviewed By: Wakeupbuddy Differential Revision: D14216216 fbshipit-source-id: 87c8c9cc522a82df467b685805f0775c86923d8b	2019-02-28 23:21:29 -08:00
Junjie Bai	52135e9b12	Revert D13551909: [fbcode] logdevice for generic feature type Differential Revision: D13551909 Original commit changeset: 807830c50bee fbshipit-source-id: 48cacf4ec1765253a9be9d78f4b28cc48330be59	2019-01-25 00:33:06 -08:00
Qin Huang	11a2b3799b	logdevice for generic feature type (#16191 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16191 logdevice related modifications for generic feature type we directly convert the generic feature structures to json strings, which corresponds to the column input in offline and dper Reviewed By: itomatik Differential Revision: D13551909 fbshipit-source-id: 807830c50bee569de202530bc3700374757793a2	2019-01-24 23:33:19 -08:00
Jiyan Yang	0199d59d3a	Resubmit: Set the correct engine name for position weighted pooling when fp16 is used for training Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13768 Reviewed By: xianjiec Differential Revision: D12996103 fbshipit-source-id: 5ca4cda4210f68ece2b5d6eced8cf52ee91fb36f	2018-11-27 14:51:56 -08:00
Huan Gui	60e7d04961	Add Recency Weighted into SparseLookup (#14291 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14291 Add RecencyWeighted into SparseLookup. Reviewed By: Wakeupbuddy Differential Revision: D13147738 fbshipit-source-id: de5dc3aaee8ce7d41c6d30d2ff47e9786a7fa4da	2018-11-24 02:43:31 -08:00
Yan Zhu	003f97cefa	fc layer accept axis argument (#13822 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13822 as title Reviewed By: xianjiec Differential Revision: D12996338 fbshipit-source-id: 1aa61e71e2d79535325ea7034c82e1cb6bf3a9f6	2018-11-11 13:44:57 -08:00
Frank Jiang	b827a40880	Implement bucket-based attention pooling for IdScoreList features (#13004 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13004 Implement BucketWeighted model layer, which learns a weight for each possible score in an IdScoreList. Here, we assume that the scores in the IdScoreList have already been converted into the appropriate 'buckets'. If this is not done, then essentially each score represents its own bucket. We assume that the scores/buckets are integers, and if max_score is not set, we assume that the maximum cardinality of the score is less than or equal to the cardinality of the ids. Reviewed By: chonglinsun Differential Revision: D10413186 fbshipit-source-id: 743e643a1b36adf124502a8b6b29976158cdb130	2018-10-25 18:04:08 -07:00
Andrey Malevich	eaf33f22c8	Revert D10123465: Set the correct engine name for position weighted pooling when fp16 is used for training Differential Revision: D10123465 Original commit changeset: e8d929d4153d fbshipit-source-id: 36269e49ac79955fe695ac1a53a3c386aa2f5bec	2018-10-15 01:53:48 -07:00
Jiyan Yang	635cbff300	Set the correct engine name for position weighted pooling when fp16 is used for training Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12225 Reviewed By: hyuen, xianjiec Differential Revision: D10123465 fbshipit-source-id: e8d929d4153d1ee987ae3d1c37892525d7574d16	2018-10-12 20:15:13 -07:00
Xiaolong Wang	8ac8b823c2	Allow use substitute ops for LayerNorm (#12177 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12177 as titled Reviewed By: Wakeupbuddy Differential Revision: D9218047 fbshipit-source-id: 8d68861472c99d587e678c3d76ac43abc9c8fe6d	2018-10-11 17:36:10 -07:00
Jiyan Yang	c5f7da3f4a	Support FP16 sparse lookup (#11674 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11674 Pull Request resolved: https://github.com/pytorch/pytorch/pull/11658 Reviewed By: hyuen Differential Revision: D9676950 fbshipit-source-id: 89a115b9664b84e4e4436b7da033e5a428c2246d	2018-09-14 02:40:08 -07:00
Yan Zhu	ac9f0a6884	refactor preproc, support dense in TumHistory layer Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11131 Reviewed By: xianjiec Differential Revision: D9358415 fbshipit-source-id: 38bf0e597e22d540d9e985ac8da730f80971d745	2018-09-05 16:10:13 -07:00
Hassan Eslami	3578909671	Remove unused code base for distributed training (#10282 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10282 This diff removes the unused/deprecated features from the code base. Reviewed By: manojkris Differential Revision: D9169859 fbshipit-source-id: d6447b7916a7c687b44b20da868112e6720ba245	2018-08-16 20:10:17 -07:00
Bangsheng Tang	44b029f5b8	move matrix formation for dot products to precompute/request-only (#10531 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10531 fixed a naming issue in pairwise_similarity Reviewed By: huayuli00 Differential Revision: D9331716 fbshipit-source-id: d7de36f20504c08b1c7871ccdffa343221a3da0c	2018-08-15 11:02:10 -07:00
Qin Huang	ab293924bb	support generic feature in DPER2 (#10197 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10197 Support generic feature in DPER2 For now since we only have one generic type 1, we are directly adding the parsed feature record to embedding feature. For new feature types with specific structure, there should also be corresponding coding changes expected. Reviewed By: itomatik Differential Revision: D8788177 fbshipit-source-id: 9aaa6f35ece382acb4072ec5e57061bb0727f184	2018-08-04 15:25:13 -07:00
Huayu Li	46d8002800	Fix bug that always uses the same blob when repeating poolings Reviewed By: houseroad Differential Revision: D9027902 fbshipit-source-id: 957702ad9736812ec5aa32066d286c2c3adffc49	2018-07-28 00:09:16 -07:00
Yuan Xie	c14e17eced	Co-disitillation with different archs and/or feature set (#9793 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9793 Enable co-distillation with different archs Reviewed By: pjh5 Differential Revision: D8888479 fbshipit-source-id: eac14d3d9bb6d8e7362bc91e8200bab237d86754	2018-07-25 10:10:27 -07:00

1 2 3 4 5

242 commits