Summary:
- Integrated RFF into the preprocessing workflow for dense features
- Developed Flow interface to input RFF parameters
- Created unit test for using RFF with sparseNN
Reviewed By: chocjy
Differential Revision: D5367534
fbshipit-source-id: 07307259c501a614d9ee68a731f0cc8ecd17db68
Summary:
- Created the random fourier features layer
- Generated a unit test to test the random fourier features layer is built correctly
- Inspired by the paper [[ https://people.eecs.berkeley.edu/~brecht/papers/07.rah.rec.nips.pdf | Random Features for Large-Scale Kernel Machines]]
Reviewed By: chocjy
Differential Revision: D5318105
fbshipit-source-id: c3885cb5ad1358853d4fc13c780fec3141609176
Summary:
In some cases we don't want to compute the full FC during eval.
These layers allow us to compute dot product between
X and W[idx,:] where idx is an input, e.g., label.
Reviewed By: kittipatv
Differential Revision: D5305364
fbshipit-source-id: 0b6a1b61cc8fcb26c8def8bcd037a4a35d223078
Summary:
similar to sparse_nn all gpu, this is our first step towards offline full gpu experiment.
**Compare Run**
cat(128, 32)512-512 :
GPU 21138598 https://fburl.com/jpeod1pi
CPU 21138787 https://fburl.com/vma7225l
Reviewed By: dzhulgakov
Differential Revision: D5308789
fbshipit-source-id: 413819bf9c5fff125d6967ed48faa5c7b3d6fa85
Summary:
As described in T19378176 by kittipatv, in this diff, we fix the issue of __getitem__() of schema.List.
For example, given Map(int32, float) (Map is a special List), field_names() will return "lengths", "values:keys", & "values:values". "values:keys" and "values:values" are not accessible via __getitem__(). __getitem__() bypasses the values prefix and directly access the fields in the map. Other APIs (e.g., _SchemaNode & dataset_ops) expect "values:keys" and "values:values" as it simplifies traversal logic. Therefore, we should keep field_names() as is and fix __getitem__().
Reviewed By: kittipatv
Differential Revision: D5251657
fbshipit-source-id: 1acfb8d6e53e286eb866cf5ddab01d2dce97e1d2
Summary:
- Incorporated dropout layer to the sparseNN training and testing pipeline
- Integrated an advanced model options feature on Flow UI for users to specify dropout rate
- Created an end-to-end unit test to build and run a model with dropout
Reviewed By: chocjy
Differential Revision: D5273478
fbshipit-source-id: f7ae7bf4de1172b6e320f5933eaaebca3fd8749e
Summary:
truncate id list using the max length computed in compute meta, so that it has determined length,
which is useful for position weighted pooling method.
Reviewed By: sunwael
Differential Revision: D5233739
fbshipit-source-id: f73deec1bb50144ba14c4f8cfa545e1ced5071ce
Summary:
The SparseToDense layer is essentially calling the SparseToDenseMask op.
This makes it impossible to call the functional layer with the true SparseToDense op.
This diff is to rename the layer.
Please let me know if I missed anything or you have a better name suggestion.
Differential Revision: D5169353
fbshipit-source-id: 724d3c6dba81448a6db054f044176ffc7f708bdb
Summary:
If there're 2 SparseToDense layers that are densifying same IdList feature
it'll result in the situation, where we might export invalid input for the
prediction in input specs. This diff is changing the behavior to support to use
Alias to a new blob instead of passing things directly.
Reviewed By: dzhulgakov
Differential Revision: D5093754
fbshipit-source-id: ef4fa4ac3722331d6e72716bd0c6363b3a629cf7
Summary: Currently using two tower models with cosine distance results in bad calibration. Adding bias to the output of cosine term solves the problem.
Reviewed By: xianjiec
Differential Revision: D5132606
fbshipit-source-id: eb4fa75acf908db89954eeee67627b4a00572f61
Summary:
In Dper utility, add a function `load_parameters_from_model_init_options` to
allow init parameters from pretrained models
Reviewed By: xianjiec
Differential Revision: D4926075
fbshipit-source-id: 5ab563140b5b072c9ed076bbba1aca43e71c6ac5
Summary:
Segment based Ops requires increasing seg id, and without gap. Lengths based Ops does not
have this requirements.
Otherpooling methods, e.g., LogExpMean does not have Lengths based Ops available yet.
Differential Revision: D5019165
fbshipit-source-id: ab01a220e10d4ed9fa2162939579d346607f905e
Summary: Layer for LastNWindowCollector op. We need this since it's an in-place operator.
Reviewed By: chocjy
Differential Revision: D4981772
fbshipit-source-id: ec85dbf247d0944db422ad396771fa9308650883
Summary:
Layer to allow model to follow different paths for each instantiation context and join later. Together with tagging system cleanup (this is a separate issue), this should reduce the need to write a layer to differentiate between context.
Re: tagging system clean up, we should make exclusion more explicit: EXCLUDE_FROM_<CONTEXT>. This would simplify instation code. TRAIN_ONLY should become a set of all EXCLUDE_FROM_*, except EXCLUDE_FROM_TRAIN.
Reviewed By: kennyhorror
Differential Revision: D4964949
fbshipit-source-id: ba6453b0deb92d1989404efb9d86e1ed25297202
Summary: Previously, the code below would go out of bound.
Reviewed By: xianjiec
Differential Revision: D4968037
fbshipit-source-id: 3760e2cddc919c45d85ac644ac3fabf72dbaf666
Summary: Current eval nets contain loss operators; see example: https://fburl.com/6otbe0n7, which is unnecessary. This diff is to remove them from the eval net.
Differential Revision: D4934589
fbshipit-source-id: 1ba96c20a3a7ef720414acb4124002fb54cabfc7
Summary: A layer that takes raw ids as inputs and outputs the indices which can be used as labels. The mapping will be stored with the model.
Reviewed By: kittipatv
Differential Revision: D4902556
fbshipit-source-id: 647db47b0362142cdba997effa2ef7a5294c84ee
Summary: added a new context to layers.py
Reviewed By: kennyhorror
Differential Revision: D4817124
fbshipit-source-id: 36f08964b86092e81df24c1b9d4b167293a7ffb8
Summary:
Currently, the functional layer infers the output types and shapes by running the operator once.
But in cases where special input data are needed to run the operator, the inferrence may fail.
This diff allows the caller to manually specify the output types and shapes if the auto infererence may fail.
Reviewed By: kennyhorror
Differential Revision: D4864003
fbshipit-source-id: ba242586ea384f76d745b29a450497135717bdcc
Summary: Having to pack the input to schema doesn't make much sense since the structure is not recognized by operators anyway.
Differential Revision: D4895686
fbshipit-source-id: df78884ed331f7bd0c69db4f86c682c52829ec76
Summary: Perform gather on the whole record. This will be used for negative random sampling.
Reviewed By: kennyhorror
Differential Revision: D4882430
fbshipit-source-id: 19e20f7307064755dc4140afb5ba47a699260289
Summary:
The basic idea of bucket-based calibration:
1. given a model and a calibration data set
2. apply the model to the calibration data set and sort the prediction scores
3. bucketize the prediction scores
4. for the samples in each bucket, compute the proportion of positive samples
5. build a set of piecewise linear functions that map from the bucket range to the proportion
6. appends an operator of piecewise linear transform to the prediction net that is supposed to calibrate the raw predictions.
7. to support calibration in realtime training, we create a new type of Net -- bucket calibration net. This needs a new Context to add_calibration_ops(), to export and load the new Net.
This includes a series of diffs.
This diff implements a layer that adds different operators for train/cali/eval for bucket based calibration.
Reviewed By: dragonxlwang
Differential Revision: D4817119
fbshipit-source-id: 44f8fcad2a94f40f7439cc1ad47e7bae5e17397d
Summary: Somehow, feed-non-ranking training data usually have this type of column. Add option to support it.
Reviewed By: xianjiec, kennyhorror
Differential Revision: D4773960
fbshipit-source-id: 5a7ef4618a070e04f3cd8ddfcbf2b7441c00d92d
Summary:
multiple places broken, blocking the push :(
- fix the weighted training for ads and feeds
- fix the publishing if no exporter model is selected
- fix the feeds retrieval evaluation
- added the default config for retrieval workflows. plan to use for flow test (in next diff)
- clean up not used code
- smaller hash size for faster canary test
Reviewed By: chocjy
Differential Revision: D4817829
fbshipit-source-id: e3d407314268b6487c22b1ee91f158532dda8807
Summary:
This diff does the followings:
1. Add optimization options to model options in the UI for all workflows.
2. Allow different parameters to use different optimizers (or same optimizer with different settings, eg, learning rate).
3. Remove the default values for the `sparseDedupAggregator` field in the thrift file as the default value for that should just be `None` instead of 'sum'.
4. `fb/dper/layer_models/mlp_sparse.py` is deprecated.
5. Add calibration to two tower workflows.
Reviewed By: kittipatv
Differential Revision: D4767004
fbshipit-source-id: de92ea63fb0ff33f8581b1693479b723a68cd2d1
Summary:
Add distributed training to dper2 and keep the dper1 working.
* Created a ModelDelegator to wrap ModelHelper and LayerModelHelper to mitigate the difference.
* To get the average length for sparse feature, I extracted some information in feature_processor. There should be some better way to do it after we have new compute_meta.
* metric right now only runs on the first trainer.
* The model is saved correctly for evaluation. But I'm still not sure how to handle the weights for adagrad.
Reviewed By: kennyhorror
Differential Revision: D4767745
fbshipit-source-id: 0559d264827a7fd9327071e8367d1e84a936bea9
Summary:
Adding support for multilabel in multiclass workflow. `input_feature_schema` and `trainer_extra_schema` are now a function taking in the preprocessor option and output the schema. This allows dynamic schema definition based on the option.
Changing default value will be in the next diff.
Reviewed By: xianjiec
Differential Revision: D4750064
fbshipit-source-id: 896143f432e963bc1723c0153749efeb39a83bec
Summary: This layer will be used to sample negative labels for sampled softmax.
Differential Revision: D4773444
fbshipit-source-id: 605a979c09d07531293dd9472da9d2fa7439c619