pytorch/caffe2/python/operator_test
James Reed 01c76bf830 Optimize TransposeOp by using strided access pattern, bulk memory transfer, and other profile-guided optimizations
Summary: Work in progress for improving the performance of the TransposeOp on CPU. This is used extensively for inference in several neural MT systems, so optimizing this function is worthwhile and will reduce request latency.

Differential Revision: D4913075

fbshipit-source-id: fa2742829291d91f3eba00fdfe7d6c0dae83e206
2017-04-20 18:31:40 -07:00
..
activation_ops_test.py Caffe2: CUDA implementation for LeakyReluOp 2017-03-28 08:48:25 -07:00
adagrad_test.py Add tests and GPU impls for sparse optimizers 2017-04-13 11:07:40 -07:00
adam_test.py Add tests and GPU impls for sparse optimizers 2017-04-13 11:07:40 -07:00
atomic_ops_test.py
boolean_unmask_test.py Caffe2 unit test for unmask 2017-04-18 21:06:18 -07:00
checkpoint_test.py snapshot -> checkpoint 2016-12-15 12:01:30 -08:00
conv_test.py Conv-ND NCHW CUP/CUDA implementation 2017-03-20 14:01:07 -07:00
conv_transpose_test.py
copy_ops_test.py make CopyGPUToCPU/CPUToGPU handle sparse gradients 2017-04-13 17:16:26 -07:00
cosine_embedding_criterion_op_test.py
counter_ops_test.py AtomicCounter to return previous value on Reset. 2017-02-02 14:59:30 -08:00
crf_test.py CRF layer in caffe2 2017-03-23 22:02:02 -07:00
cross_entropy_ops_test.py CUDA version of SigmoidCrossEntropyWithLogits 2017-04-14 16:07:33 -07:00
dataset_ops_test.py Revert D4870606: caffe2: datasets pack/unpack 2017-04-18 16:47:05 -07:00
duplicate_operands_test.py
elementwise_linear_op_test.py ElementwiseLinearOp 2017-04-17 14:18:27 -07:00
elementwise_op_broadcast_test.py new SumReduceLike op CPU/GPU implementation and doc 2017-04-13 10:28:46 -07:00
elementwise_ops_test.py Sqr op and gradient 2017-03-07 03:03:07 -08:00
emptysample_ops_test.py
extend_tensor_op_test.py
fc_operator_test.py Test for FC operator + fix for docs 2017-01-27 10:44:24 -08:00
filler_ops_test.py fix unit test 2017-04-19 17:22:00 -07:00
gather_ops_test.py Add GatherOp for GPU, and update its tests. 2017-03-31 13:20:09 -07:00
gather_ranges_op_test.py
given_tensor_fill_op_test.py Fixes for ops without a CUDA backend 2017-03-29 14:36:09 -07:00
group_conv_test.py Make all convolution operators allow optional bias term 2016-12-21 15:14:24 -08:00
hsm_test.py Generate huffman tree 2017-01-19 16:14:23 -08:00
index_ops_test.py Change the schema of IndexLoad & IndexFreeze so that state change is captured by the framework 2017-02-14 10:05:12 -08:00
instance_norm_test.py instance norm test fix 2017-02-25 14:31:42 -08:00
lengths_tile_op_test.py Renaming DuplicateOp to LengthsTileOp 2017-04-12 22:04:20 -07:00
loss_ops_test.py Caffe2: consolidate AveragedLoss with SumElementsOp 2017-04-06 10:35:01 -07:00
margin_ranking_criterion_op_test.py
matmul_op_test.py BatchMatMulOp: use cuBLAS batched strided gemm for CUDA 2017-03-28 11:54:09 -07:00
mkl_conv_op_test.py MKL convolution operator 2017-01-23 09:59:30 -08:00
mkl_packed_fc_op_test.py MKL convolution operator 2017-01-23 09:59:30 -08:00
mkl_speed_test.py MKL convolution operator 2017-01-23 09:59:30 -08:00
momentum_sgd_test.py Add tests and GPU impls for sparse optimizers 2017-04-13 11:07:40 -07:00
mpi_test.py Setup MPI before test start 2016-12-19 15:59:32 -08:00
one_hot_ops_test.py feature processing ops 2017-04-11 07:07:51 -07:00
pack_ops_test.py Registering GPU version of PackSegments using GPUFallbackOp 2017-03-24 16:01:53 -07:00
pad_test.py Support cropping with negative pad sizes in PadImage 2017-04-03 23:47:54 -07:00
partition_ops_test.py
piecewise_linear_transform_test.py PiecewiseLinearTransformOp supports passing params from input blobs. 2017-04-08 11:02:35 -07:00
pooling_test.py Generalize PoolingOp(CUDA) to compute 1D, 2D and 3D pooling. 2017-04-12 09:16:45 -07:00
pow_op_test.py CUDA version of elementwise power + rename to Pow + gradient 2017-03-07 10:20:40 -08:00
python_op_test.py
rank_loss_operator_test.py Normalize rank loss gradient to avoid convergence issues when the number of pairs is really large 2016-12-21 17:29:24 -08:00
record_queue_test.py
recurrent_network_test.py Update IntelComposerXE to 2017.2.274 2017-04-19 10:07:09 -07:00
reduce_ops_test.py ReduceBack{Sum|Mean}Op CPU & GPU implementation 2017-03-13 16:19:58 -07:00
reduction_ops_test.py SumSqrElements 2017-04-10 16:16:52 -07:00
relu_op_test.py Change back the function signature of relu gradient to only use 2017-04-13 22:08:09 -07:00
reshape_ops_test.py Allow test discovery in caffe2/python/ 2017-03-14 18:16:41 -07:00
resize_op_test.py Add ResizeNearest operator 2017-03-16 18:49:01 -07:00
rnn_cell_test.py RNNCell, LSTMCell, LSTMWithAttentionCell 2017-04-18 00:47:20 -07:00
segment_ops_test.py Allow test discovery in caffe2/python/ 2017-03-14 18:16:41 -07:00
sequence_ops_test.py add gpu support for caffe2-seq2seq 2017-03-17 05:19:14 -07:00
shape_inference_test.py create_net: explicitly specify if one wants to overwrite the network. 2017-04-17 21:46:53 -07:00
softmax_ops_test.py memory-saving only_loss argument for SoftmaxWithLoss 2017-04-06 13:04:31 -07:00
sparse_gradient_checker_test.py Fixes for ops without a CUDA backend 2017-03-29 14:36:09 -07:00
sparse_ops_test.py
spatial_bn_op_test.py
square_root_divide_op_test.py
stats_ops_test.py Performance counters 2017-02-21 16:31:24 -08:00
string_ops_test.py
text_file_reader_test.py
tile_op_test.py Revert D4794432: Added tiles and axis as input parameters to Tile Operator 2017-04-19 14:17:25 -07:00
top_k_test.py Implement TopK op in caffe2 2017-03-16 17:32:20 -07:00
unique_uniform_fill_op_test.py UniqueUniformFillOp 2017-02-15 16:00:44 -08:00
utility_ops_test.py Optimize TransposeOp by using strided access pattern, bulk memory transfer, and other profile-guided optimizations 2017-04-20 18:31:40 -07:00