pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-15 21:00:47 +00:00

Author	SHA1	Message	Date
Gu, Jinghui	2ebeb33697	Fallback to CPU concat op to handle TensorCPU inputs (#15263 ) Summary: Fallback to CPU concat op to handle TensorCPU inputs Pull Request resolved: https://github.com/pytorch/pytorch/pull/15263 Differential Revision: D13587030 Pulled By: yinghai fbshipit-source-id: 010a8579d61c3beb8556eb92493a552b2ab0030c	2019-01-07 11:13:23 -08:00
Cheng,Penghui	1488c5dd03	support 0 size in any of the tensor dimensions in mkldnn (#15295 ) Summary: support 0 size in any of the tensor dimensions in mkldnn Pull Request resolved: https://github.com/pytorch/pytorch/pull/15295 Differential Revision: D13573747 Pulled By: yinghai fbshipit-source-id: 5bf7a0b9e2567e80f44981a7823be5407fc94e53	2019-01-04 22:33:18 -08:00
Cheng,Penghui	1717ea1da0	Implementation of ChannelShuffle Op for MKLDNN (#15106 ) Summary: the speed-up of a single operation is up to 3X . Pull Request resolved: https://github.com/pytorch/pytorch/pull/15106 Differential Revision: D13429596 Pulled By: bddppq fbshipit-source-id: f8d987cafeac9bef9c3daf7e43ede8c6a4ee2ce5	2018-12-12 20:25:12 -08:00
Orion Reblitz-Richardson	febc7ff99f	Add __init__.py so files get picked up on install (#14898 ) Summary: This will let us install tests and other Caffe2 python code as a part of running Caffe2 tests in PyTorch. Broken out of https://github.com/pytorch/pytorch/pull/13733/ cc pjh5 yf225 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14898 Reviewed By: pjh5 Differential Revision: D13381123 Pulled By: orionr fbshipit-source-id: 0ec96629b0570f6cc2abb1d1d6fce084e7464dbe	2018-12-07 13:40:23 -08:00
PenghuiCheng	939877bf4b	Implementation of WeightedSum op for mkl-dnn and fix FC op output shape issue. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14407 Reviewed By: yinghai Differential Revision: D13364364 Pulled By: wesolwsk fbshipit-source-id: e69bcd1bc52e35b2f0e45e5dc40184f1bd66605d	2018-12-07 12:35:19 -08:00
Gu, Jinghui	60963c2ecb	Add "axis" and "axis_w" arguments in FC to support customized axix to reduce dim. (#12971 ) Summary: Add "axis" and "axis_w" arguments in FC to support customized axix to reduce dim. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12971 Reviewed By: bddppq Differential Revision: D12850675 Pulled By: yinghai fbshipit-source-id: f1cde163201bd7add53b8475329db1f038a73019	2018-11-21 15:44:50 -08:00
Hui Wu	acd7811e33	Add sigmoid op based on MKL-DNN Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13097 Differential Revision: D13105366 Pulled By: yinghai fbshipit-source-id: d156e8fd519baeecf61c25dcd8fa2c2fa7351ef4	2018-11-19 22:56:35 -08:00
Gu, Jinghui	d01cb70497	build with mkl-dnn by default (#13303 ) Summary: build with mkl-dnn by default Pull Request resolved: https://github.com/pytorch/pytorch/pull/13303 Reviewed By: yinghai Differential Revision: D12979633 Pulled By: orionr fbshipit-source-id: 00d23fa27c0d13e82f7e5acb3ebd00ed7ba1d5dc	2018-11-08 11:18:27 -08:00
Gu, Jinghui	dbab9b73b6	seperate mkl, mklml, and mkldnn (#12170 ) Summary: 1. Remove avx2 support in mkldnn 2. Seperate mkl, mklml, and mkldnn 3. Fix convfusion test case Pull Request resolved: https://github.com/pytorch/pytorch/pull/12170 Reviewed By: yinghai Differential Revision: D10207126 Pulled By: orionr fbshipit-source-id: 1e62eb47943f426a89d57e2d2606439f2b04fd51	2018-10-29 10:52:55 -07:00
Yinghai Lu	a839a67aad	Add IDEEP unit test with zero-dim tensors (#8459 ) Summary: This test flushes out the issue that IDEEP cannot handle tensor with dims like (0, 2), which is a valid tensor shape. Pull Request resolved: https://github.com/pytorch/pytorch/pull/8459 Differential Revision: D10419328 Pulled By: yinghai fbshipit-source-id: c5efcd152364a544180a8305c47a2a2d126ab070	2018-10-19 23:57:33 -07:00
Cheng,Penghui	6e7e63fda3	Implementation MomentumSGD/MomentumSGDUpdate operators for mkl-dnn (#11686 ) Summary: the speed-up of a single operation is up to 6X on BDW. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11686 Reviewed By: yinghai Differential Revision: D9828129 Pulled By: wesolwsk fbshipit-source-id: 7dbacea90609e18438f6fe1229c641937d0696c8	2018-09-27 13:39:59 -07:00
jgong5	c755616e00	Enable Detectron model inference for CPU and MKL-DNN paths (#10157 ) Summary: 1. Support ops needed for inference of Faster-RCNN/Mask-RCNN needed in Detectron, mostly direct fallbacks. 2. Use CPU device to hold 0-dim tensors and integer tensors in both fallback op and blob feeder, needed by Detectron models. 3. Ignore 0-dim tensor in MKL-DNN concat operator. 4. Generate dynamic library of Detectron module for CPU device. This PR obsoletes #9164. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10157 Differential Revision: D9276837 Pulled By: yinghai fbshipit-source-id: dc364932ae4a2e7fcefdee70b5fce3c0cee91b6f	2018-08-29 15:11:01 -07:00
jgong5	329d901a91	Fold AffineChannel to Conv, the same way as BN (for Detectron models) (#10293 ) Summary: AffineChannel is being used by public Detectron models, e.g. Mask-RCNN and Faster-RCNN. This PR folds this op into convolution the same way as BN to speed up inference. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10293 Differential Revision: D9276789 Pulled By: yinghai fbshipit-source-id: fbf6dd2c1be05f5713f760752e7245b1320a122b	2018-08-13 22:43:37 -07:00
Yinghai Lu	a9742e1a27	Add fallback to TensorCPU if there are unsupported types for IDEEP Tensor (#9667 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9667 MKL-DNN doesn't support 64-bit integger (`cfee61bf81/include/mkldnn_types.h (L62-L75)`). So force converting from `TensorCPU<long>` to `s32` Ideep tensor will cause memory issue. This diff gives an alternative solution, where we just fall through to TensorCPU. The reasoning is that since MKL-DNN doesn't support 64 bit integer tensor, downstream ops have to be in CPUConext. So there is no reason force converting to ideep tensor and back. Reviewed By: pjh5 Differential Revision: D8943544 fbshipit-source-id: f514903cda27e34b8887271c9df56c8220895116	2018-07-23 13:54:57 -07:00
Gu, Jinghui	e8b8c3895e	Enable Conv fusion optimizations in optimizeForIdeep (#9255 ) Summary: Enable fusion for IDEEP in optimizeForIdeep including Conv+ReLU, Conv+Sum, Conv+Sum+ReLU, Conv+BN Pull Request resolved: https://github.com/pytorch/pytorch/pull/9255 Reviewed By: bddppq Differential Revision: D8809030 Pulled By: yinghai fbshipit-source-id: af30bad3b96cb965bd26a4dfa810370faec4bb88	2018-07-16 21:28:50 -07:00
Yinghai Lu	cb98c5020a	Normalize IDEEP spatial bn op test (#9276 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9276 Use `checkDevice` instead rolling our own. Reviewed By: orionr Differential Revision: D8769401 fbshipit-source-id: bd47ec2b2501552c2da1cee2eb9ad96a215602b4	2018-07-09 11:55:41 -07:00
Yinghai Lu	2ed03898cd	Add depthwise convolution test for IDEEP (#8301 )	2018-06-09 08:44:13 -07:00
Viswanath Sivakumar	832c88a766	[ideep] Add IDEEP Squeeze op (#8227 ) Similar to MKLSqueezeOp at caffe2/mkl/operators/squeeze_op.cc	2018-06-06 21:58:51 -07:00
Yinghai Lu	c446269568	cpu/ideep context converter (#8139 )	2018-06-04 21:28:59 -07:00
Sebastian Meßmer	b3e87b1066	Fix fbcode compatibility (#7939 )	2018-05-30 13:35:46 -04:00
Sebastian Meßmer	49f8581745	Update from facebook (#7855 ) * [mpscnn] MPSCNNChannelShuffle att * [Easy] Adding tags as an argument to the functional layer Without it "tags" would be added as an argument to the operator. The change here is based on the assumption that there is no operator that takes "tags" as an argument. * Fix locally_connected_op schema check. Fix locally_connected_op schema check. * [C2] Add TypeAndShape inference for few more operators As desc * [c2] Shape inference should support 0 as dimension Tensors can have 0 in their dimension. * Make MockHiveReader loop over and support max_examples Replace DatasetReader with RandomDatasetReader. So that Mock Hive Reader can simulate a large data input using a small sample file as source. * Utility function to wipe cache between benchmark runs Caffe2 benchmark does not wipe out cache between runs, and this potentially creates an unrealistically optimistic picture of performance. This diff adds utility function to wipe out the cache. * Allow caffe2 GlobalInit to be invoked multiple times Allow caffe2 GlobalInit to be invoked multiple times. Will re-parse gflags and update logging levels on successive invocations, but will not re-run init functions or perform other one-time initialization. * Add Caffe2 GlobalInitIsCalledGuard to base net and operator classes Warn if caffe2's GlobalInit function has not been invoked before creating an operator or net object. This is based on discussion here: https://fb.quip.com/kqGIAbmK7vNG * Rethrow current exception on failure Rethrow current exception instead of copy constructing a new one on op failure. * Make `clone()` return subclass of List/Struct `clone()` is not working correctly when we subclass those classes * Wipe the cache before the net run the util function is copied from D7409424 will rebase once D7409424 is landed. * [Caffe2] [Mobile] Support utils/cast.h::GetCastDataType with LITE_PROTO builds * Correct includes async_polling include -> async_base include * Prepare execution flags for executor migration Making async_scheduling aware of underlying net type to prepare for executor migration * Add operator level observers into async executor Adding operator level observers into RunAsync operators' calls * Cleanup TEST_Benchmark Remove duplicate code and provide default implementation in NetBase * [C2] Fix type and shape inference for binary comparison ops As desc. * Add GlobalInit to predictor to ensure initialization is always done before prediction FACEBOOK: Redo D7651453 the correct way. Now use a static variable for the arguments passed to GLog * Remove spammy log message This method is currently used in various places inside Caffe itself. * Disable events for operators inside a chain We don't need to use events in operators within a chain because the chain is always scheduled on a single stream, keeping only first and last event for scheduling purposes * Ensure correct finish run order In rare cases we might call finishRun and trigger net's destruction while another worker is still holding shared_ptr to a thread pool, that can cause thread pool destruction from within a worker thread in case no other nets are using the pool. This diff fixes the order of calling finishRun and also changes pool() to return raw pointer to keep pool's ownership within the net * Reduce unnecessary polling Make sure we don't waste CPU by polling operators that we can set an efficient callbacks on * Squash commit of syncing `9506eeb` from github to fbcode Patch xplat buck fix add virtual destructor to OptimizationPass add virtual destructor to OptimizationPass build fixes for sync build fixes for sync * Fix net tracing Fix net tracing from async_scheduling * Fix logging	2018-05-29 11:38:02 -07:00
Yinghai Lu	ed3b12e1ba	[Caffe2] Ideep net optmization passes (#7514 ) * Transform ideep net * Add conv+relu transformation * Add verification and address comments	2018-05-11 23:50:18 -07:00
Yinghai Lu	2863d935b9	[Caffe2] Fix of the performance issue of IDEEP (#7503 ) * Sketch fix of the performance issue of IDEEP * Revert CMakefile * Fix tests * format * comments * Print error * review comments	2018-05-11 13:43:41 -07:00
Jinghui	769397eb77	[Caffe2] [feature request] Add gradient operators for IDEEP (#7234 ) * Add gradient operators for IDEEP Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Add gradient test cases for IDEEP Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Upgrade third_party/ideep Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Refine SumOp for IDEEP Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Share input buffer in fallback op if possible Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Fallback ConvTranspose op for IDEEP Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Fix bug introduced by the patch of sharing input buffer Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Share output buffer in fallback operators Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Remove IDEEP to resolve repo issue Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Reflash IDEEP repo Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Remove redundant lines in IDEEP Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Fallback operators for IDEEP (Flatten, ResizeLike, Transpose, and Reshape) Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>	2018-05-09 08:52:24 -07:00
Yinghai Lu	e3935f7509	[Caffe2] Add conv+relu fusion for MKLDNN ops (IDEEP) (#7385 ) * Add conv+relu fusion for MKLDNN ops (IDEEP) * comments	2018-05-08 14:44:53 -07:00
Yinghai Lu	e9f6f14555	[Caffe2] Revamp the convnet benchmark code by using models from model zoo (#7351 ) * Revamp the convnet benchmark code by using models from model zoo * Move ModelDownloader to caffe2/python/models * Remove convnet_benchmarks.py	2018-05-08 08:53:52 -07:00
Yinghai Lu	8b70f7d248	[Caffe2] Clean up ideep integration (#6881 ) * Clean up ideep integrtation * . * Remove redundant code in convnet benchmark * MKL ON * Do not add -mavx2 everywhere * . * Comments * rename * .	2018-04-24 18:32:35 -07:00
Jinghui	26ddefbda1	[feature request] [Caffe2] Enable MKLDNN support for inference (#6699 ) * Add operators based-on IDEEP interfaces Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Enable IDEEP as a caffe2 device Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Add test cases for IDEEP ops Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Add IDEEP as a caffe2 submodule Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Skip test cases if no IDEEP support Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Correct cmake options for IDEEP Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Add dependences on ideep libraries Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Fix issues in IDEEP conv ops and etc. Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Move ideep from caffe2/ideep to caffe2/contrib/ideep Signed-off-by: Gu Jinghui <jinghui.gu@intel.com> * Update IDEEP to fix cmake issue Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Fix cmake issue caused by USE_MKL option Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Correct comments in MKL cmake file Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>	2018-04-22 21:58:14 -07:00

28 commits