pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-15 21:00:47 +00:00

Author	SHA1	Message	Date
Andrey Malevich	46debe7f23	[DPER] Introduce barrier operation to force synchronization of threads in async execution (#49322 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49322 In some cases async execution might loose dependencies (Alias like ops) or produce suboptimal scheduling when there is an option which parts to schedule first. Example of the later behavior can happen in ModelParallel training where copy can get lower priority compared to the rest of the execution on the given GPU, which will caused other GPUs to starve. This operator allows to address these issues by introducing extra explicit dependencies between ops. Test Plan: Unit-test/ E2E testing in the future diffs. Reviewed By: xianjiec Differential Revision: D24933471 fbshipit-source-id: 1668994c7856d73926cde022378a99e1e8db3567	2020-12-15 16:13:42 -08:00
Newsha Ardalani	0fb58d76a1	Support ArgMin in c2_pt_converter Summary: + Add ArgMin support to Caffe2 to PyTorch converter + Using hypothesis to parameterize different conditions for test Test Plan: buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test Reviewed By: houseroad Differential Revision: D25016203 fbshipit-source-id: 94489fcf1ed3183ec96f9796a5b4fb348fbde5bc	2020-12-05 16:35:34 -08:00
Rahul Manghwani	142b21fd44	Add SparseLengthsSum4BitRowwiseSparse in c2_pt_converter (#48240 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48240 Adds the support for converting the SparseLengthsSum4BitRowwiseSparse operator from caffe2 to pytorch as a part of c2_pt_converter Test Plan: Added a unit tested buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test Tests Passed : https://our.intern.facebook.com/intern/testinfra/testrun/2251799856412296 Reviewed By: houseroad Differential Revision: D25067833 fbshipit-source-id: 45cbc331ca35bee27e083714e65a1e87a2a2d2e0	2020-12-04 14:16:25 -08:00
Tristan Rice	dc7d8a889e	caffe2: refactor context to allow being typed (#48340 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48340 This changes the context managed classes from using a decorator to define them to using inheritance. Inheritance allows the python static type checking to work correctly. ``` context.define_context() class Bar(object): ... context.define_context(allow_default=True) class Foo(object): ... ``` becomes ``` class Foo(context.Managed): ... class Bar(context.DefaultManaged): ... ``` Behavior differences: * arg_name has been removed since it's not used anywhere * classes need to call `super()` in `__enter__/__exit__` methods if they override (none do) This also defines a context.pyi file to add types for python3. python2 support should not be affected Test Plan: ci buck test //caffe2/caffe2/python:context_test //caffe2/caffe2/python:checkpoint_test Reviewed By: dongyuzheng Differential Revision: D25133469 fbshipit-source-id: 16368bf723eeb6ce3308d6827f5ac5e955b4e29a	2020-11-30 18:31:14 -08:00
Frank Seide	29f0e1e2ce	Fused8BitRowwiseQuantizedToFloat operator support (#48407 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48407 T79817692: Fused8BitRowwiseQuantizedToFloat operator support for c2_pt_converter. Also refactored some repeated code from the existing test functions. (Initial commit only has refactoring.) Test Plan: buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test Reviewed By: bugra Differential Revision: D25069936 fbshipit-source-id: 72f6a845a1b4639b9542c6b230c8cd74b06bc5a0	2020-11-30 17:11:39 -08:00
Xiaodong Wang	d386d3323f	[dper] supress excessive msg (#48404 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48404 On bento this is printing a lot of msgs like (see N408483 if you're an internal user) ``` W1123 120952.322 schema.py:811] Scalar should be considered immutable. Only call Scalar.set() on newly created Scalar with unsafe=True. This will become an error soon. ``` And it's ignoring the log level I set at global level. Removing this line unless this is super important. Test Plan: build a local dper package and verify Differential Revision: D25163808 fbshipit-source-id: 338d01c82b4e67269328bbeafc088987c4cbac75	2020-11-30 14:55:52 -08:00
shubhambhokare1	bdf360f9f2	[ONNX] Update onnx submodule (#47366 ) Summary: Update onnx submodule to 1.8 release Pull Request resolved: https://github.com/pytorch/pytorch/pull/47366 Reviewed By: hl475 Differential Revision: D24968733 Pulled By: houseroad fbshipit-source-id: 2f0a3436ab3c9380ed8ff0887a483743c1209721	2020-11-30 00:05:46 -08:00
Tristan Rice	6eaf1e358c	caffe2/core.Net: is_external_input rebuild lookup tables when necessary Summary: is_external_input doesn't check if the lookup tables are valid. Calling .Proto() should invalidate all lookup tables and have them rebuilt on call to any methods depending on them. This adds this check to is_external_input. Test Plan: internal unit tests Reviewed By: dzhulgakov, esqu1 Differential Revision: D25100464 fbshipit-source-id: d792dec7e5aa9ffeafda88350e05cb757f4c4831	2020-11-20 10:53:24 -08:00
Xiaomeng Yang	2039ff3fbb	[Caffe2] Optimize MishOp on CPU (#48212 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48212 Optimize MishOp on CPU Test Plan: buck test mode/dev-nosan //caffe2/caffe2/python/operator_test:activation_ops_test -- "mish" Reviewed By: houseroad Differential Revision: D25071304 fbshipit-source-id: fe94bfab512188d60412d66962983eff4f37bc07	2020-11-19 14:17:27 -08:00
Scott Wolchok	4c9eb57914	[PyTorch] Narrow Device to 2 bytes by narrowing DeviceType and DeviceIndex (#47023 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47023 DeviceType pretty clearly only needs 1 byte. DeviceIndex only needs 1 byte given that machines don't have anywhere near 255 GPUs in them as far as I know. ghstack-source-id: 116901430 Test Plan: Existing tests, added assertion to catch if my assumption about DeviceIndex is incorrect Reviewed By: dzhulgakov Differential Revision: D24605460 fbshipit-source-id: 7c9a89027fcf8eebd623b7cdbf6302162c981cd2	2020-11-18 19:39:40 -08:00
Tristan Rice	b10d6c6089	[caffe2] cache NextName indexes for faster name generation (#47768 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47768 This stores the next ID for a given NextName(prefix, output_id) so repeated calls to NextName are significantly faster. This accounts for ~65% of time spent for large models. Test Plan: buck test //caffe2/caffe2/python/... will launch canary job before landing to ensure no regressions + confirm speedup Reviewed By: dzhulgakov Differential Revision: D24876961 fbshipit-source-id: 668d73060d800513bc72d7cd405a47d15c4acc34	2020-11-17 12:24:00 -08:00
Ankur Singla	549ef1d668	[caffe][memonger] Extend operator schema check to dag memonger (#48021 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48021 Extending operator schema check for simple memonger to dag memonger as well. As part of this a fix is being made to handle inplace ops (having at least one output name same as input blob). Earlier all the output blobs from ops were being treated as shareable but it failed assertion of external input blobs with the same name not allowed to share. Test Plan: Added corresponding unit tests Reviewed By: hlu1 Differential Revision: D24968862 fbshipit-source-id: b6679a388a82b0d68f65ade64b85560354aaa3ef	2020-11-16 19:17:55 -08:00
Ankur Singla	f743b5639a	[caffe2][memonger] Add support for distributed inference predict nets in DAG memonger (#47718 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47718 Distributed Inference splits a predict net into multiple parts, part0 being the main part which contains ops to make remote calls to other parts. part0 predict net may contain AsyncIf ops to optimize rpc call usage. AsyncIf ops have internal nets which may refer to memongered blobs. This change handles AsyncIf ops to update internal nets to refer to memongered blobs. As part of this change, I am also updating dag memonger traversal to always start from root op, i.e. ops with 0 in degree. Earlier logic will start traversing ops based on input head blobs and if one of the head inputs is getting used in a non-root op which gets visited before its parent, the traversal will throwing assertion error here: https://fburl.com/diffusion/ob110s9z . Almost for all the distributed inference part0 nets, it was throwing this assertion error. Test Plan: Added corresponding tests in memonger_test.py . Could not find unit tests in c++ version of memonger. Reviewed By: hlu1 Differential Revision: D24872010 fbshipit-source-id: 1dc99b2fb52b2bc692fa4fc0aff6b7e4c5e4f5b0	2020-11-13 14:12:07 -08:00
Jonathan Kwok	a3e08e5344	Support ReduceSum in c2_pt_converter (#47889 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47889 Adds support for converting the [caffe2 ReduceSum](https://caffe2.ai/docs/operators-catalogue#reducesum) operator to torch. ghstack-source-id: 116580127 Test Plan: buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test : [results](https://our.intern.facebook.com/intern/testinfra/testrun/6755399466095119) ✓ ListingSuccess: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - main (60.273) ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_sub_op (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (101.119) ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_layer_norm_conversion (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (101.404) ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_local_model_conversion (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (101.966) ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_reduce_sum (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (114.896) Reviewed By: bugra Differential Revision: D24925318 fbshipit-source-id: 3f3b791eff1b03e8f5adee744560fe8bc811c659	2020-11-13 12:02:58 -08:00
Gary Zheng	f1babb00f0	[caffe2] Fix ListWithEvicted _pprint_impl wrongly printing _evicted_values (#47881 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47881 ListWithEvicted's _pprint_impl was accidentally printing _items before this change. Reviewed By: dzhulgakov Differential Revision: D24928521 fbshipit-source-id: 0d7940719b4a27defbaae3b99af104d7fe7b5144	2020-11-13 09:23:10 -08:00
Alberto Alfarano	59e96c55f7	Support MatMul in c2_pt_converter Summary: Added the MatMul operator for caffe2 Test Plan: buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test Reviewed By: bugra Differential Revision: D24920937 fbshipit-source-id: 7ba09ba0439cb9bd15d6a41fd8ff1a86d8d11437	2020-11-12 20:56:58 -08:00
Peiyao Zhou	4078f44668	[TB][embedding supporting] Modify histogram to accept multipy types to skip Castop and avoid OOMing in Castop Summary: To support min/max/mean/std, SummarizeOp need to skip size checking (similar to the LpNorm error mentioned above) and accept multiple types Test Plan: unit test: `buck test //caffe2/caffe2/fb/tensorboard/tests:tensorboard_accumulate_histogram_op_test` https://our.intern.facebook.com/intern/testinfra/testrun/1407375057859572 `buck test //caffe2/caffe2/fb/tensorboard/tests:tensorboard_accumulate_histogram_op_test --stress-runs 1000` https://our.intern.facebook.com/intern/testinfra/testrun/2533274832166362 Reviewed By: cryptopic Differential Revision: D24605507 fbshipit-source-id: fa08372d7c9970083c38abd432d4c86e84fb10e0	2020-11-11 12:03:54 -08:00
Richard Zou	17c58720fe	Revert D24346771: [caffe2][memonger] Add support for distributed inference predict nets in DAG memonger Test Plan: revert-hammer Differential Revision: D24346771 (`5882f2e540`) Original commit changeset: ad2dd2e63f3e fbshipit-source-id: 90346f08c890eebe71f068748a8e24e4db88c250	2020-11-10 12:11:22 -08:00
Ankur Singla	5882f2e540	[caffe2][memonger] Add support for distributed inference predict nets in DAG memonger Summary: Distributed Inference splits a predict net into multiple parts, part0 being the main part which contains ops to make remote calls to other parts. part0 predict net may contain AsyncIf ops to optimize rpc call usage. AsyncIf ops have internal nets which may refer to memongered blobs. This change handles AsyncIf ops to update internal nets to refer to memongered blobs. Here is one reference part0 predict net with AsyncIf ops: https://www.internalfb.com/intern/paste/P145812115/ As part of this change, I am also updating dag memonger traversal to always start from root op, i.e. ops with 0 in degree. Earlier logic will start traversing ops based on input head blobs and if one of the head inputs is getting used in a non-root op which gets visited before its parent, the traversal will throwing assertion error here: https://fburl.com/diffusion/ob110s9z . Almost for all the distributed inference part0 nets, it was throwing this assertion error. Reviewed By: hlu1 Differential Revision: D24346771 fbshipit-source-id: ad2dd2e63f3e822ad172682f6d63f8474492255d	2020-11-10 09:35:28 -08:00
Gary Zheng	8b3f1d1288	[caffe2] Add __slots__ to all classes in schema.py (#47541 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47541 The profiler has guided us to `schema.py`. Since these `Field`s are used everywhere and in huge quantities, we can easily make some optimizations system wide by adding `__slots__`. From StackOverflow, benefits include: * faster attribute access. * space savings in memory. Read more: https://stackoverflow.com/a/28059785/ Reviewed By: dzhulgakov Differential Revision: D24771078 fbshipit-source-id: 13f6064d367440069767131a433c820eabfe931b	2020-11-09 16:16:28 -08:00
Gary Zheng	4c52a56c40	[caffe2] Properly call super init in schema.py (#47542 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47542 The previous way of doing `Field.__init__(self, [])` is just wrong. Switching to Python2 compatible way: `super(ObjectName, self).__init__(...)` Reviewed By: dzhulgakov Differential Revision: D24771077 fbshipit-source-id: d6798c72090c0264b6c583602cae441a1b14587c	2020-11-09 15:02:22 -08:00
Gary Zheng	4a58f35bef	[caffe2] Fix duplicate name bug in Net.AddExternalInput (#47530 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47530 `Net.AddExternalInput` should raise if there are duplicate names. The previous code would only raise if the addition of duplicates was in separate calls, but not if it was in the same call. Test Plan: Added two new regression tests ``` ✓ Pass: caffe2/caffe2/python:core_test - testSetInputRecordWithBlobs (caffe2.caffe2.python.core_test.TestExternalInputs) (9.622) ✓ Pass: caffe2/caffe2/python:core_test - testAddExternalInputShouldRaiseIfDuplicate (caffe2.caffe2.python.core_test.TestExternalInputs) (9.639) ✓ Pass: caffe2/caffe2/python:core_test - testSetInputRecordWithoutBlobs (caffe2.caffe2.python.core_test.TestExternalInputs) (9.883) ✓ Pass: caffe2/caffe2/python:core_test - testAddExternalInputShouldRaiseIfDuplicateInSameCall (caffe2.caffe2.python.core_test.TestExternalInputs) (10.153) ``` Test trained 2 models. No issues f230755456 f230754926 Reviewed By: dzhulgakov Differential Revision: D24763586 fbshipit-source-id: c87088441d76f7198f8b07508b2607aec13521ed	2020-11-09 08:30:58 -08:00
Shiyan Deng	c19eb4ad73	BoxWithNMSLimit support int `batch_splits` input (#47504 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47504 allow int type input of `batch_splits` Test Plan: ``` buck test caffe2/caffe2/python/operator_test:torch_integration_test -- test_box_with_nms_limits ``` Reviewed By: jackm321 Differential Revision: D24629522 fbshipit-source-id: 61cb132e792bddd8f9f1bca5b808f1a9131808f0	2020-11-07 00:27:51 -08:00
Gary Zheng	582e852fba	[caffe2] Add unittests for schema.Field init (#47512 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47512 I deleted the last line of `__init__` -- `self._field_offsets.append(offset)` -- and the unittests didn't fail. So this diff is to improve test coverage. Test Plan: ``` ✓ Pass: caffe2/caffe2/python:schema_test - testInitShouldSetEmptyParent (caffe2.caffe2.python.schema_test.TestField) (8.225) ✓ Pass: caffe2/caffe2/python:schema_test - testInitShouldSetFieldOffsetsIfNoChildren (caffe2.caffe2.python.schema_test.TestField) (8.339) ✓ Pass: caffe2/caffe2/python:schema_test - testInitShouldSetFieldOffsets (caffe2.caffe2.python.schema_test.TestField) (8.381) ``` Reviewed By: dzhulgakov Differential Revision: D24767188 fbshipit-source-id: b6ce8cc96ecc61768b55360e0238f7317a2f18ea	2020-11-06 13:27:58 -08:00
Bugra Akyildiz	c26c4690fe	Add sub operator Summary: Add sub operator for caffe2 Test Plan: ``` buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test ``` Reviewed By: houseroad Differential Revision: D24685090 fbshipit-source-id: 60d745065d01b634ebd3087e533d8b9ddab77a1f	2020-11-06 12:31:17 -08:00
Tristan Rice	47198e3208	[caffe2] improve core.Net cloning/init performance (24x for large models!) (#47475 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47475 This improves the core.Net cloning/init performance by quite a bit. It makes set_input_record run in linear time instead of O(n) by checking the external_input map instead of regenerating the external inputs each time and then iterating over it. Test Plan: unit tests + canary runs Reviewed By: dzhulgakov Differential Revision: D24765346 fbshipit-source-id: 92d9f6dec158512bd50513b78675174686f0f411	2020-11-06 11:34:12 -08:00
Yen-Jung Chang	6e22b6008d	[MLF] Allow for computing prune quantile thresholds on absolute value of indicators in distributed-inference-compatible embedding LUT pruning (#46789 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46789 1. Now `SelfBinningHistogram` can calculate the binning histogram using the absolute values from the given an array of values. 2. Update the invocation of `SelfBinningHistogram` in `post_training_prune`. Test Plan: 1. [buck test caffe2/caffe2/python/operator_test:self_binning_histogram_test](https://www.internalfb.com/intern/testinfra/testconsole/testrun/6473924488326108/) 2. [buck test dper3/dper3_backend/delivery/tests:post_training_prune_test](https://www.internalfb.com/intern/testinfra/testconsole/testrun/2251799854023163/) Reviewed By: hwangjeff Differential Revision: D24494097 fbshipit-source-id: 95e47137b25746e686ef9baa9409560af5d58fc1	2020-11-02 11:31:31 -08:00
Basil Hosmer	f05b66b70d	pass TypeMeta by value (#45026 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45026 Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D23802943 Pulled By: bhosmer fbshipit-source-id: 81b06ef00bf8eb4375c0e0ff2032e03bd1d1188a	2020-10-30 10:14:17 -07:00
Bugra Akyildiz	eec201c138	Add last_n_window_collector Summary: Add `last_n_window_collector` as C2 supports and PyTorch currently does not have this operator: https://www.internalfb.com/intern/diffusion/FBS/browsefile/master/fbcode/caffe2/caffe2/operators/last_n_window_collector.cc?lines=139 ## Problem that we are solving This operator works on multiple pieces of data and collects last `n` element that has been seen. If you have the following pieces of data that has been passed around: ``` [1, 2, 3, 4] [5, 6, 7] [8, 9, 10, 11] ``` for 3 times and the number of collector is given to be 6. The expected result is: ``` [6, 7, 8, 9, 10, 11] ``` What this means is that, almost like we need a FIFO(First in First Out) mechanism where as we are passing this data through the collector, we will be pushing some other data at the end. In this particular example, in the first pass(the data is `[1, 2, 3, 4]`) , we hold `[1, 2, 3, 4]` in the queue as our queue size is 6. In the second pass(the data is `[5, 6, 7]`), we hold `[2, 3, 4, 5, 6, 7]` in the queue and since 1 is inserted the last, it will drop due to the size limitation of the queue. In the third pass(the data is `[8, 9, 10, 11]`), we hold `[6, 7, 8, 9, 10, 11]` in the queue and `2,3,4,5` are dropped due the the size of the queue. For multidimension case, when we have the following data: ``` [[1, 2], [2, 3], [3, 4], [4, 5]] [[5, 6], [6, 7], [7, 8]] [[8, 9], [9, 10], [10, 11], [11, 12]] ``` and our queue size is 6. In the first pass, we will have ` [[1, 2], [2, 3], [3, 4], [4, 5]]` In the second pass, we will have `[2, 3], [3, 4], [4, 5]] [[5, 6], [6, 7], [7, 8]]` In the third pass, we will have `[6, 7], [7, 8]] [[8, 9], [9, 10], [10, 11], [11, 12]]` ### The implementation I am using FIFO queue in Python which is in the collections library. This accepts `maxlen` argument which can be used to set the size of the queue. I am using last n indices of the tensor through list indices and in this operator, I am not doing copy. In the test plan, I have both single dimension tensors as well as multi-dimension tensors. ### Benchmark I used various different configurations and added a benchmark test. PyTorch implementation is much master than Caffe2 implementation: #### CPU Benchmark ``` torch_response.median 0.00019254473969340324 caffe_response.median 0.00030233583599794657 ``` #### GPU Benchmark ``` torch_response.mean 0.000081007429903838786 caffe_response.mean 0.00010279081099724863 ``` Test Plan: ### For CPU: ``` buck test //caffe2/torch/fb/sparsenn:test ``` ### For GPU: - Used an on-demand machine and did the following commands: ``` jf get D24435544 buck test mode/opt //caffe2/torch/fb/sparsenn:test ``` https://www.internalfb.com/intern/testinfra/testconsole/testrun/4222124688138052/ Reviewed By: dzhulgakov, radkris-git Differential Revision: D24435544 fbshipit-source-id: 8193b4746b20f2a4920fd4d41271341045cdcee1	2020-10-30 02:35:54 -07:00
Brandon Lin	4a581ba6c2	Implement LengthsToOffsets operator in Caffe2 (#46590 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46590 This operator is very similar to LengthsToRanges but doesn't pack the offsets next to the original lengths. Reviewed By: yf225 Differential Revision: D24419746 fbshipit-source-id: aa8b014588bb22eced324853c545f8684086c4e4	2020-10-29 07:03:34 -07:00
Kunal Bhalla	18d273dc0e	[RFC][LocalSession] Fix workspace type Summary: I was reading/looking into how LocalSession works and realized that the workspace type being passed around was the bound function on TaskGroup instead of the actual type. This meant that all workspaces for localsession would always be global, because they'd never match the private workspace type. Test Plan: <not sure, could use some suggestions> Reviewed By: cryptopic Differential Revision: D24458428 fbshipit-source-id: 0f87874babe9c1ddff25b5363b443f9ca37e03c1	2020-10-29 04:12:17 -07:00
Dmytro Dzhulgakov	115bbf9945	[caffe2] Disable running full grad check in tests by default Summary: We've been seeing a lot of Hypothesis timeouts and from profiling a few of the failing tests one of the contributing factors is really slow grad checker. In short, it launches the whole op for each of the input elements so the overall complexity is O(numel^2) at least. This applies a very unscientific hack to just run grad check on the first and last few elements. It's not ideal, but it's better than flaky tests. One can still explicitly opt in with the env var. Reviewed By: malfet Differential Revision: D23336220 fbshipit-source-id: f04d8d43c6aa1590c2f3e72fc7ccc6aa674e49d2	2020-10-27 16:10:03 -07:00
Huan Gui	b5662ba0f0	[uhm][0/n] add cuda Mod Op (#46732 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46732 as titled Test Plan: unittest buck test mode/dev-nosan //caffe2/caffe2/python/operator_test:mod_op_test Reviewed By: xianjiec Differential Revision: D24368100 fbshipit-source-id: 1232d22a67ac268986043911d548fa9d657470ec	2020-10-26 11:07:51 -07:00
Yunfan Zhong	e519fcd1aa	Remap net name inside arg.n for AsyncIf operator Summary: Similar to If operator, AsyncIf also contains nets in args. It needs the same handling. Test Plan: New unit test test_control_op_remap `buck test caffe2/caffe2/python:core_test` Also it worked end to end in prototype of dist bulk eval workflow f226680903 Reviewed By: yyetim Differential Revision: D24451775 fbshipit-source-id: 50594e2ab9bb457329ed8da7b035f7409461b5f6	2020-10-23 10:41:06 -07:00
Alexander Grund	93719440b8	Replace map(lambda constructs (#46462 ) Summary: Follow-up of https://github.com/pytorch/pytorch/issues/46461 with a similar goal Makes them more readable and possibly faster. Care has to be taken because `map` applies the function immediately while `(x for x in xs)` is a generator expression which gets evaluated later. This is a benefit in some cases where it is not required to actually create the list of values in memory (e.g. when passing to `tuple` or `extend` or `join`) Pull Request resolved: https://github.com/pytorch/pytorch/pull/46462 Reviewed By: zou3519 Differential Revision: D24422343 Pulled By: ezyang fbshipit-source-id: 252e33499c92ac0b15238f2df32681dbbda2b237	2020-10-22 09:50:22 -07:00
Jeff Hwang	9b5197b763	[mlf][efficiency] add tensor inference function to last-n collector op (#46693 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46693 title Test Plan: unit tests Reviewed By: hx89 Differential Revision: D23946770 fbshipit-source-id: f7c3d4a1b4ef3b0e5f56e5a9a30f5003ce9f40b0	2020-10-22 01:15:00 -07:00
Alexander Grund	5b0f400488	Replace list(map(...)) constructs by list comprehensions (#46461 ) Summary: As discussed in https://github.com/pytorch/pytorch/issues/46392 this makes the code more readable and possibly more performant. It also fixes a bug detected by this where the argument order of `map` was confused: `030a24906e (diff-5bb26bd3a23ee3bb540aeadcc0385df2a4e48de39f87ed9ea76b21990738fe98L1537-R1537)` Fixes https://github.com/pytorch/pytorch/issues/46392 Pull Request resolved: https://github.com/pytorch/pytorch/pull/46461 Reviewed By: ailzhang Differential Revision: D24367015 Pulled By: ezyang fbshipit-source-id: d55a67933cc22346b00544c9671f09982ad920e7	2020-10-19 18:42:49 -07:00
Jongsoo Park	c37baa9177	[caffe2] add concat benchmark (#46457 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46457 Wanted to see if using CopyMatrix specialized for float that uses mkl_somatcopy can be faster but it wasn't. Still want to check in benchmark that can be used later. Test Plan: . Reviewed By: dskhudia Differential Revision: D24345901 fbshipit-source-id: d3e68dbb560e3138fda11c55789cd41bc0715c6d	2020-10-16 08:48:42 -07:00
Nikita Shulga	84771fc64f	[caffe2] Add 10s deadline for all Caffe2 hypothesis fuzz tests Test Plan: CI Reviewed By: walterddr Differential Revision: D24298118 fbshipit-source-id: 2286c1e37ed9c43f404b888386c0bd4b0b6a55c6	2020-10-14 06:30:09 -07:00
Jianyu Huang	5c67cc7a9e	[caffe2] Enable fp16 for SparseNormalize op (#45551 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45551 The FP16 version of SparseNormalize op in Caffe2 is missing. This Diff adds FP16 support to unblock MC process of adding FP16 to Dper3. Check https://fb.quip.com/L0T2AXGwUY3n#EReACAeifk3 . One question is whether the pure FP16 Sparse Normalized op will affect the accuracy? Maybe we should do it in FP32 domain. ghstack-source-id: 114184398 Test Plan: ``` buck run mode/opt //caffe2/caffe2/python/operator_test:sparse_normalize_test ``` ``` buck run mode/opt -c python.package_style=inplace mode/no-gpu //caffe2/caffe2/python/benchmarks:sparse_normalize_benchmark -- --fp16 ``` Reviewed By: jspark1105 Differential Revision: D24005618 fbshipit-source-id: 8b918ec4063fdaafa444779b95206ba2b7b38537	2020-10-13 15:35:22 -07:00
Bugra Akyildiz	298e0e0d57	Refactor gather_ranges_to_dense from Python to C++ (#46021 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46021 Refactor gather_ranges_to_dense from Python to C++ https://www.internalfb.com/intern/tasks/?t=71935517 Test Plan: General build/test: ``` buck build -c python.helpers=true fbcode/caffe2 buck test -c python.helpers=true fbcode/caffe2 ``` Specific Test: ```buck test mode/dev-nosan //caffe2/torch/fb/sparsenn:test -- 'test_gather_ranges_to_dense \(caffe2\.torch\.fb\.sparsenn\.tests\.sparsenn_operators_test\.SparseNNOperatorsTest\)' ``` Reviewed By: houseroad Differential Revision: D23858186 fbshipit-source-id: 8bce7c279275c8ff7316901b455e1d1dd7e36b13	2020-10-08 11:03:06 -07:00
Pawel Garbacki	fb50fcaa82	[C2] Add string equality operator (#45886 ) Summary: This diff adds a string equality checking operator. Another attempt at reverted D24042344 (`cf48872d28`) Pull Request resolved: https://github.com/pytorch/pytorch/pull/45886 Test Plan: unit tests, github builds Reviewed By: dzhulgakov Differential Revision: D24129953 fbshipit-source-id: caa53c7eac5c67c414c37e9d93416104f72556b9	2020-10-06 12:08:26 -07:00
Dmytro Dzhulgakov	519c086418	Revert D24042344: [C2] Add string equality operator Test Plan: revert-hammer Differential Revision: D24042344 (`cf48872d28`) Original commit changeset: c8997c6130e3 fbshipit-source-id: 3d8aec1104a2a59c67ab4b7e77caeaf9fc94ae1d	2020-10-05 15:09:03 -07:00
Pawel Garbacki	cf48872d28	[C2] Add string equality operator Summary: This diff adds a string equality checking operator. Test Plan: Unit tests Differential Revision: D24042344 fbshipit-source-id: c8997c6130e3438f2ae95dae69f76978e2e95527	2020-10-05 10:47:53 -07:00
Marcio Porto	c31066ac9d	Torch Integration Test Formatting Changes (#45740 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45740 Reviewed By: esqu1 Differential Revision: D23869021 fbshipit-source-id: 5910d44f9475bd7a53dc0478b69b39572dc8666f	2020-10-02 14:02:31 -07:00
Marcio Porto	b234acd414	Exposes SparseToDenseMask Caffe2 Operator (#45670 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45670 Reviewed By: esqu1 Differential Revision: D23868280 fbshipit-source-id: d6afa129c073fe611cb43a170025bc3c880a4bec	2020-10-02 10:05:13 -07:00
Kunal Bhalla	4564444c91	[RFC][caffe2] TaskGroup.__repr__ shouldn't have side effects Summary: `__repr__` calling self.tasks() ends up marking the instance as "used", which doesn't seem appropriate. I was debugging a value being passed around and then ran into `Cannot add Task to an already used TaskGroup.` because the value had been logged once. Test Plan: Added a unit test -- didn't see a clean public method to test it, but I'm happy to add one if that makes sense. Will wait for sandcastle to trigger everything else; I'm not at all familiar with this code so any other recommendations would be great! Reviewed By: cryptopic Differential Revision: D23541198 fbshipit-source-id: 5d1ec674a1ddaedf113140133b90e0da6afa7270	2020-10-01 14:21:03 -07:00
Thomas Bredillet	0fa551f0ab	[c2] Fix int types for learning rate Summary: Currently GetSingleArgument is overflowing since it's expecting an int instead of an int64 when using a 1cycle (hill policy) annealing schedule Test Plan: unittest buck test caffe2/caffe2/python/operator_test:learning_rate_op_test Differential Revision: D23938169 fbshipit-source-id: 20d65df800d7a0f1dd9520705af31f63ae716463	2020-09-26 10:59:29 -07:00
Dianshi Li	03dde4c62a	Resend diff D23858329 (#45315 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45315 Pull Request resolved: https://github.com/pytorch/pytorch/pull/45314 in D23858329 (`721cfbf842`), we put PriorCorrectionCalibrationPrediction unit test in OSS file which causes test failure issue in public trunk. this diff moves it to FB only test file. Test Plan: ``` buck test //caffe2/caffe2/python/operator_test:torch_integration_test -- test_gather_ranges_to_dense_op buck test //caffe2/caffe2/fb/python/operator_test:torch_integration_test -- test_prior_correct_calibration_prediction_op ``` all pass. Reviewed By: houseroad Differential Revision: D23899012 fbshipit-source-id: 1ed97d8702e2765991e6caf5695d4c49353dae82	2020-09-24 18:41:49 -07:00
Danny Huang	cd7a682282	[caffe2] adds hypothesis test for queue ops cancel (#45178 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45178 ## Motivation * To be able to make C2 ops cancellable so we can safely exit. * Some C2 operators are now blocking thus being non-cancellable. If an error occurs we need to be able to safely stop all net execution so we can throw the exception to the caller. ## Summary * Adds a hypothesis test for queue ops cancellation. Test Plan: ## Unit test added to verify that queue ops propagate errors ``` buck test caffe2/caffe2/python:hypothesis_test buck test caffe2/caffe2/python:hypothesis_test -- test_safe_dequeue_blob__raises_exception_when_hang --stress-runs 1000 ``` ``` Summary Pass: 1000 ListingSuccess: 1 ``` Reviewed By: d4l3k Differential Revision: D23847576 fbshipit-source-id: 2fc351e1ee13ea8b32d976216d2d01dfb6fcc1ad	2020-09-24 14:43:52 -07:00

1 2 3 4 5 ...

2835 commits