pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-15 21:00:47 +00:00

Author	SHA1	Message	Date
pbialecki	7b85adf20f	Add back pycuda.autoinit to test_pt_onnx_trt (#51106 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/51105 by adding back the `import pycuda.autoinit`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/51106 Reviewed By: mingzhe09088 Differential Revision: D26086808 Pulled By: heitorschueroff fbshipit-source-id: 88d98796c87a44cedaa1f6666e9f71a424293641	2021-01-27 07:10:11 -08:00
Arindam Roy	09b896261c	Skip test_lc_1d for ROCM (#50964 ) Summary: The test is flaky on ROCM when deadline is set to 1 second. This is affecting builds as it is failing randomly. Disabling for now. Signed-off-by: Arindam Roy <rarindam@gmail.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/50964 Reviewed By: houseroad Differential Revision: D26049370 Pulled By: BIT-silence fbshipit-source-id: 22337590a8896ad75f1281e56fbbeae897f5c3b2	2021-01-25 11:43:37 -08:00
Lu Fang	f32b10e564	[BE] Fix the broken test caffe2/caffe2/python:lazy_dyndep_test - test_allcompare (#50696 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50696 set no deadline for test_alklcompare Test Plan: buck test mode/dev //caffe2/caffe2/python:lazy_dyndep_test -- --exact 'caffe2/caffe2/python:lazy_dyndep_test - test_allcompare (caffe2.caffe2.python.lazy_dyndep_test.TestLazyDynDepAllCompare)' --run-disabled Reviewed By: hl475 Differential Revision: D25947800 fbshipit-source-id: d2043f97128e257ef06ebca9b68262bb1c0c5e6b	2021-01-18 16:21:06 -08:00
Lu Fang	1fdc35da2c	[BE] Fix the broken test -- caffe2/caffe2/python:hypothesis_test - test_recurrent (#50668 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50668 GPU initialization sometimes is slow Test Plan: buck test mode/opt //caffe2/caffe2/python:hypothesis_test -- --exact 'caffe2/caffe2/python:hypothesis_test - test_recurrent (caffe2.caffe2.python.hypothesis_test.TestOperators)' --run-disabled Reviewed By: hl475 Differential Revision: D25939037 fbshipit-source-id: 832700cf42ece848cda66dd629a06ecda207f086	2021-01-17 21:21:38 -08:00
Zhijing Li	05542f6222	EMA op (#50393 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50393 Exponential Moving Average Usage: add ema_options in adagrad optimizer. For details, plz refer to the test workflow setting. if ema_end == -1, it means ema will never end. Test Plan: buck test caffe2/caffe2/fb/optimizers:ema_op_optimizer_test buck test caffe2/caffe2/fb/optimizers:ema_op_test f240459719 Differential Revision: D25416056 fbshipit-source-id: a25e676a364969e3be2bc47750011c812fc3a62f	2021-01-13 08:58:01 -08:00
Hugo van Kemenade	473e78c0fa	Remove redundant code for unsupported Python versions (#49486 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49486 Remove code for Python 3.5 and lower. There's more that can be removed/modernised, but sticking mainly to redundant version checks here, to keep the diff/PR smaller. Pull Request resolved: https://github.com/pytorch/pytorch/pull/46579 Reviewed By: zou3519 Differential Revision: D24453571 Pulled By: ezyang fbshipit-source-id: c2cfcf05d6c5f65df64d89c331692c9aec09248e	2021-01-06 12:45:46 -08:00
Richard Barnes	9945fd7253	Drop unused imports from caffe2/python (#49980 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49980 From ``` ./python/libcst/libcst codemod remove_unused_imports.RemoveUnusedImportsWithGlean --no-format caffe2/ ``` Test Plan: Standard sandcastle tests Reviewed By: xush6528 Differential Revision: D25727359 fbshipit-source-id: c4f60005b10546423dc093d31d46deb418352286	2021-01-05 13:17:46 -08:00
Samuel Marks	e6779d4357	[*.py] Rename "Arguments:" to "Args:" (#49736 ) Summary: I've written custom parsers and emitters for everything from docstrings to classes and functions. However, I recently came across an issue when I was parsing/generating from the TensorFlow codebase: inconsistent use of `Args:` and `Arguments:` in its docstrings. ```sh (pytorch#c348fae)$ for name in 'Args:' 'Arguments:'; do printf '%-10s %04d\n' "$name" "$(rg -IFtpy --count-matches "$name" \| paste -s -d+ -- \| bc)"; done Args: 1095 Arguments: 0336 ``` It is easy enough to extend my parsers to support both variants, however it looks like `Arguments:` is wrong anyway, as per: - https://google.github.io/styleguide/pyguide.html#doc-function-args @ [`ddccc0f`](https://github.com/google/styleguide/blob/ddccc0f/pyguide.md) - https://chromium.googlesource.com/chromiumos/docs/+/master/styleguide/python.md#describing-arguments-in-docstrings @ [`9fc0fc0`](https://chromium.googlesource.com/chromiumos/docs/+/9fc0fc0/styleguide/python.md) - https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html @ [`c0ae8e3`](https://github.com/sphinx-contrib/napoleon/blob/c0ae8e3/docs/source/example_google.rst) Therefore, only `Args:` is valid. This PR replaces them throughout the codebase. PS: For related PRs, see tensorflow/tensorflow/pull/45420 PPS: The trackbacks automatically appearing below are sending the same changes to other repositories in the [PyTorch](https://github.com/pytorch) organisation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49736 Reviewed By: albanD Differential Revision: D25710534 Pulled By: soumith fbshipit-source-id: 61e8ff01abb433e9f78185c2d1d0cbd7c22c1619	2020-12-28 09:34:47 -08:00
skyline75489	46b83212d1	Remove unused six code for Python 2/3 compatibility (#48077 ) Summary: This is basically a reborn version of https://github.com/pytorch/pytorch/issues/45254 . Ref: https://github.com/pytorch/pytorch/issues/42919 Pull Request resolved: https://github.com/pytorch/pytorch/pull/48077 Reviewed By: ngimel Differential Revision: D25687042 Pulled By: bugra fbshipit-source-id: 05f20a6f3c5212f73d0b1505b493b720e6cf74e5	2020-12-22 18:07:08 -08:00
Taylor Robie	faf6032945	Remove deadlines for Caffe2 hypothesis_test when running on GPU. (#49591 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49591 A bunch of these tests are marked flaky, and have been since time immemorial. (Read: as far back as Buck will build.) However closer inspection reveals that they fail if and only if run on a GPU worker. What seems to be going on is that there are more jobs than GPUs, so the contention causes waits which registers as timeouts on the test. This diff is kind of hacky, but it basically just drops deadlines if a GPU is present. Because Caffe2 is going away I'm not too terribly concerned about a beautiful solution, but we may as well keep some test coverage if it's easy. CC Sebastian, Ilia, Min, and Hongzheng who also have tasks for what seems to be the same flakiness. Test Plan: Turn the tests back on and see if they fall over. (The failure repros reliably on an OnDemand GPU and is fixed by this change, so it's not really just a hail Mary.) Reviewed By: ngimel Differential Revision: D25632981 fbshipit-source-id: 43dcce416fea916ba91f891e9e5b59b2c11cca1a	2020-12-18 10:00:24 -08:00
Andrey Malevich	f5a26a554b	[C2] Revive unsafe CoalesceOp (#49402 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49402 In cases of NCCLAllReduce operations there could be non-trivial overhead for launching cooperative kernels (especially in case of async execution of different parts of the model). This diff is reviving this operator to make it possible to fuse multiple operations into a single kernel. Test Plan: Unit-test. Used in a later diff. Reviewed By: xianjiec Differential Revision: D25531206 fbshipit-source-id: 64b1c161233a726f9e2868f1059316e42a8ea1fc	2020-12-17 04:31:29 -08:00
Andrey Malevich	46debe7f23	[DPER] Introduce barrier operation to force synchronization of threads in async execution (#49322 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49322 In some cases async execution might loose dependencies (Alias like ops) or produce suboptimal scheduling when there is an option which parts to schedule first. Example of the later behavior can happen in ModelParallel training where copy can get lower priority compared to the rest of the execution on the given GPU, which will caused other GPUs to starve. This operator allows to address these issues by introducing extra explicit dependencies between ops. Test Plan: Unit-test/ E2E testing in the future diffs. Reviewed By: xianjiec Differential Revision: D24933471 fbshipit-source-id: 1668994c7856d73926cde022378a99e1e8db3567	2020-12-15 16:13:42 -08:00
Newsha Ardalani	0fb58d76a1	Support ArgMin in c2_pt_converter Summary: + Add ArgMin support to Caffe2 to PyTorch converter + Using hypothesis to parameterize different conditions for test Test Plan: buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test Reviewed By: houseroad Differential Revision: D25016203 fbshipit-source-id: 94489fcf1ed3183ec96f9796a5b4fb348fbde5bc	2020-12-05 16:35:34 -08:00
Rahul Manghwani	142b21fd44	Add SparseLengthsSum4BitRowwiseSparse in c2_pt_converter (#48240 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48240 Adds the support for converting the SparseLengthsSum4BitRowwiseSparse operator from caffe2 to pytorch as a part of c2_pt_converter Test Plan: Added a unit tested buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test Tests Passed : https://our.intern.facebook.com/intern/testinfra/testrun/2251799856412296 Reviewed By: houseroad Differential Revision: D25067833 fbshipit-source-id: 45cbc331ca35bee27e083714e65a1e87a2a2d2e0	2020-12-04 14:16:25 -08:00
Tristan Rice	dc7d8a889e	caffe2: refactor context to allow being typed (#48340 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48340 This changes the context managed classes from using a decorator to define them to using inheritance. Inheritance allows the python static type checking to work correctly. ``` context.define_context() class Bar(object): ... context.define_context(allow_default=True) class Foo(object): ... ``` becomes ``` class Foo(context.Managed): ... class Bar(context.DefaultManaged): ... ``` Behavior differences: * arg_name has been removed since it's not used anywhere * classes need to call `super()` in `__enter__/__exit__` methods if they override (none do) This also defines a context.pyi file to add types for python3. python2 support should not be affected Test Plan: ci buck test //caffe2/caffe2/python:context_test //caffe2/caffe2/python:checkpoint_test Reviewed By: dongyuzheng Differential Revision: D25133469 fbshipit-source-id: 16368bf723eeb6ce3308d6827f5ac5e955b4e29a	2020-11-30 18:31:14 -08:00
Frank Seide	29f0e1e2ce	Fused8BitRowwiseQuantizedToFloat operator support (#48407 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48407 T79817692: Fused8BitRowwiseQuantizedToFloat operator support for c2_pt_converter. Also refactored some repeated code from the existing test functions. (Initial commit only has refactoring.) Test Plan: buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test Reviewed By: bugra Differential Revision: D25069936 fbshipit-source-id: 72f6a845a1b4639b9542c6b230c8cd74b06bc5a0	2020-11-30 17:11:39 -08:00
Xiaodong Wang	d386d3323f	[dper] supress excessive msg (#48404 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48404 On bento this is printing a lot of msgs like (see N408483 if you're an internal user) ``` W1123 120952.322 schema.py:811] Scalar should be considered immutable. Only call Scalar.set() on newly created Scalar with unsafe=True. This will become an error soon. ``` And it's ignoring the log level I set at global level. Removing this line unless this is super important. Test Plan: build a local dper package and verify Differential Revision: D25163808 fbshipit-source-id: 338d01c82b4e67269328bbeafc088987c4cbac75	2020-11-30 14:55:52 -08:00
shubhambhokare1	bdf360f9f2	[ONNX] Update onnx submodule (#47366 ) Summary: Update onnx submodule to 1.8 release Pull Request resolved: https://github.com/pytorch/pytorch/pull/47366 Reviewed By: hl475 Differential Revision: D24968733 Pulled By: houseroad fbshipit-source-id: 2f0a3436ab3c9380ed8ff0887a483743c1209721	2020-11-30 00:05:46 -08:00
Tristan Rice	6eaf1e358c	caffe2/core.Net: is_external_input rebuild lookup tables when necessary Summary: is_external_input doesn't check if the lookup tables are valid. Calling .Proto() should invalidate all lookup tables and have them rebuilt on call to any methods depending on them. This adds this check to is_external_input. Test Plan: internal unit tests Reviewed By: dzhulgakov, esqu1 Differential Revision: D25100464 fbshipit-source-id: d792dec7e5aa9ffeafda88350e05cb757f4c4831	2020-11-20 10:53:24 -08:00
Xiaomeng Yang	2039ff3fbb	[Caffe2] Optimize MishOp on CPU (#48212 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48212 Optimize MishOp on CPU Test Plan: buck test mode/dev-nosan //caffe2/caffe2/python/operator_test:activation_ops_test -- "mish" Reviewed By: houseroad Differential Revision: D25071304 fbshipit-source-id: fe94bfab512188d60412d66962983eff4f37bc07	2020-11-19 14:17:27 -08:00
Scott Wolchok	4c9eb57914	[PyTorch] Narrow Device to 2 bytes by narrowing DeviceType and DeviceIndex (#47023 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47023 DeviceType pretty clearly only needs 1 byte. DeviceIndex only needs 1 byte given that machines don't have anywhere near 255 GPUs in them as far as I know. ghstack-source-id: 116901430 Test Plan: Existing tests, added assertion to catch if my assumption about DeviceIndex is incorrect Reviewed By: dzhulgakov Differential Revision: D24605460 fbshipit-source-id: 7c9a89027fcf8eebd623b7cdbf6302162c981cd2	2020-11-18 19:39:40 -08:00
Tristan Rice	b10d6c6089	[caffe2] cache NextName indexes for faster name generation (#47768 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47768 This stores the next ID for a given NextName(prefix, output_id) so repeated calls to NextName are significantly faster. This accounts for ~65% of time spent for large models. Test Plan: buck test //caffe2/caffe2/python/... will launch canary job before landing to ensure no regressions + confirm speedup Reviewed By: dzhulgakov Differential Revision: D24876961 fbshipit-source-id: 668d73060d800513bc72d7cd405a47d15c4acc34	2020-11-17 12:24:00 -08:00
Ankur Singla	549ef1d668	[caffe][memonger] Extend operator schema check to dag memonger (#48021 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48021 Extending operator schema check for simple memonger to dag memonger as well. As part of this a fix is being made to handle inplace ops (having at least one output name same as input blob). Earlier all the output blobs from ops were being treated as shareable but it failed assertion of external input blobs with the same name not allowed to share. Test Plan: Added corresponding unit tests Reviewed By: hlu1 Differential Revision: D24968862 fbshipit-source-id: b6679a388a82b0d68f65ade64b85560354aaa3ef	2020-11-16 19:17:55 -08:00
Ankur Singla	f743b5639a	[caffe2][memonger] Add support for distributed inference predict nets in DAG memonger (#47718 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47718 Distributed Inference splits a predict net into multiple parts, part0 being the main part which contains ops to make remote calls to other parts. part0 predict net may contain AsyncIf ops to optimize rpc call usage. AsyncIf ops have internal nets which may refer to memongered blobs. This change handles AsyncIf ops to update internal nets to refer to memongered blobs. As part of this change, I am also updating dag memonger traversal to always start from root op, i.e. ops with 0 in degree. Earlier logic will start traversing ops based on input head blobs and if one of the head inputs is getting used in a non-root op which gets visited before its parent, the traversal will throwing assertion error here: https://fburl.com/diffusion/ob110s9z . Almost for all the distributed inference part0 nets, it was throwing this assertion error. Test Plan: Added corresponding tests in memonger_test.py . Could not find unit tests in c++ version of memonger. Reviewed By: hlu1 Differential Revision: D24872010 fbshipit-source-id: 1dc99b2fb52b2bc692fa4fc0aff6b7e4c5e4f5b0	2020-11-13 14:12:07 -08:00
Jonathan Kwok	a3e08e5344	Support ReduceSum in c2_pt_converter (#47889 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47889 Adds support for converting the [caffe2 ReduceSum](https://caffe2.ai/docs/operators-catalogue#reducesum) operator to torch. ghstack-source-id: 116580127 Test Plan: buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test : [results](https://our.intern.facebook.com/intern/testinfra/testrun/6755399466095119) ✓ ListingSuccess: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - main (60.273) ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_sub_op (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (101.119) ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_layer_norm_conversion (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (101.404) ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_local_model_conversion (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (101.966) ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_reduce_sum (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (114.896) Reviewed By: bugra Differential Revision: D24925318 fbshipit-source-id: 3f3b791eff1b03e8f5adee744560fe8bc811c659	2020-11-13 12:02:58 -08:00
Gary Zheng	f1babb00f0	[caffe2] Fix ListWithEvicted _pprint_impl wrongly printing _evicted_values (#47881 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47881 ListWithEvicted's _pprint_impl was accidentally printing _items before this change. Reviewed By: dzhulgakov Differential Revision: D24928521 fbshipit-source-id: 0d7940719b4a27defbaae3b99af104d7fe7b5144	2020-11-13 09:23:10 -08:00
Alberto Alfarano	59e96c55f7	Support MatMul in c2_pt_converter Summary: Added the MatMul operator for caffe2 Test Plan: buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test Reviewed By: bugra Differential Revision: D24920937 fbshipit-source-id: 7ba09ba0439cb9bd15d6a41fd8ff1a86d8d11437	2020-11-12 20:56:58 -08:00
Peiyao Zhou	4078f44668	[TB][embedding supporting] Modify histogram to accept multipy types to skip Castop and avoid OOMing in Castop Summary: To support min/max/mean/std, SummarizeOp need to skip size checking (similar to the LpNorm error mentioned above) and accept multiple types Test Plan: unit test: `buck test //caffe2/caffe2/fb/tensorboard/tests:tensorboard_accumulate_histogram_op_test` https://our.intern.facebook.com/intern/testinfra/testrun/1407375057859572 `buck test //caffe2/caffe2/fb/tensorboard/tests:tensorboard_accumulate_histogram_op_test --stress-runs 1000` https://our.intern.facebook.com/intern/testinfra/testrun/2533274832166362 Reviewed By: cryptopic Differential Revision: D24605507 fbshipit-source-id: fa08372d7c9970083c38abd432d4c86e84fb10e0	2020-11-11 12:03:54 -08:00
Richard Zou	17c58720fe	Revert D24346771: [caffe2][memonger] Add support for distributed inference predict nets in DAG memonger Test Plan: revert-hammer Differential Revision: D24346771 (`5882f2e540`) Original commit changeset: ad2dd2e63f3e fbshipit-source-id: 90346f08c890eebe71f068748a8e24e4db88c250	2020-11-10 12:11:22 -08:00
Ankur Singla	5882f2e540	[caffe2][memonger] Add support for distributed inference predict nets in DAG memonger Summary: Distributed Inference splits a predict net into multiple parts, part0 being the main part which contains ops to make remote calls to other parts. part0 predict net may contain AsyncIf ops to optimize rpc call usage. AsyncIf ops have internal nets which may refer to memongered blobs. This change handles AsyncIf ops to update internal nets to refer to memongered blobs. Here is one reference part0 predict net with AsyncIf ops: https://www.internalfb.com/intern/paste/P145812115/ As part of this change, I am also updating dag memonger traversal to always start from root op, i.e. ops with 0 in degree. Earlier logic will start traversing ops based on input head blobs and if one of the head inputs is getting used in a non-root op which gets visited before its parent, the traversal will throwing assertion error here: https://fburl.com/diffusion/ob110s9z . Almost for all the distributed inference part0 nets, it was throwing this assertion error. Reviewed By: hlu1 Differential Revision: D24346771 fbshipit-source-id: ad2dd2e63f3e822ad172682f6d63f8474492255d	2020-11-10 09:35:28 -08:00
Gary Zheng	8b3f1d1288	[caffe2] Add __slots__ to all classes in schema.py (#47541 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47541 The profiler has guided us to `schema.py`. Since these `Field`s are used everywhere and in huge quantities, we can easily make some optimizations system wide by adding `__slots__`. From StackOverflow, benefits include: * faster attribute access. * space savings in memory. Read more: https://stackoverflow.com/a/28059785/ Reviewed By: dzhulgakov Differential Revision: D24771078 fbshipit-source-id: 13f6064d367440069767131a433c820eabfe931b	2020-11-09 16:16:28 -08:00
Gary Zheng	4c52a56c40	[caffe2] Properly call super init in schema.py (#47542 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47542 The previous way of doing `Field.__init__(self, [])` is just wrong. Switching to Python2 compatible way: `super(ObjectName, self).__init__(...)` Reviewed By: dzhulgakov Differential Revision: D24771077 fbshipit-source-id: d6798c72090c0264b6c583602cae441a1b14587c	2020-11-09 15:02:22 -08:00
Gary Zheng	4a58f35bef	[caffe2] Fix duplicate name bug in Net.AddExternalInput (#47530 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47530 `Net.AddExternalInput` should raise if there are duplicate names. The previous code would only raise if the addition of duplicates was in separate calls, but not if it was in the same call. Test Plan: Added two new regression tests ``` ✓ Pass: caffe2/caffe2/python:core_test - testSetInputRecordWithBlobs (caffe2.caffe2.python.core_test.TestExternalInputs) (9.622) ✓ Pass: caffe2/caffe2/python:core_test - testAddExternalInputShouldRaiseIfDuplicate (caffe2.caffe2.python.core_test.TestExternalInputs) (9.639) ✓ Pass: caffe2/caffe2/python:core_test - testSetInputRecordWithoutBlobs (caffe2.caffe2.python.core_test.TestExternalInputs) (9.883) ✓ Pass: caffe2/caffe2/python:core_test - testAddExternalInputShouldRaiseIfDuplicateInSameCall (caffe2.caffe2.python.core_test.TestExternalInputs) (10.153) ``` Test trained 2 models. No issues f230755456 f230754926 Reviewed By: dzhulgakov Differential Revision: D24763586 fbshipit-source-id: c87088441d76f7198f8b07508b2607aec13521ed	2020-11-09 08:30:58 -08:00
Shiyan Deng	c19eb4ad73	BoxWithNMSLimit support int `batch_splits` input (#47504 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47504 allow int type input of `batch_splits` Test Plan: ``` buck test caffe2/caffe2/python/operator_test:torch_integration_test -- test_box_with_nms_limits ``` Reviewed By: jackm321 Differential Revision: D24629522 fbshipit-source-id: 61cb132e792bddd8f9f1bca5b808f1a9131808f0	2020-11-07 00:27:51 -08:00
Gary Zheng	582e852fba	[caffe2] Add unittests for schema.Field init (#47512 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47512 I deleted the last line of `__init__` -- `self._field_offsets.append(offset)` -- and the unittests didn't fail. So this diff is to improve test coverage. Test Plan: ``` ✓ Pass: caffe2/caffe2/python:schema_test - testInitShouldSetEmptyParent (caffe2.caffe2.python.schema_test.TestField) (8.225) ✓ Pass: caffe2/caffe2/python:schema_test - testInitShouldSetFieldOffsetsIfNoChildren (caffe2.caffe2.python.schema_test.TestField) (8.339) ✓ Pass: caffe2/caffe2/python:schema_test - testInitShouldSetFieldOffsets (caffe2.caffe2.python.schema_test.TestField) (8.381) ``` Reviewed By: dzhulgakov Differential Revision: D24767188 fbshipit-source-id: b6ce8cc96ecc61768b55360e0238f7317a2f18ea	2020-11-06 13:27:58 -08:00
Bugra Akyildiz	c26c4690fe	Add sub operator Summary: Add sub operator for caffe2 Test Plan: ``` buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test ``` Reviewed By: houseroad Differential Revision: D24685090 fbshipit-source-id: 60d745065d01b634ebd3087e533d8b9ddab77a1f	2020-11-06 12:31:17 -08:00
Tristan Rice	47198e3208	[caffe2] improve core.Net cloning/init performance (24x for large models!) (#47475 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47475 This improves the core.Net cloning/init performance by quite a bit. It makes set_input_record run in linear time instead of O(n) by checking the external_input map instead of regenerating the external inputs each time and then iterating over it. Test Plan: unit tests + canary runs Reviewed By: dzhulgakov Differential Revision: D24765346 fbshipit-source-id: 92d9f6dec158512bd50513b78675174686f0f411	2020-11-06 11:34:12 -08:00
Yen-Jung Chang	6e22b6008d	[MLF] Allow for computing prune quantile thresholds on absolute value of indicators in distributed-inference-compatible embedding LUT pruning (#46789 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46789 1. Now `SelfBinningHistogram` can calculate the binning histogram using the absolute values from the given an array of values. 2. Update the invocation of `SelfBinningHistogram` in `post_training_prune`. Test Plan: 1. [buck test caffe2/caffe2/python/operator_test:self_binning_histogram_test](https://www.internalfb.com/intern/testinfra/testconsole/testrun/6473924488326108/) 2. [buck test dper3/dper3_backend/delivery/tests:post_training_prune_test](https://www.internalfb.com/intern/testinfra/testconsole/testrun/2251799854023163/) Reviewed By: hwangjeff Differential Revision: D24494097 fbshipit-source-id: 95e47137b25746e686ef9baa9409560af5d58fc1	2020-11-02 11:31:31 -08:00
Basil Hosmer	f05b66b70d	pass TypeMeta by value (#45026 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45026 Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D23802943 Pulled By: bhosmer fbshipit-source-id: 81b06ef00bf8eb4375c0e0ff2032e03bd1d1188a	2020-10-30 10:14:17 -07:00
Bugra Akyildiz	eec201c138	Add last_n_window_collector Summary: Add `last_n_window_collector` as C2 supports and PyTorch currently does not have this operator: https://www.internalfb.com/intern/diffusion/FBS/browsefile/master/fbcode/caffe2/caffe2/operators/last_n_window_collector.cc?lines=139 ## Problem that we are solving This operator works on multiple pieces of data and collects last `n` element that has been seen. If you have the following pieces of data that has been passed around: ``` [1, 2, 3, 4] [5, 6, 7] [8, 9, 10, 11] ``` for 3 times and the number of collector is given to be 6. The expected result is: ``` [6, 7, 8, 9, 10, 11] ``` What this means is that, almost like we need a FIFO(First in First Out) mechanism where as we are passing this data through the collector, we will be pushing some other data at the end. In this particular example, in the first pass(the data is `[1, 2, 3, 4]`) , we hold `[1, 2, 3, 4]` in the queue as our queue size is 6. In the second pass(the data is `[5, 6, 7]`), we hold `[2, 3, 4, 5, 6, 7]` in the queue and since 1 is inserted the last, it will drop due to the size limitation of the queue. In the third pass(the data is `[8, 9, 10, 11]`), we hold `[6, 7, 8, 9, 10, 11]` in the queue and `2,3,4,5` are dropped due the the size of the queue. For multidimension case, when we have the following data: ``` [[1, 2], [2, 3], [3, 4], [4, 5]] [[5, 6], [6, 7], [7, 8]] [[8, 9], [9, 10], [10, 11], [11, 12]] ``` and our queue size is 6. In the first pass, we will have ` [[1, 2], [2, 3], [3, 4], [4, 5]]` In the second pass, we will have `[2, 3], [3, 4], [4, 5]] [[5, 6], [6, 7], [7, 8]]` In the third pass, we will have `[6, 7], [7, 8]] [[8, 9], [9, 10], [10, 11], [11, 12]]` ### The implementation I am using FIFO queue in Python which is in the collections library. This accepts `maxlen` argument which can be used to set the size of the queue. I am using last n indices of the tensor through list indices and in this operator, I am not doing copy. In the test plan, I have both single dimension tensors as well as multi-dimension tensors. ### Benchmark I used various different configurations and added a benchmark test. PyTorch implementation is much master than Caffe2 implementation: #### CPU Benchmark ``` torch_response.median 0.00019254473969340324 caffe_response.median 0.00030233583599794657 ``` #### GPU Benchmark ``` torch_response.mean 0.000081007429903838786 caffe_response.mean 0.00010279081099724863 ``` Test Plan: ### For CPU: ``` buck test //caffe2/torch/fb/sparsenn:test ``` ### For GPU: - Used an on-demand machine and did the following commands: ``` jf get D24435544 buck test mode/opt //caffe2/torch/fb/sparsenn:test ``` https://www.internalfb.com/intern/testinfra/testconsole/testrun/4222124688138052/ Reviewed By: dzhulgakov, radkris-git Differential Revision: D24435544 fbshipit-source-id: 8193b4746b20f2a4920fd4d41271341045cdcee1	2020-10-30 02:35:54 -07:00
Brandon Lin	4a581ba6c2	Implement LengthsToOffsets operator in Caffe2 (#46590 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46590 This operator is very similar to LengthsToRanges but doesn't pack the offsets next to the original lengths. Reviewed By: yf225 Differential Revision: D24419746 fbshipit-source-id: aa8b014588bb22eced324853c545f8684086c4e4	2020-10-29 07:03:34 -07:00
Kunal Bhalla	18d273dc0e	[RFC][LocalSession] Fix workspace type Summary: I was reading/looking into how LocalSession works and realized that the workspace type being passed around was the bound function on TaskGroup instead of the actual type. This meant that all workspaces for localsession would always be global, because they'd never match the private workspace type. Test Plan: <not sure, could use some suggestions> Reviewed By: cryptopic Differential Revision: D24458428 fbshipit-source-id: 0f87874babe9c1ddff25b5363b443f9ca37e03c1	2020-10-29 04:12:17 -07:00
Dmytro Dzhulgakov	115bbf9945	[caffe2] Disable running full grad check in tests by default Summary: We've been seeing a lot of Hypothesis timeouts and from profiling a few of the failing tests one of the contributing factors is really slow grad checker. In short, it launches the whole op for each of the input elements so the overall complexity is O(numel^2) at least. This applies a very unscientific hack to just run grad check on the first and last few elements. It's not ideal, but it's better than flaky tests. One can still explicitly opt in with the env var. Reviewed By: malfet Differential Revision: D23336220 fbshipit-source-id: f04d8d43c6aa1590c2f3e72fc7ccc6aa674e49d2	2020-10-27 16:10:03 -07:00
Huan Gui	b5662ba0f0	[uhm][0/n] add cuda Mod Op (#46732 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46732 as titled Test Plan: unittest buck test mode/dev-nosan //caffe2/caffe2/python/operator_test:mod_op_test Reviewed By: xianjiec Differential Revision: D24368100 fbshipit-source-id: 1232d22a67ac268986043911d548fa9d657470ec	2020-10-26 11:07:51 -07:00
Yunfan Zhong	e519fcd1aa	Remap net name inside arg.n for AsyncIf operator Summary: Similar to If operator, AsyncIf also contains nets in args. It needs the same handling. Test Plan: New unit test test_control_op_remap `buck test caffe2/caffe2/python:core_test` Also it worked end to end in prototype of dist bulk eval workflow f226680903 Reviewed By: yyetim Differential Revision: D24451775 fbshipit-source-id: 50594e2ab9bb457329ed8da7b035f7409461b5f6	2020-10-23 10:41:06 -07:00
Alexander Grund	93719440b8	Replace map(lambda constructs (#46462 ) Summary: Follow-up of https://github.com/pytorch/pytorch/issues/46461 with a similar goal Makes them more readable and possibly faster. Care has to be taken because `map` applies the function immediately while `(x for x in xs)` is a generator expression which gets evaluated later. This is a benefit in some cases where it is not required to actually create the list of values in memory (e.g. when passing to `tuple` or `extend` or `join`) Pull Request resolved: https://github.com/pytorch/pytorch/pull/46462 Reviewed By: zou3519 Differential Revision: D24422343 Pulled By: ezyang fbshipit-source-id: 252e33499c92ac0b15238f2df32681dbbda2b237	2020-10-22 09:50:22 -07:00
Jeff Hwang	9b5197b763	[mlf][efficiency] add tensor inference function to last-n collector op (#46693 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46693 title Test Plan: unit tests Reviewed By: hx89 Differential Revision: D23946770 fbshipit-source-id: f7c3d4a1b4ef3b0e5f56e5a9a30f5003ce9f40b0	2020-10-22 01:15:00 -07:00
Alexander Grund	5b0f400488	Replace list(map(...)) constructs by list comprehensions (#46461 ) Summary: As discussed in https://github.com/pytorch/pytorch/issues/46392 this makes the code more readable and possibly more performant. It also fixes a bug detected by this where the argument order of `map` was confused: `030a24906e (diff-5bb26bd3a23ee3bb540aeadcc0385df2a4e48de39f87ed9ea76b21990738fe98L1537-R1537)` Fixes https://github.com/pytorch/pytorch/issues/46392 Pull Request resolved: https://github.com/pytorch/pytorch/pull/46461 Reviewed By: ailzhang Differential Revision: D24367015 Pulled By: ezyang fbshipit-source-id: d55a67933cc22346b00544c9671f09982ad920e7	2020-10-19 18:42:49 -07:00
Jongsoo Park	c37baa9177	[caffe2] add concat benchmark (#46457 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46457 Wanted to see if using CopyMatrix specialized for float that uses mkl_somatcopy can be faster but it wasn't. Still want to check in benchmark that can be used later. Test Plan: . Reviewed By: dskhudia Differential Revision: D24345901 fbshipit-source-id: d3e68dbb560e3138fda11c55789cd41bc0715c6d	2020-10-16 08:48:42 -07:00
Nikita Shulga	84771fc64f	[caffe2] Add 10s deadline for all Caffe2 hypothesis fuzz tests Test Plan: CI Reviewed By: walterddr Differential Revision: D24298118 fbshipit-source-id: 2286c1e37ed9c43f404b888386c0bd4b0b6a55c6	2020-10-14 06:30:09 -07:00

1 2 3 4 5 ...

2846 commits