pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-14 20:57:59 +00:00

Author	SHA1	Message	Date
Raghavan Raman	34d6618386	[NNC] Fixing a bug in simplifier (#58291 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58291 Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D28435393 Pulled By: navahgar fbshipit-source-id: 517e47385a93a43d2ddf054382adc81c18484066	2021-05-18 01:28:33 -07:00
Elias Ellison	211bac53ef	[JIT] Add optimize_for_inference API (#58193 ) Summary: Freezing exists as a pass which partially evaluates your model and applies generic optimizations which should speed it up. Optimize for inference is a counterpart to these optimizations which runs build & server specific optimizations. The interaction with existing `optimize_frozen_module` is not great, I guess we could just deprecate the API entirely? it was never officially released but just existed to document the `optimize_numerics` keyword. Eventually, I would like to add a way of adding example inputs but I didnt add that here because they are not being used at all yet. I also have not yet included a way to blacklist individual optimizations, and would like to wait until we move this to Beta and have a little more clarity on how everything will fit together. I also think blacklisting will be an uncommon use case for the current optimizations. Pull Request resolved: https://github.com/pytorch/pytorch/pull/58193 Reviewed By: bertmaher, navahgar Differential Revision: D28443714 Pulled By: eellison fbshipit-source-id: b032355bb2585720a6d2f00c89d0d9a7ef60e649	2021-05-15 15:50:14 -07:00
Lunwen He	0a561f83ca	[PyTorch Mobile]Fix unit test (#58202 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58202 This unit test was testing the wrong target. It should test the sampler under jit::mobile. This diff fixes it. Test Plan: run unit tests Reviewed By: shreyanb98 Differential Revision: D28384839 fbshipit-source-id: 35cc63be2e73ca9b1a7d30d6f67fffcfe5021fa2	2021-05-14 13:43:22 -07:00
Lunwen He	73d51406fa	[PyTorch Mobile]Move train related files to their own folder (#58205 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58205 It's worthing moving train related files into their own folder since we are adding more code under the mobile directory. This diff does that. Test Plan: run unit tests and ci Reviewed By: iseeyuan Differential Revision: D28402432 fbshipit-source-id: cd76a1c4f8ff06508cdc3aad8a169fbf34bb4995	2021-05-14 12:54:44 -07:00
Lunwen He	a8122062c0	[PyTorch Mobile]Add light version of RandomSampler (#58201 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58201 Add light version of RandomSampler which can be used torch mobile. Test Plan: run unit test Reviewed By: iseeyuan Differential Revision: D28364467 fbshipit-source-id: 3148129fa56533f5f4b76b63b60e8778eeaf815f	2021-05-13 22:53:21 -07:00
Martin Yuan	d833caaf6b	[PyTorch Mobile][Forward/backward compatibility] Number of arguments for operators (#56845 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56845 Handle forward/backward compatibility caused by added default arguments in mobile. As an example, In older version, operator aten::foo's schema is ``` foo(Tensor a, Tensor b) -> Tensor ``` In the new version, the schema is updated to ``` foo(Tensor a, Tensor b, int groups=1) -> Tensor ``` ## Model file Serialize the number of specified arguments to each operator into the bytecode operator table. Before the operator table contains operator name and overload name: ``` ('operators', (('aten::foo', ''),)) ``` Now the number of specified arguments is added: ``` # bytecode version 6 ('operators', (('aten::foo', '', 2),)) ``` where "2" means the number of specified arguments. Since there's bytecode schema change, the bytecode version number is bumped. This PR is to be landed after #56002 , where the version number is bumped from 4 to 5. This PR bumps the version number from 5 to 6. ## Runtime and backward compatibility When the operator is found (either jit or c10), we have the OperatorHandle, where the operator schema can be accessed by ``` op.value().schema().arguments() ``` Adaptation is implemented to handle backward compatibility. For the example above, the new runtime holds the updated schema: ``` foo(Tensor a, Tensor b, int groups=1) -> Tensor ``` Whereas the model file carries ``` (('aten::foo', ''), 2) ``` We can implement a wrapper around the original function pointer to push the default argument to the stack. ## Deliver time and forward compatibility At model delivery time, two checks can be done: ### Operator check Two APIs to be provided: * Runtime: An API to get a runtime’s ops and their schemas (i.e. the # of args). D27920185(WIP) * Model: An API to get a model’s ops and their schema requirements (i.e. the # of args required). The APIs can be used to check * runtime.ops() is a superset of model.ops() * for each op in model.ops() validate their schemas are compatible with those in runtime.ops() -- i.e. the # args required in a model op are <= # args in the runtime op. Note that only root ops in the model needs to be checked here. For transient ops it's not necessary. For example, if a root op, "aten::root" calls "aten::foo", it's "aten::root"'s responsibility to adapt to "aten::foo"'s change, or "aten::root" itself needs to be updated too. ### Bytecode version backport (PR coming) When delivering a model with bytecode v6, if the runtime only works with bytecode v5 and lower, backport is needed. * The number of arguments is removed from the operator table * The bytecode version is changed from 6 to 5 Note that this backport is a pure format change, it does not guarantee the backported model always runs in old runtime. The operator check mentioned before should be done first, before it’s back ported to v5. Test Plan: Imported from OSS Reviewed By: gmagogsfm Differential Revision: D27986544 Pulled By: iseeyuan fbshipit-source-id: 143e19d4798cfb96b65095538dd648eead4e3fda	2021-05-13 14:20:47 -07:00
Jacob Szwejbka	1de9f51782	[Pytorch Edge] Runtime ops compatibility api (#57570 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57570 Move runtime ops compatibility api to OSS and introduce schema information ghstack-source-id: 128789159 Test Plan: unit test and manually ran it for a runtime with all (non custom) ops, and the bixray models unittest {P412728176} Reviewed By: raziel Differential Revision: D28203104 fbshipit-source-id: 432a7d0247bccfb2e1ce90e8d41f81596efa3d67	2021-05-13 10:20:41 -07:00
Jeffrey Wan	e71b526e7e	Add inference mode python bindings and tests (#58045 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/56608 - Adds binding to the `c10::InferenceMode` RAII class in `torch._C._autograd.InferenceMode` through pybind. Also binds the `torch.is_inference_mode` function. - Adds context manager `torch.inference_mode` to manage an instance of `c10::InferenceMode` (global). Implemented in `torch.autograd.grad_mode.py` to reuse the `_DecoratorContextManager` class. - Adds some tests based on those linked in the issue + several more for just the context manager Issues/todos (not necessarily for this PR): - Improve short inference mode description - Small example - Improved testing since there is no direct way of checking TLS/dispatch keys - Pull Request resolved: https://github.com/pytorch/pytorch/pull/58045 Reviewed By: agolynski Differential Revision: D28390595 Pulled By: soulitzer fbshipit-source-id: ae98fa036c6a2cf7f56e0fd4c352ff804904752c	2021-05-13 08:55:35 -07:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	fee7e8b91d	Striding for lists Part 2 (#49352 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49352 In this PR, we replace all definitions of slice to take None parameters for the start, end, and step. This will simplify the compiler logic Test Plan: test_jit test cases Imported from OSS Reviewed By: jamesr66a, nikithamalgifb Differential Revision: D25929903 fbshipit-source-id: 5bfc6bad514a8aafbef2dacc706f95f867fe85f1	2021-05-13 00:16:02 -07:00
Mikhail Zolotukhin	c751e53800	[TensorExpr] Implement 'call_raw' in IREval. (#57882 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57882 Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D28306752 Pulled By: ZolotukhinM fbshipit-source-id: 11d0034f9bfbadf8483de90c457f952a2161f10b	2021-05-12 14:08:18 -07:00
Shen Li	cf7a0e5af4	Use RPC context streams to cover serde ops (#57926 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57926 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D28316526 Pulled By: mrshenli fbshipit-source-id: 1907ec8f46e40fa5049d810c6ad959263361b6aa	2021-05-11 07:07:51 -07:00
Ailing Zhang	481806be97	Fix creation_meta for multi view outputs in NoGradMode/InferenceMode. (#57842 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57842 Test Plan: Imported from OSS Reviewed By: bdhirsh Differential Revision: D28295649 Pulled By: ailzhang fbshipit-source-id: e0e11f537a97825e3fb7255aa561d3e855a6d3ce	2021-05-10 12:37:30 -07:00
CodemodService FBSourceClangFormatLinterBot	cbfce376a8	[AutoAccept][Codemod][FBSourceClangFormatLinter] Daily `arc lint --take CLANGFORMAT` Reviewed By: zertosh Differential Revision: D28319469 fbshipit-source-id: 8295597a8ee16b2fef3f7aacdd6c892cb22db988	2021-05-10 03:39:31 -07:00
Raghavan Raman	259d19a733	[JIT] Adding a concat optimization pass (#55474 ) Summary: This PR adds a new pass in JIT that optimizes `aten::cat` ops. Specifically, here are optimizations performed: * Eliminate redundant in `cat` inputs by performing cse on the list of inputs. - This includes eliminating fully redundant `cat` ops when all the inputs are the same as well the case when "all but one" of the inputs have already been concatenated. * Expand `cat` into multiple copies and eliminate redundancies. - This also includes eliminating redundancies in the underlying buffers used for `cat`. These optimizations are not enabled in any compilation flow at this point. Pull Request resolved: https://github.com/pytorch/pytorch/pull/55474 Reviewed By: albanD Differential Revision: D27624511 Pulled By: navahgar fbshipit-source-id: d509289fafc23e73b02f64a90219148896817339	2021-05-09 22:06:44 -07:00
Shen Li	fc55290e5b	Fix distributed autograd gradients synchronization (#57792 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57792 There are two problems when using CUDA RPC with distributed autograd and distributed optimizer: 1) In local autograd engine, all autograd functions/nodes, including AccumualteGrad will use the forward stream for backward computation. But distributed autograd skips AccumulateGrad autograd function/node and directly calls into `AccumulateGrad::accumulateGrad`. As the result, it will use the default stream to accumulate gradients instead of the forward stream. This commit changes that and uses the forward stream to accumulate gradients, matching forward behavior. 2) Distributed optimizer and distributed autograd backward are separate RPC calls, and CUDA streams are not synchronized across different RPC calls. As a result, distributed optimizer might consume gradients before they are ready. This commit uses CUDA events to record the completion of gradient computation, and use those events to block current streams when getGradients() are called. Test Plan: Imported from OSS Reviewed By: pritamdamania87 Differential Revision: D28274876 Pulled By: mrshenli fbshipit-source-id: 22e607152324ae918084066cde8c5dbb418bba7c	2021-05-09 17:32:59 -07:00
Martin Yuan	737f48dfc5	Remove _save_data() and _load_data() from mobile (#57879 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57879 _save_data() and _load_data() were designed as a protocol of data serialization of trainer client. As confirmed with kwanmacher and dreiss , they are not used. In addition, there's no plan to use them in Federated Learning flow. Remove them for now. Test Plan: Imported from OSS Reviewed By: kwanmacher Differential Revision: D28306682 Pulled By: iseeyuan fbshipit-source-id: 1b993ce4d78e372ae9b83bcbe496a196f9269d47	2021-05-08 10:52:44 -07:00
Nikita Shulga	3a66a1cb99	[clang-tidy] Exclude cppcoreguidelines-avoid-magic-numbers (#57841 ) Summary: Add cppcoreguidelines-avoid-magic-numbers exclusion to clang-tidy Remove existing nolint warnings using following script: ``` for file in `git ls-files \| grep -v \.py`; do gsed '/^ *\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-magic-numbers)/d' -i $file; done ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/57841 Reviewed By: samestep Differential Revision: D28295045 Pulled By: malfet fbshipit-source-id: 7c6e8d1213c9593f169ed3df6a916498f1a97163	2021-05-07 20:02:33 -07:00
Chen Lai	8c04593c0a	[PyTorch Edge] Add backport to export old bytecode models (#56802 ) Summary: Add an api to backport a model vn to model vi. It accept an input model (file or buffer) and output a model (file or buffer) with an expected bytecode version. In this change, the input is a model and it can come from a file or buffer. The output is a model and can be either file path or buffer. When backport fails, function return false with a warning message : ``` /Users/chenlai/pytorch/cmake-build-debug/bin/test_jit --gtest_filter=LiteInterpreterTest.BackPortByteCodeModelV4:LiteInterpreterTest/.BackPortByteCodeModelV4:/LiteInterpreterTest.BackPortByteCodeModelV4/:/LiteInterpreterTest/*.BackPortByteCodeModelV4 --gtest_color=no Testing started at 2:32 PM ... CUDA not available. Disabling CUDA and MultiCUDA tests [W backport.cpp:419] Warning: Backport doesn't support backport to version3 (function _backport_for_mobile_impl) Process finished with exit code 0 ``` ## Test 1. Run both `caffe2/test/cpp/jit/test_lite_interpreter.cpp` and `caffe2/test/mobile/test_bytecode.py`. 2. Run all prod models with backport api. Pull Request resolved: https://github.com/pytorch/pytorch/pull/56802 ghstack-source-id: 128425510 Test Plan: CI Reviewed By: raziel, iseeyuan Differential Revision: D27844651 fbshipit-source-id: 8a803cf6c76433ee0a3049b1a5570585d569f8d6	2021-05-07 18:14:33 -07:00
Luca Wehrstedt	36e47af58b	Pass reference to parent future in callbacks (#57635 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57635 Note: this PR looks massive, but it's just one simple change, codemodded many times. In many cases, a callback needs to access the value/error produced by the parent future. In Python this was easy because the callback was invoked with the parent future as argument, and could thus inspect it. In C++ the callbacks didn't take any arguments, thus in many cases we worked around this by capturing the future in its own callback. This is risky (leads to reference cycle and thus memory leak) and must be done carefully (spoiler: sometimes we weren't). ghstack-source-id: 128296580 Test Plan: CI Reviewed By: wanchaol Differential Revision: D28178783 fbshipit-source-id: 6de02c4568be42123372edc008f630d5ddae0081	2021-05-07 03:59:18 -07:00
Luca Wehrstedt	9aa1461a68	Make wrapPropagateTLSState more generic (#57634 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57634 `wrapPropagateTLSState` was restricting its argument to be an argument-less function, and I need to relax this for later work. Also, it was requiring its argument to be converted to `std::function`, and also returned a `std::function`. Each creation of a `std::function` could cause a heap allocation. It's not particularly expensive, but here we can easily avoid it by having `wrapPropagateTLSState` directly operate on generic callables (thus, possibly, raw lambdas). ghstack-source-id: 128295264 Test Plan: CI Reviewed By: ilia-cher Differential Revision: D28178782 fbshipit-source-id: d657f5751514974518606dd4fc4175e805dcb90a	2021-05-07 03:58:08 -07:00
Raghavan Raman	1f178de800	[NNC] Add support for computing conv with dynamic shapes (#57514 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57514 Test Plan: Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D28226918 Pulled By: navahgar fbshipit-source-id: 818ac8411b809033388d419c8f33db6aeece4b33	2021-05-06 01:08:25 -07:00
Raghavan Raman	95fbc158d4	[NNC] Add a method to compute conv without bias (#57512 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57512 Test Plan: Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D28226919 Pulled By: navahgar fbshipit-source-id: e84b944f7fdc84a77409d59218ceaa0862298f3c	2021-05-06 01:07:21 -07:00
albanD	0b51ee311d	Add missing return statement from 57057 (#57669 ) Summary: Fixes a bug introduced by https://github.com/pytorch/pytorch/issues/57057 cc ailzhang while writing the tests, I realized that for these functions, we don't properly set the CreationMeta in no grad mode and Inference mode. Added a todo there. Pull Request resolved: https://github.com/pytorch/pytorch/pull/57669 Reviewed By: soulitzer Differential Revision: D28231005 Pulled By: albanD fbshipit-source-id: 08a68d23ded87027476914bc87f3a0537f01fc33	2021-05-05 16:13:35 -07:00
Alban Desmaison	15c092b888	Revert "Make grad mode error just a warning (#56401 )" (#57640 ) Summary: This reverts commit `63dac82444`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/57640 Reviewed By: soulitzer, yuguo68 Differential Revision: D28223946 Pulled By: albanD fbshipit-source-id: 641b87cff1e2f08162ca8cacae333105e89438f1	2021-05-05 13:07:29 -07:00
Chen Lai	fb9a32b7b4	[PyTorch][Edge] Add api to get bytecode model version (#56801 ) Summary: Add an api `_get_bytecode_version` to get version number given a bytecode model in both cxx and python, and the input can be both from file path and buffer. ## Test CI (new added unit test will run as part of `pytorch_core-buck`) 1. run test_lite_interpreter.cpp 2. `python test/mobile/test_bytecode.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/56801 ghstack-source-id: 128169647 Test Plan: CI (new added unit test will run as part of `pytorch_core-buck`) 1. run test_lite_interpreter.cpp 2. `python test/mobile/test_bytecode.py` Reviewed By: iseeyuan Differential Revision: D27961417 fbshipit-source-id: f786cc9573d855feecff0b4fe8e5363e25f5728c	2021-05-05 09:17:26 -07:00
Mikhail Zolotukhin	dedaf4fad7	Reland: [TensorExpr] Add methods for inspecting generated code in `TensorExprKernel`. (#57560 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57560 The new methods allow to peak into bufferArgs which describe parameters that codegen expects. This description includes info whether a given parameter is a scalar var or a buffer and in case it's a buffer allows to get the corresponding `Buf*` pointer from which we could get the expected sizes. Relanding #57074 which was reverted because I forgot to guard a new test with `ifdef LLVM`. Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D28199048 Pulled By: ZolotukhinM fbshipit-source-id: 636e838e7e242a3c63e97ec453b8fae9b6380231	2021-05-05 09:11:40 -07:00
Mikhail Zolotukhin	e686c66fe7	Reland: [TensorExpr] Add `TensorExprKernel::runFast` method. (#57552 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57552 This method uses `CodeGen::call_raw` instead of `CodeGen::call`. Relanding #57328 (the entire stack) which was reverted because I forgot to guard a new test with `ifdef LLVM`. Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D28195047 Pulled By: ZolotukhinM fbshipit-source-id: bcfd3cb5b4f33a149b7549515ffd705e2c4f208f	2021-05-05 09:11:37 -07:00
Mikhail Zolotukhin	0bf69278f7	Reland: [TensorExpr] Add `CodeGen::call_raw` method. (#57551 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57551 The new method allows to pass input and output arguments by `void*` pointers instead of CallArgs. That helps to reduce the invocation overhead. Currently this is only supported in LLVM codegen. Relanding #55113 (the entire stack) which was reverted because I forgot to guard a new test with `ifdef LLVM`. Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D28195049 Pulled By: ZolotukhinM fbshipit-source-id: 035b77ae996dbbcd542b4b0e4c011b41e8d7828b	2021-05-05 09:10:25 -07:00
Mike Ruberry	1461859fde	Revert D28048289: [TensorExpr] Add methods for inspecting generated code in `TensorExprKernel`. Test Plan: revert-hammer Differential Revision: D28048289 (`6b2cb939c5`) Original commit changeset: 3867e862a0ec fbshipit-source-id: bdd45dcc4b229673efeb06da411bbf0c58d44026	2021-05-04 11:29:14 -07:00
Kimish Patel	5326ec60e6	[Inlined Callstack Fix] Fix inlined callstack for blocks of the node. (#56562 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56562 Earlier inlined callstack was annotated only nodes. This left out nodes such as If which have block of nodes. These nodes should also be updated similarly. Test Plan: Added test in test_misc Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D27902516 fbshipit-source-id: 4e65c686fa6b4977e8719db45f71f7d2599d4d8e	2021-05-04 09:21:15 -07:00
Kimish Patel	bb3c6699a5	[Pytorch Mobile DebugInfo Serialization] Save debug handles for all instructions. (#55252 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55252 Earlier for bytecode serialization we were saving debug handles only for OPs and not all instructions. This PR makes changes to add that for all instructions. Test Plan: python test/mobile/test_lite_script_module.py TestLiteScriptModule Imported from OSS Reviewed By: dreiss Differential Revision: D27542502 fbshipit-source-id: cff75118c721ce9f0c2f60d2c9471481f05264ca	2021-05-04 09:21:13 -07:00
Kimish Patel	e0fc473e47	[Pytorch, Mobile] Serialize inlined callstack pointer with debug handle. (#55062 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55062 This diff introduces the following changes: 1. InlinedCallStack pickler/serializer is introduced. It is serialized as a tuple of {module_instance_info, source range tag, callee:InlinedCallStack} Module instance info is serialized as tuple of {class_type_name, instance_name}. Note that callee of the serialized inlined callstack points to the tuple of already serialized callstack. This means the first callstack ptr to serialize, will serialize entire path of the tree, where some callee nodes might be shared with callstack pointers that will be serialized subsequently. Pickler supports memoization of pickled objects, where if a tuple has been serialized then object id is obtained instead of serialized object again. Thus we stll serialize the tree and not every path from the root separately. Furthermore, InlinedCallStackSerializer also uses cache to lookup the pointer and return the serialized IValue. Furthermore, note that we must also serialize the source range of InlinedCallStack. In order to this serializer requires map of source-range-tags-to-source-range map. This was done in the previous diff, where as part of source range serialization we also generate unique tags. These are the tags that are serialized in InlinedCallStack. Thus during deserialization we would have to deserialize source range before deserializing InlinedCallStacks. 2. Furthermore, each serialized InlinedCallStack is serialized with a unique debug_handle and source range tag. BackendDebugHandleManager manages generation of unique debug handles and saves the map of debug-handles-to-{source_range_tag, inlined-callstack-ptr}. This map is then serialized as callstack_debug_map.pkl. Note that inlined callstack is not sufficient to get all the source information since it contains source information about the nodes which are inlined. The top-of-the-stack (or bottom) node, which is the actual op node, is not part of the inlined callstack pointer and thus the source range of this node is serialized separately using source_range_tag. This is similar to how JIT creates callstack in torch/csrc/jit/runtime/interpreter.cpp Unique debug handles facilitates exception throwing or profiling using just the debug handle without any further qualifications, such as which function or module the inlined-callstack belongs to. Furthermore, this diff refactors the old mobile code for tracking module hierarchy information per op. Mainly now bytecode serialization will serialize debug handles corresponding to ops/nodes in graph and have callstack_debug_map.pkl help generate: 1. Entire callstack and 2. Module hierarchy information. Test Plan: python test/mobile/test_lite_script_module.py TestLiteScriptModule ./build/bin/test_jit --gtest_filter=*ModuleInfo Imported from OSS Reviewed By: raziel Differential Revision: D27468709 fbshipit-source-id: 53e2413e7703ead01c77718b7c333c7c6ff50a23	2021-05-04 09:21:12 -07:00
Mikhail Zolotukhin	6b2cb939c5	[TensorExpr] Add methods for inspecting generated code in `TensorExprKernel`. (#57074 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57074 The new methods allow to peak into bufferArgs which describe parameters that codegen expects. This description includes info whether a given parameter is a scalar var or a buffer and in case it's a buffer allows to get the corresponding `Buf*` pointer from which we could get the expected sizes. Differential Revision: D28048289 Test Plan: Imported from OSS Reviewed By: bertmaher Pulled By: ZolotukhinM fbshipit-source-id: 3867e862a0ec3593906820826c2344bd8a8f5c0a	2021-05-03 20:02:28 -07:00
Chen Lai	ac71432c54	[PyTorch][Edge] Add api to get bytecode version from runtime (#56948 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56948 Add api to get runtime bytecode version ## Test Both `caffe2/test/cpp/jit/test_lite_interpreter.cpp` and `caffe2/test/mobile/test_bytecode.py` pass ghstack-source-id: 127939889 Test Plan: Both `caffe2/test/cpp/jit/test_lite_interpreter.cpp` and `caffe2/test/mobile/test_bytecode.py` pass Reviewed By: raziel, iseeyuan Differential Revision: D27987811 fbshipit-source-id: 35ed9bd626aecffc226f6dacfa046e6cdabfed51	2021-05-03 11:26:38 -07:00
Ailing Zhang	0ecdbfebff	s/InplaceOrView/ADInplaceOrView/g (#57372 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57372 Pull Request resolved: https://github.com/pytorch/pytorch/pull/57324 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D28121821 Pulled By: ailzhang fbshipit-source-id: f568dd2505f6279da9ffb93ce1d22e0f98c606bb	2021-05-01 22:56:18 -07:00
Mike Ruberry	05b255c543	Revert D27487549: [TensorExpr] Add `CodeGen::call_raw` method. Test Plan: revert-hammer Differential Revision: D27487549 (`c9ab384af7`) Original commit changeset: d8f3d92262cd fbshipit-source-id: ea8e71dbe2d632bc0fb557362c8bd899eb6aa83a	2021-05-01 19:48:07 -07:00
Hui Guo	afe6b4c8ee	[NNC] Add logical Operators '&&' and '\|\|' (#56947 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56947 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D28007342 Pulled By: huiguoo fbshipit-source-id: a2ad8d2e99d7c8d8c8bdcd8f65fa3f340bdd2bbc	2021-05-01 18:44:27 -07:00
Mike Ruberry	3018093066	Revert D28110359: [TensorExpr] Add `TensorExprKernel::runFast` method. Test Plan: revert-hammer Differential Revision: D28110359 (`f219ed6627`) Original commit changeset: 4fdffc8196d2 fbshipit-source-id: 3c93a058b5dd7a3b71e399341a408ec74949ef56	2021-05-01 16:16:37 -07:00
Luca Wehrstedt	0422e67336	Use Devices instead of DeviceIndexes in TensorPipe agent (#57294 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57294 With the advent of CPUs in the device maps, and to be more generic (e.g., to support AMD GPUs), and to avoid conversions when passing to Future and RRef and such, it's easier to use Devices instead of DeviceIndices. This started by just migrating the TensorPipe agent but the RPC layer is quite intertwined so I had to migrate a lot of stuff. ghstack-source-id: 127916562 Test Plan: CI Reviewed By: mrshenli Differential Revision: D28092733 fbshipit-source-id: 024dcb3648c5898ab13e770413c43958f04f1a8a	2021-05-01 16:12:55 -07:00
Jiakai Liu	3c4d57c18b	[pytorch][nnc] update external functions for mobile build (#56850 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56850 This is part of the changes to enable NNC AOT compilation for mobile. The generated kernels need to call these external functions thus change the declarations to use C linkage when building the mobile runtime. Added nnc_aten_addmm external function. ghstack-source-id: 127877411 Test Plan: - build & CI; - tested mobile build with stacked PRs; Reviewed By: ZolotukhinM Differential Revision: D27897154 fbshipit-source-id: 61d5499d7781a83bd2657859659fd1b5043d6b04	2021-04-30 19:07:19 -07:00
Mikhail Zolotukhin	f219ed6627	[TensorExpr] Add `TensorExprKernel::runFast` method. (#57328 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57328 This method uses `CodeGen::call_raw` instead of `CodeGen::call`. Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D28110359 Pulled By: ZolotukhinM fbshipit-source-id: 4fdffc8196d24fc3300a9b4bc69f67562042a045	2021-04-30 15:26:18 -07:00
Mikhail Zolotukhin	c9ab384af7	[TensorExpr] Add `CodeGen::call_raw` method. (#55113 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55113 The new method allows to pass input and output arguments by `void*` pointers instead of CallArgs. That helps to reduce the invocation overhead. Currently this is only supported in LLVM codegen. Differential Revision: D27487549 Test Plan: Imported from OSS Reviewed By: bertmaher Pulled By: ZolotukhinM fbshipit-source-id: d8f3d92262cde1c155beefb629454370d9af2f89	2021-04-30 15:24:37 -07:00
Scott Wolchok	b87d3fa432	[PyTorch][jit] Don't allow create() on singleton types (#56807 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56807 If I understand correctly, there's no reason to create your own instance of these global singleton types. ghstack-source-id: 127312270 Test Plan: CI Reviewed By: SplitInfinity Differential Revision: D27973447 fbshipit-source-id: f12df69d185f1baaa45f2ac6eac70570a7a65912	2021-04-30 10:28:50 -07:00
Raghavan Raman	e795f88d6b	[NNC] Make flatten transform in-place (#56629 ) Summary: Partial fix for https://github.com/pytorch/pytorch/issues/56157 This PR updates the `flatten` API in `LoopNest` to perform the flattening transformation in-place. After this transformation, the first loop in the input becomes the flattened loop. Pull Request resolved: https://github.com/pytorch/pytorch/pull/56629 Reviewed By: H-Huang Differential Revision: D28004787 Pulled By: navahgar fbshipit-source-id: 7474ae237fae3fff0cd1c64a276a8831dc5b7db0	2021-04-30 09:51:45 -07:00
CodemodService FBSourceClangFormatLinterBot	e903e16d40	[AutoAccept][Codemod][FBSourceClangFormatLinter] Daily `arc lint --take CLANGFORMAT` Reviewed By: zertosh Differential Revision: D28088724 fbshipit-source-id: 3a350580427b92719a3c300bec310aea78375996	2021-04-29 04:12:25 -07:00
Nikita Shulga	4cb534f92e	Make PyTorch code-base clang-tidy compliant (#56892 ) Summary: This is an automatic change generated by the following script: ``` #!/usr/bin/env python3 from subprocess import check_output, check_call import os def get_compiled_files_list(): import json with open("build/compile_commands.json") as f: data = json.load(f) files = [os.path.relpath(node['file']) for node in data] for idx, fname in enumerate(files): if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'): files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')] return files def run_clang_tidy(fname): check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"]) changes = check_output(["git", "ls-files", "-m"]) if len(changes) == 0: return check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"]) def main(): git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n") compiled_files = get_compiled_files_list() for idx, fname in enumerate(git_files): if fname not in compiled_files: continue if fname.startswith("caffe2/contrib/aten/"): continue print(f"[{idx}/{len(git_files)}] Processing {fname}") run_clang_tidy(fname) if __name__ == "__main__": main() ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892 Reviewed By: H-Huang Differential Revision: D27991944 Pulled By: malfet fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179	2021-04-28 14:10:25 -07:00
Chen Lai	c91ea7d488	[PyTorch][Edge] Add binarires for unittests (#57039 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57039 ## Summary Add two models (v4 and v5) for testing runtime. (v5 will be introduced in https://github.com/pytorch/pytorch/pull/56002) ## Test plan CI Test Plan: Imported from OSS Reviewed By: iseeyuan Differential Revision: D28047615 Pulled By: cccclai fbshipit-source-id: 47f7df3094dadb7e013ed57bc713cc8b3d1c8ce0	2021-04-27 20:46:34 -07:00
Nikita Shulga	a93ceb333d	Workaround intermittent gcc-7.5 ICE in cpp tests (#57016 ) Summary: gcc-7.5 optimizer can hit internal compiler error if both `-fopenmp` and `-faligned-new` are passed: ``` /var/lib/jenkins/workspace/test/cpp/api/transformer.cpp: In function 'void transformer_decoder_test_helper(bool)': /var/lib/jenkins/workspace/test/cpp/api/transformer.cpp:609:6: internal compiler error: in equal_mem_array_ref_p, at tree-ssa-scopedtables.c:429 void transformer_decoder_test_helper(bool is_cuda) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``` Fixes https://github.com/pytorch/pytorch/issues/40941 Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/57016 Reviewed By: walterddr Differential Revision: D28027670 Pulled By: malfet fbshipit-source-id: 834e34b95e09bcae39ada25e02749f479a7e9013	2021-04-27 09:21:23 -07:00
Mikhail Zolotukhin	f3743f097f	[TensorExpr] Nuke tensorexpr::ScalarType and instead use c10::ScalarType directly. (#56825 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56825 Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D27977461 Pulled By: ZolotukhinM fbshipit-source-id: f8a72938ba395e426e2d9449627113abb1c9c34f	2021-04-26 01:51:21 -07:00
Tugsbayasgalan Manlaibaatar	2041cd6707	Enable forward/backward compatibility in TS mobile (#56079 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56079 Test Plan: Imported from OSS Reviewed By: iseeyuan Differential Revision: D27828149 Pulled By: tugsbayasgalan fbshipit-source-id: 9291ddbf01853354fca0fa0a58b8115d5d2294da	2021-04-23 16:55:18 -07:00

1 2 3 4 5 ...

1388 commits