pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-15 21:00:47 +00:00

Author	SHA1	Message	Date
Philip Meier	0973c5a1cc	align signature of make_tensor with other creation ops (#72702 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72702 Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D34457729 Pulled By: mruberry fbshipit-source-id: 83d580c4201eef946dc9cf4b9e28a3d36be55609 (cherry picked from commit aa4cf20fbeb4b795595729b8ac2e6ba7707d8283)	2022-02-25 06:30:31 +00:00
Philip Meier	1f74e082e2	only compare attributes for meta tensors (#72508 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72508 Todo: - [x] document this behavior - [x] add tests Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D34262452 Pulled By: ezyang fbshipit-source-id: bc5c9653d5c3ad5c6efccc9c8e0efc0d28e15104 (cherry picked from commit 233142c88e4cff02825c7e233aba9411a6df3e9f)	2022-02-17 02:33:08 +00:00
Natalia Gimelshein	06d0536dad	Low precision support for jiterator (#70157 ) Summary: This adds support for bfloat16 and fp16 types for jiterator by adding at::Half and at::BFloat16 classes to the jiterator code template. The only methods defined in those classes are construction from float and implicit conversion to float. Mathematical operations on them never need to be defined, because jiterator is written in a way to implicitly upcast the inputs to the functor, so all math has to be performed on float only (e.g. compute part of the kernel would always be written as ``` out[j] = i0<float>(arg0[j]); ``` It also adds support for casting to complex outputs, by adding a similar templated class c10::complex<T>. Originally I planned to only support float -> complex complex for it, but to compile fetch_and_cast function we also need complex -> float conversion. We can avoid it by compiling fetch_and_cast for a different subset of types, but I'm not doing it in this PR. Thus, technically, we can compile a kernel that would accept complex inputs and produce wrong results, but we are guarding against it by static asserting that none of the functor datatype are complex, and runtime-checking that none of the inputs are complex. Adding bfloat16, half and complex support allows us to remove special handling for type promotion tests for gcd. i0 (that supports half and bfloat16 inputs) is moved to use jiterator. Pull Request resolved: https://github.com/pytorch/pytorch/pull/70157 Reviewed By: mruberry Differential Revision: D33221645 Pulled By: ngimel fbshipit-source-id: 9cfe8aba3498a0604c4ea62c217292ea06c826b1	2021-12-19 11:56:57 -08:00
Natalia Gimelshein	9ff8c49ed9	Enable cpu scalar arguments for jiterator (#69861 ) Summary: Creates analog of `gpu_kernel_with_scalars` for jiterator kernels Pull Request resolved: https://github.com/pytorch/pytorch/pull/69861 Reviewed By: mruberry Differential Revision: D33134013 Pulled By: ngimel fbshipit-source-id: fd2412e8d6432e15d5721e95a194d29fa70ad92c	2021-12-16 10:58:59 -08:00
Mike Ruberry	a974699633	Skips failing ROCm test (#69456 ) Summary: ROCm and CUDA type promotion are slightly divergent and need to be updated. cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH Pull Request resolved: https://github.com/pytorch/pytorch/pull/69456 Reviewed By: anjali411, janeyx99 Differential Revision: D32883895 Pulled By: mruberry fbshipit-source-id: 3b0ba8a9d092c2d7ff20d78da42d4a147b1db12d	2021-12-06 09:12:31 -08:00
Mike Ruberry	b6f41bb848	The Jiterator (#69439 ) Summary: This PR: - creates the "jiterator" pattern, allowing elementwise unary and binary kernels that don't accept scalars to be jit compiled when called - ports the gcd and i1 CUDA kernels to use the jiterator - extends elementwise binary systemic testing to be comparable to elementwise unary systemic testing - separates one test case from test_out in test_ops.py - updates more OpInfos to use expected failures instead of skips The jiterator currently does not support half, bfloat16 or complex dtypes. It also (as mentioned above) doesn't support scalar inputs. In the future we expect to add support for those datatypes and scalars. Pull Request resolved: https://github.com/pytorch/pytorch/pull/69439 Reviewed By: ngimel Differential Revision: D32874968 Pulled By: mruberry fbshipit-source-id: d44bb9cde4f602703e75400ec5a0b209f085e9b3	2021-12-06 07:32:48 -08:00
soulitzer	83e8612d11	Clean up test autograd (#67413 ) Summary: Partially fixes https://github.com/pytorch/pytorch/issues/66066 This PR: - cleans up op-specific testing from test_autograd. test_autograd should be reserved for testing generic autograd functionality - tests related to an operator are better colocated - see the tracker for details What to think about when moving tests to their correct test suite: - naming, make sure its not too generic - how the test is parametrized, sometimes we need to add/remove a device/dtype parameter - can this be merged with existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/67413 Reviewed By: jbschlosser, albanD Differential Revision: D32031480 Pulled By: soulitzer fbshipit-source-id: 8e13da1e58a38d5cecbfdfd4fe2b4fe6f816897f	2021-11-03 15:26:09 -07:00
kshitij12345	885a8e53ba	replace onlyOnCPUAndCUDA with onlyNativeDeviceTypes (#65201 ) Summary: Reference https://github.com/pytorch/pytorch/issues/53849 Replace `onlyOnCPUandCUDA` with `onlyNativeDeviceTypes` which includes `cpu, cuda and meta`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/65201 Reviewed By: mrshenli Differential Revision: D31299718 Pulled By: mruberry fbshipit-source-id: 2d8356450c035d6a314209ab51b2c237583920fd	2021-11-01 09:22:34 -07:00
Yukio Siraichi	83f70db95c	Fix common device computation for comparison ops. (#66245 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66245 Fixes #66053 This PR splits `declare_static_dtype_and_device` into two new methods for `TensorIteratorBase`: `declare_static_dtype` and `declare_static_device`. Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D31503849 Pulled By: ngimel fbshipit-source-id: 4b131b691d29ceb5f3709f5d6503997ea0875c54	2021-10-22 18:43:17 -07:00
Jane Xu	8a65047acc	[skip ci] Set test owners for everything considered with module: tests (#66865 ) Summary: Action following https://github.com/pytorch/pytorch/issues/66232 cc mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/66865 Reviewed By: anjali411 Differential Revision: D31771147 Pulled By: janeyx99 fbshipit-source-id: 8bebe5ac2098364ef1ee93b590abb5f4455b0f89	2021-10-20 09:37:03 -07:00
jiayisun	0b8dc0f04a	add BFloat16 operators on CPU: logaddexp, logaddexp2, remainder (#63621 ) Summary: Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/63621 Reviewed By: H-Huang Differential Revision: D31640811 Pulled By: mruberry fbshipit-source-id: 1fd061b65c196398738018eefc52bf459e424b1c	2021-10-15 13:11:45 -07:00
Freey0	2223737da9	restore test_inplace_comparison_ops_require_inputs_have_same_dtype Expected behavior (#64267 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64267 This test expects every operation to throw a runtime error. And Reinsert in-place operation test，Fix bug for comparison operation fix: #64018 Test Plan: Imported from OSS Reviewed By: gchanan Differential Revision: D30720915 Pulled By: ezyang fbshipit-source-id: 215a6556d20770f70f4ced1c1f9a9753933f1d37	2021-09-08 06:42:12 -07:00
Kevin Tse	7e4ebe06ca	Fixes issue related torch.trapezoid broadcasting behavior and documentation (#64054 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64054 Fixes #63608 cc mruberry rgommers heitorschueroff Test Plan: Imported from OSS Reviewed By: saketh-are Differential Revision: D30617078 Pulled By: NivekT fbshipit-source-id: 815896ec56d447562790df4d662e94fd13457e2a	2021-09-07 11:41:55 -07:00
Philip Meier	26b7ff5aea	deprecate dtype getters from `torch.testing` namespace (#63554 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63554 Following https://github.com/pytorch/pytorch/pull/61840#issuecomment-884087809, this deprecates all the dtype getters publicly exposed in the `torch.testing` namespace. The reason for this twofold: 1. If someone is not familiar with the C++ dispatch macros PyTorch uses, the names are misleading. For example `torch.testing.floating_types()` will only give you `float32` and `float64` skipping `float16` and `bfloat16`. 2. The dtype getters provide very minimal functionality that can be easily emulated by downstream libraries. We thought about [providing an replacement](https://gist.github.com/pmeier/3dfd2e105842ad0de4505068a1a0270a), but ultimately decided against it. The major problem is BC: by keeping it, either the namespace is getting messy again after a new dtype is added or we need to somehow version the return values of the getters. Test Plan: Imported from OSS Reviewed By: H-Huang Differential Revision: D30662206 Pulled By: mruberry fbshipit-source-id: a2bdb10ab02ae665df1b5b76e8afa9af043bbf56	2021-09-07 08:58:51 -07:00
Saketh Are	83e28a7d28	Use stacklevel for floordiv deprecation warnings (#64034 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/60548 `Tensor.__floordiv__` was indirectly deprecated by deprecation of `torch.floor_divide` (see https://github.com/pytorch/pytorch/issues/43874). Deprecating it directly provides clearer feedback. Repro: ``` import torch x = torch.tensor(0) x // 1 ``` Before this change, a deprecation warning was triggered within the C++ implementation of floor_divide: ``` UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at ../aten/src/ATen/native/BinaryOps.cpp:571.) return torch.floor_divide(self, other) ``` After this change, the warning instead cites the user's offending line of Python code: ``` UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). x // 1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/64034 Reviewed By: mruberry Differential Revision: D30658010 Pulled By: saketh-are fbshipit-source-id: b0e6c5008d741897509d102f4a89efb47de4aa2a	2021-08-31 11:27:56 -07:00
Kushashwa Ravi Shrimali	d37636901e	[Doc] `make_tensor` to `torch.testing` module (#63925 ) Summary: This PR aims to add `make_tensor` to the `torch.testing` module in PyTorch docs. TODOs: * [x] Add examples cc: pmeier mruberry brianjo Pull Request resolved: https://github.com/pytorch/pytorch/pull/63925 Reviewed By: ngimel Differential Revision: D30633487 Pulled By: mruberry fbshipit-source-id: 8e5a1f880c6ece5925b4039fee8122bd739538af	2021-08-30 12:25:40 -07:00
Philip Meier	70a3210eca	Add `BinaryUfuncOpInfo` and broadcasting tests (#61964 ) Summary: As proof of concept, this PR uses the new `BinaryUfuncOpInfo` in broadcasting tests for `add`, `sub`, `mul`, `div`, `floor_div`, and `true_div`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/61964 Reviewed By: ngimel Differential Revision: D30407734 Pulled By: mruberry fbshipit-source-id: ada28994f43b0635f279f45a02ecba18bc8ee033	2021-08-20 11:44:15 -07:00
Shen Li	1022443168	Revert D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: revert-hammer Differential Revision: D30279364 (`b004307252`) Original commit changeset: c1ed77dfe43a fbshipit-source-id: eab50857675c51e0088391af06ec0ecb14e2347e	2021-08-12 11:45:01 -07:00
Zsolt Dollenstein	b004307252	[codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: manual inspection & sandcastle Reviewed By: zertosh Differential Revision: D30279364 fbshipit-source-id: c1ed77dfe43a3bde358f92737cd5535ae5d13c9a	2021-08-12 10:58:35 -07:00
Kevin Tse	87465a6e68	adding operator cumulative_trapezoid (#61615 ) Summary: Stack from [ghstack](https://github.com/ezyang/ghstack): * https://github.com/pytorch/pytorch/issues/61616 * https://github.com/pytorch/pytorch/issues/61615 * https://github.com/pytorch/pytorch/issues/61475 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61615 Reviewed By: malfet, mruberry Differential Revision: D29975064 Pulled By: NivekT fbshipit-source-id: 4d4e98f3efb720fdc44eb238ecbf0fa157ac13d7	2021-08-03 08:04:00 -07:00
kshitij12345	fd8004b42e	add bfloat16 impl for nextafter (#61829 ) Summary: Add `BFloat16` support for `nextafter`. * [x] Add OpInfo * [x] Add Implementation Test (C++ tests) * [x] Add credit Pull Request resolved: https://github.com/pytorch/pytorch/pull/61829 Reviewed By: ejguan Differential Revision: D29932498 Pulled By: mruberry fbshipit-source-id: 89524531a4800569ba1addd08a4ace330a6f72a4	2021-08-02 23:16:58 -07:00
Freey0	109bd5e78a	OpInfo: bitwise_and (#61349 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61349 Also add type promotion test for bugs found by pr #60813 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D29592840 Pulled By: ezyang fbshipit-source-id: ee013b20e31baf6c6ebf2edb881ae6d8e215c7a6	2021-07-22 07:04:17 -07:00
Nikita Shulga	604f503d30	Revert D29794958 + compilation fix (#61937 ) Summary: This PR un-reverts https://github.com/pytorch/pytorch/issues/61475 + fixes compilation with MSVC, that does not recognize alternative operator spellings (i.e. using `or` instead of `\|\|` ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/61937 Reviewed By: albanD Differential Revision: D29805941 Pulled By: malfet fbshipit-source-id: 01e5963c6717c1b44b260300d87ba0bf57f26ce9	2021-07-20 18:14:45 -07:00
Nikita Shulga	22fff61f06	Revert D29794958: [pytorch][PR] changing trapz to trapezoid Test Plan: revert-hammer Differential Revision: D29794958 (`95cec8f4fa`) Original commit changeset: 60b9c07efd47 fbshipit-source-id: 2dcda2d62e01c2521a86ae5ed8246cfb686d3f64	2021-07-20 16:00:46 -07:00
Kevin Tse	95cec8f4fa	changing trapz to trapezoid (#61475 ) Summary: This PR resolves issue https://github.com/pytorch/pytorch/issues/52606 while also adding support for complex number Stack from [ghstack](https://github.com/ezyang/ghstack): * https://github.com/pytorch/pytorch/issues/61616 * https://github.com/pytorch/pytorch/issues/61615 * https://github.com/pytorch/pytorch/issues/61475 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61475 Reviewed By: mruberry Differential Revision: D29794958 Pulled By: NivekT fbshipit-source-id: 60b9c07efd47fd85b9c8178768fc7828d7b57d29	2021-07-20 15:25:55 -07:00
Akifumi Imanishi	4d9fd8958b	Support `__rand__`, `__ror__` and `__rxor__` (#59240 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/58120. This PR implements `torch.Tensor.{__rand__/__ror__/__rxor__}` for the compatibility with NumPy’s interface. (cc: mruberry, rgommers, emcastillo, kmaehashi) Pull Request resolved: https://github.com/pytorch/pytorch/pull/59240 Reviewed By: ngimel Differential Revision: D29482304 Pulled By: mruberry fbshipit-source-id: 13789202c1d8dddf8658a45381aeedcc31e2f603	2021-07-07 13:34:14 -07:00
Joel Schlosser	03b5a225a7	Test parametrization for instantiated device-specific tests (#60233 ) Summary: The `ops` decorator provides a way to parameterize a test across a given list of ops. This would be useful for modules as well (e.g. a `modules` decorator), but the mechanism by which this is accomplished is specific to ops. In the details, the `ops` decorator tags a test function with the metadata needed (list of ops, `dtypes`) and the actual tests are generated according to this metadata during the call to `instantiate_device_type_tests()`. This PR makes this mechanism more generic, allowing for test parameterization across arbitrary dimensions. This makes a `modules` decorator (or any similar type of decorator) straightforward to implement without changes to the device-specific test instantiation logic. One caveat is that, since this is implemented where the old `ops` decorator was (within `instantiate_device_type_tests()`), this only works for tests instantiated using the device-specific instantiation logic. Longer term, even device-specific test instantiation could be treated as an optional parameterization across device types, but this PR takes a low-risk approach for now. In practice, this just means that a `device` kwarg is required for all test signatures used with the mechanism. The `ops` decorator has been refactored to use the generic mechanism and works the same as before, with one difference: when `OpDTypes.none` is specified, the test signature no longer needs an unused `dtype` kwarg. This is a nice bonus that demonstrates the added flexibility of a generic parameterization mechanism. The refactored form also has the bonus that all op-specific test generation logic is contained within the `ops` decorator class, improving readability. Behind the scenes, the generic mechanism is a base decorator class (`_TestParameterizer`) from which `ops` derives. The core functionality is in the `_parameterize_test()` method, which takes in a test function and returns a generator that produces parameterized tests, including names and parameter kwargs to pass to them. Using the `ops` decorator results in a set of op-specific tests from a given generic test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/60233 Reviewed By: iramazanli Differential Revision: D29494995 Pulled By: jbschlosser fbshipit-source-id: a14446488c106094fafcaa75ccf8e9e3faf33bfc	2021-06-30 18:50:22 -07:00
kshitij12345	dfd2edc025	[special] add zeta (#59623 ) Summary: Reference https://github.com/pytorch/pytorch/issues/50345 `zeta` was already present in the codebase to support computation of `polygamma`. However, `zeta` only had `double(double, double)` signature for CPU before the PR (which meant that computation `polygamma` were always upcasted to `double` for zeta part). With this PR, float computations will take place in float and double in double. Have also refactored the code and moved the duplicate code from `Math.cuh` to `Math.h` Note: For scipy, q is optional, and if it is `None`, it defaults `1` which corresponds to Reimann-Zeta. However, for `torch.specia.zeta`, I made it mandatory cause for me it feels odd without `q` this is Reimann-Zeta and with `q` it is the general Hurwitz Zeta. I think sticking to just general made more sense as passing `1` for q sounds trivial. Verify: * [x] Docs https://14234587-65600975-gh.circle-artifacts.com/0/docs/special.html#torch.special.zeta Pull Request resolved: https://github.com/pytorch/pytorch/pull/59623 Reviewed By: ngimel Differential Revision: D29348269 Pulled By: mruberry fbshipit-source-id: a3f9ebe1f7724dbe66de2b391afb9da1cfc3e4bb	2021-06-24 00:00:12 -07:00
Akifumi Imanishi	26cdec6ce4	Support `torch.bitwise_{left/right}_shift` and `__rlshift__`, `__rrshift__` (#59544 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/58121 This PR implements `torch.bitwise_left_shift` and `torch.bitwise_right_shift` and `torch.Tensor.{__rlshift__/__rrshift__}`for compatibility with Python array API standard. (cc: mruberry, rgommers, emcastillo, kmaehashi) Pull Request resolved: https://github.com/pytorch/pytorch/pull/59544 Reviewed By: ngimel Differential Revision: D29348869 Pulled By: mruberry fbshipit-source-id: 329aee296cf890735e8a9f858bccfe87c03d06ca	2021-06-23 23:57:16 -07:00
Masaki Kozuki	9e773ea7d5	Use `accscalar_t` for CUDA add/sub with Tensor and Scalar (#60454 ) Summary: Follow up of https://github.com/pytorch/pytorch/issues/60227, related to https://github.com/pytorch/pytorch/issues/59907 & https://github.com/pytorch/pytorch/issues/58833 With this pull request, `torch.add` & `torch.sub` use `acc_type` for `Scalar` if either of two arguments is `Scalar`. This mimics the behavior of [`torch.mul`](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cuda/BinaryMulDivKernel.cu#L18), `torch._foreach_(add\|sub).Scalar` and `torch._foreach_(add\|sub).ScalarList`. --- reference - torch.mul CUDA kernel: `b0c9762e2d/aten/src/ATen/native/cuda/BinaryMulDivKernel.cu (L17-L25)` - `torch._foreach_(add\|sub).Scalar`: cast scalar `b0c9762e2d/aten/src/ATen/native/cuda/ForeachBinaryOpScalar.cu (L27)` - `torch._foreach_(add\|sub).ScalarList`: `BinaryOpScalarListFunctor` `b0c9762e2d/aten/src/ATen/native/cuda/ForeachFunctors.cuh (L180-L182)` and multi_tensor_apply handles `scalar_t` and computes `opmath_t` (almost equivalent `accscalar_t`) `b0c9762e2d/aten/src/ATen/native/cuda/MultiTensorApply.cuh (L60-L68)`. BinaryOpScalarListFunctor is used `b0c9762e2d/aten/src/ATen/native/cuda/ForeachBinaryOpScalarList.cu (L24)` cc ngimel ptrblck mcarilli Pull Request resolved: https://github.com/pytorch/pytorch/pull/60454 Reviewed By: VitalyFedyunin Differential Revision: D29345035 Pulled By: ngimel fbshipit-source-id: 5dbafbdfe029a9544ec2e58f17d547928e017a04	2021-06-23 18:59:22 -07:00
Philip Meier	0c916c8a4e	up the priority of numpy array comparisons in self.assertEqual (#59067 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/58988. Pull Request resolved: https://github.com/pytorch/pytorch/pull/59067 Reviewed By: jbschlosser Differential Revision: D28986642 Pulled By: heitorschueroff fbshipit-source-id: 3ef2d26b4010fc3519d0a1a020ea446ffeb46ba0	2021-06-22 13:07:07 -07:00
Masaki Kozuki	b298013cd5	[add/sub] Cast `alpha` to `acc_type` (#60227 ) Summary: This PR lets `torch.add` & `torch.sub` CUDA kernels cast `alpha` to `acc_type`, not `scalar_t`. I do not remove `cast`s from `test/test_foreach.py` because I'll do this in https://github.com/pytorch/pytorch/issues/59907 or follow-up for it. Current upstream `torch._foreach_add` & `torch._foreach_sub` upcast `alpha` parameter to `acc_type<scalar_t>` while `torch.add` & `torch.sub` not. This is kind of problematic because outputs of `torch.add` and `torch.sub` are different from `torch._foreach_add` and `torch._foreach_sub`, respectively if the dtype of input tensors is either `torch.half` or `torch.bfloat16`. The discrepancy is proportional-ish to `abs(alpha)` except when `alpha` is representable with 16 bits. ref: - `torch._foreach_add` & `torch._foreach_sub` cast `alpha`: `6d0fb85a62/aten/src/ATen/native/cuda/ForeachBinaryOpList.cu (L21-L28)`, `BinaryOpListAlphaFunctor` is defined here: `6d0fb85a62/aten/src/ATen/native/cuda/ForeachFunctors.cuh (L202)` related: https://github.com/pytorch/pytorch/issues/58833, https://github.com/pytorch/pytorch/pull/59907 cc ngimel ptrblck mcarilli Pull Request resolved: https://github.com/pytorch/pytorch/pull/60227 Reviewed By: mruberry Differential Revision: D29252759 Pulled By: ngimel fbshipit-source-id: 847f3b9493ae30a900f7445af00aef1abcc1ab21	2021-06-20 19:05:22 -07:00
Akifumi Imanishi	0a5bfa9919	Support `__rmod__` (#58476 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/58035. This PR implements `torch.Tensor.__rmod__` and `torch.remainder(scalar, tensor)` for the compatibility with NumPy’s interface. (cc: mruberry, rgommers, emcastillo, kmaehashi) TODO: - [x] Update `tensor_binary_op` in test/test_binary_ufuncs.py after https://github.com/pytorch/pytorch/issues/58216 is merged. Pull Request resolved: https://github.com/pytorch/pytorch/pull/58476 Reviewed By: ngimel Differential Revision: D28776810 Pulled By: mruberry fbshipit-source-id: 74f8aea80f439ef2cc370333524e39971eeb7bf4	2021-06-05 16:19:24 -07:00
Akifumi Imanishi	3113a1de4a	Fix some tensor operators to return `NotImplemented` for invalid inputs (#58216 ) Summary: Same as https://github.com/pytorch/pytorch/issues/57934. (cc/ albanD) Pull Request resolved: https://github.com/pytorch/pytorch/pull/58216 Reviewed By: ailzhang Differential Revision: D28494886 Pulled By: albanD fbshipit-source-id: 380205867ee1cde90e1c6fcfe2a31749e1243530	2021-05-19 13:09:57 -07:00
Xue Haotian	098d9975a7	Port heaviside to structured kernel (#57933 ) Summary: Port heaviside to structured kernel Related https://github.com/pytorch/pytorch/issues/55070 Pull Request resolved: https://github.com/pytorch/pytorch/pull/57933 Reviewed By: mruberry Differential Revision: D28362533 Pulled By: ezyang fbshipit-source-id: 96b4591db3f609434784bd0ef9e54c61c918fb88	2021-05-13 10:48:11 -07:00
Alban Desmaison	5e83c62a9e	Revert D28351931: [pytorch][PR] Fix some tensor operators to return `NotImplemented` for invalid inputs Test Plan: revert-hammer Differential Revision: D28351931 (`35521a2629`) Original commit changeset: 985457a44dba fbshipit-source-id: 10724c219e53648f10a70719e25bcf774c6c7852	2021-05-12 13:58:03 -07:00
Akifumi Imanishi	35521a2629	Fix some tensor operators to return `NotImplemented` for invalid inputs (#57934 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/57719. This PR fixes `torch.Tensor{__rsub__, __rdiv__, __rtruediv__, __pow__, __rmatmul__}` to return `NotImplemented` instead of raising a `TypeError`. cc/ mruberry: The first commit of this PR is the same as `1d209db1cc` excepts the commit message. Pull Request resolved: https://github.com/pytorch/pytorch/pull/57934 Reviewed By: mruberry Differential Revision: D28351931 Pulled By: albanD fbshipit-source-id: 985457a44dba24d2496794dfb8c1661cbcd4ff8f	2021-05-12 11:03:23 -07:00
Akifumi Imanishi	14282232d9	Fix `generate_not_implemented_tests` not testing unknown types correctly (#56997 ) Summary: Currently, the test code is not testing unknown types correctly because `op` is overwritten in the for-loop (i.e., currently only `__ior__` is tested). This PR fixes the test `generate_not_implemented_tests` to bind operator name to each method, and remove operators currently unsupported (`__rand__`, …). cc/ mruberry This fix is be needed to add tests for the operator we are going to introduce (e.g., `__rand__`) Pull Request resolved: https://github.com/pytorch/pytorch/pull/56997 Reviewed By: astaff Differential Revision: D28118465 Pulled By: mruberry fbshipit-source-id: c5a466a7604262ed5490862300d47043aff63d0b	2021-05-09 05:34:10 -07:00
Wenlei Xie	20085f6d23	Support auto generation of device check (#56872 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56872 ghstack-source-id: 127914018 Test Plan: auto test Reviewed By: ezyang Differential Revision: D27986429 fbshipit-source-id: 0da8413b0b8e6810fcea27ed1de499f11f68bd1f	2021-05-01 12:02:09 -07:00
kshitij12345	d4ddb47719	[special] Add `xlog1py` (#55138 ) Summary: Reference : https://github.com/pytorch/pytorch/issues/50345 * [x] Check Rendered Document (https://12494173-65600975-gh.circle-artifacts.com/0/docs/special.html#torch.special.xlog1py) * [x] Tests in Binary Ufunc * [x] OpInfo * [x] Structured Kernel Pull Request resolved: https://github.com/pytorch/pytorch/pull/55138 Reviewed By: ngimel Differential Revision: D27961461 Pulled By: mruberry fbshipit-source-id: 30a8f41970a829bf50254aadf5615e8ce4148c7e	2021-04-30 05:51:13 -07:00
Peter Bell	5536cda19a	Update floor_divide behavior in line with NumPy 1.20 (#56893 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56893 Fixes gh-56814 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D28025814 Pulled By: mruberry fbshipit-source-id: 8654978ea1d5aa7c12bcf5a8c939966287a2d34e	2021-04-28 05:01:23 -07:00
Brian Hirsh	e8faf69739	fix torch.pow type promotion issue (#54085 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54085 Fixes https://github.com/pytorch/pytorch/issues/50121. This fixes two similar issues pointed out with the dtype that `torch.pow` performs its computation. Thanks ngimel for spotting the issues originally (comments [here](https://github.com/pytorch/pytorch/pull/53669#discussion_r594624355) and [here](https://github.com/pytorch/pytorch/pull/53669#discussion_r594719704))! Before: ``` >>> torch.pow(2, torch.tensor([17], dtype=torch.uint8), out=torch.tensor([0])) tensor([0]) >>> torch.pow(2, torch.tensor(17, dtype=torch.uint8), out=torch.tensor(0)) tensor(131072) >>> torch.pow(2, torch.tensor([17], dtype=torch.uint8, device='cuda'), out=torch.tensor([0], device='cuda')) tensor([131072], device='cuda:0') >>> torch.pow(2, torch.tensor(17, dtype=torch.uint8, device='cuda'), out=torch.tensor(0, device='cuda')) tensor(131072, device='cuda:0') ``` After: ``` >>> torch.pow(2, torch.tensor([17], dtype=torch.uint8), out=torch.tensor([0])) tensor([0]) >>> torch.pow(2, torch.tensor(17, dtype=torch.uint8), out=torch.tensor(0)) tensor(0) >>> torch.pow(2, torch.tensor([17], dtype=torch.uint8, device='cuda'), out=torch.tensor([0], device='cuda')) tensor([0], device='cuda:0') >>> torch.pow(2, torch.tensor(17, dtype=torch.uint8, device='cuda'), out=torch.tensor(0, device='cuda')) tensor(0, device='cuda:0') ``` In all four cases above, `tensor(0, ...)` is the correct value because the computed "common dtype" among the inputs is expected to be `uint8`. Computing `2 ** 7` in uint8 will then overflow to zero. Finally, we cast the computed output to the output tensor's dtype, which is `int32`. There were two separate issues fixed in this PR: one for cpu and one for cuda: * For CPU, The `pow(Scalar, Tensor)` overload wasn't calling `set_wrapped_number(true)` after wrapping the scalar in a Tensor, which caused the "promoted" scalar to incorrectly participate in type promotion (see the documented behavior [here](`aa8714dfed/c10/core/TensorImpl.h (L590)`)) * For CUDA, the cuda kernels defined in `PowKernel.cu` were using the output's dtype to run the computation, instead of the common dtype. As an aside: The CPU and CUDA kernels actually both use `iter.dtype()` instead of `iter.common_dtype()` to run the computation, which I fixed. The reason that only manifested here for CUDA is because TensorIterator has cpu-specific logic to create temporary outputs with the intermediate dtype (shown [here](`aa8714dfed/aten/src/ATen/TensorIterator.cpp (L349)`)). I'm not sure what the end state is there- I can imagine that being something we're more okay doing for cpu than for cuda, but it also leads to hard-to-track-down inconsistencies between the two like in this case. Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D27096330 Pulled By: bdhirsh fbshipit-source-id: a7e2909243851625cb3056d1e7abb2383bfe95f2	2021-04-15 08:55:53 -07:00
Winston Smith	aceceb3d5c	Reland #50999 (Added pow() on CPU for float16 & bfloat16) (#55280 ) Summary: #### Reason for relanding Line 1607 of `torch/testing/_internal/common_methods_invocations.py` of https://github.com/pytorch/pytorch/issues/50999 had `dtype` instead of `dtype=torch.bool`, so 4 of the 9 sample inputs for `bool` had incorrect dtype. This bug was caught by https://github.com/pytorch/pytorch/issues/54949. 1. Added support for pow() on CPU for `float16` (`Half`) and `bfloat16` types. Both `pow(Tensor, Scalar)` and `pow(Tensor, Tensor)` are now supported for the aforementioned types. However autograd isn't supported for `Float16` on CPU yet, as `log_vml_cpu` can't be enabled for it. 2. heitorschueroff added `pow_tensor_scalar_optimized_kernel` to refactor & simplify `PowKernel.cpp`. It provides a common path for all the complex types & floating point types (except Float16, due to lack of complete AVX2 vectorization support for it). It replaced code that had previously been duplicated for (float, double) and complex types, so PowKernel.cpp looks a lot cleaner now. 3. Enabled (unskipped) some tests for `erf`, `erfc`,`erfinv`, `tan` and `linalg.vector.norm` which were being skipped earlier due to `pow()` not having been implemented for `float16` & `bfloat16`. 4. Added an OpInfo for `pow()` & enabled some test cases for `pow()`. 5. Extended the coverage of existing tests for `pow` in `test_binary_ufuncs.py` in order to enable comparison with `numpy`, even with discontiguous tensors, and added a test to ensure that a runtime error is raised for `pow`'s inplace variant if resizing the base tensor is required during its invocation. 6. Added `float16` & `bfloat16` to `square`'s dtype lists in its `UnaryUfuncInfo`. 7. Removed redundant `dtypesIfCPU` and `dtypesIfCUDA` from `OpInfo`s where they are equal to `dtypes`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/55280 Reviewed By: jbschlosser Differential Revision: D27591772 Pulled By: heitorschueroff fbshipit-source-id: c7420811b32595bb3353149a61e54a73f2eb352b	2021-04-13 13:23:29 -07:00
Richard Zou	1e70d217e7	Add error message for complex alpha and non-complex inputs (#54964 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54964 Previously, the following would error out with a strange error message: ``` import torch x=torch.randn(2) torch.rsub(x, 1, alpha=2j) Traceback (most recent call last) <ipython-input-2-caf2a1c03d0b> in <module> 1 import torch 2 x=torch.randn(2) ----> 3 torch.rsub(x, 1, alpha=2j) RuntimeError: value cannot be converted to type float without overflow: (-0,-2) ``` The reason why this is happening is because the alpha check doesn't check for if `x` is not complex and `alpha` is complex. The error gets thrown further along in the implementation of torch.sub, when it coerces `alpha` to be the same dtype as the input tensor: https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cpu/BinaryOpsKernel.cpp#L53 This PR fixes the bad error message by adding a new check to the alpha check. Test Plan: - pytest test/test_binary_ufuncs.py - NB: add, sub, and rsub all share the same alpha check. The test only tests it for torch.add, but that should be sufficient. Reviewed By: gchanan Differential Revision: D27504017 Pulled By: zou3519 fbshipit-source-id: 70b9aa75a7a4faaaa93f6ba235cae85998a91697	2021-04-07 14:12:34 -07:00
Peter Bell	2ee02b30b1	Replace rounding_mode="true" with rounding_mode=None (#51988 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51988 * #51988 Replace rounding_mode="true" with rounding_mode=None Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D27561817 Pulled By: mruberry fbshipit-source-id: 60d1d9c389570f60d599fc1876518717367fb368	2021-04-05 14:53:43 -07:00
Nikita Shulga	8377e6221a	Revert D27478225: [pytorch][PR] Added pow() on CPU for float16 & bfloat16 Test Plan: revert-hammer Differential Revision: D27478225 (`6d030c14cf`) Original commit changeset: d309dd98d5a9 fbshipit-source-id: e0518f15185b41946caf3a8456c7af3f52e5a910	2021-04-03 10:26:44 -07:00
Winston Smith	6d030c14cf	Added pow() on CPU for float16 & bfloat16 (#50999 ) Summary: Added the functionality desired in https://github.com/pytorch/pytorch/issues/50789. 1. Added support for pow() on CPU for `float16` (`Half`) and `bfloat16` types. Both `pow(Tensor, Scalar)` and `pow(Tensor, Tensor)` are now supported for the aforementioned types. However autograd isn't supported for `Float16` on CPU yet, as `log_vml_cpu` can't be enabled for it. 2. heitorschueroff added `pow_tensor_scalar_optimized_kernel` to refactor & simplify `PowKernel.cpp`. It provides a common path for all the complex types & floating point types (except Float16, due to lack of complete AVX2 vectorization support for it). It replaced code that had previously been duplicated for (float, double) and complex types, so PowKernel.cpp looks a lot cleaner now. 3. Enabled (unskipped) some tests for `erf`, `erfc`,`erfinv`, `linalg.norm` and `linalg.vector.norm` which were being skipped earlier due to `pow()` not having been implemented for `float16` & `bfloat16`. 4. Added an OpInfo for `pow()` & enabled some test cases for `pow()`. 5. Extended the coverage of existing tests for `pow` in `test_binary_ufuncs.py` in order to enable comparison with `numpy`, even with discontiguous tensors, and added a test to ensure that a runtime error is raised for `pow`'s inplace variant if resizing the base tensor is required during its invocation. 6. Added `float16` & `bfloat16` to `square`'s dtype lists in its `UnaryUfuncInfo`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50999 Reviewed By: zou3519 Differential Revision: D27478225 Pulled By: heitorschueroff fbshipit-source-id: d309dd98d5a96d0cb9b08281757bb1c65266d011	2021-04-02 15:57:06 -07:00
kshitij12345	bac566bf61	torch.square : OpInfo and minor fixes (#52551 ) Summary: Reference: https://github.com/pytorch/pytorch/issues/42515 Add `out` variant to be consistent with Unary Ops. Pull Request resolved: https://github.com/pytorch/pytorch/pull/52551 Reviewed By: heitorschueroff Differential Revision: D27233482 Pulled By: mruberry fbshipit-source-id: fef6f241849a12c46028bd1aad8f5ecc1dc65ea1	2021-03-24 00:04:42 -07:00
kshitij12345	b93ab10b7a	torch.lerp: cuda complex support (#54129 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/54048 TODO * [x] Add test Pull Request resolved: https://github.com/pytorch/pytorch/pull/54129 Reviewed By: bdhirsh Differential Revision: D27261878 Pulled By: anjali411 fbshipit-source-id: 10937a2eab944c73b5a98ec6278f50a876b8c7dc	2021-03-23 19:58:43 -07:00
Brian Hirsh	779cae9e42	port at::pow to structured (#53669 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53669 This PR does two things: * Ports `pow` to be structured * Fixes a bug with how pow handles mixed cpu and cuda tensors bug fix Pow is a binary op, and all binary ops that use TensorIterator are currently written to handle the case when one of the inputs is a CUDA tensor, and the other is a zero-dimensional cpu tensor. `pow` incidentally only handles one of the two cases: it fails when the CUDA tensor is passed as the exponent, e.g. `at::pow(torch.tensor(2.0, device='cpu'), torch.tensor([2, 2], device='cuda'))`. Porting `pow` to structured happened to change the error that was outputted from a `TORCH_CHECK` in TensorIterator to an `INTERNAL_ASSERT` in loop.cuh, so I ended up trying to fix the error and update the tests. I added more details in a comment on the PR. notes on the structured port Pow is a little weird, so I wrote down a couple of issues I noticed during the port: * Multiple independent overloads. `pow` has two overloads that have their own cpu/cuda kernels, meaning one doesn't call the other. I have to update the names of the kernel overloads to make the compiler happy, since the codegen would otherwise try to generate two classes with the same name. `pow` actually has 3 overloads that all have `out` variants, so I ported all 3 to structured- one of them just happens to redispatch one of the others in most cases. * Name propagation. Is name propagation implemented per operator? Or is expected to work for most/all ops by default. Right now it looks like it happens for TensorIterator ops by default. For ops that don't use TensorIterator, we need to explicitly pass the names through to the `set_output()` call in the meta function. This happened to matter for `pow` because it has 3 overloads, but only two of them directly use TensorIterator. I had to pass names directly to `set_output` in the 3rd overload to make tests happy. * Lack of `const Tensor &` in the C++ API. It's a goal to slowly make all `Tensor &` arguments const as part of the structured port, but in this case I needed to explicitly cast constness away because one structured kernel called back into the C++ API, which still has ordinary `Tensor &` arguments. This probably isn't something we'll fix soon, since we have boxing logic that actually relies on the `Tensor &` / `const Tensor &` distinction in some places. Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D27029821 Pulled By: bdhirsh fbshipit-source-id: c1786e770de6e6c2474b9a48210b88057ab1018e	2021-03-19 14:30:48 -07:00

1 2

68 commits