pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-14 20:57:59 +00:00

Author	SHA1	Message	Date
Aleksei Nikiforov	9174d14551	Don't install remaining caffe2 python files (#129067 ) It is assumed that they are no longer needed. And keeping their installation as is breaks "python setup.py develop --user" workflow when non-root user is used. This change is follow up for `3d617333e7` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129067 Approved by: https://github.com/cyyever, https://github.com/r-barnes	2024-06-27 17:25:59 +00:00
cyy	e4c32d14a8	[3/N] Remove inclusion of c10/util/string_utils.h (#128504 ) Follows #128372 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128504 Approved by: https://github.com/malfet	2024-06-15 06:38:40 +00:00
cyy	3008644297	[Caffe2] Remove remaining unused perfkernels (#128477 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/128477 Approved by: https://github.com/ezyang, https://github.com/r-barnes	2024-06-12 22:19:36 +00:00
cyy	059cae6176	[Caffe2] Remove Caffe2 proto and other files (#127655 ) Remove Caffe2 proto files altogether. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127655 Approved by: https://github.com/ezyang	2024-06-04 14:22:21 +00:00
cyy	a6bae1f6db	Remove more caffe2 files (#127511 ) Remove more caffe2 files. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127511 Approved by: https://github.com/r-barnes	2024-05-31 11:26:27 +00:00
Richard Barnes	cc6e72d882	Drop caffe2 core tests and some other stuff (#127089 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/127089 Approved by: https://github.com/Skylion007	2024-05-29 17:11:45 +00:00
Richard Barnes	af69a52f06	Reapply "Remove more of caffe2 (#126705 )" (#127317 ) This reverts commit `00fe0a0d79`. Originally was unnecessarily reverted by an oncall. Landing again. Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/127317 Approved by: https://github.com/izaitsevfb	2024-05-29 12:20:25 +00:00
Richard Barnes	1be7e4086a	Drop caffe2 nomnigraph (#127086 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/127086 Approved by: https://github.com/Skylion007	2024-05-28 23:20:46 +00:00
PyTorch MergeBot	00fe0a0d79	Revert "Remove more of caffe2 (#126705 )" This reverts commit `f95dbc1276`. Reverted https://github.com/pytorch/pytorch/pull/126705 on behalf of https://github.com/DanilBaibak due to Break internal build ([comment](https://github.com/pytorch/pytorch/pull/126705#issuecomment-2133325449))	2024-05-27 11:59:14 +00:00
Richard Barnes	f95dbc1276	Remove more of caffe2 (#126705 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/126705 Approved by: https://github.com/malfet	2024-05-24 06:53:08 +00:00
Richard Barnes	bbe68a16b9	[codemod][lowrisk] Remove extra semi colon from caffe2/caffe2/core/observer.h (#126976 ) Summary: `-Wextra-semi` or `-Wextra-semi-stmt` If the code compiles, this is safe to land. Test Plan: Sandcastle Reviewed By: palmje Differential Revision: D57632765 Pull Request resolved: https://github.com/pytorch/pytorch/pull/126976 Approved by: https://github.com/Skylion007	2024-05-23 17:31:19 +00:00
cyy	574ae9afb8	[Submodule] Remove third-party onnx-tensorrt (#126542 ) It seems that tensorrt is not used by the C++ code, may be due to the removal of Caffe2. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126542 Approved by: https://github.com/ezyang	2024-05-19 22:34:24 +00:00
Richard Barnes	ed327876f5	[codemod] `c10:optional` -> `std::optional` (#126135 ) Generated by running the following from PyTorch root: ``` find . -regex ".*\.$cpp\\|h\\|cu\\|hpp\\|cc\\|cxx$$" \| grep -v "build/" \| xargs -n 50 -P 4 perl -pi -e 's/c10::optional/std::optional/' ``` `c10::optional` is just an alias for `std::optional`. This removes usages of that alias in preparation for eliminating it entirely. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126135 Approved by: https://github.com/Skylion007, https://github.com/malfet, https://github.com/albanD, https://github.com/aaronenyeshi	2024-05-14 19:35:51 +00:00
Richard Barnes	b9e7b35912	Remove caffe2 from more build files (#125898 ) Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/125898 Approved by: https://github.com/Skylion007	2024-05-13 18:37:59 +00:00
Jeff Daily	ae9a4fa63c	[ROCm] enforce ROCM_VERSION >= 6.0 (#125646 ) Remove any code relying on ROCM_VERSION < 6.0. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125646 Approved by: https://github.com/albanD, https://github.com/eqy	2024-05-12 18:01:28 +00:00
Eddie Yan	967dd31621	[cuDNN] Cleanup cuDNN < 8.1 ifdefs (#120862 ) Follow-up of #95722 Pull Request resolved: https://github.com/pytorch/pytorch/pull/120862 Approved by: https://github.com/Skylion007	2024-03-07 01:46:25 +00:00
cyy	507611f9ae	[CUDACachingAllocator] Turn Allocator::allocate into non-const (#120969 ) Ideally, the method should be non-const since it changes the allocator state. Some const_casts are also removed in the way. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120969 Approved by: https://github.com/albanD	2024-03-05 09:53:05 +00:00
PyTorch MergeBot	a9d9077f12	Revert "Increased compile time max GPUs to 512. Switched to int16_t DeviceIndex. (#119639 )" This reverts commit `7c556428c7`. Reverted https://github.com/pytorch/pytorch/pull/119639 on behalf of https://github.com/kit1980 due to breaking internal builds, see D54286923 ([comment](https://github.com/pytorch/pytorch/pull/119639#issuecomment-1969634480))	2024-02-28 18:57:09 +00:00
Tobias Ringwald	7c556428c7	Increased compile time max GPUs to 512. Switched to int16_t DeviceIndex. (#119639 ) Fixes #115331. This PR increases the number of valid GPU devices to 512 (from 64) in order to future-proof PyTorch for providers that offer [single nodes with a large device count](https://www.tensorwave.com/). Until now, `DeviceIndex` was an `int8_t`, thus multiple changes were necessary: - `DeviceIndex` changed to `int16_t`. Updated consumers that assume it to be an `int8_t`. - Updated bounds checking for `torch.device()` in the Python frontend. Right now, we allow funny things like `torch.device('cpu', 200).index == -56`, which is undefined behavior. I inserted some checks to only allow values between 0 and `c10::Device::MAX_NUM_DEVICES - 1`. - Updated the `ArgumentInfo` struct as it hardcodes the device index as 8 bit field [^1]. Might be a breaking change, not sure if users rely on this. - Introduced `c10::Device::MAX_NUM_DEVICES` as a replacement for the old `C10_COMPILE_TIME_MAX_GPUS` [^1]: This field was unsigned, so I guess this has also been undef behavior the whole time? Our default device index is -1, so this always wrapped around to 255 when written to the `ArgumentInfo` struct. When I switched the `DeviceIndex` to `int16_t`, it actually stayed 255 after unpacking from `ArgumentInfo` again, as the `DeviceIndex` was now wide enough that it didn't wrap back to -1. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119639 Approved by: https://github.com/cyyever, https://github.com/albanD, https://github.com/huydhn	2024-02-27 07:05:48 +00:00
PyTorch MergeBot	fff9d98e58	Revert "Increased compile time max GPUs to 512. Switched to int16_t DeviceIndex. (#119639 )" This reverts commit `e0268821dd`. Reverted https://github.com/pytorch/pytorch/pull/119639 on behalf of https://github.com/huydhn due to Sorry for reverting your change but I think the Window failures are legit as they are failing now in trunk, i.e. `450339ab2d` ([comment](https://github.com/pytorch/pytorch/pull/119639#issuecomment-1958428416))	2024-02-22 00:12:54 +00:00
Tobias Ringwald	e0268821dd	Increased compile time max GPUs to 512. Switched to int16_t DeviceIndex. (#119639 ) Fixes #115331. This PR increases the number of valid GPU devices to 512 (from 64) in order to future-proof PyTorch for providers that offer [single nodes with a large device count](https://www.tensorwave.com/). Until now, `DeviceIndex` was an `int8_t`, thus multiple changes were necessary: - `DeviceIndex` changed to `int16_t`. Updated consumers that assume it to be an `int8_t`. - Updated bounds checking for `torch.device()` in the Python frontend. Right now, we allow funny things like `torch.device('cpu', 200).index == -56`, which is undefined behavior. I inserted some checks to only allow values between 0 and `c10::Device::MAX_NUM_DEVICES - 1`. - Updated the `ArgumentInfo` struct as it hardcodes the device index as 8 bit field [^1]. Might be a breaking change, not sure if users rely on this. - Introduced `c10::Device::MAX_NUM_DEVICES` as a replacement for the old `C10_COMPILE_TIME_MAX_GPUS` [^1]: This field was unsigned, so I guess this has also been undef behavior the whole time? Our default device index is -1, so this always wrapped around to 255 when written to the `ArgumentInfo` struct. When I switched the `DeviceIndex` to `int16_t`, it actually stayed 255 after unpacking from `ArgumentInfo` again, as the `DeviceIndex` was now wide enough that it didn't wrap back to -1. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119639 Approved by: https://github.com/cyyever, https://github.com/albanD	2024-02-21 21:10:49 +00:00
Edward Yang	b4a35632f9	Add function to materialize COW storages (#117053 ) Summary: From Kurt Mohler, see https://github.com/pytorch/pytorch/pull/113396 (manually imported due to ghimport problems) Test Plan: sandcastle, OSS CI Differential Revision: D52610522 Pull Request resolved: https://github.com/pytorch/pytorch/pull/117053 Approved by: https://github.com/malfet, https://github.com/kurtamohler	2024-01-10 15:34:16 +00:00
cyy	764b4cd44e	Remove outdated string function wrapper for Android and Caffe2 (#116186 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/116186 Approved by: https://github.com/janeyx99	2023-12-22 04:31:56 +00:00
Nikita Shulga	7ca6e0d38f	[EZ] Add `CUSPARSELT` to build variables (#116213 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/116213 Approved by: https://github.com/Skylion007, https://github.com/kit1980, https://github.com/atalman ghstack dependencies: #116212	2023-12-21 01:02:11 +00:00
Nikita Shulga	74119a3482	[EZ] Fix typo in `USE_GLOO` var (#116212 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/116212 Approved by: https://github.com/Skylion007, https://github.com/kit1980	2023-12-21 01:02:11 +00:00
Jeff Daily	602abf6b55	[ROCm] more 6.0 changes (#115946 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115946 Approved by: https://github.com/pruthvistony, https://github.com/huydhn, https://github.com/malfet	2023-12-20 20:19:29 +00:00
hongxyan	66a76516bf	[ROCm] Disabling Kernel Asserts for ROCm by default - fix and clean up and refactoring (#114660 ) Related to #103973 #110532 #108404 #94891 Context: As commented in `6ae0554d11/cmake/Dependencies.cmake (L1198)` Kernel asserts are enabled by default for CUDA and disabled for ROCm. However it is somewhat broken, and Kernel assert was still enabled for ROCm. Disabling kernel assert is also needed for users who do not have PCIe atomics support. These community users have verified that disabling the kernel assert in PyTorch/ROCm platform fixed their pytorch workflow, like torch.sum script, stable-diffusion. (see the related issues) Changes: This pull request serves the following purposes: * Refactor and clean up the logic, make it simpler for ROCm to enable and disable Kernel Asserts * Fix the bug that Kernel Asserts for ROCm was not disabled by default. Specifically, - Renamed `TORCH_DISABLE_GPU_ASSERTS` to `C10_USE_ROCM_KERNEL_ASSERT` for the following reasons: (1) This variable only applies to ROCm. (2) The new name is more align with #define CUDA_KERNEL_ASSERT function. (3) With USE_ in front of the name, we can easily control it with environment variable to turn on and off this feature during build (e.g. `USE_ROCM_KERNEL_ASSERT=1 python setup.py develop` will enable kernel assert for ROCm build). - Get rid of the `ROCM_FORCE_ENABLE_GPU_ASSERTS' to simplify the logic and make it easier to understand and maintain - Added `#cmakedefine` to carry over the CMake variable to C++ Tests: (1) build with default mode and verify that USE_ROCM_KERNEL_ASSERT is OFF(0), and kernel assert is disabled: ``` python setup.py develop ``` Verify CMakeCache.txt has correct value. ``` /xxxx/pytorch/build$ grep USE_ROCM_KERNEL_ASSERT CMakeCache.txt USE_ROCM_KERNEL_ASSERT:BOOL=0 ``` Tested the following code in ROCm build and CUDA build, and expected the return code differently. ``` subprocess.call([sys.executable, '-c', "import torch;torch._assert_async(torch.tensor(0,device='cuda'));torch.cuda.synchronize()"]) ``` This piece of code is adapted from below unit test to get around the limitation that this unit test now was skipped for ROCm. (We will check to enable this unit test in the future) ``` python test/test_cuda_expandable_segments.py -k test_fixed_cuda_assert_async ``` Ran the following script, expecting r ==0 since the CUDA_KERNEL_ASSERT is defined as nothing: ``` >> import sys >>> import subprocess >>> r=subprocess.call([sys.executable, '-c', "import torch;torch._assert_async(torch.tensor(0,device='cuda'));torch.cuda.synchronize()"]) >>> r 0 ``` (2) Enable the kernel assert by building with USE_ROCM_KERNEL_ASSERT=1, or USE_ROCM_KERNEL_ASSERT=ON ``` USE_ROCM_KERNEL_ASSERT=1 python setup.py develop ``` Verify `USE_ROCM_KERNEL_ASSERT` is `1` ``` /xxxx/pytorch/build$ grep USE_ROCM_KERNEL_ASSERT CMakeCache.txt USE_ROCM_KERNEL_ASSERT:BOOL=1 ``` Run the assert test, and expected return code not equal to 0. ``` >> import sys >>> import subprocess >>> r=subprocess.call([sys.executable, '-c', "import torch;torch._assert_async(torch.tensor(0,device='cuda'));torch.cuda.synchronize()"]) >>>/xxxx/pytorch/aten/src/ATen/native/hip/TensorCompare.hip:108: _assert_async_cuda_kernel: Device-side assertion `input[0] != 0' failed. :0:rocdevice.cpp :2690: 2435301199202 us: [pid:206019 tid:0x7f6cf0a77700] Callback: Queue 0x7f64e8400000 aborting with error : HSA_STATUS_ERROR_EXCEPTION: An HSAIL operation resulted in a hardware exception. code: 0x1016 >>> r -6 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/114660 Approved by: https://github.com/jeffdaily, https://github.com/malfet, https://github.com/jithunnair-amd	2023-12-13 15:44:53 +00:00
PyTorch MergeBot	f36d09fcb7	Revert "Add function to materialize COW storages (#113396 )" This reverts commit `e2f090086b`. Reverted https://github.com/pytorch/pytorch/pull/113396 on behalf of https://github.com/DanilBaibak due to Break internal build ([comment](https://github.com/pytorch/pytorch/pull/113396#issuecomment-1818769090))	2023-11-20 10:26:01 +00:00
Kurt Mohler	e2f090086b	Add function to materialize COW storages (#113396 ) Part of #109833 Pull Request resolved: https://github.com/pytorch/pytorch/pull/113396 Approved by: https://github.com/ezyang	2023-11-17 01:58:51 +00:00
cyy	ac603bc2f8	[Reland] Eliminate invocations of c10::stoi,c10::stod,c10::stoull,c10::stoll (#109566 ) This is reland of #87603 with definitions of c10::stoXX kept for further investigation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109566 Approved by: https://github.com/huydhn	2023-09-19 07:15:25 +00:00
PyTorch MergeBot	4d44d8c00a	Revert "Eliminate c10::stoi,c10::stod,c10::stoull,c10::stoll (#109179 )" This reverts commit `852f1b8417`. Reverted https://github.com/pytorch/pytorch/pull/109179 on behalf of https://github.com/huydhn due to Sorry for reverting your change but this is breaking periodic buck build, so please fix the issue and reland the change https://github.com/pytorch/pytorch/actions/runs/6207458526/job/16852695272 ([comment](https://github.com/pytorch/pytorch/pull/109179#issuecomment-1724168571))	2023-09-18 18:41:12 +00:00
cyy	852f1b8417	Eliminate c10::stoi,c10::stod,c10::stoull,c10::stoll (#109179 ) We can remove these functions in favor of std ones. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109179 Approved by: https://github.com/colesbury	2023-09-16 07:22:50 +00:00
v-s-2	60121e391b	[caffe2] Replace `CAFFE_` prefixes in `static_tracepoint.h` macros with `TORCH_` (#106380 ) Summary: Rename static tracepoint macros to better describe their targeted usage. Test Plan: Same as for D47159249: Tested the following macros on test scripts with libbpf USDTs: * `CAFFE_SDT` * `CAFFE_DISABLE_SDT` * `CAFFE_SDT_WITH_SEMAPHORE` Reviewed By: chaekit Differential Revision: D47727339 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106380 Approved by: https://github.com/chaekit	2023-08-03 21:51:36 +00:00
v-s-2	e35950cd0d	[caffe2] Move CAFFE SDT macros' definitions to `c10/util/` (#105856 ) Summary: Moving static tracepoint macros header to a location where it can be easily used by various PyTorch components (`c10/utill`). Test Plan: Same as for D47159249: Tested the following macros on test scripts with libbpf USDTs: * `CAFFE_SDT` * `CAFFE_DISABLE_SDT` * `CAFFE_SDT_WITH_SEMAPHORE` Reviewed By: EDG-GH Differential Revision: D47636258 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105856 Approved by: https://github.com/EDG-GH, https://github.com/chaekit	2023-08-01 14:42:55 +00:00
Jeff Daily	5379b5f927	[ROCm] use hipblas instead of rocblas (#105881 ) - BatchLinearAlgebraLib.cpp is now split into one additional file - BatchLinearAlgebraLib.cpp uses only cusolver APIs - BatchLinearAlgebraLibBlas.cpp uses only cublas APIs - hipify operates at the file level and cannot mix cusolver and cublas APIs within the same file - cmake changes to link against hipblas instead of rocblas - hipify mappings changes to map cublas -> hipblas instead of rocblas Pull Request resolved: https://github.com/pytorch/pytorch/pull/105881 Approved by: https://github.com/albanD	2023-07-31 20:42:55 +00:00
xvladus1	e47fad68a0	[caffe2] Update tracepoint USDT macros (#105232 ) Summary: Fix existing CAFFE static tracepoint macros and make them match the latest FOLLY version. Per anakryiko, current `CAFE_SDT` definition is broken. Quote: ``` "Arguments: -5@-16(%rbp) -4@$100 Arguments: -8@-16(%rbp) -4@$100 #define FOLLY_SDT_IS_ARRAY_POINTER(x) ((__builtin_classify_type(x) == 14) \|\| \ (__builtin_classify_type(x) == 5)) vs #define CAFFE_SDT_ISARRAY(x) (__builtin_classify_type(x) == 14) https://github.com/atgreen/gcc/blob/master/gcc/typeclass.h that 5 is "pointer_type_class" so you were right, it's just fixed up version of header I think it should be 8, not 5 5 is the size of literal, but you don't pass string literal as an argument, you pass its address, so actual argument is a pointer, and so 8 byte long you can try just fixing up CAFFE_SDT macro ``` {F1048035373} Test Plan: Tested the following macros on test scripts with libbpf USDTs: CAFFE_SDT CAFFE_DISABLE_SDT CAFFE_SDT_WITH_SEMAPHORE Reviewed By: RihamSelim Differential Revision: D47159249 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105232 Approved by: https://github.com/chaekit, https://github.com/malfet	2023-07-20 22:56:11 +00:00
cyy	483f748dd5	[BE] Enforce missing `override` keyword (#104032 ) This PR enables `-Winconsistent-missing-destructor-override` and `-Winconsistent-missing-override` and fixes violations. <!-- copilot:summary --> ### <samp>🤖 Generated by Copilot at 47e904e</samp> This pull request updates the code of various classes and operators in the `caffe2` and `aten` subdirectories to use the `override` specifier instead of the `virtual` keyword for destructors and other virtual functions that override a base class function. This improves the code readability, quality, and consistency with C++ best practices. It also modifies the `./CMakeLists.txt` file to enable warnings for these specifiers, but disable errors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104032 Approved by: https://github.com/malfet	2023-06-24 02:34:24 +00:00
PyTorch MergeBot	b5594f7df0	Revert "Use missing-prototypes in torch_cpu (#103725 )" This reverts commit `716b3b893d`. Reverted https://github.com/pytorch/pytorch/pull/103725 on behalf of https://github.com/osalpekar due to Broke caffe2 builds due. More info at [D46920675](https://www.internalfb.com/diff/D46920675) ([comment](https://github.com/pytorch/pytorch/pull/103725#issuecomment-1603129273))	2023-06-22 18:30:31 +00:00
PyTorch MergeBot	626d8548df	Revert "add override to Caffe2 (#103795 )" This reverts commit `f5f020adb0`. Reverted https://github.com/pytorch/pytorch/pull/103795 on behalf of https://github.com/osalpekar due to Caused some breakages due to jobs using `-Winconsistent-missing-destructor-override` detecting inconsistent usage of override. Specifically the Tensor class destructor not being marked with override ([comment](https://github.com/pytorch/pytorch/pull/103795#issuecomment-1601812803))	2023-06-21 23:21:25 +00:00
cyy	716b3b893d	Use missing-prototypes in torch_cpu (#103725 ) This PR enables Wmissing-prototypes in torch_cpu except some generated cpp files and the mps and metal backends. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103725 Approved by: https://github.com/albanD	2023-06-21 13:19:55 +00:00
cyy	f5f020adb0	add override to Caffe2 (#103795 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/103795 Approved by: https://github.com/kit1980	2023-06-17 19:46:40 +00:00
Sergii Dymchenko	c2402a9257	Change caffe2 branch links to main (#100129 ) Just a change pytorch/tree/master -> pytorch/tree/main pytorch/blob/master -> pytorch/blob/main Pull Request resolved: https://github.com/pytorch/pytorch/pull/100129 Approved by: https://github.com/huydhn	2023-04-27 10:31:50 +00:00
Scott Wolchok	c47464ed95	[PyTorch] Further reduce cost of TypeMeta::_typeMetaData (by 10x!) (#98105 ) Currently we should be paying a small cost for the thread-safe initialization of `index`. Now we should eliminate that cost. (10x figure in the title comes from internal benchmark that just calls `TypeMeta::Match<caffe2::Tensor>()` in a loop). Differential Revision: [D44597852](https://our.internmc.facebook.com/intern/diff/D44597852/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98105 Approved by: https://github.com/ezyang	2023-04-12 17:44:48 +00:00
mikey dagitses	ee0143bf65	distinguish mutability of TensorImpl::data<T>() (#98719 ) There already is a mutable_data<T>() with different semantics, so we introduce new names: TensorImpl::(mutable_)?data_dtype_initialized<T>(). Differential Revision: [D44824778](https://our.internmc.facebook.com/intern/diff/D44824778/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98719 Approved by: https://github.com/ezyang	2023-04-12 07:24:35 +00:00
Scott Wolchok	457afe48fd	[caffe2] Micro-optimizations in BlobGetMutableTensor (#98103 ) Make sure we don't call Tensor::GetDevice() twice. Remove redundant branch for the case when tensor->dtype() == options.dtype(); in this case we end up calling raw_mutable_data(options.dtype()) anyway! Differential Revision: [D44596695](https://our.internmc.facebook.com/intern/diff/D44596695/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98103 Approved by: https://github.com/jerryzh168	2023-04-10 19:43:02 +00:00
mikey dagitses	2400cb1d57	distinguish mutability of TensorImpl::data() (#97776 ) See D44409928. Differential Revision: [D44459999](https://our.internmc.facebook.com/intern/diff/D44459999/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97776 Approved by: https://github.com/ezyang	2023-04-09 20:21:56 +00:00
mikey dagitses	5f5d675587	remove unused CAFFE2_VERSION macros (#97337 ) remove unused CAFFE2_VERSION macros Summary: Nothing reads these and they are completely subsumed by TORCH_VERSION. Getting rid of these will be helpful for build unification, since they are also not used internally. Test Plan: Rely on CI. Reviewers: sahanp Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/97337 Approved by: https://github.com/malfet	2023-03-24 16:02:35 +00:00
Mark Richardson	8bce88d9de	[caffe2] dont call cudnnDestroy on thread exit (crashes on windows with cuda 11/12) (#95382 ) Summary: My team has been hitting a mysterious crash for a few months on a windows binary that uses Caffe2 inside a worker thread. When this thread gets destroyed, there is an error at this line in context_gpu.h where the state of this operation gives CUDNN_STATUS_INTERNAL_ERROR instead of CUDNN_STATUS_SUCCESS. When enabling cudnn debug logs (via the env variables nvidia specifies), I can see that the context is destroyed twice, even though this code only destroys it once, so something mysterious is causing a double free. This seems very very similar to the issue/fix described here for pytorch: https://github.com/pytorch/pytorch/issues/17658 https://github.com/apache/tvm/pull/8267 And pytorch handles this in the same way, by just not calling cudnnDestroy This seems to have become an issue with cuda11, but I tested cuda12 as well and found that the issue persists so this needs to be somehow fixed. Test Plan: CI I checked that the specific windows binary I am using is able to create and drestroy caffe2-invoking threads without causing the application to crash. buck run arvr/mode/win/cuda11/opt //arvr/projects/nimble/prod/tools/MonoHandTrackingVis Differential Revision: D43538017 Pull Request resolved: https://github.com/pytorch/pytorch/pull/95382 Approved by: https://github.com/malfet	2023-03-10 06:42:51 +00:00
cyy	d0e4ca233e	some reference and move fixes (#95942 ) This PR introduces some modifications: 1. We find out some const function parameters that can be passed by reference and add the reference. 2. We find more opportunists of passing by value and change them accordingly. 3. Some use-after-move errors are fixed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95942 Approved by: https://github.com/Skylion007	2023-03-10 03:44:09 +00:00
cyy	fa65ae8f56	cleanup unused include (#93359 ) Using `include-what-you-use` tool to find out and remove some unused includes Pull Request resolved: https://github.com/pytorch/pytorch/pull/93359 Approved by: https://github.com/malfet	2023-02-04 02:15:50 +00:00

1 2 3 4 5 ...

1472 commits