onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-17 18:40:28 +00:00

Author	SHA1	Message	Date
Ashwini Khade	68b5b2d7d3	Refactor training build options (#13964 ) ### Description 1. Renames all references of on device training to training apis. This is to keep the naming general. Nothing really prevents us from using the same apis on servers\non-edge devices. 2. Update ENABLE_TRAINING option: With this PR when this option is enabled, training apis and torch interop is also enabled. 3. Refactoring for onnxruntime_ENABLE_TRAINING_TORCH_INTEROP option: - Removed user facing option - Setting onnxruntime_ENABLE_TRAINING_TORCH_INTEROP to ON when onnxruntime_ENABLE_TRAINING is ON as we always build with torch interop. Once this PR is merged when --enable_training is selected we will do a "FULL Build" for training (with all the training entry points and features). Training entry points include: 1. ORTModule 2. Training APIs Features include: 1. ATen Fallback 2. All Training OPs includes communication and collectives 3. Strided Tensor Support 4. Python Op (torch interop) 5. ONNXBlock (Front end tools for training artifacts prep when using trianing apis) ### Motivation and Context Intention is to simply the options for building training enabled builds. This is part of the larger work item to create dedicated build for learning on the edge scenarios with just training apis enabled.	2023-01-03 13:28:16 -08:00
Edward Chen	9e65f3bfdb	Replace deprecated Python dependency sklearn with scikit-learn. (#13585 )	2022-11-08 09:08:29 -08:00
Jeff Daily	65c67764ae	remove line "ADD model ${WORKSPACE_DIR}/model" in the amdgpu Dockerfile (#12914 ) Follow-up to #12707. docker build is broken otherwise; model dir is gone.	2022-10-14 13:17:28 -07:00
Baiju Meswani	9e47eb68e0	Remove unused orttraining amd dockerfiles and scripts (#12707 )	2022-09-02 18:43:21 -07:00
Justin Chu	fdce4fa6af	Format all python files under onnxruntime with black and isort (#11324 ) Description: Format all python files under onnxruntime with black and isort. After checking in, we can use .git-blame-ignore-revs to ignore the formatting PR in git blame. #11315, #11316	2022-04-26 09:35:16 -07:00
Weixing Zhang	840212e115	Enable OneHot kernel for ROCm EP and add Dockerfile for ROCm 4.3.1 (#9656 ) * enable OneHot for ROCm EP * add dockerfile for ROCm 4.3.1 Co-authored-by: Weixing Zhang <wezhan@microsoft.com>	2021-12-07 12:47:00 -08:00
Jeff Daily	d02de9c1bc	[ROCm] dockerfile updates (#7955 ) * do not remove onnxruntime build directory in Dockerfile.rocm4.1.pytorch * restore ONNX Runtime Training Examples to rocm 4.2 dockerfile	2021-06-10 23:50:19 -07:00
Weixing Zhang	dce76c15e7	add dockfile for ROCm 4.2 (#7749 ) * add dockfile for ROCm 4.2 * using rocm/pytorch:rocm4.2_ubuntu18.04_py3.6_pytorch_1.8.1	2021-06-08 08:02:27 -07:00
Peng	c2435d24ec	Clean up ROCm4.1 Dockerfile build directory (#7732 ) * Clean up ROCm4.1 Dockerfile build directory * remove the UCX and OMPI build directories after installation	2021-05-20 10:04:49 -07:00
Weixing Zhang	59b57d8322	HSA_NO_SCRATCH_RECLAIM and RCCL_ALLTOALL_KERNEL_DISABLE are not needed for ROCm 4.1 (#7224 ) Co-authored-by: Weixing Zhang <wezhan@microsoft.com>	2021-04-02 18:19:11 -07:00
Jeff Daily	65ce5f07b3	add Dockerfile.rocm4.1.pytorch (#7152 )	2021-03-26 21:40:10 -07:00
Suffian Khan	5cb8934459	update Dockerfile for workaround for issue in RCCL for rocm4.0 (#7108 )	2021-03-23 13:36:04 -07:00
Jesse Benson	c562952750	Dockerfile to build onnxruntime with ROCm 4.0	2020-12-22 10:21:12 -08:00
Weixing Zhang	2705115732	add dockerfile for ROCm3.10 and update BUILD.md for ROCm EP (#5821 ) * add HSA_NO_SCRATCH_RECLAIM=1 to dockerfile It is to work around an issue in AMD compiler which generates poor GPU ISA when the type of kernel parameter is a structure and “pass-by-value” is used * update BUILD.md * add dockerfile for rocm3.10	2020-12-08 23:14:56 -08:00
Weixing Zhang	fc614ad050	revert the code change which was based on `b4869926` The change `b4869926` which was to remove per-thread allocator would cause seg fault for distributed training. In addition, add dockerfile for ROCm3.9	2020-11-15 00:24:32 -08:00
Weixing Zhang	fff85a6a35	Add GPU kernels for ROCm EP (#5655 ) * Add kernels for AMD GPU. This PR is mostly about GPU kernels for ROCm EP. Due to similar GPU programming language (CUDA and HIP and similar math library calls, one principle in ROCM EP design is to share CUDA kernels as much as possible for ROCm. Thus, the script amd_hipify.py has been created for converting CUDA kernels to ROCm HIP kernels automatically during compilation phase. But, for some reasons such as perf issue, syntax difference..., some converted kernels need some manual intervention. These kernels will be checked in the repo physically for now. In order to avoid manual intervention, the plan is to refactor CUDA kernels to make them portable between CUDA EP and ROCm EP as much as possible. Please refer to "HIP Porting Guide" for details. * like lamb, multi-tensor-apply needs to be disabled for IsAllFiniteOp and ReduceAllL2, current AMD GPU compiler has perf issue for kernel parameter which is a structure with "pass by value". * Use hipMemsetAsync and add checks on HIP calls. * move the generated files to build folder. Co-authored-by: Jesse Benson <jesseb@microsoft.com>	2020-11-06 16:11:06 -08:00
Weixing Zhang	aec4cb489e	ROCm EP for AMD GPU (#5480 ) The ROCm EP is designed and implemented based on AMD GPU software stack named ROCm. Here is the link for the details about ROCm: https://rocmdocs.amd.com/en/latest/ ROCm EP was created based on the following things: 1. AMD GPU programming language: HIP 2. AMD GPU HIP language runtime: amdhip64 3. BLAS: rocBLAS, hipBLAS 4. DNN: miOpen 5. Collective Communication library: RCCL 6. cub: hipCub 7. … Current status: BERT-L and GPT2 training can be ran on AMD GPU with data parallel. Next: 1. Make more GPU code be sharable between ROCm EP and CUDA EP since HIP language and HIP runtime API are very close to CUDA. 2. Continue improving the implementation. 3. Continue GPU kernel optimization. 4. Support model parallelism on ROCm EP. …… The rocm kernels have been removed from this commit and will be in a separate PR. Since the original PR was too big(~180 files), it was suggested to split the PR into two parts, one is rocm-kernels, the other is non rocm kernels. Co-authored-by: Weixing Zhang <wezhan@microsoft.com> Co-authored-by: sabreshao <sabre.shao@amd.com> Co-authored-by: anghostcici <11013544+anghostcici@users.noreply.github.com> Co-authored-by: Suffian Khan <sukha@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2020-10-29 17:13:04 -07:00

17 commits