onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-01 03:45:06 +00:00

Author	SHA1	Message	Date
Suffian Khan	e758870b18	Upgrade ROCm CI pipeline for ROCm 4.3.1 and permit run inside container (#9070 ) * try to run inside 4.3.1 container * no \ in container run command * remove networking options * try with adding video render groups * add job to build docker image * try without 1st stage * change alpha, beta to float * try adding service connection * retain huggingface directory * static video and render gid * use runtime expression for variables * install torch-ort * pin sacrebleu==1.5.1 * update curves for rocm 4.3.1 * try again * disable determinism and only check tail of loss curve and with a much larger threshold of 0.05 * disable RoBERTa due to high run variablity on ROCm 4.3.1 * put reduction unit tests back in	2021-09-15 12:32:02 -07:00
Changming Sun	ae6fdd3333	Bring code coverage dashboard back (#8394 )	2021-08-16 20:54:39 -07:00
raviskolli	f641c0f4e8	Update requirements.txt Updated requests version to address component governance failure	2021-07-22 14:18:21 -07:00
Suffian Khan	35ca3c99d1	Fix ROCm wheels pipeline after changes to manylinux scripts (#8026 ) * update * try fix rocm pipeline * avoid already isntalled error * ignore python3.10 since build fails * fix * try setting user * try again * try again * try again * fix script * disable inference docs generation * try print device id * fix name qual * try again * try again * try again * provider_options * add device verify * rty again * try again * try aggain * print video/render gid * try again * run as root * try again with uid, gid * cleanup * run as root * temp fix * add /bin/bash Co-authored-by: Changming Sun <chasun@microsoft.com>	2021-06-10 21:01:28 -07:00
Jesse Benson	f977644324	ROCM support int reductions	2021-05-17 16:42:06 -07:00
Jesse Benson	be79575c6a	Use built-in reduce_sum() for simple reduction cases, specifically reduce all to a scalar.	2021-04-14 08:55:35 -07:00
Weixing Zhang	75c0192e4f	enable more unit tests for ROCM EP (#7307 )	2021-04-09 15:15:13 -07:00
Weixing Zhang	c22963c23d	Polish Lamb Kernel (#7299 )	2021-04-09 09:55:57 -07:00
Weixing Zhang	8ad5007f8f	Polish Adam kernel (#7294 ) * Polish Adam kernel	2021-04-09 01:11:09 -07:00
Jesse Benson	4543459984	MIOpen supports MIOPEN_REDUCE_TENSOR_AVG now.	2021-04-01 16:00:34 -07:00
Weixing Zhang	40fa40f3ce	Enable more unit tests for ROCM EP (#6776 ) * enable more ops and unit tests for ROCM EP	2021-02-24 15:20:50 -08:00
Xavier Dupré	d3a2c8c1c7	Support double for operators ReduceMax, ReduceMin (#6265 ) * Support double for operators ReduceMax, ReduceMin * add unit test to pai-excluded-tests.txt Co-authored-by: xavier dupré <xavier.dupre@gmail.com>	2021-02-08 19:14:26 -08:00
Jesse Benson	d18aa45b46	Enable more ROCM ops that are sharing CUDA code. Some are needed for Turing NLG models.	2021-02-06 14:40:34 -08:00
Jesse Benson	21a47ec8d9	Disable a couple more unsupported tests.	2021-02-04 15:00:05 -08:00
Jesse Benson	0b147702af	Update remaining reduction ops to use MIOpen. double datatype is not supported, so disable those typed kernels.	2021-02-04 15:00:05 -08:00
Jesse Benson	a28ddb85b6	Reduction ops.	2021-02-04 15:00:05 -08:00
ashbhandare	85434273ff	Fix CUDA Reduction kernel for ArgMax/ArgMix for when reduction dim=1 (#6490 ) * Fix for when reduction dim=1 * Disable test for AMD GPUs * Specify Async	2021-02-02 09:50:16 -08:00
Suffian Khan	76bc0e479c	Enable dense sequence optimized version of Pytorch exported BERT-L on AMD GPU (#6504 ) * Permit dense seq optimization on BERT-L pytorch export by enabling ReduceSumTraining, Equal, and NonZero on AMD * enable Equal tests * enable fast_matrix_reduction test case	2021-01-29 13:12:34 -08:00
pengwa	453431f7bb	Add max_norm for gradient clipping. (#6289 ) * add max_norm as user option for gradient clipping * add adam and lamb test cases for clip norm * add frontend tests	2021-01-21 01:01:11 +08:00
Xavier Dupré	cd14c1af29	Support double for operator ArgMin (#6222 ) * Support double for operator ArgMin * add test specifically for double * add new test on pai-excluded-tests.txt	2020-12-31 11:25:46 +01:00
Xavier Dupré	84addcd2cf	Support double for operator ReduceMean, ReduceLogSumExp (#6217 ) * Support double for operators ReduceMean, ReduceLogSumExp	2020-12-31 11:24:54 +01:00
Vincent Wang	7ddeafdfcc	Add ReduceL2Grad and ClipGrad (#5970 ) * ReduceL2Grad and ClipGrad. * fix win build and amd ci pipeline * resolve comments. Co-authored-by: Vincent Wang <weicwang@AiFramework2080ti2.corp.microsoft.com>	2020-12-10 11:03:26 +08:00
Jesse Benson	98ea7372d3	Re-enable Lamb unit tests for AMD	2020-12-03 13:06:34 -08:00
Suffian Khan	9b8189dd0a	Rework AMD CI pipeline to use pool AMD-GPU and disable more tests in order to enable it. (#5885 ) Move AMD test pipeline to use self-hosted pool AMD-GPU. For time being, remove failing/flaky unit tests for AMD pipeline.	2020-11-24 09:38:14 -08:00
Weixing Zhang	aec4cb489e	ROCm EP for AMD GPU (#5480 ) The ROCm EP is designed and implemented based on AMD GPU software stack named ROCm. Here is the link for the details about ROCm: https://rocmdocs.amd.com/en/latest/ ROCm EP was created based on the following things: 1. AMD GPU programming language: HIP 2. AMD GPU HIP language runtime: amdhip64 3. BLAS: rocBLAS, hipBLAS 4. DNN: miOpen 5. Collective Communication library: RCCL 6. cub: hipCub 7. … Current status: BERT-L and GPT2 training can be ran on AMD GPU with data parallel. Next: 1. Make more GPU code be sharable between ROCm EP and CUDA EP since HIP language and HIP runtime API are very close to CUDA. 2. Continue improving the implementation. 3. Continue GPU kernel optimization. 4. Support model parallelism on ROCm EP. …… The rocm kernels have been removed from this commit and will be in a separate PR. Since the original PR was too big(~180 files), it was suggested to split the PR into two parts, one is rocm-kernels, the other is non rocm kernels. Co-authored-by: Weixing Zhang <wezhan@microsoft.com> Co-authored-by: sabreshao <sabre.shao@amd.com> Co-authored-by: anghostcici <11013544+anghostcici@users.noreply.github.com> Co-authored-by: Suffian Khan <sukha@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2020-10-29 17:13:04 -07:00

25 commits