onnxruntime/tools/ci_build
Hubert Lu dbcf54aa41
Add hipified SkipLayerNorm code for ROCmEP (#12107)
* First attempt for half2 vectorized memory access in SkipLayerNorm

* Add some functions for debugging

* Clean up the code

* Clean up the code

* Generalize the vectorized kernels with aligned_vector and remove cudaDeviceProp

* Add a unit test for a larger input size

* Fix some Lint C++ warnings

* Use ILP = 4 for the vectorized kernels

* Rewrite the vectorized kernel and templatize ComputeSkipLayerNorm

* Use conditional operator for input_v

* Refactor LaunchSkipLayerNormKernel and replace the original SkipLayerNormKernelSmall with the vectorized kernel

* Clean some comments and rename the layernorm function

* Use ComputeSkipLayerNorm to replace LaunchSkipLayerNormKernel

* Resolve a Lint C++ warning

* Fix SkipLayerNormBatch1_Float16_vec output data

* Add hipified code of bert SkipLayerNorm for ROCmEP

* Resolve some Lint C++ warnings

* Resolve some Lint C++ warnings

* Resolve some Lint C++ warnings

* Resolve Python formatting issue
2022-07-06 22:13:11 -07:00
..
github [ROCm] Temp disable AMD UT (#12105) 2022-07-06 11:08:26 -07:00
__init__.py
amd_hipify.py Add hipified SkipLayerNorm code for ROCmEP (#12107) 2022-07-06 22:13:11 -07:00
build.py Fix orttraining-linux-ci-pipeline - Symbolic shape infer (#11965) 2022-06-23 08:23:36 -07:00
clean_docker_image_cache.py Format all python files under onnxruntime with black and isort (#11324) 2022-04-26 09:35:16 -07:00
coverage.py Format all python files under onnxruntime with black and isort (#11324) 2022-04-26 09:35:16 -07:00
gen_def.py Snpe ep (#11665) 2022-06-03 14:10:02 -07:00
get_docker_image.py Set black's target version (#11370) 2022-04-27 14:52:19 -07:00
logger.py Format all python files under onnxruntime with black and isort (#11324) 2022-04-26 09:35:16 -07:00
op_registration_utils.py Format all python files under onnxruntime with black and isort (#11324) 2022-04-26 09:35:16 -07:00
op_registration_validator.py Format all python files under onnxruntime with black and isort (#11324) 2022-04-26 09:35:16 -07:00
policheck_exclusions.xml A new pipeline to replace the existing WindowsAI packaging pipeline (#10646) 2022-03-03 08:56:49 -08:00
reduce_op_kernels.py Include layout transformation ops in extended minimal build and above. (#11355) 2022-04-27 10:31:02 -07:00
requirements.txt Bump numpy from 1.19.2 to 1.21.0 in /tools/ci_build 2022-01-12 17:45:35 -08:00
upload_python_package_to_azure_storage.py Format all python files under onnxruntime with black and isort (#11324) 2022-04-26 09:35:16 -07:00
upload_python_package_to_azure_storage_with_python.py Format all python files under onnxruntime with black and isort (#11324) 2022-04-26 09:35:16 -07:00