mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-05-16 21:00:14 +00:00
- Add CK GroupNorm to GroupNormTunable. - Reduce configuration of GroupNormNHWCOp because CK implementation is better. The performance gain on stable diffusion v1.5. Before: ``` 'height': 512 'width': 512 'steps': 50 'batch_size': 1 'batch_count': 5 'num_prompts': 1 'average_latency': 2.4782688856124877 'median_latency': 2.4783748388290405 'provider': 'ROCMExecutionProvider' 'disable_safety_checker': True ``` After: ``` 'height': 512, 'width': 512, 'steps': 50, 'batch_size': 1, 'batch_count': 5, 'num_prompts': 1, 'average_latency': 2.107170510292053, 'median_latency': 2.1067750453948975, 'first_run_memory_MB': -1, 'second_run_memory_MB': -1, 'provider': 'ROCMExecutionProvider', 'disable_safety_checker': True ``` |
||
|---|---|---|
| .. | ||
| eigen@d10b27fe37 | ||
| emsdk@0ab19024f0 | ||
| libprotobuf-mutator@7a2ed51a6b | ||
| onnx@9b7bca2a72 | ||
| onnxruntime-extensions@81e7799c69 | ||
| abseil-cpp.cmake | ||
| abseil-cpp.natvis | ||
| composable_kernel.cmake | ||
| cutlass.cmake | ||
| dml.cmake | ||
| dnnl.cmake | ||
| eigen.cmake | ||
| extensions.cmake | ||
| find_snpe.cmake | ||
| FindNumPy.cmake | ||
| helper_functions.cmake | ||
| ipp-crypto.cmake | ||
| mimalloc.cmake | ||
| onnx_minimal.cmake | ||
| onnx_protobuf.natvis | ||
| onnxruntime_external_deps.cmake | ||
| protobuf_function.cmake | ||
| pybind11.cmake | ||
| pyxir.cmake | ||
| triton.cmake | ||
| tvm.cmake | ||
| wil.cmake | ||
| xnnpack.cmake | ||