mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-05-31 23:27:43 +00:00
- Add CK GroupNorm to GroupNormTunable. - Reduce configuration of GroupNormNHWCOp because CK implementation is better. The performance gain on stable diffusion v1.5. Before: ``` 'height': 512 'width': 512 'steps': 50 'batch_size': 1 'batch_count': 5 'num_prompts': 1 'average_latency': 2.4782688856124877 'median_latency': 2.4783748388290405 'provider': 'ROCMExecutionProvider' 'disable_safety_checker': True ``` After: ``` 'height': 512, 'width': 512, 'steps': 50, 'batch_size': 1, 'batch_count': 5, 'num_prompts': 1, 'average_latency': 2.107170510292053, 'median_latency': 2.1067750453948975, 'first_run_memory_MB': -1, 'second_run_memory_MB': -1, 'provider': 'ROCMExecutionProvider', 'disable_safety_checker': True ``` |
||
|---|---|---|
| .. | ||
| contrib_ops | ||
| core | ||
| python | ||
| test | ||
| tool/etw | ||
| wasm | ||
| __init__.py | ||
| ReformatSource.ps1 | ||
| ReformatSourcePython.bat | ||
| VSCodeCoverage.runsettings | ||