onnxruntime/tools
Tianlei Wu 72186bbb71
[CUDA] Build nhwc ops by default (#22648)
### Description

* Build cuda nhwc ops by default.
* Deprecate `--enable_cuda_nhwc_ops` in build.py and add
`--disable_cuda_nhwc_ops` option

Note that it requires cuDNN 9.x. If you build with cuDNN 8, NHWC ops
will be disabled automatically.

### Motivation and Context

In general, NHWC is faster than NCHW for convolution in Nvidia GPUs with
Tensor Cores, and this could improve performance for vision models.

This is the first step to prefer NHWC for CUDA in 1.21 release. Next
step is to do some tests on popular vision models. If it help in most
models and devices, set `prefer_nhwc=1` as default cuda provider option.
2024-11-06 09:54:55 -08:00
..
android_custom_build Update Android NDK version to 27.0.12077973. (#21989) 2024-09-05 17:57:24 -07:00
ci_build [CUDA] Build nhwc ops by default (#22648) 2024-11-06 09:54:55 -08:00
doc Update ruff and clang-format versions (#21479) 2024-07-24 11:50:11 -07:00
nuget [DML EP] Update DML to 1.15.4 (#22635) 2024-10-29 17:13:57 -07:00
perf_view fixed #16873 (#16932) 2023-09-26 09:57:01 -07:00
python update pipeline name list in run_CIs_for_external_pr.py (#22540) 2024-10-22 17:14:48 -07:00
scripts Update DNNL CI python to 310 (#22691) 2024-11-05 09:14:48 -08:00