onnxruntime/tools/ci_build/github
Tianlei Wu 72186bbb71
[CUDA] Build nhwc ops by default (#22648)
### Description

* Build cuda nhwc ops by default.
* Deprecate `--enable_cuda_nhwc_ops` in build.py and add
`--disable_cuda_nhwc_ops` option

Note that it requires cuDNN 9.x. If you build with cuDNN 8, NHWC ops
will be disabled automatically.

### Motivation and Context

In general, NHWC is faster than NCHW for convolution in Nvidia GPUs with
Tensor Cores, and this could improve performance for vision models.

This is the first step to prefer NHWC for CUDA in 1.21 release. Next
step is to do some tests on popular vision models. If it help in most
models and devices, set `prefer_nhwc=1` as default cuda provider option.
2024-11-06 09:54:55 -08:00
..
android Remove webgpu ep in mobile packaging stages (#22725) 2024-11-05 09:14:26 -08:00
apple Remove webgpu ep in mobile packaging stages (#22725) 2024-11-05 09:14:26 -08:00
azure-pipelines [CUDA] Build nhwc ops by default (#22648) 2024-11-06 09:54:55 -08:00
js
linux [CUDA] Build nhwc ops by default (#22648) 2024-11-06 09:54:55 -08:00
pai Add python 3.13 support (#22380) 2024-10-14 18:07:54 -07:00
windows [TensorRT EP] Refactor TRT version update logic & apply TRT 10.5 (#22483) 2024-10-29 09:23:41 -07:00
Doxyfile_csharp.cfg