onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-04 04:07:22 +00:00

History

Jeff Bloomfield 3df3a85114 Default kOrtSessionOptionsDisableQuantQDQ to 1 when the DML EP is registered (#15725 ) This addresses a performance regression in some INT8 models with the DirectML EP by defaulting OrtSessionOptionsDisableQuantQDQ to 1 when the EP is registered. This regression occured due to the introduction of the QDQ propagation transformer, which is based on this session option. That transformer maximizes the number of nodes which are executed as quantized by logically propagating quantize operators upstream and dequantize operators downstream. However, it does this simply by inserting QDQ pairs, with an expectation that something will recognize sequences of DQ->Op->Q. This logic and related L2 transformers are not currently enabled for the DirectML EP. This change also removes a noisy warning when the session option for memory pattern is overriden as the DirectML EP is registered.	2023-05-01 08:26:03 -07:00
..
onnxruntime/core	Default kOrtSessionOptionsDisableQuantQDQ to 1 when the DML EP is registered (#15725 )	2023-05-01 08:26:03 -07:00

Jeff Bloomfield 3df3a85114

Default kOrtSessionOptionsDisableQuantQDQ to 1 when the DML EP is registered (#15725 )

This addresses a performance regression in some INT8 models with the
DirectML EP by defaulting OrtSessionOptionsDisableQuantQDQ to 1 when the
EP is registered.

This regression occured due to the introduction of the QDQ propagation
transformer, which is based on this session option. That transformer
maximizes the number of nodes which are executed as quantized by
logically propagating quantize operators upstream and dequantize
operators downstream. However, it does this simply by inserting QDQ
pairs, with an expectation that something will recognize sequences of
DQ->Op->Q. This logic and related L2 transformers are not currently
enabled for the DirectML EP.

This change also removes a noisy warning when the session option for
memory pattern is overriden as the DirectML EP is registered.

2023-05-01 08:26:03 -07:00

onnxruntime/core

Default kOrtSessionOptionsDisableQuantQDQ to 1 when the DML EP is registered (#15725 )

2023-05-01 08:26:03 -07:00