### Description
Disable a test with random failure in Windows GPU CI Pipeline like the
following:
```
11: [ OK ] BatchTest/BatchTest.BatchSupport/163 (0 ms)
11: [ RUN ] BatchTest/BatchTest.BatchSupport/164
11: D:\a\_work\1\s\winml\test\image\imagetests.cpp(186): error: Expected: m_model_binding.Bind(output_data_binding_name, output_video_frames) doesn't throw an exception.
11: Actual: it throws.
11: D:\a\_work\1\s\winml\test\image\imagetests.cpp(211): error: Expected: m_result = m_session.Evaluate(m_model_binding, L"") doesn't throw an exception.
11: Actual: it throws.
11: total errors is 0/2073600, errors rate is 0total errors is 0/2073600, errors rate is 0total errors is 0/2073600, errors rate is 0[ FAILED ] BatchTest/BatchTest.BatchSupport/164, where GetParam() = ((L"fns-candy_Bgr8_Batch3.onnx", 0, { L"1080.jpg", L"fish_720_Gray.png", L"fish_720.png" }, 3, false), 0, 1, 1, 1, 4-byte object <02-00 00-00>) (3203 ms)
```
Since https://github.com/microsoft/onnxruntime/pull/15468 merged to
main, about 10~15% build job failed in the test.
### Description
1. Disable XNNPack EP's tests in Windows CI pipeline
The EP code has a known problem(memory alignment), but the problem does
not impact the usages that we ship the code to. Now we only use XNNPack
EP in mobile apps and web usages. We have already pipelines to cover
these usages. We need to prioritize fixing the bugs found in these
pipelines, and there no resource to put on this Windows one. We can
re-enable the tests once we reached an agreement on how to fix the
memory alignment bug.
2. Delete anybuild.yml which was for an already deleted pipeline.
3. Move Windows CPU pipelines to AMD CPU machine pools which are
cheaper.
4. Disable some qdq/int8 model tests that will fail if the CPU doesn't
have Intel AVX512 8-bit instructions.
This change moves the DML CI pipeline to the A10 machines and fixes or
disables tests that were failing from this change.
- Max error rate threshold was increased for Image Tests
- Some failing batch tests were disabled
---------
Co-authored-by: Changming Sun <chasun@microsoft.com>
### Description
- Use same data type as input for mask_index tensor which is used as DML
GEMM API's C parameter.
- Remove gsl header include as it is already gets included transitively.
### Motivation and Context
- Why is this change required? What problem does it solve?
Bug found in internal conformance testing.
- If it fixes an open issue, please link to the issue here.
N/A
### Description
1. update model name structure in model_tests.cpp with source name. To
avoid
`Condition test_param_names.count(param_name) == 0 failed. Duplicate
parameterized test name 'BERT_Squad_opset10_CPU'`
2. skip some failed models https://github.com/onnx/models/issues/568
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
# Motivation
Currently, ORT minimal builds use kernel def hashes to map from nodes to
kernels to execute when loading the model. As the kernel def hashes must
be known ahead of time, this works for statically registered kernels.
This works well for the CPU EP.
For this approach to work, the kernel def hashes must also be known at
ORT format model conversion time, which means the EP with statically
registered kernels must also be enabled then. This is not an issue for
the always-available CPU EP. However, we do not want to require that any
EP which statically registers kernels is always available too.
Consequently, we explore another approach to match nodes to kernels that
does not rely on kernel def hashes. An added benefit of this is the
possibility of moving away from kernel def hashes completely, which
would eliminate the maintenance burden of keeping the hashes stable.
# Approach
In a full build, ORT uses some information from the ONNX op schema to
match a node to a kernel. We want to avoid including the ONNX op schema
in a minimal build to reduce binary size. Essentially, we take the
necessary information from the ONNX op schema and make it available in a
minimal build.
We decouple the ONNX op schema from the kernel matching logic. The
kernel matching logic instead relies on per-op information which can
either be obtained from the ONNX op schema or another source.
This per-op information must be available in a minimal build when there
are no ONNX op schemas. We put it in the ORT format model.
Existing uses of kernel def hashes to look up kernels are replaced
with the updated kernel matching logic. We no longer store
kernel def hashes in the ORT format model’s session state and runtime
optimization representations. We no longer keep the logic to
generate and ensure stability of kernel def hashes.
* Contrib Op: FusedMatMul for DML EP
* Added relevant comments and extra validation
* Polish
* More polish
* Last polish
* Addressed comment on the PR
* Addressed comment on the R
* Removed un-necessary comments
* Used c++ standard function
* used std::c++ algorithms function
* Removed unsed code
Co-authored-by: Sumit Agarwal <sumitagarwal@microsoft.com>
Co-authored-by: Dwayne Robinson <fdwr@hotmail.com>
* Fix SAME_UPPER/SAME_LOWER (auto_pad attribute) in ConvTranspose
* Bump ONNX 1.10.2 globally
* load ONNX_VERSION from VERSION_NUMBER
* /
* revert deprecate warning in ORT 1.12
* add a comment about why removing cntk_simple_seg
* correct the implem in DML as well
Prior to this every test shared the same tolerances. This meant
that if an ONNX test failed due to a small but acceptable difference in
output, the only alternative was to disable the test entirely.
In op set 17, the DFT operator is being added. Without this change, the
tests for that operator fail because the output is off by about 5e-5.
It's better to keep test coverage for this new op rather than disable
the test entirely.
Also prior to this change, the global tolerances were not shared between
C++, JavaScript, and Python tests. Now they are.
Also fix various minor issues raised by linters.
Unblocks https://github.com/microsoft/onnxruntime/issues/11640.
* Share thread pools between devices
* make tests reuse device
* Change cpu thread pool options for dml sessions to use 1 thread with no spinning
* fix test failure
* Update missing type constraints for dft
* Add comment and rename inference session parameter
* default missing causing inconsistent test behavior
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
* Add experimental API for editing model name
* Change EditModelName to 'SetName'
* Change API to pass c_string
* Update SetName to edit the proto
* Test that the model proto gets changed
* Remove comments
* Skip inbox tests
* Use filehelper path
Co-authored-by: Numfor Mbiziwo-Tiapo <numform@microsoft.com>
Extend opset 13 support for:
- Split-13
- Squeeze-13
- Unsqueeze-13
- Reshape-13
- QuantizeLinear-13
- DequantizeLinear-13
- ReduceSum-13
- Resize-13
Also:
- Rename the file where all the opset versions are stored from "OperatorRegistration.h" to "OperatorVersions.h", which will make it much less confusing in the future when looking given there's another file called "OperatorRegistration.h" that corresponds to "OperatorRegistration.cpp".
- Detemplatize many of the OperatorHelper.h constructors, which duplicate multiple instantiations due to the operator helper classes not sharing a common base class, by wrapping them with an adapter. Ideally there would be a common COM base interface that both IMLOperatorKernelCreationContext and IMLOperatorShapeInferenceContext implementation objects would implement, which a wrapper in MLOperatorAuthorHelper.h could QI for.
- Fix style formatting issues in OperatorHelper.h (sorry for the noise).
```
Summary: Total=4679, Passed=4355, Failed=0, Blocked=0, Not Run=0, Skipped=324
```
Corresponding WindowsAI PR:
https://microsoft.visualstudio.com/WindowsAI/_git/WindowsAI/pullrequest/6973645
Related work items: #36672908, #36672926
* Make DmlEp Clang compatible for EPIC
* Fix build issues occurred when engine/lotus points to ORT Github latest
* Fix more build errors
* Fixed one build issue and removed temporary changes for Clang
* Addressed comments on the PR.
* [WIP] - DmlEp ORT NO Exception
* Made DmlEp compatible with ORT_NO_EXCEPTION
* Fixed typo
* Addressed comments on the PR, mostly nit styling and using approriate HR error code
* Added dependency of ErrorHandling.h
* Addressed comment on the PR
Co-authored-by: Sumit Agarwal <sumitagarwal@microsoft.com>
* add options to disable cpu copy back
* null check proprties
* only affect gpu outputs
* change name to disabletensorcpusync
* slight refactoring
* Globally enable ms-experimental ops
* change meaning of ms_experimental to mean *all* ms_experimental ops. Some experimental ops will still be enabled globally without this flag like audio ops.
* remove changes incorrectly merged
* bad merge
* add test
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
* Remove APIs unavailable in Store in #8349, #8178, #8065
* Add UWP stubs of C runtime functions
* Remove UWP incompatible tests from UWP build
* Remove incompatible tests from Store
* Use UWP stubs in store only
* Skip partition check outside of Windows
* Remove unused WRL include
* Workaround Windows header not including what it uses
* Fix precompiled header name clash
* Workaround SDK bugs
* DXCore workaround in Win7
* Fix warning
* Fix more warnings
* Bump WinML to target Windows 8
* Fix more warnings
* Remove unnecessary workarounds
* Remove Desktop only APIs from DML adapter
Bug #31652854 also repros on Qualcomm Adreno (down to the exact same pixel). This change disables this model test for Qualcomm, in addition to the existing disablement for Intel.
* Remove APIs unavailable in Store in #8349, #8178, #8065
* Add UWP stubs of C runtime functions
* Remove UWP incompatible tests from UWP build
* Remove incompatible tests from Store
* Use UWP stubs in store only
* Skip partition check outside of Windows
* Remove unused WRL include
* Workaround Windows header not including what it uses
* Fix precompiled header name clash
* Workaround SDK bugs
* DXCore workaround in Win7
* Fix warning
* Fix more warnings
* Bump WinML to target Windows 8
* Fix more warnings
* Remove unnecessary workarounds