### Description
1. make parity_check use local model to avoid using hf token
2. del the model didn't work because it tried to del the object define
out of the function scope.
So it caused out of memory in A10.
3. In fact, 16G GPU memory (one T4) is enough. But the conversion
process always be killed in T4 and it works on A10/24G.
Standard_NC4as_T4_v3 has 28G CPU memory
Standard_NV36ads_A10_v5 has 440G memory.
It looks that the model conversion needs very huge memory.
### Motivation and Context
Last time, I came across some issues in convert_to_onnx.py so I use the
onnx model in https://github.com/microsoft/Llama-2-Onnx for testing.
Now, these issues could be fixed. So I use onnx model generated by this
repo and the CI can cover the model conversion.
Fix pytest version to 7.4.4, higher version will cause error
`from onnxruntime.capi import onnxruntime_validation
ModuleNotFoundError: No module named 'onnxruntime.capi'`
### Description
<!-- Describe your changes. -->
Setup usage of coremltools via dependencies instead of copying files.
Pull in some changes from
https://github.com/microsoft/onnxruntime/pull/19347 in preparation for
supporting ML Program and enabling building the ML Model on all
platforms to make development and testing of CoreML EP code easier.
- Update to coremltools 7.1
- Add patch for changes required for cross platform build of ML Program
related code
- Generate coreml proto files on all platforms
- mainly to test these changes work everywhere, as the proto files will
be used on all platforms when #19347 is checked in
- rename onnxruntime_coreml_proto target to coreml_proto as it contains
purely coreml protobuf code with no ORT related chagnes
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Improve setup.
### Description
1. save the model to pipeline cache
2. lower the similarly bar to 97
3. publish the generated image that we can check it once the test fails
### Motivation and Context
Reduce model downloads
### Description
<!-- Describe your changes. -->
Updates to only include ios archs framework in artifacts included in
Nuget Package.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Related issue:
https://github.com/microsoft/onnxruntime/issues/19295#issuecomment-1914143256
---------
Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
### Description
In PR #19073 I mistunderstood the value of "--parallel". Instead of
testing if args.parallel is None or not , I should test the returned
value of number_of_parallel_jobs function.
If build.py was invoked without --parallel, then args.parallel equals to
1. Because it is the default value. Then we should not add "/MP".
However, the current code adds it. Because if `args.paralllel` is
evaluated to `if 1` , which is True.
If build.py was invoked with --parallel with additional numbers, then
args.parallel equals to 0. Because it is unspecified. Then we should add
"/MP". However, the current code does not add it. Because `if
args.paralllel` is evaluated to `if 0` , which is False.
This also adds a new build flag: use_binskim_compliant_compile_flags, which is intended to be only used in ONNX Runtime team's build pipelines for compliance reasons.
### Motivation and Context
### Description
1. Add visual parity test based on openai clip model
2. Add trigger rules
### Motivation and Context
1. check generated image is expected
2. reduce unnecessary triggers
### Description
Fix two issues:
(1) We can only use single quote inside `bash -c "..."`. Current
pipeline job stopped at `python3 demo_txt2img.py astronaut` and skip the
following commands. In this change, we remove the remaining commands to
get same effect (otherwise, the pipeline runtime might be 2 hours
instead of 15 minutes).
(2) Fix a typo of Stable.
### Description
Update abseil to a release tag and register neural_speed to CG.
### Motivation and Context
Now we are using a non-relesed version of abseil. Using a tag is better.
### Description
1. Update Linux GPU machine from T4 to A10, sm=8.6
2. update the tolerance
### Motivation and Context
1. Free more T4 and test with higher compute capability.
2. ORT enables TF32 in GEMM for A10/100. TF32 will cause precsion loss
and fail this test
```
2024-01-19T13:27:18.8302842Z [ RUN ] ModelTests/ModelTest.Run/cuda__models_zoo_opset12_SSD_ssd12
2024-01-19T13:27:25.8438153Z /onnxruntime_src/onnxruntime/test/providers/cpu/model_tests.cc:347: Failure
2024-01-19T13:27:25.8438641Z Expected equality of these values:
2024-01-19T13:27:25.8438841Z COMPARE_RESULT::SUCCESS
2024-01-19T13:27:25.8439276Z Which is: 4-byte object <00-00 00-00>
2024-01-19T13:27:25.8439464Z ret.first
2024-01-19T13:27:25.8445514Z Which is: 4-byte object <01-00 00-00>
2024-01-19T13:27:25.8445962Z expected 0.145984 (3e157cc1), got 0.975133 (3f79a24b), diff: 0.829149, tol=0.0114598 idx=375. 20 of 388 differ
2024-01-19T13:27:25.8446198Z
2024-01-19T13:27:25.8555736Z [ FAILED ] ModelTests/ModelTest.Run/cuda__models_zoo_opset12_SSD_ssd12, where GetParam() = "cuda_../models/zoo/opset12/SSD/ssd-12.onnx" (7025 ms)
2024-01-19T13:27:25.8556077Z [ RUN ] ModelTests/ModelTest.Run/cuda__models_zoo_opset12_YOLOv312_yolov312
2024-01-19T13:27:29.3174318Z /onnxruntime_src/onnxruntime/test/providers/cpu/model_tests.cc:347: Failure
2024-01-19T13:27:29.3175144Z Expected equality of these values:
2024-01-19T13:27:29.3175389Z COMPARE_RESULT::SUCCESS
2024-01-19T13:27:29.3175812Z Which is: 4-byte object <00-00 00-00>
2024-01-19T13:27:29.3176080Z ret.first
2024-01-19T13:27:29.3176322Z Which is: 4-byte object <01-00 00-00>
2024-01-19T13:27:29.3178431Z expected 4.34958 (408b2fb8), got 4.51324 (40906c80), diff: 0.16367, tol=0.0534958 idx=9929. 22 of 42588 differ
```
3. some other test like SSD throw other exception, so skip them
'''
2024-01-22T09:07:40.8446910Z [ RUN ]
ModelTests/ModelTest.Run/cuda__models_zoo_opset12_SSD_ssd12
2024-01-22T09:07:51.5587571Z
/onnxruntime_src/onnxruntime/test/providers/cpu/model_tests.cc:358:
Failure
2024-01-22T09:07:51.5588512Z Expected equality of these values:
2024-01-22T09:07:51.5588870Z COMPARE_RESULT::SUCCESS
2024-01-22T09:07:51.5589467Z Which is: 4-byte object <00-00 00-00>
2024-01-22T09:07:51.5589953Z ret.first
2024-01-22T09:07:51.5590462Z Which is: 4-byte object <01-00 00-00>
2024-01-22T09:07:51.5590841Z expected 1, got 63
'''
### Description
Adds a job to create a nightly python package for ORT/QNN on Windows
ARM64.
Must build onnxruntime-qnn with python 3.11 and numpy 1.25.
**Note: pipeline run may take up to 3 hrs**
### Motivation and Context
Make it possible to get a nightly python package with the latest updates
to QNN EP.
Issue #19161
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
<!-- Describe your changes. -->
### Motivation and Context
Linux_GPU_x64 job in the pipeline has been canceled due to timeout since
0112.
### Description
This way, we will not need to update the windows images constantly and
allow more flexibility to choose the cuda version in the future.
### Description
Disable ccache for all the jobs in in Windows CPU CI pipeline.
Before disabling it, the build has a warning that:
"MSIL .netmodule or module compiled with /GL found; restarting link with
/LTCG; add /LTCG to the link command line to improve linker performance"
After disabling it, the warning is gone and the build doesn't use /GL or
/LTCG.
Cache itself should not cause this difference.
### Motivation and Context
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
1. Add two build jobs for enabling Address Sanitizer in CI. One for
Windows CPU, One for Linux CPU.
2. Set default compiler flags/linker flags in build.py for normal
Windows/Linux/MacOS build. This can help control compiler flags in a
more centralized way.
3. All Windows binaries in our official packages will be built with
"/PROFILE" flag. Symbols of onnxruntime.dll can be found at [Microsoft
public symbol
server](https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/microsoft-public-symbols).
Limitations:
1. On Linux Address Sanitizer ignores RPATH settings in ELF binaries.
Therefore once Address Sanitizer is enabled, before running tests we
need to manually set LD_LIBRARY_PATH properly otherwise
libonnxruntime.so may not be able to find custom ops and shared EPs.
4. On Linux we also need to set LD_PRELOAD before running some tests(if
the main executable, like python, is not built with address sanitizer.
On Windows we do not need to.
5. On Windows before running python tests we should manually copy
address sanitizer DLL to the onnxruntime/capi directory, because python
3.8 and above has enabled "Safe DLL Search Mode" that wouldn't use the
information provided by PATH env.
6. On Linux Address Sanitizer found a lot of memory leaks from our
python binding code. Therefore right now we cannot enable Address
Sanitizer when building ONNX Runtime with python binding.
7. Address Sanitizer itself uses a lot of memory address space and
delays memory deallocations, which is easy to cause OOM issues in 32-bit
applications. We cannot run all the tests in onnxruntime_test_all in
32-bit mode with Address Sanitizer due to this reason. However, we still
can run individual tests in such a way. We just cannot run all of them
in one process.
### Motivation and Context
To catch memory issues.
### Description
Set pythonInterpreter in set-python-manylinux-variables-step.yml. To fix
a build error:
```
Starting: Set Python manylinux variables
==============================================================================
Task : Python script
Description : Run a Python file or inline script
Version : 0.231.1
Author : Microsoft Corporation
Help : https://docs.microsoft.com/azure/devops/pipelines/tasks/utility/python-script
==============================================================================
##[error]Parameter 'toolPath' cannot be null or empty.
Finishing: Set Python manylinux variables
```
The error was because today I deleted a bunch of software from the VM
image. The task might fail if no Python versions are found in
$(Agent.ToolsDirectory).
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
---------
Co-authored-by: Yi Zhang <zhanyi@microsoft.com>
### Description
Adding python3.12 support to ORT
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
1. Remove Windows ARM32 from nuget packaging pipelines
2. Add missing component-governance-component-detection-steps.yml to
some build jobs.
### Motivation and Context
Stop supporting Windows ARM32 to align with [Windows's support
policy](https://learn.microsoft.com/en-us/windows/arm/arm32-to-arm64).
Users who need this feature still can build the DLLs from source.
However, later on we will remove that support too.
### Description
- Removes `--disable_ml_ops` build flag
- Automatically detects ORT version from VERSION file via
`templates/set-version-number-variables-step.yml`. We will no longer
need to create a commit to update ORT versions.
### Motivation and Context
- A new unit test caused failures in the QNN Nuget pipeline because it
did not enable ml ops.
- Automate ORT version specification
### Description
Change all macOS python packages to use universal2, to reduce the number
of packages we have.
### Motivation and Context
According to [wikipedia](https://en.wikipedia.org/wiki/MacOS_Big_Sur),
macOS 11 is the first macOS version that supports universal 2. And it is
the min macOS version we support. So we no longer need to maintain
separate binaries for different CPU archs.
### Description
- Add mutex to protect QNN API calls for executing a graph and
extracting the corresponding profile data.
- Ensures QNN EP's execute function does not store unnecessary state
(i.e., input and output buffer pointers do not need to be stored as
class members.)
### Motivation and Context
Allow calling `session.Run()` from multiple threads when using QNN EP.
### Description
1. Update donwload-artifacts to flex-downloadartifacts to make it eaiser
to debug.
2. Move the native files into Gpu.Windows and Gpu-linux packages.
Onnxruntime-Gpu has dependency on them.
3. update the package validation as well
4. Add 2 stages to run E2E test for GPU.Windows and GPU.Linux
for example:

### Motivation and Context
Single Onnxruntime.Gpu Package size has already excceded the Nuget size
limit.
We split the package into some smaller packages to make them can be
published.
For compatibility, the user can install or upgrade Onnxruntime.Gpu,
which will install Gpu.Windows and Gpu.Linux automatically.
And the user can only install Gpu.Windows and Gpu.Linux directly.
### Test Link
1. In ORT_NIGHTLY
2. Install the preview version in nuget-int. (nuget source:
https://apiint.nugettest.org/v3/index.json)
---------
Co-authored-by: Scott McKay <skottmckay@gmail.com>
Move QNN EP provider options to session options
### Description
Need to use session option to support multi-partition for context cache feature. To smooth the transaction, move the provider options to session options first.
This is the first step for PR:
PR https://github.com/microsoft/onnxruntime/pull/18865
### Description
<!-- Describe your changes. -->
Add LeakyRelu to the list as support was added a while ago.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
1. Add a CodeSign validation task before the binaries are published, to
make sure all DLL files are signed.
2. Auto-trigger the CUDA 12 pipeline's publishing job.
### Description
Fixes a failure in the ortmodule nightly pipeline.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
Change Nuget packaging pipeline's build TRT job to download CUDA SDK
on-the-fly, so that we do not need to put a CUDA SDK in the build
machine's image.
### Description
Update absl and googletest to their latest version to include some cmake
changes:
1. A googletest's cmake change that will allow using external absl and
re2.
2. Nullability enhancements that will allow our clang-based static
analysis detecting many kinds of null pointer errors.
### Motivation and Context
To fix a C4744 link warning in our Windows pipelines.
```
LINK : warning C4744: 'static char const absl::lts_20230802::base_internal::FastTypeTag<bool>::dummy_var' has different type in 'd:\a\_work\_temp\abseil_cpp\abseil-cpp-20230802.0\absl\flags\parse.cc' and 'd:\a\_work\1\b\relwithdebinfo\_deps\googletest-src\googletest\src\gtest-all.cc': 'signed char' and 'unsigned char' [D:\a\_work\1\b\RelWithDebInfo\onnxruntime_mlas_test.vcxproj]
LINK : warning C4744: 'static char const absl::lts_20230802::base_internal::FastTypeTag<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > >::dummy_var' has different type in 'd:\a\_work\_temp\abseil_cpp\abseil-cpp-20230802.0\absl\flags\parse.cc' and 'd:\a\_work\1\b\relwithdebinfo\_deps\googletest-src\googletest\src\gtest-all.cc': 'signed char' and 'unsigned char' [D:\a\_work\1\b\RelWithDebInfo\onnxruntime_mlas_test.vcxproj]
LINK : warning C4744: 'static char const absl::lts_20230802::base_internal::FastTypeTag<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > >::dummy_var' has different type in 'd:\a\_work\_temp\abseil_cpp\abseil-cpp-20230802.0\absl\flags\internal\usage.cc' and 'd:\a\_work\1\b\relwithdebinfo\_deps\googletest-src\googletest\src\gtest-all.cc': 'signed char' and 'unsigned char' [D:\a\_work\1\b\RelWithDebInfo\onnxruntime_mlas_test.vcxproj]
LINK : warning C4744: 'static char const absl::lts_20230802::base_internal::FastTypeTag<bool>::dummy_var' has different type in 'd:\a\_work\_temp\abseil_cpp\abseil-cpp-20230802.0\absl\flags\internal\flag.cc' and 'd:\a\_work\1\b\relwithdebinfo\_deps\googletest-src\googletest\src\gtest-all.cc': 'signed char' and 'unsigned char' [D:\a\_work\1\b\RelWithDebInfo\onnxruntime_mlas_test.vcxproj]
LINK : warning C4744: 'static char const absl::lts_20230802::base_internal::FastTypeTag<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > >::dummy_var' has different type in 'd:\a\_work\_temp\abseil_cpp\abseil-cpp-20230802.0\absl\flags\internal\flag.cc' and 'd:\a\_work\1\b\relwithdebinfo\_deps\googletest-src\googletest\src\gtest-all.cc': 'signed char' and 'unsigned char' [D:\a\_work\1\b\RelWithDebInfo\onnxruntime_mlas_test.vcxproj]
LINK : warning C4744: 'static char const absl::lts_20230802::base_internal::FastTypeTag<int>::dummy_var' has different type in 'd:\a\_work\_temp\abseil_cpp\abseil-cpp-20230802.0\absl\flags\internal\flag.cc' and 'd:\a\_work\1\b\relwithdebinfo\_deps\googletest-src\googletest\src\gtest-all.cc': 'signed char' and 'unsigned char' [D:\a\_work\1\b\RelWithDebInfo\onnxruntime_mlas_test.vcxproj]
```
### Description
ONNX model zoo changed their dir structure. So some our pipelines are
failing. In prevent such things happening again, we'd better to read the
test data for a cache from local disk instead of downloading it remotely
every time.
### Description
<!-- Describe your changes. -->
Add macos build for objc pod.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Follow up pr for #18550
---------
Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
### Description
The warning is:
```
C:\a\_work\1\s\onnxruntime\contrib_ops\cpu\math\sparse_dense_matmul.cc(88,54): warning C4244: 'argument': conversion from 'const __int64' to 'Eigen::EigenBase<Derived>::Index', possible loss of data [C:\a\_work\1\b\RelWithDebInfo\onnxruntime_providers.vcxproj]
2023-12-08T20:58:48.1812949Z with
2023-12-08T20:58:48.2144272Z [
2023-12-08T20:58:48.2145285Z Derived=Eigen::Map<const Eigen::SparseMatrix<uint64_t,1,int64_t>,0,Eigen::Stride<0,0>>
2023-12-08T20:58:48.2801935Z ]
2023-12-08T20:58:48.2804047Z C:\a\_work\1\s\onnxruntime\contrib_ops\cpu\math\sparse_dense_matmul.cc(82,8): message : while compiling class template member function 'void onnxruntime::contrib::`anonymous-namespace'::SparseToDenseCsr<uint64_t>::operator ()(const onnxruntime::contrib::`anonymous-namespace'::ComputeCtx &,const onnxruntime::SparseTensor &,const onnxruntime::Tensor &,onnxruntime::Tensor &) const' [C:\a\_work\1\b\RelWithDebInfo\onnxruntime_providers.vcxproj]
2023-12-08T20:58:48.2806197Z C:\a\_work\1\s\include\onnxruntime\core/framework/data_types_internal.h(302,27): message : see the first reference to 'onnxruntime::contrib::`anonymous-namespace'::SparseToDenseCsr<uint64_t>::operator ()' in 'onnxruntime::utils::mltype_dispatcher_internal::CallableDispatchableHelper::Invoke' (compiling source file C:\a\_work\1\s\onnxruntime\contrib_ops\cpu\math\sparse_dense_matmul.cc) [C:\a\_work\1\b\RelWithDebInfo\onnxruntime_providers.vcxproj]
2023-12-08T20:58:48.2871783Z C:\a\_work\1\s\include\onnxruntime\core/framework/data_types_internal.h(438,100): message : see reference to class template instantiation 'onnxruntime::contrib::`anonymous-namespace'::SparseToDenseCsr<uint64_t>' being compiled (compiling source file C:\a\_work\1\s\onnxruntime\contrib_ops\cpu\math\sparse_dense_matmul.cc) [C:\a\_work\1\b\RelWithDebInfo\onnxruntime_providers.vcxproj]
2023-12-08T20:58:48.2893010Z C:\a\_work\1\s\include\onnxruntime\core/framework/data_types_internal.h(414,5): message : see reference to function template instantiation 'void onnxruntime::utils::MLTypeCallDispatcher<float,double,int32_t,uint32_t,int64_t,uint64_t>::InvokeWithLeadingTemplateArgs<Fn,onnxruntime::TypeList<>,onnxruntime::contrib::`anonymous-namespace'::ComputeCtx&,const T&,const onnxruntime::Tensor&,onnxruntime::Tensor&>(onnxruntime::contrib::`anonymous-namespace'::ComputeCtx &,const T &,const onnxruntime::Tensor &,onnxruntime::Tensor &) const' being compiled [C:\a\_work\1\b\RelWithDebInfo\onnxruntime_providers.vcxproj]
2023-12-08T20:58:48.2894476Z with
2023-12-08T20:58:48.2911521Z [
2023-12-08T20:58:48.2912457Z Fn=onnxruntime::contrib::`anonymous-namespace'::SparseToDenseCsr,
2023-12-08T20:58:48.3067840Z T=onnxruntime::SparseTensor
2023-12-08T20:58:48.3068863Z ] (compiling source file C:\a\_work\1\s\onnxruntime\contrib_ops\cpu\math\sparse_dense_matmul.cc)
2023-12-08T20:58:48.3195854Z C:\a\_work\1\s\onnxruntime\contrib_ops\cpu\math\sparse_dense_matmul.cc(198,11): message : see reference to function template instantiation 'void onnxruntime::utils::MLTypeCallDispatcher<float,double,int32_t,uint32_t,int64_t,uint64_t>::Invoke<onnxruntime::contrib::`anonymous-namespace'::SparseToDenseCsr,onnxruntime::contrib::`anonymous-namespace'::ComputeCtx&,const T&,const onnxruntime::Tensor&,onnxruntime::Tensor&>(onnxruntime::contrib::`anonymous-namespace'::ComputeCtx &,const T &,const onnxruntime::Tensor &,onnxruntime::Tensor &) const' being compiled [C:\a\_work\1\b\RelWithDebInfo\onnxruntime_providers.vcxproj]
2023-12-08T20:58:48.3197946Z with
2023-12-08T20:58:48.3198565Z [
2023-12-08T20:58:48.3199093Z T=onnxruntime::SparseTensor
2023-12-08T20:58:48.3905678Z ]
2023-12-08T20:58:48.3907275Z C:\a\_work\1\s\onnxruntime\contrib_ops\cpu\math\sparse_dense_matmul.cc(198,36): message : see the first reference to 'onnxruntime::utils::MLTypeCallDispatcher<float,double,int32_t,uint32_t,int64_t,uint64_t>::Invoke' in 'onnxruntime::contrib::SparseToDenseMatMul::Compute' [C:\a\_work\1\b\RelWithDebInfo\onnxruntime_providers.vcxproj]
2023-12-08T20:58:48.3910999Z ##[warning]onnxruntime\contrib_ops\cpu\math\sparse_dense_matmul.cc(88,43): Warning C4244: 'argument': conversion from 'const __int64' to 'Eigen::EigenBase<Derived>::Index', possible loss of data
2023-12-08T20:58:48.3912734Z 182>C:\a\_work\1\s\onnxruntime\contrib_ops\cpu\math\sparse_dense_matmul.cc(88,43): warning C4244: 'argument': conversion from 'const __int64' to 'Eigen::EigenBase<Derived>::Index', possible loss of data [C:\a\_work\1\b\RelWithDebInfo\onnxruntime_providers.vcxproj]
2023-12-08T20:58:48.3913414Z with
2023-12-08T20:58:48.3913660Z [
2023-12-08T20:58:48.3914001Z Derived=Eigen::Map<const Eigen::SparseMatrix<uint64_t,1,int64_t>,0,Eigen::Stride<0,0>>
2023-12-08T20:58:48.3914499Z ]
2023-12-08T20:58:48.3914743Z qlinear_concat.cc
2023-12-08T20:58:48.3917082Z ##[warning]onnxruntime\contrib_ops\cpu\math\sparse_dense_matmul.cc(92,74): Warning C4244: 'argument': conversion from 'const __int64' to 'Eigen::EigenBase<Derived>::Index', possible loss of data
2023-12-08T20:58:48.3918624Z 182>C:\a\_work\1\s\onnxruntime\contrib_ops\cpu\math\sparse_dense_matmul.cc(92,74): warning C4244: 'argument': conversion from 'const __int64' to 'Eigen::EigenBase<Derived>::Index', possible loss of data [C:\a\_work\1\b\RelWithDebInfo\onnxruntime_providers.vcxproj]
2023-12-08T20:58:48.5534583Z with
2023-12-08T20:58:48.5541266Z [
2023-12-08T20:58:48.5542401Z Derived=Eigen::Map<const Eigen::Matrix<uint64_t,-1,-1,1,-1,-1>,0,Eigen::Stride<0,0>>
2023-12-08T20:58:48.5544914Z ]
2023-12-08T20:58:48.5548670Z ##[warning]onnxruntime\contrib_ops\cpu\math\sparse_dense_matmul.cc(92,63): Warning C4244: 'argument': conversion from 'const __int64' to 'Eigen::EigenBase<Derived>::Index', possible loss of data
2023-12-08T20:58:48.5552099Z 182>C:\a\_work\1\s\onnxruntime\contrib_ops\cpu\math\sparse_dense_matmul.cc(92,63): warning C4244: 'argument': conversion from 'const __int64' to 'Eigen::EigenBase<Derived>::Index', possible loss of data [C:\a\_work\1\b\RelWithDebInfo\onnxruntime_providers.vcxproj]
2023-12-08T20:58:48.5553712Z with
2023-12-08T20:58:48.5555569Z [
2023-12-08T20:58:48.5556779Z Derived=Eigen::Map<const Eigen::Matrix<uint64_t,-1,-1,1,-1,-1>,0,Eigen::Stride<0,0>>
2023-12-08T20:58:48.5558707Z ]
2023-12-08T20:58:48.5561428Z ##[warning]onnxruntime\contrib_ops\cpu\math\sparse_dense_matmul.cc(93,90): Warning C4244: 'argument': conversion from 'const __int64' to 'Eigen::EigenBase<Derived>::Index', possible loss of data
2023-12-08T20:58:48.5565624Z 182>C:\a\_work\1\s\onnxruntime\contrib_ops\cpu\math\sparse_dense_matmul.cc(93,90): warning C4244: 'argument': conversion from 'const __int64' to 'Eigen::EigenBase<Derived>::Index', possible loss of data [C:\a\_work\1\b\RelWithDebInfo\onnxruntime_providers.vcxproj]
2023-12-08T20:58:48.5566354Z with
2023-12-08T20:58:48.5568185Z [
2023-12-08T20:58:48.5569305Z Derived=Eigen::Map<Eigen::Matrix<uint64_t,-1,-1,1,-1,-1>,0,Eigen::Stride<0,0>>
2023-12-08T20:58:48.5571339Z ]
2023-12-08T20:58:48.5574864Z ##[warning]onnxruntime\contrib_ops\cpu\math\sparse_dense_matmul.cc(93,77): Warning C4244: 'argument': conversion from 'const __int64' to 'Eigen::EigenBase<Derived>::Index', possible loss of data
2023-12-08T20:58:48.5577866Z 182>C:\a\_work\1\s\onnxruntime\contrib_ops\cpu\math\sparse_dense_matmul.cc(93,77): warning C4244: 'argument': conversion from 'const __int64' to 'Eigen::EigenBase<Derived>::Index', possible loss of data [C:\a\_work\1\b\RelWithDebInfo\onnxruntime_providers.vcxproj]
2023-12-08T20:58:48.5578562Z with
2023-12-08T20:58:48.5580399Z [
2023-12-08T20:58:48.5581503Z Derived=Eigen::Map<Eigen::Matrix<uint64_t,-1,-1,1,-1,-1>,0,Eigen::Stride<0,0>>
2023-12-08T20:58:48.5583465Z ]
2023-12-08T20:58:48.5587661Z ##[warning]onnxruntime\contrib_ops\cpu\math\sparse_dense_matmul.cc(88,54): Warning C4244: 'argument': conversion from 'const __int64' to 'Eigen::EigenBase<Derived>::Index', possible loss of data
2023-12-08T20:58:48.5590705Z 182>C:\a\_work\1\s\onnxruntime\contrib_ops\cpu\math\sparse_dense_matmul.cc(88,54): warning C4244: 'argument': conversion from 'const __int64' to 'Eigen::EigenBase<Derived>::Index', possible loss of data [C:\a\_work\1\b\RelWithDebInfo\onnxruntime_providers.vcxproj]
2023-12-08T20:58:48.5591396Z with
2023-12-08T20:58:48.5593220Z [
2023-12-08T20:58:48.5593693Z Derived=Eigen::Map<const Eigen::SparseMatrix<int64_t,1,int64_t>,0,Eigen::Stride<0,0>>
2023-12-08T20:58:48.5595955Z ]
```
And the warning in #18195
### Motivation and Context
AB#22894
---------
Co-authored-by: Dmitri Smirnov <yuslepukhin@users.noreply.github.com>
### Description
Move NuGet nightly package publishing job to a separated pipeline.
Before this change, it runs at the end of 'Zip-Nuget-Java-Nodejs
Packaging Pipeline'. This PR moves it to a separate pipeline so that we
can manually trigger this step for any branch(e.g. release branches).
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
Updating transformers package in test pipeline to fix a security
vulnerability.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
Update absl and gtest to fix an ARM64EC build error
### Motivation and Context
We need to get an important fix into ORT.
The fix is:
8028a87c96
### Description
reuse EO pool in NPM pipeline.
### Motivation and Context
build_web_debug failed in onnxruntime-Win-CPU-2022 but it works in EO
pool.
Reuse EO pool to make the pipeline work now.
When I'm free, I'll try upgrading the chrome in the custom image.
### Description
<!-- Describe your changes. -->
As title.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
yolo-v8 model missing operator support.
---------
Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
### Description
- Update QNN CI Pipelines to use QNN SDK version 2.17.0
- **Print warning if unit test requires adjusted tolerance to pass**
- **Temporarily disable unloading QnnCpu.dll for windows x64 due to
crash when calling FreeLibrary**
- Enable fixed HTP tests
- QnnHTPBackendTests.LayerNorm1D_LastAxis_DynamicScale
- QnnHTPBackendTests.GlobalMaxPool_LargeInput2_u8
- QnnHTPBackendTests.ReduceSumS8Opset13_Rank5
- QnnHTPBackendTests.ReduceSumU8Opset13_Rank5_LastAxis
- QnnHTPBackendTests.WhereLargeDataBroadcastU8
- QnnHTPBackendTests.WhereLargeDataBroadcastTransformedU8
- Enabled fixed CPU tests
- QnnCPUBackendTests.Resize_DownSample_Linear_AlignCorners_scales
- Increased tolerance for HTP tests that are less accurate on QNN SDK
2.17.0
- QnnHTPBackendTests.AveragePool_CountIncludePad_HTP_u8
- QnnHTPBackendTests.AveragePool_AutopadSameUpper_HTP_u8
- QnnHTPBackendTests.AveragePool_AutopadSameLower_HTP_u8
- QnnHTPBackendTests.ConvU8U8S32_bias_dynamic_input
- QnnHTPBackendTests.ConvU8U8S32_bias_initializer
- QnnHTPBackendTests.ConvU8U8S32_large_input1_padding_bias_initializer
- QnnHTPBackendTests.LRNSize3
- QnnHTPBackendTests.LRNSize5
- QnnHTPBackendTests.MaxPool_Large_Input_HTP_u8
- QnnHTPBackendTests.MaxPool_LargeInput_1Pads
- QnnHTPBackendTests.Resize_DownSample_Linear_HalfPixel
- QnnHTPBackendTests.ResizeU8_2xLinearPytorchHalfPixel
- QnnHTPBackendTests.ResizeU8_2xLinearHalfPixel
- QnnHTPBackendTests.ResizeU8_2xLinearAlignCorners
- QnnHTPBackendTests.ResizeU8_2xLinearAsymmetric
- Disabled ONNX model tests
- averagepool_2d_ceil: Accuracy issues **only on Windows x64
QnnCpu.dll**
- Disabled QDQ model tests (onnx_test_runner)
- facedetection_op8_qdq: Accuracy issues
- Disabled CPU EP tests (these use QnnCpu.dll)
- ActivationOpTest.Relu: QNN SDK 2.17 Relu treats inf as FLT_MAX
- GemmOpTypedTests/0.TestGemmBroadcast: Inaccuracy when weight is
initializer and bias is not
- MathOpTest.MatMulFloatType "test padding and broadcast B > A":
Inaccuracy (**only linux**)
- Fix Gemm translation bugs in QNN EP:
- Do not skip processing of inputs that need to be transposed.
### Motivation and Context
- Allow testing with newest QNN SDK version
- Take advantage of improvements to enable new models.
### Description
To make the code more consistent. Now some TRT pipelines download TRT
binaries on-the-fly, while other TRT pipelines use a preinstalled
version. This PR make them the same.
### Description
<!-- Describe your changes. -->
Remove developement id and force codesign not required in the test macos
target.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Fix failure happened in iOS_Full_xcframwork stage in
Zip-Nuget-Java-NodeJS packaging pipeline.
---------
Co-authored-by: rachguo <rachguo@rachguos-Mac-mini.local>
### Description
1. Add a new stage to download java tools from https://oss.sonatype.org
and publish them to pipeline artifact
2. Remove downloads in other jobs, they get the java tools from pipeline
artifact
3. consolidate final_java_testing stages.
### Motivation and Context
Reduce downloads to reduce the connection error like below.
```
--2023-11-28 07:16:31-- https://oss.sonatype.org/service/local/repositories/releases/content/org/junit/platform/junit-platform-console-standalone/1.6.2/junit-platform-console-standalone-1.6.2.jar
Resolving oss.sonatype.org (oss.sonatype.org)... 3.227.40.198, 3.229.50.23
Connecting to oss.sonatype.org (oss.sonatype.org)|3.227.40.198|:443... connected.
HTTP request sent, awaiting response... 502 Bad Gateway
2023-11-28 07:16:32 ERROR 502: Bad Gateway.
```
### Description
Currently, the `drop-nuget` artifact only contains protoc.exe which is
also part of the `drop-extra` artifact.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
<!-- Describe your changes. -->
As title.
1. Add macos build as an optionally enabled arch for pod and changes to
exsiting build_ios_framework/assemble_c_pod scripts.
2. Enable macos build arch in ios packaging pipeline (currently for
variants other than Mobile) and check the output artifacts are correct.
3. Write MacOS Test Target scheme in the test app and integrate into ios
packaging CI testing pipeline.
Currently the changes only apply to onnxruntime-c pod. as the original
request was from ORT SPM which consumes the onnxruntime-c pod only as
the binary target. TODO: could look into adding macos platform to objc
pod as well.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Enable macos platform support in cocoapods. and also potentially produce
binary target for enabling macos platform in SPM as well.
Replace https://github.com/microsoft/onnxruntime/pull/18334
---------
Co-authored-by: rachguo <rachguo@rachguos-Mac-mini.local>
Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
### Description
Update Azure-Pipelines-EO-Windows2022-aiinfra to
onnxruntime-win-CPU-2022 in Nuget_Package_CPU.
To make the debugging easier, use flex-downloadPipelineArtifact
### Motivation and Context
Azure-Pipelines-EO-Windows2022-aiinfra is using 1ES window-latest image.
The pipeline might be failed by unexpected upgrade.
Verified:
https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=384425&view=results
### P.S.
I think we should replace all Azure-Pipelines-EO-Windows2022-aiinfra.
### Description
<!-- Describe your changes. -->
As title.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Added for yolov8 model missing operator support.
https://github.com/microsoft/onnxruntime/issues/17654
Now the model support info looks like:
_CoreMLExecutionProvider::GetCapability, number of partitions supported
by CoreML: 3 number of nodes in the graph: 233 number of nodes supported
by CoreML: 230_
(only missing 3 concat op support due to input 3d shape is not currently
support in CoreML EP Concat).
---------
Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
Co-authored-by: rachguo <rachguo@rachguos-Mac-mini.local>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
### Description
<!-- Describe your changes. -->
Build ORT-training packaging pipeline for CUDA 12.2
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
This will help any customer using CUDA 12 and would not need to build
ORT-training from source
Test run:
https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=382993&view=logs&s=130be951-c2f3-5601-5709-434b5e50ddb0
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Recent PyTorch breaks DORT CI and [a
patch](https://github.com/pytorch/pytorch/pull/113697) has been merged
into PyTorch main. In order to update DORT's CI, we made dummy change in
this PR.
### Description
It causes our "NPM Packaging Pipeline" to fail.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
Always run emsdk_env.sh before build.py, even when ccache is disabled
This is a follow up to #18434. That PR didn't handle the case when
ccache was disabled.
This also set the Path variable for the downloaded libraries.
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
<!-- Describe your changes. -->
1. Introduce MoE CUDA op to ORT based on FT implementation.
2. Upgrade cutlass to 3.1.0 to avoid some build failures on Windows.
Remove patch file for cutlass 3.0.0.
3. Sharded MoE implementation will come with another PR
limitation: __CUDA_ARCH__ >= 700
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
add CI steps to log info for test failure investigating.
Currently Web CI is marked as 'optional'. This change adds some script
to dump debug info for investigating the random test failure