Commit graph

271 commits

Author SHA1 Message Date
Changming Sun
228dd16893
Bump clang-format from 18.1.8 to 19.1.6 (#23346)
To replace #23327
2025-01-14 09:02:04 -08:00
A-Satti
f5293d253c
Update Intel Thread Counts (#22894)
### Description
The default thread count methodology by onnxruntime did not account for
new upcoming Intel microarchitectures leading to a suboptimal thread
count. Optimizing the thread count for new Intel microarchitectures
reveal gains on the majority of models across datatypes and shows gains
up to ~1.5x speedup.


### Motivation and Context
Applications should run on Intel with the most performant thread
configuration for the majority of models. With new microarchitectures,
adjusting the thread count methodology is required to take advantage of
their differences.
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-12-06 13:56:50 -08:00
Jake Mathern
c0b68e77af
Fix warnings (#21809)
### Description
Minor changes to resolve some warnings in ORT

### Motivation and Context
Binskim for WindowsAI (which consumes ORT) treats warnings as errors,
and has hit these warnings.
As a security requirement, warnings like "signed/unsigned mismatch" must
be resolved.
2024-08-21 14:23:37 -07:00
Justin Chu
c203d89958
Update ruff and clang-format versions (#21479)
ruff -> 0.5.4
clang-format -> 18
2024-07-24 11:50:11 -07:00
mindest
5b9369e93c
Fix typos according to reviewdog report. (#21335)
### Description
Fix typos based on reviewdog report but with some
exceptions/corrections.
2024-07-22 13:37:32 -07:00
Changming Sun
2c53b4a534
Remove core/common/gsl.h (#20894)
### Description
It might be easier if we just directly include the original gsl headers.
"core/common/gsl.h" is an indirection that doesn't provide extra help.
2024-07-08 18:09:39 -07:00
Patrice Vignola
4d98f06f93
[DML EP] Add GroupQueryAttention (#20327) 2024-04-19 10:25:29 -07:00
Rachel Guo
19793de1b3
#19921 [Dup] LLC Core count calculations updated (#20171)
### Description
<!-- Describe your changes. -->

See #19921 Just to address one comment:
https://github.com/microsoft/onnxruntime/pull/19921#discussion_r1543398640

since this is an external branch. need to open another pull request for
this.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: Sai Kishan Pampana <sai.kishan.pampana@intel.com>
Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
Co-authored-by: Jian Chen <cjian@microsoft.com>
2024-04-02 16:53:47 -07:00
Changming Sun
efad5bbc5a
Replace some old file system calls with C++17 std::filesystem APIs. (#19196)
### Description
1. Replace some old file system calls to use C++17 std::filesystem APIs.
2. Remove tensorflow_C_PACKAGE_PATH cmake option, which was only used in
onnxruntime_perf_test and the code is out of maintain.
3. Excludes onnx_test_runner and onnxruntime_perf_test from iOS build
because C++17 filesystem library is not available there
2024-03-09 09:17:36 -08:00
Sheil Kumar
5197db1980
Diable __cpuid call for ARM64EC (#19592)
Diable __cpuid call for ARM64EC

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2024-02-21 15:45:44 -08:00
Sheil Kumar
3c49aacd56
Disable __cpuid check on arm64 builds as intrinsic is not available (#19574)
Disable __cpuid check on arm64 builds as intrinsic is not available

Motivation
Breaking the arm64 build.

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2024-02-20 13:13:40 -08:00
Sheil Kumar
1508c2ee39
Restrict L2 Cache Core check to Intel devices (#19483)
### Description
Limit SoC core detection via 2 level cache core logic to Intel and
Hybrid processors.

### Motivation and Context
The following code was added to add support for a new class of CPU cores
present in Intel’s next generation Intel Core Ultra mobile processors.
This code is essential to avoid placing threads on low performing SoC
cores that don’t have L3 cache. SoC cores are meant to specialize in
system bringup and help improve responsiveness and power usage, in other
words they are not meant to run compute heavy AI workloads. In order to
avoid broad exposure of this logic, it is currently designed to be
restricted to Intel platforms that have hybrid enabled.

---------

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2024-02-14 10:31:03 -08:00
Sheil Kumar
0b7048e7d6
Update winml to use #cores - #soc cores by Default as the number of intraopthreads (#18384)
Update winml to use #cores - #soc cores by Default as the number of
intraopthreads

---------

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2023-11-28 09:26:48 -08:00
Justin Chu
c250540722
Bump linter versions (#18341)
Bump linter versions and run format.
2023-11-08 13:04:40 -08:00
Yi Zhang
20798a9f03
Enable onnx_test_runner to run the whole models dir in CI machine (#17863)
### Description
1. If the model should be skipped, don't load it.
2. print loaded tests and skipped tests
3. add more same filters as of the onnxruntime_test_all.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-10-12 12:01:02 +08:00
Sheil Kumar
cb9408e89c
Enable cpp20 builds for DML EP and WinML API (#17800)
Enable cpp20 builds for DML EP and WinML API

1) Missing typename for templated types
2) unmove helper for inline references to rvalue temporaries
This is okay since per the standard a temporary bound to a reference
parameter in a function call exists until the end of the full expression
containing that function call: if the function returns a reference,
which outlives the full expression, it becomes a dangling reference.

3) static now not needed for template specializations

---------

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2023-10-06 10:33:38 -07:00
Pranav Sharma
668c70ee11
Add support for specifying a custom logging function per session. (#17727)
### Description
Add support for specifying a custom logging function per session.
Bindings for other languages will be added after this PR is merged.

### Motivation and Context
Users want a way to override the logging provided by the environment.
2023-09-29 19:46:55 -07:00
Yi Zhang
9136748462
Fix: Fail to skip disabledmodel in winml (#17728)
### Description
Move appending source name behind the ModifyNameIfDisabledTest

### Motivation and Context
In winml,  disabled test name doesn't include the model source name.
WinML job will be broken in the new image.

https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1151451&view=logs&s=4eef7ad1-5202-529d-b414-e2b14d056c05

### Verified

https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1151691&view=logs&s=4eef7ad1-5202-529d-b414-e2b14d056c05
2023-09-28 13:46:44 +08:00
cao lei
32f5658abb
remove gsl to make status.h independent from gsl (#17402)
### Description
<!-- Describe your changes. -->
Make status.h independent from gsl.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
In the coming new feature external EP API (see the prototype
https://github.com/microsoft/onnxruntime/pull/16718), we need to expose
stream in the public header, however, stream is dependent on status.h
which is dependent on gsl. We are seeking a way to decouple stream from
gsl.

From Changming's comment offline, prefast is disabled so all
GSL_SUPPRESS are not taking any effect now. He will handle the warnings
when enable prefast in the future
2023-09-13 21:47:43 -07:00
Changming Sun
bc84f52633
Update C/C++ dependencies: abseil, date, nsync, googletest, wil, mp11, cpuinfo and safeint (#15470)
### Description
Update C/C++ dependencies abseil, date, nsync, googletest, wil, mp11,
cpuinfo and safeint to newer versions per request of @
mayeut. He created the following PRs to update the deps:
https://github.com/microsoft/onnxruntime/pull/15432
https://github.com/microsoft/onnxruntime/pull/15434
https://github.com/microsoft/onnxruntime/pull/15435
https://github.com/microsoft/onnxruntime/pull/15436
https://github.com/microsoft/onnxruntime/pull/15437

However, our build system needs to fetch the dependencies from an
internal mirror that only Microsoft employees have write access to. So I
closed his PRs and created this one.

This PR also updates abseil to a newer version. This is to prepare for
upgrading re2.
2023-09-08 13:35:04 -07:00
Justin Chu
2575b9aaa1
Improve comments in winml/ (#17163)
Follow up of #17144. Manually fixed indentation in block comments and
replaced all tabs with spaces.
2023-08-15 23:30:56 -04:00
Justin Chu
416dc2e84d
Fix clang-format comment indents on Windows for winml/ (#17144)
On Windows, clang-format has a bug when AlignTrailingComments.Kind is
set to `Leave`
(https://clang.llvm.org/docs/ClangFormatStyleOptions.html#aligntrailingcomments),
where it will keep adding indentation to comments after each formatting
runs.

This PR changes to always align comments so we do not hit the bug.

As a consequence of the options change we need to reformat some of the
files. Note that this option is aligned with the rest of the repository.
2023-08-14 23:50:14 -04:00
Jeff Bloomfield
0180c0429f
Fix DML regression from allocator refactor and enable unrounded weight allocation in ORT API (#17030)
This addresses a DML performance regression from the following PR
resulting in allocations not being rounded and pooled in the DML
execution provider.

https://github.com/microsoft/onnxruntime/pull/15833

This also fixes a pre-existing limitation that allocations during
session initialization (primarily large weights and persistent
resources) only bypassed rounding and pooling while using the Winml API.
The allocator now also respects a caller's rounding mode parameter when
provided.
2023-08-10 17:02:24 -07:00
Justin Chu
eeef157888
Format c++ code under winml/ (#16660)
winml/ was previously excluded from lintrunner config. This change
includes the directory and adds the clang-format config file specific to
winml/ that fits existing style.

---------

Signed-off-by: Justin Chu <justinchu@microsoft.com>
2023-07-25 21:56:50 -07:00
Sheil Kumar
0c956bef0a
[WinML] Fix warnings in OnnxruntimeEngine and OnnxruntimeEngineBuilder (#16679)
Fix [prefast:Warning]: C6101 (in
'_winml::OnnxruntimeEngine::CreateTensorValueFromDefaultAllocator'
Fix [prefast:Warning]: C6101 (in
'_winml::OnnxruntimeEngineBuilder::CreateEngine'

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2023-07-12 13:09:50 -07:00
cao lei
329e8156d4
clean unused parameter in ORT_UNUSED_PARAMETER (#16538)
### Description
clean unused parameter in ORT_UNUSED_PARAMETER


### Motivation and Context
clean unused parameters in ORT_UNUSED_PARAMETER which are introduced
from #15833
2023-07-07 13:20:36 -07:00
Sheil Kumar
f46956056d
Add WinML Experimental API to Register ORT CustomOps Libraries (#16535)
Add WinML Experimental API to Register ORT CustomOps Libraries

---------

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2023-06-30 22:17:35 -07:00
cao lei
0c5f492493
remove AllocatorMgr class (#16509)
### Description
Remove AllocatorManager class


### Motivation and Context
After the refactor PR #15833 is in, AllocatorManager class is not
referenced anymore.
2023-06-28 15:43:19 -07:00
Justin Chu
e2381c42f2
Use M_PI to replace 3.14 constants (#16421)
### Description

Use M_PI to replace 3.14 constants

### Motivation and Context

Fixes #16413
2023-06-20 15:09:10 -07:00
cao lei
dd72192cf4
ExecutionProvider API refactor - move allocator from EP level to SessionState level and indexed by OrtDevice (#15833)
### Description
This PR is to refactor ExecutionProvider API for memory management,
which is to move allocators from EP level to SessionState level and
indexed by OrtDevice



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
This PR is to refactor ExecutionProvider API for memory management,
which is to move allocators from EP level to SessionState level and
indexed by OrtDevice. By this change, EP level will shift the burden of
maintaining allocators, which will be user friendly for EP developers

---------

Co-authored-by: Lei Cao <leca@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2023-06-19 17:44:45 -07:00
Sheil Kumar
2b7f26af7c
Add GridSample implementation to DirectML (#15788)
Add GridSample implementation to DirectML EP.

Temporary add HLSL shader in the DirectML EP to handle GridSample until
officially added to DirectML.
2023-05-05 15:59:33 -07:00
Sheil Kumar
5bde1e8e37
Add Bluestein Z-Chirp Algorithm to DirectML DFT implementation (#15686)
Add Bluestein Z-Chirp Algorithm to DirectML DFT implementation

This will enable STFT and DFT on signals which have non-powers of 2.
2023-04-27 14:03:40 -07:00
Numfor Tiapo
f44f6c5b2e
Fix Prefast Errors (#15651)
This PR adds fixes for prefast errors with the following codes:

- C26814
- C26451
- C26400
2023-04-25 16:41:39 -07:00
Patrice Vignola
b49d428299
[DML EP] Add missing newline to image test logging (#15596) 2023-04-21 13:39:07 -07:00
Tianlei Wu
5a675d9113
Disable random failing DML image batch test (#15624)
### Description
Disable a test with random failure in Windows GPU CI Pipeline like the
following:

```
11: [       OK ] BatchTest/BatchTest.BatchSupport/163 (0 ms)
11: [ RUN      ] BatchTest/BatchTest.BatchSupport/164
11: D:\a\_work\1\s\winml\test\image\imagetests.cpp(186): error: Expected: m_model_binding.Bind(output_data_binding_name, output_video_frames) doesn't throw an exception.
11:   Actual: it throws.
11: D:\a\_work\1\s\winml\test\image\imagetests.cpp(211): error: Expected: m_result = m_session.Evaluate(m_model_binding, L"") doesn't throw an exception.
11:   Actual: it throws.
11: total errors is 0/2073600, errors rate is 0total errors is 0/2073600, errors rate is 0total errors is 0/2073600, errors rate is 0[  FAILED  ] BatchTest/BatchTest.BatchSupport/164, where GetParam() = ((L"fns-candy_Bgr8_Batch3.onnx", 0, { L"1080.jpg", L"fish_720_Gray.png", L"fish_720.png" }, 3, false), 0, 1, 1, 1, 4-byte object <02-00 00-00>) (3203 ms)
```

Since https://github.com/microsoft/onnxruntime/pull/15468 merged to
main, about 10~15% build job failed in the test.
2023-04-21 13:29:56 -07:00
Changming Sun
5bed8d0285
Disable XNNPack EP's tests in Windows CI pipeline (#15406)
### Description

1. Disable XNNPack EP's tests in Windows CI pipeline
The EP code has a known problem(memory alignment), but the problem does
not impact the usages that we ship the code to. Now we only use XNNPack
EP in mobile apps and web usages. We have already pipelines to cover
these usages. We need to prioritize fixing the bugs found in these
pipelines, and there no resource to put on this Windows one. We can
re-enable the tests once we reached an agreement on how to fix the
memory alignment bug.

2.  Delete anybuild.yml which was for an already deleted pipeline.
3. Move Windows CPU pipelines to AMD CPU machine pools which are
cheaper.
4. Disable some qdq/int8 model tests that will fail if the CPU doesn't
have Intel AVX512 8-bit instructions.
2023-04-13 12:19:32 -07:00
Numfor Tiapo
e3086b2ed8
Move DML CI Pipeline to A10 (#15468)
This change moves the DML CI pipeline to the A10 machines and fixes or
disables tests that were failing from this change.

- Max error rate threshold was increased for Image Tests
- Some failing batch tests were disabled

---------

Co-authored-by: Changming Sun <chasun@microsoft.com>
2023-04-12 10:19:40 -07:00
Dmitri Smirnov
ce3b4eabd3
Implement Optional Metadata support and C# test support (#15314)
### Description
Implement Optional Type metadata support in the library.
Implement optional support in C# API along with metadata.
Implement Sequence, Map, Optional test data support
and test execution.

Prune tests and provide more details for failing tests in C# code.

Note, this PR does not enable running onnx test models in C++.

### Motivation and Context
Opset18 optional type support.
2023-04-11 09:41:59 -07:00
Sheil Kumar
7ccdf9ad8c
User/sheilk/sequence fix (#15239)
Ensure that Loop operators run on CPU.
Fix memcpy for Sequence Tensors, so that empty sequences (like when
SequenceEmpty runs on DirectML) can be copied back to CPU.
2023-03-31 12:57:25 -07:00
cao lei
50fa151298
remove device_id parameter out of ExecutionProvider::GetAllocator() (#14580)
### Description
Remove the parameter device_id out of ExecutionProvider::GetAllocator()
function



### Motivation and Context
The parameter device_id is not necessary. We can fully rely on the
second parameter OrtMemType mem_type to determine the device_id when
getting allocator from executionProvider.
2023-02-13 10:01:07 -08:00
RandySheriffH
75584c5fa8
Enabling thread pool to be numa-aware (#13778)
The PR enables ort thread pool to be numa-aware, so that threads could
be evenly created and distributed among numa nodes.
In addition, to facilitate performance tuning, the PR opens a new API
allowing customers to attach threads to certain logical processors.
Please check the API
[definition](https://github.com/microsoft/onnxruntime/pull/13778/files#diff-5845a5c76fb64abdc8f0cffe21b37f8da1712674eb3abc4cd87190891be1bd48)
for details.

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2022-12-12 10:33:55 -08:00
Abhishek Udupa
83c59d2594
Session-aware and thread-safe CUDA profiler (#13706)
### Description
The existing CUDA profiler is neither session-aware, nor thread-safe.
This PR ensures both.

### Motivation and Context
[PR 13549](https://github.com/microsoft/onnxruntime/pull/13549) brought
thread-safety and session-awareness to the ROCm profiler. This PR brings
the same goodness to the CUDA profiler as well.

Sample outputs of a profiling run from the StableDiffusion model (this
model was chosen because it requires orchestration of multiple sessions,
and verifies that the profilers are now indeed session-aware) on both
CUDA and ROCm EPs are attached, along with a script that checks that the
trace files generated by the profile are well-formed.

Update 11/29: Updated the profile outputs. The older profile outputs
exhibited an issue where some timestamps were wildly out of range,
leading to problems visualizing the traces. The bug has been fixed and
the profile outputs have been updated, along with an update to the check
script to ensure that timestamps are monotonically increasing.


[sd_profile_outputs_cuda.tar.gz](https://github.com/microsoft/onnxruntime/files/10118088/sd_profile_outputs_cuda.tar.gz)

[sd_profile_outputs_rocm.tar.gz](https://github.com/microsoft/onnxruntime/files/10118089/sd_profile_outputs_rocm.tar.gz)

[check_profile_output_well_formedness.zip](https://github.com/microsoft/onnxruntime/files/10118090/check_profile_output_well_formedness.zip)

Co-authored-by: Abhishek Udupa <abhishek.udupa@microsoft.com>
2022-12-09 13:22:12 -08:00
Sumit Agarwal
5b16593192
[DML EP] Attention Kernel bug fix (#13879)
### Description
- Use same data type as input for mask_index tensor which is used as DML
GEMM API's C parameter.
- Remove gsl header include as it is already gets included transitively.



### Motivation and Context
- Why is this change required? What problem does it solve?
Bug found in internal conformance testing.
- If it fixes an open issue, please link to the issue here.
N/A
2022-12-07 15:24:27 -08:00
Yi Zhang
ae2a9373ab
reenable quant model tests (#13871)
### Description

### Motivation and Context
Test data in the image has been fixed.
2022-12-07 23:33:22 +08:00
Numfor Tiapo
e0dcbc3832
Fix C26436 prefast errors (#13774)
Fixes errors 9196, 9214, 9255, and 9314.

Co-authored-by: Numfor Mbiziwo-Tiapo <numform@microsoft.com>
2022-12-01 09:07:44 -08:00
Numfor Tiapo
aa1390e963
Fix Prefast Errors (#13675)
Fixes all C28204, C6031, and C26814 prefast errors.

Co-authored-by: Numfor Mbiziwo-Tiapo <numform@microsoft.com>
2022-11-28 09:16:22 -08:00
Yi Zhang
a9a9c34d98
Fix WinML Test Case: create LearningModelBinding for every testcase (#13587)
### Description
Fix #13509

### Motivation and Context
The exception was caused by the incorrect fetches, which was from the
binding with last test cases.

efcbdac58e/onnxruntime/core/session/onnxruntime_c_api.cc (L809-L815)
2022-11-09 11:20:48 +08:00
Numfor Tiapo
49e5a11ccd
Fix SDL and Prefast Errors (#13465)
Fixes Errors 1978844, 1978870, 1978850, 1978855, and 9245

Co-authored-by: Numfor Mbiziwo-Tiapo <numform@microsoft.com>
2022-10-28 09:41:18 -07:00
Yi Zhang
e160688a9b
Skip some failed models winml and training workflows on Windows CPU (#13407)
### Description
1. update model name structure in model_tests.cpp with source name. To
avoid
`Condition test_param_names.count(param_name) == 0 failed. Duplicate
parameterized test name 'BERT_Squad_opset10_CPU'`
2. skip some failed models https://github.com/onnx/models/issues/568


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2022-10-25 10:05:04 +08:00
Numfor Tiapo
56387c3c31
Fix SDL Unmatched Annotation Errors (#13162)
Fixes 3 SDL unmatched annotation errors.

Co-authored-by: Numfor Mbiziwo-Tiapo <numform@microsoft.com>
2022-09-30 15:36:30 -07:00