Commit graph

9472 commits

Author SHA1 Message Date
guyang3532
401129d484
Add support for more ops for padding elimination (#17217)
Add support for Gelu/ReduceMean/SimplifiedLayerNormalization for padding
elimination
2023-08-25 18:02:15 +08:00
mindest
735cc8e6c8
[ROCm] enable If op for ROCm EP. (#17279)
### Description
Enable If op for ROCm EP.
2023-08-25 17:49:49 +08:00
Yi Zhang
9cd33e07b4
Readd Tests in Window GPU Reduced Ops workflow (#17294)
### Description
Add single test step in Window GPU Reduced Ops workflow


### Motivation and Context
The old workflow's building and testing were running in one command.
In PR #17263, the test step was removed by mistake.
So, readd it.
How to consolidate the test step is in consideration.
2023-08-25 15:56:59 +08:00
Yi Zhang
4a0f8f6672
Skip one flaky Test (#17290)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

It's skipped in the PR
```
2023-08-25T02:37:48.7772670Z 1: [ RUN      ] ModelTests/ModelTest.Run/cuda__models_opset9_Candy_candy
2023-08-25T02:37:48.7824755Z 1: D:\a\_work\1\s\onnxruntime\test\providers\cpu\model_tests.cc(91): Skipped
2023-08-25T02:37:48.7825343Z 1: Skipping single test It's in broken_tests
```
2023-08-25 14:48:41 +08:00
Changming Sun
3e934030f4
nodejs: Release Ort Env before main function returns (#17288)
### Description
Release OrtEnv before main function returns. Before this change, OrtEnv
is deleted when C/C++ runtime destructs all global variables in ONNX
Runtime's core framework.
The callstack is like this:
```
  * frame #0: 0x00007fffee39f5a6 libonnxruntime.so.1.16.0`onnxruntime::Environment::~Environment(this=0x00007fffee39fbf2) at environment.h:20:7
    frame #1: 0x00007fffee39f614 libonnxruntime.so.1.16.0`std::default_delete<onnxruntime::Environment>::operator()(this=0x00007ffff4c30e50, __ptr=0x0000000005404b00) const at unique_ptr.h:85:2
    frame #2: 0x00007fffee39edca libonnxruntime.so.1.16.0`std::unique_ptr<onnxruntime::Environment, std::default_delete<onnxruntime::Environment>>::~unique_ptr(this=0x5404b00) at unique_ptr.h:361:17
    frame #3: 0x00007fffee39e2ab libonnxruntime.so.1.16.0`OrtEnv::~OrtEnv(this=0x00007ffff4c30e50) at ort_env.cc:43:1
    frame #4: 0x00007fffee39fa96 libonnxruntime.so.1.16.0`std::default_delete<OrtEnv>::operator()(this=0x00007fffefff8f78, __ptr=0x00007ffff4c30e50) const at unique_ptr.h:85:2
    frame #5: 0x00007fffee39f394 libonnxruntime.so.1.16.0`std::unique_ptr<OrtEnv, std::default_delete<OrtEnv>>::~unique_ptr(this=0x7ffff4c30e50) at unique_ptr.h:361:17
    frame #6: 0x00007ffff78574b5 libc.so.6`__run_exit_handlers + 261
    frame #7: 0x00007ffff7857630 libc.so.6`exit + 32
    frame #8: 0x00007ffff783feb7 libc.so.6`__libc_start_call_main + 135
    frame #9: 0x00007ffff783ff60 libc.so.6`__libc_start_main@@GLIBC_2.34 + 128
    frame #10: 0x0000000000abbdee node`_start + 46
```
After this change, OrtEnv will be deleted before the main function
returns and nodejs is still alive.
2023-08-24 23:07:02 -07:00
mindest
93ae17d1bb
[ROCm] Add hipBLASLt workspace support (#17096)
### Description
* hipBLASLt extra workspace for split-k
* type update (due to extra support for fp8 in hipBLASLt)
* minor changes
2023-08-25 13:08:57 +08:00
pengwa
7c98f45928
Fix layernorm and softmax axis after upstream (#17255)
### Fix layernorm and softmax axis after upstream

For Gather (the slicing is a scalar), the output rank is small than its
inputs.

When we upstream this kind of Gather before softmax or layernorm, we
should also update the axis attribute.
Otherwise, the axis might be out-of-date and incorrect for the updated
rank.

```
  File "/opt/conda/envs/ptca/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_fallback.py", line 157, in handle_exception
    raise exception
  File "/opt/conda/envs/ptca/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_training_manager.py", line 280, in forward
    self._build_graph(graph_transformer_config)
  File "/opt/conda/envs/ptca/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_logger.py", line 158, in wrapper
    result = func(graph_execution_manager, *args, **kwargs)
  File "/opt/conda/envs/ptca/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_logger.py", line 273, in wrapper
    result = func(graph_execution_manager, *args, **kwargs)
  File "/opt/conda/envs/ptca/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_training_manager.py", line 361, in _build_graph
    super()._build_graph(graph_transformer_config)
  File "/opt/conda/envs/ptca/lib/python3.8/site-packages/onnxruntime/training/ortmodule/_graph_execution_manager.py", line 184, in _build_graph
    self._graph_builder.build(config)
RuntimeError: /onnxruntime/orttraining/orttraining/python/orttraining_pybind_state.cc:823 onnxruntime::python::addObjectMethodsForTraining(pybind11::module&, onnxruntime::python::ExecutionProviderRegistrationFn)::<lambda(onnxruntime::training::OrtModuleGraphBuilder*, const onnxruntime::training::TrainingGraphTransformerConfiguration&)> [ONNXRuntimeError] : 1 : FAIL : Node (Softmax_2904) Op (Softmax) [ShapeInferenceError] 'axis' must be in [-3 , 2]. Its actual value is: 3
```
2023-08-25 12:26:22 +08:00
Faith Xu
86238fb507
[Docs] Auto generate JS API (#17271)
### Description
Adds new workflow to generate js docs with latest changes so the API
page can stay up to date

[Test page of latest js
docs](https://faxu.github.io/onnxruntime/docs/api/js/modules/InferenceSession.html)
2023-08-24 17:35:37 -07:00
Yi Zhang
756eda2cc4
Windows CI build steps template (#17263)
### Description
1. New windows ci build steps template.
2. Remove useless variables.

### Motivation and Context
1. Make it easier to apply build cache to all windows CIs.
2. Other team's devs only need to take care of build options


###Comparision
Before: 

9f21f694cf/tools/ci_build/github/azure-pipelines/win-gpu-tensorrt-ci-pipeline.yml (L19-L82)

After:
b4c1f2261b/tools/ci_build/github/azure-pipelines/win-gpu-tensorrt-ci-pipeline.yml (L35-L54)
2023-08-25 05:58:49 +08:00
Hector Li
680fac64ed
[QNN EP] Support non-quantized Op on HTP (#17194)
### Description
[QNN EP] Support non-quantized Op on HTP

1. Remove the limitation in GetCapability that always require QDQ node
unit group to partition the node on NPU backend. So that we can support
non-quantized Slice op with int32 data input on HTP.
2. Enable Where QDQ node unit
3. Separate out the flag is_npu_backend & is_quantized_node to make it
clear
4. Separate output QuantizeLinear, DequantizeLinear to QdqOpBuilder to
better identify quantized/un-quantized input/output tensor
5. Separate out a TransposeOpBuilder to make it simple for Transpose
node processing. Especially for Single Transpose node in QDQ model, for
case like Q->Tranpose->DQ, Transpose is not QDQ node unit group, it's
single node. But we should treat it as quantized node. Output should has
same data type and quantization parameter with input. Another case is to
support non-quantized data for Transpose in QDQ model.
6. Remove is_npu_backend flag from OpBuilder interface. Set the backend
type in QnnBackendManager, QnnMOdel & QnnModelWrapper, so that
OpBuilders can always get it from QnnModelWrapper.
7. Add unit tests for quantized/non-quantized Transpose (int32, float32)
on HTP backend
2023-08-24 14:57:16 -07:00
pengwa
18d5cfdb85
Fix build - redefinition of default argument for ‘long unsigned int Extent’ (#17281)
### Fix build - redefinition of default argument for ‘long unsigned int
Extent’

One of the training customer env, building ORT, there is such a build
error. The GCC version are

```
aiscuser@node-0:/tmp/onnxruntime$ gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0


aiscuser@node-0:/tmp/onnxruntime$ g++ --version
g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0


```

But on our dev node using same GCC/G++, we don't have build issue., not
sure what's the difference but giving an explict type when creating
`gsl::span` fixed the problem.

```
/tmp/onnxruntime/build/Linux/RelWithDebInfo/_deps/gsl-src/include/gsl/span:394:7: error: redefinition of default argument for ‘long unsigned int Extent’
  394 | class span
      |       ^~~~
/tmp/onnxruntime/build/Linux/RelWithDebInfo/_deps/gsl-src/include/gsl/span_ext:46:51: note: original definition appeared here
   46 | template <class ElementType, std::size_t Extent = dynamic_extent>
      |                                                   ^~~~~~~~~~~~~~~
/tmp/onnxruntime/include/onnxruntime/core/common/span_utils.h:82:93: error: return type ‘class gsl::span<const std::byte>’ is incomplete
   82 | [[nodiscard]] inline gsl::span<const std::byte> AsByteSpan(const void* data, size_t length) {
      |                                                                                             ^
/tmp/onnxruntime/include/onnxruntime/core/common/span_utils.h: In function ‘void onnxruntime::AsByteSpan(const void*, size_t)’:
/tmp/onnxruntime/include/onnxruntime/core/common/span_utils.h:83:68: error: class template argument deduction failed:
   83 |   return gsl::span(reinterpret_cast<const std::byte*>(data), length);
      |                                                                    ^
/tmp/onnxruntime/include/onnxruntime/core/common/span_utils.h:83:68: error: no matching function for call to ‘span(const std::byte*, size_t&)’
/tmp/onnxruntime/build/Linux/RelWithDebInfo/_deps/gsl-src/include/gsl/span:740:1: note: candidate: ‘template<class Type, long unsigned int Extent> gsl::span(Type (&)[Extent])-> gsl::span<ElementType, FirstExtent>’
  740 | span(Type (&)[Extent]) -> span<Type, Extent>;
      | ^~~~
/tmp/onnxruntime/build/Linux/RelWithDebInfo/_deps/gsl-src/include/gsl/span:740:1: note:   template argument deduction/substitution failed:
/tmp/onnxruntime/include/onnxruntime/core/common/span_utils.h:83:68: note:   mismatched types ‘Type [Extent]’ and ‘const std::byte*’
   83 |   return gsl::span(reinterpret_cast<const std::byte*>(data), length);
      |                                                                    ^
/tmp/onnxruntime/build/Linux/RelWithDebInfo/_deps/gsl-src/include/gsl/span:743:1: note: candidate: ‘template<class Type, long unsigned int Size> gsl::span(std::array<_Tp, _Nm>&)-> gsl::span<ElementType, FirstExtent>’
  743 | span(std::array<Type, Size>&) -> span<Type, Size>;
      | ^~~~
/tmp/onnxruntime/build/Linux/RelWithDebInfo/_deps/gsl-src/include/gsl/span:743:1: note:   template argument deduction/substitution failed:
/tmp/onnxruntime/include/onnxruntime/core/common/span_utils.h:83:68: note:   mismatched types ‘std::array<_Tp, _Nm>’ and ‘const std::byte*’
   83 |   return gsl::span(reinterpret_cast<const std::byte*>(data), length);
      |                                                                    ^
```



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-08-25 00:40:40 +08:00
pengwa
d90afc697b
Introduce ZeROOffloadSubscriber for ORTModule (#17006)
### Introduce ZeROOffloadSubscriber for ORTModule

As part of the work: integrate ORTModule with DeepSpeed stage3, this PR
mainly focus on moving original PyTorch-based (leveraging hooks) param
partition/offload implementation to ORTModule compatible implementation.

Changes include:
1. Refactor `SubscriberBase`/`SubcriberManager` to support
pre-forward/post_forward hooks.
2. Implement new `ZeROOffloadSubscriber` by re-using DeepSpeed hook
function as much as possible. Since all hook functions are defined in
`DeepSpeedZeRoOffload._register_hooks_recursively` and
`DeepSpeedZeRoOffload.setup_zero_stage3_hooks`, and the good thing is,
the closure is not complex, all hooks are referencing the owning
`DeepSpeedZeRoOffload` instance, so we can create new hook function with
`FunctionType` by binding the owning `DeepSpeedZeRoOffload` instance,
then call the new created function in subscriber's
`pre_forward_module_apply_impl` and `post_forward_module_apply_impl`
interfaces.
3. Monkey patch `DeepSpeedZeRoOffload.setup_zero_stage3_hooks` to
register the `ZeROOffloadSubscriber` for the model, then we don't need
change any code on the DeepSpeed repo (at least so far).
4. Fix the ATen embedding custom symbolic exporter function by
tolerating weights size be (0) (changed by DeepSpeed zero stage 3).

UT will be added once stage3 is fully supported. 

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-08-25 00:15:22 +08:00
Baiju Meswani
fca81cc5d5
ConvTransposeGrad CUDA Kernel (#17201) 2023-08-24 09:08:06 -07:00
Jian Chen
33415b9da4
Removing 10.14 suffix from osx nuget package (#17277)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-08-24 08:51:54 -07:00
cloudhan
94f23882f7
Colorize terminal log output (#17196)
Make eyeball log parsing a little bit easier.
2023-08-24 17:38:21 +08:00
Baiju Meswani
34d18ee076
Build gradient graph starting at the loss alone (#17240) 2023-08-23 23:54:45 -07:00
Yulong Wang
fb51faea64
[js/webgpu] fix 2 build breaks introduced in merge (#17273)
### Description
fix 2 build breaks introduced in merge. Fixes web build
2023-08-23 18:09:50 -07:00
cloudhan
87bef1f3f2
Move composable_kernel to deps.txt (#17245) 2023-08-23 17:39:16 -07:00
Dmitri Smirnov
33c87f6283
ORT_ENFORCE on the iterator must come before iterator is dereferenced. (#17265)
### Description
Move `ORT_ENFORCE` on the iterator before iterator is used for the first
time.
2023-08-23 17:20:01 -07:00
Baiju Meswani
6c95d959f3
Make batchnorm training mode available in inference only package (#17270) 2023-08-23 15:19:11 -07:00
Dmitri Smirnov
fdc3bcae20
Disable local symbol table for function shape inferencing. (#17267)
### Description
Temporarily disable symbol tables.

### Motivation and Context
Local symbol tables mark unrelated shapes re-use and cause inference to
error out.

https://github.com/microsoft/onnxruntime/issues/17061
2023-08-23 14:46:21 -07:00
Yulong Wang
8b18d48c7c
[js/webgpu] make IndicesHelper implementation implicit (#17193)
### Description
This change makes it no longer required to call indicesHelper.impl() in
shader code.
2023-08-23 14:41:35 -07:00
Rachel Guo
aed7c6ffc7
Exclude fp16 support flag definition from minimal build (#17259)
### Description
<!-- Describe your changes. -->

As title.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Reduce minimal build binary size for mobile to meet office team
requirement.

cc @chenfucn

Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
2023-08-23 10:13:19 -07:00
Scott McKay
b3cb775cf9
Two fixes involving minimal builds (#17000)
### Description
<!-- Describe your changes. -->
- allocation planner was breaking if graph had no nodes
- in this particular model a branch of an If node returned an outer
scope value directly.

- if model used non-tensor types and sparse tensors are disabled the
call to IsSpareTensor causes an exception when prematurely terminates
the code.
- it's perfectly fine to check if a value is a sparse tensor when
support for them is disabled. we just can't do anything with that
OrtValue which is what the current ifdef's after the call to
IsSparseTensor handle.




### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Fix model execution failure for partner with model that uses sequences
in a minimal build with sparse tensors disabled.
2023-08-23 16:01:22 +10:00
BoarQing
d21a2f064b
[VITISAI] fix compile error for onnxruntime (#17252)
### Description
<!-- Describe your changes. -->
Updated the code to pass in the missing parameter


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Compile error. See https://github.com/microsoft/onnxruntime/issues/17139

Co-authored-by: Yueqing Zhang <yueqingz@amd.com>
2023-08-22 22:40:39 -07:00
Ashwini Khade
56102ecbdd
On-Device Training - Enable loading from buffer (#16417) 2023-08-22 19:59:32 -07:00
Edward Chen
ae62d752d6
Prevent GSL_SUPPRESS arguments from being modified by clang-format (#17242)
Prevent `GSL_SUPPRESS` arguments from being modified by clang-format and update existing usages.

clang-format was changing something like `GSL_SUPPRESS(r.11)` to `GSL_SUPPRESS(r .11)`.

For some compilers (e.g., clang), the `gsl::suppress` attribute takes a quoted string argument. We don't want to insert spaces there.
2023-08-22 18:26:53 -07:00
kunal-vaishnavi
4b3477f171
Add Whisper scripts (#17043)
### Description
This PR adds benchmark scripts for Whisper. It is a follow-up to [this
PR](https://github.com/microsoft/onnxruntime/pull/17020) that adds the
LLaMA scripts.



### Motivation and Context
This PR enables benchmarking Whisper across various configurations.
2023-08-22 18:14:44 -07:00
Arthur Islamov
5842144d98
[js/web] JSEP Gemm for opset 13 (#16936)
### Description
Added JSEP Gemm registration for opset 13. It was falling back to CPU
provider as CPU has it for 13

---------

Co-authored-by: Guenther Schmuelling <guschmue@microsoft.com>
2023-08-22 18:13:20 -07:00
kunal-vaishnavi
edac3ef150
Add LLaMA scripts (#17020)
### Description
This PR adds the following scripts for LLaMA:
- LLaMA conversion (support for TorchScript and Dynamo exporters)
- LLaMA parity
- LLaMA benchmark
- LLaMA quantization
- LLaMA integration with [Hugging Face
Optimum](https://github.com/huggingface/optimum)



### Motivation and Context
This PR adds scripts for using LLaMA. There is a [follow-up
PR](https://github.com/microsoft/onnxruntime/pull/17043) for adding
scripts for Whisper.
2023-08-22 18:05:11 -07:00
Guenther Schmuelling
d3d3dde844
fix webgpu split (#17258)
fix webgpu split for the case of split_sizes coming from input[1]
2023-08-22 16:49:22 -07:00
shaahji
d76dbc4fc3
Issue#16990: Cast -> AllToAll -> Cast fails with random output (#17075) 2023-08-22 12:47:23 -07:00
Edward Chen
bd8a488f4b
Enable verbose logging in unit test program with environment variable. (#17133)
Enable verbose logging in unit test program with environment variable.
E.g., `ORT_UNIT_TEST_MAIN_LOG_LEVEL=0 ./onnxruntime_test_all --gtest_filter="<test that I want to see more logs for>"`.
2023-08-22 12:13:52 -07:00
Yi Zhang
61a79436e2
Common pre-build steps of Windows CI (#16970)
### Description
Unify some pre-build common steps.

### Motivation and Context
In the long run, other devs should only focus on build option and test
commands.
It would reduce mistakes and maintenance cost to use common template
steps.
There will be more PRs to achieve the goal.
2023-08-22 18:09:55 +08:00
PeixuanZuo
d5c565156d
[ROCm] add SimplifiedSkipLayerNorm implementation (#17213)
add SimplifiedSkipLayerNorm implementation
2023-08-22 12:06:58 +08:00
cloudhan
4e6cec4d09
Update ck and enable test (#16383)
Apply the fix in https://github.com/ROCmSoftwarePlatform/composable_kernel/issues/728
Introduce more kernel instances and allow the introduction of streamk and splitk.
2023-08-22 11:08:55 +08:00
Baiju Meswani
aae9a52e8b
Avoid pushing cpu package to https://download.onnxruntime.ai/ (#17238) 2023-08-21 15:47:07 -07:00
Dmitri Smirnov
ced0cfbfea
[C#]Fix API Comment (#17236)
### Description
Fix comment reference to a renamed public API.

### Motivation and Context
Avoid confusion of incorrect docs.

We want this in 1.16 release
2023-08-21 15:46:31 -07:00
Hector Li
618f4839d1
[QNN EP] Re-enable some node tests for QNN (#17237)
### Description
Re-enable some node tests for QNN
2023-08-21 13:58:17 -07:00
Changming Sun
e2b6827a59
Add a CUDA 12.x pipeline and improve install_third_party_deps.ps1 (#17231)
### Description
1. Add a CUDA 12.x pipeline
2. Improve install_third_party_deps.ps1: avoid using Start-process.
Directly call the command instead.

### Motivation and Context
Since our official packages and all CI pipelines still use CUDA 11.x, we need extra pipelines to validate our source code level compatibility with CUDA 12.x. BTW for sure the prebuilt binaries in our release page are not compatible with CUDA 12.x. Do not report bugs for that. 

AB#15152
2023-08-21 13:04:36 -07:00
Emmanuel Ferdman
08ca624d2b
Fix: update hyperlinks to the Jupyter notebooks (#16145)
### Description
<!-- Describe your changes. -->

This PR fixes broken hyperlinks in the documentation that should lead
users to Jupyter notebooks. Currently, the hyperlinks are not working as
intended. The PR resolves this issue by updating the hyperlinks to
correctly direct users to the Jupyter notebooks.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve? -->

It fixes broken hyperlinks leading to the Jupyter notebooks.
2023-08-21 09:53:05 -07:00
Sheil Kumar
cbaa008391
Bump DirectML version from 1.12.0 to 1.12.1 (#17225)
Bump DirectML version from 1.12.0 to 1.12.1

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2023-08-20 09:55:38 -07:00
kunal-vaishnavi
4bea5ec513
Add Whisper export with beam search test cases (#17228)
### Description
This PR adds test cases for the custom export of [Whisper with beam
search](https://github.com/microsoft/onnxruntime/tree/main/onnxruntime/python/tools/transformers/models/whisper).



### Motivation and Context
This PR checks that Whisper can be exported and runs with parity.
2023-08-20 00:58:08 -07:00
Chi Lo
9445539e2c
Update dependency for deps.txt (#17220)
https://github.com/microsoft/onnxruntime/pull/17059 updates deps.txt and
we also need to update cgmanifest.json and upload the files to Azure
DevOps


https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=342803&view=results
for testing
2023-08-19 00:43:25 -07:00
Yulong Wang
6fc3fd9ece
[js/webgpu] support Cast operator (#16489)
### Description
support `Cast` operator for webgpu backend.

Cast operator for webgpu backend currently only supports f32, u32, i32
and bool.
2023-08-18 23:51:03 -07:00
Yulong Wang
bf1c62c181
check in build script for webgpu (#17126)
### Description
check in build script for webgpu described in gist
https://gist.github.com/fs-eire/a55b2c7e10a6864b9602c279b8b75dce

once this PR get merged, I can update the gist to use this file
2023-08-18 23:50:29 -07:00
Edward Chen
d6cd41cfc1
[CoreML EP] Add Shape, Gather, and Slice ops (#17153)
Add CoreML EP shape related ops:
- Shape
- Gather
- Slice

Add support for int64/int32 inputs in CoreML EP.
2023-08-18 22:34:34 -07:00
Edward Chen
2b4cc24d5c
[CoreML EP] Limit input shapes to at most rank 5 (#17086)
When considering nodes for the CoreML EP, limit input shapes to at most rank 5.
2023-08-18 20:33:40 -07:00
Yulong Wang
3426954525
disable browser stack tests (#17224)
### Description
disable browser stack tests
2023-08-18 17:14:12 -07:00
Changming Sun
3cec88bd12
FIX: memory leak checker is incompatible with std::stacktrace (#17209)
### Description
When I worked on PR #17173, I didn't notice that
onnxruntime\core\platform\windows\debug_alloc.cc also needs to call
dbghelp functions like SymInitialize. So, if we use vc runtime's
stacktrace functionality, vc runtime will initialize/uninitialize the
dbghelp library independently and vc runtime's stacktrace helper DLLs
get unloaded before our memory leak checker starts get work. Then we
call SymSetOptions, it crashes.

More details:
In VC runtime the C++23 stacktrace functions are implemented on top of
dbgeng.dll. In C:\Program Files\Microsoft Visual
Studio\2022\Enterprise\VC\Tools\MSVC\14.37.32822\crt\src\stl\stacktrace.cpp,
you can see it has:
```
                dbgeng = LoadLibraryExW(L"dbgeng.dll", nullptr, LOAD_LIBRARY_SEARCH_SYSTEM32);
```
The dbgeng.dll is a wrapper around dbghelp.dll. It calls SymInitialize
and SymCleanup. dbgeng.dll gets unloaded before our memory leak check
starts to run. In theory we should be able to call SymInitialize again
if the previous user who called SymInitialize has also called
SymCleanup. However, users can use
SymRegisterCallback/SymRegisterCallback64/SymRegisterCallbackW64 to
register callback functions to dbghelp.dll. These callback functions
need to be alive when SymSetOptions(and some other dbghelp APIs) get
called.

### Motivation and Context
2023-08-18 17:10:33 -07:00