Commit graph

7593 commits

Author SHA1 Message Date
Cheng
ea1bdb162f
[NNAPI] Refactor Resize as layout insensitive (#13412)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2022-10-25 16:50:05 +08:00
cloudhan
93f7a97a6d
Exculde hipify option from policheck (#13431) 2022-10-25 16:35:16 +08:00
PeixuanZuo
28f470c26c
[ROCm] Use SkipLayerNorm original implementation in kernel explorer (#13382)
### Description
<!-- Describe your changes. -->

Wrap SkipLayerNormoriginal implementation as a function.
Use it as part of SkipLayerNormTunableOp.
Use it in Kernel explorer to compare the gap between TunableOp and
Original implementation.

the profile output like below:
`float16 8 512 768 <class
'_kernel_explorer.SkipLayerNorm_half_Original'> 23.48 us 804.04 GB/s

float16 8 512 768 <class '_kernel_explorer.SkipLayerNorm_half_Tunable'>
20.41 us 925.00 GB/s
...`

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Co-authored-by: peixuanzuo <peixuanzuo@linmif39a000004.zvflicr54joexhdgnhvmxrxygg.phxx.internal.cloudapp.net>
2022-10-24 22:00:24 -07:00
cloudhan
2748f38362
Drop hip_add_library (#13406)
Switching to use CMake's builtin hip language support.
2022-10-25 12:57:48 +08:00
Yi Zhang
e160688a9b
Skip some failed models winml and training workflows on Windows CPU (#13407)
### Description
1. update model name structure in model_tests.cpp with source name. To
avoid
`Condition test_param_names.count(param_name) == 0 failed. Duplicate
parameterized test name 'BERT_Squad_opset10_CPU'`
2. skip some failed models https://github.com/onnx/models/issues/568


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2022-10-25 10:05:04 +08:00
sumitsays
24818cfd73
[DML EP] Attention Kernel (#13371)
### Description
DML EP kernel for com.microsoft.attention operator. It has been
implemented via DML_Graph. References for this implementation:

1. [Hugging Face Attention for
BERT](310340d0d0/src/transformers/models/bert/modeling_bert.py (L245-L284))
2. Chapter 3 of book Orielly: Natural Language Processing with
Transformers, Revised Edition

This PR also

- includes a very tiny fix for QLinearSigmoid kernel, which is storing
the temporary object into a named variable.
- enables 4 L2 transformers LayerNorm, Gelu, MatMulScale, Attention.



### Motivation and Context
- Why is this change required? What problem does it solve? 
One of the main operators used in Transformer-based model. It
contributes to the overall perf of DML EP for Transformer models.
- If it fixes an open issue, please link to the issue here. N/A

Co-authored-by: Sumit Agarwal <sumitagarwal@microsoft.com>
Co-authored-by: Dwayne Robinson <dwayner@microsoft.com>
2022-10-24 14:32:37 -07:00
Yi Zhang
1885460776
skip some models failed in dynamic shape infer (#13400)
### Description
<!-- Describe your changes. -->

### Motivation and Context
Some models from model zoo failed in the Linux CPU workflow.
https://github.com/onnx/models/issues/562
Skip them temporarily.

###Verfication
Linux CPU CI passed with beta image

https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=789772&view=results
**2022-10-21T13:31:17.6740348Z Skip symbolic shape inference on :
/mnt/vss/_work/1/b/Release/../models/zoo/opset12/Inception-1-int8/inception-v1-12-int8.onnx**
2022-10-21T13:31:17.6740998Z Running symbolic shape inference on :
/mnt/vss/_work/1/b/Release/../models/zoo/opset12/DenseNet-121-12-int8/densenet-12-int8.onnx
2022-10-21T13:31:17.6741618Z Running symbolic shape inference on :
/mnt/vss/_work/1/b/Release/../models/zoo/opset12/MNIST-12/mnist-12.onnx
**2022-10-21T13:31:17.6742207Z Skip symbolic shape inference on :
/mnt/vss/_work/1/b/Release/../models/zoo/opset12/SSD-int8/ssd-12-int8.onnx**
2022-10-21T13:31:17.6742898Z Running symbolic shape inference on :
/mnt/vss/_work/1/b/Release/../models/zoo/opset12/ResNet50_fp32/resnet50-v1-12.onnx
2022-10-21T13:31:17.6743544Z Running symbolic shape inference on :
/mnt/vss/_work/1/b/Release/../models/zoo/opset12/MobileNet
v2-1.0-fp32/mobilenetv2-12.onnx
2022-10-21T13:31:17.6744259Z Running symbolic shape inference on :
/mnt/vss/_work/1/b/Release/../models/zoo/opset12/ResNet101_DUC_HDC-12/ResNet101-DUC-12.onnx
2022-10-21T13:31:17.6744891Z Running symbolic shape inference on :
/mnt/vss/_work/1/b/Release/../models/zoo/opset12/YOLOv3-12-int8/yolov3-12-int8.onnx
2022-10-21T13:31:17.6745501Z Running symbolic shape inference on :
/mnt/vss/_work/1/b/Release/../models/zoo/opset12/AlexNet/bvlcalexnet-12.onnx
2022-10-21T13:31:17.6746114Z Running symbolic shape inference on :
/mnt/vss/_work/1/b/Release/../models/zoo/opset12/ZFNet-512-int8/zfnet512-12-int8.onnx
**2022-10-21T13:31:17.6746768Z Skip symbolic shape inference on :
/mnt/vss/_work/1/b/Release/../models/zoo/opset12/SSD-MobilenetV1-12-int8/ssd_mobilenet_v1_12-int8.onnx**
2022-10-25 01:48:46 +08:00
Yi Zhang
143725604e
Skip some models failed in Windows CPU C# tests (#13395)
### Description
For models from model zoo, in C# tests of Windows CPU CI
skip models whose name contains int8 or qdq.
skip some models (VGG16, VGG19) in x86 workflow

### Motivation and Context
These models always failed in Windows CPU C# tests 

(https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=789442&view=results)


### verified

https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=789861&view=results
C# tests passed
2022-10-22 13:54:24 +08:00
Jian Chen
397edf9918
Bumping up version number to 1.14.0 on main branch (#13401)
### Description
Bumping up version number to 1.14.0



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2022-10-21 19:16:44 -04:00
Ye Wang
928c9889a3
A few fixes for generative model ops (#13363)
### Description
<!-- Describe your changes. -->

Fix a bug in GreedySearch Op when batch > 1
Support custom attention mask in GreedySearch and BeamSearch with GPT2 


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2022-10-21 15:00:18 -07:00
sumitsays
62cc927f05
[ORT+DML] Validate DML EP header files in ORT+DML NuGet pacakge (#13359)
### Description
Today, ORT+DML NuGet package does not validate the existence of the DML
EP header files and DML dlls. This change extends the existing python
script to verify the existence of DML EP related headers.
For DML as a dependent package, we will be using another task and it
will a separate PR.

### Motivation and Context
- Why is this change required? What problem does it solve?
Pro-actively verifies the ORT+DML release candidate rather than a
customer raise an issue after it gets published to NuGet.
- If it fixes an open issue, please link to the issue here. N/A

Co-authored-by: Sumit Agarwal <sumitagarwal@microsoft.com>
2022-10-21 11:10:26 -07:00
cloudhan
a8701c2a59
Test TunableOp GEMM and MatMul (#13378)
1. Extends `OpTester` class with builder pattern to ease the parameter passing.
2. Add run option `kOpTesterRunOptionsConfigTestTunableOp` for testing purpose and let rocm ep subscribe to it.
3. Use the new builder pattern interface to launch test, with tunable op tests enabled.
2022-10-21 16:44:41 +08:00
cloudhan
928c9fc348
Hipify during build instead of before cmake config (#13333)
### Description

Currently, hipify happens before cmake is configured and then cmake glob
the directories. This get rids of thoes customized python threading
logic and opt for build system itself to generate the files.

This also supersede the half baked branch
[sukha/hipify-with-cmake](https://github.com/microsoft/onnxruntime/tree/sukha/hipify-with-cmake)
2022-10-20 22:46:22 -07:00
Yi Zhang
bb16ee712e
skip 2 models in C# test (#13384)
### Description
<!-- Describe your changes. -->



### Motivation and Context

these 2 models are also skipped in gtest

fc12abf6b1/onnxruntime/test/providers/cpu/model_tests.cc (L119-L122)
2022-10-21 09:01:34 +08:00
George Wu
7a3486c3ee
enable arm32/arm64 target for .net apps built against OnnxRuntime.ML.OnnxRuntime (#13385)
couldn't build arm64 .net app due to target file not allowing it.
2022-10-20 15:34:36 -04:00
Adam Louly
bed169192d
Windows build fix for on device training training. (#13354)
### Description
This is a fix for on device training wheel build.

### Motivation and Context
when building linux wheel it treats PathString same as std::string, but
when trying to build the wheel on windows it fails because we needed to
cast the std::string to a PathString.

This error was found manually because there is no pipeline that uses the
--enable_training_on_device for windows.

Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-10-20 09:58:02 -07:00
Jian Chen
ac5948cb48
Fix bug for percentile calibration module. (#13376)
### Description
Fix bug for percentile calibration module.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2022-10-20 12:33:07 -04:00
cloudhan
fc12abf6b1
Enable/Disbale tunable GEMM by using tunable switch in provider options and env var (#13116)
Related PRs #12853

This allows the user enable/disbale tunable GEMM on demand.
2022-10-19 22:35:08 -07:00
PeixuanZuo
4b2b588895
[ROCm] Fix azcopy issue on ROCm ci pipeline (#13365)
### Description
<!-- Describe your changes. -->

Use SAS Token to fix error` failed to perform copy command due to error:
no SAS token or OAuth token is present and the resource is not public`

Generate SAS Token of target data, add it into Key vault, and use it as
Pipeline Variable.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Co-authored-by: peixuanzuo <peixuanzuo@linmif39a000004.zvflicr54joexhdgnhvmxrxygg.phxx.internal.cloudapp.net>
2022-10-20 12:08:57 +08:00
cloudhan
24b25df641
Add verbose level log for TunableOp (#13369) 2022-10-19 20:59:48 -07:00
PeixuanZuo
665fb346ab
[ROCm] set parallel=16 when build on ROCm CI (#13368)
### Description
<!-- Describe your changes. -->

ROCm CI build step takes more than one hour. Set parallel=16 when build
on ROCm CI to reduce build time.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Co-authored-by: peixuanzuo <peixuanzuo@linmif39a000004.zvflicr54joexhdgnhvmxrxygg.phxx.internal.cloudapp.net>
2022-10-20 11:36:00 +08:00
Vincent Wang
67150baa8d
[ORTModule] ATen Support for aten::upsample_nearest (#13364)
ATen support for aten::upsample_nearest, which is required for
Huggingface's diffusers model training using ORTModule.
2022-10-20 08:30:04 +08:00
Vincent Wang
b6b3f41636
Fixes of Hierarchical ORTModule and ORTModule PythonOp (#13347)
The PR applies some fixes to Hierarchical ORTModule and ORTModule
PythonOp.

For Hierarchical ORTModule:
- Don't wrap module if the caller is to call other function instead of
forward() function
- Support single module instance is call multiple times with different
types of inputs
- Check if module can be warped from top to bottom instead of from
bottom to top

For ORTModule PythonOp:
- Add env variable control to allow using
torch.utils.checkpoint.CheckpointFunction
- Add env variable control to skip register some autograd functions so
that there is no conflict for some models.
2022-10-20 08:16:03 +08:00
Adrian Lizarraga
418304743d
[EP-Perf-Dashboard] Update table schemas (#13327)
Updates EP perf benchmarking scripts to upload new data with an improved table schema. In order to preserve compatibility with the current benchmarking pipeline, we still upload data that uses the old schema as well. These changes are required in order to improve data filtering capabilities and general UX in dashboards that visualize this data.

Details:
- EP names no longer hardcoded as columns for tables that store inference latency, session creation times, memory usage, and model/EP status.
- Add explicit branch, commit ID, and commit date columns to all tables
- Improvements to the docker image building scripts (simplify docker image build; support installing binary TensorRT packages)
- Remove use of deprecated DataFrame.append in favor of pandas.concat.
2022-10-19 16:15:05 -07:00
Chi Lo
86c5c07ea4
TRT EP race condition fix during ep compile time (#13356)
### Description
TRT EP has the chance to encounter race condition when multiple threads
are doing engine serialization/deserialization during EP compile time.
Let's say one thread is serializing the engine and has not yet
completely written all the data to file, and at this moment, another
thread finds the engine file is existed and begins to deserialize the
engine, it will end up deserialize the corrupt file.
The fix is to put a lock around engine deserialization/serialization,
engine build and context build.



### Motivation and Context
The TensorRT EP Windows CI sometimes fails because of
`TensorrtExecutionProviderTest.MultiThreadsTestWithOneSessionSingleThreadInference`
unit test fails (This PR changes the name to
SessionCreationWithMultiThreadsAndInferenceWithMultiThreads). It's
highly possible due to race condition.
The TensorRT CI failure also been reported
[here](https://github.com/microsoft/onnxruntime/issues/13030)
2022-10-19 11:19:10 -07:00
Scott McKay
565da71275
Make 'env' argument to Session const (#13362)
### Description
<!-- Describe your changes. -->
The Env argument does not need to be mutable to call the underlying C
API. Update the Ort::Session ctor to have a const Env.

All other changes are from clang-format running. 

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Cleanup
2022-10-19 14:23:24 +10:00
Vincent Wang
9efa8e20bb
Add Symbolic Shape and Type Infer for aten::group_norm (#13348)
Add symbolic shape and type infer for aten::group_norm.
2022-10-19 10:37:33 +08:00
Edward Chen
2fa18ea77e
[React Native CI] Record more info to debug E2E test (#13329)
Record more info from the React Native CI E2E test. In particular, log the view hierarchy when exiting the test and dump logs from Android emulator to the build output.
2022-10-18 17:21:28 -07:00
Dmitri Smirnov
9189ebb415
Optimize slicing when possible by copying bigger blocks at once (#13261)
### Description
Currently, SliceIterator copies inner dimension size at once at best.
However, there are many slices when several inner dimensions can be
copied at once.
Furthermore, even if a dimension is sliced, it may employ step 1 and,
therefore, has a continuous block of inner dimensions that can be copied
at once.

### Motivation and Context

For example, `[N, C, H, W]` with slice `[:, :, i:, :]` and `[N, C, H-i,
W]`. Meaning, we slice along single axis, with step = 1. Current
implementation does `C * (H-i) memcpy` with W elements each. With this
change we can do `C memcpy with (H-i)*W` elements each.

The optimization produces ~11% savings on certain internal models.
2022-10-18 14:41:46 -07:00
Dmitri Smirnov
f5e3165cc3
Fix move Base::operator= (#13355)
### Description
Base::operator= move is broken, loses a valid ptr.

### Motivation and Context
Address
https://github.com/microsoft/onnxruntime/pull/13215#discussion_r997814275
2022-10-18 13:07:40 -07:00
Jake Mathern
f96f222526
Change CPU EP behavior with auto_pad when ConvTranspose output shape is specified. (#13311)
### Description
Based on the ORT spec for ConvTranspose:

```
output_shape can also be explicitly specified in which case pads values are auto generated using these equations:

total_padding[i] = stride[i] * (input_size[i] - 1) + output_padding[i] + ((kernel_shape[i] - 1) * dilations[i] + 1) - output_shape[i]
If (auto_pads == SAME_UPPER): pads[start_i] = total_padding[i]/2; pads[end_i] = total_padding[i] - (total_padding[i]/2)
Else: pads[start_i] = total_padding[i] - (total_padding[i]/2); pads[end_i] = (total_padding[i]/2).
```
However the CPU EP logic differs. Basically, unless SAME_UPPER is
specified, the default behavior (for VALID,NOTSET,SAME_LOWER) should be
SAME_LOWER.

I think this is the pragmatic fix, however it's perhaps still not
totally up to standard.
In the case tested, the operator is actually only valid if padding is
inserted. Perhaps it "should" throw some error then, if auto_pad is not
SAME_UPPER or SAME_LOWER, as the spec also mentions:

"VALID mean no padding." (For convtranspose-1 but this was removed in
convtranspose-11, making it less clear.)
"NOTSET, which means explicit padding is used" (should technically
require explicit padding then, and not generate it)

HOWEVER, changing it to throw errors could do more harm than good. For
now, probably just best to make it consistent.

### Motivation and Context
We noticed that there was a discrepancy in one of the DML tests between
CPU and DML.
auto_pad is not specified, and DML is doing SAME_LOWER behavior by
default, where CPU EP is doing SAME_UPPER behavior.

```json
    {
      "graph_name": "ConvTranspose output_shape with even strides odd kernel autopad NOTSET",
      "op_type": "ConvTranspose",
      "dilations": [1,1],
      "group": 1,
      "strides": [2,2],
      "kernel_shape": [3,3],
      "output_shape": [1,1,4,4],
      "X": {"dims": [1,1,2,2], "function": "iota"},
      "W": {"dims": [1,1,3,3], "value": [1,2,3,4,5,6,7,8,9]},
      "B": [1],
      "Y": {"dims": [1,1,4,4], "value": [1,5,6,7,5,17,15,19,11,25,16,19,17,40,25,28]},
      "T": "float32"
    }
```
2022-10-18 12:57:47 -07:00
Hariharan Seshadri
15673b4537
Revert "Fix shape-related issues in FuseConv (#12410)" (#13353)
The commit causes subtle perf regressions in image models (caught by
Anubis). Since we are close to the release, reverting this change for
now so that the regression cause analysis doesn't push the release
timeline.

Once the PR is merged, I will re-open the GH issues that the original PR
closed.

### Motivation and Context
Fix regression in ORT 1.13 RC
2022-10-18 12:30:38 -07:00
Jian Chen
e3982416d3
Fix Bug where zero point isn't correct under entropy calibration (#13346)
### Description
Fix Bug where zero point isn't correct under entropy calibration 



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2022-10-18 12:05:40 -04:00
Adam Louly
61ee5585b2
update the nightly build to use the latest ptca image. (#13309)
### Description
updating the ptca image used in the nightly pipeline

Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-10-17 14:12:03 -07:00
Adam Louly
68eff69ab1
Add Utils for federated learning scenarios (#13014)
**Description**: utils for federated learning.

**Motivation and Context**
- This PR includes utils that will be used on federated learning
scenarios.
- Exposing python bindings to some utils, and added a util to calculate
the difference between two buffers.

Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>
2022-10-17 12:39:43 -07:00
PeixuanZuo
b4853a978a
[ROCm] add rocm python package pipeline with --use_rocm_profiling (#13068)
### Description
<!-- Describe your changes. -->

ROCm developers always need to build onnxruntime *whl with
`--enable_rocm_profiling`.
Add a ROCm dev python package pipeline which product *.whl with build
args `--enable_rocm_profiling`.
The dev *whl need to upload to azure storage and can get from
https://download.onnxruntime.ai/onnxruntime_nightly_rocm53.profiling.html


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2022-10-17 10:11:20 +08:00
cloudhan
c4d3c7003f
Refactor provider test utils (#13272)
Refactor the OpTester core logic to make adding more code easier.
2022-10-17 09:46:42 +08:00
Dmitri Smirnov
4a63cd0290
Improve thread pool creation failure handling. (#13313)
### Description
Detect and report thread creation failure on Windows.
Do not throw out of constructor after the thread is created,
the thread handle is lost and cannot be joined, resulting in a deadlock.

Make setting a thread priority on Linux consistent with windows.
Set thread priority in the thread itself. Log failure properly,
but do not exit the thread.

### Motivation and Context
Address issues https://github.com/microsoft/onnxruntime/issues/13291
And
https://github.com/microsoft/onnxruntime/issues/13285#issuecomment-1278063223
2022-10-15 17:57:19 -07:00
Maxiwell S. Garcia
1ab11a111c
ppc64le: mlas: fix both MaximumFloat and MinimumFloat to return NAN (#12628)
Avoid using vec_max/vec_min because their behaviors are undefined if one
of
the elements is NAN. The Power Vector Intrinsic Programming Reference
says:

"For floating-point types, if both source elements contain signed
zeros, or if either source element contains a NaN, it is
undefined which of the two source elements is copied into
the corresponding result element."

As the unittest Activation.ShortExecute expects NAN, this patch uses
vec_sel and vec_cmpgt to return NAN if one of the elements is NAN.


https://git.openpower.foundation/systemsoftware/Programming-Guides/src/branch/master/Intrinsics_Reference/ch_vec_reference.xml#L26808
2022-10-14 14:43:58 -07:00
fxmarty
4fe6b23699
Fix typo OpTypesToExcludeOutputQuantizatioin (#13096)
Change all occurences of `OpTypesToExcludeOutputQuantizatioin` into `OpTypesToExcludeOutputQuantization`
2022-10-14 14:11:37 -07:00
donglinb
c4a52820a5
bug fix for symbolic shape infer (#13067) 2022-10-14 14:06:31 -07:00
Jeff Daily
65c67764ae
remove line "ADD model ${WORKSPACE_DIR}/model" in the amdgpu Dockerfile (#12914)
Follow-up to #12707. docker build is broken otherwise; model dir is
gone.
2022-10-14 13:17:28 -07:00
Ted Themistokleous
a561fde126
MIGraphX Execution Provider: Stream Synchronization (#12899)
**Description**: Changes to the MIGraphx execution provider code to
allow for stream synchronization on the gpu side

**Motivation and Context**
Performance boost by removing redundant host to device synchronizations 

The current implementation of the execution provider continuously calls
hipDeviceSynchronize() between computations which adds overhead and an
idle wait between the GPU's computations. This is noticeable during
device

This change leverages new functionality that's been added to MIGraphX to
allow for GPU side synchronization which avoids the need for
host->device waits.

To maintain backwards compatibility with older MIGraphX versions, the
compile time define MIGRAPHX_STREAM_SYNC has been added to the API to
allow for older version operate with newer builds of onnxruntime without
loss of functionality to the current feature set as of (08/09/22)

Co-authored-by: Ted Themistokleous <tthemist@amd.com>
2022-10-14 10:23:51 -07:00
Yi Zhang
3c08f24efc
uset SearchOption for dotframework (#13321)
### Description
As Title


### Motivation and Context
dotcore has EnumerationOptions but dotnetframewok hasn't

**netframework-4.6.1**

https://learn.microsoft.com/en-us/dotnet/api/system.io.directory.enumeratedirectories?view=netframework-4.6.1#system-io-directory-enumeratedirectories(system-string-system-string-system-io-searchoption)

**net-5.0**

https://learn.microsoft.com/en-us/dotnet/api/system.io.directory.enumeratedirectories?view=net-5.0

It breaks C#.net framework step in Packaging Pipeline

Testing workflow

https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=238516&view=results
2022-10-14 10:22:05 -07:00
cloudhan
790e363909
Reland: Change ROCm to use tunable GEMM (#13231)
Reland: Change ROCm to use tunable GEMM (#12853)
2022-10-13 21:49:42 -07:00
George Wu
074e009e69
fix jetson build break caused by https://github.com/microsoft/onnxruntime/pull/12949 (#13310)
https://github.com/microsoft/onnxruntime/pull/12949 breaks Jetson/ARM
builds due to
invalid narrowing conversion from "char" to "signed char". (on ARM, char
is unsigned)
This was reported by
https://github.com/microsoft/onnxruntime/issues/13285#issuecomment-1276122505
2022-10-13 21:31:36 -07:00
Wei-Sheng Chin
dc324b1d90
[LazyTensor] Make LORT Build Again with Latest PyTorch (#13303)
`python setup.py develop` doesn't install PyTorch as a normal package in
site-packages anymore, and the user must stay at PyTorch's root
directory to call `import torch`. This will break LORT tests because
LORT tests contains `import torch` and are called outside PyTorch root
directory. To make PyTorch a normal package again, this PR build PyTorch
with `python setup.py install`.
2022-10-13 13:56:17 -07:00
Hector Li
db32eacda1
make the UNSIGNEDPD_CHECK for Windows only (#13260)
Fix issue reported from
https://github.com/microsoft/onnxruntime/issues/13247
The UNSIGNEDPD_CHECK should apply to Windows only
2022-10-13 11:08:35 -07:00
Vincent Wang
807b2f4dd5
[ORTModule] Use Env Variable to Set Provider Option cudnn_conv_algo_search (#13296)
This PR is to add support of using env variable to set provider option
cudnn_conv_algo_search so that user can choose better conv algo search
method to run model. This is a quick fix to unblock the test of MoE
model. Will have another PR to design and implement the ORTModule config
so that we can config ORTModule using Python script or config file
instead of env variable.
2022-10-13 15:36:21 +08:00
Vincent Wang
6fb70a82df
[ORTModule] Update Supported DeepSpeed Version for FP16_Optimizer (#13305)
Update supported deepspeed highest version from 0.7.1 to 0.7.3 for
FP16_Optimizer. Also add version info to warning log.
2022-10-13 13:03:01 +08:00