Commit graph

7275 commits

Author SHA1 Message Date
Ye Wang
9aefcc251f
fix some prefast warnings (#12730)
fix warnings
2022-08-30 12:52:59 -07:00
cloudhan
9680ffd842
Fix rocm build caused by #12699 (#12787) 2022-08-30 20:26:16 +08:00
Yi Zhang
b4f6dad7c9
increase timeout limit of mac silicon package workflow (#12784)
increase timeout
2022-08-30 13:57:01 +08:00
cloudhan
9907b59a1e
Change cuda and rocm error checking helpers to return Status (#12699)
* CudaCall returns Status in non-throw and void in throw

* RocmCall returns Status in non-throw and void in throw
2022-08-30 13:18:47 +08:00
pengwa
a0c25e5c2f
Fix segment fault for alltoall (#12701)
* fix segment fault

* formatting
2022-08-30 11:27:14 +08:00
PeixuanZuo
19ca2a0089
[ADD] python package pipeline for ROCm5.2.3 (#12770)
* [TEST] test rocm5.2.3

[TEST] rm torchversion

[Update]sort

Co-authored-by: Ubuntu <peixuanzuo@peixuanzuomi200vm.zvflicr54joexhdgnhvmxrxygg.phxx.internal.cloudapp.net>
2022-08-30 11:05:59 +08:00
Chen Fu
d761a7ceb3
Pre-processing of Quantization (#12729)
Shape Inference and Model Optimization before Quantization

Model quantization with QDQ format, i.e. inserting QuantizeLinear/DeQuantizeLinear on
the tensor, requires tensor shape information to perform its best. Currently, shape inferencing
works best with optimized model. As a result, it is highly recommended to run quantization
on optimized model with shape information.

This change adds code for model optimization and shape inferencing of the following three steps:

1. Symbolic shape inference.
2. Model optimization
3. ONNX shape inference

At the same time we should recommend model optimization should be turned off during quantization.
As the optimization might change the computation graph, making it harder for the QDQ debugger
to locate matching tensors between original and the quantized models.
2022-08-29 15:47:52 -07:00
Edward Chen
1ce14e752b
Increase timeout for clean-build-docker-image-cache-pipeline. (#12776) 2022-08-29 15:30:35 -07:00
RandySheriffH
17ccd6fa02
Fix shape-related issues in FuseConv (#12410)
* fix shape mismatch in FuseConv

* remove zeroed bias

* offset Z dim

* append UT

* add testing model

* remove output

* remove commented

* fix comments

* refactor output msg

* narrowly restrict the use of cudnn...ActFwd

* reset changes in cudnn_common

* add test cases covering all path

* move cases to conv test

* remove extra space

* fix build err

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2022-08-29 10:47:19 -07:00
G. Ramalingam
233f8c210e
Handle initializers in subgraphs when inlining (#12758) 2022-08-29 10:24:17 -07:00
tvkai
4f244e48d5
fix CalculateHash for Big Endian platforms. (#12752)
* fix CalculateHash for Big Endian platforms.
2022-08-29 10:23:58 -07:00
Baiju Meswani
b83ea3c2ff
Address prefast static analysis warnings (#12756) 2022-08-29 10:09:32 -07:00
Yi Zhang
27304d9082
gcc should not less than 7 (#12771) 2022-08-29 23:49:29 +08:00
Cassie Breviu
3e57cd88fc
Csharp docfx update (#12755)
* update dest to csharp folder, update ci to remove unused files, update git ignore

* add test branch to ci
2022-08-29 08:13:45 -05:00
Baiju Meswani
80c8d934b8
Add debug option to packaging pipeline (#12685) 2022-08-26 20:25:52 -07:00
mwootton
817dc94345
Add first pass of rocm kernel profiler (#10911)
* Add first pass of rocm kernel profiler

* Clean up rocm_profiler. Format args. Demangle kernel names.
Add Api EventRecords

* Remove debug output

* Temporarily disable profiling unit test 'api record check' for cupti

* Fix compile error for non-gpu builds

* Use common file for demangle and pid/tid.  Namespace ThreadUtil.  Fix gpu buffer clearing.

* Merge demangle into profiler_common

* Merge demangle into profiler_common part 2

* Style cleanup

* Resolve linking issues via ProviderHost interface

* Demangle cuda kernel names

* Clean up comments

* Fix formatting

* Fix anal retentive formatting
2022-08-26 19:38:03 -07:00
Adam Louly
ee543a47f6
upgrade cuda version on ci pipelines (training CI pipelines) (#12708)
* upgrade cuda version on ci pipelines

* keeping folder name same

* keeping folder name same

* setting manual seed for primitive test case

* resolving comments

* changing atol and rtrol only for test case

Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-08-26 16:51:19 -07:00
edgchen1
64e8806148 Address some static analysis warnings. 2022-08-26 15:05:53 -07:00
edgchen1
c270ea148a Move 'using common::Status;' from common.h to status.h. 2022-08-26 15:05:53 -07:00
Dmitri Smirnov
3ff75fa05f
Address static analysis warnings (#12711)
Address static analysis warnings
2022-08-26 14:24:14 -07:00
Baiju Meswani
34d90dd5bd
mac-objc-static-analysis-ci-pipeline increase timeout (#12737) 2022-08-26 12:49:49 -07:00
Chi Lo
c9fd193ef6
Make TRT EP fully support control flow op and its subgraphs (#12692)
* sync graph proto in node's attributes

* Don't fuse nodes of control flow op until later in control flow op level

* remove unnecessary ep funtions

* remove unnecessary ep funtions

* remove unnecessary ep funtions

* missing 'override' keyword which makes MacOS/Web CI fail

* Add one more test run for Test3LayerNestedSubgraph with disabling graph optimization

* Update the comments to better understand the 4 cases
2022-08-26 12:45:47 -07:00
Yi-Hong Lyu
a972db06bf
Disable SYMMQGEMM benchmark for CPU other than ARM (#12739)
Besides, MlasGemmPackBSize should be MlasSymmQgemmPackBSize instead
2022-08-26 01:47:21 -07:00
cloudhan
5bdb1d4146
Add Tunable GEMM composed from rocblas and composable kernels (#12599)
* Add tunable gemm
2022-08-26 14:32:56 +08:00
cloudhan
46c074a6c8
Update composable kernel and enable experimental inter wave scheduling (#12626)
Update ck to latest master and enable interwave scheduling
2022-08-25 22:19:41 -07:00
Adam Louly
3bb5fb0f90
moving training pipelines from cuda 11.5 to 11.6 and deprecating 11.3 (packaging pipeline) (#12688)
* moving training pipelines from cuda 11.5 to 11.6 and deprecating cuda 11.3

* change to cuda 11.6.2

* change pytorch's & torchvision's cuda version to 11.6

* specify deps version to 11.6.2

* update pytorch and torch text version

* torch 1.12.1

* change torchvision and torchtext version to be compatible with torch 1.12.1

* change cuda to 11.6 for cuda_home comaptibility

Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-08-25 22:12:01 -07:00
cloudhan
f76b40aa5b
Change TunableOp to use a type erased interface (#12597)
* Change to type erased interface, so that there is no need to implement a class for a simple kernel launch function
2022-08-25 19:46:04 -07:00
Cheng
baf141a084
Enable xnnpack EP in Android AAR package (#12720)
* take new features to export symbols

* comments to explain why
2022-08-26 10:29:23 +08:00
Scott McKay
8483b9c6e3
MacOS pipeline and MAUI CoreML fixes (#12724)
* Add asm statement to model.mm to force linker to link against CoreML.Framework.

Update targets.xml as per Rolf's suggestions

* Remove explicit numpy version from macos build. We don't specify it for other CIs and the version specified doesn't have a pre-built 3.10 wheel. This leads to the CI attempting to build numpy which fails.
2022-08-26 08:51:37 +10:00
abhi-ort
ebff15d743
Pinning manual seed (#12714) 2022-08-25 10:09:02 -07:00
Cassie Breviu
e85dce8cea
Add csharp docfx (#12596)
* add docfx and gh action to build docs

* kick off build from feature branch

* Fix LGTM linting

* update az pipeline to win22 & remove nuget install

* remove azure ci changes

* fix implicit using to support 5.0

* fix more js issues

* remove resource designer changes

* remove space

* fix linting misspellings in autogenerated js temp

* fix misspellings in generated code

* delete log file
2022-08-25 09:51:32 -05:00
Vincent Wang
5104c7dbd3
Fix Prefast Warnings (#12717)
fix prefast warnings
2022-08-25 17:09:37 +08:00
Yulong Wang
5be3e87c71
[js] upgrade minimist@1.2.6 (#12689) 2022-08-25 01:40:42 -07:00
Hariharan Seshadri
cde504ebbf
Fix/Suppress some VC static analyzer warnings (#12713) 2022-08-24 23:39:40 -07:00
Yi Zhang
dee2fdffb0
Remove debug build/test in Mac CPU training (#12698)
* run mac training parallely

* update jobname

* remove debug build/test
2022-08-25 13:38:53 +08:00
Yi Zhang
d91f017da1
remove redundant publish unit test results (#12697)
rm redundant publish unit test results
2022-08-25 11:18:07 +08:00
Cheng
eba4f77d00
enable xnnpack in default_full_aar_build_settings (#12682) 2022-08-25 10:41:06 +08:00
Pranav Sharma
f1528ea50f
Fix arithmetic overflow warning. (#12712)
Fix arithmetic overflow warning. Suggested fix by static analysis tool
Arithmetic overflow: Using operator '+' on a 4 byte value and then casting the result to a 8 byte value.
Cast the value to the wider type before calling operator '+' to avoid overflow (io.2).
2022-08-24 18:27:30 -07:00
Changming Sun
7927d525a7
Remove CUDNN path from CI build scripts (#12671) 2022-08-24 18:21:50 -07:00
Dwayne Robinson
3f47119f33
DML EP Fix InstanceNormalization with 3D tensors (#12693)
Fix InstanceNormalization with 3D tensors
2022-08-24 14:58:38 -07:00
Adam Louly
94f76b944e
nightly pipeline build using PTCA image. (#12605)
* nightly pipeline yaml and requirements files

* changed names, removed torchvision installing

* delete old file

Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-08-24 10:40:55 -07:00
Nat Kershaw (MSFT)
0757d51334
Fix Java api docs broken link (#12686) 2022-08-24 09:56:51 -07:00
Vincent Wang
53ecb9e635
Update Supporting DS Version to 0.7.1 for ORTModule (#12696)
update ds version support for fp16_optimizer
2022-08-24 14:56:12 +08:00
Yi Zhang
de3d772995
Check GCC version (#12680)
* check gcc version
2022-08-24 12:10:08 +08:00
Edward Chen
8d657de4b2
Update Newtonsoft.Json version to 13.0.1. (#12691) 2022-08-23 18:45:38 -07:00
abhi-ort
73e5741a9a
Enabling softmax grad and logsoftmax grad on ORT (#12614)
* Enabling softmax grad and logsoftmax grad on ORT

* formatting changes

* formatting changes

* reverting changes

* Changing the OpType
2022-08-23 15:49:02 -07:00
Changming Sun
cb2601c5ea
Update mac-ci.yml to increase macOS build jobs' timeout value to 3 hours (#12675) 2022-08-22 21:31:30 -07:00
Tianlei Wu
8d78f96dfe
[CUDA] Fuse add bias and transpose into one kernel in Attention (#12670)
* fuse add bias and transpose in attention
2022-08-22 15:46:13 -07:00
Chun-Wei Chen
6246662b1d
[Dup] Fix SAME_UPPER/SAME_LOWER (auto_pad attribute) in ConvTranspose (#12537)
* Fix SAME_UPPER/SAME_LOWER (auto_pad attribute) in ConvTranspose

* Bump ONNX 1.10.2 globally

* load ONNX_VERSION from VERSION_NUMBER

* /

* revert deprecate warning in ORT 1.12

* add a comment about why removing cntk_simple_seg

* correct the implem in DML as well
2022-08-22 15:35:34 -07:00
Yulong Wang
c144acc534
Replace 'master' branch ref to 'main' in the code (#12547) 2022-08-22 10:48:12 -07:00