Commit graph

6376 commits

Author SHA1 Message Date
Changming Sun
56be66a0ab Update c-api-cpu.yml: change nuget linux arm64 RID 2022-02-24 11:15:51 -08:00
Tang, Cheng
7660eeef3e
fix ortmodule's output device info when it runs on ort device (#10616)
Co-authored-by: Cheng Tang <chenta@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-02-24 10:22:55 -08:00
Yufeng Li
446258fa28
fix bug: quantize output of activation op(Relu, Clip) (#10649) 2022-02-24 09:06:04 -08:00
Alexey Gladyshev
7dc7529ec8
[TVM EP] Integrate tests for TVM EP into public onnxruntime CI (#10505)
* add support for bool type

* add TVM EP support for tests

* include TVM EP in python test pool

* fix pylint

* moved technical imports to a separate file

* clean up post build actions & move _ld_preload.py extension to CMake level

* add files for include TVM EP into CI

* implement custom logger for TVM

* replace TVM logging with ONNX RT logging

* update link for TVM EP tutorial

* clean up TVM EP cmake

* add pybind auto enabling for TVM EP

* fix blank spaces

* code review fixes

* replace print with comment

* add list of EP without TVM EP

* enable onnx tests

* disable contrib ops and ml ops

* reuse Dockerfile.ubuntu

* Move install_tvm_test_dependencies.sh out of Docker context dir, update build definition.

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
2022-02-24 16:24:23 +01:00
Scott McKay
ecf064f135
Exclude pdb from nuspec unless it's the winml package. (#10638) 2022-02-24 14:23:00 +10:00
Thomas Rondahl
e076f3a125 Fix incorrect target name
Now updating the arch variable after matching for tar file.
2022-02-23 19:17:34 -08:00
Thomas Rondahl
573e440d35 Fix no ARM64 natives for Linux nuget
Change from aarch64 to arm64 for natives in nuget packages.
2022-02-23 19:17:34 -08:00
Yi-Hong Lyu
bd08f11a58
Upsample support NHWC (#10554)
Implement bilinear interpolation for Upsample (Resize) 4-D input with the
outermost and innermost scale (usually channel of NHWC) as 1.

Besides, I revert the HandleResize back to the original implementation for
TransposeOptimizerTests.TestResize* tests.
2022-02-23 14:27:11 -08:00
Scott McKay
e0d1d6906a
Merge two helpers involving the kernel def hashes into one file (#10609)
* Merge two helpers involving the kernel def hashes used by ORT format models. Add codeowners entry to ensure updates involving hashes are checked.
2022-02-23 20:46:09 +10:00
Dwayne Robinson
ea7f773a6e
Merge pull request #10619 from microsoft/user/dwayner/DmlDev20220221
Update DirectML EP for ORT 1.11
2022-02-23 01:09:26 -08:00
ytaous
9ba2d9379f
[ROCm] Code sync from CUDA (#10631)
* code sync

* more sync

Co-authored-by: root <root@GCRAMDRR1-MI100-087.redmond.corp.microsoft.com>
2022-02-22 22:40:06 -08:00
Scott McKay
8cfa4b1c17
Fix build errors due to changes in warnings that VS 2022 17.1 produces. (#10621)
Disable warning about padding for abseil-cpp flat_hash_map.

Disable some warnings from compiling the test proto. This also required removing a line in CMakeList.txt where we move a level 4 warning to level 3. That ends up later on the command line and overrides the `/wd4800`. Couldn't find a way to handle that nicely. As we compile with `/W4` the value of moving 4800 to level 3 in dev mode is unclear so simplest was to remove that. Open to suggestions if there's a better way.
2022-02-23 07:32:07 +10:00
Dwayne Robinson
7de86d39d3 Build error int to bool 2022-02-21 22:00:48 -08:00
Rachel Guo
d6a8cba273
[NNAPI QDQ] Add nnapi qdq softmax op support (#10591)
* wip

* save

* update pr comments

* update

Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
2022-02-21 18:00:46 -08:00
Scott McKay
4d3cd2f685
Add helper for optimizing a QDQ format model for usage with ORT. (#10595)
* Add initial helper for optimizing a QDQ format model for usage with ORT.

If a DQ node has multiple consumers it will end up in multiple QDQ node units. This is complicated to handle as each qdq unit could end up being handled by different execution providers. By duplicating the DQ node we simplify this logic.

Generally the duplicate nodes will disappear when the qdq node unit is converted to a single node with a quantized operator. If there are qdq node units that are not able to be converted to use a quantized operator the ORT cleanup (pending) to drop remaining Q->DQ pairs between fp32 nodes can remove any remaining DQ nodes.

* Fix pep8 warning

Co-authored-by: Guoyu Wang <wanggy@outlook.com>
2022-02-21 09:26:19 +10:00
Ryan Hill
4a79ed62b4
Remove extra version of a function in dnnl (#10599) 2022-02-18 23:29:54 -08:00
Justin D. Harris
742694f679
[python] [orttraining] Add utility to export a graph to compute gradients (#8125) 2022-02-18 14:00:49 -08:00
Xavier Dupré
6f0640a57f
Optimize ReduceSum, ReduceMean, ReduceMin, ReduceMax (#10280)
* Optimize ReduceSum, ReduceMean, ReduceMin, ReduceMax
* improve reducemax, reducemin
* faster, smaller
* replace std::vector by gsl::span for shapes
* fix merging issues
2022-02-18 12:51:01 +01:00
Scott McKay
df841ee87d
Fix incorrect type constraint registration for operator kernels. (#10489)
* Fix incorrect type constraint registration for RoiAlign. This led to the input type not actually being checked when matching a kernel as the invalid constraint name is treated as a missing optional input.
  * fix missing dependency for the unit test exe. Whilst it doesn't link against the CUDA providers lib, without the dependency VS doesn't know it needs to rebuild the library if there are changes.
* Add check for invalid type constraints.
* Fix invalid registrations for other kernels.
* Add hash replacement logic to provide backwards compatibility in ORT format models when the registration is fixed.
* Add tests
2022-02-18 16:55:32 +10:00
Yulong Wang
893ee65e54
[js/web] fix lint error when run without ort-web TS types (#10429)
* [js/web] fix lint error when run without ort-web TS types

* update CI to run linter before 'npm ci' in /js/web
2022-02-17 22:34:38 -08:00
Dwayne Robinson
6db6ee5710 Merged PR 6973543: ORT DML EP Opset 13 more complete
Extend opset 13 support for:
- Split-13
- Squeeze-13
- Unsqueeze-13
- Reshape-13
- QuantizeLinear-13
- DequantizeLinear-13
- ReduceSum-13
- Resize-13

Also:
- Rename the file where all the opset versions are stored from "OperatorRegistration.h" to "OperatorVersions.h", which will make it much less confusing in the future when looking given there's another file called "OperatorRegistration.h" that corresponds to "OperatorRegistration.cpp".
- Detemplatize many of the OperatorHelper.h constructors, which duplicate multiple instantiations due to the operator helper classes not sharing a common base class, by wrapping them with an adapter. Ideally there would be a common COM base interface that both IMLOperatorKernelCreationContext and IMLOperatorShapeInferenceContext implementation objects would implement, which a wrapper in MLOperatorAuthorHelper.h could QI for.
- Fix style formatting issues in OperatorHelper.h (sorry for the noise).

```
Summary: Total=4679, Passed=4355, Failed=0, Blocked=0, Not Run=0, Skipped=324
```

Corresponding WindowsAI PR:
https://microsoft.visualstudio.com/WindowsAI/_git/WindowsAI/pullrequest/6973645

Related work items: #36672908, #36672926
2022-02-18 01:41:07 +00:00
Sunghoon
1af4c170ef
[js/react_native] publish onnxruntime-common npm package as web and node do (#10566)
* apply the same policy for onnxruntime-common as web and node

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* remove old comment
2022-02-17 15:25:27 -08:00
RandySheriffH
e056fbaa51
Add restrictions for hybrid cpus for thread pool task distribution (#10393)
* add restrictions for hybrid cpus

* add unit test to mock hybrid cpu

* attach hybrid flag

* add mocking interface to CpuInfo

* make is_hybrid

* make mock function const

* add force_hybrid for thread pool

* remove header
2022-02-17 14:34:09 -08:00
Jingqiao Fu
2fa333443a
Add telemetry for device kind (#10431)
Add telemetry for device kind
2022-02-17 13:56:22 -08:00
Scott McKay
2ca9566994
Add range of helpers for making usage of ORT Mobile easier. (#10458)
* Add range of helpers for making usage of ORT Mobile easier.
2022-02-18 07:35:25 +10:00
Chi Lo
fad590a059
Enhance TRT EP unit tests (#10493)
* Re-write tensorrt ep cache test

* refactor the code

* refactor

* move stdc++fs flag to CMakeLists.txt
2022-02-17 10:30:03 -08:00
Xavier Dupré
edbc844032
Fix misspelling in python documentation (#10588) 2022-02-17 18:10:21 +01:00
zhangyaobit
fd16085cea
Zhanyao/attention (#10545)
* Enable Attention op for ROCM EP.

As a note, potential hipify improvements: (1) handle math
contants (attention_softmax.h), (2) correctly generate transpose
options for the GEMM helpers, consider counterpart/dummy API for
CublasMathModeSetter (attention_impl.cu, attention_impl.cu). After
these improvements, we don't need to manually keep copies of the
above mentioned files any more.

* Clean up debugging code.
2022-02-17 09:02:45 -08:00
leqiao-1
8d06e5a9df
Add openvino base image option (#10581)
* add selectable python package build pipeline

* update tensorrt version

* update tensorrt version

* Update Dockerfile.ubuntu_openvino

* Update install_ubuntu.sh

* add parameters for openvino base image

* fix syntax error
2022-02-17 17:10:01 +08:00
Pallavi Deshmukh
ccd7a2d840 Fix build failure when using clang compiler 2022-02-16 17:52:45 -08:00
Changming Sun
09ac7595fc
update (#10573)
Move FuncMgr up the class so it is destroyed later
2022-02-16 17:43:29 -08:00
ytaous
4f76c38686
Revert "Reduce max gradient (#9859)" (#10574)
This reverts commit 7443edb0bf.
2022-02-16 16:02:30 -08:00
ytaous
e71d77e974
[ROCm] Fixing build for CIs (#10558)
* fix build

* fix build

* fix win build

* apply same fix for rocm

* fix CI

* update comments

* trigger build

Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-02-16 15:36:46 -08:00
Lingfeng Wu
3be3db5180
Use IntPtr instead of int conversion for pointer in Memory.Pin() (#10485)
* Use IntPtr instead of int conversion for pointer in Memory.Pin()
Co-authored-by: Lingfeng Wu <lingfw@microsoft.com>
2022-02-16 14:49:56 -08:00
Ye Wang
d198fbc4d5
Add a script for randomizing onnx weights (#10551)
* Add a script for randomizing onnx weights

Required by customer that when sharing an onnx model for 3rd party debugging, a tool is needed to randomize all the weights in the model.

* Update onnx_randomizer.py

more comments
2022-02-16 14:40:03 -08:00
Anh Nguyen
7443edb0bf
Reduce max gradient (#9859)
* ReduceMax gradient builder

* Update gradient_builder.cc

* Add CI fix

* Remove whitepace

* Update gradient_builder.cc

* Update gradient_ops_test.cc

* Fix Window CI tests

Co-authored-by: root <tuananhnguyen7198@gmail.com>
2022-02-15 22:38:19 -08:00
Ashwini Khade
f436d3437e
Add layout transformer for NNAPI (#10371)
* Add layout transformer for NNAPI

* plus merge fixes

* plus some more merge fixes

* test fixes

* comments + cleanup

* plus updates

* post merge changes

* enable layout transformer in extended minimal build

* plus more comments

* more tests + fix CI

* plus updates per review

* more updates per review

* fix file name

* fix qdq tests

* plus more updates

* plus updates

* typo fix

* fix qdq selection in 2nd optimization pass

* fix typo

* fix a test

* update dependency structure for layout transformer

* plus updates

* more updates

* plus change

* more updates to fix linker error in minimal build

* remove unnecessary headers
2022-02-15 20:25:29 -08:00
Vincent Wang
ceb1e2b1a6
[ROCm] Bugfix of BFloat16-float conversion and Add FastGelu Kernel for AMD (#10557)
* bf16 bugfix on amd

* enable fastgelu ut on amd
2022-02-16 11:11:08 +08:00
leqiao-1
f22cd3af5d
Leqiao/add selectable pipeline (#10560)
* add selectable python package build pipeline

* update tensorrt version

* update tensorrt version
2022-02-16 09:07:29 +08:00
Yufeng Li
05d6805830
clean up quantization of QAT model (#10549) 2022-02-15 15:37:21 -08:00
Rachel Guo
8e47bb9a4a
[NNAPI QDQ] Add QDQReshape op support (#10533)
* wip

* wip

* save

* address partial pr comments

* update

* minor change

* move isquantizedop to baseopbuilderorchecker

* update

* format

* update

* update

* address pr comments

* update

Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
2022-02-15 12:46:05 -08:00
Anh Nguyen
0c3e88944d
Fix create ort value hardcoded memory info to CPU (#10510)
* Fix create ort value hardcoded memory info to CPU

* Remove unneeded check

* Remove unneeded header

* Remove unneeded header

* Update ort_ops.cpp

* Update ort_ops.cpp

* Update ort_ops.cpp

* Update ort_ops.cpp

Co-authored-by: root <root@QTM-ANHNGUYEN-1.northamerica.corp.microsoft.com>
2022-02-15 10:40:44 -08:00
Valery Chernov
1cdc23aba4
[TVM EP] Rename Standalone TVM (STVM) Execution Provider to TVM EP (#10260)
* update java API for STVM EP. Issue is from PR#10019

* use_stvm -> use_tvm

* rename stvm worktree

* STVMAllocator -> TVMAllocator

* StvmExecutionProviderInfo -> TvmExecutionProviderInfo

* stvm -> tvm for cpu_targets. resolve onnxruntime::tvm and origin tvm namespaces conflict

* STVMRunner -> TVMRunner

* StvmExecutionProvider -> TvmExecutionProvider

* tvm::env_vars

* StvmProviderFactory -> TvmProviderFactory

* rename factory funcs

* StvmCPUDataTransfer -> TvmCPUDataTransfer

* small clean

* STVMFuncState -> TVMFuncState

* USE_TVM -> NUPHAR_USE_TVM

* USE_STVM -> USE_TVM

* python API: providers.stvm -> providers.tvm. clean TVM_EP.md

* clean build scripts #1

* clean build scripts, java frontend and others #2

* once more clean #3

* fix build of nuphar tvm test

* final transfer stvm namespace to onnxruntime::tvm

* rename stvm->tvm

* NUPHAR_USE_TVM -> USE_NUPHAR_TVM

* small fixes for correct CI tests

* clean after rebase. Last renaming stvm to tvm, separate TVM and Nuphar in cmake and build files

* update CUDA support for TVM EP

* roll back CudaNN home check

* ERROR for not positive input shape dimension instead of WARNING

* update documentation for CUDA

* small corrections after review

* update GPU description

* update GPU description

* misprints were fixed

* cleaned up error msgs

Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru>
Co-authored-by: Thierry Moreau <tmoreau@octoml.ai>
2022-02-15 10:21:02 +01:00
ytaous
d3f7459263
fix CI build (#10553)
Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-02-14 19:52:21 -08:00
Chen Fu
58f80c16ff
Create branch according to cpu core uarch (#10521)
This is a preparation change for a bigger goal.

On ARM64 CPUs with Big.Little, different cores are always the same architecture but different micro-architecture. Specifically, it is often that the little core has narrow memory buses that makes 128b load very slow. While if we always use 64b load in our kernels, the code will run slower on big cores. As a result, we need to run different code on different cores to achieve better performance.

This change constructs a manifold that pivot based on the core micro-architecture of the current core, so that we can develop and call different kernels accordingly.

Co-authored-by: Chen Fu <fuchen@microsoft.com>
2022-02-14 15:16:20 -08:00
Edward Chen
3199074ac7
Update QDQ propagation transformer to insert QDQ nodes (#10487)
Update QDQ propagation transformer to insert new QDQ nodes instead of moving the existing one. This creates a more consistent `DQ -> op -> Q` pattern for other components to recognize.
Upgrade this transformer to a basic level optimization as it yields a valid ONNX graph.
2022-02-14 14:20:03 -08:00
Baiju Meswani
7691e7ed12
Introduce load balancing dataset samplers (#10163) 2022-02-14 13:46:14 -08:00
Changming Sun
270dec7327
Return a Status instead of throw an exception in GetAttrs (#10534) 2022-02-14 13:24:35 -08:00
Yi-Hong Lyu
3f37609994
Remove unneeded code in UpsampleBilinear (#10544) 2022-02-14 12:32:53 -08:00
dependabot[bot]
bfb20b315d Bump karma from 6.3.2 to 6.3.14 in /js/web
Bumps [karma](https://github.com/karma-runner/karma) from 6.3.2 to 6.3.14.
- [Release notes](https://github.com/karma-runner/karma/releases)
- [Changelog](https://github.com/karma-runner/karma/blob/master/CHANGELOG.md)
- [Commits](https://github.com/karma-runner/karma/compare/v6.3.2...v6.3.14)

---
updated-dependencies:
- dependency-name: karma
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-02-11 12:17:11 -08:00