Commit graph

6327 commits

Author SHA1 Message Date
Ye Wang
d198fbc4d5
Add a script for randomizing onnx weights (#10551)
* Add a script for randomizing onnx weights

Required by customer that when sharing an onnx model for 3rd party debugging, a tool is needed to randomize all the weights in the model.

* Update onnx_randomizer.py

more comments
2022-02-16 14:40:03 -08:00
Anh Nguyen
7443edb0bf
Reduce max gradient (#9859)
* ReduceMax gradient builder

* Update gradient_builder.cc

* Add CI fix

* Remove whitepace

* Update gradient_builder.cc

* Update gradient_ops_test.cc

* Fix Window CI tests

Co-authored-by: root <tuananhnguyen7198@gmail.com>
2022-02-15 22:38:19 -08:00
Ashwini Khade
f436d3437e
Add layout transformer for NNAPI (#10371)
* Add layout transformer for NNAPI

* plus merge fixes

* plus some more merge fixes

* test fixes

* comments + cleanup

* plus updates

* post merge changes

* enable layout transformer in extended minimal build

* plus more comments

* more tests + fix CI

* plus updates per review

* more updates per review

* fix file name

* fix qdq tests

* plus more updates

* plus updates

* typo fix

* fix qdq selection in 2nd optimization pass

* fix typo

* fix a test

* update dependency structure for layout transformer

* plus updates

* more updates

* plus change

* more updates to fix linker error in minimal build

* remove unnecessary headers
2022-02-15 20:25:29 -08:00
Vincent Wang
ceb1e2b1a6
[ROCm] Bugfix of BFloat16-float conversion and Add FastGelu Kernel for AMD (#10557)
* bf16 bugfix on amd

* enable fastgelu ut on amd
2022-02-16 11:11:08 +08:00
leqiao-1
f22cd3af5d
Leqiao/add selectable pipeline (#10560)
* add selectable python package build pipeline

* update tensorrt version

* update tensorrt version
2022-02-16 09:07:29 +08:00
Yufeng Li
05d6805830
clean up quantization of QAT model (#10549) 2022-02-15 15:37:21 -08:00
Rachel Guo
8e47bb9a4a
[NNAPI QDQ] Add QDQReshape op support (#10533)
* wip

* wip

* save

* address partial pr comments

* update

* minor change

* move isquantizedop to baseopbuilderorchecker

* update

* format

* update

* update

* address pr comments

* update

Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
2022-02-15 12:46:05 -08:00
Anh Nguyen
0c3e88944d
Fix create ort value hardcoded memory info to CPU (#10510)
* Fix create ort value hardcoded memory info to CPU

* Remove unneeded check

* Remove unneeded header

* Remove unneeded header

* Update ort_ops.cpp

* Update ort_ops.cpp

* Update ort_ops.cpp

* Update ort_ops.cpp

Co-authored-by: root <root@QTM-ANHNGUYEN-1.northamerica.corp.microsoft.com>
2022-02-15 10:40:44 -08:00
Valery Chernov
1cdc23aba4
[TVM EP] Rename Standalone TVM (STVM) Execution Provider to TVM EP (#10260)
* update java API for STVM EP. Issue is from PR#10019

* use_stvm -> use_tvm

* rename stvm worktree

* STVMAllocator -> TVMAllocator

* StvmExecutionProviderInfo -> TvmExecutionProviderInfo

* stvm -> tvm for cpu_targets. resolve onnxruntime::tvm and origin tvm namespaces conflict

* STVMRunner -> TVMRunner

* StvmExecutionProvider -> TvmExecutionProvider

* tvm::env_vars

* StvmProviderFactory -> TvmProviderFactory

* rename factory funcs

* StvmCPUDataTransfer -> TvmCPUDataTransfer

* small clean

* STVMFuncState -> TVMFuncState

* USE_TVM -> NUPHAR_USE_TVM

* USE_STVM -> USE_TVM

* python API: providers.stvm -> providers.tvm. clean TVM_EP.md

* clean build scripts #1

* clean build scripts, java frontend and others #2

* once more clean #3

* fix build of nuphar tvm test

* final transfer stvm namespace to onnxruntime::tvm

* rename stvm->tvm

* NUPHAR_USE_TVM -> USE_NUPHAR_TVM

* small fixes for correct CI tests

* clean after rebase. Last renaming stvm to tvm, separate TVM and Nuphar in cmake and build files

* update CUDA support for TVM EP

* roll back CudaNN home check

* ERROR for not positive input shape dimension instead of WARNING

* update documentation for CUDA

* small corrections after review

* update GPU description

* update GPU description

* misprints were fixed

* cleaned up error msgs

Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru>
Co-authored-by: Thierry Moreau <tmoreau@octoml.ai>
2022-02-15 10:21:02 +01:00
ytaous
d3f7459263
fix CI build (#10553)
Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-02-14 19:52:21 -08:00
Chen Fu
58f80c16ff
Create branch according to cpu core uarch (#10521)
This is a preparation change for a bigger goal.

On ARM64 CPUs with Big.Little, different cores are always the same architecture but different micro-architecture. Specifically, it is often that the little core has narrow memory buses that makes 128b load very slow. While if we always use 64b load in our kernels, the code will run slower on big cores. As a result, we need to run different code on different cores to achieve better performance.

This change constructs a manifold that pivot based on the core micro-architecture of the current core, so that we can develop and call different kernels accordingly.

Co-authored-by: Chen Fu <fuchen@microsoft.com>
2022-02-14 15:16:20 -08:00
Edward Chen
3199074ac7
Update QDQ propagation transformer to insert QDQ nodes (#10487)
Update QDQ propagation transformer to insert new QDQ nodes instead of moving the existing one. This creates a more consistent `DQ -> op -> Q` pattern for other components to recognize.
Upgrade this transformer to a basic level optimization as it yields a valid ONNX graph.
2022-02-14 14:20:03 -08:00
Baiju Meswani
7691e7ed12
Introduce load balancing dataset samplers (#10163) 2022-02-14 13:46:14 -08:00
Changming Sun
270dec7327
Return a Status instead of throw an exception in GetAttrs (#10534) 2022-02-14 13:24:35 -08:00
Yi-Hong Lyu
3f37609994
Remove unneeded code in UpsampleBilinear (#10544) 2022-02-14 12:32:53 -08:00
dependabot[bot]
bfb20b315d Bump karma from 6.3.2 to 6.3.14 in /js/web
Bumps [karma](https://github.com/karma-runner/karma) from 6.3.2 to 6.3.14.
- [Release notes](https://github.com/karma-runner/karma/releases)
- [Changelog](https://github.com/karma-runner/karma/blob/master/CHANGELOG.md)
- [Commits](https://github.com/karma-runner/karma/compare/v6.3.2...v6.3.14)

---
updated-dependencies:
- dependency-name: karma
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-02-11 12:17:11 -08:00
Rachel Guo
5cfde7af29
[NNAPI QDQ] Add QDQTranspose op support (#10495)
* Squashed commit of the following:

commit 12380491a9
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Feb 7 12:59:04 2022 -0800

    Add qdq mul support

commit 9cadda7f2c
Merge: 7a32847761 0f5d0a091a
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Feb 7 11:24:47 2022 -0800

    Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul

commit 7a32847761
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Feb 7 00:41:30 2022 -0800

    move test case to util

commit c1a8f0d81e
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Fri Feb 4 13:04:26 2022 -0800

    update input/output check

commit a6f0a0d504
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Thu Feb 3 18:37:21 2022 -0800

    update quantized io check functions

commit 87f4d1dcfe
Merge: 7849f07109 97b8f6f394
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Wed Feb 2 17:22:58 2022 -0800

    Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul

commit 7849f07109
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Wed Feb 2 17:22:55 2022 -0800

    minor update

commit 7196cdf419
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Wed Feb 2 10:50:10 2022 -0800

    init change

commit 84c00772a1
Merge: a8c7dce22f 7318361645
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Tue Feb 1 18:21:17 2022 -0800

    Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul

commit a8c7dce22f
Merge: 55e536c182 ef7b4dc05c
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Tue Feb 1 13:51:04 2022 -0800

    Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul

commit 55e536c182
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Tue Feb 1 11:44:34 2022 -0800

    address cr comments

commit d460f5b776
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Tue Feb 1 00:33:54 2022 -0800

    fix android UT failure

commit 52146cf06f
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Jan 31 16:01:13 2022 -0800

    fix build break

commit ec6d07df8b
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Jan 31 15:41:52 2022 -0800

    minor update to UT

commit 8ec8490b4f
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Jan 31 15:01:30 2022 -0800

    Add NNAPI support of QDQ Resize

* Update qdq add/mul test case, fix build break

* Address CR comments

* Add QLinearMul support

* remove unused params

* Address CR comments

* wip

* save

* minor fix

* fix

* fix build

* address pr comments

* fix wrong ut tests

* address comments

* minor update

* fix addinitializersskip

Co-authored-by: Guoyu Wang <wanggy@outlook.com>
Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
2022-02-11 10:42:08 -08:00
Scott McKay
318d31ea12
Fix C# pipeline build error (#10524) 2022-02-11 08:56:40 -08:00
ytaous
4e2a974090
[ROCm] UTs and code clean up (#10511)
* Fix UT

* UT

* UTs

* enable ROCm UT

* fix build attempt

* minor

* fix UT

* fix UT

* fix UTs

Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: root <root@GCRAMDRR1-MI100-087.redmond.corp.microsoft.com>
2022-02-11 08:23:25 -08:00
Weixing Zhang
2002a96594
The transformer of memcpy is needed for ROCm EP and MIGraphX EP when fallbacking CPU happens (#10522)
Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
2022-02-11 00:53:24 -08:00
Edward Chen
f92e47e95b
Remove onnxruntime_util dependency on onnxruntime_framework (#10512)
There's a circular dependency between onnxruntime_util and onnxruntime_framework.
Remove onnxruntime_util's dependency on onnxruntime_framework.
2022-02-10 19:17:08 -08:00
satyajandhyala
a27aabad34
Fix fomatting. (#10520)
Formatting related changes
2022-02-10 17:43:25 -08:00
Changming Sun
3185680b6c
Add NHWC CONV contrib op (#10506) 2022-02-10 15:47:49 -08:00
satyajandhyala
eba730500f
Remove file-scope non-constant static variables to support multiple inference sessions (#10481)
* Changed file-scope static variables to automatic variables or function-scope static const.

* Reduce load time overhead by using constexpr.

* Use node indices instead of node names to track inserted, deleted and changed nodes.

Co-authored-by: Satya Jandhyala <sajandhy@microsoft.com>
2022-02-10 13:31:12 -08:00
Ye Wang
4d6d4dfb9d
Add TRT ep perf benchmark (#10470) 2022-02-10 08:51:01 -08:00
Sunghoon
dd33ce0fdc
[js/react_native] Create ONNX Runtime React Native pipeline (#10474)
* Pipeline for ONNX Runtime react native

* Fix a test failure

* test with custom built binaries

* add onnxruntime-common package back

* don't bob build when bootstrap

* revise Android test

* rename example to e2e

* remove onnxruntime packages from package.json

* remove release-it package

* upgrade gradle version to the same as CI

* add a pipeline for react native

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* android and ios mobile build for react native e2e

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* use android aar package template

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* use android aar package template

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* publish ios test results

* add e2e tests and publish a npm package

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* remove aar from npm package

* wait for view displayed

* change a waiting logic

* increase wait time for app launching

* give more time to launch an app

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* disable metro server on testing

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* test ios simulator launching

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* fix iOS e2e test

* use a publishing version of npm packages

* make pretty

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* make only one onnxruntime-common package after packaging

* make a powershell script of packaging universal

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Add a warning for file changes during a test

* clean up

* fix lint errors

* fix js npm packaging

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* resolve comments

* fix a typo
2022-02-09 21:37:05 -08:00
Changming Sun
6f3ade55ec
Move QAttention/QEmbedLayerNormalization op defs to quantization_defs.cc (#10507) 2022-02-09 14:23:17 -08:00
Hubert Lu
c9fbd0b15a
Optimize cuComputePartGradGammaBeta kernel for MI100 (#10475)
* Optimize cuComputePartGradGammaBeta kernel for MI100

Co-authored-by: root <root@gb-sjc2-10.local.lan>
Co-authored-by: Jeff Daily <jeff.daily@amd.com>
2022-02-09 12:51:06 -08:00
Changming Sun
7a2bf3c24c
Reorganize contrib op schemas (#10494) 2022-02-09 09:31:58 -08:00
ytaous
399ffc9700
Fix Windows GPU CI (#10499)
* fix build

* fix win build

Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-02-08 22:06:23 -08:00
Guoyu Wang
e4dc4e4d3c
[NNAPI QDQ] AddQDQAdd/Mul, update to NNAPI QDQ handling, update some test settings (#10483)
* Squashed commit of the following:

commit 12380491a9
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Feb 7 12:59:04 2022 -0800

    Add qdq mul support

commit 9cadda7f2c
Merge: 7a32847761 0f5d0a091a
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Feb 7 11:24:47 2022 -0800

    Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul

commit 7a32847761
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Feb 7 00:41:30 2022 -0800

    move test case to util

commit c1a8f0d81e
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Fri Feb 4 13:04:26 2022 -0800

    update input/output check

commit a6f0a0d504
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Thu Feb 3 18:37:21 2022 -0800

    update quantized io check functions

commit 87f4d1dcfe
Merge: 7849f07109 97b8f6f394
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Wed Feb 2 17:22:58 2022 -0800

    Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul

commit 7849f07109
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Wed Feb 2 17:22:55 2022 -0800

    minor update

commit 7196cdf419
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Wed Feb 2 10:50:10 2022 -0800

    init change

commit 84c00772a1
Merge: a8c7dce22f 7318361645
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Tue Feb 1 18:21:17 2022 -0800

    Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul

commit a8c7dce22f
Merge: 55e536c182 ef7b4dc05c
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Tue Feb 1 13:51:04 2022 -0800

    Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul

commit 55e536c182
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Tue Feb 1 11:44:34 2022 -0800

    address cr comments

commit d460f5b776
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Tue Feb 1 00:33:54 2022 -0800

    fix android UT failure

commit 52146cf06f
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Jan 31 16:01:13 2022 -0800

    fix build break

commit ec6d07df8b
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Jan 31 15:41:52 2022 -0800

    minor update to UT

commit 8ec8490b4f
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Jan 31 15:01:30 2022 -0800

    Add NNAPI support of QDQ Resize

* Update qdq add/mul test case, fix build break

* Address CR comments

* Add QLinearMul support

* remove unused params

* Address CR comments
2022-02-08 20:44:15 -08:00
Vincent Wang
655f490c95
Remove BFloat16 Specialized Code for ReduceSum (#10476) 2022-02-09 07:39:57 +08:00
ashbhandare
7e5d68eea6
gradient and test (#10455)
Co-authored-by: Aishwarya Bhandare <aibhanda@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-02-08 10:18:22 -08:00
ytaous
435e14d60a
[ROCm] BFloat16 support (#10465)
* bf16 support

* minor clean up

* UTs

* fix build

* UTs

* UTs

* merge commit 6b5504c

* minor

* ROCm code cleanup

* fix build

* fix build

* minor

Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: root <root@GCRAMDRR1-MI100-087.redmond.corp.microsoft.com>
2022-02-07 22:55:15 -08:00
Yufeng Li
c696da36c7
fix unit test of quant gemm (#10469) 2022-02-07 09:14:37 -08:00
Chi Lo
0f5d0a091a
Make user capable of adding new field in OrtTensorRTProviderOptionsV2 as new provider option (#10450)
* modify code for add additional field in OrtTensorRTProviderOptionsV2

* add include file

* fix typo

* fix bug

* add comment

* fix code

* revert change
2022-02-05 11:15:12 -08:00
Rachel Guo
927f1f18c9
[NNAPI QDQ] Add QDQ AveragePool op support (#10464)
* wip

* save

* address pr comments

* update

* revert minor changes

Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
2022-02-04 17:04:48 -08:00
wraveane
d0ab881d07
Contrib ops for TRT plugins: EfficientNMS and Pyramid ROI Align (#9486)
* Contrib ops for TRT plugins: EfficientNMS and Pyramid ROI Align

* Contrib ops for TRT plugins: Multilevel Crop and Resize
2022-02-04 12:10:04 -08:00
Ye Wang
0d09dd5d20
Support fusion for TNLR based model (#10432)
* support tnlr based offensive V4 model

* Update onnx_model_tnlr.py

Co-authored-by: Ubuntu <wy@linux-v100.aidmrjtolptuzevavgwhrapqcd.jx.internal.cloudapp.net>
2022-02-03 23:59:05 -08:00
Changming Sun
4f13c8ac39
Update orttraining-linux-ci-pipeline.yml (#10462) 2022-02-03 13:46:16 -08:00
Maxiwell S. Garcia
6bbf016dc4 cmake: disable 'attributes' error to fix the build with GCC < 9.x
This patch fixes the error "requested alignment X is larger than Y" in older GCC's

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89357
2022-02-03 13:38:19 -08:00
Ye Wang
bb09acffed
Transformer model CUDA EP align with CPU on corner case (#9889)
* align with cpu on no input data

* review comments and add tests

Co-authored-by: Ubuntu <wy@linux-v100.aidmrjtolptuzevavgwhrapqcd.jx.internal.cloudapp.net>
2022-02-03 12:58:49 -08:00
ytaous
63198a6566
[ROCm] BFloat16 support (#10447)
* bf16 support

* bf16 support

* UTs

* fix build

* fix UTs

Co-authored-by: root <root@GCRAMDRR1-MI100-087.redmond.corp.microsoft.com>
2022-02-03 11:31:14 -08:00
zhangyaobit
239c6ad3f0
Support specifying an execution provider in benchmark script (#10453)
* Support specifying execution providers.

* Change default provider setting to None.

* Add support for bert_perf_test script.

* Fall back to ROCM/CUDA EP for MIGraphX/Tensorrt EP.

* Assert fall back EPs are included.

* Add model class AutoModelForCausalLM and other minor updates.

Co-authored-by: Yao Zhang <zhanyao@microsoft.com>
2022-02-02 19:11:31 -08:00
Yi-Hong Lyu
a405658370
Fuse Clip->Q to Q (#10434)
* Fuse Clip->Q to Q

* Remove unused variable argmax_node

* Remove braces around scalar initializer

* Move GetClipConstantMinMax under ORT_MINIMAL_BUILD

* Consider epsilon so we can fuse more cases
2022-02-02 18:29:30 -08:00
Rachel Guo
97b8f6f394
Add logic to NNAPI EP to exclude pre-processing involving dynamic shapes when partitioning (#10452)
* wip

* wip

* wip

* save

* address pr comments

* address pr comments

Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
2022-02-02 15:54:19 -08:00
Sunghoon
6076a262dc
upgrade react-native packages to latest (#10454) 2022-02-02 15:19:40 -08:00
Viswanath Boga
ad9d2e2e89
Prefix match in first iteration of beam search OP (#10231)
* Add BeamSearch op schema

* Add ONNX conversion for beams search

* remove attention_mask and change input order

* add option to run baseline

* add check data type NULL

* applies VerifyNodeAndOpMatch to subgraph

* update input_ids shape

* Add node name for Cast node

* expose API for topk

* parse parameters

* Add beam search scorer

* output results

* fix typo

* use c++ template and format python

* fix build pipeline errors

* symbolic shape infer of input onnx

* output scores

* add kernel def hash

* Handle vocab_mask; move CheckSubgraph

* undo insert_cast_transformer.cc and fusion_utils.py

* fix typo

* fix merge

* update doc

* add repetition penalty

* refactoring: add GptSubgraph class

* move BeamSearchState from .h to .cc file

* adjust logits processor order

* add batch generation example

* fix repetition penalty for dup words in sequence

* Add test

* Add no repeat ngram processor

* refactoring: move logits processor to classes

* fix build warning

* show latency

* use allocator in beam state

* use allocator in sequences

* fix build error

* move next_positions to beam state

* Changes for prefix matching

* removing debugs

* removing more debugs

* clean up

* clean up

* cpu doc updated

* Updated docs

* updated prefix_vocab_mask dimension in convert script

* changes to support bxs prefix_vocab_mask in beamsearchop kernel

* doc update

* OperatorKernels.md updated

* matching docs from artifacts

* minor change in logits processor

* Addressing comments

* Updated the prefix vocab mask usage properly

Co-authored-by: Tianlei Wu <tlwu@microsoft.com>
2022-02-03 00:14:39 +05:30
Yufeng Li
1aa0789691
add qdq support for QGemm (#10414)
* add qgemm in quantization tool

* add qdq support for QGemm

* fix build break

* fix OperatorKernels.md
2022-02-02 10:35:29 -08:00
Guoyu Wang
7318361645
[NNAPI QDQ] Add QDQ Resize support (#10442)
* Add NNAPI support of QDQ Resize

* minor update to UT

* fix build break

* fix android UT failure

* address cr comments
2022-02-01 18:14:58 -08:00