Commit graph

6319 commits

Author SHA1 Message Date
Valery Chernov
1cdc23aba4
[TVM EP] Rename Standalone TVM (STVM) Execution Provider to TVM EP (#10260)
* update java API for STVM EP. Issue is from PR#10019

* use_stvm -> use_tvm

* rename stvm worktree

* STVMAllocator -> TVMAllocator

* StvmExecutionProviderInfo -> TvmExecutionProviderInfo

* stvm -> tvm for cpu_targets. resolve onnxruntime::tvm and origin tvm namespaces conflict

* STVMRunner -> TVMRunner

* StvmExecutionProvider -> TvmExecutionProvider

* tvm::env_vars

* StvmProviderFactory -> TvmProviderFactory

* rename factory funcs

* StvmCPUDataTransfer -> TvmCPUDataTransfer

* small clean

* STVMFuncState -> TVMFuncState

* USE_TVM -> NUPHAR_USE_TVM

* USE_STVM -> USE_TVM

* python API: providers.stvm -> providers.tvm. clean TVM_EP.md

* clean build scripts #1

* clean build scripts, java frontend and others #2

* once more clean #3

* fix build of nuphar tvm test

* final transfer stvm namespace to onnxruntime::tvm

* rename stvm->tvm

* NUPHAR_USE_TVM -> USE_NUPHAR_TVM

* small fixes for correct CI tests

* clean after rebase. Last renaming stvm to tvm, separate TVM and Nuphar in cmake and build files

* update CUDA support for TVM EP

* roll back CudaNN home check

* ERROR for not positive input shape dimension instead of WARNING

* update documentation for CUDA

* small corrections after review

* update GPU description

* update GPU description

* misprints were fixed

* cleaned up error msgs

Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru>
Co-authored-by: Thierry Moreau <tmoreau@octoml.ai>
2022-02-15 10:21:02 +01:00
ytaous
d3f7459263
fix CI build (#10553)
Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-02-14 19:52:21 -08:00
Chen Fu
58f80c16ff
Create branch according to cpu core uarch (#10521)
This is a preparation change for a bigger goal.

On ARM64 CPUs with Big.Little, different cores are always the same architecture but different micro-architecture. Specifically, it is often that the little core has narrow memory buses that makes 128b load very slow. While if we always use 64b load in our kernels, the code will run slower on big cores. As a result, we need to run different code on different cores to achieve better performance.

This change constructs a manifold that pivot based on the core micro-architecture of the current core, so that we can develop and call different kernels accordingly.

Co-authored-by: Chen Fu <fuchen@microsoft.com>
2022-02-14 15:16:20 -08:00
Edward Chen
3199074ac7
Update QDQ propagation transformer to insert QDQ nodes (#10487)
Update QDQ propagation transformer to insert new QDQ nodes instead of moving the existing one. This creates a more consistent `DQ -> op -> Q` pattern for other components to recognize.
Upgrade this transformer to a basic level optimization as it yields a valid ONNX graph.
2022-02-14 14:20:03 -08:00
Baiju Meswani
7691e7ed12
Introduce load balancing dataset samplers (#10163) 2022-02-14 13:46:14 -08:00
Changming Sun
270dec7327
Return a Status instead of throw an exception in GetAttrs (#10534) 2022-02-14 13:24:35 -08:00
Yi-Hong Lyu
3f37609994
Remove unneeded code in UpsampleBilinear (#10544) 2022-02-14 12:32:53 -08:00
dependabot[bot]
bfb20b315d Bump karma from 6.3.2 to 6.3.14 in /js/web
Bumps [karma](https://github.com/karma-runner/karma) from 6.3.2 to 6.3.14.
- [Release notes](https://github.com/karma-runner/karma/releases)
- [Changelog](https://github.com/karma-runner/karma/blob/master/CHANGELOG.md)
- [Commits](https://github.com/karma-runner/karma/compare/v6.3.2...v6.3.14)

---
updated-dependencies:
- dependency-name: karma
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-02-11 12:17:11 -08:00
Rachel Guo
5cfde7af29
[NNAPI QDQ] Add QDQTranspose op support (#10495)
* Squashed commit of the following:

commit 12380491a9
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Feb 7 12:59:04 2022 -0800

    Add qdq mul support

commit 9cadda7f2c
Merge: 7a32847761 0f5d0a091a
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Feb 7 11:24:47 2022 -0800

    Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul

commit 7a32847761
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Feb 7 00:41:30 2022 -0800

    move test case to util

commit c1a8f0d81e
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Fri Feb 4 13:04:26 2022 -0800

    update input/output check

commit a6f0a0d504
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Thu Feb 3 18:37:21 2022 -0800

    update quantized io check functions

commit 87f4d1dcfe
Merge: 7849f07109 97b8f6f394
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Wed Feb 2 17:22:58 2022 -0800

    Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul

commit 7849f07109
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Wed Feb 2 17:22:55 2022 -0800

    minor update

commit 7196cdf419
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Wed Feb 2 10:50:10 2022 -0800

    init change

commit 84c00772a1
Merge: a8c7dce22f 7318361645
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Tue Feb 1 18:21:17 2022 -0800

    Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul

commit a8c7dce22f
Merge: 55e536c182 ef7b4dc05c
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Tue Feb 1 13:51:04 2022 -0800

    Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul

commit 55e536c182
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Tue Feb 1 11:44:34 2022 -0800

    address cr comments

commit d460f5b776
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Tue Feb 1 00:33:54 2022 -0800

    fix android UT failure

commit 52146cf06f
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Jan 31 16:01:13 2022 -0800

    fix build break

commit ec6d07df8b
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Jan 31 15:41:52 2022 -0800

    minor update to UT

commit 8ec8490b4f
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Jan 31 15:01:30 2022 -0800

    Add NNAPI support of QDQ Resize

* Update qdq add/mul test case, fix build break

* Address CR comments

* Add QLinearMul support

* remove unused params

* Address CR comments

* wip

* save

* minor fix

* fix

* fix build

* address pr comments

* fix wrong ut tests

* address comments

* minor update

* fix addinitializersskip

Co-authored-by: Guoyu Wang <wanggy@outlook.com>
Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
2022-02-11 10:42:08 -08:00
Scott McKay
318d31ea12
Fix C# pipeline build error (#10524) 2022-02-11 08:56:40 -08:00
ytaous
4e2a974090
[ROCm] UTs and code clean up (#10511)
* Fix UT

* UT

* UTs

* enable ROCm UT

* fix build attempt

* minor

* fix UT

* fix UT

* fix UTs

Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: root <root@GCRAMDRR1-MI100-087.redmond.corp.microsoft.com>
2022-02-11 08:23:25 -08:00
Weixing Zhang
2002a96594
The transformer of memcpy is needed for ROCm EP and MIGraphX EP when fallbacking CPU happens (#10522)
Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
2022-02-11 00:53:24 -08:00
Edward Chen
f92e47e95b
Remove onnxruntime_util dependency on onnxruntime_framework (#10512)
There's a circular dependency between onnxruntime_util and onnxruntime_framework.
Remove onnxruntime_util's dependency on onnxruntime_framework.
2022-02-10 19:17:08 -08:00
satyajandhyala
a27aabad34
Fix fomatting. (#10520)
Formatting related changes
2022-02-10 17:43:25 -08:00
Changming Sun
3185680b6c
Add NHWC CONV contrib op (#10506) 2022-02-10 15:47:49 -08:00
satyajandhyala
eba730500f
Remove file-scope non-constant static variables to support multiple inference sessions (#10481)
* Changed file-scope static variables to automatic variables or function-scope static const.

* Reduce load time overhead by using constexpr.

* Use node indices instead of node names to track inserted, deleted and changed nodes.

Co-authored-by: Satya Jandhyala <sajandhy@microsoft.com>
2022-02-10 13:31:12 -08:00
Ye Wang
4d6d4dfb9d
Add TRT ep perf benchmark (#10470) 2022-02-10 08:51:01 -08:00
Sunghoon
dd33ce0fdc
[js/react_native] Create ONNX Runtime React Native pipeline (#10474)
* Pipeline for ONNX Runtime react native

* Fix a test failure

* test with custom built binaries

* add onnxruntime-common package back

* don't bob build when bootstrap

* revise Android test

* rename example to e2e

* remove onnxruntime packages from package.json

* remove release-it package

* upgrade gradle version to the same as CI

* add a pipeline for react native

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* android and ios mobile build for react native e2e

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* use android aar package template

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* use android aar package template

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* publish ios test results

* add e2e tests and publish a npm package

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* remove aar from npm package

* wait for view displayed

* change a waiting logic

* increase wait time for app launching

* give more time to launch an app

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* disable metro server on testing

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* test ios simulator launching

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* fix iOS e2e test

* use a publishing version of npm packages

* make pretty

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* make only one onnxruntime-common package after packaging

* make a powershell script of packaging universal

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Add a warning for file changes during a test

* clean up

* fix lint errors

* fix js npm packaging

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* Update mac-react-native-ci-pipeline.yml for Azure Pipelines

* resolve comments

* fix a typo
2022-02-09 21:37:05 -08:00
Changming Sun
6f3ade55ec
Move QAttention/QEmbedLayerNormalization op defs to quantization_defs.cc (#10507) 2022-02-09 14:23:17 -08:00
Hubert Lu
c9fbd0b15a
Optimize cuComputePartGradGammaBeta kernel for MI100 (#10475)
* Optimize cuComputePartGradGammaBeta kernel for MI100

Co-authored-by: root <root@gb-sjc2-10.local.lan>
Co-authored-by: Jeff Daily <jeff.daily@amd.com>
2022-02-09 12:51:06 -08:00
Changming Sun
7a2bf3c24c
Reorganize contrib op schemas (#10494) 2022-02-09 09:31:58 -08:00
ytaous
399ffc9700
Fix Windows GPU CI (#10499)
* fix build

* fix win build

Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-02-08 22:06:23 -08:00
Guoyu Wang
e4dc4e4d3c
[NNAPI QDQ] AddQDQAdd/Mul, update to NNAPI QDQ handling, update some test settings (#10483)
* Squashed commit of the following:

commit 12380491a9
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Feb 7 12:59:04 2022 -0800

    Add qdq mul support

commit 9cadda7f2c
Merge: 7a32847761 0f5d0a091a
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Feb 7 11:24:47 2022 -0800

    Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul

commit 7a32847761
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Feb 7 00:41:30 2022 -0800

    move test case to util

commit c1a8f0d81e
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Fri Feb 4 13:04:26 2022 -0800

    update input/output check

commit a6f0a0d504
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Thu Feb 3 18:37:21 2022 -0800

    update quantized io check functions

commit 87f4d1dcfe
Merge: 7849f07109 97b8f6f394
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Wed Feb 2 17:22:58 2022 -0800

    Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul

commit 7849f07109
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Wed Feb 2 17:22:55 2022 -0800

    minor update

commit 7196cdf419
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Wed Feb 2 10:50:10 2022 -0800

    init change

commit 84c00772a1
Merge: a8c7dce22f 7318361645
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Tue Feb 1 18:21:17 2022 -0800

    Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul

commit a8c7dce22f
Merge: 55e536c182 ef7b4dc05c
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Tue Feb 1 13:51:04 2022 -0800

    Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul

commit 55e536c182
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Tue Feb 1 11:44:34 2022 -0800

    address cr comments

commit d460f5b776
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Tue Feb 1 00:33:54 2022 -0800

    fix android UT failure

commit 52146cf06f
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Jan 31 16:01:13 2022 -0800

    fix build break

commit ec6d07df8b
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Jan 31 15:41:52 2022 -0800

    minor update to UT

commit 8ec8490b4f
Author: Guoyu Wang <wanggy@outlook.com>
Date:   Mon Jan 31 15:01:30 2022 -0800

    Add NNAPI support of QDQ Resize

* Update qdq add/mul test case, fix build break

* Address CR comments

* Add QLinearMul support

* remove unused params

* Address CR comments
2022-02-08 20:44:15 -08:00
Vincent Wang
655f490c95
Remove BFloat16 Specialized Code for ReduceSum (#10476) 2022-02-09 07:39:57 +08:00
ashbhandare
7e5d68eea6
gradient and test (#10455)
Co-authored-by: Aishwarya Bhandare <aibhanda@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-02-08 10:18:22 -08:00
ytaous
435e14d60a
[ROCm] BFloat16 support (#10465)
* bf16 support

* minor clean up

* UTs

* fix build

* UTs

* UTs

* merge commit 6b5504c

* minor

* ROCm code cleanup

* fix build

* fix build

* minor

Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: root <root@GCRAMDRR1-MI100-087.redmond.corp.microsoft.com>
2022-02-07 22:55:15 -08:00
Yufeng Li
c696da36c7
fix unit test of quant gemm (#10469) 2022-02-07 09:14:37 -08:00
Chi Lo
0f5d0a091a
Make user capable of adding new field in OrtTensorRTProviderOptionsV2 as new provider option (#10450)
* modify code for add additional field in OrtTensorRTProviderOptionsV2

* add include file

* fix typo

* fix bug

* add comment

* fix code

* revert change
2022-02-05 11:15:12 -08:00
Rachel Guo
927f1f18c9
[NNAPI QDQ] Add QDQ AveragePool op support (#10464)
* wip

* save

* address pr comments

* update

* revert minor changes

Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
2022-02-04 17:04:48 -08:00
wraveane
d0ab881d07
Contrib ops for TRT plugins: EfficientNMS and Pyramid ROI Align (#9486)
* Contrib ops for TRT plugins: EfficientNMS and Pyramid ROI Align

* Contrib ops for TRT plugins: Multilevel Crop and Resize
2022-02-04 12:10:04 -08:00
Ye Wang
0d09dd5d20
Support fusion for TNLR based model (#10432)
* support tnlr based offensive V4 model

* Update onnx_model_tnlr.py

Co-authored-by: Ubuntu <wy@linux-v100.aidmrjtolptuzevavgwhrapqcd.jx.internal.cloudapp.net>
2022-02-03 23:59:05 -08:00
Changming Sun
4f13c8ac39
Update orttraining-linux-ci-pipeline.yml (#10462) 2022-02-03 13:46:16 -08:00
Maxiwell S. Garcia
6bbf016dc4 cmake: disable 'attributes' error to fix the build with GCC < 9.x
This patch fixes the error "requested alignment X is larger than Y" in older GCC's

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89357
2022-02-03 13:38:19 -08:00
Ye Wang
bb09acffed
Transformer model CUDA EP align with CPU on corner case (#9889)
* align with cpu on no input data

* review comments and add tests

Co-authored-by: Ubuntu <wy@linux-v100.aidmrjtolptuzevavgwhrapqcd.jx.internal.cloudapp.net>
2022-02-03 12:58:49 -08:00
ytaous
63198a6566
[ROCm] BFloat16 support (#10447)
* bf16 support

* bf16 support

* UTs

* fix build

* fix UTs

Co-authored-by: root <root@GCRAMDRR1-MI100-087.redmond.corp.microsoft.com>
2022-02-03 11:31:14 -08:00
zhangyaobit
239c6ad3f0
Support specifying an execution provider in benchmark script (#10453)
* Support specifying execution providers.

* Change default provider setting to None.

* Add support for bert_perf_test script.

* Fall back to ROCM/CUDA EP for MIGraphX/Tensorrt EP.

* Assert fall back EPs are included.

* Add model class AutoModelForCausalLM and other minor updates.

Co-authored-by: Yao Zhang <zhanyao@microsoft.com>
2022-02-02 19:11:31 -08:00
Yi-Hong Lyu
a405658370
Fuse Clip->Q to Q (#10434)
* Fuse Clip->Q to Q

* Remove unused variable argmax_node

* Remove braces around scalar initializer

* Move GetClipConstantMinMax under ORT_MINIMAL_BUILD

* Consider epsilon so we can fuse more cases
2022-02-02 18:29:30 -08:00
Rachel Guo
97b8f6f394
Add logic to NNAPI EP to exclude pre-processing involving dynamic shapes when partitioning (#10452)
* wip

* wip

* wip

* save

* address pr comments

* address pr comments

Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
2022-02-02 15:54:19 -08:00
Sunghoon
6076a262dc
upgrade react-native packages to latest (#10454) 2022-02-02 15:19:40 -08:00
Viswanath Boga
ad9d2e2e89
Prefix match in first iteration of beam search OP (#10231)
* Add BeamSearch op schema

* Add ONNX conversion for beams search

* remove attention_mask and change input order

* add option to run baseline

* add check data type NULL

* applies VerifyNodeAndOpMatch to subgraph

* update input_ids shape

* Add node name for Cast node

* expose API for topk

* parse parameters

* Add beam search scorer

* output results

* fix typo

* use c++ template and format python

* fix build pipeline errors

* symbolic shape infer of input onnx

* output scores

* add kernel def hash

* Handle vocab_mask; move CheckSubgraph

* undo insert_cast_transformer.cc and fusion_utils.py

* fix typo

* fix merge

* update doc

* add repetition penalty

* refactoring: add GptSubgraph class

* move BeamSearchState from .h to .cc file

* adjust logits processor order

* add batch generation example

* fix repetition penalty for dup words in sequence

* Add test

* Add no repeat ngram processor

* refactoring: move logits processor to classes

* fix build warning

* show latency

* use allocator in beam state

* use allocator in sequences

* fix build error

* move next_positions to beam state

* Changes for prefix matching

* removing debugs

* removing more debugs

* clean up

* clean up

* cpu doc updated

* Updated docs

* updated prefix_vocab_mask dimension in convert script

* changes to support bxs prefix_vocab_mask in beamsearchop kernel

* doc update

* OperatorKernels.md updated

* matching docs from artifacts

* minor change in logits processor

* Addressing comments

* Updated the prefix vocab mask usage properly

Co-authored-by: Tianlei Wu <tlwu@microsoft.com>
2022-02-03 00:14:39 +05:30
Yufeng Li
1aa0789691
add qdq support for QGemm (#10414)
* add qgemm in quantization tool

* add qdq support for QGemm

* fix build break

* fix OperatorKernels.md
2022-02-02 10:35:29 -08:00
Guoyu Wang
7318361645
[NNAPI QDQ] Add QDQ Resize support (#10442)
* Add NNAPI support of QDQ Resize

* minor update to UT

* fix build break

* fix android UT failure

* address cr comments
2022-02-01 18:14:58 -08:00
Dmitri Smirnov
91b8ad5ee7
Allow users to bind arbitrary memory using raw pointers (#10428)
Add binding external allocation
  Add negative tests
  Add missing return status check
2022-02-01 18:09:24 -08:00
Weixing Zhang
3c96760192
support rocm/migraphx EP in perftest tool (#10449)
Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
2022-02-01 16:12:01 -08:00
Shucai Xiao
062129a5c4
Update rocm_ep and migraphx_ep to rocm4.5.2 and fix dockerfiles to build docker images correctly (#10445)
* fix build errors for the migraphx and rocm dockerfile

* add the numpy package in the migraphx and rocm dockerfile
2022-02-01 16:11:39 -08:00
Olivia Jain
a1d9a71b8b
Improve Perf System (#10404)
* move table names to one location

* remove session metadata

* reload trt inputs

* fix posting names

* Update linux-gpu-tensorrt-daily-perf-pipeline.yml for Azure Pipelines

* remove comments

* Split up anubis job and perf run

* add trt environ variables

* No embedded links
2022-02-01 16:01:34 -08:00
Chi Lo
a7c67860a5
Reduce test time for TensorRT EP CI (#10408)
* expand model tests name

* skip cpu/cuda for trt when running onnxruntime_test_all

* only run trt ep for c++ unit test

* Update CMAKE_CUDA_ARCHITECTURES for T4

* Use new t4 agent pool

* Update YAML for run T4 on Windows

* revert code

* Update CMAKE_CUDA_ARCHITECTURES

* fix wrong value

* Remove cpu/cuda directly in model tests

* add only CMAKE_CUDA_ARCHITECTURES=75

* remove expanding model test name to see difference

* revert code

* Add fallback execution provider for unit test

* Add fallback execution provider for unit test (cont)

* add conditional to add fackback cuda ep

* Reduction op takes much longer time for TRT 8.2, so we test smaller range of inputs

* use M60

* revert code

* revert code

* add comments

* Modify code and add comment

* modify comment

* update comment

* add comment
2022-02-01 15:56:33 -08:00
Yi-Hong Lyu
ef7b4dc05c
Add test quantization of ArgMax for TensorRT (#10325)
Make sure quantize_statict would insert DQ -> Q before ArgMax.
2022-01-31 16:22:16 -08:00
Guoyu Wang
68262cce86
[NNAPI QDQ] Add QDQ Conv support (#10418)
* Add qdq conv to NNAPI

* fix build warning

* addressed CR comments

* fix a minor bug in my previous merge
2022-01-31 14:36:31 -08:00
Edward Chen
c43c1691ad
Enable transpose optimizer in minimal extended build (#10349)
Enable transpose optimizer and infrastructure it depends on in a minimal extended build.
2022-01-31 09:41:04 -08:00