onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-04 04:07:22 +00:00

Author	SHA1	Message	Date
Valery Chernov	1cdc23aba4	[TVM EP] Rename Standalone TVM (STVM) Execution Provider to TVM EP (#10260 ) * update java API for STVM EP. Issue is from PR#10019 * use_stvm -> use_tvm * rename stvm worktree * STVMAllocator -> TVMAllocator * StvmExecutionProviderInfo -> TvmExecutionProviderInfo * stvm -> tvm for cpu_targets. resolve onnxruntime::tvm and origin tvm namespaces conflict * STVMRunner -> TVMRunner * StvmExecutionProvider -> TvmExecutionProvider * tvm::env_vars * StvmProviderFactory -> TvmProviderFactory * rename factory funcs * StvmCPUDataTransfer -> TvmCPUDataTransfer * small clean * STVMFuncState -> TVMFuncState * USE_TVM -> NUPHAR_USE_TVM * USE_STVM -> USE_TVM * python API: providers.stvm -> providers.tvm. clean TVM_EP.md * clean build scripts #1 * clean build scripts, java frontend and others #2 * once more clean #3 * fix build of nuphar tvm test * final transfer stvm namespace to onnxruntime::tvm * rename stvm->tvm * NUPHAR_USE_TVM -> USE_NUPHAR_TVM * small fixes for correct CI tests * clean after rebase. Last renaming stvm to tvm, separate TVM and Nuphar in cmake and build files * update CUDA support for TVM EP * roll back CudaNN home check * ERROR for not positive input shape dimension instead of WARNING * update documentation for CUDA * small corrections after review * update GPU description * update GPU description * misprints were fixed * cleaned up error msgs Co-authored-by: Valery Chernov <valery.chernov@deelvin.com> Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru> Co-authored-by: Thierry Moreau <tmoreau@octoml.ai>	2022-02-15 10:21:02 +01:00
ytaous	d3f7459263	fix CI build (#10553 ) Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-02-14 19:52:21 -08:00
Chen Fu	58f80c16ff	Create branch according to cpu core uarch (#10521 ) This is a preparation change for a bigger goal. On ARM64 CPUs with Big.Little, different cores are always the same architecture but different micro-architecture. Specifically, it is often that the little core has narrow memory buses that makes 128b load very slow. While if we always use 64b load in our kernels, the code will run slower on big cores. As a result, we need to run different code on different cores to achieve better performance. This change constructs a manifold that pivot based on the core micro-architecture of the current core, so that we can develop and call different kernels accordingly. Co-authored-by: Chen Fu <fuchen@microsoft.com>	2022-02-14 15:16:20 -08:00
Edward Chen	3199074ac7	Update QDQ propagation transformer to insert QDQ nodes (#10487 ) Update QDQ propagation transformer to insert new QDQ nodes instead of moving the existing one. This creates a more consistent `DQ -> op -> Q` pattern for other components to recognize. Upgrade this transformer to a basic level optimization as it yields a valid ONNX graph.	2022-02-14 14:20:03 -08:00
Baiju Meswani	7691e7ed12	Introduce load balancing dataset samplers (#10163 )	2022-02-14 13:46:14 -08:00
Changming Sun	270dec7327	Return a Status instead of throw an exception in GetAttrs (#10534 )	2022-02-14 13:24:35 -08:00
Yi-Hong Lyu	3f37609994	Remove unneeded code in UpsampleBilinear (#10544 )	2022-02-14 12:32:53 -08:00
dependabot[bot]	bfb20b315d	Bump karma from 6.3.2 to 6.3.14 in /js/web Bumps [karma](https://github.com/karma-runner/karma) from 6.3.2 to 6.3.14. - [Release notes](https://github.com/karma-runner/karma/releases) - [Changelog](https://github.com/karma-runner/karma/blob/master/CHANGELOG.md) - [Commits](https://github.com/karma-runner/karma/compare/v6.3.2...v6.3.14) --- updated-dependencies: - dependency-name: karma dependency-type: direct:development ... Signed-off-by: dependabot[bot] <support@github.com>	2022-02-11 12:17:11 -08:00
Rachel Guo	5cfde7af29	[NNAPI QDQ] Add QDQTranspose op support (#10495 ) * Squashed commit of the following: commit `12380491a9` Author: Guoyu Wang <wanggy@outlook.com> Date: Mon Feb 7 12:59:04 2022 -0800 Add qdq mul support commit `9cadda7f2c` Merge: `7a32847761` `0f5d0a091a` Author: Guoyu Wang <wanggy@outlook.com> Date: Mon Feb 7 11:24:47 2022 -0800 Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul commit `7a32847761` Author: Guoyu Wang <wanggy@outlook.com> Date: Mon Feb 7 00:41:30 2022 -0800 move test case to util commit `c1a8f0d81e` Author: Guoyu Wang <wanggy@outlook.com> Date: Fri Feb 4 13:04:26 2022 -0800 update input/output check commit `a6f0a0d504` Author: Guoyu Wang <wanggy@outlook.com> Date: Thu Feb 3 18:37:21 2022 -0800 update quantized io check functions commit `87f4d1dcfe` Merge: `7849f07109` `97b8f6f394` Author: Guoyu Wang <wanggy@outlook.com> Date: Wed Feb 2 17:22:58 2022 -0800 Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul commit `7849f07109` Author: Guoyu Wang <wanggy@outlook.com> Date: Wed Feb 2 17:22:55 2022 -0800 minor update commit `7196cdf419` Author: Guoyu Wang <wanggy@outlook.com> Date: Wed Feb 2 10:50:10 2022 -0800 init change commit `84c00772a1` Merge: `a8c7dce22f` `7318361645` Author: Guoyu Wang <wanggy@outlook.com> Date: Tue Feb 1 18:21:17 2022 -0800 Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul commit `a8c7dce22f` Merge: `55e536c182` `ef7b4dc05c` Author: Guoyu Wang <wanggy@outlook.com> Date: Tue Feb 1 13:51:04 2022 -0800 Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul commit `55e536c182` Author: Guoyu Wang <wanggy@outlook.com> Date: Tue Feb 1 11:44:34 2022 -0800 address cr comments commit `d460f5b776` Author: Guoyu Wang <wanggy@outlook.com> Date: Tue Feb 1 00:33:54 2022 -0800 fix android UT failure commit `52146cf06f` Author: Guoyu Wang <wanggy@outlook.com> Date: Mon Jan 31 16:01:13 2022 -0800 fix build break commit `ec6d07df8b` Author: Guoyu Wang <wanggy@outlook.com> Date: Mon Jan 31 15:41:52 2022 -0800 minor update to UT commit `8ec8490b4f` Author: Guoyu Wang <wanggy@outlook.com> Date: Mon Jan 31 15:01:30 2022 -0800 Add NNAPI support of QDQ Resize * Update qdq add/mul test case, fix build break * Address CR comments * Add QLinearMul support * remove unused params * Address CR comments * wip * save * minor fix * fix * fix build * address pr comments * fix wrong ut tests * address comments * minor update * fix addinitializersskip Co-authored-by: Guoyu Wang <wanggy@outlook.com> Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>	2022-02-11 10:42:08 -08:00
Scott McKay	318d31ea12	Fix C# pipeline build error (#10524 )	2022-02-11 08:56:40 -08:00
ytaous	4e2a974090	[ROCm] UTs and code clean up (#10511 ) * Fix UT * UT * UTs * enable ROCm UT * fix build attempt * minor * fix UT * fix UT * fix UTs Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: root <root@GCRAMDRR1-MI100-087.redmond.corp.microsoft.com>	2022-02-11 08:23:25 -08:00
Weixing Zhang	2002a96594	The transformer of memcpy is needed for ROCm EP and MIGraphX EP when fallbacking CPU happens (#10522 ) Co-authored-by: Weixing Zhang <wezhan@microsoft.com>	2022-02-11 00:53:24 -08:00
Edward Chen	f92e47e95b	Remove onnxruntime_util dependency on onnxruntime_framework (#10512 ) There's a circular dependency between onnxruntime_util and onnxruntime_framework. Remove onnxruntime_util's dependency on onnxruntime_framework.	2022-02-10 19:17:08 -08:00
satyajandhyala	a27aabad34	Fix fomatting. (#10520 ) Formatting related changes	2022-02-10 17:43:25 -08:00
Changming Sun	3185680b6c	Add NHWC CONV contrib op (#10506 )	2022-02-10 15:47:49 -08:00
satyajandhyala	eba730500f	Remove file-scope non-constant static variables to support multiple inference sessions (#10481 ) * Changed file-scope static variables to automatic variables or function-scope static const. * Reduce load time overhead by using constexpr. * Use node indices instead of node names to track inserted, deleted and changed nodes. Co-authored-by: Satya Jandhyala <sajandhy@microsoft.com>	2022-02-10 13:31:12 -08:00
Ye Wang	4d6d4dfb9d	Add TRT ep perf benchmark (#10470 )	2022-02-10 08:51:01 -08:00
Sunghoon	dd33ce0fdc	[js/react_native] Create ONNX Runtime React Native pipeline (#10474 ) * Pipeline for ONNX Runtime react native * Fix a test failure * test with custom built binaries * add onnxruntime-common package back * don't bob build when bootstrap * revise Android test * rename example to e2e * remove onnxruntime packages from package.json * remove release-it package * upgrade gradle version to the same as CI * add a pipeline for react native * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * android and ios mobile build for react native e2e * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * use android aar package template * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * use android aar package template * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * publish ios test results * add e2e tests and publish a npm package * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * remove aar from npm package * wait for view displayed * change a waiting logic * increase wait time for app launching * give more time to launch an app * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * disable metro server on testing * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * test ios simulator launching * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * fix iOS e2e test * use a publishing version of npm packages * make pretty * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * make only one onnxruntime-common package after packaging * make a powershell script of packaging universal * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Add a warning for file changes during a test * clean up * fix lint errors * fix js npm packaging * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * resolve comments * fix a typo	2022-02-09 21:37:05 -08:00
Changming Sun	6f3ade55ec	Move QAttention/QEmbedLayerNormalization op defs to quantization_defs.cc (#10507 )	2022-02-09 14:23:17 -08:00
Hubert Lu	c9fbd0b15a	Optimize cuComputePartGradGammaBeta kernel for MI100 (#10475 ) * Optimize cuComputePartGradGammaBeta kernel for MI100 Co-authored-by: root <root@gb-sjc2-10.local.lan> Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2022-02-09 12:51:06 -08:00
Changming Sun	7a2bf3c24c	Reorganize contrib op schemas (#10494 )	2022-02-09 09:31:58 -08:00
ytaous	399ffc9700	Fix Windows GPU CI (#10499 ) * fix build * fix win build Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-02-08 22:06:23 -08:00
Guoyu Wang	e4dc4e4d3c	[NNAPI QDQ] AddQDQAdd/Mul, update to NNAPI QDQ handling, update some test settings (#10483 ) * Squashed commit of the following: commit `12380491a9` Author: Guoyu Wang <wanggy@outlook.com> Date: Mon Feb 7 12:59:04 2022 -0800 Add qdq mul support commit `9cadda7f2c` Merge: `7a32847761` `0f5d0a091a` Author: Guoyu Wang <wanggy@outlook.com> Date: Mon Feb 7 11:24:47 2022 -0800 Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul commit `7a32847761` Author: Guoyu Wang <wanggy@outlook.com> Date: Mon Feb 7 00:41:30 2022 -0800 move test case to util commit `c1a8f0d81e` Author: Guoyu Wang <wanggy@outlook.com> Date: Fri Feb 4 13:04:26 2022 -0800 update input/output check commit `a6f0a0d504` Author: Guoyu Wang <wanggy@outlook.com> Date: Thu Feb 3 18:37:21 2022 -0800 update quantized io check functions commit `87f4d1dcfe` Merge: `7849f07109` `97b8f6f394` Author: Guoyu Wang <wanggy@outlook.com> Date: Wed Feb 2 17:22:58 2022 -0800 Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul commit `7849f07109` Author: Guoyu Wang <wanggy@outlook.com> Date: Wed Feb 2 17:22:55 2022 -0800 minor update commit `7196cdf419` Author: Guoyu Wang <wanggy@outlook.com> Date: Wed Feb 2 10:50:10 2022 -0800 init change commit `84c00772a1` Merge: `a8c7dce22f` `7318361645` Author: Guoyu Wang <wanggy@outlook.com> Date: Tue Feb 1 18:21:17 2022 -0800 Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul commit `a8c7dce22f` Merge: `55e536c182` `ef7b4dc05c` Author: Guoyu Wang <wanggy@outlook.com> Date: Tue Feb 1 13:51:04 2022 -0800 Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul commit `55e536c182` Author: Guoyu Wang <wanggy@outlook.com> Date: Tue Feb 1 11:44:34 2022 -0800 address cr comments commit `d460f5b776` Author: Guoyu Wang <wanggy@outlook.com> Date: Tue Feb 1 00:33:54 2022 -0800 fix android UT failure commit `52146cf06f` Author: Guoyu Wang <wanggy@outlook.com> Date: Mon Jan 31 16:01:13 2022 -0800 fix build break commit `ec6d07df8b` Author: Guoyu Wang <wanggy@outlook.com> Date: Mon Jan 31 15:41:52 2022 -0800 minor update to UT commit `8ec8490b4f` Author: Guoyu Wang <wanggy@outlook.com> Date: Mon Jan 31 15:01:30 2022 -0800 Add NNAPI support of QDQ Resize * Update qdq add/mul test case, fix build break * Address CR comments * Add QLinearMul support * remove unused params * Address CR comments	2022-02-08 20:44:15 -08:00
Vincent Wang	655f490c95	Remove BFloat16 Specialized Code for ReduceSum (#10476 )	2022-02-09 07:39:57 +08:00
ashbhandare	7e5d68eea6	gradient and test (#10455 ) Co-authored-by: Aishwarya Bhandare <aibhanda@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-02-08 10:18:22 -08:00
ytaous	435e14d60a	[ROCm] BFloat16 support (#10465 ) * bf16 support * minor clean up * UTs * fix build * UTs * UTs * merge commit 6b5504c * minor * ROCm code cleanup * fix build * fix build * minor Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: root <root@GCRAMDRR1-MI100-087.redmond.corp.microsoft.com>	2022-02-07 22:55:15 -08:00
Yufeng Li	c696da36c7	fix unit test of quant gemm (#10469 )	2022-02-07 09:14:37 -08:00
Chi Lo	0f5d0a091a	Make user capable of adding new field in OrtTensorRTProviderOptionsV2 as new provider option (#10450 ) * modify code for add additional field in OrtTensorRTProviderOptionsV2 * add include file * fix typo * fix bug * add comment * fix code * revert change	2022-02-05 11:15:12 -08:00
Rachel Guo	927f1f18c9	[NNAPI QDQ] Add QDQ AveragePool op support (#10464 ) * wip * save * address pr comments * update * revert minor changes Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>	2022-02-04 17:04:48 -08:00
wraveane	d0ab881d07	Contrib ops for TRT plugins: EfficientNMS and Pyramid ROI Align (#9486 ) * Contrib ops for TRT plugins: EfficientNMS and Pyramid ROI Align * Contrib ops for TRT plugins: Multilevel Crop and Resize	2022-02-04 12:10:04 -08:00
Ye Wang	0d09dd5d20	Support fusion for TNLR based model (#10432 ) * support tnlr based offensive V4 model * Update onnx_model_tnlr.py Co-authored-by: Ubuntu <wy@linux-v100.aidmrjtolptuzevavgwhrapqcd.jx.internal.cloudapp.net>	2022-02-03 23:59:05 -08:00
Changming Sun	4f13c8ac39	Update orttraining-linux-ci-pipeline.yml (#10462 )	2022-02-03 13:46:16 -08:00
Maxiwell S. Garcia	6bbf016dc4	cmake: disable 'attributes' error to fix the build with GCC < 9.x This patch fixes the error "requested alignment X is larger than Y" in older GCC's https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89357	2022-02-03 13:38:19 -08:00
Ye Wang	bb09acffed	Transformer model CUDA EP align with CPU on corner case (#9889 ) * align with cpu on no input data * review comments and add tests Co-authored-by: Ubuntu <wy@linux-v100.aidmrjtolptuzevavgwhrapqcd.jx.internal.cloudapp.net>	2022-02-03 12:58:49 -08:00
ytaous	63198a6566	[ROCm] BFloat16 support (#10447 ) * bf16 support * bf16 support * UTs * fix build * fix UTs Co-authored-by: root <root@GCRAMDRR1-MI100-087.redmond.corp.microsoft.com>	2022-02-03 11:31:14 -08:00
zhangyaobit	239c6ad3f0	Support specifying an execution provider in benchmark script (#10453 ) * Support specifying execution providers. * Change default provider setting to None. * Add support for bert_perf_test script. * Fall back to ROCM/CUDA EP for MIGraphX/Tensorrt EP. * Assert fall back EPs are included. * Add model class AutoModelForCausalLM and other minor updates. Co-authored-by: Yao Zhang <zhanyao@microsoft.com>	2022-02-02 19:11:31 -08:00
Yi-Hong Lyu	a405658370	Fuse Clip->Q to Q (#10434 ) * Fuse Clip->Q to Q * Remove unused variable argmax_node * Remove braces around scalar initializer * Move GetClipConstantMinMax under ORT_MINIMAL_BUILD * Consider epsilon so we can fuse more cases	2022-02-02 18:29:30 -08:00
Rachel Guo	97b8f6f394	Add logic to NNAPI EP to exclude pre-processing involving dynamic shapes when partitioning (#10452 ) * wip * wip * wip * save * address pr comments * address pr comments Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>	2022-02-02 15:54:19 -08:00
Sunghoon	6076a262dc	upgrade react-native packages to latest (#10454 )	2022-02-02 15:19:40 -08:00
Viswanath Boga	ad9d2e2e89	Prefix match in first iteration of beam search OP (#10231 ) * Add BeamSearch op schema * Add ONNX conversion for beams search * remove attention_mask and change input order * add option to run baseline * add check data type NULL * applies VerifyNodeAndOpMatch to subgraph * update input_ids shape * Add node name for Cast node * expose API for topk * parse parameters * Add beam search scorer * output results * fix typo * use c++ template and format python * fix build pipeline errors * symbolic shape infer of input onnx * output scores * add kernel def hash * Handle vocab_mask; move CheckSubgraph * undo insert_cast_transformer.cc and fusion_utils.py * fix typo * fix merge * update doc * add repetition penalty * refactoring: add GptSubgraph class * move BeamSearchState from .h to .cc file * adjust logits processor order * add batch generation example * fix repetition penalty for dup words in sequence * Add test * Add no repeat ngram processor * refactoring: move logits processor to classes * fix build warning * show latency * use allocator in beam state * use allocator in sequences * fix build error * move next_positions to beam state * Changes for prefix matching * removing debugs * removing more debugs * clean up * clean up * cpu doc updated * Updated docs * updated prefix_vocab_mask dimension in convert script * changes to support bxs prefix_vocab_mask in beamsearchop kernel * doc update * OperatorKernels.md updated * matching docs from artifacts * minor change in logits processor * Addressing comments * Updated the prefix vocab mask usage properly Co-authored-by: Tianlei Wu <tlwu@microsoft.com>	2022-02-03 00:14:39 +05:30
Yufeng Li	1aa0789691	add qdq support for QGemm (#10414 ) * add qgemm in quantization tool * add qdq support for QGemm * fix build break * fix OperatorKernels.md	2022-02-02 10:35:29 -08:00
Guoyu Wang	7318361645	[NNAPI QDQ] Add QDQ Resize support (#10442 ) * Add NNAPI support of QDQ Resize * minor update to UT * fix build break * fix android UT failure * address cr comments	2022-02-01 18:14:58 -08:00
Dmitri Smirnov	91b8ad5ee7	Allow users to bind arbitrary memory using raw pointers (#10428 ) Add binding external allocation Add negative tests Add missing return status check	2022-02-01 18:09:24 -08:00
Weixing Zhang	3c96760192	support rocm/migraphx EP in perftest tool (#10449 ) Co-authored-by: Weixing Zhang <wezhan@microsoft.com>	2022-02-01 16:12:01 -08:00
Shucai Xiao	062129a5c4	Update rocm_ep and migraphx_ep to rocm4.5.2 and fix dockerfiles to build docker images correctly (#10445 ) * fix build errors for the migraphx and rocm dockerfile * add the numpy package in the migraphx and rocm dockerfile	2022-02-01 16:11:39 -08:00
Olivia Jain	a1d9a71b8b	Improve Perf System (#10404 ) * move table names to one location * remove session metadata * reload trt inputs * fix posting names * Update linux-gpu-tensorrt-daily-perf-pipeline.yml for Azure Pipelines * remove comments * Split up anubis job and perf run * add trt environ variables * No embedded links	2022-02-01 16:01:34 -08:00
Chi Lo	a7c67860a5	Reduce test time for TensorRT EP CI (#10408 ) * expand model tests name * skip cpu/cuda for trt when running onnxruntime_test_all * only run trt ep for c++ unit test * Update CMAKE_CUDA_ARCHITECTURES for T4 * Use new t4 agent pool * Update YAML for run T4 on Windows * revert code * Update CMAKE_CUDA_ARCHITECTURES * fix wrong value * Remove cpu/cuda directly in model tests * add only CMAKE_CUDA_ARCHITECTURES=75 * remove expanding model test name to see difference * revert code * Add fallback execution provider for unit test * Add fallback execution provider for unit test (cont) * add conditional to add fackback cuda ep * Reduction op takes much longer time for TRT 8.2, so we test smaller range of inputs * use M60 * revert code * revert code * add comments * Modify code and add comment * modify comment * update comment * add comment	2022-02-01 15:56:33 -08:00
Yi-Hong Lyu	ef7b4dc05c	Add test quantization of ArgMax for TensorRT (#10325 ) Make sure quantize_statict would insert DQ -> Q before ArgMax.	2022-01-31 16:22:16 -08:00
Guoyu Wang	68262cce86	[NNAPI QDQ] Add QDQ Conv support (#10418 ) * Add qdq conv to NNAPI * fix build warning * addressed CR comments * fix a minor bug in my previous merge	2022-01-31 14:36:31 -08:00
Edward Chen	c43c1691ad	Enable transpose optimizer in minimal extended build (#10349 ) Enable transpose optimizer and infrastructure it depends on in a minimal extended build.	2022-01-31 09:41:04 -08:00

1 2 3 4 5 ...

6319 commits