onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-28 03:20:58 +00:00

Author	SHA1	Message	Date
ytaous	399ffc9700	Fix Windows GPU CI (#10499 ) * fix build * fix win build Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-02-08 22:06:23 -08:00
Guoyu Wang	e4dc4e4d3c	[NNAPI QDQ] AddQDQAdd/Mul, update to NNAPI QDQ handling, update some test settings (#10483 ) * Squashed commit of the following: commit `12380491a9` Author: Guoyu Wang <wanggy@outlook.com> Date: Mon Feb 7 12:59:04 2022 -0800 Add qdq mul support commit `9cadda7f2c` Merge: `7a32847761` `0f5d0a091a` Author: Guoyu Wang <wanggy@outlook.com> Date: Mon Feb 7 11:24:47 2022 -0800 Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul commit `7a32847761` Author: Guoyu Wang <wanggy@outlook.com> Date: Mon Feb 7 00:41:30 2022 -0800 move test case to util commit `c1a8f0d81e` Author: Guoyu Wang <wanggy@outlook.com> Date: Fri Feb 4 13:04:26 2022 -0800 update input/output check commit `a6f0a0d504` Author: Guoyu Wang <wanggy@outlook.com> Date: Thu Feb 3 18:37:21 2022 -0800 update quantized io check functions commit `87f4d1dcfe` Merge: `7849f07109` `97b8f6f394` Author: Guoyu Wang <wanggy@outlook.com> Date: Wed Feb 2 17:22:58 2022 -0800 Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul commit `7849f07109` Author: Guoyu Wang <wanggy@outlook.com> Date: Wed Feb 2 17:22:55 2022 -0800 minor update commit `7196cdf419` Author: Guoyu Wang <wanggy@outlook.com> Date: Wed Feb 2 10:50:10 2022 -0800 init change commit `84c00772a1` Merge: `a8c7dce22f` `7318361645` Author: Guoyu Wang <wanggy@outlook.com> Date: Tue Feb 1 18:21:17 2022 -0800 Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul commit `a8c7dce22f` Merge: `55e536c182` `ef7b4dc05c` Author: Guoyu Wang <wanggy@outlook.com> Date: Tue Feb 1 13:51:04 2022 -0800 Merge remote-tracking branch 'origin/master' into gwang-msft/qdq_mul commit `55e536c182` Author: Guoyu Wang <wanggy@outlook.com> Date: Tue Feb 1 11:44:34 2022 -0800 address cr comments commit `d460f5b776` Author: Guoyu Wang <wanggy@outlook.com> Date: Tue Feb 1 00:33:54 2022 -0800 fix android UT failure commit `52146cf06f` Author: Guoyu Wang <wanggy@outlook.com> Date: Mon Jan 31 16:01:13 2022 -0800 fix build break commit `ec6d07df8b` Author: Guoyu Wang <wanggy@outlook.com> Date: Mon Jan 31 15:41:52 2022 -0800 minor update to UT commit `8ec8490b4f` Author: Guoyu Wang <wanggy@outlook.com> Date: Mon Jan 31 15:01:30 2022 -0800 Add NNAPI support of QDQ Resize * Update qdq add/mul test case, fix build break * Address CR comments * Add QLinearMul support * remove unused params * Address CR comments	2022-02-08 20:44:15 -08:00
Vincent Wang	655f490c95	Remove BFloat16 Specialized Code for ReduceSum (#10476 )	2022-02-09 07:39:57 +08:00
Ryan Lai	4388eaed1b	Merged PR 6937750: Restore history to dmldev. Merge without squash Related work items: #37712737	2022-02-08 23:24:02 +00:00
Ryan Lai	b14944f9f8	Merge commit 'b02f4ece5e4f48f5d303d6be0170c03d60b24efb' into user/rylai/restore_history	2022-02-08 14:58:23 -08:00
ashbhandare	7e5d68eea6	gradient and test (#10455 ) Co-authored-by: Aishwarya Bhandare <aibhanda@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-02-08 10:18:22 -08:00
ytaous	435e14d60a	[ROCm] BFloat16 support (#10465 ) * bf16 support * minor clean up * UTs * fix build * UTs * UTs * merge commit 6b5504c * minor * ROCm code cleanup * fix build * fix build * minor Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: root <root@GCRAMDRR1-MI100-087.redmond.corp.microsoft.com>	2022-02-07 22:55:15 -08:00
Yufeng Li	c696da36c7	fix unit test of quant gemm (#10469 )	2022-02-07 09:14:37 -08:00
Chi Lo	0f5d0a091a	Make user capable of adding new field in OrtTensorRTProviderOptionsV2 as new provider option (#10450 ) * modify code for add additional field in OrtTensorRTProviderOptionsV2 * add include file * fix typo * fix bug * add comment * fix code * revert change	2022-02-05 11:15:12 -08:00
Rachel Guo	927f1f18c9	[NNAPI QDQ] Add QDQ AveragePool op support (#10464 ) * wip * save * address pr comments * update * revert minor changes Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>	2022-02-04 17:04:48 -08:00
wraveane	d0ab881d07	Contrib ops for TRT plugins: EfficientNMS and Pyramid ROI Align (#9486 ) * Contrib ops for TRT plugins: EfficientNMS and Pyramid ROI Align * Contrib ops for TRT plugins: Multilevel Crop and Resize	2022-02-04 12:10:04 -08:00
Dwayne Robinson	6fd7ba5b7e	Merged PR 6917440: ONNX Runtime update from GitHub master Just RI. Related work items: #38034064	2022-02-04 10:13:38 +00:00
Ye Wang	0d09dd5d20	Support fusion for TNLR based model (#10432 ) * support tnlr based offensive V4 model * Update onnx_model_tnlr.py Co-authored-by: Ubuntu <wy@linux-v100.aidmrjtolptuzevavgwhrapqcd.jx.internal.cloudapp.net>	2022-02-03 23:59:05 -08:00
Changming Sun	4f13c8ac39	Update orttraining-linux-ci-pipeline.yml (#10462 )	2022-02-03 13:46:16 -08:00
Maxiwell S. Garcia	6bbf016dc4	cmake: disable 'attributes' error to fix the build with GCC < 9.x This patch fixes the error "requested alignment X is larger than Y" in older GCC's https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89357	2022-02-03 13:38:19 -08:00
Ye Wang	bb09acffed	Transformer model CUDA EP align with CPU on corner case (#9889 ) * align with cpu on no input data * review comments and add tests Co-authored-by: Ubuntu <wy@linux-v100.aidmrjtolptuzevavgwhrapqcd.jx.internal.cloudapp.net>	2022-02-03 12:58:49 -08:00
ytaous	63198a6566	[ROCm] BFloat16 support (#10447 ) * bf16 support * bf16 support * UTs * fix build * fix UTs Co-authored-by: root <root@GCRAMDRR1-MI100-087.redmond.corp.microsoft.com>	2022-02-03 11:31:14 -08:00
zhangyaobit	239c6ad3f0	Support specifying an execution provider in benchmark script (#10453 ) * Support specifying execution providers. * Change default provider setting to None. * Add support for bert_perf_test script. * Fall back to ROCM/CUDA EP for MIGraphX/Tensorrt EP. * Assert fall back EPs are included. * Add model class AutoModelForCausalLM and other minor updates. Co-authored-by: Yao Zhang <zhanyao@microsoft.com>	2022-02-02 19:11:31 -08:00
Yi-Hong Lyu	a405658370	Fuse Clip->Q to Q (#10434 ) * Fuse Clip->Q to Q * Remove unused variable argmax_node * Remove braces around scalar initializer * Move GetClipConstantMinMax under ORT_MINIMAL_BUILD * Consider epsilon so we can fuse more cases	2022-02-02 18:29:30 -08:00
Rachel Guo	97b8f6f394	Add logic to NNAPI EP to exclude pre-processing involving dynamic shapes when partitioning (#10452 ) * wip * wip * wip * save * address pr comments * address pr comments Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>	2022-02-02 15:54:19 -08:00
Sunghoon	6076a262dc	upgrade react-native packages to latest (#10454 )	2022-02-02 15:19:40 -08:00
Viswanath Boga	ad9d2e2e89	Prefix match in first iteration of beam search OP (#10231 ) * Add BeamSearch op schema * Add ONNX conversion for beams search * remove attention_mask and change input order * add option to run baseline * add check data type NULL * applies VerifyNodeAndOpMatch to subgraph * update input_ids shape * Add node name for Cast node * expose API for topk * parse parameters * Add beam search scorer * output results * fix typo * use c++ template and format python * fix build pipeline errors * symbolic shape infer of input onnx * output scores * add kernel def hash * Handle vocab_mask; move CheckSubgraph * undo insert_cast_transformer.cc and fusion_utils.py * fix typo * fix merge * update doc * add repetition penalty * refactoring: add GptSubgraph class * move BeamSearchState from .h to .cc file * adjust logits processor order * add batch generation example * fix repetition penalty for dup words in sequence * Add test * Add no repeat ngram processor * refactoring: move logits processor to classes * fix build warning * show latency * use allocator in beam state * use allocator in sequences * fix build error * move next_positions to beam state * Changes for prefix matching * removing debugs * removing more debugs * clean up * clean up * cpu doc updated * Updated docs * updated prefix_vocab_mask dimension in convert script * changes to support bxs prefix_vocab_mask in beamsearchop kernel * doc update * OperatorKernels.md updated * matching docs from artifacts * minor change in logits processor * Addressing comments * Updated the prefix vocab mask usage properly Co-authored-by: Tianlei Wu <tlwu@microsoft.com>	2022-02-03 00:14:39 +05:30
Yufeng Li	1aa0789691	add qdq support for QGemm (#10414 ) * add qgemm in quantization tool * add qdq support for QGemm * fix build break * fix OperatorKernels.md	2022-02-02 10:35:29 -08:00
Guoyu Wang	7318361645	[NNAPI QDQ] Add QDQ Resize support (#10442 ) * Add NNAPI support of QDQ Resize * minor update to UT * fix build break * fix android UT failure * address cr comments	2022-02-01 18:14:58 -08:00
Dmitri Smirnov	91b8ad5ee7	Allow users to bind arbitrary memory using raw pointers (#10428 ) Add binding external allocation Add negative tests Add missing return status check	2022-02-01 18:09:24 -08:00
Weixing Zhang	3c96760192	support rocm/migraphx EP in perftest tool (#10449 ) Co-authored-by: Weixing Zhang <wezhan@microsoft.com>	2022-02-01 16:12:01 -08:00
Shucai Xiao	062129a5c4	Update rocm_ep and migraphx_ep to rocm4.5.2 and fix dockerfiles to build docker images correctly (#10445 ) * fix build errors for the migraphx and rocm dockerfile * add the numpy package in the migraphx and rocm dockerfile	2022-02-01 16:11:39 -08:00
Olivia Jain	a1d9a71b8b	Improve Perf System (#10404 ) * move table names to one location * remove session metadata * reload trt inputs * fix posting names * Update linux-gpu-tensorrt-daily-perf-pipeline.yml for Azure Pipelines * remove comments * Split up anubis job and perf run * add trt environ variables * No embedded links	2022-02-01 16:01:34 -08:00
Chi Lo	a7c67860a5	Reduce test time for TensorRT EP CI (#10408 ) * expand model tests name * skip cpu/cuda for trt when running onnxruntime_test_all * only run trt ep for c++ unit test * Update CMAKE_CUDA_ARCHITECTURES for T4 * Use new t4 agent pool * Update YAML for run T4 on Windows * revert code * Update CMAKE_CUDA_ARCHITECTURES * fix wrong value * Remove cpu/cuda directly in model tests * add only CMAKE_CUDA_ARCHITECTURES=75 * remove expanding model test name to see difference * revert code * Add fallback execution provider for unit test * Add fallback execution provider for unit test (cont) * add conditional to add fackback cuda ep * Reduction op takes much longer time for TRT 8.2, so we test smaller range of inputs * use M60 * revert code * revert code * add comments * Modify code and add comment * modify comment * update comment * add comment	2022-02-01 15:56:33 -08:00
Yi-Hong Lyu	ef7b4dc05c	Add test quantization of ArgMax for TensorRT (#10325 ) Make sure quantize_statict would insert DQ -> Q before ArgMax.	2022-01-31 16:22:16 -08:00
Guoyu Wang	68262cce86	[NNAPI QDQ] Add QDQ Conv support (#10418 ) * Add qdq conv to NNAPI * fix build warning * addressed CR comments * fix a minor bug in my previous merge	2022-01-31 14:36:31 -08:00
Edward Chen	c43c1691ad	Enable transpose optimizer in minimal extended build (#10349 ) Enable transpose optimizer and infrastructure it depends on in a minimal extended build.	2022-01-31 09:41:04 -08:00
Scott McKay	baa1767922	Allow for an optional subgraph input to have no type info. (#10379 ) Add a test for a missing optional input to Loop.	2022-01-30 08:10:13 +10:00
ytaous	85cbe8367e	[ROCm] BFloat16 support (#10416 ) * reducesum bf16 support * bf16 for add/sub/mul/div * fix build * bf16 for Cast * bf16 for softmax Co-authored-by: root <root@GCRAMDRR1-MI100-087.redmond.corp.microsoft.com>	2022-01-28 22:43:27 -08:00
Dwayne Robinson	b02f4ece5e	Remove cbegin and cend calls which do not exist in std::span or gsl::span (#10426 )	2022-01-28 14:25:12 -08:00
Guoyu Wang	5f0ba31890	Remove coremltools submodule security vulnerability and copy the coreml model schema (#10424 ) * remove coremltools submodule * update cgmanifest * Copy proto files directly from coremltools	2022-01-28 12:48:48 -08:00
Chen Fu	c4f1dfcfaa	Cfu s8s8 (#10413 ) Adding S8S8 kernels for symmetric quantized indirect conv and depthwise conv. Perf number with single thread: Nokia G10 (baseline / new) in ms Pixel 4 (baseline/new) in ms mobilenet_edgetpu 220 / 213 18.5 / 17.6 cartoongan 8537 / 8521 967 / 928 Co-authored-by: Chen Fu <fuchen@microsoft.com>	2022-01-28 09:26:52 -08:00
Nat Kershaw (MSFT)	1a2925acce	Add sympy package as a dependency (#10406 )	2022-01-28 09:19:08 -08:00
Sheil Kumar	2dd5e75ba8	Incorrect output after GPU to GPU inference via VideoFrame and Gray8 models (#10425 ) * If the tensor is of gray8 format, we should call the gray8 shader * other check (which resolves to unknown in this case) is incorrectly being compared to constant and not DXGI_FORMAT Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2022-01-28 08:45:57 -08:00
Changming Sun	feae842a7c	Update pytorch-lightning (#10421 )	2022-01-27 21:15:00 -08:00
Changming Sun	b14da94fc1	Exclude CETCOMPAT from Windows ARM build (#10417 )	2022-01-27 17:57:01 -08:00
RandySheriffH	ce081fe655	Fix TopK with NAN on Cuda (#10314 ) * reset MIN for float/double * better logics for float/double comparision for equals	2022-01-27 16:19:55 -08:00
Rachel Guo	ff2057a817	Add sample qdq unit test case for nnapi ep qdq integration (#10358 ) * add sample unit test case and make qdq modeltestubuilder shared * update * address pr comments * modify redundant funcs impl * update * update * address pr comments * update * update * update * fix build breaks * minor update * fix bad_alloc in UT * address pr comments Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net> Co-authored-by: Guoyu Wang <wanggy@outlook.com>	2022-01-27 15:10:41 -08:00
Edward Chen	0e951d7d6b	Add some more documentation for the C/C++ API tensor creation functions. (#10394 )	2022-01-27 13:19:11 -08:00
Xavier Dupré	481b96d32a	STVM, NUPHAR, remove tvm from submodules list, checks pointers are not null. (#10211 ) * STVM, checks pointers are not null. * removes submodules tvm * add missing include(FetchContent) * add target tvm * fix stvm test * extend cgmanifest with dependencies of tvm	2022-01-27 20:31:13 +01:00
Changming Sun	ec4362f8f3	Enable more static analysis warnings and enable the analyzer for training cpu (#10176 )	2022-01-27 11:17:20 -08:00
Edward Chen	66acf50488	Document C/C++ API documentation version info conventions. (#10396 )	2022-01-27 10:20:13 -08:00
Dmitri Smirnov	3367ddc5ba	Add abseil cgmanifest declaration. Update coding standards. (#10374 ) Add abseil cgmanifest declaration. Update coding standards for InlinedContainers Adjust coding guidelines. Add default N calculation for InlinedVector<T, N> for general use. Rename T from InlinedShapeVectorT. Fix Eager build Add LLVM Copyright with modified derived code notice.	2022-01-27 08:32:05 -08:00
ytaous	4d305282da	[ROCm] Enable BFloat16 for Gemm and MatMul Op (#10398 ) * gemm-bf16 * gemm bf16 * gemm bf16 * matmul bf16 * minor style change Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: root <root@GCRAMDRR1-MI100-087.redmond.corp.microsoft.com>	2022-01-27 00:09:16 -08:00
dependabot[bot]	5f49f40fa5	Bump log4js from 6.3.0 to 6.4.0 in /js/web Bumps [log4js](https://github.com/log4js-node/log4js-node) from 6.3.0 to 6.4.0. - [Release notes](https://github.com/log4js-node/log4js-node/releases) - [Changelog](https://github.com/log4js-node/log4js-node/blob/master/CHANGELOG.md) - [Commits](https://github.com/log4js-node/log4js-node/compare/v6.3.0...v6.4.0) --- updated-dependencies: - dependency-name: log4js dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2022-01-26 20:51:49 -08:00

... 30 31 32 33 34 ...

7863 commits