onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-09 17:28:58 +00:00

Author	SHA1	Message	Date
Ye Wang	bb09acffed	Transformer model CUDA EP align with CPU on corner case (#9889 ) * align with cpu on no input data * review comments and add tests Co-authored-by: Ubuntu <wy@linux-v100.aidmrjtolptuzevavgwhrapqcd.jx.internal.cloudapp.net>	2022-02-03 12:58:49 -08:00
ytaous	63198a6566	[ROCm] BFloat16 support (#10447 ) * bf16 support * bf16 support * UTs * fix build * fix UTs Co-authored-by: root <root@GCRAMDRR1-MI100-087.redmond.corp.microsoft.com>	2022-02-03 11:31:14 -08:00
zhangyaobit	239c6ad3f0	Support specifying an execution provider in benchmark script (#10453 ) * Support specifying execution providers. * Change default provider setting to None. * Add support for bert_perf_test script. * Fall back to ROCM/CUDA EP for MIGraphX/Tensorrt EP. * Assert fall back EPs are included. * Add model class AutoModelForCausalLM and other minor updates. Co-authored-by: Yao Zhang <zhanyao@microsoft.com>	2022-02-02 19:11:31 -08:00
Yi-Hong Lyu	a405658370	Fuse Clip->Q to Q (#10434 ) * Fuse Clip->Q to Q * Remove unused variable argmax_node * Remove braces around scalar initializer * Move GetClipConstantMinMax under ORT_MINIMAL_BUILD * Consider epsilon so we can fuse more cases	2022-02-02 18:29:30 -08:00
Rachel Guo	97b8f6f394	Add logic to NNAPI EP to exclude pre-processing involving dynamic shapes when partitioning (#10452 ) * wip * wip * wip * save * address pr comments * address pr comments Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>	2022-02-02 15:54:19 -08:00
Sunghoon	6076a262dc	upgrade react-native packages to latest (#10454 )	2022-02-02 15:19:40 -08:00
Viswanath Boga	ad9d2e2e89	Prefix match in first iteration of beam search OP (#10231 ) * Add BeamSearch op schema * Add ONNX conversion for beams search * remove attention_mask and change input order * add option to run baseline * add check data type NULL * applies VerifyNodeAndOpMatch to subgraph * update input_ids shape * Add node name for Cast node * expose API for topk * parse parameters * Add beam search scorer * output results * fix typo * use c++ template and format python * fix build pipeline errors * symbolic shape infer of input onnx * output scores * add kernel def hash * Handle vocab_mask; move CheckSubgraph * undo insert_cast_transformer.cc and fusion_utils.py * fix typo * fix merge * update doc * add repetition penalty * refactoring: add GptSubgraph class * move BeamSearchState from .h to .cc file * adjust logits processor order * add batch generation example * fix repetition penalty for dup words in sequence * Add test * Add no repeat ngram processor * refactoring: move logits processor to classes * fix build warning * show latency * use allocator in beam state * use allocator in sequences * fix build error * move next_positions to beam state * Changes for prefix matching * removing debugs * removing more debugs * clean up * clean up * cpu doc updated * Updated docs * updated prefix_vocab_mask dimension in convert script * changes to support bxs prefix_vocab_mask in beamsearchop kernel * doc update * OperatorKernels.md updated * matching docs from artifacts * minor change in logits processor * Addressing comments * Updated the prefix vocab mask usage properly Co-authored-by: Tianlei Wu <tlwu@microsoft.com>	2022-02-03 00:14:39 +05:30
Yufeng Li	1aa0789691	add qdq support for QGemm (#10414 ) * add qgemm in quantization tool * add qdq support for QGemm * fix build break * fix OperatorKernels.md	2022-02-02 10:35:29 -08:00
Guoyu Wang	7318361645	[NNAPI QDQ] Add QDQ Resize support (#10442 ) * Add NNAPI support of QDQ Resize * minor update to UT * fix build break * fix android UT failure * address cr comments	2022-02-01 18:14:58 -08:00
Dmitri Smirnov	91b8ad5ee7	Allow users to bind arbitrary memory using raw pointers (#10428 ) Add binding external allocation Add negative tests Add missing return status check	2022-02-01 18:09:24 -08:00
Weixing Zhang	3c96760192	support rocm/migraphx EP in perftest tool (#10449 ) Co-authored-by: Weixing Zhang <wezhan@microsoft.com>	2022-02-01 16:12:01 -08:00
Shucai Xiao	062129a5c4	Update rocm_ep and migraphx_ep to rocm4.5.2 and fix dockerfiles to build docker images correctly (#10445 ) * fix build errors for the migraphx and rocm dockerfile * add the numpy package in the migraphx and rocm dockerfile	2022-02-01 16:11:39 -08:00
Olivia Jain	a1d9a71b8b	Improve Perf System (#10404 ) * move table names to one location * remove session metadata * reload trt inputs * fix posting names * Update linux-gpu-tensorrt-daily-perf-pipeline.yml for Azure Pipelines * remove comments * Split up anubis job and perf run * add trt environ variables * No embedded links	2022-02-01 16:01:34 -08:00
Chi Lo	a7c67860a5	Reduce test time for TensorRT EP CI (#10408 ) * expand model tests name * skip cpu/cuda for trt when running onnxruntime_test_all * only run trt ep for c++ unit test * Update CMAKE_CUDA_ARCHITECTURES for T4 * Use new t4 agent pool * Update YAML for run T4 on Windows * revert code * Update CMAKE_CUDA_ARCHITECTURES * fix wrong value * Remove cpu/cuda directly in model tests * add only CMAKE_CUDA_ARCHITECTURES=75 * remove expanding model test name to see difference * revert code * Add fallback execution provider for unit test * Add fallback execution provider for unit test (cont) * add conditional to add fackback cuda ep * Reduction op takes much longer time for TRT 8.2, so we test smaller range of inputs * use M60 * revert code * revert code * add comments * Modify code and add comment * modify comment * update comment * add comment	2022-02-01 15:56:33 -08:00
Yi-Hong Lyu	ef7b4dc05c	Add test quantization of ArgMax for TensorRT (#10325 ) Make sure quantize_statict would insert DQ -> Q before ArgMax.	2022-01-31 16:22:16 -08:00
Guoyu Wang	68262cce86	[NNAPI QDQ] Add QDQ Conv support (#10418 ) * Add qdq conv to NNAPI * fix build warning * addressed CR comments * fix a minor bug in my previous merge	2022-01-31 14:36:31 -08:00
Edward Chen	c43c1691ad	Enable transpose optimizer in minimal extended build (#10349 ) Enable transpose optimizer and infrastructure it depends on in a minimal extended build.	2022-01-31 09:41:04 -08:00
Scott McKay	baa1767922	Allow for an optional subgraph input to have no type info. (#10379 ) Add a test for a missing optional input to Loop.	2022-01-30 08:10:13 +10:00
ytaous	85cbe8367e	[ROCm] BFloat16 support (#10416 ) * reducesum bf16 support * bf16 for add/sub/mul/div * fix build * bf16 for Cast * bf16 for softmax Co-authored-by: root <root@GCRAMDRR1-MI100-087.redmond.corp.microsoft.com>	2022-01-28 22:43:27 -08:00
Dwayne Robinson	b02f4ece5e	Remove cbegin and cend calls which do not exist in std::span or gsl::span (#10426 )	2022-01-28 14:25:12 -08:00
Guoyu Wang	5f0ba31890	Remove coremltools submodule security vulnerability and copy the coreml model schema (#10424 ) * remove coremltools submodule * update cgmanifest * Copy proto files directly from coremltools	2022-01-28 12:48:48 -08:00
Chen Fu	c4f1dfcfaa	Cfu s8s8 (#10413 ) Adding S8S8 kernels for symmetric quantized indirect conv and depthwise conv. Perf number with single thread: Nokia G10 (baseline / new) in ms Pixel 4 (baseline/new) in ms mobilenet_edgetpu 220 / 213 18.5 / 17.6 cartoongan 8537 / 8521 967 / 928 Co-authored-by: Chen Fu <fuchen@microsoft.com>	2022-01-28 09:26:52 -08:00
Nat Kershaw (MSFT)	1a2925acce	Add sympy package as a dependency (#10406 )	2022-01-28 09:19:08 -08:00
Sheil Kumar	2dd5e75ba8	Incorrect output after GPU to GPU inference via VideoFrame and Gray8 models (#10425 ) * If the tensor is of gray8 format, we should call the gray8 shader * other check (which resolves to unknown in this case) is incorrectly being compared to constant and not DXGI_FORMAT Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2022-01-28 08:45:57 -08:00
Changming Sun	feae842a7c	Update pytorch-lightning (#10421 )	2022-01-27 21:15:00 -08:00
Changming Sun	b14da94fc1	Exclude CETCOMPAT from Windows ARM build (#10417 )	2022-01-27 17:57:01 -08:00
RandySheriffH	ce081fe655	Fix TopK with NAN on Cuda (#10314 ) * reset MIN for float/double * better logics for float/double comparision for equals	2022-01-27 16:19:55 -08:00
Rachel Guo	ff2057a817	Add sample qdq unit test case for nnapi ep qdq integration (#10358 ) * add sample unit test case and make qdq modeltestubuilder shared * update * address pr comments * modify redundant funcs impl * update * update * address pr comments * update * update * update * fix build breaks * minor update * fix bad_alloc in UT * address pr comments Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net> Co-authored-by: Guoyu Wang <wanggy@outlook.com>	2022-01-27 15:10:41 -08:00
Edward Chen	0e951d7d6b	Add some more documentation for the C/C++ API tensor creation functions. (#10394 )	2022-01-27 13:19:11 -08:00
Xavier Dupré	481b96d32a	STVM, NUPHAR, remove tvm from submodules list, checks pointers are not null. (#10211 ) * STVM, checks pointers are not null. * removes submodules tvm * add missing include(FetchContent) * add target tvm * fix stvm test * extend cgmanifest with dependencies of tvm	2022-01-27 20:31:13 +01:00
Changming Sun	ec4362f8f3	Enable more static analysis warnings and enable the analyzer for training cpu (#10176 )	2022-01-27 11:17:20 -08:00
Edward Chen	66acf50488	Document C/C++ API documentation version info conventions. (#10396 )	2022-01-27 10:20:13 -08:00
Dmitri Smirnov	3367ddc5ba	Add abseil cgmanifest declaration. Update coding standards. (#10374 ) Add abseil cgmanifest declaration. Update coding standards for InlinedContainers Adjust coding guidelines. Add default N calculation for InlinedVector<T, N> for general use. Rename T from InlinedShapeVectorT. Fix Eager build Add LLVM Copyright with modified derived code notice.	2022-01-27 08:32:05 -08:00
ytaous	4d305282da	[ROCm] Enable BFloat16 for Gemm and MatMul Op (#10398 ) * gemm-bf16 * gemm bf16 * gemm bf16 * matmul bf16 * minor style change Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: root <root@GCRAMDRR1-MI100-087.redmond.corp.microsoft.com>	2022-01-27 00:09:16 -08:00
dependabot[bot]	5f49f40fa5	Bump log4js from 6.3.0 to 6.4.0 in /js/web Bumps [log4js](https://github.com/log4js-node/log4js-node) from 6.3.0 to 6.4.0. - [Release notes](https://github.com/log4js-node/log4js-node/releases) - [Changelog](https://github.com/log4js-node/log4js-node/blob/master/CHANGELOG.md) - [Commits](https://github.com/log4js-node/log4js-node/compare/v6.3.0...v6.4.0) --- updated-dependencies: - dependency-name: log4js dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2022-01-26 20:51:49 -08:00
Hariharan Seshadri	27a4af6074	Fix some BinSkim defects (#10400 )	2022-01-26 20:22:22 -08:00
Guoyu Wang	c6ef465011	minor fix in node unit change (#10405 )	2022-01-26 16:42:38 -08:00
Weixing Zhang	ea9c8a7cdc	support MIGraphXEP to work with ROCMEP for inference on AMD GPU (#10368 ) Co-authored-by: Weixing Zhang <wezhan@microsoft.com> Support MIGraphXEP to work with ROCMEP for inference on AMD GPU	2022-01-26 15:52:56 -08:00
Chi Lo	389d2db1ce	Make model tests name clear (#10220 ) * add clear test name for model tests * handle remove character * modify for test * Modify for correct test name * Remove test code * add comments * make it only on Linux * change function name * Convert from wchar_t to char	2022-01-26 15:08:27 -08:00
Yulong Wang	847801f5be	[wasm] update emscripten v2.0.34 (#10391 )	2022-01-26 14:46:02 -08:00
ashbhandare	cf13b9dd5e	Symbolic export for numpy_T (#10390 ) * Export numpy_T as onnx transpose * further fixes, test Co-authored-by: Aishwarya Bhandare <aibhanda@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-01-26 14:14:42 -08:00
RandySheriffH	a27503ebe4	use strict mode (#10397 )	2022-01-26 10:27:05 -08:00
Changming Sun	5576e3553d	Remove python 3.6 from our python packaging pipeline (#10395 )	2022-01-26 10:21:57 -08:00
Guoyu Wang	4af116649c	[QDQ] Hookup NNAPI GetCapability/Compile with shared QDQ selectors (#10347 ) * add qdqgroup as input for NodeUnit * minor update * hookup nnapi_ep * minor update * update compiler setting * Add a simple UT * Pipeline change to add build minimal extended with NNAPI for Android * move GetAllNodeUnits to node_unit.h, add UT for NodeUnits, minor updates * minor updates * address CR comments Co-authored-by: gwang0000 <62914304+gwang0000@users.noreply.github.com>	2022-01-25 17:13:46 -08:00
Tang, Cheng	9aa51379c9	[eager mode]: add configuration for ort virtual device count (#10346 ) * add configuration for ort virtual device count * fix build break * fix ci build break Co-authored-by: Cheng Tang <chenta@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-01-25 16:15:54 -08:00
Edward Chen	5eafbb50f9	Fix possible null pointer dereference. (#10373 ) NodeInfo::p_node was used directly but it can be null from here: `2afce4830c/onnxruntime/core/framework/session_state_utils.cc (L381-L382)` Add an additional check that it is not null before use.	2022-01-25 14:48:51 -08:00
sumitsays	e1012a8662	Added OnRunEnd and Sync method in ExecutionProvider (#10362 ) Co-authored-by: Sumit Agarwal <sumitagarwal@microsoft.com>	2022-01-25 13:00:44 -08:00
Edward Chen	df16c605e8	Add "available since" message for C API additions since v1.10.0. (#10348 )	2022-01-25 10:15:34 -08:00
Alexey Gladyshev	a0fe4a7c1c	[TVM EP] Improved usability of TVM EP (#10241 ) * improved usability of TVM EP * moved technical import under a condition related to TVM EP only * Revert "moved technical import under a condition related to TVM EP only" * add conditional _ld_preload.py file extension for TVM EP * improve readability of inserted code	2022-01-25 18:48:08 +01:00
Xavier Dupré	6e95c0316d	Builds onnxruntime + eager mode with the same value for _GLIBCXX_USE_CXX11_ABI as pytorch (#10114 ) * add _GLIBCXX_USE_CXX11_ABI * restrict to eager mode	2022-01-25 11:25:31 +01:00

1 2 3 4 5 ...

6286 commits