onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-04 23:59:56 +00:00

Author	SHA1	Message	Date
Dwayne Robinson	6db6ee5710	Merged PR 6973543: ORT DML EP Opset 13 more complete Extend opset 13 support for: - Split-13 - Squeeze-13 - Unsqueeze-13 - Reshape-13 - QuantizeLinear-13 - DequantizeLinear-13 - ReduceSum-13 - Resize-13 Also: - Rename the file where all the opset versions are stored from "OperatorRegistration.h" to "OperatorVersions.h", which will make it much less confusing in the future when looking given there's another file called "OperatorRegistration.h" that corresponds to "OperatorRegistration.cpp". - Detemplatize many of the OperatorHelper.h constructors, which duplicate multiple instantiations due to the operator helper classes not sharing a common base class, by wrapping them with an adapter. Ideally there would be a common COM base interface that both IMLOperatorKernelCreationContext and IMLOperatorShapeInferenceContext implementation objects would implement, which a wrapper in MLOperatorAuthorHelper.h could QI for. - Fix style formatting issues in OperatorHelper.h (sorry for the noise). ``` Summary: Total=4679, Passed=4355, Failed=0, Blocked=0, Not Run=0, Skipped=324 ``` Corresponding WindowsAI PR: https://microsoft.visualstudio.com/WindowsAI/_git/WindowsAI/pullrequest/6973645 Related work items: #36672908, #36672926	2022-02-18 01:41:07 +00:00
Ryan Lai	4388eaed1b	Merged PR 6937750: Restore history to dmldev. Merge without squash Related work items: #37712737	2022-02-08 23:24:02 +00:00
Ryan Lai	b14944f9f8	Merge commit 'b02f4ece5e4f48f5d303d6be0170c03d60b24efb' into user/rylai/restore_history	2022-02-08 14:58:23 -08:00
Dwayne Robinson	6fd7ba5b7e	Merged PR 6917440: ONNX Runtime update from GitHub master Just RI. Related work items: #38034064	2022-02-04 10:13:38 +00:00
Dwayne Robinson	b02f4ece5e	Remove cbegin and cend calls which do not exist in std::span or gsl::span (#10426 )	2022-01-28 14:25:12 -08:00
Guoyu Wang	5f0ba31890	Remove coremltools submodule security vulnerability and copy the coreml model schema (#10424 ) * remove coremltools submodule * update cgmanifest * Copy proto files directly from coremltools	2022-01-28 12:48:48 -08:00
Chen Fu	c4f1dfcfaa	Cfu s8s8 (#10413 ) Adding S8S8 kernels for symmetric quantized indirect conv and depthwise conv. Perf number with single thread: Nokia G10 (baseline / new) in ms Pixel 4 (baseline/new) in ms mobilenet_edgetpu 220 / 213 18.5 / 17.6 cartoongan 8537 / 8521 967 / 928 Co-authored-by: Chen Fu <fuchen@microsoft.com>	2022-01-28 09:26:52 -08:00
Nat Kershaw (MSFT)	1a2925acce	Add sympy package as a dependency (#10406 )	2022-01-28 09:19:08 -08:00
Sheil Kumar	2dd5e75ba8	Incorrect output after GPU to GPU inference via VideoFrame and Gray8 models (#10425 ) * If the tensor is of gray8 format, we should call the gray8 shader * other check (which resolves to unknown in this case) is incorrectly being compared to constant and not DXGI_FORMAT Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2022-01-28 08:45:57 -08:00
Changming Sun	feae842a7c	Update pytorch-lightning (#10421 )	2022-01-27 21:15:00 -08:00
Changming Sun	b14da94fc1	Exclude CETCOMPAT from Windows ARM build (#10417 )	2022-01-27 17:57:01 -08:00
RandySheriffH	ce081fe655	Fix TopK with NAN on Cuda (#10314 ) * reset MIN for float/double * better logics for float/double comparision for equals	2022-01-27 16:19:55 -08:00
Rachel Guo	ff2057a817	Add sample qdq unit test case for nnapi ep qdq integration (#10358 ) * add sample unit test case and make qdq modeltestubuilder shared * update * address pr comments * modify redundant funcs impl * update * update * address pr comments * update * update * update * fix build breaks * minor update * fix bad_alloc in UT * address pr comments Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net> Co-authored-by: Guoyu Wang <wanggy@outlook.com>	2022-01-27 15:10:41 -08:00
Edward Chen	0e951d7d6b	Add some more documentation for the C/C++ API tensor creation functions. (#10394 )	2022-01-27 13:19:11 -08:00
Xavier Dupré	481b96d32a	STVM, NUPHAR, remove tvm from submodules list, checks pointers are not null. (#10211 ) * STVM, checks pointers are not null. * removes submodules tvm * add missing include(FetchContent) * add target tvm * fix stvm test * extend cgmanifest with dependencies of tvm	2022-01-27 20:31:13 +01:00
Changming Sun	ec4362f8f3	Enable more static analysis warnings and enable the analyzer for training cpu (#10176 )	2022-01-27 11:17:20 -08:00
Edward Chen	66acf50488	Document C/C++ API documentation version info conventions. (#10396 )	2022-01-27 10:20:13 -08:00
Dmitri Smirnov	3367ddc5ba	Add abseil cgmanifest declaration. Update coding standards. (#10374 ) Add abseil cgmanifest declaration. Update coding standards for InlinedContainers Adjust coding guidelines. Add default N calculation for InlinedVector<T, N> for general use. Rename T from InlinedShapeVectorT. Fix Eager build Add LLVM Copyright with modified derived code notice.	2022-01-27 08:32:05 -08:00
ytaous	4d305282da	[ROCm] Enable BFloat16 for Gemm and MatMul Op (#10398 ) * gemm-bf16 * gemm bf16 * gemm bf16 * matmul bf16 * minor style change Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: root <root@GCRAMDRR1-MI100-087.redmond.corp.microsoft.com>	2022-01-27 00:09:16 -08:00
dependabot[bot]	5f49f40fa5	Bump log4js from 6.3.0 to 6.4.0 in /js/web Bumps [log4js](https://github.com/log4js-node/log4js-node) from 6.3.0 to 6.4.0. - [Release notes](https://github.com/log4js-node/log4js-node/releases) - [Changelog](https://github.com/log4js-node/log4js-node/blob/master/CHANGELOG.md) - [Commits](https://github.com/log4js-node/log4js-node/compare/v6.3.0...v6.4.0) --- updated-dependencies: - dependency-name: log4js dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2022-01-26 20:51:49 -08:00
Hariharan Seshadri	27a4af6074	Fix some BinSkim defects (#10400 )	2022-01-26 20:22:22 -08:00
Guoyu Wang	c6ef465011	minor fix in node unit change (#10405 )	2022-01-26 16:42:38 -08:00
Weixing Zhang	ea9c8a7cdc	support MIGraphXEP to work with ROCMEP for inference on AMD GPU (#10368 ) Co-authored-by: Weixing Zhang <wezhan@microsoft.com> Support MIGraphXEP to work with ROCMEP for inference on AMD GPU	2022-01-26 15:52:56 -08:00
Chi Lo	389d2db1ce	Make model tests name clear (#10220 ) * add clear test name for model tests * handle remove character * modify for test * Modify for correct test name * Remove test code * add comments * make it only on Linux * change function name * Convert from wchar_t to char	2022-01-26 15:08:27 -08:00
Yulong Wang	847801f5be	[wasm] update emscripten v2.0.34 (#10391 )	2022-01-26 14:46:02 -08:00
ashbhandare	cf13b9dd5e	Symbolic export for numpy_T (#10390 ) * Export numpy_T as onnx transpose * further fixes, test Co-authored-by: Aishwarya Bhandare <aibhanda@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-01-26 14:14:42 -08:00
RandySheriffH	a27503ebe4	use strict mode (#10397 )	2022-01-26 10:27:05 -08:00
Changming Sun	5576e3553d	Remove python 3.6 from our python packaging pipeline (#10395 )	2022-01-26 10:21:57 -08:00
Guoyu Wang	4af116649c	[QDQ] Hookup NNAPI GetCapability/Compile with shared QDQ selectors (#10347 ) * add qdqgroup as input for NodeUnit * minor update * hookup nnapi_ep * minor update * update compiler setting * Add a simple UT * Pipeline change to add build minimal extended with NNAPI for Android * move GetAllNodeUnits to node_unit.h, add UT for NodeUnits, minor updates * minor updates * address CR comments Co-authored-by: gwang0000 <62914304+gwang0000@users.noreply.github.com>	2022-01-25 17:13:46 -08:00
Tang, Cheng	9aa51379c9	[eager mode]: add configuration for ort virtual device count (#10346 ) * add configuration for ort virtual device count * fix build break * fix ci build break Co-authored-by: Cheng Tang <chenta@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-01-25 16:15:54 -08:00
Edward Chen	5eafbb50f9	Fix possible null pointer dereference. (#10373 ) NodeInfo::p_node was used directly but it can be null from here: `2afce4830c/onnxruntime/core/framework/session_state_utils.cc (L381-L382)` Add an additional check that it is not null before use.	2022-01-25 14:48:51 -08:00
sumitsays	e1012a8662	Added OnRunEnd and Sync method in ExecutionProvider (#10362 ) Co-authored-by: Sumit Agarwal <sumitagarwal@microsoft.com>	2022-01-25 13:00:44 -08:00
Edward Chen	df16c605e8	Add "available since" message for C API additions since v1.10.0. (#10348 )	2022-01-25 10:15:34 -08:00
Alexey Gladyshev	a0fe4a7c1c	[TVM EP] Improved usability of TVM EP (#10241 ) * improved usability of TVM EP * moved technical import under a condition related to TVM EP only * Revert "moved technical import under a condition related to TVM EP only" * add conditional _ld_preload.py file extension for TVM EP * improve readability of inserted code	2022-01-25 18:48:08 +01:00
Xavier Dupré	6e95c0316d	Builds onnxruntime + eager mode with the same value for _GLIBCXX_USE_CXX11_ABI as pytorch (#10114 ) * add _GLIBCXX_USE_CXX11_ABI * restrict to eager mode	2022-01-25 11:25:31 +01:00
pallavides	790c3be7e9	Fix Reshape issue when shape size is -1 (#10356 ) * Fix Reshape issue (in_place) when shape size is -1	2022-01-24 19:30:52 -08:00
Edward Chen	4b87d2c172	Fix dockerfiles/Dockerfile.arm32v7 build. (#10360 ) Install CMake, ignore some Eigen warnings.	2022-01-24 19:06:09 -08:00
Chen Fu	df0c819850	fix compilation error due to symantic conflict with another PR (#10370 ) Resolve PR conflicts between: #10289 and #10334 Co-authored-by: Chen Fu <fuchen@microsoft.com>	2022-01-24 16:32:05 -08:00
Chen Fu	2afce4830c	Symmetric QGEMM (#10289 ) Adding code for symmetric quantized matrix multiplication. Used in quantized convolution, achieving significant perf gain. TODO, use Symmetric Quantized GEMM in other operators! TODO address activation buffer overread in custom allocators and tensors supplied by users. DOT kernel perf test: Pixel 5a: Cartoongan 513.539 ms 471.786 ms Efficient 57.5169 ms 56.4174 ms Edgetpu 14.6673 ms 13.5959 ms NEON kernel perf test Pixel 3a Cartoongan 1423.53 ms 1069.92 ms Efficient 114.086 ms 107.968 ms Edgetpu 39.2632 ms 36.9839 ms Co-authored-by: Chen Fu <fuchen@microsoft.com>	2022-01-24 10:49:04 -08:00
Dmitri Smirnov	7e092a7e3f	Reduce number of memory allocations based on a customer profiling case (#10193 ) Add abseil and inlined containers typedefs Introduce TensorShapeVector for shape building. Use gsl::span<const T> to make interfaces accept different types of vector like args. Introduce InineShapeVectorT for shape capacity typed instantiations Refactor cuda slice along with provider shared interfaces Refactor Concat, Conv, Pad Build with Conv Einsum and ConvTranspose refactored. Remove TesnorShape::GetDimsAsVector() Refactor SliceIterator and SliceIteratorBase Refactor broadcast Refactor Pads for twice as long Remove memory planner intermediate shapes vector Refactor orttraining Fix passing TenshroShapeVector to tests Remove abseil copy and submodule, use FetchContent_Declare/Fetch Path with separate command Make RocmAsyncBuffer accept anything convertible to span. Adjust Linux GPU pipeline.	2022-01-24 10:40:46 -08:00
wejoncy	5df15c5644	additional options of NNAPI for ORT_PERF_TOOL (#10351 ) * additional options of NNAPI for ORT_PERF_TOOL * reuse current key '-i' * fix * fix * _MSC_VER won't be defined when build with NDK * fix * fix	2022-01-24 10:17:56 -08:00
PeixuanZuo	3dfadf9031	[FIX] Add condition in amd ci pipeline yaml to stop test in time when onnxruntime build failed (#10335 ) * [FIX] Add condition in amd ci pipeline yaml to stop test in time when onnxruntime build failed.	2022-01-24 15:34:48 +08:00
Jeff Daily	42db893607	Add ThresholdedRelu to ROCm EP. (#9480 ) Sources were already hipified and compiled, but ROCm EP registration was missing.	2022-01-22 13:29:07 -08:00
Edward Chen	6876641c1e	Pin version of post to dashboard scripts' dependencies and update them to work with recent version. (#10353 )	2022-01-21 19:35:58 -08:00
Edward Chen	bfabef081d	Remove unused pipeline orttraining-linux-gpu-perf-test-ci-pipeline.yml and unused send_perf_metrics tool. (#10326 )	2022-01-21 14:31:34 -08:00
Baiju Meswani	141606534c	Add support for FusedAdam to be mathematically equivalent to pytorch/AdamW (#10106 )	2022-01-21 13:37:59 -08:00
Cheng Tang	13e277525c	fix whitelist	2022-01-21 13:30:53 -08:00
Olivia Jain	eee627fde9	Track Session Creation Time (#10281 ) * add back previous changes lost in merge * post session to dashboard * post session creation time to dashboard * fix trt 8 functionality: * add component governance * Remove hardcoded values * Update linux-gpu-tensorrt-daily-perf-pipeline.yml for Azure Pipelines * cleanup errors * post results only once * checkout 8.0 GA * try build 8.0 without building shared lib * add back build_shared_lib, not the problem * add upload_time to table * use identifier to post * Shorten to TRT x.x * shorten commit hash using rev_parse * use shortened commit hash * use nvidia's default TRT_VERSION	2022-01-21 13:20:53 -08:00
Yufeng Li	d2b1424968	fix bugs in cpuid_info (#10334 ) * fix serveral bugs in cpuid_info	2022-01-20 16:30:18 -08:00
Tang, Cheng	2dcb69685e	support type promotion in binary poerators in eager mode (#10285 ) Co-authored-by: Cheng Tang <chenta@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-01-20 10:06:09 -08:00

1 2 3 4 5 ...

6283 commits