onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-08 17:17:15 +00:00

Author	SHA1	Message	Date
Edward Chen	ffde44cd09	[iOS Packaging] Add full ORT build iOS package. (#10626 ) Add C/C++ and Objective-C packages with full ORT builds.	2022-02-28 15:39:07 -08:00
Scott McKay	1f6d8248da	Add optional optimizer to remove leftover Q->DQ pairs after all other QDQ processing has completed (#10659 ) Add an optimizer that can remove leftover Q->DQ pairs. Depending on the model this may help with performance and/or improve accuracy. Optional as it could make things worse so user needs to be aware of this and test what works best for their scenario. Enable with SessionOptions config param `session.enable_quant_qdq_cleanup`	2022-03-01 08:05:02 +10:00
Thiago Crepaldi	e788cc2a23	Convert com.microsoft::ATen into org.pytorch.aten::ATen onnx op (#10060 ) Signed-off-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>	2022-02-28 14:14:45 -05:00
Adam Pocock	e47434ea12	[java] Adding the graph description to the exposed model metadata. (#10318 )	2022-02-28 10:05:03 -08:00
harshithapv	037f08f1ff	Fix unsqueeze for opset 13 for ReduceMean Grad (#10668 ) * fix unsqueeze for opset 13 for reducemean grad * fix input for reduce mean	2022-02-28 09:55:52 -08:00
Ryan Hill	eb116595d4	Add ability to customize ORT_CXX_API_THROW (#10688 )	2022-02-28 00:15:10 -08:00
Guoyu Wang	240f31ef6e	fix softplus (#10576 )	2022-02-28 09:27:07 +10:00
Scott McKay	f2ca43fe0d	Enable CoreML in the macos package (#10675 ) * packaging pipeline change * Enable CoreML on macos Co-authored-by: Guoyu Wang <wanggy@outlook.com>	2022-02-28 09:12:37 +10:00
Dmitri Smirnov	b30e0e2283	Remove inline_containers include from tensor_shape (#10682 ) Hide Inlined Hash set and maps guts behind template forward declarations. Currently CUDA 10.2 compiler can not compile abseil but provider interfaces use those types in their signatures. InlinedVector seems to be fine. Introduce core/common/inlined_containers_fwd.h header	2022-02-26 20:07:18 -08:00
Changming Sun	81831201a8	Change C# tests to use C# 5.0 (#10686 ) .NET Core 2.1 has reached end of support on August 21, 2021. Use C# 5.0 instead. Our CI build machines do no have C# 6.0 yet. Later I will do it.	2022-02-26 00:28:30 -08:00
Numfor Tiapo	5fbfca3d58	Add Experimental API for setting model name (#10518 ) * Add experimental API for editing model name * Change EditModelName to 'SetName' * Change API to pass c_string * Update SetName to edit the proto * Test that the model proto gets changed * Remove comments * Skip inbox tests * Use filehelper path Co-authored-by: Numfor Mbiziwo-Tiapo <numform@microsoft.com>	2022-02-25 14:23:49 -08:00
Tianlei Wu	36c3271546	BeamSearch op cuda (#10556 ) Add BeamSearch cuda implementation with support of fp16 GPT-2 subgraph	2022-02-25 13:08:55 -08:00
Dmitri Smirnov	957dccb379	Fix compile (#10667 )	2022-02-25 10:08:30 -08:00
Chen Fu	12c44bfc4e	fix bug: getting current cpu core type (#10630 ) Prev merged pull request has a bug: #10521 It was aimed to detect current CPU core micro-architecture and select a best suited kernel. Unfortunately it assumes that a thread can never migrate from one core to another. This change tries to fix that problem. It introduces about 2-5% performance degradation on symmetric quantized matmul Co-authored-by: Chen Fu <fuchen@microsoft.com>	2022-02-25 08:56:14 -08:00
David Fan	617474e298	Stop gradient edges for aten::argmax (#10650 )	2022-02-24 21:14:53 -08:00
Dmitri Smirnov	2679711bee	Refactor transformers and other code to reduce memory allocation calls (#10523 ) Work on minimizing memory management calls by reducing number of allocations and copies. Replace std::unordered_set to InlinedHashSet and add usage of InlinedVector. Employ std::move() to minimize copying and memory allocations. Remove copying of the const shared data into each of the PropagateCast transformer instances. Move inlined_containers.h header to include/common Adjust AsSpan imlementation for C++ < 17	2022-02-24 16:17:14 -08:00
Scott McKay	0b19a03361	Fix the debug dump of tensor values to output int8 and uint8 values correctly. Without the change they are treated as char/unsigned char by std::cout. (#10658 ) Other changes are from clang-format	2022-02-25 07:25:23 +10:00
Ashwini Khade	b993de9dd4	add reshape handler (#10627 ) * add reshape handler * plus test updates per review * add a check to validate only 1 dim is not equal to 1	2022-02-24 12:32:08 -08:00
Changming Sun	56be66a0ab	Update c-api-cpu.yml: change nuget linux arm64 RID	2022-02-24 11:15:51 -08:00
Tang, Cheng	7660eeef3e	fix ortmodule's output device info when it runs on ort device (#10616 ) Co-authored-by: Cheng Tang <chenta@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-02-24 10:22:55 -08:00
Yufeng Li	446258fa28	fix bug: quantize output of activation op(Relu, Clip) (#10649 )	2022-02-24 09:06:04 -08:00
Alexey Gladyshev	7dc7529ec8	[TVM EP] Integrate tests for TVM EP into public onnxruntime CI (#10505 ) * add support for bool type * add TVM EP support for tests * include TVM EP in python test pool * fix pylint * moved technical imports to a separate file * clean up post build actions & move _ld_preload.py extension to CMake level * add files for include TVM EP into CI * implement custom logger for TVM * replace TVM logging with ONNX RT logging * update link for TVM EP tutorial * clean up TVM EP cmake * add pybind auto enabling for TVM EP * fix blank spaces * code review fixes * replace print with comment * add list of EP without TVM EP * enable onnx tests * disable contrib ops and ml ops * reuse Dockerfile.ubuntu * Move install_tvm_test_dependencies.sh out of Docker context dir, update build definition. Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2022-02-24 16:24:23 +01:00
Scott McKay	ecf064f135	Exclude pdb from nuspec unless it's the winml package. (#10638 )	2022-02-24 14:23:00 +10:00
Thomas Rondahl	e076f3a125	Fix incorrect target name Now updating the arch variable after matching for tar file.	2022-02-23 19:17:34 -08:00
Thomas Rondahl	573e440d35	Fix no ARM64 natives for Linux nuget Change from aarch64 to arm64 for natives in nuget packages.	2022-02-23 19:17:34 -08:00
Yi-Hong Lyu	bd08f11a58	Upsample support NHWC (#10554 ) Implement bilinear interpolation for Upsample (Resize) 4-D input with the outermost and innermost scale (usually channel of NHWC) as 1. Besides, I revert the HandleResize back to the original implementation for TransposeOptimizerTests.TestResize* tests.	2022-02-23 14:27:11 -08:00
Scott McKay	e0d1d6906a	Merge two helpers involving the kernel def hashes into one file (#10609 ) * Merge two helpers involving the kernel def hashes used by ORT format models. Add codeowners entry to ensure updates involving hashes are checked.	2022-02-23 20:46:09 +10:00
Dwayne Robinson	ea7f773a6e	Merge pull request #10619 from microsoft/user/dwayner/DmlDev20220221 Update DirectML EP for ORT 1.11	2022-02-23 01:09:26 -08:00
ytaous	9ba2d9379f	[ROCm] Code sync from CUDA (#10631 ) * code sync * more sync Co-authored-by: root <root@GCRAMDRR1-MI100-087.redmond.corp.microsoft.com>	2022-02-22 22:40:06 -08:00
Scott McKay	8cfa4b1c17	Fix build errors due to changes in warnings that VS 2022 17.1 produces. (#10621 ) Disable warning about padding for abseil-cpp flat_hash_map. Disable some warnings from compiling the test proto. This also required removing a line in CMakeList.txt where we move a level 4 warning to level 3. That ends up later on the command line and overrides the `/wd4800`. Couldn't find a way to handle that nicely. As we compile with `/W4` the value of moving 4800 to level 3 in dev mode is unclear so simplest was to remove that. Open to suggestions if there's a better way.	2022-02-23 07:32:07 +10:00
Dwayne Robinson	7de86d39d3	Build error int to bool	2022-02-21 22:00:48 -08:00
Rachel Guo	d6a8cba273	[NNAPI QDQ] Add nnapi qdq softmax op support (#10591 ) * wip * save * update pr comments * update Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>	2022-02-21 18:00:46 -08:00
Scott McKay	4d3cd2f685	Add helper for optimizing a QDQ format model for usage with ORT. (#10595 ) * Add initial helper for optimizing a QDQ format model for usage with ORT. If a DQ node has multiple consumers it will end up in multiple QDQ node units. This is complicated to handle as each qdq unit could end up being handled by different execution providers. By duplicating the DQ node we simplify this logic. Generally the duplicate nodes will disappear when the qdq node unit is converted to a single node with a quantized operator. If there are qdq node units that are not able to be converted to use a quantized operator the ORT cleanup (pending) to drop remaining Q->DQ pairs between fp32 nodes can remove any remaining DQ nodes. * Fix pep8 warning Co-authored-by: Guoyu Wang <wanggy@outlook.com>	2022-02-21 09:26:19 +10:00
Ryan Hill	4a79ed62b4	Remove extra version of a function in dnnl (#10599 )	2022-02-18 23:29:54 -08:00
Justin D. Harris	742694f679	[python] [orttraining] Add utility to export a graph to compute gradients (#8125 )	2022-02-18 14:00:49 -08:00
Xavier Dupré	6f0640a57f	Optimize ReduceSum, ReduceMean, ReduceMin, ReduceMax (#10280 ) * Optimize ReduceSum, ReduceMean, ReduceMin, ReduceMax * improve reducemax, reducemin * faster, smaller * replace std::vector by gsl::span for shapes * fix merging issues	2022-02-18 12:51:01 +01:00
Scott McKay	df841ee87d	Fix incorrect type constraint registration for operator kernels. (#10489 ) * Fix incorrect type constraint registration for RoiAlign. This led to the input type not actually being checked when matching a kernel as the invalid constraint name is treated as a missing optional input. * fix missing dependency for the unit test exe. Whilst it doesn't link against the CUDA providers lib, without the dependency VS doesn't know it needs to rebuild the library if there are changes. * Add check for invalid type constraints. * Fix invalid registrations for other kernels. * Add hash replacement logic to provide backwards compatibility in ORT format models when the registration is fixed. * Add tests	2022-02-18 16:55:32 +10:00
Yulong Wang	893ee65e54	[js/web] fix lint error when run without ort-web TS types (#10429 ) * [js/web] fix lint error when run without ort-web TS types * update CI to run linter before 'npm ci' in /js/web	2022-02-17 22:34:38 -08:00
Dwayne Robinson	6db6ee5710	Merged PR 6973543: ORT DML EP Opset 13 more complete Extend opset 13 support for: - Split-13 - Squeeze-13 - Unsqueeze-13 - Reshape-13 - QuantizeLinear-13 - DequantizeLinear-13 - ReduceSum-13 - Resize-13 Also: - Rename the file where all the opset versions are stored from "OperatorRegistration.h" to "OperatorVersions.h", which will make it much less confusing in the future when looking given there's another file called "OperatorRegistration.h" that corresponds to "OperatorRegistration.cpp". - Detemplatize many of the OperatorHelper.h constructors, which duplicate multiple instantiations due to the operator helper classes not sharing a common base class, by wrapping them with an adapter. Ideally there would be a common COM base interface that both IMLOperatorKernelCreationContext and IMLOperatorShapeInferenceContext implementation objects would implement, which a wrapper in MLOperatorAuthorHelper.h could QI for. - Fix style formatting issues in OperatorHelper.h (sorry for the noise). ``` Summary: Total=4679, Passed=4355, Failed=0, Blocked=0, Not Run=0, Skipped=324 ``` Corresponding WindowsAI PR: https://microsoft.visualstudio.com/WindowsAI/_git/WindowsAI/pullrequest/6973645 Related work items: #36672908, #36672926	2022-02-18 01:41:07 +00:00
Sunghoon	1af4c170ef	[js/react_native] publish onnxruntime-common npm package as web and node do (#10566 ) * apply the same policy for onnxruntime-common as web and node * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * Update mac-react-native-ci-pipeline.yml for Azure Pipelines * remove old comment	2022-02-17 15:25:27 -08:00
RandySheriffH	e056fbaa51	Add restrictions for hybrid cpus for thread pool task distribution (#10393 ) * add restrictions for hybrid cpus * add unit test to mock hybrid cpu * attach hybrid flag * add mocking interface to CpuInfo * make is_hybrid * make mock function const * add force_hybrid for thread pool * remove header	2022-02-17 14:34:09 -08:00
Jingqiao Fu	2fa333443a	Add telemetry for device kind (#10431 ) Add telemetry for device kind	2022-02-17 13:56:22 -08:00
Scott McKay	2ca9566994	Add range of helpers for making usage of ORT Mobile easier. (#10458 ) * Add range of helpers for making usage of ORT Mobile easier.	2022-02-18 07:35:25 +10:00
Chi Lo	fad590a059	Enhance TRT EP unit tests (#10493 ) * Re-write tensorrt ep cache test * refactor the code * refactor * move stdc++fs flag to CMakeLists.txt	2022-02-17 10:30:03 -08:00
Xavier Dupré	edbc844032	Fix misspelling in python documentation (#10588 )	2022-02-17 18:10:21 +01:00
zhangyaobit	fd16085cea	Zhanyao/attention (#10545 ) * Enable Attention op for ROCM EP. As a note, potential hipify improvements: (1) handle math contants (attention_softmax.h), (2) correctly generate transpose options for the GEMM helpers, consider counterpart/dummy API for CublasMathModeSetter (attention_impl.cu, attention_impl.cu). After these improvements, we don't need to manually keep copies of the above mentioned files any more. * Clean up debugging code.	2022-02-17 09:02:45 -08:00
leqiao-1	8d06e5a9df	Add openvino base image option (#10581 ) * add selectable python package build pipeline * update tensorrt version * update tensorrt version * Update Dockerfile.ubuntu_openvino * Update install_ubuntu.sh * add parameters for openvino base image * fix syntax error	2022-02-17 17:10:01 +08:00
Pallavi Deshmukh	ccd7a2d840	Fix build failure when using clang compiler	2022-02-16 17:52:45 -08:00
Changming Sun	09ac7595fc	update (#10573 ) Move FuncMgr up the class so it is destroyed later	2022-02-16 17:43:29 -08:00
ytaous	4f76c38686	Revert "Reduce max gradient (#9859 )" (#10574 ) This reverts commit `7443edb0bf`.	2022-02-16 16:02:30 -08:00

1 2 3 4 5 ...

6394 commits