onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-04 23:59:56 +00:00

Author	SHA1	Message	Date
Vincent Wang	5ecfaef042	ATen Fallback for Inference (#11597 ) * aten op for inference * fix build error * more some code to training only * remove domain from operator name * move aten_op_executor ext out from ortmodule * add pipeline * add exec mode * fix script * fix ut script * fix test pipeline * failure test * rollback * bugfix * resolve comments * enable aten for python build only * fix win build * use target_compile_definitions * support io binding * turn off aten by default * fix ut Co-authored-by: Vincent Wang <weicwang@microsoft.com> Co-authored-by: zhijxu <zhijxu@microsoft.com>	2022-06-09 16:07:30 +08:00
Scott McKay	927bac0f86	Rework allocator sharing to work for multiple devices. (#11700 ) * Rework allocator sharing to work for multiple devices. * Update SessionState to not use allocator name in matching for consistency with IExecutionProvider. The name doesn't have any clear meaning (e.g. we use the same name for the per-thread allocator in the CUDA EP as the shared allocate there and in the TRT EP). * NOTE: this means we will have one allocator per OrtMemType+OrtDevice. * Reverse order when doing allocator setup in SessionState. This will result in the CPU and CUDA EPs allocators being preferred (they are the most configurable), and also means the per-thread CUDA allocator for default GPU memory will be used even when TRT is enabled. * NOTE: Combined with the change to remove the allocator name from the key this will mean that if CUDA and TRT or ROCM and MIGraphX are both enabled the CUDA/ROCM per-thread allocator will be used to allocate GPU memory. * Use InsertAllocator instead of TryInsertAllocator. Each EP should be registered once, and we should only enter RegisterAllocator once, so the 'try' should not be required and would indicate an unexpected setup was involved. i.e. better to fail and figure out if we need to support that setup. * Add some clarifying comments around how replace allocator works. * Add unit testing for setup where EP has local allocator that may get out of sync with values in the IExecutionProvider base class. * Fix invalid check of whether data is on CPU to use device info instead of allocator name.	2022-06-09 17:38:38 +10:00
Alex Fuller	8156b9370c	[Abseil] Adding URL_HASH so that an existing archive can be used from disk (#11690 )	2022-06-08 17:12:59 -07:00
Justin Chu	913100885b	Remove the redundant black check in CI (#11790 ) We have two black checks in CI for different scopes (PR, full repo). Now that the repo level black check is required, we can remove the PR level check.	2022-06-08 16:58:43 -07:00
Gary Miguel	79db92f8fe	clang-format signal_defs.cc (#11767 )	2022-06-08 15:45:40 -07:00
dependabot[bot]	750cb42f87	Bump protobufjs from 6.10.2 to 6.11.3 in /js/node (#11722 ) Bumps [protobufjs](https://github.com/protobufjs/protobuf.js) from 6.10.2 to 6.11.3. - [Release notes](https://github.com/protobufjs/protobuf.js/releases) - [Changelog](https://github.com/protobufjs/protobuf.js/blob/v6.11.3/CHANGELOG.md) - [Commits](https://github.com/protobufjs/protobuf.js/compare/v6.10.2...v6.11.3) --- updated-dependencies: - dependency-name: protobufjs dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-08 11:17:56 -07:00
dependabot[bot]	bc4c771078	Bump protobufjs from 6.10.2 to 6.11.3 in /js/web (#11723 ) Bumps [protobufjs](https://github.com/protobufjs/protobuf.js) from 6.10.2 to 6.11.3. - [Release notes](https://github.com/protobufjs/protobuf.js/releases) - [Changelog](https://github.com/protobufjs/protobuf.js/blob/v6.11.3/CHANGELOG.md) - [Commits](https://github.com/protobufjs/protobuf.js/compare/v6.10.2...v6.11.3) --- updated-dependencies: - dependency-name: protobufjs dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-08 11:17:30 -07:00
Changming Sun	eeeb249a27	Update onnxruntime_providers.cmake to remove the reference of "onnxruntime_tvm_dependencies" (#11780 )	2022-06-08 09:06:00 -07:00
Alexey Gladyshev	331c387f4a	[TVM EP][DOC] Documentation update for TVM EP due to the addition of precompiled model support. (#11743 ) * update description of TVM EP options in docs * update sample notebook * update TVM EP documentation * add link to description of options Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>	2022-06-08 14:56:01 +02:00
Yi Zhang	7f8d0ba824	Update comments in Android workflow (#11311 ) * keep comments change only	2022-06-08 15:25:21 +08:00
Yufeng Li	f6f457aa57	not remove relu/clip for symmetric activation (#11696 ) * not remove relu/clip for symmetric activation	2022-06-07 18:02:31 -07:00
GPhilo	40f4304c7d	[Fix #11447 ] Use correct type for tensor shape vectors (#11448 ) * [Fix] Use correct type for tensor shape vectors * Replacing std::vector with absl::InlinedVector * Remove explicit use of absl:: namespace; Add back explicit size in constructors. * Remove explicit size for InlinedVector	2022-06-07 09:06:32 -07:00
Yi Zhang	b4f1e769c0	Add Mac Silicon/M1 Wheel (#11591 )	2022-06-07 08:58:20 -07:00
Yulong Wang	40d2c98e4d	[js/web] fix ORT Web dependency version mismatch	2022-06-06 23:41:40 -07:00
leqiao-1	8fb38e8a54	fix cmake warning (#11742 )	2022-06-07 09:37:16 +08:00
dependabot[bot]	9e33bfd29b	Bump simple-plist from 1.3.0 to 1.3.1 in /js/react_native/e2e (#11712 ) Bumps [simple-plist](https://github.com/wollardj/simple-plist) from 1.3.0 to 1.3.1. - [Release notes](https://github.com/wollardj/simple-plist/releases) - [Commits](https://github.com/wollardj/simple-plist/compare/v1.3.0...v1.3.1) --- updated-dependencies: - dependency-name: simple-plist dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-06 16:54:36 -07:00
PeixuanZuo	908e19dc16	[FIX] using torch.version.cuda/hip to ensure build ORTModule Torch C++ CUDA extension for docker build (#11675 ) * [FIX] cpp ext * Update orttraining/orttraining/python/training/ortmodule/torch_cpp_extensions/install.py Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com> * [FIX] fix python format Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>	2022-06-07 07:51:26 +08:00
George Nash	981d45d8d5	Add binary comparators to the OneDNN (dnnl) execution provider (#11641 ) * Added Bool output support by using u8 datatype Signed-off-by: George Nash <george.nash@intel.com> * Add Equal, Greater, GreaterOrEqual, Less, and LessOrEqual Operators Signed-off-by: George Nash <george.nash@intel.com> Co-authored-by: Erick Munoz Alvarado <erick.munoz.alvarado@intel.com>	2022-06-06 09:15:42 -07:00
Valery Chernov	4296968f20	[TVM EP] update set input method for VirtualMachine (#11674 ) * update TVM * get alignment constant from TVM * update TVM_VM_SetInputs to upstream with TVM API * fix CI issue: update TVM EP dependencies * add sudo * revert changes needed to install missing package * add package for TVM EP CI Co-authored-by: Valery Chernov <valery.chernov@deelvin.com> Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru>	2022-06-04 09:31:01 +02:00
Changming Sun	d5e34acb82	Remove git and python packages from the docker images used by Zip-Nuget-Java-Nodejs Packaging Pipeline (#11651 )	2022-06-03 20:00:54 -07:00
Changming Sun	3c1dd9514d	Revert "fixed point based requantization on arm64 (#11540 )" (#11732 ) This reverts commit `1f2c926`. Because it makes our packaging pipeline crash Error message: [ RUN ] QLinearConvTest.Conv3D_S8S8_Depthwise Test #1: onnxruntime_test_all ...................Subprocess killed***Exception: 838.24 sec We haven't successfully reproduced the bug on a real ARM64 hardware. Currently we only saw it showed up with qemu. More investigations are on-going.	2022-06-03 19:12:25 -07:00
Scott McKay	ef64b2ee52	Fix clash between QDQ propagation and TransposeOptimizer (#11636 ) * Initial changes with comments on potential unit test changes. * Update tests to disable TransposeOptimizer as that's simpler. Add some extra comments. Cleanup. * Update comments in TransformGraph * Add regression test. Add limitation that transpose optimizer will ignore assigned nodes that do not match the context EP if that is set. * Fix test. I removed a trailing Transpose after initial validation to simplify but that changed things so that the transpose optimizer didn't kick in, and the DQ -> Transpose -> Q was actually converted to a single Transpose by the CPU EP QDQ handling. Same end result in most builds so the subtle difference wasn't noticed, but in a build without contrib ops the CPU EP QDQ handling is disabled so the end result was different. Update the test to re-instate the trailing Transpose so transpose optimizer alters the graph as desired. * Don't run level 1 optimizers after partitioning as they don't guarantee to handle EP assignment for new nodes they create.	2022-06-03 16:16:35 -07:00
Hector Li	95a16c1ffe	Snpe ep (#11665 ) * Initiate Ort SNPE EP * fix snpe ep windows build which is caused by the utility method (ToUTF8String) name change on master * correct the source path for libonnxruntime.so while building for andorid package * add AdditionalDependencies for amr64 * On MS-Windows, the patchfile must be a text file, i.e. CR-LF must be used as line endings. A file with LF may give the error: "Assertion failed, hunk, file patch.c, line 343," unless the option '--binary' is given. * fix build failure if snpe is not enabled * update doc for contrib op * separate out snpe ep settings to onnxruntime_snpe_provider.cmake * renaming according review comments * update according review comments	2022-06-03 14:10:02 -07:00
G. Ramalingam	98960c53fe	Replace ORT's function shape inference with ONNX's (#11538 ) * Function inlining tests * Replace ORT copy of function shape inference * Remove std::move * Fix memory leak Signed-off-by: Ganesan Ramalingam <grama@microsoft.com> * Address feedback	2022-06-03 12:54:28 -07:00
Changming Sun	ec05313cd9	Split the GPU pipeline to 3 different machine pools (#11724 )	2022-06-03 10:57:32 -07:00
Scott McKay	4445dd6bc1	XNNPACK EP (#11445 ) * Implement XNNPACK support via an EP. * Layout transform uses the GraphPartitioner infrastructure. * Node fusion is supported. * Conv and MaxPool implementations were ported from Changming's PR. * Added optional mutex in InferenceSession::Run as we only want to allow sequential calls if xnnpack is enabled	2022-06-03 20:22:34 +10:00
ytaous	ce4ac6d328	Optimizer - add missing supported version for BiasSoftmaxFusion (#11616 ) * add missing version * opset check * fix format * reject fusion if type not allowed * per comments * trigger new build Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-06-02 23:23:51 -07:00
Valery Chernov	196cd7aed1	Pylint fix after #11647 (#11704 )	2022-06-02 17:14:20 -07:00
Yufeng Li	1f2c92673b	fixed point based requantization on arm64 (#11540 ) * fixed point based requantization on arm64 * reverse MlasConvSymDepthwiseKernel u8s8 and s8s8 order	2022-06-02 12:34:17 -07:00
Changming Sun	453c57f92f	Revert "Directly use memory mapped data for external data initializers (#11127 )" (#11650 ) This reverts commit `541eff8d89`, because it broke CUDA EP. See: https://github.com/microsoft/onnxruntime/issues/11511 for more details.	2022-06-01 14:46:13 -07:00
Vincent Wang	54d1573d2f	[ORTModule] Enable SimplifiedLayerNormalization Fusion (#11580 ) * enable SimplifiedLayerNormalization fuse * remove allow_layer_norm_mod_precision flag	2022-06-01 15:09:39 +08:00
Chih-Hsuan Yen	03abcb0640	Correctly unpack tensor values (#11639 ) This change fixes two issues: * protobuf 3.20 incompatibility * Potential incorrect results on big-endian machines	2022-06-01 00:03:11 -07:00
leqiao-1	2ac3649752	Update requirements.txt (#11682 ) set protobuf version	2022-06-01 12:31:21 +08:00
Yufeng Li	f437945926	fix output shape of ReduceMin/ReduceMax in calibration tool (#11647 )	2022-05-31 14:26:08 -07:00
Yulong Wang	004560f1fe	[js/rn] upgrade dependency packages' version (#11586 ) * [js/rn] upgrade dependency packages' version * clean up yarn.lock	2022-05-31 13:54:17 -07:00
Gary Miguel	74bc4c07f6	Fix C# and numbering (#11643 ) * C# protocol buffer code can be updated on Linux. Link to the relevant instructions. * Fix numbering.	2022-05-31 11:33:36 -07:00
Yi-Hong Lyu	3076061aca	Fix a comment in test_op_reshape.py (#11606 )	2022-05-30 12:04:40 -07:00
Sheil Kumar	22739137c4	Update signal op defs to match onnx17 defs, and add more tests (#11631 )	2022-05-28 16:00:09 -07:00
Scott McKay	4fabc400de	Fix CUDA 11.6 build error on Windows (#11578 ) * Avoid windows header that defines 'small'	2022-05-28 08:04:46 +10:00
Scott McKay	7e6d052275	Add better error message for subgraph output coming directly from outer scope value. (#11638 ) * Add better message for subgraph output coming directly from outer scope value. * Use regex to match value name as the test model is processed in a different order on different platforms.	2022-05-28 08:04:27 +10:00
Gary Miguel	b67c0f639c	Remove filter_mode input from pyflakes GitHub action (#11644 ) Previously it triggered: `Warning: Unexpected input(s) 'filter_mode', valid inputs are ['entryPoint', 'args', 'github_token', 'level', 'reporter']`	2022-05-27 07:59:17 -07:00
pengwa	44f7b1bf2c	MTA AdamWOptimizer (#11506 ) * skeleton change * adam compute kernels * add rtol/atol for tests * some clean up * optional outputs * more clean up * add tests * adamw mode=1 test pass * clean up tests * add HF AdamW test cases * refactor adam test file * make test pass * all test pass, fix comments * rename to adamw * make test pass again * fix cpplint * minor fixes * fix python lint * Fix build and tests * fix builds * fix windows build * fix win build * minor fix * Refine based on comments * resolve comments * formatting * resolve comments * add ut	2022-05-27 19:52:04 +08:00
Vincent Wang	02724c54ff	[CUDA] Implement BitmaskDropout, BitmaskBiasDropout and BitmaskDropoutGrad (#11534 ) * Implement BitmaskDropout and associated unit tests. * Implement BitmaskDropoutGrad and associated unit tests. * Implement Dropout -> BitmaskDropout rewrite rule and associated unit tests. * Implement (Dropout,DropoutGrad) -> (BitmaskDropout,BitmaskDropoutGrad) rewrite rule. This commit does not yet include unit tests for this rewrite rule. This commit also introduces improved documentation for all changes which will be grouped into this PR. * bitmask dropout * fix win build * bugfix for rocm * bugfix * fix code format * fix ut * fix build break * fix ut in win * resolve comments * fix ut in trt * resolve comments * fix rocm build error * fix typo Co-authored-by: Aidan Beggs <aidanbeggs@microsoft.com>	2022-05-27 17:24:47 +08:00
Vincent Wang	eadb1a3128	Speed Up GradientChecker Running (#11579 ) * fix gradient tester * test size adjust * fix win build	2022-05-27 15:14:53 +08:00
Changming Sun	6a45f9f059	Pin protobuf version to 3.18.1 (#11645 )	2022-05-26 21:14:56 -07:00
microsoft-github-policy-service[bot]	006597b9b8	Microsoft mandatory file (#11619 ) Co-authored-by: microsoft-github-policy-service[bot] <77245923+microsoft-github-policy-service[bot]@users.noreply.github.com>	2022-05-25 13:56:10 -07:00
Yulong Wang	f0dff6bb74	[js/rn] add expo config plugin support (#11556 ) * [js/rn] add expo config plugin support * resolve comments	2022-05-25 11:55:35 -07:00
Ryan Hill	d03d7afef8	Fix build errors when building with enable_memory_profile (#11617 )	2022-05-25 10:08:33 -07:00
Hariharan Seshadri	6e65bac5c2	Memory usage optimization in LongFormer Attention (#11611 )	2022-05-25 10:07:41 -07:00
Adrian Lizarraga	883e4bc341	Update the 'Linux-GPU-EP-Perf' pipeline to build ORT from source by default. (#11610 )	2022-05-25 09:29:49 -07:00

1 2 3 4 5 ...

6823 commits