onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-28 03:20:58 +00:00

Author	SHA1	Message	Date
G. Ramalingam	98960c53fe	Replace ORT's function shape inference with ONNX's (#11538 ) * Function inlining tests * Replace ORT copy of function shape inference * Remove std::move * Fix memory leak Signed-off-by: Ganesan Ramalingam <grama@microsoft.com> * Address feedback	2022-06-03 12:54:28 -07:00
Changming Sun	ec05313cd9	Split the GPU pipeline to 3 different machine pools (#11724 )	2022-06-03 10:57:32 -07:00
Scott McKay	4445dd6bc1	XNNPACK EP (#11445 ) * Implement XNNPACK support via an EP. * Layout transform uses the GraphPartitioner infrastructure. * Node fusion is supported. * Conv and MaxPool implementations were ported from Changming's PR. * Added optional mutex in InferenceSession::Run as we only want to allow sequential calls if xnnpack is enabled	2022-06-03 20:22:34 +10:00
ytaous	ce4ac6d328	Optimizer - add missing supported version for BiasSoftmaxFusion (#11616 ) * add missing version * opset check * fix format * reject fusion if type not allowed * per comments * trigger new build Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-06-02 23:23:51 -07:00
Valery Chernov	196cd7aed1	Pylint fix after #11647 (#11704 )	2022-06-02 17:14:20 -07:00
Yufeng Li	1f2c92673b	fixed point based requantization on arm64 (#11540 ) * fixed point based requantization on arm64 * reverse MlasConvSymDepthwiseKernel u8s8 and s8s8 order	2022-06-02 12:34:17 -07:00
Changming Sun	453c57f92f	Revert "Directly use memory mapped data for external data initializers (#11127 )" (#11650 ) This reverts commit `541eff8d89`, because it broke CUDA EP. See: https://github.com/microsoft/onnxruntime/issues/11511 for more details.	2022-06-01 14:46:13 -07:00
Vincent Wang	54d1573d2f	[ORTModule] Enable SimplifiedLayerNormalization Fusion (#11580 ) * enable SimplifiedLayerNormalization fuse * remove allow_layer_norm_mod_precision flag	2022-06-01 15:09:39 +08:00
Chih-Hsuan Yen	03abcb0640	Correctly unpack tensor values (#11639 ) This change fixes two issues: * protobuf 3.20 incompatibility * Potential incorrect results on big-endian machines	2022-06-01 00:03:11 -07:00
leqiao-1	2ac3649752	Update requirements.txt (#11682 ) set protobuf version	2022-06-01 12:31:21 +08:00
Yufeng Li	f437945926	fix output shape of ReduceMin/ReduceMax in calibration tool (#11647 )	2022-05-31 14:26:08 -07:00
Yulong Wang	004560f1fe	[js/rn] upgrade dependency packages' version (#11586 ) * [js/rn] upgrade dependency packages' version * clean up yarn.lock	2022-05-31 13:54:17 -07:00
Gary Miguel	74bc4c07f6	Fix C# and numbering (#11643 ) * C# protocol buffer code can be updated on Linux. Link to the relevant instructions. * Fix numbering.	2022-05-31 11:33:36 -07:00
ashbhandare	1c316d0e39	Parameter,Module and Optimizer changes (#11494 ) * Module step * On device training offline composition * Working grad accumulation with test for TrainStep * Temp changes * Revert "On device training offline composition" This reverts commit `ec3da68247`. * cleanup * Implement eval step * Use new graphs and checkpoints * Optimizer test, changes * review comments * review comments * review comments Co-authored-by: Aishwarya Bhandare <aibhanda@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>	2022-05-31 09:20:47 -07:00
Yi-Hong Lyu	3076061aca	Fix a comment in test_op_reshape.py (#11606 )	2022-05-30 12:04:40 -07:00
Sheil Kumar	22739137c4	Update signal op defs to match onnx17 defs, and add more tests (#11631 )	2022-05-28 16:00:09 -07:00
pengwa@microsoft.com	e1c63cb06a	Merge branch 'master' of https://github.com/microsoft/onnxruntime into training_dev/on_device_poc	2022-05-28 01:54:17 +00:00
Baiju Meswani	c318b19307	Add support for BCEWithLogitsLoss (#11630 )	2022-05-27 15:50:16 -07:00
Scott McKay	4fabc400de	Fix CUDA 11.6 build error on Windows (#11578 ) * Avoid windows header that defines 'small'	2022-05-28 08:04:46 +10:00
Scott McKay	7e6d052275	Add better error message for subgraph output coming directly from outer scope value. (#11638 ) * Add better message for subgraph output coming directly from outer scope value. * Use regex to match value name as the test model is processed in a different order on different platforms.	2022-05-28 08:04:27 +10:00
Jeff Bloomfield	a7fa735286	Merge remote-tracking branch 'origin/master' into WindowsAI	2022-05-27 12:53:54 -07:00
Gary Miguel	b67c0f639c	Remove filter_mode input from pyflakes GitHub action (#11644 ) Previously it triggered: `Warning: Unexpected input(s) 'filter_mode', valid inputs are ['entryPoint', 'args', 'github_token', 'level', 'reporter']`	2022-05-27 07:59:17 -07:00
pengwa	44f7b1bf2c	MTA AdamWOptimizer (#11506 ) * skeleton change * adam compute kernels * add rtol/atol for tests * some clean up * optional outputs * more clean up * add tests * adamw mode=1 test pass * clean up tests * add HF AdamW test cases * refactor adam test file * make test pass * all test pass, fix comments * rename to adamw * make test pass again * fix cpplint * minor fixes * fix python lint * Fix build and tests * fix builds * fix windows build * fix win build * minor fix * Refine based on comments * resolve comments * formatting * resolve comments * add ut	2022-05-27 19:52:04 +08:00
Vincent Wang	02724c54ff	[CUDA] Implement BitmaskDropout, BitmaskBiasDropout and BitmaskDropoutGrad (#11534 ) * Implement BitmaskDropout and associated unit tests. * Implement BitmaskDropoutGrad and associated unit tests. * Implement Dropout -> BitmaskDropout rewrite rule and associated unit tests. * Implement (Dropout,DropoutGrad) -> (BitmaskDropout,BitmaskDropoutGrad) rewrite rule. This commit does not yet include unit tests for this rewrite rule. This commit also introduces improved documentation for all changes which will be grouped into this PR. * bitmask dropout * fix win build * bugfix for rocm * bugfix * fix code format * fix ut * fix build break * fix ut in win * resolve comments * fix ut in trt * resolve comments * fix rocm build error * fix typo Co-authored-by: Aidan Beggs <aidanbeggs@microsoft.com>	2022-05-27 17:24:47 +08:00
Vincent Wang	eadb1a3128	Speed Up GradientChecker Running (#11579 ) * fix gradient tester * test size adjust * fix win build	2022-05-27 15:14:53 +08:00
Changming Sun	6a45f9f059	Pin protobuf version to 3.18.1 (#11645 )	2022-05-26 21:14:56 -07:00
microsoft-github-policy-service[bot]	006597b9b8	Microsoft mandatory file (#11619 ) Co-authored-by: microsoft-github-policy-service[bot] <77245923+microsoft-github-policy-service[bot]@users.noreply.github.com>	2022-05-25 13:56:10 -07:00
Yulong Wang	f0dff6bb74	[js/rn] add expo config plugin support (#11556 ) * [js/rn] add expo config plugin support * resolve comments	2022-05-25 11:55:35 -07:00
Ryan Hill	d03d7afef8	Fix build errors when building with enable_memory_profile (#11617 )	2022-05-25 10:08:33 -07:00
Hariharan Seshadri	6e65bac5c2	Memory usage optimization in LongFormer Attention (#11611 )	2022-05-25 10:07:41 -07:00
Adrian Lizarraga	883e4bc341	Update the 'Linux-GPU-EP-Perf' pipeline to build ORT from source by default. (#11610 )	2022-05-25 09:29:49 -07:00
Thiago Crepaldi	427230431a	Fix torch cpp ext build when CPU wheel is installed but GPU card is present (#11608 ) * Fix torch cpp ext build when CPU wheel is installed but GPU card is present Also there is a minor improvement for ATen operator that allows both "::op" and "aten::op" name for operators * Fix flake8 false positive	2022-05-25 09:44:26 -04:00
George Nash	147a1737f9	MatMul postop fusion for dnnl ep (#11565 ) This includes a series of unit test that exercise the MatMul fusion. This is not an exhaustive list of tests. The tests focuse on paterns seen in in models, with additional tests to cover at least one instance of each operator type that can be part of the fusion. Signed-off-by: George Nash <george.nash@intel.com>	2022-05-24 22:19:38 -07:00
Yulong Wang	4e9ad7b6ae	Update .flake8 to exclude .git directory (#11615 )	2022-05-24 19:43:02 -07:00
Baiju Meswani	3a22a866a1	On device training offline tooling (#11520 )	2022-05-24 18:21:39 -07:00
Gary Miguel	e3a2d5cca8	Add additional python requirements (#11522 ) These are used by some of the python code in the package, e.g., `0292356bd7/onnxruntime/python/tools/transformers/optimizer.py (L25)` `c8270c2940/onnxruntime/python/tools/symbolic_shape_infer.py (L10)` `0292356bd7/onnxruntime/python/tools/transformers/torch_onnx_export_helper.py (L9)`	2022-05-20 16:16:18 -07:00
Yulong Wang	69aaf03345	allow catch all exceptions (#11498 )	2022-05-20 03:35:47 -07:00
PeixuanZuo	a67994316a	Update rocm ci to ROCm5.1.1 + torch1.10.0 * [UPDATE] update amd ci pipeline 2 rocm5.1.1 * [FIX] json format error * [ERROR] disable unit tests * [FIX] ucx error * [FIX] cmake version * [FIX] units test	2022-05-20 11:07:21 +08:00
Tang, Cheng	abecb56832	fix buid break (#11492 ) Co-authored-by: Cheng Tang <chenta@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-05-19 16:10:45 -07:00
Dwayne Robinson	d2519ec0c2	DirectML EP add CastLike-15 and Reshape-14 (#11568 ) * Add CastLike15 * Add Reshape14 * Fix allowzero comment * Rename REG_INFO_ID to REG_INFO_COPY to be clearer to readers	2022-05-19 14:34:11 -07:00
Vincent Wang	436c4f9b79	Add BFloat16 (bf16) support for ATen (#11546 ) Co-authored-by: Vincent Wang <weicwang@microsoft.com>	2022-05-19 10:04:08 -04:00
Dwayne Robinson	3a867d83d5	DirectML opset14 type updates for Add/Sub/Mul/Div and Relu/PRelu (#11560 ) * Add more type support for Add/Sub/Mul/Div and Relu/PRelu * Remove stale Remap64bitDmlDataTypeTo32bit	2022-05-18 17:34:15 -07:00
Adrian Lizarraga	e45197fa8c	[trt-ep-perf] Fix upload time of EP perf data (#11531 ) Fix the post.py script to use the actual "upload time" in ISO format instead of the day/month/year of the commit date.	2022-05-18 15:36:21 -07:00
Valery Chernov	8092d9f9a2	[TVM EP] Support inference by shared library created by TVM (#11389 ) * add so_folder option to TVM EP options. add TvmSoEP class and update TVM EP factory * compilation from so_folder was implemented * update TVMCompiler for default pipeline and compilation from shared lib * filter excess so-file in so_folder * clean Compile method and vm conditions * implementation of TVMSoCompile on native side instead of python API * cpplint fixes * some fixes after review * more cpplint fixes * more fixes after review * align TVMso EP with new API for compilation from #10632 * small fixes for cpplint Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>	2022-05-18 14:50:54 +02:00
Adrian Lizarraga	48efeca66c	[trt-ep-perf] Fix bug that suppresses latency gain reporting (#11321 ) Fix bug that prevents EP perf script from reporting latency gain for TensortRT/CUDA	2022-05-17 14:00:52 -07:00
Edward Chen	782f9e394d	[CoreML EP] Fix condition in PRelu op supported check. (#11543 )	2022-05-17 09:03:24 -07:00
Ryan Hill	deef214772	Update gather to use multiple threads (#11524 )	2022-05-16 19:31:14 -07:00
Edward Chen	5eaa893936	[CoreML EP] Add support for PRelu (#11474 )	2022-05-16 16:30:09 -07:00
Justin Chu	d9c9adb78b	Add python static type checking in CI checks (#11518 ) - Enable pyright and pylint (https://github.com/microsoft/pyright) in CI - Enable pyright, pylint and bandit by default in VS code Pylint has some good style checks. pyright is Microsoft's static type checker.	2022-05-16 13:26:56 -07:00
PeixuanZuo	c556f5f22f	Add AMD python package ROCm5.1.1+torch1.11 (#11516 ) * [FIX] fix name error * [ADD] add rocm5.1.1 python package * [ADD] torch1.10.0 rocm requirements * [UPDATE] update docker Repository name	2022-05-16 08:14:11 +08:00

... 20 21 22 23 24 ...

7863 commits