onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-16 18:31:27 +00:00

Author	SHA1	Message	Date
Nat Kershaw (MSFT)	998bf0fdb6	Remove advice to use IO Binding for this scenario (#11006 )	2022-03-30 10:23:50 -07:00
Xavier Dupré	c37d2728bf	Implement TreeEnsemble for opset(ai.onnx.ml)==3 (#10821 ) * Implement TreeEnsemble for opset(ai.onnx.ml)==3 * use of InlineVector * refactoring * improve attributes retrieval * avoid creating a temporary buffer * modifies onnx.ml.cpu.json * use unordered_map * update docs/OperatorKernels.md * address PR comments (TH -> ThresholdType, ORT_RETURN...) * add a python unit test to load a TreeEnsembleRegressor following ai.onnx.ml==3 specifications	2022-03-30 12:53:12 +02:00
Yulong Wang	1424b796ff	[js/web] disable test_tan temorarily (#11048 )	2022-03-29 21:47:52 -07:00
Yi Zhang	d1bdd2cd94	allow trailing slash in directory (#11001 ) * allow trailing slash in directory * fix lint	2022-03-30 09:42:57 +08:00
ytaous	5868413caf	fix seg fault (#11038 ) Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-03-29 14:12:45 -07:00
Edward Chen	8f456735d1	Remove unused variable. (#11043 )	2022-03-29 14:11:07 -07:00
Erick Alejandro Muñoz Alvarado	6c005bfdbc	Enabled Cast operator on OneDNN EP (#11023 )	2022-03-29 08:16:01 -07:00
Vincent Wang	6a6840d5c6	Fuse LayerNormalization for Apex O2 (#10233 )	2022-03-29 21:22:04 +08:00
Vincent Wang	3b6cee8059	[CUDA] Optimize Conv and ConvGrad for Training (#10999 ) * Optimize Conv and ConvGrad for Training * add provider option to control * fix typo	2022-03-29 07:31:36 +08:00
Chi Lo	8ba52b0a05	Bump master version to 1.12 (#10797 ) * bump master version to 1.11 * bump master version to 1.12	2022-03-28 12:30:11 -07:00
Edward Chen	9371401746	Move node EP assignment for ORT format into SessionState::FinalizeSessionState() (#10944 ) Follow up to #10904. - Move node EP assignment for ORT format into SessionState::FinalizeSessionState(). - Add unit test for #10904. - Make convert_onnx_models_to_ort.py optimization level configurable via environment variable.	2022-03-28 10:37:22 -07:00
Baiju Meswani	9c6cc018a9	Add utility to get the gradient graph from GradientGraphBuilder (#10995 ) * Add pybind method to get the gradient graph * Fix segmentation fault because of logging for gradien building	2022-03-25 17:13:56 -07:00
Chen Fu	dc72159105	Symmetric Quant indirect Conv kernel for ARMv8 A55 chip (#10862 ) ARM a55 micro-architecture (with dot product instructions), similar to a53, is widely used as little cores in big.Little configurations. A55 has a narrower memory load/store hardware, where a 128b load instruction would block the pipeline for 2 whole cycles, during which no other instructions can be executed. On the other hand, a 64b load instruction can be duo issued with many other instructions. This change adds a Symmetric Quant indirect Conv kernel for a55 micro-architecture, where we replace ldr q4,[x1], with ldr d4,[x1], ldr x11,[x1], ins v4.d[1],x11 so that we can try to hide the memory load cycles behind computing cycles in the kernel. With this new kernel, cartoongan model shows significant perf improvement on Pixel5a little cores (2 threads running on two little cores): new kernel: 2188.59 ms old kernel: 2360.61 ms	2022-03-25 17:10:47 -07:00
leqiao-1	8ddc45f52d	Add linux and macos arm64 java aritifacts (#10981 )	2022-03-25 16:23:17 -07:00
Jack·Boos·Yu	d1be71eaa3	[cmake] Add keyword STATIC to add_library in function onnxruntime_add_static_library (#10998 )	2022-03-25 16:19:36 -07:00
Chandru Ramakrishnan	cb31b7eab1	Fixed creation of ORT_Value to pass offset of 0 (#11004 )	2022-03-25 15:52:10 -04:00
Scott McKay	47c09e6701	Clarify usage of kOnnxDomainAlias. (#10962 ) * Clarify usage of kOnnxDomainAlias.	2022-03-25 09:52:59 +10:00
pengwa	89ef987ab1	Improve NonZero on CUDA/ROCM (#10307 ) * improve NonZero * fix megatron_fp16 optimzier, fix the doc * multi_tensor_applier * resolve comment * fix building warning * fix build error when enabling training and use tensorrt	2022-03-25 07:35:45 +08:00
mpapdiwala	1e917c879e	Adding support for saving and loading train step info properties in the state dict and checkpoint file. (#10569 ) * Adding optimization step and step parameter to the ORTTrainer constructor * Added ORTTrainerOptions for optimization step * Adding Train Step Info Settings to State Dictionary * Adding train step info key * Updating comments * Reverting changes * Updating test case for new state dict entry train_step_info	2022-03-24 11:50:45 -07:00
Christoph Hausner	989e640009	Update docstrings in quantize.py (#10952 )	2022-03-24 10:49:33 -07:00
mindest	3c5853dcbc	register custom_op_symbolic for squeeze (#10970 ) * register custom_op_symbolic for squeeze * remove misleading warning msg from symbolic_opset9	2022-03-24 10:28:21 +08:00
Shucai Xiao	7ee52fb8a0	amdmigraphx_ep-add ops to be supported by migraphx and fixed a bug in check ops to be supported (#10496 ) * backup debugging information related to debugging a jira ticket * fixed a bug in checking whether an input can be constand folded * added more operators that are supported by migraphx * revert unnecessary changes * remove unused logger parameter * rename function to make name style consistent * backup code changes * fix review comments * refactor graph utility functions to add unit tests * backup additional changes * fixed a link error in build migraphx_basic_test * add unit test for some migraphx utility functions * add more supported ops in migraphx	2022-03-23 19:17:19 -07:00
Adrian Tsai	ae08f9666d	Fix type constraints in registration of DequantizeLinear (#10986 )	2022-03-23 17:05:12 -07:00
Sheil Kumar	938f3857a5	Set the default for the STFT onesided attribute to 1, which tests expect (#10984 ) Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2022-03-23 14:20:54 -07:00
Chandru Ramakrishnan	07201726ed	Fixed macros for graph transformer registration. (#10983 )	2022-03-23 14:55:17 -04:00
Olivia Jain	de384805cd	Custom parameters (#10964 ) * get inputs independently for trtexec * track one process only * remove engine and profile files * change time to commit time * add runtime option for io binding * move to commit date * fixes * add option for graph optimization * cleanup docker script * note second time creation * allow for parameters to be configured from pipeline at runtime * uncomment * include optional arguments at runtime * post second session creation * update cmake version * Revert "update cmake version" This reverts commit 09a1364eae68610724c8e90eeea777b7ee03f74b. * Move data format import	2022-03-23 09:47:24 -07:00
Jeff Daily	9a3be9b46a	use #include <hiprand/hiprand.h>, not deprecated #include <hiprand.h> (#10966 )	2022-03-23 08:56:45 -07:00
Yi Zhang	0efbe92296	fix coverage report error in master build (#10969 ) * fix error in master * check NNAPI_EP_MASTER * Revert "check NNAPI_EP_MASTER" This reverts commit 59c9043b7c9bbcb4b495d2dd121ef6d4271be408. * rm coverage in PR build	2022-03-23 16:00:57 +08:00
raviskolli	480c793125	Update training packages to Pytorch 1.11.0 (#10851 ) * Update ortmodule training packages to Pytorch 1.11.0 Co-authored-by: Harshitha Venkata <havenka@microsoft.com> Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>	2022-03-22 16:45:51 -07:00
Baiju Meswani	565318ce86	Support ORT WASM compilation with the training flag (#10973 ) * Add training support for ORT web assembly compilation * Use wrapper for eigen includes in training	2022-03-22 16:13:35 -07:00
Scott McKay	b28e5064f3	Ignore DequantizeLinear nodes in CommonSubexpressionElimination optimizer (#10934 ) * Ignore DequantizeLinear nodes in CommonSubexpressionElimination. Coalescing DQ nodes results in QDQ node groups having overlaps, which the QDQ processing does not support.	2022-03-23 08:46:01 +10:00
Xavier Dupré	b88fb68fac	Adds missing numpy type when looking for the ort correspondance (#10943 )	2022-03-22 14:44:48 -07:00
Yulong Wang	dce5d719c5	add build flag for emscripten settings (#10963 ) * allows multiple '--cmake_extra_defines' flags * fix flake8 error * Add build flag for emscripten settings * remove "emscripten_settings" in generate_build_tree() * format code	2022-03-22 11:55:45 -07:00
Sheil Kumar	027565b3b2	Add multi-dim dft test, and fix complex idft (#10947 ) * fix complex multi-dim dft * Add multi-dim dft test, and fix complex idft * remove incorrect inplace specification * Add DFT tests * update epsilon to 1000ths place Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2022-03-22 10:08:12 -07:00
Yulong Wang	2da82fd0b9	allows multiple '--cmake_extra_defines' flags (#10953 ) * allows multiple '--cmake_extra_defines' flags * fix flake8 error	2022-03-21 19:10:47 -07:00
Sunghoon	6d19c295d0	use lf as eol for node package (#10965 )	2022-03-21 15:50:03 -07:00
Sunghoon	b34d9f6867	[js/wasm] Add WebAssembly static library build into web CI pipeline (#10959 ) * add webassembly static library build into ci * add webassembly static library build into ci * skip publishing on static lib * fix type	2022-03-21 15:49:49 -07:00
Chandru Ramakrishnan	4a5b5328a4	Added support to Eager CodeGen for multiple in-place parameters. (#10945 ) * Added support to CodeGen for multiple inplace output parameters. * Updated output Tensor to references.	2022-03-21 13:10:22 -07:00
Leandro Gracia Gil	1cc2cfb7b8	Move #ifndef ORT_CXX_API_THROW to the no exceptions case. (#10937 ) This is related to https://github.com/microsoft/onnxruntime/issues/10564 which introduced a fix in the wrong case where exceptions are enabled.	2022-03-21 11:12:56 -07:00
leqiao-1	a6ea278502	add python3.10 support (#10848 ) * add python3.10 support * upgrade numpy version in build pipeline * add python 3.10 path * upgrade torch version in build pipeline * update docker run arguments * change torch version * fix typo * fix permission issue * change python version * remove python3.10 for openvino build * remove python 3.10 for openvino build	2022-03-21 09:46:02 +08:00
G. Ramalingam	8703d37517	Extend DropoutGrad function to support bfloat16 (#10662 ) * Update DropoutGrad function to support bfloat16 * Eliminate dead comments * Set opset version for testcase Signed-off-by: Ganesan Ramalingam <grama@microsoft.com> * Update to new builder Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>	2022-03-20 15:11:08 -07:00
Scott McKay	91722e2bc4	Fix typos (#10935 )	2022-03-20 08:27:35 +10:00
Yi Zhang	c1e37e4ebf	Android CI Pipeline: Fix post coverage bug (#10949 )	2022-03-19 11:17:08 -07:00
Ella Charlaix	fe6ab719f3	Fix a typo in quantization tools (#10940 )	2022-03-18 21:03:16 -07:00
soundarthiaga	eabb14788a	[perf_metric] added inferences per second metric (#10921 )	2022-03-18 21:01:11 -07:00
Yi Zhang	3897b93606	optimize Android CI (#10938 )	2022-03-19 11:00:21 +08:00
Kotaro Yamamoto	2dea7dc27f	Skip python arena shrinkage test on ppc (#10901 )	2022-03-18 19:31:21 -07:00
soundarthiaga	de06d95096	[parallel_inference] added support for parallel inference with timed duration perf test (#10922 )	2022-03-18 19:05:28 -07:00
Scott McKay	5cbacec854	Maintain aspect ratio by doing resize + crop in image_to_pb tool (#10887 )	2022-03-19 07:08:45 +10:00
ytaous	f058c59407	Performance: add io_binding support for bert benchmark util (#10907 ) * io_binding support * cover all test cases * per comments Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-03-18 10:33:30 -07:00

1 2 3 4 5 ...

6582 commits