onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-04 04:07:22 +00:00

Author	SHA1	Message	Date
Changming Sun	87b1fddd97	Add Linux/MacOS ARM64 support to nuget packaging pipeline (#9570 )	2021-10-27 19:00:43 -07:00
Ginés Hidalgo	2d44bd525b	DML functions always returning a value (#9485 ) * Always return a value * @fdwr advice added	2021-10-27 15:21:32 -07:00
Scott McKay	a2b3e6bb23	Remove pointless assert. (#9571 )	2021-10-28 07:33:40 +10:00
Dmitri Smirnov	4e76360261	Prevent PySparseTensor form being garbage collected if we have an outstanding OrtValue (#9540 ) Prevent PySparseTensor form being garbage collected if we have an outstanding OrtValue Improve comments.	2021-10-27 11:28:37 -07:00
Changming Sun	aa76520e60	Update macOS build agents to macOS 11 (#9562 )	2021-10-27 10:00:04 -07:00
Thiago Crepaldi	5d5c03bcdc	Fix opset version change by not using copy of global constant (#9393 )	2021-10-27 12:42:06 -04:00
Scott McKay	b5a652c578	Add Xamarin support (#9436 ) Add Xamarin support to the ORT nuget packages. - Update C# code to support Xamarin builds for iOS and Android - refactor some things to split out common code - include iOS and Android ORT native shared library in native nuget package	2021-10-27 20:07:07 +10:00
Ginés Hidalgo	12f216aab5	Bug in DmlOperatorResize.cpp with m_inputDimensions (#9456 )	2021-10-27 02:50:54 -07:00
Ginés Hidalgo	9639eded4b	Missing `#pragma once` in dml_provider_factory.h (#9457 )	2021-10-27 02:49:52 -07:00
Ginés Hidalgo	1efdbff1a3	Fixed compiler error in Clang (for Win64) for ExecutionProvider (#9482 )	2021-10-27 02:47:22 -07:00
Yi-Hong Lyu	0301f401ee	Cleanup unnecessary opset_version arguments (#9558 )	2021-10-27 02:25:54 -07:00
Sunghoon	c79307e7b4	[js/web] support opset-13 of softmax (#9493 ) * add p50 in test * support opset-13 of softmax * update a operators.md * resolve comments * fix lint and format Co-authored-by: Yulong Wang <yulongw@microsoft.com>	2021-10-26 23:58:50 -07:00
Ginés Hidalgo	d079e0d48f	Fixed Clang (on Windows) compiler error with `#pragma`'s (#9484 )	2021-10-26 21:31:45 -07:00
RajalakshmiSR	c54ad0dd0b	POWER: Add Dgemm kernel for POWER processor (#9459 ) * POWER: Add Dgemm kernel for POWER processor This patch adds new dgemm kernel specific to POWER processor. * POWER: Restrict new functions to VSX in header * Remove warning check in header * POWER: Dgemm Adjust indentation Fixing indentation based on review comments. Co-authored-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>	2021-10-26 20:27:24 -07:00
Yulong Wang	90555bf96d	[node.js binding] enable CI for macOS arm64 (#9532 ) * nodejs aggr * add dependency * no unzip * fix aggregation * add arm64 for mac * mac arm64 build * fix commandline * add check for multi-CMAKE_OSX_ARCHITECTURES * fix	2021-10-26 16:42:19 -07:00
Zhang Lei	c1b0f924b7	quantization tool better support operator when subgraph is enabled (#9463 ) * Fix is_valid_quantize_weight recursive issue when enable subgraph. * some clear	2021-10-26 15:36:19 -07:00
Zhang Lei	33ef1d7700	disable inner parallel for global avg pool as normally they are small (#9487 ) * Using cost model's thread count rather than max number of threads when parallel tasks. * according to perf test result, decrease parallel on channels. * Seems no use on parallel channels for qavg_pool according several models, remove it. * Revert "Using cost model's thread count rather than max number of threads when" This reverts commit 5fa47cd5b5ddbaa4e5ef97ccbc53200324379544.	2021-10-26 15:35:49 -07:00
Changming Sun	df7a5342a5	Upgrade com.diffplug.spotless to 5.17.0 (#9546 )	2021-10-26 14:29:46 -07:00
Changming Sun	f39821adbc	Fix a bug in CMakeLists.txt when handling NO RTTI (#9547 )	2021-10-26 14:29:29 -07:00
Jingqiao Fu	da15f5fc2f	change cmake condition to prevent WCOS fom linking advapi32 (#9500 ) * change condition to prevent WCOS fom linking advapi32.dll * Remove linkage to advapi32.lib	2021-10-26 12:16:49 -07:00
Stella Stamenova	542f1a9737	Cleanup some whitespace and capitalization for set (#9504 )	2021-10-26 12:02:07 -07:00
Ginés Hidalgo	a036cc6d4b	Fixing bugs in ORT_NO_EXCEPTIONS (#9479 ) ORT_NO_EXCEPTIONS is not working after the latest changes in: onnxruntime/core/graph/function.cc onnxruntime/core/graph/graph.cc	2021-10-26 10:50:32 -07:00
Ginés Hidalgo	1aabba7120	Avoided warning C4458: declaration of 'X' hides class member. (#9541 )	2021-10-26 10:49:24 -07:00
satyajandhyala	f29057c7c0	Added TanhGrad. (#9507 ) * Added TanhGrad.	2021-10-26 09:10:03 -07:00
pengwa	b125446f9c	Optimize python overhead of APEX amp (#9447 ) * optimize python overhead of _post_amp_backward * overwrite apex amp's zero_grad for faster implementation * move unscale_fp16_grads_into_fp32_grads into C++ impl * improve the efficiency furthur, reducing 3.5ms to 1.7ms for unilm. * unilm 1.7ms to 338us: 1). optimize python list <==> std::vector copy, 2). launch the kernels as long as num_elem reach thresh hold. This help reduce the CUDA idel time. * refine the logic a bit after validating Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>	2021-10-26 13:13:49 +08:00
Yi-Hong Lyu	27ad20df23	Add QDQ support of Resize to able to fuse it into a quantized Resize (#9476 )	2021-10-25 21:48:15 -07:00
ashbhandare	0270ff7951	Minor import fix (#9538 )	2021-10-25 21:29:31 -07:00
Changming Sun	f92b8e2ac8	Clean up optional-lite references (#9534 )	2021-10-25 21:05:45 -07:00
Yulong Wang	bf4c3fa3d6	[node.js binding] aggregate binaries for multiple platforms in single NPM package (#9501 )	2021-10-25 20:16:10 -07:00
Vincent Wang	fb4f7dbbb7	Call ATenOp for ReduceSum on ORTModule (#9471 ) * call ATenOp for ReduceSum * Enable ReduceSum ATenOp for training only * always load extension	2021-10-26 09:48:57 +08:00
marcusfreisleben	651955d3c9	CUDA: Enable parallel compilation (#8974 ) * Pass on parallel option to nvcc * Fixed build.py * Added missing string conversion * Adressed review points	2021-10-25 16:42:58 -07:00
Scott McKay	39d1b9e1c1	Fix bug in Slice helper when dim value is zero (#9492 ) * Don't clamp if dim_value is zero as that will set `step` to an invalid value.	2021-10-25 17:39:01 +10:00
Ginés Hidalgo	dbe1b57a71	Update thread_utils.cc	2021-10-22 16:59:09 -07:00
Ginés Hidalgo	a79d375d24	Added fixes for Clang on Win64	2021-10-22 16:59:09 -07:00
Ginés Hidalgo	9335cf102a	Deleted duplicated "core/graph/function.h" "core/graph/function.h" appears twice: - `include/onnxruntime/core/graph/function.h` - `onnxruntime/core/graph/function.h` --> This one is redundant and not used anywhere	2021-10-22 16:58:29 -07:00
Stella Stamenova	d608504438	Don't use legacy mode for protobuf (#9498 )	2021-10-22 16:50:29 -07:00
Changming Sun	d83adaaf9f	Remove optional-lite (#9424 )	2021-10-22 16:45:45 -07:00
Sherlock	3ed8ade675	Use SafeInt for malloc related computation (#9503 ) * Use SafeInt for malloc related computation	2021-10-22 16:42:12 -07:00
Wei-Sheng Chin	beddbdec5a	Fix PythonOp exporter (#9318 ) Register PythonOp exporter with the right symbol.	2021-10-22 10:45:45 -07:00
stevenlix	5adf175847	pad shape 0 is not allowed in edge mode to comply with latest numpy (#9488 )	2021-10-22 10:42:51 -07:00
Wei-Sheng Chin	d2d480a0db	Allow None As Autograd Context (#9315 ) * Allow none ctx * Update orttraining/orttraining/test/python/orttraining_test_ortmodule_autograd.py Co-authored-by: pengwa <pengwa@microsoft.com> * Address a comment Co-authored-by: pengwa <pengwa@microsoft.com>	2021-10-21 20:37:36 -07:00
Guoyu Wang	b64b2d48f3	Move iOS e2e test to XCUITest (#9422 ) * Move iOS test to user UITest * minor update * Update readme * update test's ios deployment target * address cr comments	2021-10-21 18:51:13 -07:00
Ginés Hidalgo	7f2f56633c	Fixed implicit conversion warnings (#9481 )	2021-10-21 16:13:28 -07:00
Stella Stamenova	49b66c7486	NFC: Normalize whitespace around if statements in CMakeLists.txt (#9464 ) Always add a space after if to make the file consistent	2021-10-21 15:35:58 -07:00
Jeff Daily	ca7116ca3e	CUDA EP's ResizeImpl now uses functors, hipify for ROCm EP (#9466 ) Support for device function pointers is not yet available for ROCm. Instead, the device function pointers were converted to device functors. Case statements, lambdas, and macros are used for dispatch; as a result, all combinations of kernels are compiled with inlined functors. The basis of this approach can be found in PyTorch. Lastly, hipify and register Resize and Upsample for ROCm EP.	2021-10-21 15:02:41 -07:00
Jeff Daily	66ceb6926d	rehipify ROCm EP files under orttraining (#9443 ) * rehipify rocm ep files under orttraining committed to source control * fix flake8 error	2021-10-21 13:36:21 -07:00
Sherlock	ff23b9ff55	Avoid cudaStreamSync at the end of Forward/Backward (#9470 ) * Skip cudaStreamSynchronize at the end of fw * skip sync stream for end of backward	2021-10-21 11:28:25 -07:00
Xavier Dupré	5797bd6db3	Remove one unnecessary deepcopy in unflatten_user_output (#9353 ) * Removes one unnecessary deepcopy	2021-10-21 10:44:27 +02:00
Sunghoon	4028e51e7e	Update the compatibility of ONNX Runtime Web (#9444 )	2021-10-20 18:03:12 -07:00
George Nash	1249c7c29e	Resolve issue when running Yolov4 on DNNL EP (#9355 ) The dnnl_binary ops need the memory format to match the format expected by Onnxruntime. If the memory format of the inputs do not match each other there will be an error in the calculated results. Additionally, since the code manually pads the tensor dimensions for broadcasting the inputs are expected to be in Onnxruntimes format. Since detecting and reordering the memory to Ort format matches what was previously done for the Reshape op the code was moved from dnnl_reshape to dnnl_subgraph_primitive under the name GetMemoryInOrtFormat. One small additional change made to the capability code log to also print the percentage of nodes run by the dnnl execution provider. Signed-off-by: George Nash <george.nash@intel.com>	2021-10-20 13:10:31 -07:00

1 2 3 4 5 ...

5753 commits