onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-23 22:13:38 +00:00

Author	SHA1	Message	Date
Justin Stoecker	bd236ecc26	Switch to unified DirectML 1.4.0 redistributable (#5794 ) Transitions from the ORT-only DML NuGet (hosted on the onnxruntime_public feed) to the new unified DirectML NuGet (Microsoft.AI.DirectML) on nuget.org. In addition, the Microsoft.AI.MachineLearning (WinML) and Microsoft.ML.OnnxRuntime.DirectML packages now take a dependency on the Microsoft.AI.DirectML package. This means we can remove the extra copy of DML binaries in these packages since they will be installed by the DML package.	2020-11-17 13:42:23 -08:00
Scott McKay	7b76b57fc8	Support EPs that compile nodes in a minimal build. (#5776 ) * Support EPs that compile nodes in a minimal build. This enables NNAPI being used.	2020-11-17 13:52:22 +10:00
Tiago Koji Castro Shibata	794e8479eb	Revert #5805 (#5823 ) * Fix race condition in msbuild * Revert "Named Dimension Override internals test and experimental API (#5805)" This reverts commit `157d1844fb`.	2020-11-16 17:05:28 -08:00
Sheil Kumar	671fa60327	Enable direct tensorization and detensorization to many buffers in WinML (#5791 ) * switch to work PC * back with iterable of buffers * add raw api tests * tensorization * last test * all tests pass! * small cleanup * whitespace * newline * whitespace * refactor common code into DisjointBufferHelpers * remove unused file * warning * skip gpu tests when hardware not available * Add error condition when createreference is invoked * add null check to cretereference * uncomment out check Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2020-11-16 10:06:22 -08:00
RandySheriffH	20ae1ea21f	Remerge custom gpu op (#5818 ) * add case for cpu custom op on gpu * format doc * restrict GPU custom op on Linux GPU CI only * separate cu file to a independent project * fix typo * include cuda_add lib * move lib def * add file header Co-authored-by: RandySheriffH <rashuai@microsoft.com>	2020-11-16 09:27:46 -08:00
Guoyu Wang	c4818d36ed	[NNAPI EP] Make NNAPI EP build on non-Android Platform (#5779 ) * Make NNAPI EP build on non-Android Platform * minor updates * Adress CR comments * Fix build issue using Windows, address CR comments * Fix linux build warnings * Fix for test failure * Fix for test failure * Fix model_tests failure	2020-11-15 17:04:45 -08:00
Ori Levari	157d1844fb	Named Dimension Override internals test and experimental API (#5805 )	2020-11-13 21:21:11 -08:00
sfatimar	dfbf6d78be	OpenVino: fix allocation failure on Window for RelWithDebInfo build (#5713 ) * ng_supported_ops * Remove ng_supported_ops * Revert "Remove ng_supported_ops" This reverts commit 3c27385b2d88c6e8cf7ac4e8c290a367ad5d0bd8. * Revert "ng_supported_ops" This reverts commit 650721ae2913b79739521d58838298e031abdac1. * cmake changes to ensure that the debug build on windows link to debug builds of openvino and do not result in bad allocation error Co-authored-by: sfatimar <sahar.fatima@intel/com>	2020-11-13 07:59:52 -08:00
jeyblu	435b904f0e	add dnnl gpu engine (#5788 )	2020-11-12 20:17:54 -08:00
stevenlix	54de618c2e	Improve TensorRT engine caching (#5737 ) * add profile caching to improve engine caching feature * Add comments * fix typo * add decryption for engine caching * Update tensorrt_execution_provider.cc * Update tensorrt_execution_provider.cc * Update tensorrt_execution_provider.cc * Update tensorrt_execution_provider.cc * Update tensorrt_execution_provider.cc * update onnx-tensorrt submodule * set opt profile to max value of the range * add hash to engine/profile name * Add calibration based INT8 quantization * add an option to enable both FP16 and INT8 * Update tensorrt_execution_provider.cc * add env variable to specify calibration file name * clean up code * Add comments and update TRT document * enable tensorrt basic test and add EngineCachingTest * clean up * update envrionment variable in the test * clean up	2020-11-12 08:56:45 -08:00
Hariharan Seshadri	b92fc66ea1	Support opset-13 specs of controlflow ops (Loop, If) (#5665 )	2020-11-11 23:44:14 -08:00
Maajid khan	a84a058f9e	[OpenVINO-EP] Enabling Multi Device support (#5740 ) * Enabling Multi Device support for UEP Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Minor fix added *Added a simple fix to determine OpenVINO version for Arm build as well Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>	2020-11-11 15:16:30 -08:00
Xueyun Zhu	d8ace07ad7	Add CPU send/recv for pipeline (#5315 ) * cpu send/recv * clean up send/recv * remove unused code * assert and nccl option for mnist * add build option to enable build with only cpu. Without this, nccl is always enabled which will break build on machine that only contains cpu * Add USE_MPI distinct from USE_NCCL/USE_HOROVOD * fix * fix * exclude cpu send/recv for machines without mpi Co-authored-by: Tim Harris <tiharr@microsoft.com>	2020-11-11 12:41:39 -08:00
Yufeng Li	2ba637c558	Implement Scale function for quant gemm (#5632 ) * Implement a Scale function for quantization Quantized GEMM is always followed by Scaling (PerTensor Or PerColumn), and often need to be accumulated to an existing matrix. This PR implements a post-processor for quantized GEMM result and accumulate it to another matrix.	2020-11-10 23:34:38 -08:00
Alberto Magni	c75b7c5c47	[CMake] Enable NCCL only when enabling CUDA or ROCm support (#5516 ) Conditionally enable NCCL depending on CUDA and ROCM Before this change NCCL support was enabled unconditionally, even when building without CUDA or ROCM support. This caused the command: $ ./build.sh --enable_training To trigger the following cmake warning -- Could NOT find NCCL (missing: NCCL_INCLUDE_DIR NCCL_LIBRARY) CMake Warning at CMakeLists.txt:1282 (message): NCCL is not found. Please use --nccl_home to specify the path of NCCL. Otherwise, NCCL is disabled. This is a spurious warning because the user did not ask to search for NCCL.	2020-11-10 12:39:23 -08:00
Weixing Zhang	fff85a6a35	Add GPU kernels for ROCm EP (#5655 ) * Add kernels for AMD GPU. This PR is mostly about GPU kernels for ROCm EP. Due to similar GPU programming language (CUDA and HIP and similar math library calls, one principle in ROCM EP design is to share CUDA kernels as much as possible for ROCm. Thus, the script amd_hipify.py has been created for converting CUDA kernels to ROCm HIP kernels automatically during compilation phase. But, for some reasons such as perf issue, syntax difference..., some converted kernels need some manual intervention. These kernels will be checked in the repo physically for now. In order to avoid manual intervention, the plan is to refactor CUDA kernels to make them portable between CUDA EP and ROCm EP as much as possible. Please refer to "HIP Porting Guide" for details. * like lamb, multi-tensor-apply needs to be disabled for IsAllFiniteOp and ReduceAllL2, current AMD GPU compiler has perf issue for kernel parameter which is a structure with "pass by value". * Use hipMemsetAsync and add checks on HIP calls. * move the generated files to build folder. Co-authored-by: Jesse Benson <jesseb@microsoft.com>	2020-11-06 16:11:06 -08:00
Maajid khan	d6f9cc181d	Modify logic to determine OV Version (#5701 ) Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>	2020-11-05 15:12:02 -08:00
edgchen1	07bd4ef470	Upgrade optional implementation to https://github.com/martinmoene/optional-lite . (#5563 )	2020-11-03 15:27:47 -08:00
Hector Li	b6eeadf420	Enable OpenVino build on Arm64 platform (#5682 )	2020-11-03 13:55:34 -08:00
Ashwini Khade	1cca903680	update onnx commit id (#5594 ) * update onnx commit id * update onnx commit for docker images * update docker images	2020-11-02 09:46:36 -08:00
Maajid khan	d98062da0c	[OpenVINO-EP] Hetero support (#5627 ) * Implement Hetero in UEP * Added security checks to take valid Hetero combinations as device type * Integrating Hetero features * Get the statistics Report in Debug Mode Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Passing right device type for vadm_baackend Added simple fix to pick the right device type when using vadm_backend with Hetero as well. Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Fixed batching logic for 2020.4 and above * Fixed flake8 PEP8 errors Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Minor Fixes Added Added security checks for device_type passed in for Hetero build during run time code cleanup Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Minor changes Added Fixed batch_size bug in vadm_backend code cleanup *Documentation updated for Hetero Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> Co-authored-by: suryasidd <surya.siddharth.pemmaraju@intel.com>	2020-10-30 22:35:08 -07:00
Changming Sun	d9293f38e6	Revert "Custom Op on GPU (#5620 )" This reverts commit `2c63196600`.	2020-10-30 21:23:51 -07:00
RandySheriffH	2c63196600	Custom Op on GPU (#5620 ) * add case for cpu custom op on gpu * format doc * restrict GPU custom op on Linux GPU CI only * separate cu file to a independent project * fix typo Co-authored-by: RandySheriffH <rashuai@microsoft.com>	2020-10-30 12:25:44 -07:00
Weixing Zhang	aec4cb489e	ROCm EP for AMD GPU (#5480 ) The ROCm EP is designed and implemented based on AMD GPU software stack named ROCm. Here is the link for the details about ROCm: https://rocmdocs.amd.com/en/latest/ ROCm EP was created based on the following things: 1. AMD GPU programming language: HIP 2. AMD GPU HIP language runtime: amdhip64 3. BLAS: rocBLAS, hipBLAS 4. DNN: miOpen 5. Collective Communication library: RCCL 6. cub: hipCub 7. … Current status: BERT-L and GPT2 training can be ran on AMD GPU with data parallel. Next: 1. Make more GPU code be sharable between ROCm EP and CUDA EP since HIP language and HIP runtime API are very close to CUDA. 2. Continue improving the implementation. 3. Continue GPU kernel optimization. 4. Support model parallelism on ROCm EP. …… The rocm kernels have been removed from this commit and will be in a separate PR. Since the original PR was too big(~180 files), it was suggested to split the PR into two parts, one is rocm-kernels, the other is non rocm kernels. Co-authored-by: Weixing Zhang <wezhan@microsoft.com> Co-authored-by: sabreshao <sabre.shao@amd.com> Co-authored-by: anghostcici <11013544+anghostcici@users.noreply.github.com> Co-authored-by: Suffian Khan <sukha@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2020-10-29 17:13:04 -07:00
Ramakrishnan Sivakumar	5bcb5f5a3d	MLAS: Add support for AVXVNNI (#5592 ) Adds Gemm kernels with AVXVNNI support for Int8 acceleration	2020-10-26 16:27:48 -07:00
Hariharan Seshadri	44773c60e3	Add a CUDA based IOBinding test (#5572 )	2020-10-26 10:57:36 -07:00
Tracy Sharpe	502f67ba58	MLAS: implement u8x8 GEMM for aarch32 (#5580 )	2020-10-25 23:05:12 -07:00
Andrews548	20bc83400b	ACL/ArmNN update (#5515 ) * Build ACL and ArmNN with custom library path * Define import to tensor as a separate function for maintenance and readability * Enabled optimized depthwise convolution for ACL v20.02 * Check operation status for ACL and ArmNN Execution Providers * Enabled fused operation for convolution-activation Co-authored-by: Andrei-Alexandru <andrei-alexandru.avram@nxp.com>	2020-10-22 09:29:44 -07:00
Changming Sun	5802fe1699	Remove MKLML build config (#5559 ) Remove MKLML build config	2020-10-21 13:11:25 -07:00
Ashwini Khade	df22611026	Update ONNX commit (#5487 ) * update ONNX * update onnx + register kernels for reduction ops * bug fix kernel reg * update cgmanifests * revert unsqueeze op 13 registration * filter ops which are not implemented yet * filter some tests * update onnx commit to include conv transpose bug fix * update docker images * undo not required test changes * fix test failures	2020-10-21 07:22:20 -07:00
Tracy Sharpe	45483dcf1f	Add QLinearConv for activations=u8, weights=s8 (#5510 )	2020-10-20 08:45:13 -07:00
Ryan Lai	f207f0bf5e	Add WinML Model testing (#5417 ) * Model test start with float * Clean up code and add environment variable detection * Move into namespace * PR comments * Fix linker errors in latest merge to master and also fix warning * add skipping model test mechanism * Return std::string instead of writing to buffer * Address case where env variable is larger than max_path * use const static string for test reason * Disable x86 tests and don't build if ort memory checker is enabled * Add comment * Add additional failing x86 tests and ifdef for checking fo rx86 build * PR comments	2020-10-15 19:04:12 -07:00
sfatimar	6d2a30eae3	[OPENVINO-EP] 2021.1 Release (#5431 ) * Cmake changes for 2021.1 * added new ov version 2020.1 for faster rcnn * Added missing defs * equal op modified * changes to incoroporate faster rcnn * backend util.cc * hddl_plugin_config.hpp is depreceated . instead use hddl_config.hpp * changing myriad precision bool to i32 * gather is not enabled for gpu * conv2D and pooltest auto_pad attribute should not be null * negative indices are not valid for scatter op in myriad * non max suppression op only supported in faster rcnn mode * maxpool indices output is not supported * Cleaned redundant code in backends * Added ifdefs for HDDL config * cast output dimensions check topk operator k input it seems only resolved for myriad as it is throwing issues for ask rcnn . need to verify * we are limiting the subgraph size to 3 here * taking care of review comments * Fixed minor bugs * Modified Slice op checks * Added NonZero, Upsample * Removed TopK if it's in the middle of a subgraph * incorporated upsample conditions too * Dockerfile changes for 2021.1 release * dockerfile aptkey update * Minor fixes * ceil condition added again * Fixed few gpu models * Disabled LSTM and yolov3 in ModelTests * python softmax cross entropy tests and negative log likelihood * Update Build.md Updated for openvino 2021.1 * Update OpenVINO-ExecutionProvider.md update openvino execution provider for 2021.1 * Update READMe.md updated new openvino version * Update Dockerfile.openvino added environment variable for DEBIAN Frontend * Fixed myriad models * Fixed gather condition * Fixed mask rcnn model on myriad * Modified Gather condition * set default target of MCR dockerfile to MYRIAD_FP16 * Fixed tinyolov3 on CPU * Update OpenVINO-ExecutionProvider.md update openvino execution provider documentation * Update Dockerfile.openvino Removed environment variable * Update OpenVINO-ExecutionProvider.md update image manipulation networks supported * Update onnx_backend_test_series_filters.jsonc removed test_upsample_nearest from cpu test cases * New InternalCI changes for 2021.1 * Full protobuf removed for OpenVINO * Protobuf added * Updated with apt installation for openvino * Revert the testing changes * Reverted testing changes * File permessions are changed to original * Deleted openvino installation and cmake change * Optimized Dockerfile Removed unnecessary cmake installation, numpy * Added missing ifdefs * delete array fix * backend_utils.cc output_shape * Revert "set default target of MCR dockerfile to MYRIAD_FP16" This reverts commit 928d3e2b71e2f589cf51dacd3a133951cf9ca18d. Co-authored-by: suryasidd <surya.siddharth.pemmaraju@intel.com> Co-authored-by: sfatimar <sahar.fatima@intel/com> Co-authored-by: suryasidd <48925384+suryasidd@users.noreply.github.com> Co-authored-by: S. Manohar Karlapalem <manohar.karlapalem@intel.com> Co-authored-by: Aravind <aravindx.gunda@intel.com> Co-authored-by: Aravind Gunda <38353114+gundaarx@users.noreply.github.com>	2020-10-14 15:56:00 -07:00
Scott McKay	5544391e79	Fix linking of MLAS unit test lib on platforms where libatomic is required. (#5469 )	2020-10-14 07:25:43 +10:00
Wenbing Li	80d36eab86	enable the onnxruntime shared library test on iOS (#5443 ) * enable the onnxruntime shared library test on iOS * fixing as commented. * add return status check.	2020-10-12 21:40:57 -07:00
stevenlix	186f0668b0	update onnx-tensorrt submodule (#5442 )	2020-10-09 21:49:40 -07:00
Scott McKay	a92ccbe1bc	Various armv7 related fixes (#5394 ) * - Link with libatomic if needed - Install pip differently so it doesn't clash with the system pip which may involve a wrapper script - Remove ability to specify offset when Tensor allocates the data. The data prior to offset isn't accessible by anything. - Fix use of offset in TensorOpTest to work on armv7 where it must be aligned to the type it points to. - Fix ActivationOpNoInfTest.Softsign to allow for armv7 behavior - Fix ReductionOpTest.ReduceMean_keepdims to allow for armv7 floating point inaccuracy Address PR comments	2020-10-09 22:34:32 +10:00
Tianlei Wu	8133223871	clear cudaDelayLoadedLibs since delayload is disabled (#5386 )	2020-10-07 11:33:12 -07:00
Guoyu Wang	deb708d3b1	Move flatbuffers to 1.12 release (#5392 )	2020-10-07 09:23:03 -07:00
Tracy Sharpe	0122e890d9	MLAS: implement u8x8 GEMM for ARM64 (#5380 ) Add an implementation for u8u8/u8s8 GEMM for use on ARM64 (Windows/Linux).	2020-10-06 19:22:23 -07:00
Guoyu Wang	b4934b0016	Mitigate pybind11 build break using Xcode 12 on macOS (#5381 ) * turn dev_mode off if we are using macos to build python with xcode 12 * Address CR comments * Add ways to check compiler version	2020-10-06 19:03:33 -07:00
Wenbing Li	ed102e9d88	Add iOS test pipeline and a sample app. (#5298 ) * Add iOS test pipeline and a sample app. * clean up the unused code. * clean up. * revert the unknown change * disable the shared library for iOS. * add open source notice text. * ignore the skipped test. * extract the common ortenv setup	2020-09-29 13:53:11 -07:00
Changming Sun	1a04b8f8b7	Add valgrind support to our cmake files (#5296 )	2020-09-28 09:31:08 -07:00
Guoyu Wang	fec890a09a	fix build break (#5306 )	2020-09-28 00:10:48 -07:00
George Wu	16d35266ab	add install targets for ep shared libs (#5286 )	2020-09-25 07:10:43 -07:00
Guoyu Wang	3a3f26f38e	Move ort flatbuffers helper functions and value info r/w functions into separated lib (#5276 ) * Move fbs include from header to cc * add initial cmake for flatbuffers * Move most flatbuffers util to ort_flatbuffers * move code around * fix * move test/perf runner to use flatbuffer directly instead of model * minor update * Fix build break * Clean up includes and foward decl * Fix traning CI build breaks * Addressed PR comment, replaced some include with forward decls * Remove ORT_MUST_USE_RESULT temporarily	2020-09-25 05:36:29 -07:00
Ryan Lai	71b52ad5de	Fix inbox telemetry (#5265 ) * ifdef to check if redist or not * Fix redist telemetry Co-authored-by: Ryan Lai <ryalai96@gamil.com>	2020-09-24 14:58:07 -07:00
Dwayne Robinson	6ad39819c2	Update DirectML Nuget to 1.3.0 (#5274 ) Update to 1.3.0	2020-09-23 22:53:02 -07:00
Justin Stoecker	56862f4022	Add way to disable additional linker opt flags	2020-09-23 12:56:40 -07:00
George Wu	b5a6a8e847	remove implicit linking of tensorrt and dnnl ep shared libs (#5262 ) * remove trt and dnnl from link command * add comment	2020-09-23 05:47:18 -07:00

1 2 3 4 5 ...

606 commits