onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-04 04:07:22 +00:00

Author	SHA1	Message	Date
Yulong Wang	054464dce2	fix XNNPACK on WebAssembly SIMD (#13161 ) ### Description fix XNNPACK on WebAssembly SIMD. Flag "-msimd128" need to be applied to every source file when compiling WASM SIMD. Currently only a part of the source files are compiled with this flag so we get inconsistent result for `sizeof(xnn_f32_minmax_params)` because the type definition include a `#ifdef` for `__wasm_simd128__`. The inconsistency causes writing garbage data to a stack variable and eventually cause the crash. XNNPACK libraries are C libraries so need to apply the build flags not only to `CMAKE_CXX_FLAGS` but also to `CMAKE_C_FLAGS`.	2022-09-30 16:34:15 -07:00
George Nash	b76a65c784	Upgrade the oneDNN ep to use oneDNNv2.7 (#13175 ) ### Description This updates the oneDNN library used by oneDNN ep from version 2.6 to version 2.7 ### Motivation and Context This brings in the many improvements incorporated into the oneDNN library to the oneDNN execution provider. Signed-off-by: George Nash <george.nash@intel.com>	2022-09-30 12:29:17 -07:00
cloudhan	c93cb8f949	Revert "Enable ROCm to use tunable GEMM" (#13160 ) Reverts microsoft/onnxruntime#12853 due to CI pipeline problem.	2022-09-30 14:01:16 +08:00
cloudhan	32c2c4b480	Change ROCm to use tunable GEMM (#12853 ) Change ROCm to use tunable GEMM. It is not enabled in this PR. This will drastically improve GEMM performance in some shapes and dtypes configuration. This will benefit the overall performance for BERT inference and hopefully, training, when enabled.	2022-09-28 16:21:54 +08:00
Rachel Guo	9a44a69653	Refactor NNAPI EP OpBuilder/OpSupportChecker structure (#13065 ) ### Description <!-- Describe your changes. --> As title -Split long OpBuilder and OpSupportChecker files into individual operator files. -Add OpBuilder/SupportChecker registry factories. -Combine the functionality of op_builder and op_support_checker into one op_builder. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> The NNAPI OPBuilder was splitted into OPBuilder (For EP::Compile) and OPSupportChecker (for EP::GetCapability) At the time it was reasonable choice, but OPBuilder/OPSupportChecker share some logic and has to use addition helper. Clean up now to make NNAPI OPBuilder/OPSupportChecker into single OPBuilder (similar to what CoreML EP has)	2022-09-27 17:12:09 -07:00
Changming Sun	b25437ec41	Upgrade protobuf version (#13100 ) Upgrade protobuf version from 3.18.1 to 3.18.3 to address CVE-2022-1941	2022-09-26 21:30:28 -07:00
RandySheriffH	77a066c700	Drop nuphar from java API (#13107 ) Drop nuphar from: - java API - tvm.cmake - run_build.sh	2022-09-26 17:06:08 -07:00
RandySheriffH	a83a9ed6b0	Remove miscellaneous nuphar configs (#13070 ) Remove a handful of nuphar related configurations after deprecation. Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2022-09-26 13:41:28 -07:00
Dale Phurrough	2ae33b3613	fix CuDNN lib path for Windows (#12974 ) Fixes microsoft/onnxruntime#12969 ### Motivation and Context Build is broken, can't find cudnn.lib with nvidia official install of cuDNN Alternative method is to use `IF(EXISTS ${onnxruntime_CUDNN_HOME}/lib/x64/cudnn.lib)` to test for legacy location and only add the legacy dir to the path, else add the current official `lib/` dir.	2022-09-26 13:23:38 -07:00
Changming Sun	eafd67b8fd	Update CUDA version to 11.6 and refactor python packaging pipeline (#13002 ) 1. Update CUDA version from 11.4 to 11.6. 2. Update Manylinux version 3. Upgrade GCC version from 10 to 11 for most x86_64 pipelines. CentOS 7 ARM64 doesn't have GCC 11 yet. 4. Refactor python packaging pipeline: a. Split Linux GPU build job to two parts, build and test, so that the build part doesn't need to use a GPU machine b. Make the Linux GPU build job and Linux CPU build job more similar: share the same bash script and yaml file. 5. Temporarily disable Attention_Mask1D_Fp16_B2_FusedNoPadding because it is causing one of our packaging pipeline to fail. I have created an ADO task for this.	2022-09-23 00:29:27 -07:00
cloudhan	a24b41d92e	Move all TunableOp related falicilities to EP level directory (#12857 ) Some Ops in EP directory instead of contrib_ops directory will require TunableOp. We will also need to add EP level session tuning options for it. So move those code all at once. Also remove duplicated utility functions.	2022-09-23 11:10:19 +08:00
wangxiyuan	952c99304a	Add CANN EP (#12416 ) Description: This PR adds Ascend CANN execution provider support. Motivation and Context - Why is this change required? What problem does it solve? As the info shown in the issue. CANN is the API layer for Ascend processor. Add CANN EP can allow user run onnx model on Ascend hardware via onnxruntime The detail change: 1. Added CANN EP framework. 2. Added the basic operators to support ResNet and VGG model. 3. Added C/C++、Python API support - If it fixes an open issue, please link to the issue here. https://github.com/microsoft/onnxruntime/issues/11477 Author: lijiawei <lijiawei19@huawei.com> wangxiyuan <wangxiyuan1007@gmail.com> Co-authored-by: FFrog <ljw1101.vip@gmail.com>	2022-09-22 14:53:40 -07:00
sfatimar	cccbe90764	Openvino ep 2022.2 v4.2 (#13023 ) This changes are to align OV 2022.2 Release with ORT . Changes CPU FP16 Support, dGPU Support, RHEL Dockerfile, Ubuntu 20 Dockerfile Motivation and Context - This change is required to ensure ORT-OpenVINO Execution Provider is aligned with latest changes. - If it fixes an open issue, please link to the issue here. Co-authored-by: mayavijx <mayax.vijayan@intel.com> Co-authored-by: shamaksx <shamax.kshirsagar@intel.com> Co-authored-by: pratiksha <pratikshax.bapusaheb.vanse@intel.com> Co-authored-by: pratiksha <mohsinx.mohammad@intel.com> Co-authored-by: Sahar Fatima <sfatima.3001@gmail.com> Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com> Co-authored-by: nmaajidk <n.maajid.khan@intel.com> Co-authored-by: Mateusz Tabaka <mateusz.tabaka@intel.com> Co-authored-by: intel <intel@iotgecsp-nuc04.iind.intel.com>	2022-09-22 12:31:40 -07:00
Adam Louly	268bfe2a5d	python training api bindings (#12610 ) Description: Python API Bindings for on device training. Motivation and Context - This PR contains api bindings so python users can perform a whole training loop. Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>	2022-09-16 09:38:24 -07:00
sumitsays	363c695dad	Update DML 1.9.0 to 1.9.1 (#12966 ) Update DML to 1.9.1 Co-authored-by: Dwayne Robinson <dwayner@microsoft.com>	2022-09-15 10:54:22 -07:00
cloudhan	10f9a69707	Use CMake EXCLUDE_FROM_ALL for composable kernels to avoid building of conv related kernels (#12855 )	2022-09-14 22:11:31 -07:00
Chun-Wei Chen	d819b56fba	Consume ONNX 1.12.1 to prevent vulnerability issue while loading external file (#12915 ) * consume ONNX 1.12.1 to prevent vulnerability issue while loading external tensors * update ONNX 1.12.1 * test updated PR * use official rel-1.12.1 commit	2022-09-14 21:10:24 -07:00
Scott McKay	022d9e2d0c	Get files for XNNPACK wasm build from BUILD.bazel. (#12892 ) Get files for wasm build from BUILD.bazel.	2022-09-09 12:38:57 -07:00
pallavides	6ebb7b91eb	Re-apply fix for mkl issue for eager mode (#12881 ) * reapply fix for mkl issue for eager mode * add comment, update link libs	2022-09-08 12:29:24 -07:00
RandySheriffH	d3b684cd9e	Drop nuphar (#11555 ) * drop nuphar code and configs * refactor test case * format python * remove nuphar from training test * remove commented nuphar logics * restore llvm setting * drop nuphar ci * fix compile err * fix compile err Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2022-09-07 15:11:18 -07:00
Hariharan Seshadri	ad69aac491	Introduce ordered quantization ops for the CUDA EP [1/n] (#12582 ) Initial core small set for the ordered quantization ops for cuda EP.	2022-09-07 11:58:15 -07:00
Guenther Schmuelling	f856be162e	fix xnnpack wasm build (#12845 )	2022-09-06 19:20:07 -07:00
Jan Tilly	437409c343	Add DONT_VECTORIZE flag to cmake (#12169 ) Add DONT_VECTORIZE flag.	2022-09-07 12:14:14 +10:00
Yulong Wang	726251609a	increase max memory to 4G for wasm (#12798 )	2022-09-06 17:07:13 -07:00
Xavier Dupré	54360c88d2	Disable two warnings raised by tensorboard on Visual Studio (#12773 )	2022-09-06 20:42:52 +02:00
Baiju Meswani	295bd26980	Remove orttraining-distributed CI pipeline (#12738 )	2022-09-02 14:34:26 -07:00
Changming Sun	ca5af24765	Update Sdl.ruleset to remove C26812 from the rules (#12695 )	2022-09-01 20:05:20 -07:00
Sheil Kumar	e3b501125d	DFT on DirectML (#12710 ) * DFT on DirectML * feedback * fix misc build issues * fixes * fix constant cpu inputs and optional tensors for external operators * disable dft tests on 'pure' dml	2022-09-01 08:31:14 -07:00
Yulong Wang	82a28cc2c3	upgrade emsdk to 3.1.19 (#12690 ) * upgrade emsdk to 3.1.19 * fix build break * ignore '-Wunused-but-set-variable' in eigen * add malloc and free in exported functions * EXPORTED_FUNCTIONS	2022-08-30 13:42:45 -07:00
Yi Zhang	27304d9082	gcc should not less than 7 (#12771 )	2022-08-29 23:49:29 +08:00
mwootton	817dc94345	Add first pass of rocm kernel profiler (#10911 ) * Add first pass of rocm kernel profiler * Clean up rocm_profiler. Format args. Demangle kernel names. Add Api EventRecords * Remove debug output * Temporarily disable profiling unit test 'api record check' for cupti * Fix compile error for non-gpu builds * Use common file for demangle and pid/tid. Namespace ThreadUtil. Fix gpu buffer clearing. * Merge demangle into profiler_common * Merge demangle into profiler_common part 2 * Style cleanup * Resolve linking issues via ProviderHost interface * Demangle cuda kernel names * Clean up comments * Fix formatting * Fix anal retentive formatting	2022-08-26 19:38:03 -07:00
cloudhan	46c074a6c8	Update composable kernel and enable experimental inter wave scheduling (#12626 ) Update ck to latest master and enable interwave scheduling	2022-08-25 22:19:41 -07:00
Changming Sun	7927d525a7	Remove CUDNN path from CI build scripts (#12671 )	2022-08-24 18:21:50 -07:00
Yi Zhang	de3d772995	Check GCC version (#12680 ) * check gcc version	2022-08-24 12:10:08 +08:00
Wei-Sheng Chin	dc486d146b	Make ORT callable from various Pytorch compilers (LazyTensor, TorchDynamo, etc) (#10460 ) * Make ORT as Pytorch JIT backend LORT likely doesn't work with aten fallback so we only test LORT in its own CI. * Revert changes to enable external CUDA allocator. Will add it later. Revert "Revert changes to enable external CUDA allocator. Will add it later." This reverts commit d5487f2e193014c805505afae8fb577c53667658. Fix external allocator * Relax tolerance and remove commented code * Print more information in CI * Fix pointer * Address comments. 1. Reuse ORT-eager mode's environment. 2. Remove unused ctor. * Use Pytorch master branch as all PRs are merged Fix * Refine based on cpplint feedbacks * Revert changes to allow custom CUDA allocator in public APIs * Use torch.testing.assert_close * Use unittest framework * Switch docker repo * Rename .cpp to .cc * Address comments * Add comment * Use same pipeline file for eager and lort pipelines * Address comments * Add yaml comment * Fix cmake files * Address comments * Rename flags, remove printing code, remove dead comment	2022-08-22 09:40:40 -07:00
Yulong Wang	bfdd191eec	[wasm] use same export name for SIMD/NOSIMD build (#12545 )	2022-08-19 18:17:50 -07:00
yf711	9d10badc55	Add build option to link TensorRT prebuilt parser (#12602 ) * Add build option to link prebuilt TensorRT parser * Test without the build option to link prebuilt TRTParser * Minor: update name of build option * Minor: update name of build option	2022-08-16 14:09:58 -07:00
Dmitri Smirnov	616677104a	ONNX Protobuf natvis with some google::protobuf (#12580 ) ONNX Protobuf natvis with some google::protobuf structures Add leading underscore to local Intrinsic	2022-08-15 09:59:07 -07:00
Xinya Zhang	eb827bd3e5	[ROCm] NGramRepeatBlock, LongformerAttention and DecoderAttention Ops (#11971 ) * [ROCm] enable NGramRepeatBlock Op * [ROCm] Enable testing ROCm in NGramRepeatBlockTest.NGramSize_3 Also link onnxruntime_test_all with amdhip64 when USE_ROCM=1 * [ROCm] add LongformerAttention Op * [ROCm] Enable LongformerAttentionTest * [ROCm] Add DecoderAttention Op * Enable DecoderAttention Test for ROCm. * [ROCM] Updates according to reviews	2022-08-11 19:32:08 -07:00
Changming Sun	ac7538b909	Remove CUDA 10.2 support (#12541 )	2022-08-10 22:46:41 -07:00
Cheng	819c36701f	[xnnpack] basic QDQ operators support (#11912 ) * basic ops for mobilenet,qconv,qsoftmax,qavgpool update Xnnpack to latest unit test * NodeUnit: use outputedge to replace output-node * qdq model e2e test * use inlinedvector to replace vector * conv bias check * tensorshape helpers * Refactor xnn_op minmax * Qlinearsoftmax schema update * Remove qlinearsoftmax registration Co-authored-by: Jicheng Wen <jicwen@microsoft.com>	2022-08-11 10:12:51 +08:00
Dwayne Robinson	eb90b52a75	DML EP fix training build error (#12461 ) Fix onnxruntime_training.cmake missing linkage issue	2022-08-05 16:01:25 -07:00
cloudhan	f39354d7cb	Add composable kernel GEMM baseline for kernel explorer (#12364 ) * Split GemmBase RocBlasGemm * Add composable kernel GEMM baseline * Make linter happy * Address review comment * Update bert cases with batchsize * Adjust includes to fix IWYU lint * Only builds and links used ck kernels to improve building time * Remove warmup run on SelectImpl * Add comment to utility function * Mute cpplint * Make RocBlasGemm<T>::SelectImpl semantically correct * Add reduced basic test cases for ck gemm * More robust gemm testing * Fix warnings * Fix grammar	2022-08-04 17:32:20 -07:00
Dmitri Smirnov	a4ef0e7f7b	Remove dynamic allocation for ThreadPool ParallelSection (#12429 ) Use InlinedVector in a TP Store per thread parallel section in std::optional and avoid memory allocation	2022-08-04 09:46:16 -07:00
Dmitri Smirnov	eebaf5f270	Adjust and fixx abseil-cpp debugging visualization (#12415 ) Move abseil-cpp.natvis file, add it to PDB, adjust visualization	2022-08-02 15:08:17 -07:00
Edward Chen	f77ab4fea6	Manually add optimization flag for Android Release builds. (#12390 ) With recent versions of NDK (since 23), the `-O` optimization level compile flag is not being passed when building in the "Release" configuration. More details here: https://github.com/android/ndk/issues/1740 Our "Release" Android builds have been built without the optimization flag since we upgraded from NDK 21. This change is a workaround to manually add `-O3` for "Release" Android builds.	2022-08-01 12:49:03 -07:00
George Wu	6bb807ef74	add cuda compute 8.7 to Cmakelists.txt to support Nvidia Orin devices (#12377 ) * add cuda arch 8.7 to cmakelists.txt to support Nvidia Orin devices * add cuda version >= 11 check for orin support	2022-08-01 09:45:58 -07:00
Valery Chernov	1a4868e5c4	[TVM EP] Hot fix of build on Windows of TVM EP with ipp-crypto (#12381 ) fix of build on Windows with ipp-crypto. cmake warnings fix Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>	2022-07-31 14:36:54 +02:00
Valery Chernov	e2423bb55c	[TVM EP] Build on Windows with ipp-crypto support (#12336 ) * update TVM EP docs for ipp-crypto build conditions * add ipp-crypto by ExternalProject_Add Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>	2022-07-28 15:40:19 +02:00
msftlincoln	9cf6912bba	Fix ORT Eager Mode to work with Pytorch 1.12 (#12323 )	2022-07-27 16:24:46 -04:00
Ashwini Khade	ceb76429db	Merge pull request #12056 from microsoft/bmeswani/merge-training_dev/on_device_poc Merge On-Device-Training Offline Tooling and C/C++ APIs	2022-07-21 15:09:48 -07:00
Baiju Meswani	cbf08c7a7b	Make GetTrainingApi as a part of the OrtApis, add Training API documentation and address other pull request review comments	2022-07-21 18:11:48 +00:00
cloudhan	a0074ba9bc	Add baseline gemm for kernel explorer (#12050 ) Use rocblasGemmHelper gemm wrapper from ORT and profile for bert param size only.	2022-07-20 13:49:26 +08:00
Michael Melesse	bb5bd08545	[ROCM] Navi21 fixes pr (#11368 ) * add scripts * update docker scripts * update build script * create run script * add test script * add log 3 flags * use the right build function * build navi * add clean script * add pytorch like soln * only build gfx 1030 * use HOST side var * ignore logs * update scripts * GPU_WARP_SIZE_HOST * update scripts * remove scripts/amd * match main * add GPU_WARP_SIZE_HOST on cuda side * match main * correct gfx1030 * remove print * move gfx add to rocm5.0 * remove inline * make constexpr on cuda side	2022-07-18 22:26:57 -07:00
Valery Chernov	3b0aaa9e0e	[TVM EP] support build on Windows (#11851 ) * add description of build ORT+TVM EP on Windows * fix cmake error related to symlink creation on Windows * add llvm config path to build flags for correct build on Windows * update TVM_EP.md for llvm_config build arg * fix warnings skipping during build on Windows * fix using string or wstring for model path to correct build on Windows (MSVC error) * fix error in custom logger for correct build on Windows * implement glob algorithm for Windows * additional build fixes * update TVM with export of VM symbols for dll * description of nasm issue and workaround * update TVM with export of Executable from VM symbols for dll * description of installation of ipp-crypto dependencies on Windows * cmake key for ipp-crypto build * fix wstring for TVMso EP * fix ipp-crypto build * cmake key onnxruntime_TVM_USE_HASH switch off not specific methods, but full hash functionality * fix absolute path to compiled lib * update TVM_EP.md, fix lint warnings * update TVM_EP.md * small fixes after review * switch on handshake functionality for Linux workflow Co-authored-by: Valery Chernov <valery.chernov@deelvin.com> Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru>	2022-07-13 10:48:42 +02:00
cloudhan	785f74979b	Rework cmake for kernel_explorer (#12079 ) Improve CMake for deep integration with ORT, so that we can easily hook ort function of microbenchmarking purpose.	2022-07-13 15:43:32 +08:00
Dwayne Robinson	32a8751dc4	DML EP Update to DML 1.9 (#12090 ) * Update to DML 1.9 * Appease obnoxious Python formatting tool	2022-07-05 16:30:54 -07:00
Baiju Meswani	1aa27e127c	Resolve build conflicts with master	2022-07-05 19:53:54 +00:00
Wenbing Li	479e71a7a8	enable the extensions custom build for java and android (#11823 )	2022-07-05 10:34:14 -07:00
Baiju Meswani	a457ddc41d	Merge branch 'master' of https://github.com/microsoft/onnxruntime into bmeswani/merge_pr	2022-06-30 21:53:07 +00:00
ashbhandare	0ce14c7068	Fix windows cpu build VS2022 (#12032 ) Fix windows cpu build VS2021	2022-06-29 15:45:00 -07:00
Baiju Meswani	6e8edfff0c	Separate training apis from shared core apis (#12027 )	2022-06-29 14:12:29 -07:00
Valery Chernov	8ba8146650	[TVM] handshake mechanism for support of TVMso EP (#11437 ) * infrastructure for handshake mechanism was implemented. sha256 was selected as first hash algorithm * check hash during compile in TVMso EP * add IPP-CRYPTO to external dependencies for TVM EP * made checkHash method constant * removed the public implementation of the SHA-256 algorithm so as not to cause a license conflict * implemented SHA-256 calculation using ipp-crypto library * fix dependency for ipp-crypto * add provider options for hash check * update documentation for added provider options * add hash check condition * fix docs * fix lint * fix ORT_THROW Co-authored-by: Valery Chernov <valery.chernov@deelvin.com> Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru>	2022-06-29 14:57:18 +02:00
Edward Chen	f045994389	[NNAPI EP] Update NNAPI headers (#11954 ) Update the NNAPI headers to a more recent version (copied from TF Lite v2.9.1).	2022-06-27 18:54:06 -07:00
Baiju Meswani	d25cf4df26	Merge branch 'master' into training_dev/on_device_poc	2022-06-24 20:18:19 +00:00
Preetha Veeramalai	f54476a42f	Dll version fix ovep4.1 (#11953 ) * Setting default version values for ovep dlls as well * Update backend_manager.cc Co-authored-by: mayavijx <mayax.vijayan@intel.com> Co-authored-by: mohsin <mohsinx.mohammad@intel.com>	2022-06-22 11:09:36 -07:00
Gary Miguel	4bf22e2a40	Update ONNX to 1.12 (#11924 ) Follow-ups that need to happen after this and before the next ORT release: * Support SequenceMap with https://github.com/microsoft/onnxruntime/pull/11731 * Support signal ops with https://github.com/microsoft/onnxruntime/pull/11778 Follow-ups that need to happen after this but don't necessarily need to happen before the release: * Implement LayerNormalization kernel for opset version 17: https://github.com/microsoft/onnxruntime/issues/11916 Fixes #11640	2022-06-21 17:19:52 -07:00
Dwayne Robinson	64f95d400a	Update DML 1.9 Nuget package to fix WindowsAI nuget pipeline build issue (#11934 )	2022-06-21 15:55:51 -07:00
sfatimar	f97bd38c4f	UEP 4.1 release (#11834 ) * Add pypi build changes to latest Master * Add ORT training part of OV build * Disabling SqueezeOpTest.BadAxes * Add ONNXruntime branch ARG to Docker build * Changes to include file details versions * Commit File Version Updates * Change naming for linux build * Add fix for pylint format errors * Fix pylint warnings. * Fix pylint errors - stage 2 Signed-off-by: Preetha Veeramalai <preetha.veeramalai@intel.com> * Fix pylint errors - stage 3 * Fix pylint format - stage4 Signed-off-by: Preetha Veeramalai <preetha.veeramalai@intel.com> * Commit for Wheel Release >0.35.1 Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com> Co-authored-by: mayavijx <mayax.vijayan@intel.com> Co-authored-by: Sahar Fatima <sfatima.3001@gmail.com> Co-authored-by: nmaajidk <n.maajid.khan@intel.com>	2022-06-17 14:49:04 -07:00
Dwayne Robinson	3d99f16e98	Merge pull request #11827 from microsoft/user/dwayner/DmlEp1.9 Integrate WindowsAI feature branch with DML EP features and DML 1.9	2022-06-16 13:04:00 -07:00
George Wu	df5ee6aa4e	[TensorRT EP] support TensorRT 8.4 (#11866 ) * update trt 8.4ga * trt 8.4 linux ci pipeline * fix cmake * placeholder_builder * trt 8.4 windows pipeline * gpu package pipeline * trt 8.4.1.5 , packaging pipeline updates * python packaging * ctest timeout * python packaging test * bump timeout * python format * format * revert * newline * enable trt python tests * typo * python format * disable on windows	2022-06-16 07:46:40 -07:00
Dwayne Robinson	babd6e3fcd	Update DirectML preview package with unmangled names	2022-06-15 18:16:58 -07:00
Scott McKay	d64f23fec0	EP factory creation cleanup and enhancements. (#11798 ) * Rework the EP factory creation setup so we're not cut-and-pasting function declarations in multiple places. Convert append EP for SNPE to be generic, and also use for XNNPACK. Add XNNPACK to C# API * Don't need stub for MIGraphX as it's using provider bridge. * Remove old 'create' functions that aren't applicable now that the EPs are built as separate libraries. * Only use EPs that require the layout transform if the opset is supported by the layout transformer. * Update wasm registration of xnnpack.	2022-06-16 07:01:41 +10:00
Ashwini Khade	f63e28c92f	C API version 0.001 (#11758 ) * C API version 0.001 * fix linker issues * fixes for save checkpoint api * plus fixes based on tests * plus test_runner and other changes * Plus cosmetic updates * remove unnecessary headers * plus some updates * plus more changes Co-authored-by: Ashwini Khade <askhade@microsoft.com@orttrainingdev10.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-06-15 11:13:35 -07:00
Dwayne Robinson	ff8b173286	Typo in DirectML.Debug.dll	2022-06-15 00:18:40 -07:00
Dwayne Robinson	508c76a246	Add missing DirectML.Debug.dll	2022-06-15 00:16:10 -07:00
Dwayne Robinson	4c1a410d54	Unmangle DML preview package filenames	2022-06-14 23:12:58 -07:00
daquexian	3cbbf9dcae	Fix wasm static lib in sub-project (#11671 ) * wasm_static_lib_global Signed-off-by: daquexian <daquexian566@gmail.com> * make wasm static lib global Signed-off-by: daquexian <daquexian566@gmail.com> * fix the property Signed-off-by: daquexian <daquexian566@gmail.com> * add code missing after merge Signed-off-by: daquexian <daquexian566@gmail.com>	2022-06-14 15:18:11 -07:00
Gary Miguel	e8b0d24071	Support per-test tolerances for ONNX tests (#11775 ) Prior to this every test shared the same tolerances. This meant that if an ONNX test failed due to a small but acceptable difference in output, the only alternative was to disable the test entirely. In op set 17, the DFT operator is being added. Without this change, the tests for that operator fail because the output is off by about 5e-5. It's better to keep test coverage for this new op rather than disable the test entirely. Also prior to this change, the global tolerances were not shared between C++, JavaScript, and Python tests. Now they are. Also fix various minor issues raised by linters. Unblocks https://github.com/microsoft/onnxruntime/issues/11640.	2022-06-14 15:12:23 -07:00
Scott McKay	6bf6bac1fd	Add patching of xnnpack CMakeLists.txt to allow building with Emscripten. (#11829 )	2022-06-14 09:31:17 +10:00
Hector Li	7582644f57	cmake changes for SNPE EP (#11821 ) * move code used to find the SNPE libs to a separate cmake file * Roll back the change for libc++_shared, it's the one from SNPE SDK, otherwise it will cause uncaught exception of type std::bad_cast because of conflict	2022-06-13 08:15:37 -07:00
Dwayne Robinson	50e0a193c8	Merge branch 'master' into user/dwayner/DmlEp1.9	2022-06-11 19:01:51 -07:00
Dwayne Robinson	76024b8a6a	Update DirectML.dll to 1.9.0 Preview	2022-06-11 18:51:32 -07:00
pengwa	fb88efbe18	End to end run pass (on device training) (#11694 ) * lr_scheduler implementation (cherry picked from commit d9c2552b3a3b2ff38ee0a14770257aa1169f6fa9) * refactor Module/Optimizer constructor. * add intermidiate API layer bridging public interfaces with internal ones. * synthetic data loader * make end to end run pass * avoid many session input copy (CPU to GPU) some clean up * NVTX for runner * minor fix after sync * revert to let Module/Optimizer handle session creation. * fix tests & test file folder consolidation * refine based on comments & fix cpplint * typos	2022-06-10 15:25:44 -07:00
Guenther Schmuelling	d4ea59654c	make xnnpack build for ort-web (#11745 ) * make xnnpack build for ort-web * make ci happy	2022-06-10 08:47:57 -07:00
Vincent Wang	5ecfaef042	ATen Fallback for Inference (#11597 ) * aten op for inference * fix build error * more some code to training only * remove domain from operator name * move aten_op_executor ext out from ortmodule * add pipeline * add exec mode * fix script * fix ut script * fix test pipeline * failure test * rollback * bugfix * resolve comments * enable aten for python build only * fix win build * use target_compile_definitions * support io binding * turn off aten by default * fix ut Co-authored-by: Vincent Wang <weicwang@microsoft.com> Co-authored-by: zhijxu <zhijxu@microsoft.com>	2022-06-09 16:07:30 +08:00
Alex Fuller	8156b9370c	[Abseil] Adding URL_HASH so that an existing archive can be used from disk (#11690 )	2022-06-08 17:12:59 -07:00
pengwa	540935aace	lr scheduler implementation (on device training) (#11714 ) * lr_scheduler implementations * rename test_runner to test_trainer. * add unit tests * address comments	2022-06-09 08:04:30 +08:00
Changming Sun	eeeb249a27	Update onnxruntime_providers.cmake to remove the reference of "onnxruntime_tvm_dependencies" (#11780 )	2022-06-08 09:06:00 -07:00
Valery Chernov	4296968f20	[TVM EP] update set input method for VirtualMachine (#11674 ) * update TVM * get alignment constant from TVM * update TVM_VM_SetInputs to upstream with TVM API * fix CI issue: update TVM EP dependencies * add sudo * revert changes needed to install missing package * add package for TVM EP CI Co-authored-by: Valery Chernov <valery.chernov@deelvin.com> Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru>	2022-06-04 09:31:01 +02:00
Hector Li	95a16c1ffe	Snpe ep (#11665 ) * Initiate Ort SNPE EP * fix snpe ep windows build which is caused by the utility method (ToUTF8String) name change on master * correct the source path for libonnxruntime.so while building for andorid package * add AdditionalDependencies for amr64 * On MS-Windows, the patchfile must be a text file, i.e. CR-LF must be used as line endings. A file with LF may give the error: "Assertion failed, hunk, file patch.c, line 343," unless the option '--binary' is given. * fix build failure if snpe is not enabled * update doc for contrib op * separate out snpe ep settings to onnxruntime_snpe_provider.cmake * renaming according review comments * update according review comments	2022-06-03 14:10:02 -07:00
Scott McKay	4445dd6bc1	XNNPACK EP (#11445 ) * Implement XNNPACK support via an EP. * Layout transform uses the GraphPartitioner infrastructure. * Node fusion is supported. * Conv and MaxPool implementations were ported from Changming's PR. * Added optional mutex in InferenceSession::Run as we only want to allow sequential calls if xnnpack is enabled	2022-06-03 20:22:34 +10:00
ashbhandare	1c316d0e39	Parameter,Module and Optimizer changes (#11494 ) * Module step * On device training offline composition * Working grad accumulation with test for TrainStep * Temp changes * Revert "On device training offline composition" This reverts commit `ec3da68247`. * cleanup * Implement eval step * Use new graphs and checkpoints * Optimizer test, changes * review comments * review comments * review comments Co-authored-by: Aishwarya Bhandare <aibhanda@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>	2022-05-31 09:20:47 -07:00
pengwa@microsoft.com	e1c63cb06a	Merge branch 'master' of https://github.com/microsoft/onnxruntime into training_dev/on_device_poc	2022-05-28 01:54:17 +00:00
Scott McKay	4fabc400de	Fix CUDA 11.6 build error on Windows (#11578 ) * Avoid windows header that defines 'small'	2022-05-28 08:04:46 +10:00
Jeff Bloomfield	a7fa735286	Merge remote-tracking branch 'origin/master' into WindowsAI	2022-05-27 12:53:54 -07:00
Baiju Meswani	3a22a866a1	On device training offline tooling (#11520 )	2022-05-24 18:21:39 -07:00
Yi Zhang	a3f05da338	Revert "[TVM EP] update set input to remove excess copying inside TVM (#11247 )" (#11504 ) This reverts commit `5ae461ec0a`.	2022-05-13 02:27:36 +08:00
Tianlei Wu	ece1274ffa	revert safeint version (#11500 )	2022-05-12 11:24:43 -07:00
Tianlei Wu	f5473596fa	Change longformer default kernel (#11470 ) * change default to compact memory kernel * Remove a cuda stream synchronize that is not needed * Update longformer benchmark tool	2022-05-11 10:54:59 -07:00

1 2 3 4 5 ...

1209 commits