onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-26 22:35:43 +00:00

Author	SHA1	Message	Date
Yifan Li	e2c214d81f	[TensorRT EP] TRT 8.6 minor version update (#16475 ) ### Description * Minor version update: TRT 8.6.0.12->8.6.1.6 * CI pipeline ymls/dockerfiles are updated * cgmanifest.json/deps.txt/download-deps.yml are updated; Win trt binaries uploaded to [win img 307029](https://aiinfra.visualstudio.com/AI%20Infra%20Management/_build/results?buildId=307029&view=results) * Re-enable unit tests which were failed in 8.6.0 and re-gained support in 8.6.1	2023-06-26 10:44:27 -07:00
Scott McKay	48eff09664	Fix file list for test of build with IO debug (#16474 ) ### Description <!-- Describe your changes. --> Update file list to adjust for recent changes to test infra. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-06-26 16:36:22 +10:00
Chen Fu	5c125b4366	Cfu revertamx (#16455 ) ### Description This is to revert two PRs that aim at reducing AMX toolchain requirements. Unfortunately we still have some pipeline issues. https://github.com/microsoft/onnxruntime/pull/16390 https://github.com/microsoft/onnxruntime/pull/16086 ### Motivation and Context Looks like gcc link time optimization does not work very well with inline assembly in the above PRs.	2023-06-23 09:20:23 -07:00
Baiju Meswani	10ba1e270c	Minimal Build for On-Device Training (#16326 ) 🛠️ __Changes in this pull request:__ This pull request introduces two significant changes to the project: - Changing on device training checkpoint format: The current implementation stores the on device training checkpoint as a sequence of tensors in multiple files inside a checkpoint folder, which can be inefficient in terms of storage and performance. In this PR, I have modified the checkpoint format to utilize the flatbuffer table to save the checkpoint to a single file, providing a more compact and efficient representation. The changes around this are twofold: - Add the checkpoint flatbuffer schema that will generate the necessary checkpoint source files. - Update the checkpoint saving and loading functionality to use the new format. - Adding support for onnxruntime minimal build: To support scenarios where binary size is a constraint, I made changes to ensure that the training build can work well with the minimal build. 🔍 __Open Issues:__ - In order to extract the optimizer type, the existing implementation re-loaded the onnx optimizer model and parsed it. This is no longer possible, since the model format can either be onnx or ort. One idea is to do the same for ort format optimizer model. This needs some investigation. - Changes to the offline tooling to generate ort format training artifacts. - End-to-end training example showcasing the use of the minimal training build. - Add support for export model for inferencing in a minimal build.	2023-06-22 12:27:23 -07:00
RandySheriffH	6e29e185f3	Clean AzureEP logics (#16367 ) Moving out AzureEP invokers out of core runtime. --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-06-21 09:38:52 -07:00
Chi Lo	4e3cff60fd	CUDA graph support for TRT EP (#16081 ) CUDA EP already supports [CUDA graph](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#cuda-graphs), also we observed some models can benefit from using CUDA graph with `trtexec`. Therefore, this PR enables the CUDA graph support for TRT EP. The implementation is based on https://github.com/microsoft/onnxruntime/pull/9978 with the same [constraints](https://github.com/microsoft/onnxruntime/pull/9978) as below: - Models with control-flow ops (i.e. If, Loop and Scan ops) are not supported. - Usage of CUDA Graphs is limited to models where-in all the model ops (graph nodes) can be partitioned to the TRT EP. - The input/output types of models need to be tensors. - Shapes of inputs/outputs cannot change across inference calls. - IObinding is required.	2023-06-21 09:36:45 -07:00
Yuhong Guo	48e6186b1a	Move tests from core/providers/cuda/test/* to test/providers/cuda/ and refactor CUDA UT (#16161 ) ### Description <!-- Describe your changes. --> 1. Add a new test lib `onnxruntime_providers_cuda_ut` which is similar to `onnxruntime_providers_cuda` but `onnxruntime_providers_cuda_ut` is only built if `onnxruntime_BUILD_UNIT_TESTS` is set. We can call all CUDA UTs through this ut lib without affecting production lib `onnxruntime_providers_cuda`. 2. Move all test cases from `core/providers/cuda/test/` to `test/providers/cuda/`. These test cases are built into lib `onnxruntime_providers_cuda_ut` and run by `./onnxruntime_test_all --gtest_filter="CUDA_EP_Unittest"`. Since the lib is only for test, we can use gtest macros in the test cases. Previous implementation do not support using gtest lib in the CUDA UT cases. 3. The cmake code in `cmake/onnxruntime_providers.cmake` is refactored a bit. A new function `onnxruntime_add_object_library` is to build a object target. The 2 libs `onnxruntime_providers_cuda_ut` & `onnxruntime_providers_cuda` share most of the code, so the object files can be used in both libs, which helps reduce build time. Another function `config_cuda_provider_shared_module` is used to configure all 3 similar targets(onnxruntime_providers_cuda_obj/onnxruntime_providers_cuda/onnxruntime_providers_cuda_ut). 4. Refactored the test to call `testing::InitGoogleTest` & `RUN_ALL_TESTS` in `libonnxruntime_providers_cuda_ut.so`'s `TestAll`. After this change, we can see all the cases running in `CUDA_EP_Unittest.All`: ![image](https://github.com/microsoft/onnxruntime/assets/19584326/8ff80df6-060b-4ef0-90b7-657e68d3db87) ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> After https://github.com/microsoft/onnxruntime/pull/13016, there are still test files in test/providers/cuda/ that are not moved to core/providers/cuda/test/ and the test cases are disabled. This PR helps to clean the unfinished TODOs. Even through onnxruntime_shared_lib_test covers some test for CUDA provider. onnxruntime_shared_lib_test works like a coarse grain end-to-end test for CUDA provider. If CUDA unittest can run cases for a single component, this wound be helpful for CUDA developers. --------- Co-authored-by: Yuhong Guo <yuhong.gyh@antgroup.com>	2023-06-20 14:54:55 -07:00
Prateek Chokse	12dffef768	added support for cmake "find_package" (#8919 ) Description: Adds support for cmake find_package. Motivation and Context As mentioned in issue #7150 onnxruntime doesn't have support for CMake find_package, this PR adds that and also adds the CMake package version file. Now anyone can link onnxruntime like this: ```cmake find_package(onnxruntime) add_executable(test Source.cpp) target_link_libraries(test PRIVATE onnxruntime::onnxruntime) ``` this also simplifies #3124	2023-06-19 22:20:31 -07:00
Dipanjan Sengupta	35fa6af428	Fix for the build break in AMX feature on Mac OS. (#16390 ) ### Description Fixing the build break issue in Apple pipeline due to AMX flag removal.	2023-06-16 21:00:41 -07:00
Scott McKay	8fdfd20191	Separate out operator vs model testing. (#16228 ) ### Description <!-- Describe your changes. --> Split up OpTester to separate out operator vs model testing. This led to a lot of other cleanups/refactoring. - create BaseTester class and derived OpTester/ModelTester classes to limit APIs to what is applicable for each test type - e.g. adding an attribute isn't relevant to a model test - cleanup structure - don't expose member variables either directly or via public methods returning them - split out checkers so they can be easily re-used - refactor so there's one public Check method for comparing two OrtValue instances containing any data type - refactor the GradientOpTester usage - it required a lot of OpTester internals to be exposed and no other tests needed this - it also returned Status through various parts which prevented the usage of the google test macros which provide better output. change to return void and use the macros. - fix some other minor issues - update some cmake files so all the source files are included - remove some low value helpers (FetchTensor and GetShapeVector) - remove some outdated code to allow unreleased opset versions from when onnx opset 15 wasn't released - move files from test/util/include/test to test/util/include - doesn't seem to be any reason for the additional subdirectory given they're not files use to test the code in test/util - files were moved with no changes ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Cleanup test infrastructure. --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2023-06-17 12:58:57 +10:00
saurabh	a6ce7b339f	Enable model subgraph execution in OVEP and setting the OpenVINO dll's to the path from the OpenVINO pypi packge in OVEP and fix OVEP windows io buffer sample (#16147 ) ### Description This PR enables execution of subgraphs in OVEP and currently, when OVEP developers install the onnxruntime-openvino package on windows from pypi, they would have to additionally download OpenVINO windows binaries and run the setupvars.bat script which sets the environment PATH to locate the OV dll's. Also this PR fixes issues of OVEP windows io buffer sample. ### Motivation and Context Fix: We want to make the user experience easy for OVEP Python developers on windows platform. This fix, introduces a function add_openvino_libs_to_path at the location tools/python/util/add_openvino_win_libs.py. The above function, can be called by OVEP python users in the application code and that takes care of setting the OpenVINO dll's to the path from the OpenVINO pypi packge (openvino) which was installed. This change also makes sure that add_openvino_libs_to_path() function is added to onnxruntime python package only when it is build for OpenVINO Execution Provider for ONNXRuntime and not for default ORT python package builds. New user experience for Python OVEP developers on windows platform: step 1: pip install onnxruntime-openvino step 2: pip install openvino step 3: <Add these 2 lines in the application code> import onnxruntime.tools.add_openvino_win_libs as utils utils.add_openvino_libs_to_path() --------- Signed-off-by: MaajidKhan <n.maajid.khan@intel.com> Co-authored-by: MaajidKhan <n.maajid.khan@intel.com> Co-authored-by: Suryaprakash Shanmugam <suryaprakash.shanmugam@intel.com>	2023-06-16 19:47:09 -07:00
Silvio Traversaro	4915191e63	Fix build of Python wheel on Windows with single-config generator (#16337 ) ### Description Before this PR, the CMake code assumed that when on Windows a multiple-config CMake generator was used, while on non-Windows there was the assumption of a single-config CMake generator. After this PR this information is obtained from the [`GENERATOR_IS_MULTI_CONFIG`](https://cmake.org/cmake/help/latest/prop_gbl/GENERATOR_IS_MULTI_CONFIG.html) global CMake propery. ### Motivation and Context I discovered this problem when building with Ninja generator on Windows, but I guess this should fix problems also on non-Windows platforms when using a multiple-config generator (such as Xcode on macOS or "Ninja Multi-Config" on all platforms). See https://cmake.org/cmake/help/latest/prop_gbl/GENERATOR_IS_MULTI_CONFIG.html for more info.	2023-06-16 09:17:49 -07:00
Dipanjan Sengupta	681a0d084d	Removing AMX build flag (#16086 ) ### Description 1. Replacing AMX intrinsics with machine code macro instructions in QGEMM kernel. 2. Removing AMX build flags for GCC in cmake file. ### Motivation and Context The additional AMX flag in cmake adds an extra layer of dependency on GCC version to use the feature.These changes should allow the usage of the AMX feature with just the CPU ID check.	2023-06-15 11:22:59 -07:00
satyajandhyala	889f80082f	[js/web] Added Reduce operators support (#16122 ) ### Description Added support for ReduceL1, ReduceL2, ReduceMean, ReduceMin, ReduceMax, ReduceSum, ReduceLogSum, ReduceLogSumExp, ReduceProd and ReduceSquareSum. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: Satya Jandhyala <sajandhy@microsoft.com> Co-authored-by: guschmue <guschmue@microsoft.com>	2023-06-12 07:46:27 -07:00
Vrajang Parikh	67f4a4fd16	Objective-C binding for ORT training (#16127 ) ### Description Implement Objective-C binding for `ORTCheckPoint`. Additionally, - Modify `onnxruntime_objectivec.cmake` to only include training header and sources when training flag is enabled - Enable objective-c binding for `orttraining-mac-ci-pipeline` ### Motivation and Context This PR is part of implementing Objective-C bindings for training API. It implements objective-c binding for ORTCheckPoint class. The objective-C API closely resembles the C++ API. Note: The test for saving checkpoint is skipped as it requires use of training session. It will be added when the objective-c binding for `ORTTrainingSession` is added.	2023-06-07 14:01:30 -07:00
Edward Chen	1261d0b8ba	Fix some build issues on MacOS with Xcode 14.3. (#15878 ) - Fix flatbuffers flatc warning, unused-but-set-variable. - Address `-Wshorten-64-to-32` warnings (fix in our code, allow in dependencies' code). - Update CI builds to use Xcode 14.3. - Update minimum iOS version to 12.0. - Update Mac hosted agents to MacOS 13 where possible.	2023-06-07 12:07:11 -07:00
Wanming Lin	a8c2f24ae0	[WebNN EP] Merge support for segment anything into main branch (#16208 ) We implemented a number of new ops and data types to support running segment anything model on Chromium WebNN DML backend (POC) in a forked branch https://github.com/honry/onnxruntime/tree/stable-diffusion In this PR, we migrate the changes in the forked branch to main branch, includes: - 22 new ops - New tensor data types: bool, int32, uint32, uint64, int64, float16 (As JavaScript hasn't shipped Float16Array, we use Uint16Array as a workaound) - Handle empty input tensors and duplicated outputs - Fixed some nits	2023-06-07 09:56:37 -07:00
Changming Sun	7686193c40	Fix DNNL build (#16201 )	2023-06-02 09:46:03 +08:00
Xavier Dupré	e726151b5c	Introduce float 8 types (#14731 ) ### Description The PR implements FloatE4M3FN, FloatE5M2, FloatE4MEFNUZ, FloatE5M2FNUZ as described in PR https://github.com/onnx/onnx/pull/4805. It uses CUDA API to cast float/half to float8 if CUDA>=11.8, a custom implementation if CUDA<11.8. * It implements, Cast, QuantizeLinear, DequantizeLinear for all types on CPU, only for types FloatE4M3FN, FloatE5M2 on CUDA. * It extends the supported types for control flow operator, Shape, Reshape, Identity, If, Loop, Scan, Reshape * It implements Equal(19). * Cast, QuantizeLinear, DequantizeLinear operators now support a parameter `saturate` only valid for float 8 types. It is true by default. In that case, any value out of range is converted into the maximum float 8 value. If false, it is infinite. * QuantizeLinear, DequantizeLinear now supports multiple scales on CUDA (and ROCm by extension), scale = 1D tensor with one scale per channel ### Motivation and Context Supports latest onnx version. Fixes [AB#15395](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/15395) --------- Co-authored-by: Xavier Dupre <xadupre@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: Randy Shuai <rashuai@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>	2023-05-30 13:25:58 -07:00
神楽坂帕琪	abd94b65b7	eigen.cmake use url info from deps.txt (#16129 ) ### Description `eigen.cmake` use url info provided by deps.txt instead of using raw url.	2023-05-30 11:07:20 -07:00
Yuhong Guo	04a8f50674	New configuration to limit the arena extension (#15983 ) Add a configuration `max_power_of_two_extend_bytes ` to limit the arena extension size. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> In our real scenario, we observe that if the model is big enough the BfcArena will extend uncontrollable. As showed by the following figures, if a model uses more than 16GB memory, the BfcArena will totally apply for 32GB memory according to the `kNextPowerOfTwo` strategy. With the new strategy, the extension is limited. The default maximum extension size is 1GB. #### Without the new configuration After loading the model, ORT uses 32G GPU memory. ![image](https://github.com/microsoft/onnxruntime/assets/19584326/42b93c66-b957-4f20-a13b-d34cb390afff) #### With the new configuration After loading the model, ORT uses 23G GPU memory. ![image](https://github.com/microsoft/onnxruntime/assets/19584326/5abffeff-9ca3-4187-a262-37fd2764fe1b) Co-authored-by: Yuhong Guo <yuhong.gyh@antgroup.com>	2023-05-25 02:19:07 -07:00
Sumit Agarwal	70d2dc8209	[DML EP] Fix issue with --dml_path build option (#15972 ) ### Description DML_PACKAGE_DIR cmake variable is not getting set properly when dml_path build options is used. ### Motivation and Context - Why is this change required? What problem does it solve? It is required for DML Perf dashboard. <!--- If it fixes an open issue, please link to the issue here. -->	2023-05-24 19:20:40 -05:00
yf711	105f5f0f20	Avoid trt deprecated api warnings shown as errors during ORT-TRT build (#16035 ) ### Description Avoid trt deprecated api warnings shown as errors when building onnxruntime_test_all This issue is only visible when installing trt via binaries, rather than deb/rpm pkg (CI pipelines) The change is similar to existing set_property for onnxruntime_providers_tensorrt `89ea503024/cmake/onnxruntime_providers.cmake (L421)` ### Motivation and Context onnxruntime/test/unittest_main/[test_main.cc](https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/test/unittest_main/test_main.cc#L32) includes nvinfer.h, which includes deprecated trt apis and and generates warnings. When building onnxruntime_test_all, it will show warnings as errors and block the build. ### Doubts Although this issue is visible on trt tar binaries but not on trt deb/rpm pkgs, Their file size&hash are the same (creation time vary), regarding headers/libs installing in different ways. \| tarBin \| pkg \| \| ------------------------------------------------------------ \| ------------------------------------------------------------ \| \| 997284784 Apr 26 15:15 libnvinfer_builder_resource.so.8.6.1 \| 997284784 Apr 26 22:21 libnvinfer_builder_resource.so.8.6.1 \| \| 235369632 Apr 26 15:14 libnvinfer.so.8.6.1 \| 235369632 Apr 26 22:21 libnvinfer.so.8.6.1 \|	2023-05-24 13:19:27 -07:00
PeixuanZuo	2fddc65c8c	[ROCm] add hipblaslt into GemmFastGelu TunableOp (#15945 ) add hipblaslt into GemmFastGelu TunableOp.	2023-05-23 11:07:09 +08:00
RandySheriffH	d35361bf9d	Fix python pipeline for AzureEP without using root (#16023 ) Fix python pipeline for AzureEP without using root, this is for 1.15. --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-05-22 16:38:47 -07:00
Changming Sun	0204594f90	Cleanup WASM cmake code (#15996 ) ### Description Remove the "onnxruntime_BUILD_WEBASSEMBLY" cmake option. Use `if (CMAKE_SYSTEM_NAME STREQUAL "Emscripten")` instead. It makes some code look more nature. For example, ```cmake if (CMAKE_SYSTEM_NAME STREQUAL "iOS" OR CMAKE_SYSTEM_NAME STREQUAL "Android" OR onnxruntime_BUILD_WEBASSEMBLY) ``` becomes ```cmake if (CMAKE_SYSTEM_NAME STREQUAL "iOS" OR CMAKE_SYSTEM_NAME STREQUAL "Android" OR CMAKE_SYSTEM_NAME STREQUAL "Emscripten") ```	2023-05-20 18:07:39 -07:00
RandySheriffH	4dfb89b3ad	Implement mutex-free spin lock for task queue (#14834 ) Implemented "lock-free" spinlock to save CPU usage on context switching. The change has been tested on queene service of Ads team, the lock-free version of ort (40 threads) saves CPU usage on gen8 (128 logical processors on 8 numa nodes) windows by nearly half, from 65% to 35%. For 32 cores, the curve is flat: Anubis, 32 vCPU, windows, hugging face models, 95 percentile E2E latency in ms: model \| mutex(ms) \| mutex-free --- \| --- \| --- alvert_base_v2 \| 34.21 \| 34.09 bert_large_uncased \| 116.27\| 117.84 bart_base \| 72.06 \| 71.99 distilgpt2 \| 25.43 \| 25.02 vit_base_patch16_224 \| 37.33 \| 37.76 Anubis, 32 vCPU win, Linux, 1st party models, 95 percentile E2E latency in ms: model \| mutex(ms) \| mutex-free --- \| --- \| --- deepthink_v2 \| 24.35 \| 22.95 bing_feeds \| 36.96 \| 36.48 deep_writes \| 14.46 \| 14.32 keypoints \| 9.34 \| 7.69 model11 \| 1.71 \| 1.66 model12 \| 1.82 \| 1.44 model2 \| 4.21 \| 3.95 model6 \| 1.08 \| 1.05 agiencoder \| 0.99 \| 0.93 geminet_transformer \| 5.32 \| 5.24 --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-05-19 10:12:10 -07:00
Patrice Vignola	310b22aa0c	[DML EP] Update DirectML version to 1.12.0 (#16011 )	2023-05-18 19:37:12 -07:00
Ashwini Khade	0c815a95b7	android package fix (#15999 ) ### Description This PR adds the training headers to the training android packages. ### Motivation and Context Training headers need to be added as part of the training android packages, however because of the typo in the cmake these headers were not being added. This PR fixes the issue.	2023-05-18 09:21:03 -07:00
Changming Sun	842b1a3472	Revert a change in #15797 : restore the correct version of emsdk (#15995 ) ### Description Revert a change in #15797: restore the correct version of emsdk ### Motivation and Context Without change, when you build it on Windows you will see: ``` 2023-05-17 19:41:30,093 build [INFO] - Activating emsdk... 2023-05-17 19:41:30,093 util.run [INFO] - Running subprocess in 'C:\src\onnxruntime2\cmake\external\emsdk' 'C:\src\onnxruntime2\cmake\external\emsdk\emsdk.bat' activate 3.1.37 error: tool or SDK not found: '3.1.37' ```	2023-05-18 07:41:38 -07:00
kailums	f62f722c70	integrate triton into ort (#15862 ) ### Description In some scenarios, the triton written kernels are more performant than CK or other handwritten kernels, so we implement a framework that onnxruntime can use these triton written kernels. This PR is to integrate triton into ort, so that ort can use kernels that written and compiled by triton. The main change focus on two part: 1. a build part to compile triton written kernel and combine these kernels into libonnxruntime_providers_rocm.so 2. a loader and launcher in c++, for loading and launch triton written kernels. #### Build To compile triton written kernel, add a script `tools/ci_build/compile_triton.py`. This script will dynamic load all kernel files, compile them, and generate `triton_kernel_infos.a` and `triton_kernel_infos.h`. `triton_kernel_infos.a` contains all compiled kernel instructions, this file will be combined into libonnxruntime_providers_rocm.so, using --whole-archive flag. `triton_kernel_infos.h` defines a const array that contains all the metadata for each compiled kernel. These metadata will be used for load and launch. So this header file is included by 'triton_kernel.cu' which defines load and launch functions. Add a build flag in build.py and CMakeList.txt, when building rocm provider, it will call triton_kernel build command, and generate all necessary files. #### C++ Load and Launch On c++ part, we implement load and launch functions in triton_kernel.cu and triton_kernel.h. These two files located in `providers/cuda`, and when compiling rocm, they will be hipified. so this part supports both cuda and rocm. But currently we only call triton kernel in rocm. We also implement a softmax triton op for example. Because there will generate many kernels for different input shape of softmax, we use TunableOp to select the best one. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-17 09:35:28 +08:00
cloudhan	dc383ed4ce	Basic CSharp packaging support for ROCm EP (#15535 ) This PR mainly fixes building errors when trying to build nupkg for ROCm EP. It also slighly improve the packaging logic so that devlopers can produce the nupkg on linux natively.	2023-05-16 07:27:38 +08:00
Dmitri Smirnov	896a963492	Adust GetVersionString() GetBuildInfoString() signatures and move them to OrtApi (#15921 ) ### Description This PR partially reverts changes introduced in https://github.com/microsoft/onnxruntime/pull/15643 We make two API return std::string always in UTF-8. We also move the entry points from OrtApiBase to OrtApi to make them versioned. ### Motivation and Context `GetVersionString` always returns x.y.z numbers that are not subject to internationalization. `GetBuildInfoString` can hold international chars, but UTF-8 should be fine to contain those. We prefix them with u8"" in case the compiler default charset is not UTF-8. Furthermore, creating platform dependent APIs is discouraged. `ORTCHAR_T` is platform dependent and was created for paths only. On non-unix platforms would still produce `std::string` that can only contain UTF-8 The API was introduced after the latest release, and can still be adjusted.	2023-05-13 13:45:07 -07:00
RandySheriffH	7c4e8267e7	Implement openAI endpoint invoker for nuget (#15797 ) Implement openAI audio endpoint, and enable nuget packaging. --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-05-11 22:04:02 -07:00
Jian Chen	1a73d61829	Update eigen to 3.4 and remove the eigen from git submodule (#15875 ) ### Description Update eigen to 3.4 and remove the eigen from git submodule ### Motivation and Context We need to have eigen 3.4 for c++20	2023-05-11 11:56:59 -07:00
Changming Sun	7c58d013aa	Remove Ubuntu 18.04 usages (#15781 ) ### Description Remove Ubuntu 18.04 usages because it will be EOL this month. ### Motivation and Context	2023-05-11 11:44:00 -07:00
sdegrande	cf062dbdb1	FlatBuffers fails to compile with gcc13. (#15787 ) When building the FlatBuffers dependencies, gcc13 emits a stringop-overflow warning. All warnings being turned into errors, that fails the compilation of FlatBuffers, and as a consequence also fails the build of onnxruntime. This commit adds the application of a patch to FlatBuffers's CMakeList.txt, to add -Wno-error=stringop-overflow to the CMAKE_CXX_FLAGS.	2023-05-11 11:20:19 -07:00
liqun Fu	ac9ae9f7c5	update onnx release 1.14 for docker files (#15680 ) ### Description this is for ort 1.15 release to work with onnx 1.14 It shall be merged after onnx 1.14 release and before ort 1.15 release. ### Motivation and Context --------- Signed-off-by: Liqun Fu <liqfu@microsoft.com>	2023-05-10 13:15:56 -07:00
Sumit Agarwal	b473e3f3c6	[DML EP] Update DirectML version to 1.11.0 (#15858 ) ### Description - Update DML version to 1.11.0 - Disable Gemm+Softmax fusion ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-09 12:48:15 -07:00
Wanming Lin	00b1e79e04	Support WebNN EP (#15698 ) Description: This PR intends to enable WebNN EP in ONNX Runtime Web. It translates the ONNX nodes by [WebNN API](https://webmachinelearning.github.io/webnn/), which is implemented in C++ and uses Emscripten [Embind API](https://emscripten.org/docs/porting/connecting_cpp_and_javascript/embind.html#). Temporarily using preferred layout NHWC for WebNN graph partitions since the restriction in WebNN XNNPack backend implementation and the ongoing [discussion](https://github.com/webmachinelearning/webnn/issues/324) in WebNN spec that whether WebNN should support both 'NHWC' and 'NCHW' layouts. No WebNN native EP, only for Web. Motivation and Context: Allow ONNXRuntime Web developers to access WebNN API to benefit from hardware acceleration. WebNN API Implementation Status in Chromium: - Tracked in Chromium issue: [#1273291](https://bugs.chromium.org/p/chromium/issues/detail?id=1273291) - CPU device: based on XNNPack backend, and had been available on Chrome Canary M112 behind "#enable-experimental-web-platform-features" flag for Windows and Linux platforms. Further implementation for more ops is ongoing. - GPU device: based on DML, implementation is ongoing. Open: - GitHub CI: WebNN currently is only available on Chrome Canary/Dev with XNNPack backend for Linux and Windows. This is an open to reviewers to help identify which GitHub CI should involved the WebNN EP and guide me to enable it. Thanks!	2023-05-08 21:25:10 -07:00
Yulong Wang	0457fd0b40	upgrade emsdk to 3.1.37 (#15817 ) ### Description upgrade emsdk to 3.1.37 WIP branch to debug the mystery memory issue in web assembly multi-thread build.	2023-05-08 16:49:47 -07:00
Guenther Schmuelling	5a43828b3d	update ort extensions to 94142d8391c9791ec71c38336436319a2d4ac7a0 (#15688 ) needed to get tokenizers/decode for whisper --------- Co-authored-by: Shalva Mist <shalvamist@microsoft.com>	2023-05-05 09:48:07 -07:00
cloudhan	412d05a1d2	[ROCm] Update cmake (#15807 ) Followup of #15775	2023-05-04 11:20:56 -07:00
Yulong Wang	33d1372729	[wasm] revert emsdk to v3.1.19 (#15793 ) ### Description latest emsdk generated multi-thread version sometimes crash with unknown reason ( error: memory access out of bounds ). we don't want to break existing ort-web users, so revert emsdk back to 3.1.19 (same to what ort v1.14.0 uses)	2023-05-04 01:15:01 -07:00
Baiju Meswani	ba7b83ff3c	Remove onnxruntime_PYBIND_EXPORT_OPSCHEMA definition from onnxruntime (#15776 )	2023-05-03 13:08:35 -07:00
Changming Sun	41c082fdde	Add a Github workflow for Prefast (#15763 )	2023-05-03 11:42:51 -07:00
Changming Sun	328cabb194	Download protoc from Github Release instead of Nuget (#15731 ) ### Description Download protoc from Github Release instead of Nuget to avoid having dependency on nuget.exe on Linux ### Motivation and Context To avoid having dependency on nuget.exe on Linux. Many users' build environment do not have nuget or dotnet.	2023-05-02 12:18:59 -07:00
Changming Sun	5352f6d9b0	Make "--cuda_version" build arg optional (#15758 ) ### Description This change will allow us building CUDA EP without installing CUDA SDK on Windows. ### Motivation and Context Nvidia's CUDA installer comes with a VS extension. In the past, we require installing the extension. It is a little bit inconvenient since: 1. Visual Studio must be installed before CUDA SDK. CUDA's installer will not install the extension if your machine doesn't have Visual Studio. 2. We need to install CUDA SDK on our build machines, instead of just downloading it and using it. After this change, we will not need to install CUDA SDK on our build machines. So it will be easier to add a support for a different CUDA version. Also, fix two PreFast warnings.	2023-05-01 18:00:47 -07:00
Ashwini Khade	0ffae8073b	Creating Nuget and Android packages for Training (#15712 ) ### Description This PR creates Nuget and Android for Training. ### Motivation and Context These packages are intended to be released in ORT 1.15 to enable On-Device Training Scenarios. ## Packaging Story for Learning On The Edge Release ### Nuget Packages: 1. New Native package -> Microsoft.ML.OnnxRuntime.Training (Native package will contain binaries for: win-x86, win-x64, win-arm, win-arm64, linux-x64, linux-arm64, android) 2. C# bindings will be added to existing package -> Microsoft.ML.OnnxRuntime.Managed ### Android Package published to Maven: 1. New package for training (full build) -> onnxruntime-training-android-full-aar ### Python Package published to PyPi: 1. Python bindings and offline tooling will be added to the existing ort training package -> onnxruntime-training	2023-05-01 12:59:56 -07:00
Sumit Agarwal	4c4f688a93	[DML EP] Fix dml_external_project (#15656 ) ### Description While building ORT for DML EP with `dml_EXTERNAL_PROJECT` flag, 2 variables (`DML_SHARED_LIB`, `DML_PACKAGE_DIR`) value is not set properly. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-01 12:02:56 -07:00

1 2 3 4 5 ...

1394 commits