onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-16 21:00:14 +00:00

Author	SHA1	Message	Date
Changming Sun	27f595a2d8	update	2025-02-06 12:34:12 -08:00
Adrian Lizarraga	3b4c7df4e9	[QNN EP] Make QNN EP a shared library (#23120 ) ### Description - Makes QNN EP a shared library by default when building with `--use_qnn` or `--use_qnn shared_lib`. Generates the following build artifacts: - Windows: `onnxruntime_providers_qnn.dll` and `onnxruntime_providers_shared.dll` - Linux: `libonnxruntime_providers_qnn.so` and `libonnxruntime_providers_shared.so` - Android: Not supported. Must build QNN EP as a static library. - Allows QNN EP to still be built as a static library with `--use_qnn static_lib`. This is primarily for the Android QNN AAR package. - Unit tests run for both the static and shared QNN EP builds. ### Detailed changes - Updates Java bindings to support both shared and static QNN EP builds. - Provider bridge API: - Adds logging sink ETW to the provider bridge. Allows EPs to register ETW callbacks for ORT logging. - Adds a variety of methods for onnxruntime objects that are needed by QNN EP. - QNN EP: - Adds `ort_api.h` and `ort_api.cc` that encapsulates the API provided by ORT in a manner that allows the EP to be built as either a shared or static library. - Adds custom function to transpose weights for Conv and Gemm (instead of adding util to provider bridge API). - Adds custom function to quantize data for LeakyRelu (instead of adding util to provider bridge API). - Adds custom ETW tracing for QNN profiling events: - shared library: defines its own TraceLogging provider handle - static library: uses ORT's TraceLogging provider handle and existing telemetry provider. - ORT-QNN Packages: - Python: Pipelines build QNN EP as a shared library by default. User can build a local python wheel with QNN EP as a static library by passing `--use_qnn static_lib`. - NuGet: Pipelines build QNN EP as a shared library by default. `build.py` currently enforces QNN EP to be built as a shared library. Can add support for building a QNN NuGet package with static later if deemed necessary. - Android: Pipelines build QNN EP as a static library. `build.py` enforces QNN EP to be built as a static library. Packaging multiple shared libraries into an Android AAR package is not currently supported due to the added need to also distribute a shared libcpp.so library. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2025-01-22 12:11:00 -08:00
Yifan Li	5c3c7643db	Update range of gpu arch (#23309 ) ### Description <!-- Describe your changes. --> * Remove deprecated gpu arch to control nuget/python package size (latest TRT supports sm75 Turing and newer arch) * Add 90 to support blackwell series in next release (86;89 not considered as adding them will rapidly increase package size) \| arch_range \| Python-cuda12 \| Nuget-cuda12 \| \| -------------- \| ------------------------------------------------------------ \| ---------------------------------- \| \| 60;61;70;75;80 \| Linux: 279MB Win: 267MB \| Linux: 247MB Win: 235MB \| \| 75;80 \| Linux: 174MB Win: 162MB \| Linux: 168MB Win: 156MB \| \| 75;80;90 \| Linux: 299MB Win: 277MB \| Linux: 294MB Win: 271MB \| \| 75;80;86;89 \| [Linux: MB Win: 390MB](https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=647457&view=results) \| Linux: 416MB Win: 383MB \| \| 75;80;86;89;90 \| [Linux: MB Win: 505MB](https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=646536&view=results) \| Linux: 541MB Win: 498MB \| ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Callout: While adding sm90 support, the build of cuda11.8+cudnn8 will be dropped in the coming ORT release, as the build has issue with blackwell (mentioned in comments) and demand on cuda 11 is minor, according to internal ort-cuda11 repo.	2025-01-14 14:27:34 -08:00
Changming Sun	0ec2171b9f	Update Linux docker images (#23244 ) The new images contain the following updates: 1. Added Git, Ninja and VCPKG to all docker images 2. Updated CPU containers' GCC version from 12 to 14 3. Pinned CUDA 12 images' CUDNN version to 9.5(The latest one is 9.6) 4. Addressed container supply chain warnings by building CUDA 12 images from scratch(avoid using Nvidia's prebuilt images) 5. Updated manylinux commit id to 75aeda9d18eafb323b00620537c8b4097d4bef48 Also, this PR updated some source code to make the CPU EP's source code compatible with GCC 14.	2025-01-09 10:20:33 -08:00
Jian Chen	f423b737a9	Fix Linux python CUDA package pipeline (#22803 ) ### Description Making ::p optional in the Linux python CUDA package pipeline ### Motivation and Context Linux stage from Python-CUDA-Packaging-Pipeline has failed since merge of #22773	2024-11-13 14:20:21 -08:00
Yi Zhang	ef281f850a	Add XNNPack build on Linux ARM64 and improve Linux CPU (#22773 ) ### Description 1. Add XNNPack build on Linux ARM64 2. Build only one python wheel for PR request. [AB#49763](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/49763) ### Motivation and Context Why I add xnnpack build on Linux ARM64 rather than Windows ARM64. Becuase KleidiAI doesn't support Windows ``` IF(XNNPACK_TARGET_PROCESSOR STREQUAL "arm64" AND XNNPACK_ENABLE_ARM_I8MM AND NOT CMAKE_C_COMPILER_ID STREQUAL "MSVC") IF (XNNPACK_ENABLE_KLEIDIAI) MESSAGE(STATUS "Enabling KleidiAI for Arm64") ENDIF() ELSE() SET(XNNPACK_ENABLE_KLEIDIAI OFF) ENDIF() ``` ---------	2024-11-09 11:26:19 +08:00
Changming Sun	f9e623e4d1	Update CMake to 3.31.0rc1 (#22433 ) To include a bug fix: https://gitlab.kitware.com/cmake/cmake/-/merge_requests/9890 Discussion: https://discourse.cmake.org/t/cmake-incorrectly-links-to-nvrtc-builtins/12723/4 This bug fix should be included in our upcoming release, because right now our GPU package depends on “libnvrtc-builtins.so.12.2" which has a hardcoded CUDA version: 12.2. The minor CUDA version should not be there.	2024-10-16 11:50:13 -07:00
Changming Sun	4af593a722	Add python 3.13 support (#22380 ) 1. Add python 3.13 to our python packaging pipelines 2. Because numpy 2.0.0 doesn't support thread free python, this PR also upgrades numpy to the latest 3. Delete some unused files.	2024-10-14 18:07:54 -07:00
Changming Sun	d98340968e	Stop publishing python 3.8/3.9 packages (#22343 ) ### Description 1. Stop publishing python 3.8/3.9 packages, to align with numpy. 2. Add a trigger for CUDA12's python test pipeline.	2024-10-08 09:50:05 -07:00
George Wu	944d87381d	[QNN EP] set up py packaging pipeline for Linux x64 (#22132 ) set up a pipeline to produce nightly Linux x64 whls for onnxruntime-qnn this can be used for offline context binary generation.	2024-09-18 23:24:32 -07:00
Changming Sun	67bc9438d7	Update training packaging pipeline's docker files (#20853 ) ### Description Similar to #20786 . The last PR was able to update all pipelines and all docker files. This is a follow-up to that PR. ### Motivation and Context 1. To extract the common part as a reusable build infra among different ONNX Runtime projects. 2. Avoid hitting docker hub's limit: 429 Too Many Requests - Server message: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit	2024-05-30 23:48:42 -07:00
Changming Sun	e91d91ae4f	Fix a build issue: /MP was not enabled correctly (#19190 ) ### Description In PR #19073 I mistunderstood the value of "--parallel". Instead of testing if args.parallel is None or not , I should test the returned value of number_of_parallel_jobs function. If build.py was invoked without --parallel, then args.parallel equals to 1. Because it is the default value. Then we should not add "/MP". However, the current code adds it. Because if `args.paralllel` is evaluated to `if 1` , which is True. If build.py was invoked with --parallel with additional numbers, then args.parallel equals to 0. Because it is unspecified. Then we should add "/MP". However, the current code does not add it. Because `if args.paralllel` is evaluated to `if 0` , which is False. This also adds a new build flag: use_binskim_compliant_compile_flags, which is intended to be only used in ONNX Runtime team's build pipelines for compliance reasons. ### Motivation and Context	2024-01-29 12:45:38 -08:00
Changming Sun	81d363045b	Upgrade Ubuntu machine pool from 20.04 to 22.04 (#19117 ) ### Description Upgrade Ubuntu machine pool from 20.04 to 22.04	2024-01-16 17:25:18 -08:00
Changming Sun	0e8d4c3d21	Enable Address Sanitizer in CI (#19073 ) ### Description 1. Add two build jobs for enabling Address Sanitizer in CI. One for Windows CPU, One for Linux CPU. 2. Set default compiler flags/linker flags in build.py for normal Windows/Linux/MacOS build. This can help control compiler flags in a more centralized way. 3. All Windows binaries in our official packages will be built with "/PROFILE" flag. Symbols of onnxruntime.dll can be found at [Microsoft public symbol server](https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/microsoft-public-symbols). Limitations: 1. On Linux Address Sanitizer ignores RPATH settings in ELF binaries. Therefore once Address Sanitizer is enabled, before running tests we need to manually set LD_LIBRARY_PATH properly otherwise libonnxruntime.so may not be able to find custom ops and shared EPs. 4. On Linux we also need to set LD_PRELOAD before running some tests(if the main executable, like python, is not built with address sanitizer. On Windows we do not need to. 5. On Windows before running python tests we should manually copy address sanitizer DLL to the onnxruntime/capi directory, because python 3.8 and above has enabled "Safe DLL Search Mode" that wouldn't use the information provided by PATH env. 6. On Linux Address Sanitizer found a lot of memory leaks from our python binding code. Therefore right now we cannot enable Address Sanitizer when building ONNX Runtime with python binding. 7. Address Sanitizer itself uses a lot of memory address space and delays memory deallocations, which is easy to cause OOM issues in 32-bit applications. We cannot run all the tests in onnxruntime_test_all in 32-bit mode with Address Sanitizer due to this reason. However, we still can run individual tests in such a way. We just cannot run all of them in one process. ### Motivation and Context To catch memory issues.	2024-01-12 07:24:40 -08:00
Jian Chen	2eb3db6bf0	Adding python3.12 support to ORT (#18814 ) ### Description Adding python3.12 support to ORT ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-01-11 08:34:28 -08:00
Jian Chen	d97fc1824f	Create a new Python Package pipeline for CUDA 12 (#18348 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-11-20 09:48:28 -08:00

16 commits