onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-15 01:23:42 +00:00

Author	SHA1	Message	Date
Adam Pocock	14d1bfc34b	[java] Multi-LoRA support (#22280 ) ### Description Java parts of Multi-LoRA support - #22046. ### Motivation and Context API equivalence with Python & C#. --------- Co-authored-by: Dmitri Smirnov <dmitrism@microsoft.com>	2024-10-01 13:54:37 -07:00
Adam Pocock	cfa45df6b5	[java] Migrate OnnxTensors created from arrays over to a backing Java buffer (#18556 ) ### Description Following from #16578 and #16835 this migrates over `OnnxTensor.createTensor(<array>)` to first instantiate a `java.nio.Buffer` and then copy the array into that buffer in Java before creating the tensor. It also changes the `OnnxTensor.getValue()` method which returns a multidimensional array so it does the array construction and value copy in Java. This allows the removal of some unpleasant recursive C code which repeatedly calls into the JVM to traverse Java's arrays. The equivalent Java code is still unpleasant and recursive, but it's easier to reason about and memory safe. As a bonus, more `OnnxTensor`s are now backed by buffers which allow users to pin memory and reduce allocations by reusing them for same sized inputs. Some of the JNI code which parses Java arrays still exists as it's used by `OnnxMap`, removing that will be the target of a future refactor. Strings are still processed in JNI as it is easier to work with String tensors and UTF-8 arrays in C. ### Motivation and Context Minimizing the amount of JNI code makes it easier to maintain and using buffers in preference to arrays allows for fewer allocations.	2024-09-24 15:36:52 +10:00
Adam Pocock	6d7235ba5a	[Java] Exposing SessionOptions.SetDeterministicCompute (#18998 ) ### Description Exposes `SetDeterministicCompute` in Java, added to the C API by #18944. ### Motivation and Context Parity between C and Java APIs.	2024-09-16 11:55:38 +10:00
Adam Pocock	02e00dc023	[java] Adding ability to load a model from a memory mapped byte buffer (#20062 ) ### Description Adds support for constructing an `OrtSession` from a `java.nio.ByteBuffer`. These buffers can be memory mapped from files which means there doesn't need to be copies of the model protobuf held in Java, reducing peak memory usage during session construction. ### Motivation and Context Reduces memory usage on model construction by not requiring as many copies on the Java side. Should help with #19599.	2024-09-16 08:31:55 +10:00
Michael Tyler	904b850b44	Update Arm Compute Library Execution Provider (#22032 ) ### Description This PR makes the following updates to the Arm Compute Library execution provider: - Target Arm Compute Library 24.07 - Add support for the following operators: - Conv (FP16) - NhwcConv - QLinearConv - MatMul - FusedMatMul - MatMulIntegerToFloat - Optimize memory usage and performance - Expose the enable_fast_math setting - Use the main runtime thread pool ### Motivation and Context These updates improve performance and memory usage, and enable use of a more recent version of Arm Compute Library. @microsoft-github-policy-service agree company="Arm Ltd" --------- Signed-off-by: Michael Tyler <michael.tyler@arm.com>	2024-09-12 20:51:59 -07:00
Adam Pocock	a36692066d	[java] CUDA & TensorRT options fix (#20549 ) ### Description I misunderstood how UpdateCUDAProviderOptions and UpdateTensorRTProviderOptions work in the C API, I had assumed that they updated the options struct, however they re-initialize the struct to the defaults then only apply the values in the update. I've rewritten the Java bindings for those classes so that they aggregate all the updates and apply them in one go. I also updated the C API documentation to note that these classes have this behaviour. I've not checked if any of the other providers with an options struct have this behaviour, we only expose CUDA and TensorRT's options in Java. There's a small unrelated update to add a private constructor to the Fp16Conversions classes to remove a documentation warning (they shouldn't be instantiated anyway as they are utility classes containing static methods). ### Motivation and Context Fixes #20544.	2024-05-05 00:16:55 -07:00
Adam Pocock	262b6bd3b7	[java][DML EP] Modifying dml_provider_factory.h so it can compile as a C header file (#20157 ) ### Description The dml_provider_factory header file can't be used in C programs as it defines C++ inline operators. This PR rearranges that header file so that it looks like valid C when used from C, and also makes a couple of small modifications to the Java code so it correctly binds to the DML EP at build time. I'm having some difficulty testing it as I think it's pulling in the old version of DirectML on my computer and I can't figure out what the library loading path is in Java to make it look at the recent version I downloaded. So the test I added fails with: ``` InferenceTest > testDirectML() FAILED ai.onnxruntime.OrtException: Error code - ORT_RUNTIME_EXCEPTION - message: Exception during initialization: <path-to-ort>\onnxruntime\core\providers\dml\DmlExecutionProvider\src\AbiCustomRegistry.cpp(518)\onnxruntime.dll!00007FFF74819333: (caller: 00007FFF74793509) Exception(3) tid(4f58) 80070057 The parameter is incorrect. at app//ai.onnxruntime.OrtSession.createSession(Native Method) at app//ai.onnxruntime.OrtSession.<init>(OrtSession.java:74) at app//ai.onnxruntime.OrtEnvironment.createSession(OrtEnvironment.java:236) at app//ai.onnxruntime.OrtEnvironment.createSession(OrtEnvironment.java:221) at app//ai.onnxruntime.InferenceTest.openSessionSqueezeNet(InferenceTest.java:1961) at app//ai.onnxruntime.InferenceTest.runProvider(InferenceTest.java:665) at app//ai.onnxruntime.InferenceTest.testDirectML(InferenceTest.java:657) ``` But it does correctly compile, and this error seems very similar to other issues with the DML provider when it doesn't like a model due to the loaded library being old. The test is using the squeezenet file that's been in the repo since 2019. If someone can help me figure out how to get the right version of DML in the library path I can test it more on my end. I tried adding the folder with the new version into the system path, but I'm not very familiar with Windows' library loading behaviour. ### Motivation and Context Fixes #19656 to allow use of the DirectML EP from ORT Java. cc @martinb35	2024-04-01 21:58:50 -07:00
Changming Sun	a28abeb241	Change "#ifdef WIN32" to "#ifdef _WIN32" (#19254 ) ### Description `_WIN32` is a standard macro listed at https://learn.microsoft.com/en-us/cpp/preprocessor/predefined-macros?view=msvc-170 . But `WIN32` is not.	2024-01-24 14:35:44 -08:00
Heflin Stephen Raj	0ea48fc73e	Modified the condition to load the optimiser model (#18891 )	2024-01-23 10:10:54 -08:00
Adam Pocock	191525301f	[java] Updating TensorInfo so it contains the named dimensions (#18962 ) ### Description The Java `TensorInfo` object which is used to describe a tensor's shape, along with the input and output placeholders for a model couldn't show any symbolic/named dimensions in that tensor. Now this information is stored in Java strings on construction and included in the toString. ### Motivation and Context Setting symbolic dimensions required external information in Java, the names were not discoverable from within the API.	2024-01-15 14:42:50 -08:00
Chi Lo	569876fb16	[TensorRT EP] Refactor OrtTensorRTProviderOptions initialization and make it easy to add new field (#17617 ) Two major modifications of this PR: 1. Refactor OrtTensorRTProviderOptions initialization and make it easy to add new field. 2. Make Python API capable of using TensorRT plugins by adding new Python binding api `register_tensorrt_plugins_as_custom_ops`. (It needs to register ep's custom op domain before model load. For C++ API, it's slightly different, when calling SessionOptionsAppendExecutionProvider_TensorRT_XX, it appends cutom op domain to session option. Later ORT can register custom op domain from session option before model loading)	2023-10-06 14:12:20 -07:00
Adam Pocock	aed43f429a	[java] Enable output pinning in OrtSession and OrtTrainingSession (#16835 )	2023-09-26 01:49:13 -07:00
Adam Pocock	a8e776b78b	[java] Adds support for fp16 and bf16 tensors (#16703 ) ### Description The Java API currently only supports fp16 output tensors which it automatically casts to floats on the way out. This PR adds support for creating fp16 and bf16 tensors (from `java.nio.Buffer` objects or as the output of models, creation from Java short arrays is not supported), along with efficient methods for casting `FloatBuffer` into `ShortBuffer` filled with fp16 or bf16 values and vice versa. The fp16 conversions use a trick to pull in the efficient conversion methods added to Java 20, falling back to ports of the MLAS methods otherwise. The Java 20 methods can be special cased by the C2 JIT compiler to emit the single instruction on x86 and ARM which converts fp32<->fp16, or the vectorized versions thereof, so they should be quite a bit faster than the MLAS ported one. ### Motivation and Context fp16 and bf16 are increasingly popular formats and we've had several requests for this functionality. Fixes #7003. cc @yuslepukhin @cassiebreviu --------- Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>	2023-07-21 21:14:41 +10:00
Adam Pocock	ba91457183	[java] Adding addExternalInitializers and addInitializer to OrtSession.SessionOptions (#16198 ) ### Description Adds support for adding external initializers or overriding initializers to a session options from Java. ### Motivation and Context We want to instantiate large models from Java without filesystem access. cc @yuslepukhin	2023-07-05 12:51:59 -07:00
Adam Pocock	bca49d62a0	Fixing CoreML in Java (#16231 ) ### Description The name of the flag we set when compiling the JNI binding to enable the CoreML EP changed at some point in the past. This PR fixes it by updating the flag in the JNI. I also added a quick smoke test for the CoreML provider to make sure it doesn't crash and can be enabled. ### Motivation and Context All the EPs should work as expected in Java. Fixes #16230.	2023-06-07 12:24:57 -07:00
Jian Chen	ea7b2deffd	Removing C4090 warning suppression (#15994 ) ### Description Removing C4090 warning suppression after windows pipelines adapt vs2022 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-18 10:08:05 -07:00
Dmitri Smirnov	896a963492	Adust GetVersionString() GetBuildInfoString() signatures and move them to OrtApi (#15921 ) ### Description This PR partially reverts changes introduced in https://github.com/microsoft/onnxruntime/pull/15643 We make two API return std::string always in UTF-8. We also move the entry points from OrtApiBase to OrtApi to make them versioned. ### Motivation and Context `GetVersionString` always returns x.y.z numbers that are not subject to internationalization. `GetBuildInfoString` can hold international chars, but UTF-8 should be fine to contain those. We prefix them with u8"" in case the compiler default charset is not UTF-8. Furthermore, creating platform dependent APIs is discouraged. `ORTCHAR_T` is platform dependent and was created for paths only. On non-unix platforms would still produce `std::string` that can only contain UTF-8 The API was introduced after the latest release, and can still be adjusted.	2023-05-13 13:45:07 -07:00
Yuhong Guo	41dcf0d32e	Expose build information in dynamic lib (#15643 ) ### Description <!-- Describe your changes. --> 1. Add Build Info API to onnx. 2. Fix compile error while building onnxruntime_benchmark in MacOs. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> 1. When Onnxruntime lib is serving online, we need a way to detect how this lib is built. This PR helps the developer to get the build information using `strings` such as git branch, git commit id, build type and cmake cxx flags, which is showed as follows. ![image](https://user-images.githubusercontent.com/19584326/233794371-b2f95a2c-27fb-4709-a6dd-bf4bb12b0b5b.png) ![image](https://user-images.githubusercontent.com/19584326/233794360-f96f5d2e-332c-405c-83f1-370ccc2b86f8.png) If the build env has no git, there will be no git related infor: ![image](https://user-images.githubusercontent.com/19584326/234558596-298c1b01-9a90-41bf-9372-7259a8f8e5be.png) 3. Fix the following compile error while building benchmark in MacOs. ![image](https://user-images.githubusercontent.com/19584326/233793571-c261ac1f-47b2-434d-a293-7e9edc6c8a66.png) --------- Co-authored-by: Yuhong Guo <yuhong.gyh@antgroup.com>	2023-04-28 21:57:31 -07:00
Adam Pocock	8a1a40ac63	[Java] CheckpointState AddProperty & GetProperty support (#15730 )	2023-04-28 09:52:52 -07:00
Baiju Meswani	b5a1941835	C, C++, Python, C# API update for on device training (#15518 )	2023-04-21 11:36:01 -07:00
Changming Sun	d9436407b6	Use safe allocator for JNI code (#13999 ) ### Description Use a customized allocarray function to replace the original malloc calls to avoid integer overflow. ### Motivation and Context Fix Prefast warnings. Fixed [AB#8990](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/8990) Fixed [AB#8991](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/8991) Fixed [AB#9016](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/9016)	2023-03-08 11:40:55 -08:00
Adam Pocock	47f00b5d49	[Java] Initial on device training support (#14027 ) contributor: @Craigacp	2023-03-08 10:01:08 -08:00
Adam Pocock	150043f74f	Adds a Java accessor for GetVersionString (#14876 ) ### Description Java part of #14873.	2023-03-07 09:46:56 -08:00
Erick Muñoz	d1533c27eb	[oneDNN] Improved thread handling (#13618 ) * Added the OrtDnnlProviderOptions structure to expose configuration options to the user * The number of threads can be defined by the user with the -i flag on the perftest * Number of threads can also be configured via the OMP_NUM_THREADS environment variable * The number of threads defined in the OrtDnnlProviderOptions is prioritized over the environment variable ### Description Avoids thread oversubscription caused by OpenMP allocating the maximum number of threads possible for oneDNN EP. Added support for the OrtDnnlProviderOptions, this will allow for more EP customization capabilities, and allows for user defined number of threads. ### Motivation and Context - Improves performances and allows for user to fine tune the number of threads	2023-01-31 14:37:13 -08:00
Wei-Sheng Chin	679ae7ff33	[Java] Fix warnings (#14076 ) Fix C6011, C6385, C6386 found by Visual Studio. Basically, I set the maximum number of options for every EP to 128. To my knowledge, 128 is big enough to support all EPs. For support arbitrary number of EP options, we probably need #13999 and create a "std::vector"-like struct in C language.	2023-01-30 09:22:28 -08:00
Scott McKay	114f18357a	Add Java and Objective-C bindings for RegisterCustomOpsUsingFunction. (#14256 ) Description Add bindings for Android and iOS. Motivation and Context Enable mobile app linking against ort-extensions library and registering the custom ops with ORT.	2023-01-13 09:04:26 -08:00
Adam Pocock	dd2c031d95	[java] Sparse tensor support (#10653 ) Description: Adds support for creating and receiving sparse tensors in the ORT Java API. CSRC and COO tensors as inputs are tested, but there is no op which accepts a block sparse tensor to test. COO tensors are tested as outputs, but there is no op which emits a CSRC or block sparse tensor to test. Motivation and Context - Why is this change required? What problem does it solve? Request to expose ORT sparse tensor support in Java. cc @yuslepukhin	2022-11-22 10:29:24 -08:00
Adam Pocock	388d3cf847	[Java] Fix OnnxSequence semantics (#13012 ) Previously OnnxSequence would flatten out a list of tensors into a single output array assuming they were all scalar values. This doesn't accurately represent the semantics of an ONNX sequence, but was what the semantics appeared to be years ago when I first wrote that class. This PR changes it so that the `getValue` method on `OnnxSequence` unwraps the sequence and returns `List<? extends OnnxValue>` allowing the user to process the individual ONNX values separately. It's done this way rather than returning a multidimensional array for a tensor and a Java map for a map as multidimensional arrays are very inefficient in Java and best practice when operating with a OnnxTensor in Java is to use a `java.nio.ByteBuffer`. So allowing users to access each `OnnxTensor`s individually allows them to control how the data is materialised on the Java heap.	2022-09-28 15:53:30 -07:00
RandySheriffH	77a066c700	Drop nuphar from java API (#13107 ) Drop nuphar from: - java API - tvm.cmake - run_build.sh	2022-09-26 17:06:08 -07:00
RandySheriffH	a83a9ed6b0	Remove miscellaneous nuphar configs (#13070 ) Remove a handful of nuphar related configurations after deprecation. Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2022-09-26 13:41:28 -07:00
Adam Pocock	5d55b0730e	[Java] JNI refactor for OrtJniUtil (#12516 ) Refactoring more JNI methods in OrtJniUtil. Make the strings const. Removing unnecessary use of OrtAllocator.	2022-09-08 17:04:42 -07:00
Cheng	76d17b0f48	Add java API for xnnpack (#12788 ) * Add java API for xnnpack * provider option support * a more general interface for creating EP	2022-09-03 08:29:40 +08:00
Adam Pocock	733db31420	[Java] JNI refactor for OrtSession (#12496 ) Refactor JNI error reporting	2022-08-16 13:43:06 -07:00
Adam Pocock	8a86b346a5	[Java] JNI refactor for ONNX Tensor (#12281 ) Working on JNI refactor for OnnxTensor. Simplifying the error handling logic in createTensor. Collapsing casting branches and migrating to ONNX element type enum. Disable cpplint for JNI C files.	2022-08-08 12:48:30 -07:00
Adam Pocock	e0ed9f0f2f	[java] First part of the JNI error handling rewrite (#12013 ) Description: This fixes error handling in the JNI code in OnnxMap, OnnxSequence, OnnxRuntime, RunOptions. SessionOptions and OrtEnvironment are correct as is. The bulk of the work will be in rewriting OnnxTensor, OnnxSparseTensor (after the merge of #10653) and OrtSession, along with the helper methods in OrtJniUtil. I plan to tackle those in separate PRs to reduce the amount of code to review. Motivation and Context - Why is this change required? What problem does it solve? The current native interop code doesn't return control to Java immediately on throwing an exception from an ORT error code, which can cause incorrect interactions with native ORT, and issues with exception propagation on the Java side. - If it fixes an open issue, please link to the issue here. Partial work towards solving #11451.	2022-07-12 15:16:54 -07:00
Wenbing Li	479e71a7a8	enable the extensions custom build for java and android (#11823 )	2022-07-05 10:34:14 -07:00
Adam Pocock	9616ad483f	[Java] Support configuring CUDA and TensorRT execution providers (#10697 ) Java side parts for configuring CUDA and TensorRT. Adding tests for CUDA and TensorRT. Refactoring library loading logic as provider options need to have their shared library loaded before they can be constructed.	2022-03-30 14:26:51 -07:00
Adam Pocock	e47434ea12	[java] Adding the graph description to the exposed model metadata. (#10318 )	2022-02-28 10:05:03 -08:00
Valery Chernov	1cdc23aba4	[TVM EP] Rename Standalone TVM (STVM) Execution Provider to TVM EP (#10260 ) * update java API for STVM EP. Issue is from PR#10019 * use_stvm -> use_tvm * rename stvm worktree * STVMAllocator -> TVMAllocator * StvmExecutionProviderInfo -> TvmExecutionProviderInfo * stvm -> tvm for cpu_targets. resolve onnxruntime::tvm and origin tvm namespaces conflict * STVMRunner -> TVMRunner * StvmExecutionProvider -> TvmExecutionProvider * tvm::env_vars * StvmProviderFactory -> TvmProviderFactory * rename factory funcs * StvmCPUDataTransfer -> TvmCPUDataTransfer * small clean * STVMFuncState -> TVMFuncState * USE_TVM -> NUPHAR_USE_TVM * USE_STVM -> USE_TVM * python API: providers.stvm -> providers.tvm. clean TVM_EP.md * clean build scripts #1 * clean build scripts, java frontend and others #2 * once more clean #3 * fix build of nuphar tvm test * final transfer stvm namespace to onnxruntime::tvm * rename stvm->tvm * NUPHAR_USE_TVM -> USE_NUPHAR_TVM * small fixes for correct CI tests * clean after rebase. Last renaming stvm to tvm, separate TVM and Nuphar in cmake and build files * update CUDA support for TVM EP * roll back CudaNN home check * ERROR for not positive input shape dimension instead of WARNING * update documentation for CUDA * small corrections after review * update GPU description * update GPU description * misprints were fixed * cleaned up error msgs Co-authored-by: Valery Chernov <valery.chernov@deelvin.com> Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru> Co-authored-by: Thierry Moreau <tmoreau@octoml.ai>	2022-02-15 10:21:02 +01:00
Shucai Xiao	ce103ace93	Amdmigraphx fix build error (#9272 ) * fix build error * rename a missing api for the MIGraphX EP	2022-01-10 15:18:43 -08:00
Valery Chernov	b327e89efa	Standalone TVM Executor Provider (#10019 ) * squashed commit for standalone tvm execution provider * critical fix for correct python build with stvm ep * get tuning log file from ep options. It has priority over AUTOTVM_TUNING_LOG * updates and fixes * update parsing of stvm provider options * add support of external data for onnx model * add conditional dump of subgraphs * remove unused code * get input tensor shapes through provider options. get output shapes for fixed input ones by TVM API * support AUTO_TVM tuning log file inside ORT. Selector for Ansor and Auto_TVM is provider option (tuning_type) * add fp16 * add functionality of conversion of model layout to NHWC if need. Necessary parameter was added to STVM provider options * fix license text in header. fix log format * small fixes * fix issues from flake8 * remove model proto construction from GetCapability * reserve memory for vector of DLTensors * add simple tutorial for STVM EP * STVM docs * jroesch/tvm -> apache/tvm * remove dead code, unneccessary logs and comments * fix in readme * improve tutorial notebook * tvm update * update STVM_EP.md * fix default value * update STVM_EP.md * some TODOs for the future development * shorten long lines * add hyperlink to STVM_EP.md * fix Linux CI error * fix error in csharp test Co-authored-by: Jared Roesch <jroesch@octoml.ai> Co-authored-by: Valery Chernov <valery.chernov@deelvin.com> Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru>	2021-12-15 16:59:20 -08:00
Hariharan Seshadri	bbeceb7541	Support optional type in ORT (#8339 )	2021-11-04 15:01:42 -07:00
Jeff Daily	c8789d3047	[ROCm] static re-hipify of CUDA EP to ROCm EP, now a shared provider (#8877 ) * re-hipify all rocm EP sources * fix all other files affected by re-hipify * add cuda_provider_factory.h to amd_hipify.py * do not use cudnn_conv_algo_search in ROCm EP, missing reduce min registration * Fix ReduceConsts template specialization introduced in #9101. Fixes the error when building for ROCm 4.3.1: error: too many template headers for onnxruntime::rocm::ReduceConsts<__half>::One (should be 0) * fix flake8 error in amd_hipify.py * speed up hipify with concurrent.futures * flake8 fix in amd_hipify.py	2021-10-14 15:15:51 -07:00
Ryan Hill	cc9f793b48	Move one function from cuda_provider_factory.h (#8407 )	2021-07-19 17:55:59 -07:00
Adam Pocock	7ed9f5fc90	[Java] Fixing the creation of OnnxTensors from scalars, adding tests (#8023 ) * Fixing the creation of OnnxTensors from scalars, adding tests. * Documentation fixes from the review.	2021-06-24 13:21:35 -07:00
Guoyu Wang	9003df5d87	Fix 32bit Android java API crash (#8122 ) * Fix 32bit Android java API crash * fix code formating	2021-06-22 17:41:11 -07:00
Changming Sun	7b003967b1	Add static code analyzer to Windows CPU/GPU CI builds and fix the warnings (#7489 )	2021-04-29 11:54:57 -07:00
Changming Sun	d68cedfa85	Fix some C/C++ warnings in the jni part (#7385 )	2021-04-28 14:25:58 -07:00
Adam Pocock	5a473216b7	[Java] Adds extra providers (#6770 ) Add providers for CoreML, ROCM, NNAPI, ArmNN Adding the structs for OrtCUDAProviderOptions and OrtOpenVINOProviderOptions Updating NNAPI flags. Adding the new CoreML flag. Adding hooks to the build system to tell Java about the new providers.	2021-02-24 10:25:05 -08:00
Adam Pocock	77d0eb3f56	Fixing a leak in OnnxSequences with String keys or values. (#6473 )	2021-01-28 11:28:56 -08:00

1 2

66 commits