onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-15 20:50:42 +00:00

Author	SHA1	Message	Date
Changming Sun	5d7030e4c6	Revert DML pipeline changes (#23135 ) ### Description Previously we wanted to add DirectML EP to existing onnxruntime Windows CUDA packages. After careful consideration, we will postpone the change. This PR reverts some pipeline changes previously made by @mszhanyi and @jchen351 .	2024-12-18 10:42:10 -08:00
Jian Chen	9ed0c7fe26	Redo "Update Gradle version 8.7 and java version 17 within onnxruntime/java" (#22923 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-12-02 18:34:25 -08:00
Yi Zhang	85751e7276	Build DML in Windows GPU CI pipeline (#22869 ) ### Description Add a new stage to build cuda and dml in Windows GPU CI pipeline (PR checks) to prevent regressions introduced by new cuda tests. Update all tests in cuda/testcases name prefix to CudaEp for skipping them easily ### Motivation and Context 1. CudaNhwcEP is added by default when using cuda ep 2. if onnxruntime_ENABLE_CUDA_EP_INTERNAL_TES is enable, the tests in tests/provider/cuda/testcases is added too. ### To do add enable_pybind in the new stage. Now, --enable_pybind will trigger some python test, like onnxruntime_test_python.py. It uses the API of get_avaible_providers() . More discussions are needed to decide how to make it works	2024-11-25 10:50:52 +08:00
Yi Zhang	a28246a994	Revert "Update Gradle version 8.7 and java version 17 within onnxrunt… (#22914 ) …ime/java (#22771)" This reverts commit `632a36a233`. ### Description <!-- Describe your changes. --> ### Motivation and Context Run E2E tests using Browserstack failed due to this PR.	2024-11-21 18:12:28 +08:00
Jian Chen	632a36a233	Update Gradle version 8.7 and java version 17 within onnxruntime/java (#22771 ) ### Description This change is to update the Gradle version within java project to 8.7, it also upgrades the JAVA to 17. Gradle version from react-native was also updated to 7.5 to make it compatible with changes from the Java directory. However, the target java version remains the same. Java version from these will be upgraded in a separated PR. This is spited from #22206 ### Motivation and Context This is the first step to upgrade the react native version.	2024-11-14 17:10:44 -08:00
sheetalarkadam	e8f1d73b0b	Add Android QNN Browserstack test (#22434 ) Add Android QNN Browserstack test ### Motivation and Context Real device test in CI	2024-11-10 16:10:29 -08:00
Yi Zhang	8e8b62b8b5	Build CUDA and DML together (#22602 ) ### Description Now, we need to build cuda and dml in one package. But CUDA EP and DML EP can't run in one process. It will throw the exception of `the GPU device instance has been suspended` So the issue is CUDA EP and DML EP coexist in compile time but can't exist in run time. This PR is to split cuda ep test and dml ep test in all unit tests. The solution is to use 2 environment variable, NO_CUDA_TEST and NO_DML_TEST, in CI. For example, if NO_CUDA_TEST is set, the DefaultCudaExecutionProvider will be nullptr, and the test will not run with CUDA EP. In debugging, the CUDAExecutionProvider will not be called. I think, as long as cuda functions, like cudaSetDevice, are not called, DML EP tests can pass. Disabled java test of testDIrectML because it doesn't work now even without CUDA EP.	2024-10-31 15:51:13 -07:00
jingyanwangms	5036e63d58	Increanse TensorRT tolerance from default 1e-5 to 1e-3 after TRT 10.4 (#22321 ) ### Description Increanse TensorRT tolerance from default 1e-5 to 1e-3 after TRT 10.4 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-10-05 18:06:06 +10:00
Adam Pocock	14d1bfc34b	[java] Multi-LoRA support (#22280 ) ### Description Java parts of Multi-LoRA support - #22046. ### Motivation and Context API equivalence with Python & C#. --------- Co-authored-by: Dmitri Smirnov <dmitrism@microsoft.com>	2024-10-01 13:54:37 -07:00
Edward Chen	c24e55b1f1	[Java] Add API for appending QNN EP (#22208 ) - Add Java API for appending QNN EP - Update Java unit test setup - Fix issues with setting system properties for tests - Unify Windows/non-Windows setup to simplify	2024-10-01 10:18:04 -07:00
Adam Pocock	cfa45df6b5	[java] Migrate OnnxTensors created from arrays over to a backing Java buffer (#18556 ) ### Description Following from #16578 and #16835 this migrates over `OnnxTensor.createTensor(<array>)` to first instantiate a `java.nio.Buffer` and then copy the array into that buffer in Java before creating the tensor. It also changes the `OnnxTensor.getValue()` method which returns a multidimensional array so it does the array construction and value copy in Java. This allows the removal of some unpleasant recursive C code which repeatedly calls into the JVM to traverse Java's arrays. The equivalent Java code is still unpleasant and recursive, but it's easier to reason about and memory safe. As a bonus, more `OnnxTensor`s are now backed by buffers which allow users to pin memory and reduce allocations by reusing them for same sized inputs. Some of the JNI code which parses Java arrays still exists as it's used by `OnnxMap`, removing that will be the target of a future refactor. Strings are still processed in JNI as it is easier to work with String tensors and UTF-8 arrays in C. ### Motivation and Context Minimizing the amount of JNI code makes it easier to maintain and using buffers in preference to arrays allows for fewer allocations.	2024-09-24 15:36:52 +10:00
Adam Pocock	6d7235ba5a	[Java] Exposing SessionOptions.SetDeterministicCompute (#18998 ) ### Description Exposes `SetDeterministicCompute` in Java, added to the C API by #18944. ### Motivation and Context Parity between C and Java APIs.	2024-09-16 11:55:38 +10:00
Adam Pocock	02e00dc023	[java] Adding ability to load a model from a memory mapped byte buffer (#20062 ) ### Description Adds support for constructing an `OrtSession` from a `java.nio.ByteBuffer`. These buffers can be memory mapped from files which means there doesn't need to be copies of the model protobuf held in Java, reducing peak memory usage during session construction. ### Motivation and Context Reduces memory usage on model construction by not requiring as many copies on the Java side. Should help with #19599.	2024-09-16 08:31:55 +10:00
Adam Pocock	22437b581b	[java] Fix for OnnxTensor creation when passing in a ByteBuffer containing elements of a different type (#21774 ) ### Description Fixes a bug where the buffer offset and position was incorrectly computed if the user supplied a `ByteBuffer` to `createTensor` but set the type of the tensor to something other than `INT8`. This would be more common if the user was trying to load the initializers from a serialized representation and didn't want to bother with the type information (which is the case in #21321). ### Motivation and Context Partial fix for #21321. The remainder of the fix is to add a helper which allows users to load initializers out of an `onnx_data` file, but that will require adding protobuf as a dependency for the Java API to allow the parsing of an ONNX file separately from the native code. It might be nicer to put that functionality into ORT's C API so it can return the lengths & offsets of the initializers when provided with an ONNX file containing external initializers. We hit this kind of thing in Java more often than other languages as in Java models can be supplied as classpath resources which we can easily read, but not materialize on disk for the ORT native library to read.	2024-09-13 12:38:17 +10:00
mindest	5b9369e93c	Fix typos according to reviewdog report. (#21335 ) ### Description Fix typos based on reviewdog report but with some exceptions/corrections.	2024-07-22 13:37:32 -07:00
Edward Chen	981893c318	Remove deprecated "mobile" packages (#20941 ) # Description This PR removes the building of the ORT "mobile" packages and much of the associated infrastructure which is no longer needed. Not removed yet - tools/ci_build/github/android/mobile_package.required_operators.config and the helper scripts that depend on it. # Motivation and Context The mobile packages were deprecated in 1.18. Users should use the full packages (Android - onnxruntime-android, iOS - onnxruntime-c/onnxruntime-objc) instead or do a custom build.	2024-06-07 16:20:32 -05:00
Adam Pocock	a36692066d	[java] CUDA & TensorRT options fix (#20549 ) ### Description I misunderstood how UpdateCUDAProviderOptions and UpdateTensorRTProviderOptions work in the C API, I had assumed that they updated the options struct, however they re-initialize the struct to the defaults then only apply the values in the update. I've rewritten the Java bindings for those classes so that they aggregate all the updates and apply them in one go. I also updated the C API documentation to note that these classes have this behaviour. I've not checked if any of the other providers with an options struct have this behaviour, we only expose CUDA and TensorRT's options in Java. There's a small unrelated update to add a private constructor to the Fp16Conversions classes to remove a documentation warning (they shouldn't be instantiated anyway as they are utility classes containing static methods). ### Motivation and Context Fixes #20544.	2024-05-05 00:16:55 -07:00
Adam Pocock	262b6bd3b7	[java][DML EP] Modifying dml_provider_factory.h so it can compile as a C header file (#20157 ) ### Description The dml_provider_factory header file can't be used in C programs as it defines C++ inline operators. This PR rearranges that header file so that it looks like valid C when used from C, and also makes a couple of small modifications to the Java code so it correctly binds to the DML EP at build time. I'm having some difficulty testing it as I think it's pulling in the old version of DirectML on my computer and I can't figure out what the library loading path is in Java to make it look at the recent version I downloaded. So the test I added fails with: ``` InferenceTest > testDirectML() FAILED ai.onnxruntime.OrtException: Error code - ORT_RUNTIME_EXCEPTION - message: Exception during initialization: <path-to-ort>\onnxruntime\core\providers\dml\DmlExecutionProvider\src\AbiCustomRegistry.cpp(518)\onnxruntime.dll!00007FFF74819333: (caller: 00007FFF74793509) Exception(3) tid(4f58) 80070057 The parameter is incorrect. at app//ai.onnxruntime.OrtSession.createSession(Native Method) at app//ai.onnxruntime.OrtSession.<init>(OrtSession.java:74) at app//ai.onnxruntime.OrtEnvironment.createSession(OrtEnvironment.java:236) at app//ai.onnxruntime.OrtEnvironment.createSession(OrtEnvironment.java:221) at app//ai.onnxruntime.InferenceTest.openSessionSqueezeNet(InferenceTest.java:1961) at app//ai.onnxruntime.InferenceTest.runProvider(InferenceTest.java:665) at app//ai.onnxruntime.InferenceTest.testDirectML(InferenceTest.java:657) ``` But it does correctly compile, and this error seems very similar to other issues with the DML provider when it doesn't like a model due to the loaded library being old. The test is using the squeezenet file that's been in the repo since 2019. If someone can help me figure out how to get the right version of DML in the library path I can test it more on my end. I tried adding the folder with the new version into the system path, but I'm not very familiar with Windows' library loading behaviour. ### Motivation and Context Fixes #19656 to allow use of the DirectML EP from ORT Java. cc @martinb35	2024-04-01 21:58:50 -07:00
Adam Pocock	2f82400b13	[java] Java 21 build support (#19876 ) ### Description Bump spotless and the Gradle wrapper to 6.25.0 and 8.6 respectively to allow compiling ORT on Java 21. The build still targets Java 8. I'm not sure if there will be CI changes necessary to use this PR, specifically for the Gradle version as I don't know if that is cached somewhere earlier in the CI build process. The new Gradle version adds a warning that using `--source` and `--target` to select the Java language version is obsolete which is annoying, we can fix it if we decide to only allow building on newer versions of Java, while still supporting running on Java 8. ### Motivation and Context Java 21 is the latest LTS release of Java and ORT should be able to build on it.	2024-03-28 15:51:22 -07:00
Tianlei Wu	fbff99a432	Change Jave Test Threshold (#19508 ) ### Description Increase the threshold to 1e-5 to avoid test failed in CUDA when difference is slightly larger than 1e-6. May because TF32 is used in those CUDA tests. ### Motivation and Context https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1291322&view=logs&j=f2f63060-d9d6-52d0-adee-b97db5a9ab91&t=28e21ca6-87a4-5e1e-0441-72b5e8326f2d ProviderOptionsTest > testCUDAOptions() FAILED org.opentest4j.AssertionFailedError: array contents differ at index [103], expected: <0.0102678> but was: <0.010266338> at app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) at app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) at app//org.junit.jupiter.api.AssertArrayEquals.failArraysNotEqual(AssertArrayEquals.java:440) at app//org.junit.jupiter.api.AssertArrayEquals.assertArrayEquals(AssertArrayEquals.java:290) at app//org.junit.jupiter.api.AssertArrayEquals.assertArrayEquals(AssertArrayEquals.java:123) at app//org.junit.jupiter.api.AssertArrayEquals.assertArrayEquals(AssertArrayEquals.java:119) at app//org.junit.jupiter.api.Assertions.assertArrayEquals(Assertions.java:1360) at app//ai.onnxruntime.providers.ProviderOptionsTest.runProvider(ProviderOptionsTest.java:99) at app//ai.onnxruntime.providers.ProviderOptionsTest.testCUDAOptions(ProviderOptionsTest.java:43) https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1293200&view=logs&jobId=f2f63060-d9d6-52d0-adee-b97db5a9ab91&j=f2f63060-d9d6-52d0-adee-b97db5a9ab91&t=28e21ca6-87a4-5e1e-0441-72b5e8326f2d InferenceTest > testCUDA() FAILED org.opentest4j.AssertionFailedError: array contents differ at index [103], expected: <0.0102678> but was: <0.010266337> at app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) at app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) at app//org.junit.jupiter.api.AssertArrayEquals.failArraysNotEqual(AssertArrayEquals.java:440) at app//org.junit.jupiter.api.AssertArrayEquals.assertArrayEquals(AssertArrayEquals.java:290) at app//org.junit.jupiter.api.AssertArrayEquals.assertArrayEquals(AssertArrayEquals.java:123) at app//org.junit.jupiter.api.AssertArrayEquals.assertArrayEquals(AssertArrayEquals.java:119) at app//org.junit.jupiter.api.Assertions.assertArrayEquals(Assertions.java:1360) at app//ai.onnxruntime.InferenceTest.runProvider(InferenceTest.java:676) at app//ai.onnxruntime.InferenceTest.testCUDA(InferenceTest.java:615)	2024-02-14 10:08:46 -08:00
Adam Pocock	191525301f	[java] Updating TensorInfo so it contains the named dimensions (#18962 ) ### Description The Java `TensorInfo` object which is used to describe a tensor's shape, along with the input and output placeholders for a model couldn't show any symbolic/named dimensions in that tensor. Now this information is stored in Java strings on construction and included in the toString. ### Motivation and Context Setting symbolic dimensions required external information in Java, the names were not discoverable from within the API.	2024-01-15 14:42:50 -08:00
Adam Pocock	71657d1eb8	[java] Fix double close (#19133 ) ### Description The `OnnxValue` and `OrtProviderOptions` implementations now check to see if they've been closed before accessing the native pointer, and also before close is called. ### Motivation and Context Before they could be closed twice which SIGSEGV'd the JVM. Fixes #19125.	2024-01-14 14:53:26 -08:00
Adam Pocock	3456831413	[java] Make the backing byte buffer in an OrtValue accessible (#16578 ) ### Description Adds a method to access the backing direct byte buffer from a Java `OnnxTensor` object, assuming it is backed by a direct byte buffer (tensors created by ORT's run call or ones created in Java from multidimensional arrays are not). Also adds a method to check if the backing byte buffer was copied from the user's buffer supplied on creation (this could be tested via a pointer comparison from the output of `getBufferRef` and the user's input buffer, so I'm not sure if it's necessary). ### Motivation and Context This is the first part of changes necessary to support output pinning in Java OrtSession.run/OrtTrainingSession.run calls. I split it out from the rest of the work as it's useful by itself (e.g. to allow users to keep a single input tensor and rewrite it each time with new inputs rather than allocate a fresh one) and the other change will be much more involved so splitting it makes it easier to review. cc @yuslepukhin	2023-10-17 10:03:49 -07:00
Adam Pocock	aed43f429a	[java] Enable output pinning in OrtSession and OrtTrainingSession (#16835 )	2023-09-26 01:49:13 -07:00
Adam Pocock	03c3e91b0d	[java] Relaxing CoreML test (#16777 ) ### Description Reduces precision on the CoreML provider test as it returns slightly different answers than the other tested providers. Checked on a 2020 13" M1 MBP. ### Motivation and Context Fixes Java CoreML test failure after #16763.	2023-08-09 11:43:05 -07:00
Adam Pocock	a1bb670536	[java] Fp16 fix for android/react native (#16832 ) ### Description This PR splits out the FP16 conversions into a separate package we can override in the android build with a version which works on old versions of Android. I'm not sure the android build system changes are correct as I haven't got an android build environment configured on my workstation. @YUNQIUGUO if the CI build fails we should follow up offline to get my environment configured so I can iterate on it. ### Motivation and Context Fixes the CI failure after #16703.	2023-07-25 12:31:32 -07:00
Adam Pocock	a8e776b78b	[java] Adds support for fp16 and bf16 tensors (#16703 ) ### Description The Java API currently only supports fp16 output tensors which it automatically casts to floats on the way out. This PR adds support for creating fp16 and bf16 tensors (from `java.nio.Buffer` objects or as the output of models, creation from Java short arrays is not supported), along with efficient methods for casting `FloatBuffer` into `ShortBuffer` filled with fp16 or bf16 values and vice versa. The fp16 conversions use a trick to pull in the efficient conversion methods added to Java 20, falling back to ports of the MLAS methods otherwise. The Java 20 methods can be special cased by the C2 JIT compiler to emit the single instruction on x86 and ARM which converts fp32<->fp16, or the vectorized versions thereof, so they should be quite a bit faster than the MLAS ported one. ### Motivation and Context fp16 and bf16 are increasingly popular formats and we've had several requests for this functionality. Fixes #7003. cc @yuslepukhin @cassiebreviu --------- Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>	2023-07-21 21:14:41 +10:00
Adam Pocock	ba91457183	[java] Adding addExternalInitializers and addInitializer to OrtSession.SessionOptions (#16198 ) ### Description Adds support for adding external initializers or overriding initializers to a session options from Java. ### Motivation and Context We want to instantiate large models from Java without filesystem access. cc @yuslepukhin	2023-07-05 12:51:59 -07:00
Baiju Meswani	10ba1e270c	Minimal Build for On-Device Training (#16326 ) 🛠️ __Changes in this pull request:__ This pull request introduces two significant changes to the project: - Changing on device training checkpoint format: The current implementation stores the on device training checkpoint as a sequence of tensors in multiple files inside a checkpoint folder, which can be inefficient in terms of storage and performance. In this PR, I have modified the checkpoint format to utilize the flatbuffer table to save the checkpoint to a single file, providing a more compact and efficient representation. The changes around this are twofold: - Add the checkpoint flatbuffer schema that will generate the necessary checkpoint source files. - Update the checkpoint saving and loading functionality to use the new format. - Adding support for onnxruntime minimal build: To support scenarios where binary size is a constraint, I made changes to ensure that the training build can work well with the minimal build. 🔍 __Open Issues:__ - In order to extract the optimizer type, the existing implementation re-loaded the onnx optimizer model and parsed it. This is no longer possible, since the model format can either be onnx or ort. One idea is to do the same for ort format optimizer model. This needs some investigation. - Changes to the offline tooling to generate ort format training artifacts. - End-to-end training example showcasing the use of the minimal training build. - Add support for export model for inferencing in a minimal build.	2023-06-22 12:27:23 -07:00
Adam Pocock	bca49d62a0	Fixing CoreML in Java (#16231 ) ### Description The name of the flag we set when compiling the JNI binding to enable the CoreML EP changed at some point in the past. This PR fixes it by updating the flag in the JNI. I also added a quick smoke test for the CoreML provider to make sure it doesn't crash and can be enabled. ### Motivation and Context All the EPs should work as expected in Java. Fixes #16230.	2023-06-07 12:24:57 -07:00
Adam Pocock	3c2a11f2f1	[java] Allow the creation of boolean tensors from ByteBuffer (#15556 ) ### Description The tensor creation code now allows the creation of boolean tensors from non-direct `ByteBuffer` instances. It previously only allowed them from arrays and direct `ByteBuffer` instances and this fixes that inconsistency. The boolean tensor test has been updated to cover all three cases. ### Motivation and Context Fixes #15509.	2023-06-05 09:58:50 -07:00
Adam Pocock	8a1a40ac63	[Java] CheckpointState AddProperty & GetProperty support (#15730 )	2023-04-28 09:52:52 -07:00
Ashwini Khade	ccb2243ee7	Update build option for training in java to enable_training_api (#15638 ) ### Description Updating the build option for enabling training in java builds from ENABLE_TRAINING -> ENABLE_TRAINING_APIS. In the native codebase ENABLE_TRAINING is used for enabling full training and ENABLE_TRAINING_APIS is used for creating the lte builds with training apis. Making the change to sync the naming convention across all the language bindings. It was a bit confusing to see ENABLE_TRAINING when debugging the android build failures for training. Making this change just to improve readability of logs during debugging. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-04-24 11:53:08 -07:00
Adam Pocock	ef11032c89	[java] Allows the creation and extraction of zero length tensors (#15116 ) ### Description Allows the creation of zero length tensors via the buffer path (the array path with zero length arrays still throws as the validation logic to check it's not ragged would require more intrusive revision), and allows the `tensor.getValue()` method to return a Java multidimensional array with a zero dimension. Also added a test for the creation and extraction behaviour. ### Motivation and Context The Python interface can return zero length tensors (e.g. if object detection doesn't find any objects), and before this PR in Java calling `tensor.getValue()` throws an exception with a confusing error message. Fixes #7270 & #15107.	2023-04-05 10:49:59 -07:00
Edward Chen	c46c7ccba5	Update Gradle version (#14862 ) - Update Gradle version used in most places from 6.8.3 to 8.0.1. Update Android Gradle Plugin version where applicable. Not updated in this change: React Native Android projects (under `js/react_native/`). That can be done later along with updating the React Native projects. - Add Gradle wrapper in `java/` to make it easier to consistently use a specific Gradle version.	2023-03-08 12:22:06 -08:00
Adam Pocock	47f00b5d49	[Java] Initial on device training support (#14027 ) contributor: @Craigacp	2023-03-08 10:01:08 -08:00
Adam Pocock	150043f74f	Adds a Java accessor for GetVersionString (#14876 ) ### Description Java part of #14873.	2023-03-07 09:46:56 -08:00
Scott McKay	114f18357a	Add Java and Objective-C bindings for RegisterCustomOpsUsingFunction. (#14256 ) Description Add bindings for Android and iOS. Motivation and Context Enable mobile app linking against ort-extensions library and registering the custom ops with ORT.	2023-01-13 09:04:26 -08:00
Adam Pocock	dd2c031d95	[java] Sparse tensor support (#10653 ) Description: Adds support for creating and receiving sparse tensors in the ORT Java API. CSRC and COO tensors as inputs are tested, but there is no op which accepts a block sparse tensor to test. COO tensors are tested as outputs, but there is no op which emits a CSRC or block sparse tensor to test. Motivation and Context - Why is this change required? What problem does it solve? Request to expose ORT sparse tensor support in Java. cc @yuslepukhin	2022-11-22 10:29:24 -08:00
Adam Pocock	388d3cf847	[Java] Fix OnnxSequence semantics (#13012 ) Previously OnnxSequence would flatten out a list of tensors into a single output array assuming they were all scalar values. This doesn't accurately represent the semantics of an ONNX sequence, but was what the semantics appeared to be years ago when I first wrote that class. This PR changes it so that the `getValue` method on `OnnxSequence` unwraps the sequence and returns `List<? extends OnnxValue>` allowing the user to process the individual ONNX values separately. It's done this way rather than returning a multidimensional array for a tensor and a Java map for a map as multidimensional arrays are very inefficient in Java and best practice when operating with a OnnxTensor in Java is to use a `java.nio.ByteBuffer`. So allowing users to access each `OnnxTensor`s individually allows them to control how the data is materialised on the Java heap.	2022-09-28 15:53:30 -07:00
RandySheriffH	a83a9ed6b0	Remove miscellaneous nuphar configs (#13070 ) Remove a handful of nuphar related configurations after deprecation. Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2022-09-26 13:41:28 -07:00
Edward Chen	454f77cd94	Update kernel matching logic: decouple from op schemas and remove kernel def hashes (#12791 ) # Motivation Currently, ORT minimal builds use kernel def hashes to map from nodes to kernels to execute when loading the model. As the kernel def hashes must be known ahead of time, this works for statically registered kernels. This works well for the CPU EP. For this approach to work, the kernel def hashes must also be known at ORT format model conversion time, which means the EP with statically registered kernels must also be enabled then. This is not an issue for the always-available CPU EP. However, we do not want to require that any EP which statically registers kernels is always available too. Consequently, we explore another approach to match nodes to kernels that does not rely on kernel def hashes. An added benefit of this is the possibility of moving away from kernel def hashes completely, which would eliminate the maintenance burden of keeping the hashes stable. # Approach In a full build, ORT uses some information from the ONNX op schema to match a node to a kernel. We want to avoid including the ONNX op schema in a minimal build to reduce binary size. Essentially, we take the necessary information from the ONNX op schema and make it available in a minimal build. We decouple the ONNX op schema from the kernel matching logic. The kernel matching logic instead relies on per-op information which can either be obtained from the ONNX op schema or another source. This per-op information must be available in a minimal build when there are no ONNX op schemas. We put it in the ORT format model. Existing uses of kernel def hashes to look up kernels are replaced with the updated kernel matching logic. We no longer store kernel def hashes in the ORT format model’s session state and runtime optimization representations. We no longer keep the logic to generate and ensure stability of kernel def hashes.	2022-09-20 14:24:59 -07:00
Cheng	76d17b0f48	Add java API for xnnpack (#12788 ) * Add java API for xnnpack * provider option support * a more general interface for creating EP	2022-09-03 08:29:40 +08:00
Yulong Wang	1a402a3f25	replace 'master' branch ref to 'main' for onnx repo (#12678 )	2022-08-30 13:41:42 -07:00
Yulong Wang	c144acc534	Replace 'master' branch ref to 'main' in the code (#12547 )	2022-08-22 10:48:12 -07:00
Mina Asham	6cd1931a93	Specify list/map capacity when initializing where possible (#11110 ) * Specify list/map capacity when initializing where possible - This really depends on the use case, but in some cases the array/map resizing can be slightly costly, there is effectively no downside setting the initial capacity for a collection if we know for sure its final size * Supply list/map capacity when initializing where possible - This really depends on the use case, but in some cases the array/map resizing can be slightly costly, there is effectively no downside setting the initial capacity for a collection if we know for sure its final size - Introduce an extra utility to help creating maps with expected capacity * Move utility function to OrtUtil and drop MapUtil, also add Java doc to method * Move test to the right class	2022-04-27 20:59:18 -07:00
Adam Pocock	9616ad483f	[Java] Support configuring CUDA and TensorRT execution providers (#10697 ) Java side parts for configuring CUDA and TensorRT. Adding tests for CUDA and TensorRT. Refactoring library loading logic as provider options need to have their shared library loaded before they can be constructed.	2022-03-30 14:26:51 -07:00
Adam Pocock	4ef81b142d	Making the Java tests faster by optionally disabling ones which require running multiple JVMs. (#10811 )	2022-03-08 22:19:37 -08:00
Adam Pocock	f856608599	[java] Changes OrtEnvironment so it can't be closed by users (#10670 ) * Changes OrtEnvironment so it can't be closed by users. * Fix the formatting and add a same instance check.	2022-02-28 21:03:40 -08:00
Adam Pocock	e47434ea12	[java] Adding the graph description to the exposed model metadata. (#10318 )	2022-02-28 10:05:03 -08:00

1 2

80 commits