Commit graph

20 commits

Author SHA1 Message Date
Changming Sun
82036b0497
Remove references to the outdated CUDA EP factory method (#21549)
The function "OrtSessionOptionsAppendExecutionProvider_CUDA" is
deprecated.
2024-07-29 21:59:16 -07:00
Hector Li
55e0aaeeef
fix android build issue (#20389)
fix android build issue
2024-04-19 14:21:34 -07:00
Hector Li
5daeb5e0b0
enable model with external data be loaded from memory buffer (#19089)
### Description
Background:
User save large model with initializer data in external file. e.g:
onnx.save_model(onnx_model, "path/to/save/the/model.onnx", save_as_external_data=True, all_tensors_to_one_file=True,
location="filename", size_threshold=1024).
In that case, Ort loads the model, get the external initializer information (external file name, offset, length) and use the model path to find the external file, and locate to the tensor data via the offset and length.
But it won't work if user load the model from memory, since Ort lost track of the model path.

This PR adds API/session option to let user provide a table with external initializer file name as the key, the pointer to the loaded external file in memory and the buffer length as value. So that

1. user can load the model from memory buffer with external initializers in memory buffer too.
2. the initializers can be shared across sessions, for different EPs.
3. user can load the file in any way they want, e.g mmap.

Internally, 
1. at session creation time, Ort goes through the external initializers in the graph, gets the file name, offset, data length of the external initializers from Tensorproto .
2. With the file name, Ort get the file in memory buffer and buffer length from the table user provided.
4. Ort locates the tensor buffer from file in memory buffer (user provided) using the offset and data length (from Tensorproto ).
5. Ort creates the Tensor and replace the existing Tensor in the graph.


### Motivation and Context
https://github.com/onnx/onnx/blob/main/docs/ExternalData.md
For a model with external data, the Tensorproto may have initializer data in a separate file. The external file location is set via the file path relative to the model path. With the API to load model from memory buffer, it lost track of the
model path. So it causes error if the model has external data. By adding a session option to set the external data buffer, Ort can find the external data correctly if model loaded from memory buffer.
2024-04-17 19:01:01 -07:00
Edward Chen
1b8d5c43c2
Fix builds (#16646)
- Fix some more `shorten-64-to-32` warnings
- Move minimum build.py Python version back to 3.6
2023-07-11 19:21:25 -07:00
Justin Chu
cf19c3697d
Run clang-format in CI (#15524)
### Description

Run clang-format in CI. Formatted all c/c++, objective-c/c++ files.

Excluded

```
    'onnxruntime/core/mlas/**',
    'onnxruntime/contrib_ops/cuda/bert/tensorrt_fused_multihead_attention/**',
```

because they contain assembly or is data heavy


### Motivation and Context

Coding style consistency
2023-04-18 09:26:58 -07:00
RandySheriffH
a061fedb5d
Exclude affinity-setting logic from minimal build (#13967)
Comment out the affinity-setting logic which introduced an unnecessary
binary size increase for the minimal build.

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2022-12-15 14:43:42 -08:00
RandySheriffH
75584c5fa8
Enabling thread pool to be numa-aware (#13778)
The PR enables ort thread pool to be numa-aware, so that threads could
be evenly created and distributed among numa nodes.
In addition, to facilitate performance tuning, the PR opens a new API
allowing customers to attach threads to certain logical processors.
Please check the API
[definition](https://github.com/microsoft/onnxruntime/pull/13778/files#diff-5845a5c76fb64abdc8f0cffe21b37f8da1712674eb3abc4cd87190891be1bd48)
for details.

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2022-12-12 10:33:55 -08:00
Dmitri Smirnov
2700261f7c
Provide an API to supply external initializers data from user buffers (#11109)
Imlpement AddExternalInitializers
2022-04-07 12:21:53 -07:00
Ryan Hill
cc9f793b48
Move one function from cuda_provider_factory.h (#8407) 2021-07-19 17:55:59 -07:00
alonre24
374acf1423
Disable external initializers build option (#7635)
* Merge set custom allocator to master

* Add documentation for the new API.
Reset global env in testCustomArenaAllocator so won't have a registered allocator of type arena (from previous test)

* Add a session option config that will allow to disable loading model with initializers that have an external data (+test it).

* Add the model used for the test and its external initializers data

* Change the session config option that disable external initializers to a build option.

* Addressing PR comments
2021-05-19 14:16:36 -07:00
Scott McKay
b5c2932ae8
Last major set of ORT format model changes (#5056)
* Add minimal build option to build.py
Group some of the build settings so binary size reduction options are all together
Make some cmake variable naming more consistent
Replace usage of std::hash with murmurhash3 for kernel. std::hash is implementation dependent so can't be used.
Add initial doco and ONNX to ORT model conversion script
Misc cleanups of minimal build breaks.
2020-09-05 07:59:01 +10:00
gwang-msft
64237d999c
Add Cmake config for onnxruntime_NO_EXCEPTIONS (#4975)
* additional noexception setting, added compile options

* more no exception changes

* addressed PR comments

* Fix build issue when MSVC static library is used.

* Clarify comment

* add fatal message for onnxruntime_NO_EXCEPTIONS enabled without onnxruntime_MINIMAL_BUILD

Co-authored-by: Scott McKay <skottmckay@gmail.com>
2020-09-01 10:17:50 -07:00
KeDengMS
ade4fa108f
Disable delayload for cuda dlls (#3147)
This change fixes #3129. When running onnxruntime as dll on Windows, CUDA does some internal cleanups when process exits. After this, any call to CUDA would cause crash. Delayload makes thread_local destructor to happen after CUDA cleanup, thus the crash.
2020-03-05 14:40:22 -08:00
Changming Sun
7ff5c0e5a3
CMake changes (#2961)
1. Add support for vstest. 
2. Add support for vcpkg. To use it:
  ```bat
   vcpkg install zlib:x64-windows benchmark:x64-windows gtest:x64-windows protobuf:x64-windows pybind11:x64-windows re2:x64-windows
   mkdir build
   cmake ..\cmake -DCMAKE_BUILD_TYPE=Debug -A x64 -T host=x64 -DCMAKE_TOOLCHAIN_FILE=C:\vcpkg\scripts\buildsystems\vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows -Donnxruntime_PREFER_SYSTEM_LIB=ON
  ```
3. New cmake option: onnxruntime_PREFER_SYSTEM_LIB, which allows user using the preinstall libs instead of the things in onnxruntime submodule.
4. New cmake option: onnxruntime_ENABLE_MEMLEAK_CHECKER, which allows user turn on/off the memory leak checker by @RyanUnderhill in Windows Debug Build. The checker doesn't work with vstest.
4. Fix the post merge pipeline(Mainly for test coverage report).
5. Ignore the compile warning from the Featurizer library code
6. Apply "/utf-8" VC compile flag to our code. Without this, you can't build onnxruntime on Chinese Windows.
7. Remove the SingleUnitTestProject cmake option because it's deprecated more than one year and nobody is using it.
8. Move opaque api tests to onnxruntime_test_all
9. Enable "/W4" on CUDA ep's C++ code(Not the *.cu files), and fix some warnings, add some extra checks.
10. Delete the onnxruntime::test::TestEnvironment class.
11. Add a DLLmain for onnxruntime.dll. 
12. Allow dynamic link to libprotobuf
2020-02-03 19:33:14 -08:00
Ryan Hill
5781222456
Ryanunderhill/api interface (#1855)
* Convert ABI to a versioned interface.
* Convert ORT_THROW_ON_ERROR to inline function to fix link errors.
2019-09-20 13:39:11 -07:00
Scott McKay
e3919d3fce
Cleanup naming of test input to use .onnx for models. (#1337)
* Cleanup naming of test input to use .onnx for models.

* Remove file deleted on master
2019-07-04 13:10:29 +10:00
Ryan Hill
9129a652c5
Ryanunderhill/cxx api2 (#1091)
More C++ API improvements and cleanup
Add templates to tensor creation
Add run method that allows preallocated outputs
Simplify CreateTensor<T> to multiply by sizeof(T)
Convert io_types code
Optimize away vector copies in Session::Run
2019-05-24 11:15:51 -07:00
Mika Fischer
c0acb8b6c3 Allow loading model from in-memory byte-array (#718) 2019-05-11 05:58:50 +08:00
Pranav Sharma
5d452b3029
Use protobuf-lite to reduce onnxruntime.dll size. (#639)
* Test protobuf-lite

* Test protobuf-lite

* Test protobuf-lite

* Optimize protobuf usage for LITE_RUNTIME to reduce the binary size of
onnxruntime.dll. More details can be found here https://developers.google.com/protocol-buffers/docs/proto.
The reduction is significant. For commit id: 4873b452151bafe49da332aaeab639ef0318fc1ca28d728, the size
reduced by ~700K; from 4873728 to 4172800.

* Add LITE_RUNTIME flag in in.proto files

* Fix merge conflict.

* Address PR comments

* Forgot to add 2 files + fix linux and gpu build errors.

* Fix build errors + test failures

* Fix cuda tests

* Fix tensor rt build

* Use full protobuf for trt

* Address PR comments

* Print tensor shape proto as text string for easier debugging
2019-03-21 14:06:38 -07:00
Changming Sun
8e0fff7b8d
Support large model(>2GB) (#520)
1. Support the new external data extension in ONNX 1.4 onnx/onnx#678
2. Enable onnxruntime_perf_test in Mac Build
3. move path_lib.h from onnx_test_runner source dir to onnxruntime_framework
4. Enable memory planner for string tensors
5. Make memory planner always enabled, to simplify model loading logic
6. Delete some duplicated code between onnxruntime_perf_test and onnx_test_runner
7. Delete win_getopt_mb lib.
8. Remove the dependency on Pathcch lib, which is only available on Windows 8 and newer.
2019-03-05 21:27:12 -08:00