Commit graph

19 commits

Author SHA1 Message Date
Vincent Wang
198fc901b1
Cherry-pick 1st Round (#17308)
Cherry-pick 1st round for rel-1.16.0 from
https://github.com/microsoft/onnxruntime/issues?q=label%3Arelease%3A1.16+label%3Atriage%3Aapproved+is%3Aclosed
except #17201 because it caused UT failure and is not fixed yet.

PR list:
#16417
#16936
#17000
#17236
#17238
#17240
#17252
#17255
#17258
#17265
#17267
#17277
2023-08-28 12:34:27 -07:00
Dmitri Smirnov
322237f482
[C#] Implement OrtValue APIs (#16206)
### Description

Expose `OrtValue` class API as first-class citizen.
Make it simular with C++ API.
Enable safe direct native memory access.
Make string tensor manipulation more efficient.
Avoid intermediate structures such as `NamedOnnxValue`,
`DisposableNamedOnnxvalue` and etc.

Provide more examples with `IOBinding`, although `OrtValue` API
potentially makes `IOBinding` redundant for most of scenarios, since
`OrtValue` can be created on top of any memory.

Run all the pre-trained models now with `OrtValue` API as well.
Obsolete `OrtExternalMemory class`. Obsolete IOBinding API that takes
`FixedBufferOnnxValue`.

### Motivation and Context
Make the API efficient and uniform with C++.

This aspires to address: 
https://github.com/microsoft/onnxruntime/issues/14918
https://github.com/microsoft/onnxruntime/issues/15381

Cc: @Craigacp
2023-06-29 08:59:23 -07:00
Baiju Meswani
1f60414bc2
Load CheckpointState from a buffer (#16457) 2023-06-26 09:18:38 -07:00
Baiju Meswani
de0a973b6e
[Bug Fix] Incorrect comparison for FromBuffer in TrainingSession.cs (#16022) 2023-05-22 21:21:54 -07:00
Baiju Meswani
6b7181d31d
Add C# API documentation for training (and some other changes) (#15935) 2023-05-16 03:15:24 -07:00
Changming Sun
1fb2f2605b
Update VERSION_NUMBER (#15773)
### Description

1. Update VERSION_NUMBER for preparing the upcoming release. This PR's
commit will not be included in the 1.15 release branch
2. Delete package/rpm/onnxruntime.spec since it was not used in past
years.

### Motivation and Context
Preparing the release.

Fixed
[AB#15311](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/15311)
2023-05-03 15:07:34 -07:00
Baiju Meswani
bb33285ec2
C# training api updates for on device training (#15720) 2023-05-01 10:01:38 -07:00
Baiju Meswani
b5a1941835
C, C++, Python, C# API update for on device training (#15518) 2023-04-21 11:36:01 -07:00
Dmitri Smirnov
a5dec8eedf
[C# ] Improve string marshalling and reduce GC pressure (#15545)
### Description

  Reduce a number of auxillary objects created to reduce GC pressure.
Eliminate GCHandle type of memory pinning in most of the places.
Improve string marshalling by allocating unmanaged memory that does not
require pinning. Change native methods from `IntPtr` to `byte[]`
(marshalling pinning is more efficient).

Allocate input/output UTF-8 names in unmanaged heap for the lifetime of
InferenceSession. So we do not keep converting them and pinning on every
Run.

Introduce a new native API that allows to allocate and convert/copy
strings directly into a native tensor.

The PR delivers around 50% latency improvements and less GC pauses.

Inspired by: https://github.com/microsoft/onnxruntime/pull/15520

### Motivation and Context
Client experience GC pressure and performance degradation when dealing
with string tensors.


Co-Authored-By: @tannergooding
2023-04-20 15:12:51 -07:00
Dmitri Smirnov
a66af390fa
[C#] Allow passing various options when creating singleton Environment object. (#14723)
### Description
Re-work OrtEnv class so we can pass various options when creating the
environment such as:
- logId
- initial logging level
- thread options
- user supplied logging function

Create the default instance when SessionOptions are instantiated as
users often forget to do so.

### Motivation and Context
We lack this capability.
Inspired by
https://github.com/microsoft/onnxruntime/pull/13822
https://github.com/microsoft/onnxruntime/pull/13951
https://github.com/microsoft/onnxruntime/pull/11593


Cc: @thoron

---------

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
2023-04-18 21:49:55 -07:00
Baiju Meswani
3d8fa4d77b
GetTrainingApi to not print to stderr when not an ort training build (#14515) 2023-02-02 13:28:32 -08:00
Ashwini Khade
d92c663f28
Create dedicated build for training api (#14136)
### Description
Enable creating dedicated build for on device training. With this PR we
can build a lean binary for on device training using flag
--enable_training_apis. This binary includes only the essentials like
training ops, optimizers etc and NOT features like Aten fallback,
strided tensors, gradient builders etc . This binary also removes all
the deprecated components like training::TrainingSession and OrtTrainer
etc

### Motivation and Context
This enables our partners to create a lean binary for on device
training.
2023-01-10 20:58:04 -08:00
Ashwini Khade
68b5b2d7d3
Refactor training build options (#13964)
### Description
1. Renames all references of on device training to training apis. This
is to keep the naming general. Nothing really prevents us from using the
same apis on servers\non-edge devices.
2. Update ENABLE_TRAINING option: With this PR when this option is
enabled, training apis and torch interop is also enabled.
3. Refactoring for onnxruntime_ENABLE_TRAINING_TORCH_INTEROP option: 
   -  Removed user facing option
- Setting onnxruntime_ENABLE_TRAINING_TORCH_INTEROP to ON when
onnxruntime_ENABLE_TRAINING is ON as we always build with torch interop.

Once this PR is merged when --enable_training is selected we will do a
"FULL Build" for training (with all the training entry points and
features).
Training entry points include:
1. ORTModule
2. Training APIs

Features include:
1. ATen Fallback
2. All Training OPs includes communication and collectives
3. Strided Tensor Support
4. Python Op (torch interop)
5. ONNXBlock (Front end tools for training artifacts prep when using
trianing apis)

### Motivation and Context
Intention is to simply the options for building training enabled builds.
This is part of the larger work item to create dedicated build for
learning on the edge scenarios with just training apis enabled.
2023-01-03 13:28:16 -08:00
Baiju Meswani
5a55fac402
Miscellaneous updates to training apis (#13929) 2022-12-14 13:33:07 -08:00
Ashwini Khade
a7bc927b4b
fix typos in training apis (#13908)
### Description
This PR fixes some typos in the training apis.

We need to add more tests and make sure they are all run on the CIs to
capture such issues. These changes are out of scope of this PR.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Co-authored-by: Ashwini Khade <askhade@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-12-09 16:01:11 -08:00
Ashwini Khade
65201e47bf
Enable nuget packages for on device training (#13637)
### Description
This PR enables building nuget packages locally for on device training
using --build_nuget arg.
This PR also enables the C# bindings by default in the managed package.
If a user triggers any training apis when the native binary is not built
for training, an exception with message "Training is disabled in the
current build. Please build ONNXRuntime from source with the build flags
enable_training and enable_training_on_device. " is thrown.

Build command for creating nuget packes for on device training:
build.bat --enable_training --enable_training_on_device --build_nuget 

2 Nuget packages are built
1. Microsoft.ML.OnnxRuntime.Managed
2. Microsoft.ML.OnnxRuntime.Training OR
Microsoft.ML.OnnxRuntime.Training.Gpu



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2022-12-05 14:54:09 -08:00
Baiju Meswani
a46c599a40
Training API to export the eval model to an inference model (#13345) 2022-10-27 09:34:01 -07:00
Ashwini Khade
4fc8f7139a
Bug Fix - C# API order incompatibile with C API (#13191)
### Description
Training C# bindings (ReleaseTrainingSession and ReleaseCheckpointState)
broke after an API order change in Training C API. This PR fixes this
issue.



### Motivation and Context
Bug Fix for Training C# bindings
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2022-10-04 09:29:20 -07:00
ashbhandare
27dde0b51f
Csharp bindings for on-device training APIs (#12404) 2022-09-02 13:13:48 -07:00