* Add iOS test pipeline and a sample app.
* clean up the unused code.
* clean up.
* revert the unknown change
* disable the shared library for iOS.
* add open source notice text.
* ignore the skipped test.
* extract the common ortenv setup
* Add ORTTrainerOptions class for the new pytorch frontend (#4382)
Add ORTTrainerOptions class and some placeholders
* Add _ORTTrainerModelDesc to perform validation for model description (#4416)
* Add Loss Scaler classes to the new frontend (#4306)
* Add TrainStepInfo used on the new frontend API (#4256)
* Add Optimizer classes to the new frontend (#4280)
* Add LRScheduler implementation (#4357)
* Add basic ORTTrainer API (#4435)
This PR presents the public API for ORTTrainer for the short term
development.
It also validates and saves input parameters, which will be used in the
next stages, such as building ONNX model, post processing the model and
configuring the training session
* Add opset_version into ORTTrainerOptions and change type of ORTTrainer.loss_fn (#4592)
* Update ModelDescription and minor fix on ORTTrainer ctor (#4605)
* Update ModelDescription and minor fix on ORTTrainer/ORTTrainerOptions
This PR keeps the public API intact, but changes how model description is stored on the backend
Currently, users creates a dict with two lists of tuples.
One list called 'inputs' and each tuple has the following format tuple(name, shape).
The second list is called 'outputs' and each tuple can be either tuple(name, shape) or tuple(name, shape, is_loss).
With this PR, when this dict is passed in to ORTTrainer, it is fully validated as usual.
However, tuples are internally replaced by namedtuples and all output tuples will have
tuple(name, shape, is_loss) format instead of is_loss being optionally present.
Additionally to that normalization in the internal representation (which eases coding),
two internal methods were created to replace a namedtuple(name, shape) to namedtuple(name, shape, dtype)
or namedtuple(name, shape, is_loss, dtype) dependeing whether the tuple is an input or output.
This is necessary as ORTTRainer finds out data types of each input/output during model export to onnx.
Finally, a minor fix was done on ORTTrainer. It could initialize ORTTrainerOptions incorrectly when options=None
* Rename input name for test
* Add ONNX Model Export to New Frontend (#4612)
Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
* Create training session + minor improvements (#4668)
Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
* Save ONNX model in file (#4671)
Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
* Add eval step (#4674)
Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
* Add train_step (#4677)
Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
* Add LR Scheduler (#4694)
Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
* Add deterministic compute tests (#4716)
Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
* Add legacy vs experimental ORTTrainer accuracy comparison (#4727)
Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
* Add Mixed precision/LossScaler + several fixes (#4739)
Additionally to the mixed precision/loss scaler code, this PR includes:
* Fix CUDA training
* Add optimization_step into TrainStepInfo class
* Refactor LRSCheduler to use optimization_step instead of step
* Updated several default values at ORTTrainerOptions
* Add initial Gradient Accumulation supported. Untested
* Fix ONNX model post processing
* Refactor unit tests
* Add ONNX BERT example + minor fixes (#4757)
* Fix training issue when passing ONNX file into ORTTrainer
Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
* Add Dynamic Shape support (#4758)
* Update DeepSpeed Zero Stage option to a separate option group (#4772)
* Add support to fetches (#4777)
* Add Gradient Accumulation Steps support (#4793)
* Fix Dynamic Axes feature and add unit test (#4795)
* Add frozen weights test (#4807)
* Move new pytorch front-end to 'experimental' namespace (#4814)
* Fix build
Co-authored-by: Rayan-Krishnan <rayankrishnan@live.com>
Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
* Move nnapi dnnlib to subfolder
* dnnlib compile settings
* add nnapi buildin build.py
* add onnxruntime_USE_NNAPI_BUILTIN
* compile using onnxruntime_USE_NNAPI_BUILTIN
* remove dnnlib from built in code
* Group onnxruntime_USE_NNAPI_BUILTIN sources
* add file stubs
* java 32bit compile error
* built in nnapi support 5-26
* init working version
* initializer support
* fix crash on free execution
* add dynamic input support
* bug fixes for dynamic input shape, add mul support, working on conv and batchnorm
* Add batchnormalization, add overflow check for int64 attributes
* add global average/max pool and reshape
* minor changes
* minor changes
* add skip relu and options to use different type of memory
* small bug fix for in operator relu
* bug fix for nnapi
* add transpose support, minor bug fix
* Add transpose support
* minor bug fixes, depthwise conv weight fix
* fixed the bug where the onnx model input has mismatch order than the nnapi model input
* add helper to add scalar operand
* add separated opbuilder to handle single operator
* add cast operator
* fixed reshape, moved some logs to verbose
* Add softmax and identity support, change shaper calling signature, and add support for int32 output
* changed the way to execute the NNAPI
* move NNMemory and InputOutputInfo into Model class
* add limited support for input dynamic shape
* add gemm support, fixed crash when allocating big array on stack
* add abs/exp/floor/log/sigmoid/neg/sin/sqrt/tanh support
* better dynamic input shape support;
* add more check for IsOpSupportedImpl, refactored some code
* some code style fix, switch to safeint
* Move opbuilders to a map with single instance, minor bug fixes
* add GetUniqueName for new temp tensors
* change from throw std to ort_throw
* build settings change and 3rd party notice update
* add readme for nnapi_lib, move to ort log, add comments to public functions, clean the code
* add android log sink and more logging changes, add new string for NnApiErrorDescription
* add nnapi execution options/fp16 relax
* fix a dnnlibrary build break
* addressed review comments
* address review comments, changed adding output for subgraph in NnapiExecutionProvider::GetCapability, minor issue fixes
* formatting in build.py
* more formatting fix in build.py, return fail status instead of throw in compute_func
* moved android_log_sink to platform folder, minor coding style changes
* addressed review comments
* Add amd migraphx execution provider to onnx runtime
* rename MiGraphX to MIGraphX
* remove unnecessary changes in migraphx_execution_provider.cc
* add migraphx EP to tests
* add input requests of the batchnorm operator
* add to support an onnx operator PRelu
* update migrapx dockerfile and removed one unused line
* sync submodules with mater branch
* fixed a small bug
* fix various bugs to run msft real models correctly
* some code cleanup
* fix python file format
* fixed a code style issue
* add default provider for migraphx execution provider
Co-authored-by: Shucai Xiao <Shucai.Xiao@amd.com>
* Initial update of readme
* Readme updates
* Review of consolidated README (#3930)
* Proposed updates for readme (#3953)
I found some of the information was duplicated within the doc, so attempted to streamline
* Fix links
* More updates
- fix build instructions
- nodejs doc reorganization
- roadmap update
- version fixes
* Update ORT Server build instructions
* More doc cleanup
* fix python dev notes name
* Update nodejs and some links
* sync eigen version back to master
* Minor fixes
* add nodsjs to sample table of content
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* address PR feedback
* address PR feedback
* nodejs build instruction
* Update Java instructions to include gradle
* Roadmap refresh
Reformat some data, fix link, minor rewording
* Clarify Visual C++ runtime req
Co-authored-by: Nat Kershaw (MSFT) <nakersha@microsoft.com>
Co-authored-by: Prasanth Pulavarthi <prasantp@microsoft.com>
Co-authored-by: manashgoswami <magoswam@microsoft.com>
This change adds a new execution provider powered by [DirectML](https://aka.ms/DirectML).
DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning on Windows. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers.
The DirectML execution provider is capable of greatly improving evaluation time of models using commodity GPU hardware, without sacrificing broad hardware support or requiring vendor-specific extensions to be installed.
**Note** that the DML EP code was moved verbatim from the existing WindowsAI project, which is why it doesn't yet conform to the onnxruntime coding style. This is something that can be fixed later; we would like to keep formatting/whitespace changes to a minimum for the time being to make it easier to port fixes from WindowsAI to ORT during this transition.
Summary of changes:
* Initial commit of DML EP files under onnxruntime/core/providers/dml
* Add cmake entries for building the DML EP and for pulling down the DirectML redist using nuget
* Add a submodule dependency on the Windows Implementation Library (WIL)
* Add docs under docs/execution_providers/DirectML-ExecutionProvider.md
* Add support for DML EP to provider tests and perf tests
* Add support for DML EP to fns_candy_style_transfer sample
* Add entries to the C ABI for instantiating the DML EP
* Updates
* Remove preview texts
* Update README.md
* Updates
* Update README.md
* Update README.md
* Minor wording update
* Update README.md
* Update doc on CUDA version
* revert update
* Update readme for issue #1558
* Clean up example section
* Cosmetic updates
- Add a index of build instructions for browsability
- Update build CUDA version from 9.1 to 10
* Fix broken link
* Update README to reflect upgrade to pip requirement
* Update CuDNN version for Linux Python packages
* Clean up content
Updated ordering and add table of contents
* Minor format fixes
* Move Android NNAPI under EP section
* Add link to operator support documentation
* Fix typo
* typo fix
* remove todo section
* Mention OrtCreateSessionFromArray in C API doc
* Don't create the default allocator every single time. Rename API accordingly.
* Don't create the default allocator every single time. Rename API accordingly.
* updates...
* updates...
* PR comments
* fix typo in license header
* fix build