ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Find a file
TomWildenhain-Microsoft e8268c9a18
Add Transpose Optimizer and modify nhwc optimizer to use it. (#9284)
* Add Transpose Optimizer and modify nhwc optimizer to use it.

* Fix casts

* Fix casts2

* Fix move

* Add tests

* Add headers

* Fixes and tests

* Remove explicit template instantiation

* Fix build warning

* Name unit tests

* Code review fixes

* Add some comments

* Fix some casts

* Make optimization slightly less agressive

* Some unit test fixes

* Update Attention pattern to work with transpose optimizer

* Update attention fuser

* Fix attention fusion python script

* Improve transpose optimizer documentation

* Create OptimizerCtx struct

* Disable Slice handler for testing

* Implement Slice int32

* Only push transposes leading up to other transposes

* Improve optimization heuristic

* Add exemption for MaxPool

* Document transpose optimizer api.h

* Revert fusion tests to master

* Remove temp files

* Replace typedef with using

* Trim trailing whitespace

* Move class declarations from api_impl.h to api_impl.cc

* Remove copy constructors and move allocator

* Alphabetize headers

* Add override keyword

* Comments for nhwc_transformer

* Rename OrtGraph to ApiGraph, etc.

* Wrap line

* Remove extra qualifier on ApiGraph

* Refector attention fusion

* Remove c-style casts from api_impl.cc

* Improve documentation

* Avoid printing vector in ORT_ENSURES

* Revert attention fusion refactor

* Remove duplicate cost heuristics and improve documentation

* Fix size_t casts

* Fixes from Scott's review

* Unrevert attention refactor and more updates from Scott's review

* Revert api_impl.cc ValueInfo change

* only optimize first transpose input

* Unrevert api_impl.cc changes

* Make vector call reserve

* transpose_optimizer.cc update from Scott's comments

* Rename api::Graph to api::GraphRef etc.

* Consider domains 'onnx.ai' and '' equal

* Replace AddInput with SetInput

* Improve tests

* quantization and heuristic tests

* Comments for tests

* Replace const string_view with string_view and update tests

* Fixes requested by Edward

* Fix std::string to string_view conversion

* Add <string> to includes

* Fix bug for broadcasting ops with unknown rank. Slight safety improvements

* Changes requested by Edward

* Fix formatting

* Improve description of cost metric
2021-10-27 22:10:39 -07:00
.gdn Update compliance tasks in python packaging pipeline and fix some compile warnings (#8471) 2021-07-30 17:16:37 -07:00
.github Update issue template to ask users to check known issues to avoid repetition. (#8288) 2021-07-02 15:36:14 -07:00
cgmanifests Clean up optional-lite references (#9534) 2021-10-25 21:05:45 -07:00
cmake Add Transpose Optimizer and modify nhwc optimizer to use it. (#9284) 2021-10-27 22:10:39 -07:00
csharp Remove pointless assert. (#9571) 2021-10-28 07:33:40 +10:00
dockerfiles Update dockerfile readme (#9241) 2021-10-01 17:28:26 -07:00
docs Bifurcation detector for aggressive decoding (#9432) 2021-10-19 19:53:56 -07:00
include/onnxruntime/core Add Transpose Optimizer and modify nhwc optimizer to use it. (#9284) 2021-10-27 22:10:39 -07:00
java Upgrade com.diffplug.spotless to 5.17.0 (#9546) 2021-10-26 14:29:46 -07:00
js [js/web] support opset-13 of softmax (#9493) 2021-10-26 23:58:50 -07:00
objectivec [Objective-C API] WIgnore clang documentation warnings from C/C++ header usage. (#9057) 2021-09-14 13:03:48 -07:00
onnxruntime Add Transpose Optimizer and modify nhwc optimizer to use it. (#9284) 2021-10-27 22:10:39 -07:00
orttraining Fix opset version change by not using copy of global constant (#9393) 2021-10-27 12:42:06 -04:00
package/rpm Bumping up to 1.10 (#9006) 2021-09-22 16:34:28 -07:00
samples Add Python checks pipeline (#7032) 2021-08-09 10:37:05 -07:00
server fix boost download url (#7843) 2021-05-26 16:08:57 -07:00
tools Add Linux/MacOS ARM64 support to nuget packaging pipeline (#9570) 2021-10-27 19:00:43 -07:00
winml Make onnxruntime::Status nodiscard (#9279) 2021-10-08 17:10:31 -07:00
.clang-format
.clang-tidy
.dockerignore Update dockerfiles (#5929) 2020-11-25 15:38:22 -08:00
.flake8 Add Python checks pipeline (#7032) 2021-08-09 10:37:05 -07:00
.gitattributes
.gitignore Add Xamarin support (#9436) 2021-10-27 20:07:07 +10:00
.gitmodules Remove optional-lite (#9424) 2021-10-22 16:45:45 -07:00
build.amd64.1411.bat
build.bat
build.sh Add iOS test pipeline and a sample app. (#5298) 2020-09-29 13:53:11 -07:00
CODEOWNERS Update ORTTraiing frontend codeowner (#9427) 2021-10-18 23:56:21 -07:00
CONTRIBUTING.md fixed the link (#8757) 2021-08-18 11:45:42 -07:00
LICENSE Remove year from license (#6658) 2021-02-12 00:25:56 -08:00
NuGet.config Delete nuget extra configs (#6477) 2021-01-27 20:25:45 -08:00
ort.wprp
packages.config Update DirectML version to 1.5.1 and enable ARM/ARM64 builds with DML (#7511) 2021-04-30 00:49:30 -07:00
README.md Fix typo 2021-08-12 15:57:15 -07:00
requirements-dev.txt Add post-install command to build PyTorch CPP extensions from within onnxruntime package (#8027) 2021-06-28 18:11:58 -07:00
requirements-doc.txt Add auto doc gen for ORTModule API during CI build (#7046) 2021-03-22 10:20:33 -07:00
requirements-training.txt Add post-install command to build PyTorch CPP extensions from within onnxruntime package (#8027) 2021-06-28 18:11:58 -07:00
requirements.txt.in Chang how numpy version is handled. (#8130) 2021-06-23 14:08:37 -07:00
setup.py Optimize python overhead of APEX amp (#9447) 2021-10-26 13:13:49 +08:00
ThirdPartyNotices.txt Clean up optional-lite references (#9534) 2021-10-25 21:05:45 -07:00
VERSION_NUMBER Bumping up to 1.10 (#9006) 2021-09-22 16:34:28 -07:00

ONNX Runtime is a cross-platform inference and training machine-learning accelerator.

ONNX Runtime inference can enable faster customer experiences and lower costs, supporting models from deep learning frameworks such as PyTorch and TensorFlow/Keras as well as classical machine learning libraries such as scikit-learn, LightGBM, XGBoost, etc. ONNX Runtime is compatible with different hardware, drivers, and operating systems, and provides optimal performance by leveraging hardware accelerators where applicable alongside graph optimizations and transforms. Learn more →

ONNX Runtime training can accelerate the model training time on multi-node NVIDIA GPUs for transformer models with a one-line addition for existing PyTorch training scripts. Learn more →

Get Started

General Information: onnxruntime.ai

Usage documention and tutorials: onnxruntime.ai/docs

Companion sample repositories:

Build Pipeline Status

System CPU GPU EPs
Windows Build Status Build Status Build Status
Linux Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Mac Build Status
Build Status
Android Build Status
iOS Build Status
WebAssembly Build Status

Data/Telemetry

Windows distributions of this project may collect usage data and send it to Microsoft to help improve our products and services. See the privacy statement for more details.

Contributions and Feedback

We welcome contributions! Please see the contribution guidelines.

For feature requests or bug reports, please file a GitHub Issue.

For general discussion or questions, please use GitHub Discussions.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

License

This project is licensed under the MIT License.