ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Find a file
Viswanath Boga ad9d2e2e89
Prefix match in first iteration of beam search OP (#10231)
* Add BeamSearch op schema

* Add ONNX conversion for beams search

* remove attention_mask and change input order

* add option to run baseline

* add check data type NULL

* applies VerifyNodeAndOpMatch to subgraph

* update input_ids shape

* Add node name for Cast node

* expose API for topk

* parse parameters

* Add beam search scorer

* output results

* fix typo

* use c++ template and format python

* fix build pipeline errors

* symbolic shape infer of input onnx

* output scores

* add kernel def hash

* Handle vocab_mask; move CheckSubgraph

* undo insert_cast_transformer.cc and fusion_utils.py

* fix typo

* fix merge

* update doc

* add repetition penalty

* refactoring: add GptSubgraph class

* move BeamSearchState from .h to .cc file

* adjust logits processor order

* add batch generation example

* fix repetition penalty for dup words in sequence

* Add test

* Add no repeat ngram processor

* refactoring: move logits processor to classes

* fix build warning

* show latency

* use allocator in beam state

* use allocator in sequences

* fix build error

* move next_positions to beam state

* Changes for prefix matching

* removing debugs

* removing more debugs

* clean up

* clean up

* cpu doc updated

* Updated docs

* updated prefix_vocab_mask dimension in convert script

* changes to support bxs prefix_vocab_mask in beamsearchop kernel

* doc update

* OperatorKernels.md updated

* matching docs from artifacts

* minor change in logits processor

* Addressing comments

* Updated the prefix vocab mask usage properly

Co-authored-by: Tianlei Wu <tlwu@microsoft.com>
2022-02-03 00:14:39 +05:30
.gdn Update compliance tasks in python packaging pipeline and fix some compile warnings (#8471) 2021-07-30 17:16:37 -07:00
.github Update C/C++ API docs automation to create a PR (instead of push to publish branch) (#10093) 2022-01-07 16:16:47 -08:00
cgmanifests Remove coremltools submodule *security vulnerability* and copy the coreml model schema (#10424) 2022-01-28 12:48:48 -08:00
cmake Reduce test time for TensorRT EP CI (#10408) 2022-02-01 15:56:33 -08:00
csharp Allow users to bind arbitrary memory using raw pointers (#10428) 2022-02-01 18:09:24 -08:00
dockerfiles Update rocm_ep and migraphx_ep to rocm4.5.2 and fix dockerfiles to build docker images correctly (#10445) 2022-02-01 16:11:39 -08:00
docs Prefix match in first iteration of beam search OP (#10231) 2022-02-03 00:14:39 +05:30
include/onnxruntime/core Enable transpose optimizer in minimal extended build (#10349) 2022-01-31 09:41:04 -08:00
java Amdmigraphx fix build error (#9272) 2022-01-10 15:18:43 -08:00
js Bump log4js from 6.3.0 to 6.4.0 in /js/web 2022-01-26 20:51:49 -08:00
objectivec [Objective-C API] WIgnore clang documentation warnings from C/C++ header usage. (#9057) 2021-09-14 13:03:48 -07:00
onnxruntime Prefix match in first iteration of beam search OP (#10231) 2022-02-03 00:14:39 +05:30
orttraining Enable more static analysis warnings and enable the analyzer for training cpu (#10176) 2022-01-27 11:17:20 -08:00
package/rpm Bump master version to 1.11 (#9957) 2021-12-14 23:32:06 -08:00
samples Add Python checks pipeline (#7032) 2021-08-09 10:37:05 -07:00
server Standalone TVM Executor Provider (#10019) 2021-12-15 16:59:20 -08:00
tools Improve Perf System (#10404) 2022-02-01 16:01:34 -08:00
winml Incorrect output after GPU to GPU inference via VideoFrame and Gray8 models (#10425) 2022-01-28 08:45:57 -08:00
.clang-format
.clang-tidy Add remaining build options and make minor changes in documentation (#39) 2018-11-27 19:59:40 -08:00
.dockerignore Update dockerfiles (#5929) 2020-11-25 15:38:22 -08:00
.flake8 Add Python checks pipeline (#7032) 2021-08-09 10:37:05 -07:00
.gitattributes
.gitignore Remove unused pipeline orttraining-linux-gpu-perf-test-ci-pipeline.yml and unused send_perf_metrics tool. (#10326) 2022-01-21 14:31:34 -08:00
.gitmodules Remove coremltools submodule *security vulnerability* and copy the coreml model schema (#10424) 2022-01-28 12:48:48 -08:00
build.amd64.1411.bat
build.bat
build.sh Add iOS test pipeline and a sample app. (#5298) 2020-09-29 13:53:11 -07:00
CITATION.cff Add citation file (#10061) 2021-12-16 19:56:21 -08:00
CODEOWNERS Update ORTTraiing frontend codeowner (#9427) 2021-10-18 23:56:21 -07:00
CONTRIBUTING.md fixed the link (#8757) 2021-08-18 11:45:42 -07:00
LICENSE Remove year from license (#6658) 2021-02-12 00:25:56 -08:00
NuGet.config Delete nuget extra configs (#6477) 2021-01-27 20:25:45 -08:00
ort.wprp Add Tracelogging for profiling (#1639) 2019-11-11 21:34:10 -08:00
packages.config Bump winrt version (#10243) 2022-01-12 10:52:27 -08:00
README.md Fix typo 2021-08-12 15:57:15 -07:00
requirements-dev.txt Add post-install command to build PyTorch CPP extensions from within onnxruntime package (#8027) 2021-06-28 18:11:58 -07:00
requirements-doc.txt Add auto doc gen for ORTModule API during CI build (#7046) 2021-03-22 10:20:33 -07:00
requirements-training.txt Add post-install command to build PyTorch CPP extensions from within onnxruntime package (#8027) 2021-06-28 18:11:58 -07:00
requirements.txt.in Chang how numpy version is handled. (#8130) 2021-06-23 14:08:37 -07:00
setup.py STVM, NUPHAR, remove tvm from submodules list, checks pointers are not null. (#10211) 2022-01-27 20:31:13 +01:00
ThirdPartyNotices.txt add copyright (#9943) (#9970) 2021-12-08 14:34:53 -08:00
VERSION_NUMBER Bump master version to 1.11 (#9957) 2021-12-14 23:32:06 -08:00

ONNX Runtime is a cross-platform inference and training machine-learning accelerator.

ONNX Runtime inference can enable faster customer experiences and lower costs, supporting models from deep learning frameworks such as PyTorch and TensorFlow/Keras as well as classical machine learning libraries such as scikit-learn, LightGBM, XGBoost, etc. ONNX Runtime is compatible with different hardware, drivers, and operating systems, and provides optimal performance by leveraging hardware accelerators where applicable alongside graph optimizations and transforms. Learn more →

ONNX Runtime training can accelerate the model training time on multi-node NVIDIA GPUs for transformer models with a one-line addition for existing PyTorch training scripts. Learn more →

Get Started

General Information: onnxruntime.ai

Usage documention and tutorials: onnxruntime.ai/docs

Companion sample repositories:

Build Pipeline Status

System CPU GPU EPs
Windows Build Status Build Status Build Status
Linux Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Mac Build Status
Build Status
Android Build Status
iOS Build Status
WebAssembly Build Status

Data/Telemetry

Windows distributions of this project may collect usage data and send it to Microsoft to help improve our products and services. See the privacy statement for more details.

Contributions and Feedback

We welcome contributions! Please see the contribution guidelines.

For feature requests or bug reports, please file a GitHub Issue.

For general discussion or questions, please use GitHub Discussions.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

License

This project is licensed under the MIT License.