Commit graph

475 commits

Author SHA1 Message Date
Yi Zhang
ea128cdb18
skip windows GPU check if changes only in doc (#13248)
### Description
Use Path filter and fake workflow to skip windows GPU check if there's
only changes in doc.
Refs:

https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/defining-the-mergeability-of-pull-requests/troubleshooting-required-status-checks#handling-skipped-but-required-checks

The fake github yaml is generated by code.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

###verifications:###
In this PR:
since the win-gpu-ci-pipeline.yml and .github are updated, so the real
Windows GPU workflows are always triggered.

in #13256
To avoid update win-gpu-ci-pipleline.yml, I added the path filter in
devops page. the fake win GPU workflows triggered, and the real
workflows are skipped.
2022-10-11 13:51:44 +08:00
garanews
38906625a3
fix some typo in docs (#13212)
### Description
<!-- Describe your changes. -->
fix some typo in docs


### Motivation and Context
singed vs signed
succeding vs succeeding 
fileter vs filter
kernal vs kernel
libary vs library
2022-10-07 15:58:18 -07:00
ashari4
b09dd11ece
BFP schemas: Change block dimension type to Int (#13169)
* Change block dimension type to Int from Ints.
* In response to feedback that the block dimension corresponds to the
reduction dimension of the consuming matrix multiplication. There is
always only 1 reduction dimension.
2022-10-06 11:11:43 -07:00
Tony Xia
c7522e547a
Fixed a minor typo (#13194)
### Description
binraries ==> binaries



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2022-10-05 12:10:14 -07:00
Tony Xia
962fee5fe5
Fix typo enviroment => environment (#13195) 2022-10-03 17:02:26 -07:00
Changming Sun
dd2aec170d
Update Coding_Conventions_and_Standards.md (#7705) 2022-09-29 23:32:37 -07:00
ashari4
c4a7e88fc8
QuantizeBFP and DequantizeBFP (#12833)
* `QuantizeBFP` and `DequantizeBFP` schemas - similar to
`QuantizeLinear` and `DeQuantizeLinear`.
* BFP datatype is represented as a `uint8` tensor with shape and stride
metadata. This is preferrable to adding a new datatype for BFP, which is
more disruptive and [discouraged by
PyTorch](https://discuss.pytorch.org/t/training-with-custom-quantized-datatype/152132/2).

Context: 

The Microsoft Floating Point (BFP) datatype shares an exponent for every
n numbers called a “bounding box.” Each number still has its own
mantissa and sign bits. BFP has been shown to incur 3-4 less cost
(energy and area) than BFloat16 and INT8 counterparts without reductions
in accuracy for the ImageNet benchmark as described in [Rouhani
2020](https://proceedings.neurips.cc/paper/2020/file/747e32ab0fea7fbd2ad9ec03daa3f840-Paper.pdf).

Requirements:

* There are many variants of BFP (number of mantissa bits, number of
shared exponent bits, size of bounding box, custom bit fields, etc.)
* The size and layout of an BFP variant varies across hardware
* bounding box can be over arbitrary dimensions; for example, for the
channel "C" dimension in a N x C x H x W tensor for convolution

Goals of this PR:

* Add initial versions of QuantizeBFP and DequantizeBFP operators to
enable QDQ-style quantization with BFP. Once the schemas stabilize, we
can consider upstreaming to ONNX.
* Add some basic type and shape inferencing tests; tests that run on an
EP will be a follow-up.
2022-09-22 14:02:55 -07:00
Edward Chen
454f77cd94
Update kernel matching logic: decouple from op schemas and remove kernel def hashes (#12791)
# Motivation
Currently, ORT minimal builds use kernel def hashes to map from nodes to
kernels to execute when loading the model. As the kernel def hashes must
be known ahead of time, this works for statically registered kernels.
This works well for the CPU EP.
For this approach to work, the kernel def hashes must also be known at
ORT format model conversion time, which means the EP with statically
registered kernels must also be enabled then. This is not an issue for
the always-available CPU EP. However, we do not want to require that any
EP which statically registers kernels is always available too.
Consequently, we explore another approach to match nodes to kernels that
does not rely on kernel def hashes. An added benefit of this is the
possibility of moving away from kernel def hashes completely, which
would eliminate the maintenance burden of keeping the hashes stable.

# Approach
In a full build, ORT uses some information from the ONNX op schema to
match a node to a kernel. We want to avoid including the ONNX op schema
in a minimal build to reduce binary size. Essentially, we take the
necessary information from the ONNX op schema and make it available in a
minimal build.
We decouple the ONNX op schema from the kernel matching logic. The
kernel matching logic instead relies on per-op information which can
either be obtained from the ONNX op schema or another source.
This per-op information must be available in a minimal build when there
are no ONNX op schemas. We put it in the ORT format model.
Existing uses of kernel def hashes to look up kernels are replaced
with the updated kernel matching logic. We no longer store
kernel def hashes in the ORT format model’s session state and runtime
optimization representations. We no longer keep the logic to
generate and ensure stability of kernel def hashes.
2022-09-20 14:24:59 -07:00
Alexey Gladyshev
2b5b11d373
[C#][TVM EP] Fix issues related to using TVM EP in C# front-end (#12958)
Changes in this PR:
* Update building of Nuget package for TVM EP
* Update of documentation  for using TVM EP in C#
2022-09-16 16:04:59 +02:00
RandySheriffH
64466c2d62
Remove nuphar provider folder (#12939) 2022-09-13 09:10:52 -07:00
Dwayne Robinson
8e4eb24648
Update operator kernel table to include DML operators (#12887)
* Fix bug in pybind get_all_operator_schema due to premature reference dropping
* Add updated operator kernels markdown table
* Update build.py to include documentation generation for DML operators too
* Update GPU pipeline to include DML in the build to so operators can be generated.
* Use a separate pipeline stage, feedback from Changming and Scott
* Appease annoying Python linter
* Add onnxruntime_BUILD_UNIT_TESTS=OFF and remove stale --use_dml in cuda stage
2022-09-09 10:21:25 -07:00
Hariharan Seshadri
ad69aac491
Introduce ordered quantization ops for the CUDA EP [1/n] (#12582)
Initial core small set for the ordered quantization ops for cuda EP.
2022-09-07 11:58:15 -07:00
Yulong Wang
1a402a3f25
replace 'master' branch ref to 'main' for onnx repo (#12678) 2022-08-30 13:41:42 -07:00
Yulong Wang
c144acc534
Replace 'master' branch ref to 'main' in the code (#12547) 2022-08-22 10:48:12 -07:00
Wei-Sheng Chin
dc486d146b
Make ORT callable from various Pytorch compilers (LazyTensor, TorchDynamo, etc) (#10460)
* Make ORT as Pytorch JIT backend

LORT likely doesn't work with aten fallback so we only test LORT in its own CI.

* Revert changes to enable external CUDA allocator. Will add it later.

Revert "Revert changes to enable external CUDA allocator. Will add it later."

This reverts commit d5487f2e193014c805505afae8fb577c53667658.

Fix external allocator

* Relax tolerance and remove commented code

* Print more information in CI

* Fix pointer

* Address comments.
1. Reuse ORT-eager mode's environment.
2. Remove unused ctor.

* Use Pytorch master branch as all PRs are merged

Fix

* Refine based on cpplint feedbacks

* Revert changes to allow custom CUDA allocator in public APIs

* Use torch.testing.assert_close

* Use unittest framework

* Switch docker repo

* Rename *.cpp to *.cc

* Address comments

* Add comment

* Use same pipeline file for eager and lort pipelines

* Address comments

* Add yaml comment

* Fix cmake files

* Address comments

* Rename flags, remove printing code, remove dead comment
2022-08-22 09:40:40 -07:00
Cheng
64e991a9fc
[Qlinearsoftmax] contrib cpu (#12177)
* [Qlinearsoftmax] contrib cpu

* int8 implementation

* contrib operator md

* qdq transformer test

* new attribute: opset

* doc

* quantized tool

* remove template to reduce Binary size

* doc of contribe operators

* enforce x_shape is valid

* fix reduce_size if input-shape is dynamic

* add UT

* register one op for reducing binarysize

* kernel hash update

* docs/ContribOperators.md
2022-08-10 10:52:02 +08:00
Vincent Wang
cfa09d16d9
[CUDA] Mod Op Kernel (#12499)
* mod for cuda and rocm

* fix bfloat16 ut

* change bf16 ut number

* fix opset version

* fix op kernel doc
2022-08-09 13:05:40 +08:00
Vincent Wang
37995a7245
[CUDA] BiasSoftmax Supporting New Pattern (#12361) 2022-08-05 06:59:24 +08:00
Scott McKay
a3de1bbf7d
Update script to find optimizers that potentially need supported opset updates (#12330)
* Update to handle multiline declarations for the kernels which are typical these days.
* Update to new path for the cpu contrib_op kernel registrations.
* Update tools/python/find_optimizer_opset_version_updates_required.py

Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>
2022-08-04 07:37:27 +10:00
Dmitri Smirnov
dc984a03d5
Container and memory allocation guidelines (#12387)
Container and memory allocation guidelines
  Re-org and add code samples
  Clarify the wording on returning gsl::span
2022-08-03 10:31:59 -07:00
Changming Sun
44ec2cf088
Update publish-python-apidocs.yml (#12433) 2022-08-03 10:17:00 -07:00
Ye Wang
b622e5fa9b
Support vocab_mask/prefix_vocab_mask/no_repeat_number in greedysearch op (#12327)
* support more inputs for greedy search

* fix docs

* refactor test

* lint

* review comments
2022-08-03 10:10:08 -07:00
Valery Chernov
e2423bb55c
[TVM EP] Build on Windows with ipp-crypto support (#12336)
* update TVM EP docs for ipp-crypto build conditions

* add ipp-crypto by ExternalProject_Add

Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
2022-07-28 15:40:19 +02:00
Ye Wang
89ac61f4d4
support gpt2 model with greedy search (#12068)
* greedy search gpt2 cpu checkin

* add cuda support

* add test

* provider

* update

* fix some bugs

* refactor impl class

* refactor test

* remove unused func

* refactor parameters class

* simplify padding

* fix lint warnings

* python format

* Revert "python format"

This reverts commit f25fe1017fa33d960b2418ebbb5dba6a4bd043cf.

* python format

* fix pipelines

* fix pipeline

* move bufferallocater to generate_impl_base

* review comments(alignment, filename/namespace change)

* rebase2

* python reformat

* reformat

* fix rocm build

* review comment

* review comments

* review comments

* fix a bug

* rebase test files

* python format

* format import order

* review comments

* fix build
2022-07-22 15:45:16 -07:00
RandySheriffH
0264a9c29b
Bump ort version number (#11948)
* bump ort version number

* update link and note url

* update version to silence assert

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2022-07-22 12:55:53 -07:00
Valery Chernov
3b0aaa9e0e
[TVM EP] support build on Windows (#11851)
* add description of build ORT+TVM EP on Windows

* fix cmake error related to symlink creation on Windows

* add llvm config path to build flags for correct build on Windows

* update TVM_EP.md for llvm_config build arg

* fix warnings skipping during build on Windows

* fix using string or wstring for model path to correct build on Windows (MSVC error)

* fix error in custom logger for correct build on Windows

* implement glob algorithm for Windows

* additional build fixes

* update TVM with export of VM symbols for dll

* description of nasm issue and workaround

* update TVM with export of Executable from VM symbols for dll

* description of installation of ipp-crypto dependencies on Windows

* cmake key for ipp-crypto build

* fix wstring for TVMso EP

* fix ipp-crypto build

* cmake key onnxruntime_TVM_USE_HASH switch off not specific methods, but full hash functionality

* fix absolute path to compiled lib

* update TVM_EP.md, fix lint warnings

* update TVM_EP.md

* small fixes after review

* switch on handshake functionality for Linux workflow

Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru>
2022-07-13 10:48:42 +02:00
PeixuanZuo
5579d81fc8
[add] Add operator gemmfastgelu for ROCM (#12101)
* [ADD] add gemm fast gelu

* [UPDATE] refunction matmul_impl

* [Update] delete tuning_ in this pr

* [FIX] code format

* [FIX] compiler warning

* [Update] update doc
2022-07-13 15:40:16 +08:00
Preetha Veeramalai
99a370dd02
Update readme for OVEP (#12122)
* Add changes for training module in Readme

* Update ReadMeOV.rst
2022-07-11 10:54:12 -07:00
Valery Chernov
8ba8146650
[TVM] handshake mechanism for support of TVMso EP (#11437)
* infrastructure for handshake mechanism was implemented. sha256 was selected as first hash algorithm

* check hash during compile in TVMso EP

* add IPP-CRYPTO to external dependencies for TVM EP

* made checkHash method constant

* removed the public implementation of the SHA-256 algorithm so as not to cause a license conflict

* implemented SHA-256 calculation using ipp-crypto library

* fix dependency for ipp-crypto

* add provider options for hash check

* update documentation for added provider options

* add hash check condition

* fix docs

* fix lint

* fix ORT_THROW

Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru>
2022-06-29 14:57:18 +02:00
Gary Miguel
dc5d6b9515
register signal ops for opset 17 (#11778)
* Register signal ops for op set 17

Note code is mostly being moved, not added. These ops were previously
only registered as Microsoft contrib ops and only built if
`BUILD_MS_EXPERIMENTAL_OPS=1`. They've been added to the ai.onnx
standard op set in version 17.

Main components of this change:

* Move the kernels from the conrib_ops directory to the
  core directory.
* Add function bodies for ms experimental ops. This will allow
  old models that use the contrib ops to continue to function.
  All the function bodies consist of a single op (the
  new standard op), so performance overhead should be minimal.

Minor clean-up also in this change:

* De-duplicate get_scalar_value_from_tensor: put it in a new utils.h.
* Fix some bugs that caused compilation errors with the experimental
  ops. Tested with `build.sh --ms_experimental`
* Fix some spelling errors and lint violations.
* Replace a couple of switch statements with `MLTypeCallDispatcher`.
* Use `InlineVector` instead of `std::vector`.

Unblocks https://github.com/microsoft/onnxruntime/issues/11640
2022-06-27 10:26:55 +10:00
Gary Miguel
4bf22e2a40
Update ONNX to 1.12 (#11924)
Follow-ups that need to happen after this and before the next ORT release:
* Support SequenceMap with https://github.com/microsoft/onnxruntime/pull/11731
* Support signal ops with https://github.com/microsoft/onnxruntime/pull/11778

Follow-ups that need to happen after this but don't necessarily need to happen before the release:
* Implement LayerNormalization kernel for opset version 17: https://github.com/microsoft/onnxruntime/issues/11916

Fixes #11640
2022-06-21 17:19:52 -07:00
Ye Wang
859ef277a0
apply zcode changes to the beam search op (#11880)
* apply zcode  changes to the beam search op

* fix pipeline failure

* add doc

* workaround for C#

* update

* update

* use name zcode

* review comment

* review comments

* fix cpplint

* review coments
2022-06-20 18:39:07 -07:00
Tianlei Wu
6ee2c1b5fc
Remove temperature input from BeamSearch operator (#11896)
* remove temperature input
* update index of remaining inputs
2022-06-20 09:50:45 -07:00
sfatimar
f97bd38c4f
UEP 4.1 release (#11834)
* Add pypi build changes to latest Master

* Add ORT training part of OV build

* Disabling SqueezeOpTest.BadAxes

* Add ONNXruntime branch ARG to Docker build

* Changes to include file details versions

* Commit File Version Updates

* Change naming for linux build

* Add fix for pylint format errors

* Fix pylint warnings.

* Fix pylint errors - stage 2

Signed-off-by: Preetha Veeramalai <preetha.veeramalai@intel.com>

* Fix pylint errors - stage 3

* Fix pylint format - stage4

Signed-off-by: Preetha Veeramalai <preetha.veeramalai@intel.com>

* Commit for Wheel Release >0.35.1

Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com>
Co-authored-by: mayavijx <mayax.vijayan@intel.com>
Co-authored-by: Sahar Fatima <sfatima.3001@gmail.com>
Co-authored-by: nmaajidk <n.maajid.khan@intel.com>
2022-06-17 14:49:04 -07:00
Gary Miguel
e8b0d24071
Support per-test tolerances for ONNX tests (#11775)
Prior to this every test shared the same tolerances. This meant
that if an ONNX test failed due to a small but acceptable difference in
output, the only alternative was to disable the test entirely.

In op set 17, the DFT operator is being added. Without this change, the
tests for that operator fail because the output is off by about 5e-5.
It's better to keep test coverage for this new op rather than disable
the test entirely.

Also prior to this change, the global tolerances were not shared between
C++, JavaScript, and Python tests. Now they are.

Also fix various minor issues raised by linters.

Unblocks https://github.com/microsoft/onnxruntime/issues/11640.
2022-06-14 15:12:23 -07:00
Chun-Wei Chen
63c483a998
1.12.0 is the right TBD instead of released 1.11.0 (#11817) 2022-06-13 14:27:59 -07:00
Tianlei Wu
def78a1b81
Support T5 in BeamSearch operator (#11450)
(1) Support T5 in BeamSearch operator, and add both CPU and CUDA implementation.
(2) Change BeamSearch op: rename encoder_decoder_init attribute to encoder, and add decoder_start_token_id attribute
(3) Update convert_to_onnx for T5 to use int32 instead of int64 inputs as default.
(4) Add more tests in best_beam_search.py
(5) fix ORT_ENFORCE of hypothesis_buffer_offset_
(6) Improve ONNX conversion:
   (a) Change encoder some dynamic axes to fixed dim value
   (b) add --separate_encoder_and_decoder_init
   (c) correct name t5-3B => t5-3b, t5-11B => t5-11b
   (d) Add --use_int32_inputs in convert t5 to onnx
   (e) Allow t5 beam search conversion in one step
2022-06-10 15:06:57 -07:00
Alexey Gladyshev
331c387f4a
[TVM EP][DOC] Documentation update for TVM EP due to the addition of precompiled model support. (#11743)
* update description of TVM EP options in docs

* update sample notebook

* update TVM EP documentation

* add link to description of options

Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
2022-06-08 14:56:01 +02:00
Hector Li
95a16c1ffe
Snpe ep (#11665)
* Initiate Ort SNPE EP
* fix snpe ep windows build which is caused by the utility method (ToUTF8String) name change on master
* correct the source path for libonnxruntime.so while building for andorid package
* add AdditionalDependencies for amr64
* On MS-Windows, the patchfile must be a text file, i.e. CR-LF must be used as line endings. A file with LF may give the error: "Assertion failed, hunk, file patch.c, line 343," unless the option '--binary' is given.
* fix build failure if snpe is not enabled
* update doc for contrib op
* separate out snpe ep settings to onnxruntime_snpe_provider.cmake
* renaming according review comments
* update according review comments
2022-06-03 14:10:02 -07:00
Gary Miguel
74bc4c07f6
Fix C# and numbering (#11643)
* C# protocol buffer code can be updated on Linux. Link to the relevant instructions.
* Fix numbering.
2022-05-31 11:33:36 -07:00
Vincent Wang
02724c54ff
[CUDA] Implement BitmaskDropout, BitmaskBiasDropout and BitmaskDropoutGrad (#11534)
* Implement BitmaskDropout and associated unit tests.

* Implement BitmaskDropoutGrad and associated unit tests.

* Implement Dropout -> BitmaskDropout rewrite rule and associated unit tests.

* Implement (Dropout,DropoutGrad) -> (BitmaskDropout,BitmaskDropoutGrad) rewrite rule.

This commit does not yet include unit tests for this rewrite rule.

This commit also introduces improved documentation for all changes which will be grouped
into this PR.

* bitmask dropout

* fix win build

* bugfix for rocm

* bugfix

* fix code format

* fix ut

* fix build break

* fix ut in win

* resolve comments

* fix ut in trt

* resolve comments

* fix rocm build error

* fix typo

Co-authored-by: Aidan Beggs <aidanbeggs@microsoft.com>
2022-05-27 17:24:47 +08:00
Justin Chu
c541063245
Format coding conventions documentation (#11405)
Add proper formatting to code blocks to make the doc more readable.

- Wrap code blocks with `
- Fix typos
2022-05-09 10:19:15 -07:00
Justin Chu
fdce4fa6af
Format all python files under onnxruntime with black and isort (#11324)
Description: Format all python files under onnxruntime with black and isort.

After checking in, we can use .git-blame-ignore-revs to ignore the formatting PR in git blame.

#11315, #11316
2022-04-26 09:35:16 -07:00
Justin Chu
6fb29f5b9a
Add python docstring linting in vscode settings (#11316)
Add python docstring linting in vscode settings
Use black and isort for python code formatting in VScode. Import sorting enabled on save. Code formatting available in VSCode with manual trigger.
Adopted from pytorch https://github.com/pytorch/pytorch/blob/master/.vscode/settings_recommended.json
2022-04-23 06:23:04 -07:00
Chun-Wei Chen
b9279f637d
update How_To_Update_ONNX_Dev_Notes with right paths (#11074) 2022-04-01 08:05:31 -07:00
Xavier Dupré
c37d2728bf
Implement TreeEnsemble for opset(ai.onnx.ml)==3 (#10821)
* Implement TreeEnsemble for opset(ai.onnx.ml)==3
* use of InlineVector
* refactoring
* improve attributes retrieval
* avoid creating a temporary buffer
* modifies onnx.ml.cpu.json
* use unordered_map
* update docs/OperatorKernels.md
* address PR comments (TH -> ThresholdType, ORT_RETURN...)
* add a python unit test to load a TreeEnsembleRegressor following ai.onnx.ml==3 specifications
2022-03-30 12:53:12 +02:00
Vincent Wang
6a6840d5c6
Fuse LayerNormalization for Apex O2 (#10233) 2022-03-29 21:22:04 +08:00
Chi Lo
8ba52b0a05
Bump master version to 1.12 (#10797)
* bump master version to 1.11

* bump master version to 1.12
2022-03-28 12:30:11 -07:00
pengwa
89ef987ab1
Improve NonZero on CUDA/ROCM (#10307)
* improve NonZero

* fix megatron_fp16 optimzier, fix the doc

* multi_tensor_applier

* resolve comment

* fix building warning

* fix build error when enabling training and use tensorrt
2022-03-25 07:35:45 +08:00
Nat Kershaw (MSFT)
2d961604b1
Refactor Python API docs to better explain IO binding scenarios (#10651) 2022-03-15 09:40:59 -07:00
Hariharan Seshadri
a9d9c6b486
Register CPU, CUDA and ROCM opset-16 kernels for some operators (#10643) 2022-03-08 09:18:39 -08:00
liqun Fu
da885a72e8
update with onnx 1.11 release (#10441) 2022-03-07 21:10:55 -08:00
Tianlei Wu
0e335aba37
Update BeamSearch operator spec to support t5 (#10777)
* change BeamSearch op to support encoder decoder model

* check model_type and decoder attribute

* fix

* update comments

* warn shape inference issue with onnx v1.11 or T5

* skip parity test when tempature != 1.0

* fix build
2022-03-04 21:52:45 -08:00
Tianlei Wu
36c3271546
BeamSearch op cuda (#10556)
Add BeamSearch cuda implementation with support of fp16 GPT-2 subgraph
2022-02-25 13:08:55 -08:00
Dmitri Smirnov
2679711bee
Refactor transformers and other code to reduce memory allocation calls (#10523)
Work on minimizing memory management calls by
  reducing number of allocations and copies.
  Replace std::unordered_set to InlinedHashSet
  and add usage of InlinedVector.
  Employ std::move() to minimize copying and memory allocations.
  Remove copying of the const shared data into each of the
  PropagateCast transformer instances.
  Move inlined_containers.h header to include/common
  Adjust AsSpan imlementation for C++ < 17
2022-02-24 16:17:14 -08:00
Alexey Gladyshev
7dc7529ec8
[TVM EP] Integrate tests for TVM EP into public onnxruntime CI (#10505)
* add support for bool type

* add TVM EP support for tests

* include TVM EP in python test pool

* fix pylint

* moved technical imports to a separate file

* clean up post build actions & move _ld_preload.py extension to CMake level

* add files for include TVM EP into CI

* implement custom logger for TVM

* replace TVM logging with ONNX RT logging

* update link for TVM EP tutorial

* clean up TVM EP cmake

* add pybind auto enabling for TVM EP

* fix blank spaces

* code review fixes

* replace print with comment

* add list of EP without TVM EP

* enable onnx tests

* disable contrib ops and ml ops

* reuse Dockerfile.ubuntu

* Move install_tvm_test_dependencies.sh out of Docker context dir, update build definition.

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
2022-02-24 16:24:23 +01:00
Scott McKay
df841ee87d
Fix incorrect type constraint registration for operator kernels. (#10489)
* Fix incorrect type constraint registration for RoiAlign. This led to the input type not actually being checked when matching a kernel as the invalid constraint name is treated as a missing optional input.
  * fix missing dependency for the unit test exe. Whilst it doesn't link against the CUDA providers lib, without the dependency VS doesn't know it needs to rebuild the library if there are changes.
* Add check for invalid type constraints.
* Fix invalid registrations for other kernels.
* Add hash replacement logic to provide backwards compatibility in ORT format models when the registration is fixed.
* Add tests
2022-02-18 16:55:32 +10:00
Valery Chernov
1cdc23aba4
[TVM EP] Rename Standalone TVM (STVM) Execution Provider to TVM EP (#10260)
* update java API for STVM EP. Issue is from PR#10019

* use_stvm -> use_tvm

* rename stvm worktree

* STVMAllocator -> TVMAllocator

* StvmExecutionProviderInfo -> TvmExecutionProviderInfo

* stvm -> tvm for cpu_targets. resolve onnxruntime::tvm and origin tvm namespaces conflict

* STVMRunner -> TVMRunner

* StvmExecutionProvider -> TvmExecutionProvider

* tvm::env_vars

* StvmProviderFactory -> TvmProviderFactory

* rename factory funcs

* StvmCPUDataTransfer -> TvmCPUDataTransfer

* small clean

* STVMFuncState -> TVMFuncState

* USE_TVM -> NUPHAR_USE_TVM

* USE_STVM -> USE_TVM

* python API: providers.stvm -> providers.tvm. clean TVM_EP.md

* clean build scripts #1

* clean build scripts, java frontend and others #2

* once more clean #3

* fix build of nuphar tvm test

* final transfer stvm namespace to onnxruntime::tvm

* rename stvm->tvm

* NUPHAR_USE_TVM -> USE_NUPHAR_TVM

* small fixes for correct CI tests

* clean after rebase. Last renaming stvm to tvm, separate TVM and Nuphar in cmake and build files

* update CUDA support for TVM EP

* roll back CudaNN home check

* ERROR for not positive input shape dimension instead of WARNING

* update documentation for CUDA

* small corrections after review

* update GPU description

* update GPU description

* misprints were fixed

* cleaned up error msgs

Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru>
Co-authored-by: Thierry Moreau <tmoreau@octoml.ai>
2022-02-15 10:21:02 +01:00
Changming Sun
3185680b6c
Add NHWC CONV contrib op (#10506) 2022-02-10 15:47:49 -08:00
Viswanath Boga
ad9d2e2e89
Prefix match in first iteration of beam search OP (#10231)
* Add BeamSearch op schema

* Add ONNX conversion for beams search

* remove attention_mask and change input order

* add option to run baseline

* add check data type NULL

* applies VerifyNodeAndOpMatch to subgraph

* update input_ids shape

* Add node name for Cast node

* expose API for topk

* parse parameters

* Add beam search scorer

* output results

* fix typo

* use c++ template and format python

* fix build pipeline errors

* symbolic shape infer of input onnx

* output scores

* add kernel def hash

* Handle vocab_mask; move CheckSubgraph

* undo insert_cast_transformer.cc and fusion_utils.py

* fix typo

* fix merge

* update doc

* add repetition penalty

* refactoring: add GptSubgraph class

* move BeamSearchState from .h to .cc file

* adjust logits processor order

* add batch generation example

* fix repetition penalty for dup words in sequence

* Add test

* Add no repeat ngram processor

* refactoring: move logits processor to classes

* fix build warning

* show latency

* use allocator in beam state

* use allocator in sequences

* fix build error

* move next_positions to beam state

* Changes for prefix matching

* removing debugs

* removing more debugs

* clean up

* clean up

* cpu doc updated

* Updated docs

* updated prefix_vocab_mask dimension in convert script

* changes to support bxs prefix_vocab_mask in beamsearchop kernel

* doc update

* OperatorKernels.md updated

* matching docs from artifacts

* minor change in logits processor

* Addressing comments

* Updated the prefix vocab mask usage properly

Co-authored-by: Tianlei Wu <tlwu@microsoft.com>
2022-02-03 00:14:39 +05:30
Yufeng Li
1aa0789691
add qdq support for QGemm (#10414)
* add qgemm in quantization tool

* add qdq support for QGemm

* fix build break

* fix OperatorKernels.md
2022-02-02 10:35:29 -08:00
Xavier Dupré
481b96d32a
STVM, NUPHAR, remove tvm from submodules list, checks pointers are not null. (#10211)
* STVM, checks pointers are not null.
* removes submodules tvm
* add missing include(FetchContent)
* add target tvm
* fix stvm test
* extend cgmanifest with dependencies of tvm
2022-01-27 20:31:13 +01:00
Edward Chen
66acf50488
Document C/C++ API documentation version info conventions. (#10396) 2022-01-27 10:20:13 -08:00
Dmitri Smirnov
3367ddc5ba
Add abseil cgmanifest declaration. Update coding standards. (#10374)
Add abseil cgmanifest declaration. Update coding standards for InlinedContainers
  Adjust coding guidelines. Add default N calculation for InlinedVector<T, N> for general use.
  Rename T from InlinedShapeVectorT. Fix Eager build
  Add LLVM Copyright with modified derived code notice.
2022-01-27 08:32:05 -08:00
Yi-Hong Lyu
e27f2dc932
int8/uint8 support for Argmax for opset 1, 11, 12 (#10296) 2022-01-18 14:37:34 -08:00
Vincent Wang
44e2db9397
CUDA BFloat16 Refactor (#10085) 2022-01-14 19:38:56 +08:00
Yi-Hong Lyu
499f1d5fd7
Quantization of Argmax (#10213)
This patch includes:
* int8/uint8 support for Argmax
* Quantization tool support for Argmax
2022-01-12 14:12:56 -08:00
Nat Kershaw (MSFT)
d52d3c0052
Update C/C++ API docs automation to create a PR (instead of push to publish branch) (#10093) 2022-01-07 16:16:47 -08:00
Edward Chen
3bc91c2151
Move reduced ops files into build directory (#10030)
In a reduced ops build, some source files get updated. This change moves the updated files into the build directory. This way, it is easier to simultaneously manage different build directories (with possibly different reduced ops configurations) based on a single source directory.
2021-12-28 19:04:20 -08:00
Vincent Wang
ceb17f82ff
Use FusedMatMul When Transpose is Between First Dim and Contiguous Batch Dims (#9734)
* fusedmatmul support transpose batches

* fix win build

* fix contrib op md

* more comments
2021-12-27 10:49:46 +08:00
Yufeng Li
12ee2e942f
add int8_t for Resize (#10067)
As we support quantization for format s8s8, we need Resize to support int8_t.
2021-12-17 15:36:09 -08:00
Tianlei Wu
ef36488df0
Add BeamSearch operator for GPT-2 decoding (#9680)
* Add BeamSearch operator and CPU implementation
* Add ONNX conversion script
2021-12-16 16:08:05 -08:00
Valery Chernov
b327e89efa
Standalone TVM Executor Provider (#10019)
* squashed commit for standalone tvm execution provider

* critical fix for correct python build with stvm ep

* get tuning log file from ep options. It has priority over AUTOTVM_TUNING_LOG

* updates and fixes

* update parsing of stvm provider options

* add support of external data for onnx model

* add conditional dump of subgraphs

* remove unused code

* get input tensor shapes through provider options. get output shapes for fixed input ones by TVM API

* support AUTO_TVM tuning log file inside ORT. Selector for Ansor and Auto_TVM is provider option (tuning_type)

* add fp16

* add functionality of conversion of model layout to NHWC if need. Necessary parameter was added to STVM provider options

* fix license text in header. fix log format

* small fixes

* fix issues from flake8

* remove model proto construction from GetCapability

* reserve memory for vector of DLTensors

* add simple tutorial for STVM EP

* STVM docs

* jroesch/tvm -> apache/tvm

* remove dead code, unneccessary logs and comments

* fix in readme

* improve tutorial notebook

* tvm update

* update STVM_EP.md

* fix default value

* update STVM_EP.md

* some TODOs for the future development

* shorten long lines

* add hyperlink to STVM_EP.md

* fix Linux CI error

* fix error in csharp test

Co-authored-by: Jared Roesch <jroesch@octoml.ai>
Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru>
2021-12-15 16:59:20 -08:00
jingyanwangms
8043a9facc
Bump master version to 1.11 (#9957)
* Bump master version to 1.11

* Update Windows AI version

* update version in onnxruntime_c_api.cc
2021-12-14 23:32:06 -08:00
Nat Kershaw (MSFT)
b4434c7694
Automate generation of C/C++ API docs (#9997) 2021-12-10 17:45:50 -08:00
Yufeng Li
ffdafb2012
add fallback of s8s8 support on x64 (#9995)
* add fallback of s8s8 support on x64
2021-12-10 11:33:19 -08:00
Scott McKay
00c979db4d
Update doc for operators/opsets supported by mobile package (#9899) 2021-12-02 13:51:22 +10:00
Sherlock
6de79d82c8
Fix Training Packaging pipeline (#9885)
* Fix Training Packaging pipeline
2021-11-30 15:26:10 -08:00
Yufeng Li
a0afd7303d
add int8_t support for pool operators (#9852)
* add int8_t support for pool operators
2021-11-29 18:43:43 -08:00
Ye Wang
6856619b18
Decoder Attention CUDA Op (#9792)
* add kernel interface

* register kernel

* add self/cross qkv projection without cache

* add LaunchTransQkv2 for (S,B,X,N,H) -> (X,B,N,S,H)

* refactor ConcatPastToPresent

* DecoderQkvToContext interface

* q,k,v buffer and cache as output

* qk, pv and transctx

* fix compiler error on linux machine

* key_padding_mask

* add test_parity file. However not runnable

* add partial unittest

* made partial attributes to inputs

* --gen_doc

* change kernel interface, add more tests

* morre parity tests

* fix test

* fix typo

* transpose optimizer has bug. remove it temporarily

* add input shape checks

* add type/shape inference

* fix cache shape check

* fix rocm build failure

* fix rocm build error

* review comments

* review comments
2021-11-19 19:25:36 -08:00
Vincent Wang
f390347c11
Add CUDA Kernels of RandomNormal[Like], RandomUniform[Like] (#9761) 2021-11-19 08:18:34 +08:00
Viswanath Boga
9d84811fb6
fixing pypi pipeline for release (#9716)
* fixing pypi pipeline for release

* updated the script and correct python version

* updated the version correctly with script changes

* Remove 1.9.1
2021-11-10 17:33:51 -08:00
satyajandhyala
229c9a4e1c
Added Trilu CUDA kernel. (#9633)
* Added Trilu CUDA kernel.

* Added TriluGrad.

* Added a training testcase for Trilu.

* Added Trilu gradient checker test.
2021-11-09 11:20:17 -08:00
Hariharan Seshadri
bbeceb7541
Support optional type in ORT (#8339) 2021-11-04 15:01:42 -07:00
Viswanath Boga
85874bb315
embed layer fusion gpt2 (#9336)
* Changes to fuse embed layer for gpt2, kernal changes pending

* verified add output and regular add match

* Test added for additional output embedlayernorm, working on CUDA

* Test passing on CPU

* updated convert_to_onnx toll to check parity correctly

* removed some debugs

* couple of TODO left as in optimizer.py

* removed changes to optimizer.py

* fixing build

* fixing build

* updated order of initilization

* added a test case for float16

* updating the docs

* updating tests failing due to embed layer fusion

* update unit tests

* updating CUDA documentation in operatorkernels.md

* addressing comments

* OperatorKernels.md updated with CUDA

* adding TODO to qembed_layer

* minor edit

* updated docs

* addressing comments

* adding position ids to embed layer gpt2

* updating fused gpt2 model

* added extra test

* remove comments

* addressing comments

* contrib_defs.cc updated

* all tests passing

* fixing a typo

* minor edit

* trigger build

* qembedlayernorm checkinputs updated

* fixing build error

* fixing build error

* fixing build error
2021-10-28 11:06:26 -07:00
Bowen Bao
e983f37121
Bifurcation detector for aggressive decoding (#9432)
```
Component for aggressive decoding. Find the bifurcation index of predicted tokens, between source tokens,
starting from previous suffix match index, and predicted tokens.
Concat predicted tokens, starting from bifurcation index, to the back
of current tokens. This forms the output tokens.
Detect suffix match index in source tokens, between source tokens and output tokens.
Detection is based on finding the appearances of last n-gram in output tokens
in source tokens.
A match is considered found if source tokens contain a single matching n-gram.
Return the index of the start of the n-gram in source tokens.
No matching if found if src tokens contain multiple or zero matching n-grams. Return -1.
```
2021-10-19 19:53:56 -07:00
Hariharan Seshadri
4698b73725
Fix output shape description of Attention op's schema (#9406) 2021-10-19 15:56:35 -07:00
Xavier Dupré
11f0081c1e
Remove tensorflow, tf2onnx from the list of dependencies for the documentation (#9221)
* Remove tensorflow, tf2onnx from the list of dependencies for the documentation
* improve documentation
* update API
2021-10-14 18:07:35 +02:00
mindest
f9cf62912a
Add same_shape case for BiasDropout (#9188)
* bias dropout improvement

* add transform case for same shape case

* combine kernel

* merge with vectorized kernel

* use "has_same_shape_bias"

* minor: a "N % 4 != 0" case

* add op UT for has_same_shape_bias

* address comments; add param case for 1d bias;
add param case tests for 1d and same-shape bias

* rewrite logic condition

Co-authored-by: Peng Wang <pengwa@microsoft.com>
2021-10-12 19:57:38 +08:00
ashbhandare
35c2102cfa
Fixes for GatherND, Multinomial (#9143)
* register gathernd kernel, aten multinomial

* fix CI, add test

* review comments
2021-10-05 14:51:58 -07:00
Ye Wang
4934455ab6
Bumping up to 1.10 (#9006)
* bump to 1.10

* Update Versioning.md

* Update README.rst

* Change opset version to 15
2021-09-22 16:34:28 -07:00
Jason
4e5bc8365b
Add Paddle2ONNX to Versioning.md (#9067)
* Add Paddle2ONNX to Versioning.md
2021-09-22 13:38:14 -07:00
Pranav Sharma
dae37dc946
Fix S360 issue by using "use strict" for javascript code. (#9128) 2021-09-20 20:32:44 -07:00
Ryan Hill
6ae5f7a244
C API Docs - Add build instructions (#9106)
* Update Doxyfile, add build instructions to header
* Update paths in README.md
2021-09-17 18:40:27 -07:00
Ryan Hill
280e79463a
FIll in more documentation (#9088)
Fix plural values with %s
Fix more symbol links
Add custom header for web metrics
2021-09-16 17:08:27 -07:00
Zuwei Zhao
ff66cfdfa6
Enable linking in exception throwing support library when build onnxruntime wasm. (#8973)
* Enable linking in exception throwing support library when build onnxruntime webassembly containing onnxruntime-extensions.

* Add flag in build.py to enable linking exceptions throwing library.

* Update onnxruntime-extensions document and bind custom_ops build flag with use_extensions.

* Update doc.

* Update cgmanifest.json.

Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com>
2021-09-10 22:09:16 +08:00
Ryan Hill
2439ced3ec
API Documentation (#8948)
* Make help information compile properly
2021-09-09 22:04:51 -07:00
ytaous
0193490cbf
ReduceMin - add int64 cuda kernel support for opset12/13 (#8966)
* ReduceMin - int64 support

* fix doc

Co-authored-by: Ethan Tao <ettao@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2021-09-07 17:01:26 -07:00
Ye Wang
e2194797a7
bumping up to version 1.9 (#8982)
* bump up version

* makes the windowAI column align with ORT version

* update the hardcoded version string

* fix a typo
2021-09-07 14:30:55 -07:00
Zuwei Zhao
89e8bff121
Enable selecting custom ops in onnxruntime-extensions. (#8826)
* Enable selecting custom ops in onnxruntime-extensions.

* Move cmake_helper.py.

* Remove over-indented spaces.

* Add doc.

* Remove onnxruntime-extensions from git submodules, and user should pass path of onnxruntime-extensions for build.

* Modify doc.

* Remove argument --enable_onnxruntime_extensions and use --onnxruntime_extensions_path.

* Fix build error.

* Fix build error.

* Use onnxruntime_extensions_path.

* support both submodule and external source folders

* refinement

* Update cgmanifest.json

* Support building onnxruntime-extensions from either git submodule or pre-pulled path.

* Update doc.

* more standard name

* update docs

* add the copyright header

Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com>
Co-authored-by: Wenbing Li <wenbingl@outlook.com>
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
2021-08-27 21:45:52 -07:00