* regsiter part of opset13 cpu kernels; fix a bug in func impl; adjust reshapefusion order
* remove useless function
Co-authored-by: Cheng Tang <chenta@microsoft.com>
* Optimized MatMulGrad for dB when B's shape is 2D
* Refactor for ConstantScalarNode
Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
* matching multiple choice between new and old apis
* update according to reviewer's comments
Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
* support Normalized_0_1 and Normalized_1_1
* add tests for Normalized_1_1
* fix build error
* fix imagetests failure
* support denterization and add more tests
* fix build
* remove added models
* disable gpu tests for CPU pipeline
* refactor based on comments and moved two added models
* merge normalizer and Denomalizer into NominalRangeConverter
* add comments
* little change
* fix build failure for amd64
* cuda kernel support
* on comments
* test UT
* test UT
* revert settings
* attempt to fix broken UT
* corrected UT fix
Co-authored-by: Ethan Tao <ettao@microsoft.com>
* Extend C++ API for Map/Sequence Type Info (#3517)
Expose functionality to view type information about sequences/maps
to C++ API.
- Add functions
- `TypeInfo::GetSequenceTypeInfo`
- `SequenceTypeInfo::GetSequenceElementType`
- `TypeInfo::GetMapTypeInfo`
- `MapTypeInfo::GetMapValueType`
- `MapTypeInfo::GetMapKeyType`
- Add structs
- `SequenceTypeInfo`
- `MapTypeInfo`
Co-authored-by: Dudeldu <mustermann.informatik@gmail.com>
Co-authored-by: Jonas-Heinrich <Jonas@JonasHeinrich.com>
* Extend tests to cover new type info functionality for sequences and maps
- two new test case in test_nontensor_types for maps and sequences
Co-authored-by: Jonas-Heinrich <Jonas@JonasHeinrich.com>
* support Normalized_0_1 and Normalized_1_1
* add tests for Normalized_1_1
* fix build error
* fix imagetests failure
* support denterization and add more tests
* fix build
* remove added models
* disable gpu tests for CPU pipeline
* refactor based on comments and moved two added models
* merge normalizer and Denomalizer into NominalRangeConverter
* add comments
* little change
* Export GPT-2 ONNX model without postion_ids and attention_mask inputs
* allow benchmark_gpt2 on user's model
* refactor: get_dummy_inputs returns a data class.
* check whether the node has been casted before
* check casted node logically
* better naming convention
* nit: extra space
* change to skip for Cast Node
* remove hasNodeBeenCast
* Add a Unit test
* Add test onnx file
* nit: naming convention and comments
* check CI: try to remove test
* move test to existing test file
* Copy samples to build folder and load models from there. Fix CI
* This PR also includes a fix to path validation for save_as_onnx API
* Add torchtext to CI for GPU training
* Remove new frontend tests from CI
Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
* Next round of changes.
Remove inclusion of ONNX schema header
Exclude custom registry related things
Move IsConstantInitializer from graph_utils to Graph as it's needed in a minimal build and graph_utils is excluded.
* Add support for sharing allocators
* Incremental update
* Address some PR comments, add unit tests, add documentation.
* Address PR comments, add tests and some documentation.
* Fix build and test issues
* Remove RegisterAllocator API restoring the OrtAllocator interface changes. Changed docs to reflect this.
Also fixed the orttraining segfault. The segfault was because in the case of training session,
the CPU exec prov is not available at the time the transformers are applied. Changed it to create
a new one.
* create branch for debug
* move unit test
* more changes
* move relu to activations_grad*
* Fix ReluGrad Domain and opset version
* added unit test, CudaKernelTest.Relu_basic doesn't work yet
* remove CudaKernelTest.Relu_basic
* PR comment
* add unit test ReluGradTest_Basic
Co-authored-by: Jingyan Wang <jingywa@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
* cancel night build on pyop
* add rewriter to rewrite cpu provider
* skip BuildKernelCreateInfo<void>
* refactor variable name and comment
* include ops from csv file
* process multiple eps
* add default function to cuda provider
* rename function and add license header
* fix import
* add doc
* fix typo
* deal with empty kernel entry in cuda
* rename the rewriter file
* add comment into provider file
* add comment and rename function
* log warnings
* refactor extracting logic
* add entry for script to run solo
* add better example
* avoid onnx importing
* fix flake8 alerts
* minor fixes to better comments and doc
* add entries for all domains
* add void entry into contrib providers
* format cuda_contrib_kernels.cc
* format cpu_contrib_kernels.cc
* add all providers
* add default entry to all providers
* include op_kernel header
* cancelling change in providers beyond cpu/cuda
* rename file and switch file format to domain;opset;op1,op2...
* update doc
* restore non-regular ending grammar in cuda_contrib_kernels.cc
* add ort_root as input argument of script
* enable test in ci
* update doc
* update doc
* revert change on linux gnu ci
* switch to set to host ops
* simplify trimming logic
* add domain map to track current model
* allow ort_root to take relative path
* correct some errors in the flatbuffers schema, move flatbuffers submodule to cmake/external
* update the ort flatbuffers schema to use less namespace
* minor update
Co-authored-by: gwang0000 <62914304+gwang0000@users.noreply.github.com>
* Initial set of changes to start disabling code in the minimal build. Breaking changes into multiple PRs so they're more easily reviewed. Focus on InferenceSession, Model and Graph here. SessionState will be next.
Needs to be integrated with de/serialization code before being testable so changes are all off by default.
Changes are limited to
- #ifdef'ing out code
- moving some things around so there are fewer #ifdef statements
- moving definition of some one-line methods into the header so we don't need to #ifdef out in a .cc as well
- exclude some things in the cmake setup
* Update session state and a few other places.
The core code builds if ORT_MINIMAL_BUILD is specified.
* update onnx to latest master
* implement per-channel for quantizelinear and dequantizelinear
* refine the unit test
* exclude sequence_insert tests
* refine onnx cmake
* add failure tests to broken_tests
* move qdq common code to a seperate function
* refine code