Description: Describe your changes.
Add no scale check for resize and upsample
Motivation and Context
Why is this change required? What problem does it solve?
If it fixes an open issue, please link to the issue here.
* Update DNNLibrary
* Allow fp16 by default
* Add nnapi build in ci
* Fix nnapi ep after #1268
* Remove unused variables
* Support nnapi in onnx_test_runner
* Update DNNLibrary to fix tests
* Update build.py for android build support, solve conflict of
tools/ci_build/build.py
* Support non-ARM Android build, solve conflict of tools/ci_build/build.py
* Enable android test by x86_64 android emulator
* Add dnnlibrary/NNAPI support in build.py
* suppress the verbose adb output
* Remove debug logs
* Install cmake by pip
* Fix undefined host_protoc_path
* cmake==3.13.2 in pypi is actually 3.12.2, so install 3.13.2.post1 instead
* Fix Android ARM64 build
* Use android ndk r20 instead of r19c, fix conflicts in install_deps_android.sh
Description: Describe your changes.
Optimize the resize and upsample operators
Motivation and Context
Why is this change required? What problem does it solve?
For case with input with shape [1,128, 267, 200] and scales [1, 1, 1.97, 2], Resize and upsample get 15x gain (w/o: 1020ms, w: 71ms on my local box). It should benefit other scenarios at similar level.
If it fixes an open issue, please link to the issue here.
* Update version number to 0.5.0 in preparation for release
* Update to README.md to direct to Versioning doc
* Resolve PR comment
* Remove incorrect line generation
* Minor updates to update version script
* Minor comment update
* Remove invalid dim_param and dim_value values when creating a NodeArg.
* Allow re-use of a large enough buffer if there's a shape mismatch.
* Update handling in python to treat unset dimension the same as a dim_param (equivalent to None).
* Fix GetTensorShapeFromTensorShapeProto to handle neither dim_param and dim_value being set.
* Initial commit
* More ops
* fix missing declarations for ReduceSum and ReduceSumSquare
* Add tests for new ops supporting double
* isable Add_dobule for OpenVINO EP
The NCHWc transform was missing support for the Sum_6 operator from ONNX 1.2. Older models would add unnecessary reorder ops and also would not use the Conv/Add fusion.
* Add string attribute interface for C API.
* Add string attribute interface for C++ API accordingly.
* Update comment to say that string is also valid
* Initial commit
* Add test case
* Revert unintentional change
* Update comments
* Resolve PR feedback
* Craft test casse and add more logs
* Fix build failures
* Fix minor bug in the way modified is updated
* Remove full model inference session test
* Resolve PR comments
* Resolve more PR feedback
* Resolve more PR feedback
* Resolve more PR comments
* Remove logging
* Move GetInitializer() method to memcpy_transformer scope
* Remove some unnecessary blank lines
* Make GetInitializer static
* Validate input shapes.
* Cache some input def metadata
* Make some methods const and check for negative values of dims instead of just -1.
* Fix shape inferencing test.
* Fix testLabelEncoder test
* Fix more tests
* Fix more tests
* Use size_t for loop variable
* sync onnx to get equal op with float support
* doc update
* fix test failure because of updated shape inference logic for roialign.
* filter consum test cases since it's not implemented yet.
This change makes some optimizations on various places. This change consists of a part of PR #1240 (removed the problematic part) and some other trivial fix.
1. reduce unnecessary copy when constructing vector or objects that contains vector as member. use std::move when applicable.
2. use std::vector<std::reference_wrapper<const TensorShape>> instead of std::vector<TensorShape>, when it is only for constant reference usage.
3. calculate key BEFORE (instead of AFTER) acquire lock in SessionState::GetMemoryPatternGroup
other trivial fixes (code should be straightforward and self-explainable).
Description: Describe your changes.
Change the logic to find cublas dll
Motivation and Context
Why is this change required? What problem does it solve?
The name pattern of cublas changed since 10.1. It doesn't include minor version in its name anymore.
If it fixes an open issue, please link to the issue here.