* If there is an outer scope value that matches a subgraph input, don't create an implicit input from the outer scope value.
Minor unrelated change for issue noticed while debugging: Use unordered_set for implicit inputs so we don't add them multiple times.
* Add unit test based on onnx issue.
* Bug fix for shape of optional output in Dropout op
* Exclude new test from NGraph EP
* Account for the fact that mask could be of different type in different opset variants of the op
* Make accompanying Cuda changes
* Fix build break
* Exclude Opset 7 test for tensorRT EP
* PR comments
Description:
crash if the output shape has 0 in it. because the code to / output_shape[i]
Fix:
If the output shape has 0 which means output_shape.Size() is 0, so output should be null.
* Add MacOS leg of Python packaging job
* Update copy files source directory for Mac OS leg
* Add a task to display the binaries directories contents after build wheel creation
* Revert some changes
* Add task to log
* Update
* Remove unnecessary logs
Python script and necessary changes in the azure-pipelines yaml file to post the binary size data from NuGet package build. Currently only posted from CPU pipeline. GPU and other pipelines may be added as necessary.
1. Move non_max_suppression_test.cc to object_detection folder
2. Move Class CudnnDropout to cudnn_common.h so that can share it with other ops. Move the cuda memory allocation part out of CudnnDropout to avoid memory leak.
Description: Describe your changes.
Add no scale check for resize and upsample
Motivation and Context
Why is this change required? What problem does it solve?
If it fixes an open issue, please link to the issue here.
* Update DNNLibrary
* Allow fp16 by default
* Add nnapi build in ci
* Fix nnapi ep after #1268
* Remove unused variables
* Support nnapi in onnx_test_runner
* Update DNNLibrary to fix tests
* Update build.py for android build support, solve conflict of
tools/ci_build/build.py
* Support non-ARM Android build, solve conflict of tools/ci_build/build.py
* Enable android test by x86_64 android emulator
* Add dnnlibrary/NNAPI support in build.py
* suppress the verbose adb output
* Remove debug logs
* Install cmake by pip
* Fix undefined host_protoc_path
* cmake==3.13.2 in pypi is actually 3.12.2, so install 3.13.2.post1 instead
* Fix Android ARM64 build
* Use android ndk r20 instead of r19c, fix conflicts in install_deps_android.sh
Description: Describe your changes.
Optimize the resize and upsample operators
Motivation and Context
Why is this change required? What problem does it solve?
For case with input with shape [1,128, 267, 200] and scales [1, 1, 1.97, 2], Resize and upsample get 15x gain (w/o: 1020ms, w: 71ms on my local box). It should benefit other scenarios at similar level.
If it fixes an open issue, please link to the issue here.
* Update version number to 0.5.0 in preparation for release
* Update to README.md to direct to Versioning doc
* Resolve PR comment
* Remove incorrect line generation
* Minor updates to update version script
* Minor comment update
* Remove invalid dim_param and dim_value values when creating a NodeArg.
* Allow re-use of a large enough buffer if there's a shape mismatch.
* Update handling in python to treat unset dimension the same as a dim_param (equivalent to None).
* Fix GetTensorShapeFromTensorShapeProto to handle neither dim_param and dim_value being set.