* Add CumSum and Round for Opset 11
* add back 1 test
* Add back one broken test
* Add back more broken tests
* activate cumsum, round, dynamicquantizelinear tests
* removed python backend tests
* re-comment out dynamicquantizelinear_* tests. ReduceMin(11) not implemented yet
* re-comment out dynamicquantizelinear_* tests. ReduceMin(11) not implemented yet
* comment out cumsum_1d_reverse_exclusive
* Remove few types for csum. Keep only float, int32, int64
* Added friendly error message
* Added double type to pass ONNX tests.
* Don't return shape for non-const initializer in InferenceContextImpl::getInputType
Don't return initializer for non-const initializer in InferenceContextImpl::getInputData
Update graph_utils to support these scenarios
- fix GetConstantInitializer to make sure a name is for an outer scope value before checking a parent graph, as local name could shadow an outer scope initializer.
* call MLAS's pooling function as an external call for Nuphar
Note that at the moment Nuphar provider doesn't handle the cases below:
- symbolic height/weight dimensions
- Indices output of MaxPool
- non-default dilations
* unify the pool interface for mti and mti_x86
* Address two issues:
Thread-safety issue with LTSM/RNN running lambda in parallel
Propagate lambda exceptions and report them when running in
parallel.
Refactor cpu provider's pool ops by extracting pooling attributes
into a separate helper class PoolAttributes. With this change,
other providers such as Nuphar can re-use the same routines
for processing pooling attributes. This refactorying doesn't
have any functional changes.
* add ctor overloads that accept model byte array
* doxygen. mark Init method as private.
* doxygen
* rename test method for clarity
* PR feedback - add two overloads that accept either model path or model byte array
* update native signature to align with latest codebase
* fix native call
Fix issue #1591
Root Cause:
CUDA Equal Greater Less do not support multi-directional broadcast
Fix:
Add code to support the multi-directional broadcast
Also add tests to cover more cases.
* Mention OrtCreateSessionFromArray in C API doc
* Add C API for free dim override
* Add C API for free dim override, fix missing API mention in InferenceTest.cs, fix confusing print statement in perf_test.
* Remaining C#files
* fix c# build
* Run the tests in blame mode. This option is helpful in isolating a problematic test causing the test host to crash.
* fix order
* Avoid variable length stack array variables for VC++ compatibility
Use dynamically allocated arrays or vectors instead.
* windows enabling
* openvino windows build
* Update build instructions
* resolve conflicts for PR
* remove debug messages from cmake
* PR fix for window support
* Disabled Div unit test on GPU
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Disabled div unit test for GPU in python backend tests
*Added more backends for OpenVINO
*Disabled div unit test in onnx_test_runner
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Disabled div for GPU_FP16 in python backend tests
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Handle std::bad_alloc when growing arena.
Allow more than one attempt at reducing the buffer if allocation fails. More memory may have become available so never trying to backpedal more than once means we potentially fail when a large enough buffer could have been allocated.
Description: Refine threading control options and move inter op thread pool to session state.
Added thread_utils.h/cc to centralize the decision around the thread pool size under various conditions.
Motivation and Context
Currently the thread pool size of the parallel executor is hardcoded to 32 for some reason. This PR makes the options to configure the thread pool sizes clearer.
* Initial commit
* Uncomment tests
* Updates
* Updates
* Disable CRD mode DepthToSpace for NGraph builds
* Disable test
* Update tests
* PR feedback
* Add unit test for CRD mode
* Reflect class variable in naming
* Add a test to NGRAPH disabled list
* Update main.cc
* Update main.cc
* Fix symbolic shape inference for faster_rcnn, mask_rcnn, yolov3
Force merge when --auto_merge, on symbolic dims which sympy cannot simplify
Add symbolic inference for Resize opset 10
Add support for step != 1 in Slice
Add support for computed dim in TopK
Bug fixes in passing symbolic dims from subgraph
Fix an outdate comment in Nuphar provider header
1. remove sudo from the cleanup step for Linux so that we don't need the sudo access for vstsagent build user
2. a minor fix in the install_ubuntu.sh to make the image smaller for openvino