* attention fusion kernel refactored
* consider the case of none in add_qk
* variabled added to check for pre-pack weights
* added a comment to PrePack()
* Optimized prepack and try to free the weights
* making comment sound better
* fixing a bug with optimizer.py
* commented out changes to be done
* removed comments
* make the private fn() private
* fix build
* making clean up fn static
* backed out optimizer tool change, needs more looking into
* freeze/fastpath support
* more comments on _fast_path
* per comments
* minor fix
* IntFlag improve
* address comments
Co-authored-by: Ethan Tao <ettao@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
* atenop for inference
* assert if dtype mismatch
* atenop config in frontend
* fix orttrainer test
* gradient def not only for ATenOp
* bugfix
* fix gradient input shape and type issue
* fix after merge master
SparseTensor support
Implement Builder pattern
Fix support for 1-D and 2-D COO indices
Implement and test CSR support.
Handle shape inference for SparseTensors
Implement conversion for COO, CSR and tests.
Address the case where constant sparse initializer is the output.
Implement test infra for SparseTensors
Implement SparseDenseMatMul for Csr and COO and tested it.
Add hash for SparseToDenseMatMul
Finish shared provider refactor
Refactor GetOrCreate to Create
Working on py interface
Expose OrtDevice and use it in allocate_numpy
Adjust Sparse interfaces, add support for string SparseTensor. Add tests.
Add and test to_cuda()
Add accessors to format specific indices
Test values and indices views, read-only flag, after GC access
Add sparse related methods to OrtValue
Re-work SparseTensor wrapper, add OrtValue methods
Rework numpy_array_to_cuda/to_cpu
Add run_with_ort_values
Add models and test sparse_mat_mul with run_with_ort_values
Refactor sparse tensor to use a single buffer
Ifdef x86 Eigen CSR sparse matmul implementation
Exclude broken test, check for string type when copying cross device
Split pybind schema, regenerate docs, add exclusion
Conditionally exclude schema module
Update docs fix cuda build
Add test to a filter and renerate JS docs
Add conversion and test string support for sparse tensors
Exclude conversion utils from minimal build
Add CUDA Memcpy and adjust provider interfaces
* add Gridsampler contrib op
* fix gridsampler_paddingmode_border test
* disable the tests until the kernel added
* fix CI failure
* change GridSampler to GridSample
When compiling with newer gcc and older glibc, there is a chance
for new POWER10 macros to be not available in hwcap.h. This patch
checks whether hwcap macros are available before using that in
platform.cpp.
* changes working to convert akv nodes
* changes to replace nodes
* changes to accomodate qkv hidden sizes as attributes
* kernel to accept qkv_hidden_size attributes
* Working till compute for varied dimension, todo applyattention()
* changes to make all regression tests work
* inference running successfully without prepack
* success inference with pre-pack weights
* add test for diff sizes
* bias shape need not be a mul of 3
* get the output_hidden_size from input
* infer output shape from input
* merge with master
* cleaning up files that got merged wrong
* accurancy at accepted level
* added unit test case for different dimensions
* all unit tests passing
* packed weights working for attention
* prepacked weights working
* added test case for newly added extra qk input
* updated unit test to test only extra add qk
* fixing build error
* removing few debugs
* reverting test changes
* all python test passing
* cleaning up
* new unit test added, major clean up of code
* removed extra code
* minor
* minor fix to tests
* prepack weights code cleaned up
* compacted compute() in attention.cc
* reformat compute()
* making a parameter T
* adding 3 q,k,v buffers in all cases
* fixing build
* running tests only on cpu
* Updating docs
* trigger ci builds
* Addressing comments in PR
* addressing some more comments
* get add_qk_str from add_qk node directly
* updating docs, added extra check to verify attn inputs
* Optimized the extra add by parallelizing
* added attention_shape to symbolic_shape_infer.py
* minor refactoring to address comments
* Changes to ensure the openvino-ep-2021.4 branch is created
* Fix failing cpp and python unit tests
* Fixed Myriad Tests for Ov_2021.4
* Disabled failing python tests for myriad
* Fixes models which were breaking w.r.t 2021.4
* Added fixes to Fix tinyyolov3 working on Myriad
and MaskRcnn, FasterRcnn using GPU_FP32
* Added FP16 output data type support for ngraph
* Implemented ReadNetwork() method
->Using Core::ReadNetwork() method for reading and creating a CNNNework
->Since OpenVINO™ 2020.4 version, Inference Engine enables reading ONNX models
via the Inference Engine Core API and there is no need to use directly the low-level
ONNX* Importer API anymore. To read ONNX* models, it's recommended to use the
Core::ReadNetwork() method that provide a uniform way to read models from ONNX format.
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fixed ngraph f16 supported output type
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Added comments in data_ops.cc
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fixed broken windows build
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Disable failing CPP tests on CPU
Some of the convtranspose tests are failing on
OpenVINO-EP CPU due to accuracy mismatch w.r.t
default CPU. so currently we are disbaling
these tests.
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Updated for ov version 2021.4
* Changes to include qdq ops in code
* Disabled failing python tests on GPU
Disabled two maxpool python tests on
GPU as they were passing but throwing
segfault
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fix the backward compatibility issue
ReadNetwork() API has a bug and will only work
starting from OpenVINO 2021.4 version.
The previous versions will still have to use
onnx importer route
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fix CMakeLists.txt for OpenVINO EP
If a directory with OpenVINO is sourced,
the latest OpenVINO settings have to
be imported.
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
Co-authored-by: sfatimar <sahar.fatima@intel/com>
Co-authored-by: sfatimar <64512376+sfatimar@users.noreply.github.com>
Co-authored-by: Aravind Gunda <aravindx.gunda@intel.com>
* Add ability to generate ios static framework
* Fix typos
* Add pod cache clean, update some comments of previous commit
* Fix CI failure with newly added cpuinfo library
* Update test model (CoreML requires node has a name)
* Addressed CR comments
Fix bug that shows up when running tests (in particular, GraphTransformationTests.ConcatSliceEliminationTest) with more verbose logging level.
There is a log statement that doesn't get evaluated at the default test logging level (warning). It was accessing the first element of an empty vector. This change moves that log statement before the point where that vector is cleared.