* special case concat and split when sizes are equal
* add tests for 16 and 32 inputs with same dim
* add tests for 16/64 inputs on concat or 16/64 outputs on split
* try eliminate windows warning
* outter => outer
* Change the strided copy to switch on data size not data type.
Move to header so we can reduce on the enabled types.
Setup type reduction for Concat now that it's using this implementation.
* test running hf bert-large
* try again
* try again
* include other models
* correct names
* disable deberta-v2-xxlarge
* avoid torch.distributed
* add compare json loss and perf for bert-large to test
* fix sed expression
* remove pytest
* add more models
* move unit tests u
* display samples/sec
* support register external ep lib inforation; make eager mode share the same ep pools with training workloads
* fix inference code
* fix build break
* fix the message
* Made rank a template parameter of _UpampleNearestKernel
* Added error checking for rank specified to UpampleImpl
* Added __restrict__ keyboard to input and output arrays in Upsample
The following ops have been added to the DNNL execution provider
Abs, Elu, Exp, Log, *Relu, Round, Sigmoid, Softplus, Sqrt, and Tanh
*Relu op was moved from its individual file to the elementwise operators
The error tolerance for the LogGrad unit test had to be decreased slightly
when using OneDNN. Still investigating why a differet tolerance value is
needed.
DnnlSubgraph::AddKernels() member function was moved to the top of the file
since this is eddited every time a new operator is added to the the execution
provider this places the code at the top which mean less scrooling when
adding new kernels.
Signed-off-by: George Nash <george.nash@intel.com>
* Fuse HardSigmoid with conv.
Add transform test case and FusedConv testcase.
* Limit Conv/HardSigmoid fusion in CpuExecutionProvider.
* Fix typo for arm build.
* change format one place
* Add command to skip tests
* Remove support for OV_2021.3_LTS and ov_2021.1
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Removed request_id parameter from all references
request_id parameter was being used with ov_2020.3
release. Starting from 2020.4 OV release, input_name
paramater is being used instead to get the
KernelContext_GetInput.
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Enabling CI Logs in the branch
* CI Commits to enable logs
* Enable CI Print
* Added Imagescaler op to the supported op's list
Fixes test_tiny_yolo_V2 opset 8 model to support
fully on OV-EP. This model is the older variation
of tiny_yolo_v2 model which has Imagescaler op.
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Added ops to fully support yolov3 model
-Added changes to support yolov3 opset 10 model
fully on CPU_FP32.
-This also increases the operator coverage for GPU
hardware. There by enabling yolov3 model on GPU
with fewer subgraphs.
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Enabling tiny_yolov3 model fully on CPU
->Enabled tiny_yolov3 model fully on CPU.
-> Also reduces the number of subgraphs
to infer this model on GPU
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Adding GatherND op support for CPU and GPU
->This enables yolov3_pytorch model to work
with fewer subgraphs on CPU and GPU Devices.
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fixes Albert model for ISV customer
ConvTranspose op was getting rejected
due to a condition. Fixed it.
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Disabling this 4 cpp tests for openvino-ep
These unit tests are failing with special conditions
for conv_transpose op with output_shape attribute.
so disabling them for now.
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Docker file changes for 2021.4-v3.1
* Remvoing duplicate code
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* ReduceMax No dimension supported
* Fixes failing protobuf issue for docker
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Excluding openvinoep type for convtranpose test
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Disabled 2 Failing convtranspose tests with TensorRT EP
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
Co-authored-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
Co-authored-by: Aravind Gunda <aravindx.gunda@intel.com>
Co-authored-by: sfatimar <sahar.fatima@intel/com>
The previous attempt to enable static analysis (#8842) didn't actually run the static analysis checks.
- Run clang-tidy directly.
- Address static analysis warnings.
* Expose symbols in onnx and protobuf namespaces in python when building with --enable_external_custom_op_schemas
* Add external onnx and protobuf files to wheel
* Added an example to demonstrate external custom ops use-case
* Added a Linux build pipeline to test external custom ops
* Enable selecting custom ops in onnxruntime-extensions.
* Move cmake_helper.py.
* Remove over-indented spaces.
* Add doc.
* Remove onnxruntime-extensions from git submodules, and user should pass path of onnxruntime-extensions for build.
* Modify doc.
* Remove argument --enable_onnxruntime_extensions and use --onnxruntime_extensions_path.
* Fix build error.
* Fix build error.
* Use onnxruntime_extensions_path.
* support both submodule and external source folders
* refinement
* Update cgmanifest.json
* Support building onnxruntime-extensions from either git submodule or pre-pulled path.
* Update doc.
* more standard name
* update docs
* add the copyright header
Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com>
Co-authored-by: Wenbing Li <wenbingl@outlook.com>
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
* seperate the training python module; share the execution proivder instance
* fix build break
* fix cuda test crash; reorg the python module code base
* se correct env
* use provider customized hash func
* fixbuild break
* fix rocm break
* use const ref in argument
* rename the file
* move hash func to trainiing module
* Revert "Cleanup C# bindings to add EP (#8810)"
This reverts commit b21ea00020.
* Add back in a minimal set of changes.
Provide stubs in for a limited set of things
- things called from C# using a static lib of ORT built for mac/ios
- things in OrtApis that are not included in the build by default
- things in OrtApis that are excluded in a minimal build
* Cleanup order or EPs in test
* Fix unused function in ROCM build
Support of sparse initializers with smaller indices data type to save space.
Make the script more efficient by selecting indices data type and checking resulting sparse bytes
Exclude new code from SPARSE_TENSORS