Working on JNI refactor for OnnxTensor.
Simplifying the error handling logic in createTensor.
Collapsing casting branches and migrating to ONNX element type enum.
Disable cpplint for JNI C files.
* adding conditional variable again
* Adding split test cases in python
* Adding python cases for split
* Enable s8s8 split
* Optimize input
* Revert "Remove git and python packages from the docker images used by Zip-Nuget-Java-Nodejs Packaging Pipeline (#11651)"
This reverts commit d5e34acb
* Revert "Revert "Remove git and python packages from the docker images used by Zip-Nuget-Java-Nodejs Packaging Pipeline (#11651)""
This reverts commit 3c1a330dd3afeb55aa7eabb8ebea39b6deb37bad.
* format file
* Update c-api-linux-cpu.yml
* Update c-api-linux-cpu.yml
* Update c-api-linux-cpu.yml
* Reformat file
* Reformat file
* format file
* Optimize input
* Remove unused import
* Remove useless init
* Format split.py with black
* set zero point to 0 if all value are 0.0
* fix bug: lower version of numpy.finfo doesn't have smallest_subnormal
* check scale to make sure it is not subnormal
* Workaround false positive error produced by clang
ROCm's hip clang complaints that "use 'template' keyword to treat 'Foo' as a dependent template name"
where Foo is not a dependent template name. Instead, avoid the using of auto keyword fixes the error
here.
* Split GemmBase RocBlasGemm
* Add composable kernel GEMM baseline
* Make linter happy
* Address review comment
* Update bert cases with batchsize
* Adjust includes to fix IWYU lint
* Only builds and links used ck kernels to improve building time
* Remove warmup run on SelectImpl
* Add comment to utility function
* Mute cpplint
* Make RocBlasGemm<T>::SelectImpl semantically correct
* Add reduced basic test cases for ck gemm
* More robust gemm testing
* Fix warnings
* Fix grammar
Fix comparison of path characters when checking for ".ort" suffix.
Some clean up of InferenceSession Load functions.
- Reduce duplication between std::string/std::wstring versions.
- Renaming for clarity.
* first draft
* plus fixes
* plus more links
* Plus updates per review
* plus more clarifications
* plus updates
* plus more nit fixes
* plus some additions
* Update to handle multiline declarations for the kernels which are typical these days.
* Update to new path for the cpu contrib_op kernel registrations.
* Update tools/python/find_optimizer_opset_version_updates_required.py
Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>
* [ROCm] Add InstanceNormalization Op
* Enable InstanceNormBatch1_fp16 and InstanceNormBatch2_fp16 for ROCm
* [ROCm] Add BatchNormalization for fp32 and fp16
* Enable BatchNormTest for ROCm
* [ROCm] Add LRN Op
* [ROCM] replace miCompat functions with Helper functions
* MatMulInteger + post op fusion
This fuses MatMulInteger with upto 32 binary/elementwise
operators if running on the oneDNN execution provider.
Signed-off-by: George Nash <george.nash@intel.com>
* Remove the un-needed transformer
The MatMulIntegerToFloat transformer is not needed since
the transform done is handled by the MatMulIntegerBinaryEltwise
transformer code.
Signed-off-by: George Nash <george.nash@intel.com>
* Refactor of the post op trasformer code
This separates the code that finds the post op
nodes for MatMul and MatMulInteger to reduce code
repetition.
Signed-off-by: George Nash <george.nash@intel.com>
* Minor cleanup based on cpplint
resolved unused-variable build failure
Signed-off-by: George Nash <george.nash@intel.com>
Losen the following test timeout:
1. "Test Web Multi-Browsers" stage in "ONNX Runtime Web CI Pipeline": 30min -> 60min
2. Node.js binding default per-case timeout: 30 sec -> 90 sec
using ensureSymlinkSync might have issues with permissions when using 'dir' - changed to 'junction' to avoid this.
If the folder generation fails it will cause the test to fails as well.
* Make multiple-level nested control flow op model work
* find correct input index
* find correct input index (cont.)
* enable nested layer unit tests for TRT EP
* add comment
* add Scan op to current workaround support of control flow op
With recent versions of NDK (since 23), the `-O` optimization level compile flag is not being passed when building in the "Release" configuration.
More details here: https://github.com/android/ndk/issues/1740
Our "Release" Android builds have been built without the optimization flag since we upgraded from NDK 21.
This change is a workaround to manually add `-O3` for "Release" Android builds.