Fix minor bug in qdq quantization tool
Motivation and Context
Relu node is removed in qdq quantization tool if it can be merged to its input node. When performing the removal, we forgot to check whether the input is actually the graph input
Python module for dumping activation tensors when running an ONNX model
This is the first step towards a quantization debugging tool. We dump the activation tensors. Next step would be to compare them: original model vs quantized model (running with same input) to see where the difference becomes significant.
* Load checkpoint in cpp
* removed unused imports
* throw error on invalid name and change function name
* inplace model assignment, change name and other comments resolved
* name change on import
* Addded unit test, resolved comments
* remove unused imports
* resolved comments
* refactoring too reduce memoory allocation
* resolved extra comments
* changed files hierarchy an force added onnx moodel
* solved order of function argument
* used gtest macros on test cases
Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Working on JNI refactor for OnnxTensor.
Simplifying the error handling logic in createTensor.
Collapsing casting branches and migrating to ONNX element type enum.
Disable cpplint for JNI C files.
* adding conditional variable again
* Adding split test cases in python
* Adding python cases for split
* Enable s8s8 split
* Optimize input
* Revert "Remove git and python packages from the docker images used by Zip-Nuget-Java-Nodejs Packaging Pipeline (#11651)"
This reverts commit d5e34acb
* Revert "Revert "Remove git and python packages from the docker images used by Zip-Nuget-Java-Nodejs Packaging Pipeline (#11651)""
This reverts commit 3c1a330dd3afeb55aa7eabb8ebea39b6deb37bad.
* format file
* Update c-api-linux-cpu.yml
* Update c-api-linux-cpu.yml
* Update c-api-linux-cpu.yml
* Reformat file
* Reformat file
* format file
* Optimize input
* Remove unused import
* Remove useless init
* Format split.py with black
* set zero point to 0 if all value are 0.0
* fix bug: lower version of numpy.finfo doesn't have smallest_subnormal
* check scale to make sure it is not subnormal
* Workaround false positive error produced by clang
ROCm's hip clang complaints that "use 'template' keyword to treat 'Foo' as a dependent template name"
where Foo is not a dependent template name. Instead, avoid the using of auto keyword fixes the error
here.
* Split GemmBase RocBlasGemm
* Add composable kernel GEMM baseline
* Make linter happy
* Address review comment
* Update bert cases with batchsize
* Adjust includes to fix IWYU lint
* Only builds and links used ck kernels to improve building time
* Remove warmup run on SelectImpl
* Add comment to utility function
* Mute cpplint
* Make RocBlasGemm<T>::SelectImpl semantically correct
* Add reduced basic test cases for ck gemm
* More robust gemm testing
* Fix warnings
* Fix grammar
Fix comparison of path characters when checking for ".ort" suffix.
Some clean up of InferenceSession Load functions.
- Reduce duplication between std::string/std::wstring versions.
- Renaming for clarity.
* first draft
* plus fixes
* plus more links
* Plus updates per review
* plus more clarifications
* plus updates
* plus more nit fixes
* plus some additions
* Update to handle multiline declarations for the kernels which are typical these days.
* Update to new path for the cpu contrib_op kernel registrations.
* Update tools/python/find_optimizer_opset_version_updates_required.py
Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>
* [ROCm] Add InstanceNormalization Op
* Enable InstanceNormBatch1_fp16 and InstanceNormBatch2_fp16 for ROCm
* [ROCm] Add BatchNormalization for fp32 and fp16
* Enable BatchNormTest for ROCm
* [ROCm] Add LRN Op
* [ROCM] replace miCompat functions with Helper functions