catch symbolic shape inference exception.
no prune graph when there is inner graph (Loop/If/Scan)
add an wrapper for numpy_helper.to_array so that we can debug onnx graph without external data
remove fuse_mask that is not used any more in onnx_model_bert_tf.py
* Use positivity everywhere; handle negative index in Slice
* limit positivity to inputs
* make handle_negative_index private
* strengthen sympy comparison
* further strengthen compariso
n and a minor refactoring
* Add flip test
* Fall through if -int_max in handle_negative_index()
* minor fix for infer_Concat to include initializers
* Add more tests
* use simplify
* more tests
* check in early stop search as separate type
* rename to beam search configurations
* update do sample configuration flag help
* rename to configurable search step
* add option groups
* add more unit tests
Co-authored-by: Xiaoyu Liu <xiaoyu@xiaoyu-VM.z4vh1dzj5eoevgybsksdpz2izh.jx.internal.cloudapp.net>
* Update symbolic_shape_infer.py
don't rely on static code infer in _infer_Squeeze_
* checking if dorpped axes might be =! 1
* Checking opset. Logging assumption that symbolic dimensions are unequal to 1.
* more checks
* Implement qlinear concat and unit test.
Add quantization tools for QLinearConcat and it quantization tests.
* Add kernel def hash for QLinearConcat.
* Change according to PR. Add qdq transformer support for QLinearConcat.
* Add QDQ Transformer unittest. Fix typo on domain.
* remove dup logic of no use.
* fix x86 build error.
* Update operator docs.
* initial dynamic load example
* support load EP in the provider options
* support dynamic load EP in orttrainer
* split the provider interface; fix comments in pr
* remove experiment code
* add test
* remove useless file
* add test model file;fix linux brewak
* fix linux build and missing file
* fix python build
* fix python build
* fix python binding
* fix python test
* fix runtime path for posix env
* exclude the shared library from minimal build
* fix comments in pr;
* seperate the provider shared lib loading
* excluded from minimal / macos / ios build
* skip copy the provider shared lib for minimal build and mac os
* fix macos build
* exclude the test for macos build
* exclude from andorid build
* exclude from web assembly build
* enable the invalid ep test
Co-authored-by: Cheng Tang <chenta@microsoft.com>
* beam search refactoring checkin
* add factory class and deduplicate code
* one step beam search works on gpu
Co-authored-by: Xiaoyu Liu <xiaoyu@xiaoyu-VM.z4vh1dzj5eoevgybsksdpz2izh.jx.internal.cloudapp.net>
* Enabling save/Load blob feature for OpenVINO-EP
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Added changes to enhance save/load feature
->This feature applies only for MYRIAD device target
->cleaned up the code and added error checks
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Enabled the feature only for MyriadX and only for Linux
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fixed compilation issues on windows
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Added changes to fix const subgraph issue
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fixed issues on windows
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Added changes for the feature
-> Removed default location dir dump using cmake
-> Enabled saving blob dumps at the executable path
by default
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Made save/load dump path configurable
-> The save/load blob dump path is now also made configurable
using a c/python Api's.
-> Introduced a flag named blob_dump_path
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Minor fixes added
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fixed python API issues
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Using GetEnvironmentVar to get the path
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fixed python runtime option issue
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fixes import network issue on windows
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Enabled rocm support for graph transformations
* Support for external Hip allocator
* Added const_cast to reinterpret_cast to fix compiler issue
* Another crack at fixing the compile error
* More compilation fixes
* Added compilation flags to load_inline extension
* Added ROCM, ROCM_PINNED constants
* Changes to address PR comments
* Changed gpu identifier from ROCM to CUDA
* Added HIP compilation flag for torch inline functions
* Fixed a typo in header allocator string formatting
* Fix for runtime error with external_cuda_allocator
* Removed cuda/rocm specific code paths for allocators
* More name changes to generic gpu from rocm/cuda
* Removed duplicate allocator creation
* Rename cuda_external_ config options as gpu_external_
* Rename hip_mem_limit to gpu_mem_limit
* Rename cuda_mem_limit to gpu_mem_limit
With this change, differentiating CUDA EP and ROCm EP is not needed in training script when mem_limit option needs to be set.
Co-authored-by: Weixing Zhang <wezhan@microsoft.com>