* Fixed issue in python cmake to update wheel package
* Fixes python cmake issue for OV EP
Added post build step for libonnxruntime_providers_openvino
that copies the updated libonnxruntime_providers_openvino.so file
to /onnxruntime/capi directory every time this target is rebuilt.
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Removed post_build step from onnxruntime_python.cmake
Now that we have added the post build step to copy
onnxruntime_providers_openvino.so and providers_shared.so
to /onnxruntime/capi directory in onnxruntime_providers.cmake file.
so removing the duplication of the same from here.
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
* Fixed python cmake issue for OpenVINO-EP
->Fixed issue for both Linux and windows
Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
Co-authored-by: MaajidKhan <n.maajidkhan@gmail.com>
* If unit tests are manually excluded via `--cmake_extra_defines onnxruntime_BUILD_UNIT_TESTS=OFF` (e.g. testing changes to binary size where you want to keep the build time as quick as possible) it should still be possible to create the python bindings.
Update CMakeLists.txt to decouple the inclusion of onnxruntime_python.cmake from unit tests being enabled.
Update onnxruntime_python.cmake so it works when unit tests are disabled. Also skip copying of test py files when unit tests are disabled.
* Add robust dependency check for Python package
* Add version_info.py to .gitignore
* Fix Linux build
* Fix Windows CPU build
* Fix Windows 32-bit build
* Minor tweak
* Generate version_info.py earlier in onnxruntime_python.cmake
* Print a user-friendly message if cuDNN is not found in
* Relax version requirements for CUDA 11 - only the major version has to match
* Fix PATH environment variable to include CUDA 11 in 'Python packaging pipeline' (Windows/GPU)
* Fix the build with cuDNN 7
1. Merge Nuget CPU pipeline, Java CPU pipeline, C-API pipeline into a single one.
2. Enable compile warnings for cuda files(*.cu) on Windows.
3. Enable static code analyze for the Windows builds in these jobs. For example, this is our first time scanning the JNI code.
4. Fix some warnings in the training code.
5. Enable code sign for Java. Previously we forgot it.
6. Update TPN.txt to remove Jemalloc.
Update ORT model conversion script
- add args for specifying optimization level and whether to use NNAPI
- add logic to create a list of required ops and ORT format model that can be used with NNAPI
* Remove nGraph Execution Provider
Pursuant to nGraph deprecation notice: https://github.com/microsoft/onnxruntime/blob/master/docs/execution_providers/nGraph-ExecutionProvider.md#deprecation-notice
**Deprecation Notice**
| | |
| --- | --- |
| Deprecation Begins | June 1, 2020 |
| Removal Date | December 1, 2020 |
Starting with the OpenVINO™ toolkit 2020.2 release, all of the features
previously available through nGraph have been merged into the OpenVINO™
toolkit. As a result, all the features previously available through
ONNX RT Execution Provider for nGraph have been merged with ONNX RT
Execution Provider for OpenVINO™ toolkit.
Therefore, ONNX RT Execution Provider for **nGraph** will be deprecated
starting June 1, 2020 and will be completely removed on December 1,
2020. Users are recommended to migrate to the ONNX RT Execution Provider
for OpenVINO™ toolkit as the unified solution for all AI inferencing on
Intel® hardware.
* Remove nGraph Licence info from ThirdPartyNotices.txt
* Use simple Test.Run() for tests without EP exclusions
To be consistent with rest of test code.
* Remove nGraph EP functions from Java code
Transitions from the ORT-only DML NuGet (hosted on the onnxruntime_public feed) to the new unified DirectML NuGet (Microsoft.AI.DirectML) on nuget.org. In addition, the Microsoft.AI.MachineLearning (WinML) and Microsoft.ML.OnnxRuntime.DirectML packages now take a dependency on the Microsoft.AI.DirectML package. This means we can remove the extra copy of DML binaries in these packages since they will be installed by the DML package.
The ROCm EP is designed and implemented based on AMD GPU software stack named ROCm. Here is the link for the details about ROCm: https://rocmdocs.amd.com/en/latest/
ROCm EP was created based on the following things:
1. AMD GPU programming language: HIP
2. AMD GPU HIP language runtime: amdhip64
3. BLAS: rocBLAS, hipBLAS
4. DNN: miOpen
5. Collective Communication library: RCCL
6. cub: hipCub
7. …
Current status:
BERT-L and GPT2 training can be ran on AMD GPU with data parallel.
Next:
1. Make more GPU code be sharable between ROCm EP and CUDA EP since HIP language and HIP runtime API are very close to CUDA.
2. Continue improving the implementation.
3. Continue GPU kernel optimization.
4. Support model parallelism on ROCm EP.
……
The rocm kernels have been removed from this commit and will be in a separate PR. Since the original PR was too big(~180 files), it was suggested to split the PR into two parts, one is rocm-kernels, the other is non rocm kernels.
Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
Co-authored-by: sabreshao <sabre.shao@amd.com>
Co-authored-by: anghostcici <11013544+anghostcici@users.noreply.github.com>
Co-authored-by: Suffian Khan <sukha@microsoft.com>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
* Move fbs include from header to cc
* add initial cmake for flatbuffers
* Move most flatbuffers util to ort_flatbuffers
* move code around
* fix
* move test/perf runner to use flatbuffer directly instead of model
* minor update
* Fix build break
* Clean up includes and foward decl
* Fix traning CI build breaks
* Addressed PR comment, replaced some include with forward decls
* Remove ORT_MUST_USE_RESULT temporarily
* Fix places where MinSizeRel wasn't having relevant flags added in the same way as Release and RelWithDebInfo
Enable LTO for minimal build. Cleanups onnx_minimal.cmake to remove some things handled when LTO is enabled in CMakeLists.txt
* Only enable LTO for MSVC in a minimal build
Improve quantization tools:
1. Support QAT
2. Make quantization tool to register Operators.
3. Make the API clear to use
Co-authored-by: t-yguo <t-yguo@microsoft.com>
* Add ORTTrainerOptions class for the new pytorch frontend (#4382)
Add ORTTrainerOptions class and some placeholders
* Add _ORTTrainerModelDesc to perform validation for model description (#4416)
* Add Loss Scaler classes to the new frontend (#4306)
* Add TrainStepInfo used on the new frontend API (#4256)
* Add Optimizer classes to the new frontend (#4280)
* Add LRScheduler implementation (#4357)
* Add basic ORTTrainer API (#4435)
This PR presents the public API for ORTTrainer for the short term
development.
It also validates and saves input parameters, which will be used in the
next stages, such as building ONNX model, post processing the model and
configuring the training session
* Add opset_version into ORTTrainerOptions and change type of ORTTrainer.loss_fn (#4592)
* Update ModelDescription and minor fix on ORTTrainer ctor (#4605)
* Update ModelDescription and minor fix on ORTTrainer/ORTTrainerOptions
This PR keeps the public API intact, but changes how model description is stored on the backend
Currently, users creates a dict with two lists of tuples.
One list called 'inputs' and each tuple has the following format tuple(name, shape).
The second list is called 'outputs' and each tuple can be either tuple(name, shape) or tuple(name, shape, is_loss).
With this PR, when this dict is passed in to ORTTrainer, it is fully validated as usual.
However, tuples are internally replaced by namedtuples and all output tuples will have
tuple(name, shape, is_loss) format instead of is_loss being optionally present.
Additionally to that normalization in the internal representation (which eases coding),
two internal methods were created to replace a namedtuple(name, shape) to namedtuple(name, shape, dtype)
or namedtuple(name, shape, is_loss, dtype) dependeing whether the tuple is an input or output.
This is necessary as ORTTRainer finds out data types of each input/output during model export to onnx.
Finally, a minor fix was done on ORTTrainer. It could initialize ORTTrainerOptions incorrectly when options=None
* Rename input name for test
* Add ONNX Model Export to New Frontend (#4612)
Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
* Create training session + minor improvements (#4668)
Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
* Save ONNX model in file (#4671)
Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
* Add eval step (#4674)
Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
* Add train_step (#4677)
Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
* Add LR Scheduler (#4694)
Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
* Add deterministic compute tests (#4716)
Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
* Add legacy vs experimental ORTTrainer accuracy comparison (#4727)
Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
* Add Mixed precision/LossScaler + several fixes (#4739)
Additionally to the mixed precision/loss scaler code, this PR includes:
* Fix CUDA training
* Add optimization_step into TrainStepInfo class
* Refactor LRSCheduler to use optimization_step instead of step
* Updated several default values at ORTTrainerOptions
* Add initial Gradient Accumulation supported. Untested
* Fix ONNX model post processing
* Refactor unit tests
* Add ONNX BERT example + minor fixes (#4757)
* Fix training issue when passing ONNX file into ORTTrainer
Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
* Add Dynamic Shape support (#4758)
* Update DeepSpeed Zero Stage option to a separate option group (#4772)
* Add support to fetches (#4777)
* Add Gradient Accumulation Steps support (#4793)
* Fix Dynamic Axes feature and add unit test (#4795)
* Add frozen weights test (#4807)
* Move new pytorch front-end to 'experimental' namespace (#4814)
* Fix build
Co-authored-by: Rayan-Krishnan <rayankrishnan@live.com>
Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
* test
* test
* add missing CUDA header include
* debug
* fix
* fix python package for dnnl and tensorrt.
* fix
* fix windows build.
* revert
* target_link_directories for tensorrt shared lib.
* Add python API for specifying CUDA device id
* Modification for providing session based python api for specifying
device id
* When include header file pybind11/stl.h, conversion between c++
containers and Python list, vector and dict data structure are
automatically enabled.
https://pybind11.readthedocs.io/en/stable/advanced/cast/stl.html#
Therefore, refactor the code for better leverage this advantage.
* Make struct CudaDeviceOptions as default cuda device options
* Implement sess.set_providers(list_of_providers, list_of_provider_option_dicts)
But still stay consistent with existing sess.set_providers(list_of_provider)
* Add cuda provider option default setting
* Add support for setting cuda cuda_mem_limit and arena_extend_strategy.
Also resolved the merge conflict on session.py
* Use python ctypes to call cuda library to help python unittest
* Refine the code with reviewer's suggestions
* Add the capability of getting execution provider's configuration
- Once we introduced the capability to set execution provider's
configuration, it makes sense to add capability of getting ep's configuration.
* Modify the code with reviewer's suggestions.
* Using stoull() and stoul() depends on 32/64-bits architecture.
* Rewrite the testcases for testing setting CUDA device id
Note: We need to make sure every ORT process be run on one CUDA device
at a time.
* Make sure old session object is destroyed by python gc before new
session object is being created
* Move testcases to original onnxruntime_test_python.py
* Fix bugs to pass CI build
* Make it pass CI build (cont.)
* Make it pass CI build (cont.)
* expose ACL/ARMNN providers to python
* add -acl / -armnn to package name when use_acl / use_armnn is specified
* build python wheel for ARMNN EP
* link ACL/ARMNN EPs into onnxruntime_pybind11_state
* wrong argument order in build_python_wheel for wheel_name_suffix
* Add amd migraphx execution provider to onnx runtime
* rename MiGraphX to MIGraphX
* remove unnecessary changes in migraphx_execution_provider.cc
* add migraphx EP to tests
* add input requests of the batchnorm operator
* add to support an onnx operator PRelu
* update migrapx dockerfile and removed one unused line
* sync submodules with mater branch
* fixed a small bug
* fix various bugs to run msft real models correctly
* some code cleanup
* fix python file format
* fixed a code style issue
* add default provider for migraphx execution provider
Co-authored-by: Shucai Xiao <Shucai.Xiao@amd.com>
* initial change to transformer.py
* prepare e2e transformer tests
* refactor transformer tests
* put test python files in a flat folder
* fix typo pip install transform(s)
* python 3.6
* python version to 3.6 in install_ubuntu.sh
* remove argparser
* to use opset ver 12
* workaround loss_scale naming patch in case of loss_fn_
* assign self.loss_fn_ so it can be checked
* skip a few un-needed post-process steps
* fix loss_scale_input_name, clean up post process steps
* skip non-frontend tests
* move cpu/cuda related files to coresponding cpu/cuda folder (#3668)
Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
* type cast for ratio is not necessary for dropout (#3682)
Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
* thrustallocator is not needed since cub is used directly for gather now. (#3683)
Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
* GatherND-12 Implementation (#3645)
* Renamed, UT passing
* Move GatherND CUDA Kerenl into onnxruntime
* Merge GatherNDOpTest
* Refactor Test code
* Merge CPU Kernel Impl
* Handle Negative Indice, Fix UT
* Improve CUDA kernel to handle negative index
* Minor Fixes
* Preserve GatherND-1 Cuda kernel
* Fix Mac build
* fix UT
* Fix Build
* fix GatherNDOpTest.double > CUDA error cudaErrorInvalidDeviceFunction:invalid device function
Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: Peng Wang (pengwa) <pengwa@microsoft.com>
* update with reviewers' comments
* testBertTrainingGradientAccumulation was not using rtol and may fail occasionally with small (e-06) difference
* fix merge mistakes
Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: Weixing Zhang <weixingzhang@users.noreply.github.com>
Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
Co-authored-by: Sherlock <baihan.huang@gmail.com>
Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: Peng Wang (pengwa) <pengwa@microsoft.com>
* Added FP16 transformations
* Revert "Added CMAKE_BUILD_TYPE to make building dynamic"
This reverts commit d3e17af1af655cfdc4d2fec33f52055caa525e85.
* Added FP16 transformations for FP16 builds
* Backend logic cleanup
Cleans the backend(intel_graph.*) code in the following ways:-
1. Minimize global usage: Since all the IR graphs need to be
re-generated on every Infer, it is bad practice to rely on globals
for their saving and usage as there would be multiple readers and
writers to the same global variable leading to incorrect usages or
contentions. This change replaces globals with locals where possible.
This change also fixes an existing bug with due to
incorrect global usage.
2. Remove all unused functions.
3. Remove all unused headers and prepocessor directives.
* removed commented out code
* Disabled default optimization for Intel EP
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Fix missed plugins.xml for python bindings
* Fixed the build after latest master changes
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Disabled unsupported ops for accelerators
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Added some more disabled ops
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Added environment variable to enable debugging
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Added more debug statements
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Fixed unsupported ops list for GPU and VPU
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Fixed unsqueeze unit tests
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Added error message to the status
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Overwrite Model proto with shape info from data
Overwrites the shape info of Model proto with the shape from
actual input data. Needed for inferring models with Dynamic
shapes.
* Removed print statement and disabled where op
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
* Disabled Reshape with Empty initializer
* Added more debug statements for 1P
* Don't allow 1D inputs with symbol for dimension
* Disabled some 3rd phase ops
* Disabled split and added zero dimension check for OutputDefs
* Cleanup zero dimensionality check
* Added different data type check for inputs and initializers
* Added conditions for Mod, Cast and Pad
* Removed unused variable
* Disabled scan and added conditions for squeeze
* Added changes for fixing all C++ unit tests
* Implements Backend Manager class for caching
Backend Manager provides a layer of indirection between EP interface
and OV backend that provides caching services for models with
symbolic dims in input shapes.
* clean up commented blocks
* clang-formatting
* Read I/O type info from ModleProto
Read the tensor element type information from ModelProto object,
as FusedNode is no longer available.
* code cleanup
* clang-formatting
* Added print statement for jenkins
* Disabled some python tests
* Changed the path of convert fp32 to fp16 hpp
* Added conditions for BatchNorm in GetCapability
* Fixed failed tests
* Revert "Added conditions for BatchNorm in GetCapability"
This reverts commit c3c28c3b00d27892c42546b35dacdd807a48ee90.
* Added Intel to onnxruntime backends
* pick up vars set by OV package setupvars.sh
* Added conditions for Identity
* remove a few cout prints
* Added conditions for GPU_FP32 unit tests
* Revert "pick up vars set by OV package setupvars.sh"
This reverts commit 8199e029c03eae21a1a7ef6bfdc93d00e5d0198b.
* Commented out fatal message for protobuf
* Might need to be removed
* Add interface class for current backend
* moved common logic to base class
* simplified cpu backend
* Removed unused headers
* use vectors to save i/o tensors for windows compatibility
* move utils fxns to backend_utils namespace
* rename ov_backend to ibackend
* Factory pattern for backend creation
* rename CPU backend to Basic backend
* renamed to vad-M and added to factory list
* Added conditions for VPU
* Added print statements
* Changed the logic for checking for symbolic shapes
* Modified logic for zero dimension check
* Removed VPU single dimension condition
* Removed comments
* Modified logic in DimensionCheck method
* Remove legacy OpenVINO EP
Remove all the legacy code for OpenVINO EP. UEP code will take its
place going forward.
This change does NOT remove OVEP files in the following areas asa
they will be reused by UEP:-
1. Documentation: All .md files
2. Docker releated files
3. Python bindings
4. Java bindings
5. C# bindings
6. ORT Server
7. CI pipeline setup files
* Rename Intel EP to OpenVINO EP
* Added unique names to the subgraphs
* Removed subgraphs with only constant inputs
* Modified subgraph partitioning algorithm to remove const input subgraphs
* Apply suggestion to onnxruntime/core/providers/openvino/openvino_execution_provider.cc
* Tracking output names to fix the output order bug
* Changed output names to a unordered map
* Modified logic to check for symbolic input shapes
* Fixed a bug in Reshape check
* Added empty model path to Model constructor
* Made necessary changes to cmake to build from the binary package
* Changed INTEL_CVSDK_DIR to INTEL_OPENVINO_DIR
* Enable dyn device selection with C++ API
* Added Round operator to unsupported list
* Modified subgraph partition logic for MYRIAD
* Removed supported ops from the list
* Enable dyn dev selection in Py API's
* Add documentation for dynamic device selection
* Use MYRIAD || HDDL instead of VPU
* Removed temporary cast of Int64 to FP32
* Disabled unit Tests for CPU_FP32 and GPU_FP32
* Removed default "CPU" from unit tests to allow overriding
* Removed ops Concat, Squeeze, Unsqueeze from unsupported list
* Get the device id from info
* Removed overwriting device_id and precision
* Enabled ConvTranspose and EyeLike
* Reordered unsupported ops in alphabetical order
* Fixed syntax error
* Fixed syntax error
* Code clean-up: Handle exceptions, logs and formatting
Code formatted according to ORT coding guidelines.
* remove debug print from pybind code
* updated docs with ops and models
* formatting prints
* Added default values for c and j for openvino
* Overriding the values set for c and j to be 1
* BACKEND_OPENVINO should be empty if openvino is not in build
* Overriding c value with default for perftest
* fix VAD-M device string bug
* Add IE error details to exceptions
* Use IE specific device names in EP
* Add VAD-F (FPGA) device support
* Removed unecessary libraries from whl package
* Code changes for Windows compatibility
* Add VAD-F option to python API
* [revert before merge] cmake changes for RC
* Enable Windows build in CMake
* Unset macro OPTIONAL for windows builds
inference_engine.hpp's include chain defines a macro 'OPTIONAL'
which conflicts with onnx project's headers when using MSVC. So
would need to explictly unset it for MSVC.
* Use a single copy of plugin/IE::Core
Defined as a static member in Backend manager
* Remove restriction of single subgraphs for myriad
* Passed subgraph name to Backend to enhance log statements
* Disabled zero dimension conditions
* Disabled concat to remove zero dims
* Enabled building ngraph as part of ORT
* Removed serializing and added versioning
* Fix CPU_FP32 unit tests
* Removed unecessary condition
* add ngraph.so.0.0 to .whl
* Check for zero dimensions only for inputs and outputs
* Restrict loading only 10 subgraphs on myriad
* Build ngraph.dll within UEP. Doesn't link yet
* Rename Linux included libngraph.so to libovep_ngraph.so
Renames locally built libngraph.so containing ONNX importer to
libovep_ngraph.so in order to avoid linkage conflicts with
libngraph.so supplied by OpenVINO binary installer.
Applies only for Linux builds.
* use output_name cmake properties for lib name
* fix .so name format in lib_name.patch
* CMake code cleanup
* Rename WIN32 included ngraph.dll to ovep_ngraph.dll
To avoid conflict with ngraph.dll distributed by openvino.
* Added myriad config for networks without 4 dimensions
* Loading the 10 max clusters for inference on myriad
* Refactor code and add Batching support
Encapsulate subgraph settings into context structs.
Add batching support for completely supported models.
* Disabled some broken tests
* use input_indexes to avoid batch-checking initializers
* Avoid static initialization order error on WOS
* Added candy to broken tests
* InternalCI changes for 2020.2
* Updated DLDT instructions
* Unsaved changed in install_openvino.sh
* Changes after manual check
* Remove custom ngraph onnx_import build for WOS
ONNX Importer on WOS does not have protobuf issue.
* Remove FP32ToFP16 ngraph pass
This conversion is performed implicitly within IE.
* Surround debug logic by #ifndef NDEBUG
* remove invalid TODO comments
* removed references to ngrpah-ep
* clang-formatting
* remove commented code
* comment edits
* updating copyright year to that of first OpenVINO-EP release
* remove redundant log msg
* Modified operator and topology support
* Update build instructions
* doc formatting
* Fixed clip unit tests
* Revert "Remove FP32ToFP16 ngraph pass"
This reverts commit ec962ca5f315a5658ad980e740196f19de2639c1.
* Applying FP16 transformation only for GPU FP16
* Fixed GPU FP32 python tests
* automatically use full protobuf
* disable onnxrt server for now
* Disabled upsample
* update dockerfile instructions
* Removed MO paths and added ngraph path
* Remove OVEP from ORT Server docs
Will put it back in after validation
* Updated path to Ngraph lib
* Disabled Resize and some other python tests
* Removed unnecesary header files
* Use commit SHA to fetch ngraph repo
* Avoid un-needed file changes due to version update
* Fixed clip tests
* Fixed Pow, max and min onnx tests
* build.md doc typo
* Update cmake patch command for ngraph src
* remove dead cmake code for onnxruntime_USE_OPENVINO_BINARY
* use spaces instead of tab
* remove commented code
* Add info about protobuf version
* edit debug env var and enable for WIN32
* specify only version tag of 2020.2 for dockerbuilds
* remove unnecessary file changes
* Pass empty string as default argument to C# tests
* Use ${OPENVINO_VERSION} to name openvino install directory in CI builds
* Enabled unnecessarily disabled tests
* Fixed ngraph protobuf patch
* Fixed error in protobuf patch
* Revert "Use ${OPENVINO_VERSION} to name openvino install directory in CI builds"
This reverts commit 89e72adb8bf3b9712f5c81c5e13fe68c6c0df002.
* Remove unsetting OPTIONAL macro
This is no longer used in recent ONNX update onnx/onnx@da13be2,
so this unset workaround is no longer necessary.
* Use a null string default argument for C# API
* Set OpenVINO version yml files and pass to CI Docker builds
Git Tag info for DLDT as well as install directory are set
using this value.
This reverts commit 9fa9c20348ed72ae360a95c98e9b074d2f9fafc5.
* Documentation: recommendation and instructions for disabling ORT graph optimizations
* more doc updates
* Reduced the number of models according to CI time constraints
Co-authored-by: ynimmaga <yamini.nimmagadda@intel.com>
Co-authored-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
Co-authored-by: Mikhail Treskin <mikhail.treskin@intel.com>
Co-authored-by: mbencer <mateusz.bencer@intel.com>
Co-authored-by: Aravind <aravindx.gunda@intel.com>
Co-authored-by: suryasidd <48925384+suryasidd@users.noreply.github.com>
1. Add support for vstest.
2. Add support for vcpkg. To use it:
```bat
vcpkg install zlib:x64-windows benchmark:x64-windows gtest:x64-windows protobuf:x64-windows pybind11:x64-windows re2:x64-windows
mkdir build
cmake ..\cmake -DCMAKE_BUILD_TYPE=Debug -A x64 -T host=x64 -DCMAKE_TOOLCHAIN_FILE=C:\vcpkg\scripts\buildsystems\vcpkg.cmake -DVCPKG_TARGET_TRIPLET=x64-windows -Donnxruntime_PREFER_SYSTEM_LIB=ON
```
3. New cmake option: onnxruntime_PREFER_SYSTEM_LIB, which allows user using the preinstall libs instead of the things in onnxruntime submodule.
4. New cmake option: onnxruntime_ENABLE_MEMLEAK_CHECKER, which allows user turn on/off the memory leak checker by @RyanUnderhill in Windows Debug Build. The checker doesn't work with vstest.
4. Fix the post merge pipeline(Mainly for test coverage report).
5. Ignore the compile warning from the Featurizer library code
6. Apply "/utf-8" VC compile flag to our code. Without this, you can't build onnxruntime on Chinese Windows.
7. Remove the SingleUnitTestProject cmake option because it's deprecated more than one year and nobody is using it.
8. Move opaque api tests to onnxruntime_test_all
9. Enable "/W4" on CUDA ep's C++ code(Not the *.cu files), and fix some warnings, add some extra checks.
10. Delete the onnxruntime::test::TestEnvironment class.
11. Add a DLLmain for onnxruntime.dll.
12. Allow dynamic link to libprotobuf
1. Enable warning "4503" # Decorated name length exceeded.
2. Enable warning "4146" # unary minus operator applied to unsigned type.
3. Enable float64 support for the Softmax operator
4. Enable compliance checks for Windows x86 32bits build
5. Use TryBatchParallelFor to replace some fallback code in mlas pooling.cc
6. Fix Android CI pipeline.
* added cache version for nuphar JIT binaries
Previously, when the user wrongfully loaded a JIT binary generated
from a Nuphar version different from the current used one, she
would get mysterious runtime failures, because we didn't perform
any version check on JIT binaries.
This change added cache versions to the Nuphar runtime and
JIT binaries. The Nuphar runtime will issue verbose message that
informs the user version-mismatch errors.
* address CR feedback
* include NUPHAR_CACHE_VERSION in python wheel