* Update to flatbuffers v2.0.0 (#10866)
* Fix Reduced ops pipeline (#10861)
* Fix a couple of issues with the python package tools (#10858)
* Tweaks to the model utils
* Add handling for a dim_value of -1 when replacing the entire input shape. This occurs in models exported from PaddlePaddle
* make pytorch helpers accessible in package
* make QDQ helpers accessible in package
* Fix wrong percentile values returned during calibration (#10847)
* Use numpy.percentile to get the lookup value.
* Use 1.0 as float value rather than integer.
* Add missing cdf parameter for `np.percentile`.
* Use 100. instead of 1.0
* Remove print.
* Update from @yufenglee
* Add support for opset 16 to transpose optimizer. (#10841)
* Add support for opset 16 to transpose optimizer.
Only change required is for GridSample to be added to the layout sensitive ops. The existing handling for layout transpose works with that as the first input and first output are layout sensitive.
Update the optimize to be able to return an error message if it fails.
* Use separate build directories for full and mobile iOS packages. (#10835)
* Address performance issue with abseil flat_hash_table. (#10819)
When returning by value in a cross DLL call, the hash table
even though containing all the entries that are originally there
can not find at least some of them. Reverting to std::unordered_set
pending further investigation.
* Mark end of version 11 C API. (#10803)
* Mark end of version 11 C API
* Add static_assert
* avoid using LocalFree on FormatMessageW buffer (#10796)
* remove local free
* Remove local free from onnxruntime
* don't allocate
* Change to use constexpr to satisfy CPU build warning
* Integrate C-API tests into Pipelines for release packages (#10794)
* add c-api test for package
* fix bug for running c-api test for package
* refine run application script
* remove redundant code
* include CUDA test
* Remove testing CUDA EP temporarily
* fix bug
* Code refactor
* try to fix YAML bug
* try to fix YAML bug
* try to fix YAML bug
* fix bug for multiple directories in Pipelines
* fix bug
* add comments and fix bug
* Update c-api-noopenmp-packaging-pipelines.yml
* Remove failOnStandardError flag in Pipelines
* Detect runtime CUDA JIT and warn the user (#10781)
* Use cudaMalloc vs cudaDeviceSynchronize and show the total time
* Update convert_onnx_models_to_ort.py to support runtime optimizations. (#10765)
Add runtime optimization support to ONNX -> ORT format conversion script.
Replace `--optimization_level`, `--use_nnapi`, and `--use_coreml` with a new `--optimization_style` option.
* Add multithreading test and put a lock on nvinfer1::createInferRuntime() for TRT EP (#10714)
* Add multithread unit test and put lock on library call
* update code
* remove debug code
* add comment
* add one session multi-threads inference
* Put lock for build engine all the time
* Update naming and comment
* remove unnecessary lock
* Revert "remove unnecessary lock"
This reverts commit 9c2317b1d2273dec0ebdeb52160bc757839e5edc.
* Fix handling of nodes inserted by NHWC transformer. (#10904) (#10925)
* Revert "Upsample support NHWC (#10554)" (#10917)
This reverts commit bd08f11a58.
Co-authored-by: Yufeng Li <liyufeng1987@gmail.com>
* [python API] Change raise import error when `C:\Windows\System32\vcruntime140_1.dll` is not found to warning (#10927)
* remove throw if C:\\Windows\\System32\\vcruntime140_1.dll cannot be found
* Add comments and update warning message
* adding back accidentally removed line
Co-authored-by: gwang0000 <62914304+gwang0000@users.noreply.github.com>
* [js] Create npm packaging pipeline (#10886)
* create npm packaging pipeline
* fix indentations
* Update npm-packaging-pipeline.yml for Azure Pipelines
* Update npm-packaging-pipeline.yml for Azure Pipelines
* Update npm-packaging-pipeline.yml for Azure Pipelines
* react-native-ci as a template
* fix typos
* fix template paths
* add a depencendy
* change a stage name
* set different artifact name for each package
* fix typo
* Update npm-packaging-pipeline.yml for Azure Pipelines
Set a build Id for node npm package as a parameter
* Update npm-packaging-pipeline.yml for Azure Pipelines
Set a build Id for node npm package as a parameter
* Update npm-packaging-pipeline.yml for Azure Pipelines
* Follow up update for python API checking if `vcruntime140_1.dll` is available (#10927) (#10933)
Co-authored-by: Hariharan Seshadri <hasesh@microsoft.com>
Co-authored-by: Scott McKay <skottmckay@gmail.com>
Co-authored-by: Funtowicz Morgan <mfuntowicz@users.noreply.github.com>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
Co-authored-by: Dmitri Smirnov <yuslepukhin@users.noreply.github.com>
Co-authored-by: Pranav Sharma <prs@microsoft.com>
Co-authored-by: Ryan Lai <rylai@microsoft.com>
Co-authored-by: Ryan Hill <38674843+RyanUnderhill@users.noreply.github.com>
Co-authored-by: Yi-Hong Lyu <yilyu@microsoft.com>
Co-authored-by: Yufeng Li <liyufeng1987@gmail.com>
Co-authored-by: Guoyu Wang <62914304+gwang-msft@users.noreply.github.com>
Co-authored-by: gwang0000 <62914304+gwang0000@users.noreply.github.com>
Co-authored-by: Sunghoon <35605090+hanbitmyths@users.noreply.github.com>
* Add initial helper for optimizing a QDQ format model for usage with ORT.
If a DQ node has multiple consumers it will end up in multiple QDQ node units. This is complicated to handle as each qdq unit could end up being handled by different execution providers. By duplicating the DQ node we simplify this logic.
Generally the duplicate nodes will disappear when the qdq node unit is converted to a single node with a quantized operator. If there are qdq node units that are not able to be converted to use a quantized operator the ORT cleanup (pending) to drop remaining Q->DQ pairs between fp32 nodes can remove any remaining DQ nodes.
* Fix pep8 warning
Co-authored-by: Guoyu Wang <wanggy@outlook.com>
* Update required operators for prebuilt package to add opsets 14 and 15.
Add helper script to check if the prebuilt package will support the model and if not why not.
* Add support for multiple opsets being specified on a single line in the required operators config. This makes it easier to update the pre-built package config.
It's also required for validation tools to work as they only have a single opset from the model and not per-operator opsets. If we only list the incremental ops we could merge in the ops from the previous opset, but that wouldn't give a way to drop an operator from being supported.
Left the info on which ops changed though so we have a better feel for the cost of supporting each opset.
* schema change
* cc channges
* remove temp debug code
* Adding fbs namespace to session_state_flatbuffers_utils.h
* Add fbs namepsace to all ort format utils
* Change the strided copy to switch on data size not data type.
Move to header so we can reduce on the enabled types.
Setup type reduction for Concat now that it's using this implementation.
* Include ORT format model conversion scripts and infrastructure in ORT python package.
- tweak existing script setup so it can be easily run directly and from the ORT python package
Add config file and readme for Android minimal build package
Update ORT Mobile doco
Disable warning if 'all' optimizations are enabled but NCHWc transformer is excluded (device specific optimizations don't apply in this scenario so the warning is moot).
* Address PR comments
* Updates to some operators to always support int32 and int64 based on testing of Android package build config with a minimal build.
If an operator can be used for shape manipulation (int64) it is frequently used for indices manipulation (int32), so we enable both types for that set of ops.
- e.g. BERT models take indices as input
- Scatter/Gather ops utilize indices
Misc. fix to python bindings to exclude call that fails in a minimal build.
Enable type reduction for Scatter/ScatterElements CPU kernels. Some refactoring to reduce binary size.
Add MLTypeCallDispatcher methods.
Minor cleanup for Pad CPU kernel.
Enable type reduction for Shrink, Sign, SplitToSequence CPU kernels.
Some other type reduction changes including refactoring to specify element types in a single place.
Update the kernel def hashing in ORT format models. The new hashing logic ignores the ordering of type constraint types.
This is a backward compatibility breaking change, but we don't guarantee backward compatibility yet.
* Add support for custom ops library to the ORT model conversion script
Simplify model conversion now that we read ops from the ORT format model.
Enable custom ops in the python bindings if custom ops are turned on in a minimal build.
* Add test of model conversion involving custom ops.
* Add type reduction support to Min, Max and Pow
Update the C++ type reduction infrastructure to allow specifying an opset for the supported types list, as those can change across opset versions.
Minor updates to the type usage tracking script
* Add 'all opsets' macros and constant
* Add ability to generate configuration that includes required types for individual operators, to allow build size reduction based on that.
- Add python bindings for ORT format models
- Add script to update bindings and help info
- Add parsing of ORT format models
- Add ability to enable type reduction to config generation
- Update build.py to only allow operator/type reduction via config
- simpler to require config to be generated first
- can't mix a type aware (ORT format model only) and non-type aware config as that may result in insufficient types being enabled
- Add script to create reduced build config
- Update CIs
Follow up to #5811 to automate cleanup of the build docker image cache.
Added a script and build definition to clean up docker images that haven't been accessed recently.
This PR adds infrastructure to automatically cache docker images used in CI builds in a container registry.
Currently, build images are pulled from a container registry for some builds and built every time for others. The container registry requires maintenance to keep the images up to date and building images every time wastes build agent resources.
With this change, a given build image can be looked up in a cache container registry and if present, pulled, and otherwise, built and pushed. The uniqueness of a build image is determined by a hash digest of the dockerfile, docker build context directory, and certain "docker build" options. This digest is part of the image tag in the cache container repository.
The cache container registry will need to be cleaned up periodically. This is not automated yet.