onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-29 03:30:52 +00:00

Author	SHA1	Message	Date
Scott McKay	159fe9d4f3	Update to mobile model usability checker (#19843 ) ### Description <!-- Describe your changes. --> - Add check for CoreML MLProgram supported ops - Only check usability with ORT Mobile package if requested - this package will be deprecated so info is a) of minimal value and b) can be confusing. - Output more things at INFO level - a lot of meaningful info was only output at DEBUG level. The default INFO level is more useful - dump full partition info at DEBUG level - Check subgraphs fully - CoreML can handle a subgraph - TBD if we want to add support for adding a subgraph to the parent graph for Loop and If nodes - most likely will be required for simple If nodes to be performant - Check 5D CoreML limitation ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Improve helper tools --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2024-06-18 07:50:33 +10:00
Scott McKay	d4470fe653	Update Android SDK tools path lookup to be more strongly anchored to the provided root. (#21046 ) ### Description <!-- Describe your changes. --> The tools should really all come from the same Android NDK, so using `shutil.which` adds potential confusion when we do a lookup for the target program by name first due to adding `dirnames.insert(0, "")` as the first directory entry to lookup as it will match the filename anywhere in the current path. That's problematic as the emulator should come from <sdk_tools>/emulator/emulator (see [here](https://www.stkent.com/2017/08/10/update-your-path-for-the-new-android-emulator-location.html)), but the paths on the CI machines result in the old location of <sdk_tools>/tools/emulator being selected. This leads to the emulator failing to run on arm64 macOS CIs as the old emulator does not look for the arm64 binary. At the most you may have multiple cmdline-tools versions installed, but if we need to support explicitly specifying a version for that path that can be added. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Make emulator run on arm64 macOS machines.	2024-06-17 09:24:43 +10:00
Justin Chu	faea42af95	Bump ruff to 0.3.2 and black to 24 (#19878 ) ### Motivation and Context Routing updates	2024-03-13 10:00:32 -07:00
Justin Chu	3d2ddf96e3	Bump ruff linter to 0.2.1 (#19471 ) ### Motivation and Context Include new lint rules	2024-02-08 16:08:27 -08:00
Scott McKay	e7a524fea9	Update to allow large models to be checked for mobile support. (#18357 ) ### Description <!-- Describe your changes. --> Update usability checker and related infrastructure to support checking models > 2GB. - Add ability to set flag to keep initializers as external data - we optimize the model as part of the checking so need to write out a new copy. - Handle issue with ONNX shape inferencing silently failing - use API that supports large models but requires writing the model to a new file - automate cleanup of that copy of the model ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Allow analysis of LLMs to determine gaps for mobile usage. --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2023-11-17 07:20:16 +10:00
Scott McKay	885bf3561d	Add tool to fix lines > 120 chars. (#18293 ) ### Description <!-- Describe your changes. --> Helper to run clang-format on lines that are > 120 chars. We disable clang-format enforcing 120 chars by default because it's formatting can negatively impact readability. If a developer has not manually kept a line within the 120 char limit this tool will fix it. It will leave all other lines alone to honor the formatting the developer chose. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Help developers fix lint errors. Preferred is to use a vertical ruler/guideline in your editor when actually writing the code.	2023-11-09 10:12:57 +10:00
Scott McKay	ae211999dd	Attempt to make the usage of the Android emulator in CIs more robust (#17903 ) ### Description <!-- Describe your changes. --> Android emulator usage updates: - Change approach to detecting boot has completed - use `-delay-adb` and a simple command (`ls`) with `wait-for-device` as the first step - this ensures enough startup has occurred for adb to be responsive - use secondary loop on the python side to check for sys.boot_completed to be set - doing the check on the python side provides more feedback and seems to work well - make the 'stop' logic more precise by using psutil - add internal timeout of 20 mins for emulator startup - waiting for the CI jobs overall timeout is way too long - value is hardcoded for now (most CIs startup in under 10 mins) but could be made configurable if needed CI updates: - add template for using the Android emulator - update CIs to use template - reorder React Native CI - minimize the time the Android emulator or iOS simulator is running by moving some build steps around - don't run both at the same time - unnecessary and potentially adds significant memory pressure to the machine - fix QNN Android emulator CI as much as possible - now everything works apart from running onnx_test_runner with the QNN EP ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix inconsistent detection of the emulator boot completing. --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2023-10-15 08:42:36 +10:00
Justin Chu	be7541ef4a	[Linter] Bump ruff and remove pylint (#17797 ) Bump ruff version and remove pylint from the linter list. Fix any new error detected by ruff. ### Motivation and Context Ruff covers many of the pylint rules. Since pylint is not enabled in this repo and runs slow, we remove it from the linters	2023-10-05 21:07:33 -07:00
Bowen Bao	152e61da37	Avoid `get_logger` overriding root logger level (#17569 ) ### Description Instead, set level to DEBUG for the logger returned. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Otherwise, this function call overrides root logger level setting, which affects logging facility of other python packages.	2023-09-19 10:42:27 -07:00
Justin Chu	0c1a5098dc	Disable PERF* rules in ruff to allow better readability (#16834 ) ### Description Disable two PERF* rules in ruff to allow better readability. Rational commented inline. This change also removes the unused noqa directives because of the rule change. ### Motivation and Context Readability	2023-07-25 15:38:22 -07:00
Justin Chu	d79515041c	[Better Engineering] Bump ruff to 0.0.278 and fix new lint errors (#16789 ) Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #16789 Bump ruff to 0.0.278 and fix new lint errors. I added noqa to all existing RUF012 errors which requires mutable class variables to be annotated with `ClassVar`, as well as all PERF issues. Signed-off-by: Justin Chu <justinchu@microsoft.com>	2023-07-21 12:53:41 -07:00
Xavier Dupré	2bc9fbb621	Fix url in the code documentation (graph optimizations) (#16770 ) ### Description Fix a wrong url in the documentation as mentioned in issue #16678. ### Motivation and Context Better documentation.	2023-07-20 07:02:22 -07:00
Baiju Meswani	42489a8a24	Add ability to create ort format models from training offline utility (#16360 )	2023-06-21 18:51:43 -07:00
saurabh	a6ce7b339f	Enable model subgraph execution in OVEP and setting the OpenVINO dll's to the path from the OpenVINO pypi packge in OVEP and fix OVEP windows io buffer sample (#16147 ) ### Description This PR enables execution of subgraphs in OVEP and currently, when OVEP developers install the onnxruntime-openvino package on windows from pypi, they would have to additionally download OpenVINO windows binaries and run the setupvars.bat script which sets the environment PATH to locate the OV dll's. Also this PR fixes issues of OVEP windows io buffer sample. ### Motivation and Context Fix: We want to make the user experience easy for OVEP Python developers on windows platform. This fix, introduces a function add_openvino_libs_to_path at the location tools/python/util/add_openvino_win_libs.py. The above function, can be called by OVEP python users in the application code and that takes care of setting the OpenVINO dll's to the path from the OpenVINO pypi packge (openvino) which was installed. This change also makes sure that add_openvino_libs_to_path() function is added to onnxruntime python package only when it is build for OpenVINO Execution Provider for ONNXRuntime and not for default ORT python package builds. New user experience for Python OVEP developers on windows platform: step 1: pip install onnxruntime-openvino step 2: pip install openvino step 3: <Add these 2 lines in the application code> import onnxruntime.tools.add_openvino_win_libs as utils utils.add_openvino_libs_to_path() --------- Signed-off-by: MaajidKhan <n.maajid.khan@intel.com> Co-authored-by: MaajidKhan <n.maajid.khan@intel.com> Co-authored-by: Suryaprakash Shanmugam <suryaprakash.shanmugam@intel.com>	2023-06-16 19:47:09 -07:00
Xavier Dupré	e726151b5c	Introduce float 8 types (#14731 ) ### Description The PR implements FloatE4M3FN, FloatE5M2, FloatE4MEFNUZ, FloatE5M2FNUZ as described in PR https://github.com/onnx/onnx/pull/4805. It uses CUDA API to cast float/half to float8 if CUDA>=11.8, a custom implementation if CUDA<11.8. * It implements, Cast, QuantizeLinear, DequantizeLinear for all types on CPU, only for types FloatE4M3FN, FloatE5M2 on CUDA. * It extends the supported types for control flow operator, Shape, Reshape, Identity, If, Loop, Scan, Reshape * It implements Equal(19). * Cast, QuantizeLinear, DequantizeLinear operators now support a parameter `saturate` only valid for float 8 types. It is true by default. In that case, any value out of range is converted into the maximum float 8 value. If false, it is infinite. * QuantizeLinear, DequantizeLinear now supports multiple scales on CUDA (and ROCm by extension), scale = 1D tensor with one scale per channel ### Motivation and Context Supports latest onnx version. Fixes [AB#15395](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/15395) --------- Co-authored-by: Xavier Dupre <xadupre@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: Randy Shuai <rashuai@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>	2023-05-30 13:25:58 -07:00
Justin Chu	a36caba073	Bump ruff in CI (#15533 ) ### Description Bump ruff version in CI and fixed new lint errors. - This change enables the flake8-implicit-str-concat rules which helps detect unintended string concatenations: https://beta.ruff.rs/docs/rules/#flake8-implicit-str-concat-isc - Update gitignore to include common python files that we want to exclude. ### Motivation and Context Code quality	2023-04-17 10:11:44 -07:00
Edward Chen	9f942e1a3e	Graph transformer to ensure unique DQ nodes for QDQ node units (#15145 ) ### Description <!-- Describe your changes. --> Add required graph transformer to duplicate DQ nodes to ensure that QDQ node units have unique DQ nodes. This condition is necessary for QDQ node unit processing. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> There is an existing Python utility that does this: `c7ced7a5e9/tools/python/util/qdq_helpers/qdq_model_utils.py (L77)` This PR implements it as a graph transformer so it is integrated into ORT and does not require a separate step to update the model. There are also tests to ensure that its effects are not undone by basic level graph optimizations.	2023-03-31 08:39:43 +10:00
Justin Chu	938e2136c6	Enable pylint and numpy rules (#15218 ) ### Description Enable pylint and numpy rules ### Motivation and Context Modernize numpy usage and enable more quality checks	2023-03-27 20:37:53 -07:00
Justin Chu	d834ec895a	Adopt linrtunner as the linting tool - take 2 (#15085 ) ### Description `lintrunner` is a linter runner successfully used by pytorch, onnx and onnx-script. It provides a uniform experience running linters locally and in CI. It supports all major dev systems: Windows, Linux and MacOs. The checks are enforced by the `Python format` workflow. This PR adopts `lintrunner` to onnxruntime and fixed ~2000 flake8 errors in Python code. `lintrunner` now runs all required python lints including `ruff`(replacing `flake8`), `black` and `isort`. Future lints like `clang-format` can be added. Most errors are auto-fixed by `ruff` and the fixes should be considered robust. Lints that are more complicated to fix are applied `# noqa` for now and should be fixed in follow up PRs. ### Notable changes 1. This PR removed some suboptimal patterns: - `not xxx in` -> `xxx not in` membership checks - bare excepts (`except:` -> `except Exception`) - unused imports The follow up PR will remove: - `import *` - mutable values as default in function definitions (`def func(a=[])`) - more unused imports - unused local variables 2. Use `ruff` to replace `flake8`. `ruff` is much (40x) faster than flake8 and is more robust. We are using it successfully in onnx and onnx-script. It also supports auto-fixing many flake8 errors. 3. Removed the legacy flake8 ci flow and updated docs. 4. The added workflow supports SARIF code scanning reports on github, example snapshot: ![image](https://user-images.githubusercontent.com/11205048/212598953-d60ce8a9-f242-4fa8-8674-8696b704604a.png) 5. Removed `onnxruntime-python-checks-ci-pipeline` as redundant ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Unified linting experience in CI and local. Replacing https://github.com/microsoft/onnxruntime/pull/14306 --------- Signed-off-by: Justin Chu <justinchu@microsoft.com>	2023-03-24 15:29:03 -07:00
James Yuzawa	d925055a3e	Fix broken and outdated links in documentation (#14092 ) ### Description <!-- Describe your changes. --> I fixed some broken links in the C API documentation, but then did a quick pass over all of the links I could find and then fixed those. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> I got some 404's when exploring the documentation and wanted to fix it.	2023-02-23 10:48:04 -08:00
Edward Chen	3b382ea7e1	Free OrtStatus in ASSERT_ORT_STATUS_OK, make run_android_emulator.py work with newer JDK version (#14369 ) - Free OrtStatus in ASSERT_ORT_STATUS_OK in model_tests.cc - Make run_android_emulator.py work with newer JDK version	2023-01-20 09:27:47 -08:00
Edward Chen	31a1403e06	Add --output_dir option to convert_onnx_models_to_ort.py. (#12844 ) Add --output_dir option to convert_onnx_models_to_ort.py. Allows one to optionally specify an output directory for the converted model files.	2022-09-12 15:36:03 -07:00
Justin Chu	d64769c38e	Set black's target version (#11370 ) Description: Set black's target version to be py37 - py310 Motivation and Context Black by default targets its format for py3.10. Since our project supports python 3.7, we need to target version to all the python versions supported. Re-ran black. 13 files reformatted.	2022-04-27 14:52:19 -07:00
Justin Chu	fdce4fa6af	Format all python files under onnxruntime with black and isort (#11324 ) Description: Format all python files under onnxruntime with black and isort. After checking in, we can use .git-blame-ignore-revs to ignore the formatting PR in git blame. #11315, #11316	2022-04-26 09:35:16 -07:00
Edward Chen	269be2fe63	Remove unnecessary option from convert_onnx_models_to_ort.py, fix old instructions. (#11088 ) Remove unnecessary --nnapi_partitioning_stop_ops option from convert_onnx_models_to_ort.py, fix old instructions.	2022-04-11 11:19:21 -07:00
Edward Chen	9371401746	Move node EP assignment for ORT format into SessionState::FinalizeSessionState() (#10944 ) Follow up to #10904. - Move node EP assignment for ORT format into SessionState::FinalizeSessionState(). - Add unit test for #10904. - Make convert_onnx_models_to_ort.py optimization level configurable via environment variable.	2022-03-28 10:37:22 -07:00
Scott McKay	91722e2bc4	Fix typos (#10935 )	2022-03-20 08:27:35 +10:00
Scott McKay	f385c73058	Fix a couple of issues with the python package tools (#10858 ) * Tweaks to the model utils * Add handling for a dim_value of -1 when replacing the entire input shape. This occurs in models exported from PaddlePaddle * make pytorch helpers accessible in package * make QDQ helpers accessible in package	2022-03-15 15:52:12 +10:00
Edward Chen	e53422c6d0	Update convert_onnx_models_to_ort.py to support runtime optimizations. (#10765 ) Add runtime optimization support to ONNX -> ORT format conversion script. Replace `--optimization_level`, `--use_nnapi`, and `--use_coreml` with a new `--optimization_style` option.	2022-03-14 16:50:41 -07:00
Scott McKay	6072c6b65e	Simplify QLinearConv registration so type reduction works with it. (#10747 ) * Simplify QLinearConv registration so type reduction works with it. * Update QLinearMatMul registration to be a standard typed registration	2022-03-04 14:06:04 +10:00
Rachel Guo	a9dc50ba8b	Add option to force QDQIsInt8Allowed to return true when exporting to ORT format (#10719 ) * wip * save * minor update * fix * fix * Revert "fix" This reverts commit `a76f364b2d`. * revert * revert * revert submodule removal * address pr comments * minor fix * address cr comments * fix format Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>	2022-03-02 23:26:14 -08:00
Scott McKay	4d3cd2f685	Add helper for optimizing a QDQ format model for usage with ORT. (#10595 ) * Add initial helper for optimizing a QDQ format model for usage with ORT. If a DQ node has multiple consumers it will end up in multiple QDQ node units. This is complicated to handle as each qdq unit could end up being handled by different execution providers. By duplicating the DQ node we simplify this logic. Generally the duplicate nodes will disappear when the qdq node unit is converted to a single node with a quantized operator. If there are qdq node units that are not able to be converted to use a quantized operator the ORT cleanup (pending) to drop remaining Q->DQ pairs between fp32 nodes can remove any remaining DQ nodes. * Fix pep8 warning Co-authored-by: Guoyu Wang <wanggy@outlook.com>	2022-02-21 09:26:19 +10:00
Scott McKay	2ca9566994	Add range of helpers for making usage of ORT Mobile easier. (#10458 ) * Add range of helpers for making usage of ORT Mobile easier.	2022-02-18 07:35:25 +10:00
Scott McKay	6545e24b60	Update mobile prebuilt package ops to add support for opset 14 and 15 (#9717 ) * Update required operators for prebuilt package to add opsets 14 and 15. Add helper script to check if the prebuilt package will support the model and if not why not. * Add support for multiple opsets being specified on a single line in the required operators config. This makes it easier to update the pre-built package config. It's also required for validation tools to work as they only have a single opset from the model and not per-operator opsets. If we only list the incremental ops we could merge in the ops from the previous opset, but that wouldn't give a way to drop an operator from being supported. Left the info on which ops changed though so we have a better feel for the cost of supporting each opset.	2021-11-18 10:44:39 +10:00
Guoyu Wang	5ad6dbb314	Remove experimental from ORT format namespace (#9729 ) * schema change * cc channges * remove temp debug code * Adding fbs namespace to session_state_flatbuffers_utils.h * Add fbs namepsace to all ort format utils	2021-11-11 19:46:30 -08:00
Edward Chen	011cb8fd48	Fix Where op type reduction processing (#9033 ) * Update type reduction script to track Where Op's second input type. * Clean up op_kernel_type_control.h includes. * Use more maintainable include.	2021-09-13 08:37:58 -07:00
Scott McKay	858989293d	Reduce binary size of strided copy used by Concat (#8913 ) * Change the strided copy to switch on data size not data type. Move to header so we can reduce on the enabled types. Setup type reduction for Concat now that it's using this implementation.	2021-09-02 08:19:20 +10:00
Edward Chen	94c3e2048b	[convert_onnx_models_to_ort.py] Add option to specify NNAPI EP partitioning stop ops. (#8668 ) Add option to specify NNAPI EP partitioning stop ops from the ORT format model conversion script.	2021-08-19 13:02:28 -07:00
Rachel Guo	78759059f1	[CoreML EP]Make coreml ep build on non-macOS platform (#8677 ) * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * clean * remove unused defs * correct typo * remove onnxruntime_coreml_proto * cr comments * enablie nnapi/coreml in minimal build * enable nnapi/coreml in one build * refine dependencies * fix nnapi build failure and remove onnxruntime_coreml_proto dependencies in unit tests cmake files * small fix * fix * fix build * revert * fix build Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>	2021-08-18 09:35:32 -07:00
Edward Chen	dda9f53bed	Build script logging updates (#8618 ) Log build.py command line arguments. Update subprocess logging to format arguments in way that is easier to copy.	2021-08-05 09:41:17 -07:00
Edward Chen	e09321f4db	Update ORT format model conversion utility to optionally fail fast on model conversion failure. (#8589 )	2021-08-03 11:12:56 -07:00
Rachel Guo	0cf2ed029b	Add python binding for CoreML EP (#8472 ) * add pybind binding for coreml ep * update merged files * address comments * format * remove lines for non-macOS platform Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>	2021-07-29 10:06:47 -07:00
Edward Chen	c254c3c355	Fix issue with ONNX to ORT format model conversion script when given single model file as input. (#8323 )	2021-07-07 14:08:47 -07:00
Scott McKay	57782b3463	Add supported operators/types documentation for the ORT Mobile package (#7807 ) * Add ability to generate documentation for the ORT Mobile package using the build configuration as input.	2021-05-26 15:57:40 +10:00
Scott McKay	d6df5764d7	Android package infrastructure (#7430 ) * Include ORT format model conversion scripts and infrastructure in ORT python package. - tweak existing script setup so it can be easily run directly and from the ORT python package Add config file and readme for Android minimal build package Update ORT Mobile doco Disable warning if 'all' optimizations are enabled but NCHWc transformer is excluded (device specific optimizations don't apply in this scenario so the warning is moot). * Address PR comments	2021-04-30 14:23:54 +10:00
Scott McKay	329fd03bb4	Add int32_t as required type to some operators (#7192 ) * Updates to some operators to always support int32 and int64 based on testing of Android package build config with a minimal build. If an operator can be used for shape manipulation (int64) it is frequently used for indices manipulation (int32), so we enable both types for that set of ops. - e.g. BERT models take indices as input - Scatter/Gather ops utilize indices Misc. fix to python bindings to exclude call that fails in a minimal build.	2021-04-01 19:32:34 +10:00
Edward Chen	0ccfe6c86a	Enable type reduction for Scatter/ScatterElements CPU kernels (#7171 ) Enable type reduction for Scatter/ScatterElements CPU kernels. Some refactoring to reduce binary size. Add MLTypeCallDispatcher methods. Minor cleanup for Pad CPU kernel.	2021-03-30 11:02:24 -07:00
Edward Chen	53392664d3	Enable type reduction for Shrink, Sign, SplitToSequence CPU kernels (#7090 ) Enable type reduction for Shrink, Sign, SplitToSequence CPU kernels. Some other type reduction changes including refactoring to specify element types in a single place.	2021-03-23 09:57:33 -07:00
Edward Chen	4cbb8e166a	Update kernel def hashing (#7019 ) Update the kernel def hashing in ORT format models. The new hashing logic ignores the ordering of type constraint types. This is a backward compatibility breaking change, but we don't guarantee backward compatibility yet.	2021-03-22 09:28:27 -07:00
Edward Chen	aa60a8368f	Update type reduction operator type usage processors set. (#6976 )	2021-03-11 09:22:53 -08:00

1 2

62 commits