onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-17 21:10:43 +00:00

Author	SHA1	Message	Date
Justin Chu	d834ec895a	Adopt linrtunner as the linting tool - take 2 (#15085 ) ### Description `lintrunner` is a linter runner successfully used by pytorch, onnx and onnx-script. It provides a uniform experience running linters locally and in CI. It supports all major dev systems: Windows, Linux and MacOs. The checks are enforced by the `Python format` workflow. This PR adopts `lintrunner` to onnxruntime and fixed ~2000 flake8 errors in Python code. `lintrunner` now runs all required python lints including `ruff`(replacing `flake8`), `black` and `isort`. Future lints like `clang-format` can be added. Most errors are auto-fixed by `ruff` and the fixes should be considered robust. Lints that are more complicated to fix are applied `# noqa` for now and should be fixed in follow up PRs. ### Notable changes 1. This PR removed some suboptimal patterns: - `not xxx in` -> `xxx not in` membership checks - bare excepts (`except:` -> `except Exception`) - unused imports The follow up PR will remove: - `import *` - mutable values as default in function definitions (`def func(a=[])`) - more unused imports - unused local variables 2. Use `ruff` to replace `flake8`. `ruff` is much (40x) faster than flake8 and is more robust. We are using it successfully in onnx and onnx-script. It also supports auto-fixing many flake8 errors. 3. Removed the legacy flake8 ci flow and updated docs. 4. The added workflow supports SARIF code scanning reports on github, example snapshot: ![image](https://user-images.githubusercontent.com/11205048/212598953-d60ce8a9-f242-4fa8-8674-8696b704604a.png) 5. Removed `onnxruntime-python-checks-ci-pipeline` as redundant ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Unified linting experience in CI and local. Replacing https://github.com/microsoft/onnxruntime/pull/14306 --------- Signed-off-by: Justin Chu <justinchu@microsoft.com>	2023-03-24 15:29:03 -07:00
pengwa	1d32285536	Statistics tool for ORTModule convergence parity (#15020 ) ### Statistics tool for ORTModule convergence parity As ORTModule get more and more validated, it is pretty fast to intergrade PyTorch based model with ORT. The same time, we need make sure once there is convergence issue, we don't spend months of time to investigate. As part of this efforts, this PR is introducing a tool to dump activation statistics without much involvement from users. The dumping results contains only some statistic numbers plus sampled data, which is not big, compared with dumping all the tensors, it is much faster and space efficient. For us to use it, two single lines are needed before wrapping ORTModule. For baseline run, need also apply the same trick. ``` + from onnxruntime.training.utils.hooks import SubscriberManager, StatisticsSubscriber + SubscriberManager.subscribe(model, [StatisticsSubscriber("pt_out", override_output_dir=True)]) ``` Once you run the steps, following command can be used to merge result into per-step-summary respectively for ORT and baseline runs. ```bash python -m onnxruntime.training.utils.hooks.merge_activation_summary --pt_dir pt_out --ort_dir ort_out --output_dir /tmp/output ``` Docs is added here as part of this PR [convergence investigation notes](https://github.com/microsoft/onnxruntime/blob/pengwa/conv_tool/docs/ORTModule_Convergence_Notes.md) Based on the generated merged files, we can compare them with tools. ![image](https://user-images.githubusercontent.com/10530022/224653929-4e4480bd-bb02-4bbe-bd44-2672bdf91a87.png) ### Design and Implementation This PR introduced a common mechanism registering custom logic for nn.Module's post forward hooks. And statistics for activation (StatisticsSubscriber) is one of the implementations. If there is other needs, we can define another XXSubscriber to do the customized things.	2023-03-23 20:34:24 +08:00
George Wu	289f7dbcdd	enable pybind for qnn ep (#14897 ) enable python bindings for QNN EP. tested on Windows Dev Kit 2023 (ARM64) with python 3.11 (ARM64) from https://www.python.org/ftp/python/3.11.1/python-3.11.1-arm64.exe	2023-03-03 07:26:53 -08:00
Tianlei Wu	742658d171	Stable Diffusion CUDA optimizations Part 2 (#14597 ) ### Description This is a follow-up of https://github.com/microsoft/onnxruntime/pull/14428 for Stable Diffusion CUDA optimizations: (1) use NchwConv to replace Conv in onnx graph and add Tranpose nodes accordingly (2) reduce sequential Transpose nodes to at most one. (3) symbolic shape infer of NchwConv (4) fix add bias transpose which causes CUDA error (launching more than 1024 threads per block) in inferencing fp32 model. (5) add models (bert, bart, stable_diffusion subdirectories) to package; (6) remove option --disable_channels_last Note that (1) We can add a few graph transformations to reduce Transpose nodes further. It is not done in this PR due to time limit. (2) Stable diffusion 2.1 model outputs black images. It seems that forcing Attention to float32 could avoid the issue. However it is much slow to use float32 Attention. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-02-07 07:49:15 -08:00
Baiju Meswani	d06ad9462b	[Bug Fix] Include python training apis when enable_training is enabled (#14485 )	2023-01-31 17:17:26 -08:00
sfatimar	7654cd50e8	Openvino ep 2022.3 v4.3 (#14210 ) ### Description Changes to incorporate OpenVINO EP 2022.3 ### Motivation and Context This change is required to incorportate OpenVINO EP 2022.3 - If it fixes an open issue, please link to the issue here. --> Co-authored-by: mohsinmx <mohsinx.mohammad@intel.com> Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com> Co-authored-by: Aravind <aravindx.gunda@intel.com> Co-authored-by: mayavijx <mayax.vijayan@intel.com> Co-authored-by: flexci <mohsinmx>	2023-01-11 16:31:26 -08:00
RandySheriffH	83ad562826	Rename CloudEP to AzureEP (#14175 ) Rename CloudEP to AzureEP. Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-01-11 12:25:04 -08:00
Xavier Dupré	79dc39600f	Replace distutils by setuptools to import build_ext (#14108 ) ### Description Uses setuptools instead of distutils. ### Motivation and Context Fixes #14107.	2023-01-09 11:48:01 +01:00
Ashwini Khade	68b5b2d7d3	Refactor training build options (#13964 ) ### Description 1. Renames all references of on device training to training apis. This is to keep the naming general. Nothing really prevents us from using the same apis on servers\non-edge devices. 2. Update ENABLE_TRAINING option: With this PR when this option is enabled, training apis and torch interop is also enabled. 3. Refactoring for onnxruntime_ENABLE_TRAINING_TORCH_INTEROP option: - Removed user facing option - Setting onnxruntime_ENABLE_TRAINING_TORCH_INTEROP to ON when onnxruntime_ENABLE_TRAINING is ON as we always build with torch interop. Once this PR is merged when --enable_training is selected we will do a "FULL Build" for training (with all the training entry points and features). Training entry points include: 1. ORTModule 2. Training APIs Features include: 1. ATen Fallback 2. All Training OPs includes communication and collectives 3. Strided Tensor Support 4. Python Op (torch interop) 5. ONNXBlock (Front end tools for training artifacts prep when using trianing apis) ### Motivation and Context Intention is to simply the options for building training enabled builds. This is part of the larger work item to create dedicated build for learning on the edge scenarios with just training apis enabled.	2023-01-03 13:28:16 -08:00
RandySheriffH	587e891cae	CloudEP (#13855 ) Implement CloudEP for hybrid inferencing. The PR introduces zero new API, customers could configure session and run options to do inferencing with Azure [triton endpoint.](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-with-triton?tabs=azure-cli%2Cendpoint) Sample configuration in python be like: ``` sess_opt.add_session_config_entry('cloud.endpoint_type', 'triton'); sess_opt.add_session_config_entry('cloud.uri', 'https://cloud.com'); sess_opt.add_session_config_entry('cloud.model_name', 'detection2'); sess_opt.add_session_config_entry('cloud.model_version', '7'); // optional, default 1 sess_opt.add_session_config_entry('cloud.verbose', '1'); // optional, default '0', meaning no verbose ... run_opt.add_run_config_entry('use_cloud', '1') # 0 for local inferencing, 1 for cloud endpoint. run_opt.add_run_config_entry('cloud.auth_key', '...') ... sess.run(None, {'input':input_}, run_opt) ``` Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-01-03 10:03:15 -08:00
FFFrog	6705915af8	[CANN] Add the ability to run graph (#13728 ) ### Description Add the ability to run graph ### Motivation and Context A brief description is as follows: 1) If the whole graph is supported, then will be processed by the graph engine, directly. 2) If the whole graph is not supported, the whole graph will be divided into subgraphs and single operators; The sub-graphs will be run on graph engine, and the single operators will fallback to the traditional mode.	2022-12-16 06:57:40 -08:00
Wei-Sheng Chin	b5904c40dd	Enable ORT in TorchDynamo (#13259 ) This PR enables ORT to execute graphs captured by TorchDynamo. Major compilation code is in `OrtBackend.compile` in ort_backend.py. `register_backend.py` is for plugging `OrtBackend` into TorchDynamo as a compiler.	2022-11-01 11:19:29 -07:00
Adam Louly	68eff69ab1	Add Utils for federated learning scenarios (#13014 ) Description: utils for federated learning. Motivation and Context - This PR includes utils that will be used on federated learning scenarios. - Exposing python bindings to some utils, and added a util to calculate the difference between two buffers. Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>	2022-10-17 12:39:43 -07:00
PeixuanZuo	b4853a978a	[ROCm] add rocm python package pipeline with --use_rocm_profiling (#13068 ) ### Description <!-- Describe your changes. --> ROCm developers always need to build onnxruntime whl with `--enable_rocm_profiling`. Add a ROCm dev python package pipeline which product .whl with build args `--enable_rocm_profiling`. The dev *whl need to upload to azure storage and can get from https://download.onnxruntime.ai/onnxruntime_nightly_rocm53.profiling.html ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2022-10-17 10:11:20 +08:00
RandySheriffH	a83a9ed6b0	Remove miscellaneous nuphar configs (#13070 ) Remove a handful of nuphar related configurations after deprecation. Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2022-09-26 13:41:28 -07:00
Chih-Hsuan Yen	9abd6e3a30	setup.py: use packaging instead of wheel.vendored.packaging (#13083 )	2022-09-24 08:32:44 -07:00
Changming Sun	eafd67b8fd	Update CUDA version to 11.6 and refactor python packaging pipeline (#13002 ) 1. Update CUDA version from 11.4 to 11.6. 2. Update Manylinux version 3. Upgrade GCC version from 10 to 11 for most x86_64 pipelines. CentOS 7 ARM64 doesn't have GCC 11 yet. 4. Refactor python packaging pipeline: a. Split Linux GPU build job to two parts, build and test, so that the build part doesn't need to use a GPU machine b. Make the Linux GPU build job and Linux CPU build job more similar: share the same bash script and yaml file. 5. Temporarily disable Attention_Mask1D_Fp16_B2_FusedNoPadding because it is causing one of our packaging pipeline to fail. I have created an ADO task for this.	2022-09-23 00:29:27 -07:00
wangxiyuan	952c99304a	Add CANN EP (#12416 ) Description: This PR adds Ascend CANN execution provider support. Motivation and Context - Why is this change required? What problem does it solve? As the info shown in the issue. CANN is the API layer for Ascend processor. Add CANN EP can allow user run onnx model on Ascend hardware via onnxruntime The detail change: 1. Added CANN EP framework. 2. Added the basic operators to support ResNet and VGG model. 3. Added C/C++、Python API support - If it fixes an open issue, please link to the issue here. https://github.com/microsoft/onnxruntime/issues/11477 Author: lijiawei <lijiawei19@huawei.com> wangxiyuan <wangxiyuan1007@gmail.com> Co-authored-by: FFrog <ljw1101.vip@gmail.com>	2022-09-22 14:53:40 -07:00
Baiju Meswani	4ed5a5b2a8	Disable local versions based on environment variable (#12997 )	2022-09-16 22:51:18 -07:00
Ashwini Khade	ceb76429db	Merge pull request #12056 from microsoft/bmeswani/merge-training_dev/on_device_poc Merge On-Device-Training Offline Tooling and C/C++ APIs	2022-07-21 15:09:48 -07:00
RandySheriffH	178a413ca1	List 3.10 as supported python version and remove 3.6 (#12141 ) list 3.10 as supported python version and remove 3.6 Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2022-07-12 15:28:30 -07:00
Baiju Meswani	a457ddc41d	Merge branch 'master' of https://github.com/microsoft/onnxruntime into bmeswani/merge_pr	2022-06-30 21:53:07 +00:00
Baiju Meswani	fac8dae9df	Add support for gradient clipping, AdamWOptimizer and tensorseq as inputs (#11697 )	2022-06-22 10:27:58 -07:00
sfatimar	f97bd38c4f	UEP 4.1 release (#11834 ) * Add pypi build changes to latest Master * Add ORT training part of OV build * Disabling SqueezeOpTest.BadAxes * Add ONNXruntime branch ARG to Docker build * Changes to include file details versions * Commit File Version Updates * Change naming for linux build * Add fix for pylint format errors * Fix pylint warnings. * Fix pylint errors - stage 2 Signed-off-by: Preetha Veeramalai <preetha.veeramalai@intel.com> * Fix pylint errors - stage 3 * Fix pylint format - stage4 Signed-off-by: Preetha Veeramalai <preetha.veeramalai@intel.com> * Commit for Wheel Release >0.35.1 Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com> Co-authored-by: mayavijx <mayax.vijayan@intel.com> Co-authored-by: Sahar Fatima <sfatima.3001@gmail.com> Co-authored-by: nmaajidk <n.maajid.khan@intel.com>	2022-06-17 14:49:04 -07:00
Yi Zhang	8bb0062873	add manylinux_2_27 CPU wheel (#11886 ) * add manylinux_2_27 * minor refactory * change base image * minor refactor * add tests * fix condition	2022-06-17 19:38:38 +08:00
Changming Sun	10478a09ca	Revert "add manylinux_2_27 wheel (#11832 )" This reverts commit `bbace23d0c`.	2022-06-16 18:28:12 -07:00
Yi Zhang	bbace23d0c	add manylinux_2_27 wheel (#11832 ) * add manylinux_2_27	2022-06-15 10:26:51 +08:00
pengwa@microsoft.com	e1c63cb06a	Merge branch 'master' of https://github.com/microsoft/onnxruntime into training_dev/on_device_poc	2022-05-28 01:54:17 +00:00
Baiju Meswani	3a22a866a1	On device training offline tooling (#11520 )	2022-05-24 18:21:39 -07:00
Scott McKay	833ded4b0e	Update setup.py to include config files used by model analysis in wheel. (#11381 ) * Update setup.py to include config files used by model analysis in wheel.	2022-04-28 16:13:26 +10:00
Justin Chu	fdce4fa6af	Format all python files under onnxruntime with black and isort (#11324 ) Description: Format all python files under onnxruntime with black and isort. After checking in, we can use .git-blame-ignore-revs to ignore the formatting PR in git blame. #11315, #11316	2022-04-26 09:35:16 -07:00
Tianlei Wu	1d96cbec73	Move gpt2 script to models\gpt2 sub-directory (#11256 ) * move gpt-2 scripts to models\gpt2 * change gpt2 beam search helper to make test_gpt2 passes	2022-04-20 11:09:26 -07:00
Scott McKay	3b3b23bcf9	Add new python helper dirs to wheel. (#11196 )	2022-04-13 13:34:07 +10:00
Tianlei Wu	00b595e389	move longformer and t5 to models subdirectory (#11161 ) * move longformer scripts to models subdirectory * Copy transformers\models\t5 to python package as well	2022-04-09 22:35:14 -07:00
Alexey Gladyshev	7dc7529ec8	[TVM EP] Integrate tests for TVM EP into public onnxruntime CI (#10505 ) * add support for bool type * add TVM EP support for tests * include TVM EP in python test pool * fix pylint * moved technical imports to a separate file * clean up post build actions & move _ld_preload.py extension to CMake level * add files for include TVM EP into CI * implement custom logger for TVM * replace TVM logging with ONNX RT logging * update link for TVM EP tutorial * clean up TVM EP cmake * add pybind auto enabling for TVM EP * fix blank spaces * code review fixes * replace print with comment * add list of EP without TVM EP * enable onnx tests * disable contrib ops and ml ops * reuse Dockerfile.ubuntu * Move install_tvm_test_dependencies.sh out of Docker context dir, update build definition. Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2022-02-24 16:24:23 +01:00
Justin D. Harris	742694f679	[python] [orttraining] Add utility to export a graph to compute gradients (#8125 )	2022-02-18 14:00:49 -08:00
Valery Chernov	1cdc23aba4	[TVM EP] Rename Standalone TVM (STVM) Execution Provider to TVM EP (#10260 ) * update java API for STVM EP. Issue is from PR#10019 * use_stvm -> use_tvm * rename stvm worktree * STVMAllocator -> TVMAllocator * StvmExecutionProviderInfo -> TvmExecutionProviderInfo * stvm -> tvm for cpu_targets. resolve onnxruntime::tvm and origin tvm namespaces conflict * STVMRunner -> TVMRunner * StvmExecutionProvider -> TvmExecutionProvider * tvm::env_vars * StvmProviderFactory -> TvmProviderFactory * rename factory funcs * StvmCPUDataTransfer -> TvmCPUDataTransfer * small clean * STVMFuncState -> TVMFuncState * USE_TVM -> NUPHAR_USE_TVM * USE_STVM -> USE_TVM * python API: providers.stvm -> providers.tvm. clean TVM_EP.md * clean build scripts #1 * clean build scripts, java frontend and others #2 * once more clean #3 * fix build of nuphar tvm test * final transfer stvm namespace to onnxruntime::tvm * rename stvm->tvm * NUPHAR_USE_TVM -> USE_NUPHAR_TVM * small fixes for correct CI tests * clean after rebase. Last renaming stvm to tvm, separate TVM and Nuphar in cmake and build files * update CUDA support for TVM EP * roll back CudaNN home check * ERROR for not positive input shape dimension instead of WARNING * update documentation for CUDA * small corrections after review * update GPU description * update GPU description * misprints were fixed * cleaned up error msgs Co-authored-by: Valery Chernov <valery.chernov@deelvin.com> Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru> Co-authored-by: Thierry Moreau <tmoreau@octoml.ai>	2022-02-15 10:21:02 +01:00
Baiju Meswani	7691e7ed12	Introduce load balancing dataset samplers (#10163 )	2022-02-14 13:46:14 -08:00
Xavier Dupré	481b96d32a	STVM, NUPHAR, remove tvm from submodules list, checks pointers are not null. (#10211 ) * STVM, checks pointers are not null. * removes submodules tvm * add missing include(FetchContent) * add target tvm * fix stvm test * extend cgmanifest with dependencies of tvm	2022-01-27 20:31:13 +01:00
Weixing Zhang	ea9c8a7cdc	support MIGraphXEP to work with ROCMEP for inference on AMD GPU (#10368 ) Co-authored-by: Weixing Zhang <wezhan@microsoft.com> Support MIGraphXEP to work with ROCMEP for inference on AMD GPU	2022-01-26 15:52:56 -08:00
Alexey Gladyshev	a0fe4a7c1c	[TVM EP] Improved usability of TVM EP (#10241 ) * improved usability of TVM EP * moved technical import under a condition related to TVM EP only * Revert "moved technical import under a condition related to TVM EP only" * add conditional _ld_preload.py file extension for TVM EP * improve readability of inserted code	2022-01-25 18:48:08 +01:00
Valery Chernov	b327e89efa	Standalone TVM Executor Provider (#10019 ) * squashed commit for standalone tvm execution provider * critical fix for correct python build with stvm ep * get tuning log file from ep options. It has priority over AUTOTVM_TUNING_LOG * updates and fixes * update parsing of stvm provider options * add support of external data for onnx model * add conditional dump of subgraphs * remove unused code * get input tensor shapes through provider options. get output shapes for fixed input ones by TVM API * support AUTO_TVM tuning log file inside ORT. Selector for Ansor and Auto_TVM is provider option (tuning_type) * add fp16 * add functionality of conversion of model layout to NHWC if need. Necessary parameter was added to STVM provider options * fix license text in header. fix log format * small fixes * fix issues from flake8 * remove model proto construction from GetCapability * reserve memory for vector of DLTensors * add simple tutorial for STVM EP * STVM docs * jroesch/tvm -> apache/tvm * remove dead code, unneccessary logs and comments * fix in readme * improve tutorial notebook * tvm update * update STVM_EP.md * fix default value * update STVM_EP.md * some TODOs for the future development * shorten long lines * add hyperlink to STVM_EP.md * fix Linux CI error * fix error in csharp test Co-authored-by: Jared Roesch <jroesch@octoml.ai> Co-authored-by: Valery Chernov <valery.chernov@deelvin.com> Co-authored-by: KJlaccHoeUM9l <wotpricol@mail.ru>	2021-12-15 16:59:20 -08:00
Chi Lo	7242627fec	Integrate TensorRT into GPU Python package (#9785 ) * add use_tensorrt build option * Add use_tensorrt to running tests * add use_tensorrt for Windows * make trt ep to skip backend test * make trt ep to skip backend test * Fix bug * Add/Modify description * modify for debug * swtich pool to test * modify to debug * modify to debug * add vobersity * refine the code * refine the code * refine the code * fix flake8 warning * refine the code * add pre_load check for trt as well as add cupti lib to cuda depedencies * modify script to make trt build path the same as cuda * show error message when user wants to run TensorRT but TensorRT is not installed in the env * fix bug * fix bug * add trt lib for manylinux * include cuda_dependencies for trt * rewrite the condition to throw exception * make code more compact	2021-11-18 13:26:51 -08:00
Suffian Khan	b409cbe62c	Fix incorrect library reference in Python manylinux package for CUDA (#9769 )	2021-11-16 13:40:17 -08:00
Guoyu Wang	5ad6dbb314	Remove experimental from ORT format namespace (#9729 ) * schema change * cc channges * remove temp debug code * Adding fbs namespace to session_state_flatbuffers_utils.h * Add fbs namepsace to all ort format utils	2021-11-11 19:46:30 -08:00
Suffian Khan	e6f0fdd653	Strip AMD libraries bundled with Python package due to libonnxruntime_providers_rocm.so change (#9679 ) * remove AMD library depedence from libonnxruntime_providers_rocm.so * fix flake error * remove rocm dependency from original library as well	2021-11-11 09:32:09 -08:00
Weixing Zhang	e11fde0179	libonnxruntime_providers_rocm.so and libonnxruntime_providers_shared.so are not included in python package. (#9618 ) * libonnxruntime_providers_rocm.so and libonnxruntime_providers_shared.so are not included in python package. Co-authored-by: Weixing Zhang <wezhan@microsoft.com>	2021-11-01 19:12:09 -07:00
pengwa	b125446f9c	Optimize python overhead of APEX amp (#9447 ) * optimize python overhead of _post_amp_backward * overwrite apex amp's zero_grad for faster implementation * move unscale_fp16_grads_into_fp32_grads into C++ impl * improve the efficiency furthur, reducing 3.5ms to 1.7ms for unilm. * unilm 1.7ms to 338us: 1). optimize python list <==> std::vector copy, 2). launch the kernels as long as num_elem reach thresh hold. This help reduce the CUDA idel time. * refine the logic a bit after validating Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>	2021-10-26 13:13:49 +08:00
Changming Sun	406f1629c1	Remove Featurizers code (#9300 )	2021-10-20 10:20:35 -07:00
Abhishek Jindal	87e726d1a0	Abjindal/merge eager with external custom ops (#8986 ) * switching to pytorch nightly build * adding eager mode * enable pybind and remove install step * removing auditwheel repair process * installing package * adding auditwheel back * disabling auditwheel repair for eager mode * typo correction	2021-10-14 13:19:45 -07:00
baijumeswani	bcdb411c8d	Implement FusedAdam for ORT adapted from DeepSpeed (#9266 )	2021-10-05 20:50:34 -07:00
Thiago Crepaldi	ceb51dda4a	Support external torch cpp extensions on ORTModule (#9223 )	2021-09-30 10:37:35 -04:00
Wei-Sheng Chin	1b0816859f	Only wrap sub-modules which can be wrapped as ORTModule (#9021 )	2021-09-27 17:18:22 -07:00
Ryan Hill	b7971575f8	Fix python manylinux to not load cuda if it fails to load dependencies (#8882 ) * Fix python manylinux to not load cuda if it fails to load dependencies	2021-09-07 11:09:25 -07:00
liqun Fu	f126a12699	decouple pytorch from onnxruntime training build (#8815 )	2021-09-01 16:31:53 -07:00
pengwa	3eb08d4dc7	custom autograd func memory (#8901 ) * remove PythonOpGrad control dependency && avoid segement fault * comment alignment * fix bugs	2021-09-01 09:29:26 +08:00
satyajandhyala	31926176ac	Support external custom operator schemas on Ubuntu (#8807 ) * Expose symbols in onnx and protobuf namespaces in python when building with --enable_external_custom_op_schemas * Add external onnx and protobuf files to wheel * Added an example to demonstrate external custom ops use-case * Added a Linux build pipeline to test external custom ops	2021-08-28 11:05:21 -07:00
Thiago Crepaldi	6f2f4721ec	Update Python setuptools classfiers to remove windows and mac (#8776 )	2021-08-20 08:53:25 -07:00
liqun Fu	1a2b41dbbc	packaging pipeline produces -cpu- named packages due to a logical error (#8665 )	2021-08-09 16:49:59 -07:00
Edward Chen	baf8c39a8d	Add Python checks pipeline (#7032 ) This change adds a new pipeline for checking Python code. Currently this pipeline only runs flake8. flake8 is also run as part of the CMake project builds, but we can switch over completely to the new pipeline later. The .flake8 config file was also updated to make it easier to run standalone (flake8 --config ./.flake8) and some Python formatting issues were addressed in files that were not previously scanned.	2021-08-09 10:37:05 -07:00
liqun Fu	419fd5cc6e	reformat build suffix so that the latest is always correct (#8267 )	2021-08-06 16:44:51 -07:00
liqun Fu	eab6c51413	to create a training cpu package for torch-ort documentation (#7845 )	2021-08-05 16:43:37 -07:00
baijumeswani	816ad86d14	Configuring ORTModule - Internal Options (#8537 )	2021-07-30 13:05:32 -07:00
Suffian Khan	e71846b029	fix ld_preload for rocm (#8290 )	2021-07-02 17:15:28 -07:00
Thiago Crepaldi	97f1eea2ea	Propagate ROCM version to onnxruntime wheel package (#8247 )	2021-06-30 13:52:22 -07:00
Thiago Crepaldi	83be3759bc	Add post-install command to build PyTorch CPP extensions from within onnxruntime package (#8027 ) ORTModule requires two PyTorch CPP extensions that are currently JIT compiled. The runtime compilation can cause issues in some environments without all build requirements or in environments with multiple instances of ORTModule running in parallel This PR creates a custom command to compile such extensions that must be manually executed before ORTModule is executed for the first time. When users try to use ORTModule before the extensions are compiled, an error with instructions are raised PyTorch CPP Extensions for ORTModule can be compiled by running: python -m onnxruntime.training.ortmodule.torch_cpp_extensions.install Full build environment is needed for this	2021-06-28 18:11:58 -07:00
liqunfu	9366114028	make pipelines to support torch1.8.1 and torch1.9.0 (#8084 )	2021-06-25 14:55:49 -07:00
Changming Sun	1fa6986656	Chang how numpy version is handled. (#8130 ) Numpy has binary compatibility, which means "binaries compiled against a given version of NumPy will still run correctly with newer NumPy versions, but not with older versions." So, if an onnx runtime package was built with numpy version A, then at run time it requires numpy version >=A. In this change, we read numpy version from the installed packages at build time, to avoid manually keeping the build time/runtime consistency.	2021-06-23 14:08:37 -07:00
Changming Sun	96989b83ee	Create python packages for DML (#8061 )	2021-06-16 16:59:12 -07:00
Changming Sun	b854f2399d	Update manylinux build scripts and GPU CUDA version from 11.0 to 11.1 (#7632 ) 1. Update manylinux build scripts. This will add [PEP600](https://www.python.org/dev/peps/pep-0600/)(manylinux2 tags) support. numpy has adopted this new feature, we should do the same. The old build script files were copied from https://github.com/pypa/manylinux, but they has been deleted and replaced in the upstream repo. The manylinux repo doesn't have a manylinux2014 branch anymore. So I'm removing the obsolete code, sync the files with the latest master. 2. Update GPU CUDA version from 11.0 to 11.1(after a discussion with PMs). 3. Delete tools/ci_build/github/linux/docker/Dockerfile.manylinux2014_cuda10_2. (Merged the content to tools/ci_build/github/linux/docker/Dockerfile.manylinux2014_cuda11) 4. Modernize the cmake code of how to locate python devel files. It was suggested in https://github.com/onnx/onnx/pull/1631 . 5. Remove `onnxruntime_MSVC_STATIC_RUNTIME` and `onnxruntime_GCC_STATIC_CPP_RUNTIME` build options. Now cmake has builtin support for it. Starting from cmake 3.15, we can use `CMAKE_MSVC_RUNTIME_LIBRARY` cmake variable to choose which MSVC runtime library we want to use. 6. Update Ubuntu docker images that used in our CI build from Ubuntu 18.04 to Ubuntu 20.04. 7. Update GCC version in CUDA 11.1 pipelines from 8.x to 9.3.1 8. Split Linux GPU CI pipeline to two jobs: build the code on a CPU machine then run the tests on another GPU machines. In the past we didn't test our python packages. We only tested the pre-packed files. So we didn't catch the rpath issue in CI build. 9. Add a CentOS machine pool and test our Linux GPU build on real CentOS machines. 10. Rework ARM64 Linux GPU python packaging pipeline. Previously it uses cross-compiling therefore we must static link to C Runtime. But now have pluggable EP API and it doesn't support static link. So I changed to use qemu emulation instead. Now the build is 10x slower than before. But it is more extensible.	2021-06-02 23:36:49 -07:00
liqunfu	bed6e87cbd	add environment variable to control default training package's local version (#7849 )	2021-05-26 22:44:20 -07:00
George Wu	1c6b6f696e	fixes for cuda centos/manylinux (#7830 ) * fixes for cuda centos/manylinux * remove providers_shared.so dep processing.	2021-05-25 19:38:59 -07:00
Scott McKay	c4f515d380	- Fix training cmake file so it builds if `--cmake_extra_defines onnxruntime_BUILD_UNIT_TESTS=OFF` is specified. (#7789 ) - Fix check on cudart_versions when building on Windows to handle None being returned	2021-05-23 09:53:15 +10:00
Ryan Hill	c99aa3a3f3	Ryanunderhill/cuda shared (#7626 ) * First iteration of making cuda a shared provider. Separated out shared OpKernel change, so doing this to merge with that change. * More cuda shared library refactoring * More cuda shared library refactoring * More build options tested, converted the training ops over. * Fix merge breaks * Fix submodules * Fix submodules * Fix submodules * Fix python * Fix compile errors * Duplicate symbol fix * Test fix for ROCM provider * Another ROCM test workaround * ROCM Build Test * ROCM build fix * ROCM * ROCM * ROCM * ROCM * ROCM * ROCM test * Reduce header dependencies * Remove redundant namespace * Test fix for linux * Fix linux build * Fix Eigen build error * Fix unused parameter warning * Test link error * Another linker test * Linker test * Linker test * Another test * Another build test * Fix linux link error * Build test * Fix control flow ops to use common base class with core code * Remove extra qualifiers * Fix template syntax for linux * Fix cuda memory leak * Fix pybind * Test disabling cast * Cleanup * Restore cuda in test * Remove more header dependencies * Test not adding cuda provider to session * Make GetProviderInfo_CUDA throw * No-op cuda provider creation * Fix some setup issues * Fix memory cleanup on unload * Diagnostics * Don't unload library * Add diagnostics * Fix deleting registry at right time. * Test disabling profiler * Fix merge break * Revert profiler change * Move unloading of shared providers into Environment * Free more global allocations before library unloads * Add more diagnostics * Move unloading back to the OrtEnv as there are multiple Environments created during a session. Remove some library dependencies for tests. * Fix more cmake files * ERROR -> WARNING * Fix python shutdown * Test not using dml in pipeline * Change python version and disable dml * Update python version * Test adding unload method for shared providers * Disable DLL test * Python test * Revert "Python test" This reverts commit `c7ec2cfe98`. * Revert "Disable DLL test" This reverts commit `e901cb93aa`. * Revert "Test adding unload method for shared providers" This reverts commit `c427b78799`. * Point to RyanWinGPU * Revert python version * Fix id_to_allocator_map * Another python exit test * Remove extra debug messages Try a more clean python shutdown through DllMain * Revert DllMain idea, it didn't work * Merge conflicts * Fix merge with master issues. * Comments * Undo edit to file * Cleanup + new training ops * Revert yml changes * Fix another merge error * ROCM fix * ROCM fix v2 * Put back Linux hack, it is necessary * Stupid fixes * Fix submodule out of sync * ROCM fix 3 * ROCM 4 * Test java fix * Fix typos * Java test on my VM * Fix build error * Spotless fix * Leave temp file around to load properly * Fix cleanup on exit * Fix break * Java comments * Remove LongformerAttentionBase workaround * Spotless fix * Switch yml back to regular build pool * Revert "Switch yml back to regular build pool" This reverts commit `be35fc2a5a`. * Code review feedback * Fix errors due to merge * Spotless fix * Fix minimal build * Java fix for non cuda case * Java fix for CPU build * Fix Nuphar? * Fix nuphar 2 * Fix formatting * Revert "Remove LongformerAttentionBase workaround" This reverts commit `648679b370`. * Training fix * Another java fix * Formatting * Formatting * For orttraining * Last orttraining build fix... * training fixes * Fix test provider error * Missing pass command * Removed in wrong spot * Python typo * Python typos * Python crash on exit, possibly due to unloading of libraries. * Remove test_execution_provider from training build Only enable python atexit on windows Remove assert on provider library exit * Still can't unload providers in python, alas. * Disable Nvtx temporarily * MPI Kernels for Training * MPI Kernels part 2 * Patch through INcclService * Oops, wrong CMakeLists * Missing namespace * Fix missing () * Move INcclService::GetInstance around to link nicer * Missing } * Missing MPI libraries for Cuda * Add extra GetType functions used by MPI * Missing Nccl library * Remove LOGS statements as a test * Add in a couple more missing GetType methods * Update comments * Missed a logging reference in mpi_context.h * Convert aten_op to shared (due to marge with master) * Test moving DistributedRunContext instance into shared provider layer (with purpose error to verify it's being built properly) * Test passed, now with fix * Missing static * Oops, scope DistributedRunContext to just NCCL * Merge related issues and code review feedback. * Merge error * Bump to rel-1.9.1 (#7684) * Formatting * Code review feedback for Java build on non Windows * Remove cupti library dependency from core library * Test Java pipeline fix * Linux build fix * Revert "Linux build fix" This reverts commit `a73a811516`. * Revert "Remove cupti library dependency from core library" This reverts commit `6a889ee8bf`. * Packaging pipeline fixes to copy cuda shared provider for tensorrt & standard packages * Add cuda to Tensorrt nuget package * onnxruntime_common still has a cuda header dependency Co-authored-by: ashbhandare <ash.bhandare@gmail.com>	2021-05-20 07:53:47 -07:00
liqunfu	359fe1d197	Liqun/ort training version (#7620 )	2021-05-14 09:54:19 -07:00
Adrian Tsai	70e67ddd2b	Update DirectML version to 1.5.1 and enable ARM/ARM64 builds with DML (#7511 ) * Update DirectML to version 1.5.1 * Enable --use_dml with ARM and ARM64 * Add ARM/ARM64 binaries to nuget packages	2021-04-30 00:49:30 -07:00
Scott McKay	d6df5764d7	Android package infrastructure (#7430 ) * Include ORT format model conversion scripts and infrastructure in ORT python package. - tweak existing script setup so it can be easily run directly and from the ORT python package Add config file and readme for Android minimal build package Update ORT Mobile doco Disable warning if 'all' optimizations are enabled but NCHWc transformer is excluded (device specific optimizations don't apply in this scenario so the warning is moot). * Address PR comments	2021-04-30 14:23:54 +10:00
liqunfu	4cbd2cce9b	. (#7466 ) Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-04-27 09:20:21 -07:00
Suffian Khan	7a3c1787af	Add CI pipeline to publish Python training package targeting Rocm (#7417 ) * first attempt rocm training wheel * modifications needed to python packaging pipeline for Rocm 4.1 * changges to not conflict with cuda missed stage1 changes remove package push add option r to getopt try again without python install try again without python install try again without python install split pipelines and add back push to remote storage try on cuda gpu pool try again try again try running without az subscription set try again on original pipeline change pool passing AMD Rocm whl on AMD-GPU pool split rocm pipeline from cuda pipeline remove comments * try adding Rocm tests as well * try with tests in place * fix trailing ws * add training data * try again as root for tests * use python3 * typo * try to map video, render group into container * try again * try again * try to avoid yum error code * make UID 1001 * try without yum downgrade * define rocm_version=None * remove CUDA related comments for Rocm Dockerfile * Dont pin nightly torch torchvision torchtext versions as they expire (for now nightly is required for Rocm 4.1) * missed requirements-rocm.txt from last commit * fix whitespace	2021-04-23 17:22:31 -07:00
liqunfu	75d8319286	Liqun/ort package name2 (#7337 )	2021-04-13 20:36:24 -07:00
liqunfu	4c862c73ed	for training to use new python package naming convention to explicitl… (#7204 )	2021-04-13 16:19:42 -07:00
jeyblu	61ba9ac1bb	matmul in dnnl (#7311 ) * update dnnl to v2.2 * dnnl matmul	2021-04-12 08:03:03 -07:00
KeDengMS	6987106bf5	Add missing Python dependencies for ORT training (#7104 ) * Add missing Python dependencies for training cerberus - option parsing h5py - checkpoint onnx - model proto packaging/sympy - symbolic shape inference * Separate requirements.txt for inference and training Python packages.	2021-03-23 18:43:19 -07:00
Chi Lo	8c3b59a026	Quantization calibration refactor (#6893 ) * Code refactor * Modify code to tackle OOM when calibrating on larget dataset * Fix mismatch issue when setting keepdims on ReduceMin/ReduceMax * Add COCO val 2017 annotation * Fix mismatch issue when setting keepdims on ReduceMin/ReduceMax * Fix bug of "No module named:onnxruntime.quantization.CalTableFlatBuffers" * Check and install flatbuffers module * Add script to donwload coco dataset image and refactor example * Fix bug of "No module named:onnxruntime.quantization.CalTableFlatBuffers" * Add CalTableFaltBuffers as module * Remove annotation, user can download by themselves. * Uncommet code * Add back instances_val2017.json * Make sure flatbuffers installed when ORT is installed * Refactor code to call coco api * Enable FP16 for example	2021-03-19 01:09:11 -07:00
Faith Xu	72eb5de0e2	Add Python 3.9 to pypi metadata	2021-02-12 20:00:17 -08:00
suryasidd	1a5b75a554	[OpenVINO-EP] Remove support for OpenVINO 2020.2 (#6493 ) * Removed OpenVINO 2020.2 support * Updated documentation and build.py * Removed unnecessary libraries from setup.py	2021-01-28 23:00:41 -08:00
Faith Xu	7a0ab9c450	Update pypi package metadata (#6354 ) * Update setup file data * add missing comma * remove python 3.5 * fix typo bracket	2021-01-27 19:27:37 -08:00
Tianlei Wu	ec81e29c84	Add longformer to python package (#6314 ) * add longformer to python package * move test related script and data to a new folder	2021-01-12 10:38:39 -08:00
S. Manohar Karlapalem	40926867c3	Add OpenVINO EP shared lib to Py Wheel (#5920 ) * Add OpenVINO EP shared lib to Py Wheel Include the libonnxruntime_providers_openvino.so/.dll to the wheel * Follow libs.extend pattern as other EPs	2020-11-24 21:27:13 -08:00
S. Manohar Karlapalem	ff58f621fa	Remove nGraph Execution Provider (#5858 ) * Remove nGraph Execution Provider Pursuant to nGraph deprecation notice: https://github.com/microsoft/onnxruntime/blob/master/docs/execution_providers/nGraph-ExecutionProvider.md#deprecation-notice Deprecation Notice \| \| \| \| --- \| --- \| \| Deprecation Begins \| June 1, 2020 \| \| Removal Date \| December 1, 2020 \| Starting with the OpenVINO™ toolkit 2020.2 release, all of the features previously available through nGraph have been merged into the OpenVINO™ toolkit. As a result, all the features previously available through ONNX RT Execution Provider for nGraph have been merged with ONNX RT Execution Provider for OpenVINO™ toolkit. Therefore, ONNX RT Execution Provider for nGraph will be deprecated starting June 1, 2020 and will be completely removed on December 1, 2020. Users are recommended to migrate to the ONNX RT Execution Provider for OpenVINO™ toolkit as the unified solution for all AI inferencing on Intel® hardware. * Remove nGraph Licence info from ThirdPartyNotices.txt * Use simple Test.Run() for tests without EP exclusions To be consistent with rest of test code. * Remove nGraph EP functions from Java code	2020-11-19 16:47:55 -08:00
Tianlei Wu	c5d4ae0401	Add transformers tools to python package (#5090 ) * Add transformers to onnxruntime python package	2020-09-10 15:42:15 -07:00
Thiago Crepaldi	6594d6672f	Move onnxruntime.experiment to onnxruntime.training namespace (#5045 )	2020-09-09 09:46:06 -07:00
Cameron Maske	4553b2eecd	Expose DirectML provider to python (conflicts resolved from #3359 ) (#4630 )	2020-09-08 14:34:09 -07:00
Yufeng Li	ffc2b25a3a	Quantization tool improvement (#4933 ) Improve quantization tools: 1. Support QAT 2. Make quantization tool to register Operators. 3. Make the API clear to use Co-authored-by: t-yguo <t-yguo@microsoft.com>	2020-09-01 09:07:46 -07:00
Thiago Crepaldi	42408aa3ed	Add new PytTrch front-end (#4815 ) * Add ORTTrainerOptions class for the new pytorch frontend (#4382) Add ORTTrainerOptions class and some placeholders * Add _ORTTrainerModelDesc to perform validation for model description (#4416) * Add Loss Scaler classes to the new frontend (#4306) * Add TrainStepInfo used on the new frontend API (#4256) * Add Optimizer classes to the new frontend (#4280) * Add LRScheduler implementation (#4357) * Add basic ORTTrainer API (#4435) This PR presents the public API for ORTTrainer for the short term development. It also validates and saves input parameters, which will be used in the next stages, such as building ONNX model, post processing the model and configuring the training session * Add opset_version into ORTTrainerOptions and change type of ORTTrainer.loss_fn (#4592) * Update ModelDescription and minor fix on ORTTrainer ctor (#4605) * Update ModelDescription and minor fix on ORTTrainer/ORTTrainerOptions This PR keeps the public API intact, but changes how model description is stored on the backend Currently, users creates a dict with two lists of tuples. One list called 'inputs' and each tuple has the following format tuple(name, shape). The second list is called 'outputs' and each tuple can be either tuple(name, shape) or tuple(name, shape, is_loss). With this PR, when this dict is passed in to ORTTrainer, it is fully validated as usual. However, tuples are internally replaced by namedtuples and all output tuples will have tuple(name, shape, is_loss) format instead of is_loss being optionally present. Additionally to that normalization in the internal representation (which eases coding), two internal methods were created to replace a namedtuple(name, shape) to namedtuple(name, shape, dtype) or namedtuple(name, shape, is_loss, dtype) dependeing whether the tuple is an input or output. This is necessary as ORTTRainer finds out data types of each input/output during model export to onnx. Finally, a minor fix was done on ORTTrainer. It could initialize ORTTrainerOptions incorrectly when options=None * Rename input name for test * Add ONNX Model Export to New Frontend (#4612) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com> * Create training session + minor improvements (#4668) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> * Save ONNX model in file (#4671) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> * Add eval step (#4674) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> * Add train_step (#4677) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> * Add LR Scheduler (#4694) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com> * Add deterministic compute tests (#4716) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com> * Add legacy vs experimental ORTTrainer accuracy comparison (#4727) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com> * Add Mixed precision/LossScaler + several fixes (#4739) Additionally to the mixed precision/loss scaler code, this PR includes: * Fix CUDA training * Add optimization_step into TrainStepInfo class * Refactor LRSCheduler to use optimization_step instead of step * Updated several default values at ORTTrainerOptions * Add initial Gradient Accumulation supported. Untested * Fix ONNX model post processing * Refactor unit tests * Add ONNX BERT example + minor fixes (#4757) * Fix training issue when passing ONNX file into ORTTrainer Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com> Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> * Add Dynamic Shape support (#4758) * Update DeepSpeed Zero Stage option to a separate option group (#4772) * Add support to fetches (#4777) * Add Gradient Accumulation Steps support (#4793) * Fix Dynamic Axes feature and add unit test (#4795) * Add frozen weights test (#4807) * Move new pytorch front-end to 'experimental' namespace (#4814) * Fix build Co-authored-by: Rayan-Krishnan <rayankrishnan@live.com> Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-08-17 09:45:25 -07:00
Changming Sun	5eec4f66ed	Refactor manylinux docker image and the related pipelines (#4751 ) 1. Publish the image ACR, instead of building it every time for every PR 2. Make USE_MKLML and USE_OPENMP be able to co-exist. Currently both of them are enabled in our Linux CI build but indeed only one of them is taking effect. 3. Split nuphar and DNNL to separated pipelines. 4. Fix two warnings in onnxruntime/core/optimizer/matmul_scale_fusion.cc and onnxruntime/test/tvm/tvm_basic_test.cc. 5. Update the manylinux2010_x86_64 image to the latest.	2020-08-17 09:40:31 -07:00
George Wu	f12e9de111	build fixes for https://github.com/microsoft/onnxruntime/pull/4721 (#4784 ) * test * test * add missing CUDA header include * debug * fix * fix python package for dnnl and tensorrt. * fix * fix windows build. * revert * target_link_directories for tensorrt shared lib.	2020-08-14 06:24:44 +08:00
Ryan Hill	ac725b53f6	Convert TensorRT provider into a shared library (#4721 ) Lots of changes to shared library interfaces, new lighter weight design.	2020-08-10 21:17:16 -07:00
Yufeng Li	5dc7339be6	Add quantization tool to python package (#4458 ) * Add quantization tool to python package	2020-07-08 21:42:53 -07:00
goloskokovic	478b923e19	Expose ACL/ARMNN providers to Python (#4260 ) * expose ACL/ARMNN providers to python * add -acl / -armnn to package name when use_acl / use_armnn is specified * build python wheel for ARMNN EP * link ACL/ARMNN EPs into onnxruntime_pybind11_state * wrong argument order in build_python_wheel for wheel_name_suffix	2020-06-18 20:24:14 +05:30

1 2 3 4 5

207 commits