onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-21 02:18:09 +00:00

Author	SHA1	Message	Date
RandySheriffH	9345894c82	Add build option to enable cuda profiling (#9875 )	2021-11-29 22:44:50 -08:00
Scott McKay	fb4a8e12fc	Limit inclusion of Xamarin mobile target frameworks. (#9834 ) - Only set them as targets for the ORT nuget package - Use OrtPackageId as the condition for inclusion, if installed - need to do the nuget restore via msbuild so that this property is set correctly - Add desktop-only version of the C# sln as there is no way to exclude the mobile specific csproj's from an sln - use this when applicable if someone is running build.py with the `--build_nuget` flag Other - remove attempt to include symbols in the nuget package as nuget doesn't support symbols in native packages - update build.py to use `nuget` and not a windows specific path and filename for a linux build with `--build_nuget`	2021-11-23 11:29:53 +10:00
Chi Lo	7242627fec	Integrate TensorRT into GPU Python package (#9785 ) * add use_tensorrt build option * Add use_tensorrt to running tests * add use_tensorrt for Windows * make trt ep to skip backend test * make trt ep to skip backend test * Fix bug * Add/Modify description * modify for debug * swtich pool to test * modify to debug * modify to debug * add vobersity * refine the code * refine the code * refine the code * fix flake8 warning * refine the code * add pre_load check for trt as well as add cupti lib to cuda depedencies * modify script to make trt build path the same as cuda * show error message when user wants to run TensorRT but TensorRT is not installed in the env * fix bug * fix bug * add trt lib for manylinux * include cuda_dependencies for trt * rewrite the condition to throw exception * make code more compact	2021-11-18 13:26:51 -08:00
Changming Sun	76715ad525	Delete ioscross code (#9793 )	2021-11-18 11:31:13 -08:00
sfatimar	1d03baa8cc	Openvino ep 2021.4 v3.3 (#9588 ) * Added checks for Hetero/Multi Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Remote Context Plugin * changes for IO Buffer plugin * erronous couts added * erronous entry rectified * Set the Openvino OP Buffer also as output * Enable AUTO plugin in OpenVINO EP Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Remote Context Plugin * changes for IO Buffer plugin * erronous couts added * erronous entry rectified * Added checks for Hetero/Multi Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Set the Openvino OP Buffer also as output * Enable AUTO plugin in OpenVINO EP Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Please commit error message and rectification of param.context * Alignment fixed Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Changed the string to OpenVINO_GPU * hanged OpenVINO to to OpenVINO_CPU * Onnxruntime updated API for memory location * Removing Duplicate LOG Error * Tensor.h removed DeviceType function. Updated comment * API Comments updated * Removing changes to Provider Indo * Erronous commit * Removing Extra logs * Merge CMAKE * Not copy from a local location * Duplicate Entry * Remove extra line Co-authored-by: MaajidKhan <n.maajidkhan@gmail.com>	2021-11-15 13:41:12 -08:00
Tang, Cheng	99257eb8e3	support build option to include external graph transformers (#9478 ) * temp code * support external graph transformer from build script * remove debug code * add test case * support register rewrite rule * fix source_group issue if external source is not share any common prefix * fix python code style checker * resolve merge conflict Co-authored-by: Cheng Tang <chenta@microsoft.com>	2021-11-15 08:16:20 -08:00
Edward Chen	9f69d8bbae	Disable partial runtime optimization implementation by default (#9748 ) * Only serialize runtime optimization records container if non-empty. * Remove runtime optimizations from onnxruntime/core/flatbuffers/schema/README.md as it's not completely implemented yet. * Disable partial runtime optimization implementation by default.	2021-11-12 17:37:29 -08:00
Edward Chen	997266a620	Add build.py option to disable ORT format model runtime optimization (#9723 ) ORT format model runtime optimization implementation is in progress. This change adds a build.py option to disable the partial runtime optimization implementation, adds CI builds to test it, and disables runtime optimizations in mobile package builds.	2021-11-11 18:05:45 -08:00
Changming Sun	1cbbafdbe0	Change the default value of onnxruntime_DISABLE_RTTI (#9674 )	2021-11-05 15:27:04 -07:00
Abhishek Jindal	dfe4d0a330	Abjindal/eager windows ci pipeline (#9587 ) * adding eager ci pipelines files * adding import torch before onnxruntime * finding os environ path * finding os environ path corrected * print OS environ path variables * adding environ path for torch * changing python version * changing python python for torch libs * removing import torch statements * removing unncecessary torch path * removing path variable * add dll_path * test for python 3.7 * adding dll directory path for python 3.8+ * print dll directory path for python 3.8+ * adding requirements file * change requirements directory * print more * adding dll dir path * removing setup eager file * adding details for dll directory * adding details for dll directory more * adding import torch in onnxruntime init file * removing dll dir path and moving requirements file * enabling pipeline for py3.7 * remove enter * removing debug build * removing openmp * adding comments for torch dll loading and cases of failure * cleaning up the pipeline	2021-11-05 09:09:09 -07:00
Sheil Kumar	71a1a7b471	Enable building winml with --build_nuget (#9632 ) * Enable building winml with --build_nuget * Fix flake8 errors * semicolor Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2021-11-04 00:42:51 -07:00
Edward Chen	c315d1b3cd	Always enable ORT format model loading. (#9586 )	2021-11-01 10:00:08 +10:00
Guoyu Wang	fa4658e8a9	Move to XCode new build system if building on Mac using XCode (#9617 ) * Use xcode new build system * Address cr comments	2021-10-29 18:44:55 -07:00
marcusfreisleben	651955d3c9	CUDA: Enable parallel compilation (#8974 ) * Pass on parallel option to nvcc * Fixed build.py * Added missing string conversion * Adressed review points	2021-10-25 16:42:58 -07:00
Changming Sun	d83adaaf9f	Remove optional-lite (#9424 )	2021-10-22 16:45:45 -07:00
Changming Sun	406f1629c1	Remove Featurizers code (#9300 )	2021-10-20 10:20:35 -07:00
Abhishek Jindal	87e726d1a0	Abjindal/merge eager with external custom ops (#8986 ) * switching to pytorch nightly build * adding eager mode * enable pybind and remove install step * removing auditwheel repair process * installing package * adding auditwheel back * disabling auditwheel repair for eager mode * typo correction	2021-10-14 13:19:45 -07:00
Abhishek Jindal	23700a15a0	Abjindal/eager windows build (#9326 ) * removing warnings which are causing errors from torch and changing flags for Windows * adding MKL library resolution and comments * cleaning up the code * fixing onnxruntime_python file for windows build * fix the include order to aovid the python_d.lib issue on win debug build * changes for warnings, typos and other comments * merge conflict * adding fix for mkl library error * Revert "adding fix for mkl library error" This reverts commit `73b87c73c2`. * fix for dll path for windows * typo for dll path Co-authored-by: Cheng Tang <chenta@microsoft.com>	2021-10-14 12:54:49 -07:00
Moshe David	510b747821	w (#9319 ) Co-authored-by: modav <modav@microsoft.com>	2021-10-12 16:02:40 -07:00
Sunghoon	2f1204a5d5	[js/web] Enable wasm profiling and preserve function names in profiling (#9314 ) * add p50 in test * allow WebAssembly profiling and preserve function names Co-authored-by: Yulong Wang <yulongw@microsoft.com>	2021-10-11 22:04:50 -07:00
Maajid khan	72c4cea9e6	[OpenVINO-EP] V3.2 Release (#9232 ) * model caching changes for 2021.4 Signed-off-by: Your Name <you@example.com> * changed the ov version check * Minor changes added Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Added support for external data format Starting from OpenVINO 2021.4 version, OpenVINO-EP will support onnx models with Weights saved in external file location. Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Introduced Hetero/Multi options for perf_test Enabled to use HETERO/MULTI device feature from OpenVINO-EP using the onnxruntime_perf_test tool. Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * cleaned up CMake code for older OV version support OV 2020.3 is now longer supported by OpenVINO-EP. This check is not required now. Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Add option to disable graph partitioning Added a option to diable graph partitioning during build time for OpenVINO-EP. with this option, when the model is not fully supported on OpenVINO-EP, the model fully fall backs to default CPU EP (MLAS). Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Changed the flag for diabling graph partitioning Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Fixes the flake8 check error Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Added changes for disable graph partition option Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Fixed flake8 indentation error Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> Co-authored-by: Your Name <you@example.com>	2021-10-07 16:02:19 -07:00
Guoyu Wang	ddafe50199	Fix Android build break after Virtual Environment update to 20210919 (#9163 )	2021-09-23 10:07:18 -07:00
Zuwei Zhao	ff66cfdfa6	Enable linking in exception throwing support library when build onnxruntime wasm. (#8973 ) * Enable linking in exception throwing support library when build onnxruntime webassembly containing onnxruntime-extensions. * Add flag in build.py to enable linking exceptions throwing library. * Update onnxruntime-extensions document and bind custom_ops build flag with use_extensions. * Update doc. * Update cgmanifest.json. Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com>	2021-09-10 22:09:16 +08:00
Changming Sun	91c15843cd	Fix a directml python packaging error (#8981 )	2021-09-07 16:29:33 -07:00
Gary Miguel	47435311f4	Include pytorch_export_contrib_ops in inference builds (#8878 ) * Include pytorch_export_contrib_ops in inference builds Rename / move it from tools/python/register_custom_ops_pytorch_exporter to onnxruntime/python/tools/pytorch_export_contrib_ops. Rationale for inclusion in inference builds: This code is potentially useful for anyone using ORT, not just training. Rationale for new name: "Contrib op" is the nomenclature used within ORT to refer to the set of ops that are not in the standard op set but are included by default with ORT. This is more specific than "custom op", which is what the PyTorch exporter uses to refer to any non-standard op. Step 1 of addressing #8818. After this is merged I will update the docs. * Enable test_pytorch_export_contrib_ops.py in CI Fixes AB#1342330	2021-09-02 14:26:58 -07:00
Maajid khan	b7129305be	[OpenVINO-EP] UEP v3.1 Release with OpenVINO 2021.4 (#8892 ) * Add command to skip tests * Remove support for OV_2021.3_LTS and ov_2021.1 Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Removed request_id parameter from all references request_id parameter was being used with ov_2020.3 release. Starting from 2020.4 OV release, input_name paramater is being used instead to get the KernelContext_GetInput. Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Enabling CI Logs in the branch * CI Commits to enable logs * Enable CI Print * Added Imagescaler op to the supported op's list Fixes test_tiny_yolo_V2 opset 8 model to support fully on OV-EP. This model is the older variation of tiny_yolo_v2 model which has Imagescaler op. Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Added ops to fully support yolov3 model -Added changes to support yolov3 opset 10 model fully on CPU_FP32. -This also increases the operator coverage for GPU hardware. There by enabling yolov3 model on GPU with fewer subgraphs. Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Enabling tiny_yolov3 model fully on CPU ->Enabled tiny_yolov3 model fully on CPU. -> Also reduces the number of subgraphs to infer this model on GPU Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Adding GatherND op support for CPU and GPU ->This enables yolov3_pytorch model to work with fewer subgraphs on CPU and GPU Devices. Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Fixes Albert model for ISV customer ConvTranspose op was getting rejected due to a condition. Fixed it. Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Disabling this 4 cpp tests for openvino-ep These unit tests are failing with special conditions for conv_transpose op with output_shape attribute. so disabling them for now. Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Docker file changes for 2021.4-v3.1 * Remvoing duplicate code Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * ReduceMax No dimension supported * Fixes failing protobuf issue for docker Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Excluding openvinoep type for convtranpose test Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Disabled 2 Failing convtranspose tests with TensorRT EP Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> Co-authored-by: suryasidd <surya.siddharth.pemmaraju@intel.com> Co-authored-by: Aravind Gunda <aravindx.gunda@intel.com> Co-authored-by: sfatimar <sahar.fatima@intel/com>	2021-08-31 09:23:13 -07:00
satyajandhyala	84f9271a8d	Enable registering external custom op schemas on Linux (#8889 ) * Use manylinux instead of Ubuntu to run external custom ops build pipeline.	2021-08-30 10:13:47 -07:00
satyajandhyala	31926176ac	Support external custom operator schemas on Ubuntu (#8807 ) * Expose symbols in onnx and protobuf namespaces in python when building with --enable_external_custom_op_schemas * Add external onnx and protobuf files to wheel * Added an example to demonstrate external custom ops use-case * Added a Linux build pipeline to test external custom ops	2021-08-28 11:05:21 -07:00
Zuwei Zhao	89e8bff121	Enable selecting custom ops in onnxruntime-extensions. (#8826 ) * Enable selecting custom ops in onnxruntime-extensions. * Move cmake_helper.py. * Remove over-indented spaces. * Add doc. * Remove onnxruntime-extensions from git submodules, and user should pass path of onnxruntime-extensions for build. * Modify doc. * Remove argument --enable_onnxruntime_extensions and use --onnxruntime_extensions_path. * Fix build error. * Fix build error. * Use onnxruntime_extensions_path. * support both submodule and external source folders * refinement * Update cgmanifest.json * Support building onnxruntime-extensions from either git submodule or pre-pulled path. * Update doc. * more standard name * update docs * add the copyright header Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com> Co-authored-by: Wenbing Li <wenbingl@outlook.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>	2021-08-27 21:45:52 -07:00
Guoyu Wang	6a1939252f	Fix Android java API failure (#8865 ) * Fix Android Package break * Without java fix -- pipeline should fail * With java fix, should pass now * address CR comments	2021-08-27 15:58:56 -07:00
Yulong Wang	e8564d6597	[js/web] update emsdk to v2.0.26 (#8653 ) * update emsdk to v2.0.26 * fix pooling build warning * fix build break * use pragma diagnostic semantic only when __GNUC__ is defined * fix build break * disable AttentionPastState_dynamic	2021-08-26 15:31:34 -07:00
Chandru Ramakrishnan	2693af9799	Ported changes / bug fixes from torch/ort. (#8784 ) * Ported changes / bug fixes from torch/ort. * Fixed formatting * Renamed function * Renamed module_ to module. * Revert "Renamed module_ to module." This reverts commit b17fc114b3db20d174283811d90592b5b8154c19. * Include pybind common header to fix linker errors on windows debug. * Fix to generation of > 1 custom op. Co-authored-by: Ashwin Hari <ashari@microsoft.com>	2021-08-23 17:45:40 -04:00
Rachel Guo	78759059f1	[CoreML EP]Make coreml ep build on non-macOS platform (#8677 ) * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * clean * remove unused defs * correct typo * remove onnxruntime_coreml_proto * cr comments * enablie nnapi/coreml in minimal build * enable nnapi/coreml in one build * refine dependencies * fix nnapi build failure and remove onnxruntime_coreml_proto dependencies in unit tests cmake files * small fix * fix * fix build * revert * fix build Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>	2021-08-18 09:35:32 -07:00
Wei-Sheng Chin	47b3ecb53b	Packaging pipeline now builds with PythonOp (aka running autograd.Function) (#8652 ) This PR disable UTs in training's package pipelines for building packages with PythonOp (torch.autograd.Function).	2021-08-17 10:55:13 -07:00
Tang, Cheng	6d3c2c85ef	Integrate eager mode source code into onnxruntime repo (#8584 ) * integrate eager mode source codde; build with cmake and integrate the python test * Adding the python path for importing libraries in the Eager mode * fix clang break;check if training and python enabled * handling the linking of torch libraries across multiple platforms * merge and fix the naming * add build instruction Co-authored-by: Abhishek Jindal <abjindal@OrtTrainingDev0.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: ajindal1 <abjindal@microsoft.com>	2021-08-06 08:30:27 -07:00
Edward Chen	dda9f53bed	Build script logging updates (#8618 ) Log build.py command line arguments. Update subprocess logging to format arguments in way that is easier to copy.	2021-08-05 09:41:17 -07:00
satyajandhyala	87975bdeef	Use CUDA_HOME and CUDNN_HOME from the environment if they are not specified on the command line. (#8575 )	2021-08-02 09:18:44 -07:00
Changming Sun	0510688411	Update compliance tasks in python packaging pipeline and fix some compile warnings (#8471 ) 1. Update SDLNativeRules from v2 to v3. The new one allows us setting excluded paths. 2. Update TSAUpload from v1 to v2. And add a config file ".gdn/.gdntsa" for it. 3. Fix some parentheses warnings 4. Update cmake to the latest. 5. Remove "--x86" build option from pipeline yaml files. Now we can auto-detect cpu architecture from python. So we don't need to ask user to specify it.	2021-07-30 17:16:37 -07:00
Guoyu Wang	464fd28ee9	Update iOS packaging script to default build static framework, disable bitcode (#8533 ) * default package build to static, disable bitcode * fix pipeline failure * Address CR comments	2021-07-29 17:28:02 -07:00
Ye Wang	ad093b94b9	Restore transformers tests and disable some tests (#8530 ) * restore transformers tests and disable some tests * test * update * pass pep8 check * update	2021-07-29 14:09:36 -07:00
KeDengMS	d243b38929	[Symbolic Shape Infer] Bump up required onnx ver And remove some stale comments in build.py	2021-07-29 09:36:20 -07:00
Dmitri Smirnov	950fe5e28b	Implement SparseTensor and infrastructure suppport and advance ONNX commit (#8038 ) SparseTensor support Implement Builder pattern Fix support for 1-D and 2-D COO indices Implement and test CSR support. Handle shape inference for SparseTensors Implement conversion for COO, CSR and tests. Address the case where constant sparse initializer is the output. Implement test infra for SparseTensors Implement SparseDenseMatMul for Csr and COO and tested it. Add hash for SparseToDenseMatMul Finish shared provider refactor Refactor GetOrCreate to Create Working on py interface Expose OrtDevice and use it in allocate_numpy Adjust Sparse interfaces, add support for string SparseTensor. Add tests. Add and test to_cuda() Add accessors to format specific indices Test values and indices views, read-only flag, after GC access Add sparse related methods to OrtValue Re-work SparseTensor wrapper, add OrtValue methods Rework numpy_array_to_cuda/to_cpu Add run_with_ort_values Add models and test sparse_mat_mul with run_with_ort_values Refactor sparse tensor to use a single buffer Ifdef x86 Eigen CSR sparse matmul implementation Exclude broken test, check for string type when copying cross device Split pybind schema, regenerate docs, add exclusion Conditionally exclude schema module Update docs fix cuda build Add test to a filter and renerate JS docs Add conversion and test string support for sparse tensors Exclude conversion utils from minimal build Add CUDA Memcpy and adjust provider interfaces	2021-07-22 15:24:36 -07:00
Guoyu Wang	c5038063ed	Add iOS/macOS static framework (#8357 ) * Add ability to generate ios static framework * Fix typos * Add pod cache clean, update some comments of previous commit * Fix CI failure with newly added cpuinfo library * Update test model (CoreML requires node has a name) * Addressed CR comments	2021-07-14 16:39:17 -07:00
Zuwei Zhao	b46310b349	Integrate onnxruntime-extensions into onnxruntime. (#8143 ) Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com>	2021-07-01 09:34:03 -07:00
Changming Sun	c716b56f26	Update C++ Standard from 14 to 17 (#8041 ) Switched the code to C++17. To build ONNX Runtime on old distros like CentOS 7, you need to install a newer GCC from additionary repos. If you build onnxruntime with the newer GCC, typically the result binary can't be distributed to other places because it depends on the new GCC's runtime libraries, something that the stock OS doesn't have. But on RHEL/CentOS, it can be better. We use Red Hat devtoolset 8/9/10 with CentOS7 building our code. The new library features(like std::filesystem) that not exists in the old C++ runtime will be statically linked into the applications with some restrictions: 1. GCC has dual ABI, but we can only use the old one. It means std::string is still copy-on-write and std::list::size() is still O(n). Also, if you build onnxruntime on CentOS 7 and link it with some binaries that were built on CentOS 8 or Ubuntu with the new ABI and export C++ symbols directly(instead of using a C API), the it won't work. 2. We still can't use std::optional. It is a limitation coming from macOS. We will solve it when we got macOS 11 build machines. It won't be too long. 3. Please avoid to use C++17 in CUDA files(.cu). Also, the .h files that they include(like core/framework/float16.h). This is Because CUDA 10.2 doesn't support C++17. You are welcome to use the new features in any *.cc files.	2021-06-25 14:08:01 -07:00
Changming Sun	96989b83ee	Create python packages for DML (#8061 )	2021-06-16 16:59:12 -07:00
Ye Wang	e6225c62a5	transformers test CI pipeline fix (#8016 ) * init checkin * Restore initial environment * -y * testtest * fix * fix indent	2021-06-11 12:57:52 -07:00
Ye Wang	d433aa2459	Add transformers tool test to pipeline (#7959 ) * checkin transformers pipeline * add docker requirements * only trigger linux cpu * temp remove tf instalation due to numpy version conflicts * test numpy>=1.7 * revert numpy and disable transformers * add coloredlogs * enable shape_infer_helper and install transformers when needed * pip3? * testtest * enable more tets * line too long * remove pytorch1.4 test and added back some onnx files * add tests * copy dir * disable 2 teests * trim lines * add missing onnx * fix type * fix version conflicts * install psutil * change file path * mfix path * remove cached files * add back attention fusion test * labeled the shape infer test as slow * fix * enable tf2onnx test and enable pytest * refactor path * fix typo * add cwd	2021-06-08 19:43:59 -07:00
Yulong Wang	9b5f749176	[wasm] emsdk: allow to install emscripten only (#7961 )	2021-06-07 09:45:02 -07:00
Changming Sun	b856e7ae3c	Update build.py: change default cmake generator for Windows to VS2019 (#7945 ) VS 2019 is well tested. VS 2017 is not. We should make "Visual Studio 16 2019" as the default to not confuse people.	2021-06-04 15:56:53 -07:00
Changming Sun	5a7f65b831	Fix training e2e pipeline (#7942 ) 1. Fix training e2e pipeline. The failure was caused by my recent change #7632. The fix is adding "--cmake_extra_defines CMAKE_CUDA_ARCHITECTURES=70" to the build parameters because the machines are with V100 GPUs. 2. Simplify Nuphar pipeline. It doesn't need to install a separated ONNX version(1.5.0) 3. Fix a problem that run_dockerbuild.sh ignored OS version parameter. Now because it starts to take effect, I also set python version to the system default one(3.8 for ubuntu 20.04)	2021-06-04 09:37:09 -07:00
Yulong Wang	0723d16436	[wasm] allows to specify MALLOC setting for wasm build (#7934 )	2021-06-03 23:08:56 -07:00
Changming Sun	b854f2399d	Update manylinux build scripts and GPU CUDA version from 11.0 to 11.1 (#7632 ) 1. Update manylinux build scripts. This will add [PEP600](https://www.python.org/dev/peps/pep-0600/)(manylinux2 tags) support. numpy has adopted this new feature, we should do the same. The old build script files were copied from https://github.com/pypa/manylinux, but they has been deleted and replaced in the upstream repo. The manylinux repo doesn't have a manylinux2014 branch anymore. So I'm removing the obsolete code, sync the files with the latest master. 2. Update GPU CUDA version from 11.0 to 11.1(after a discussion with PMs). 3. Delete tools/ci_build/github/linux/docker/Dockerfile.manylinux2014_cuda10_2. (Merged the content to tools/ci_build/github/linux/docker/Dockerfile.manylinux2014_cuda11) 4. Modernize the cmake code of how to locate python devel files. It was suggested in https://github.com/onnx/onnx/pull/1631 . 5. Remove `onnxruntime_MSVC_STATIC_RUNTIME` and `onnxruntime_GCC_STATIC_CPP_RUNTIME` build options. Now cmake has builtin support for it. Starting from cmake 3.15, we can use `CMAKE_MSVC_RUNTIME_LIBRARY` cmake variable to choose which MSVC runtime library we want to use. 6. Update Ubuntu docker images that used in our CI build from Ubuntu 18.04 to Ubuntu 20.04. 7. Update GCC version in CUDA 11.1 pipelines from 8.x to 9.3.1 8. Split Linux GPU CI pipeline to two jobs: build the code on a CPU machine then run the tests on another GPU machines. In the past we didn't test our python packages. We only tested the pre-packed files. So we didn't catch the rpath issue in CI build. 9. Add a CentOS machine pool and test our Linux GPU build on real CentOS machines. 10. Rework ARM64 Linux GPU python packaging pipeline. Previously it uses cross-compiling therefore we must static link to C Runtime. But now have pluggable EP API and it doesn't support static link. So I changed to use qemu emulation instead. Now the build is 10x slower than before. But it is more extensible.	2021-06-02 23:36:49 -07:00
Yulong Wang	faae347d9f	[wasm] upgrade emsdk version to 2.0.23 (#7893 ) * upgrade emsdk version to 2.0.23 * fix build * override gmock build options	2021-06-02 12:26:24 -07:00
Scott McKay	0fbec1b9c1	Update the operator documentation generation (#7787 ) * Update the operator documentation generation - Make layout a little nicer - Update to latest supported operators including training - Fix some links that are broken when the docs content is copied to github-pages - Fix incorrect usage of 'onnx.ai.ml' as the default domain - ML ops are now separated from the real default domain of 'onnx.ai' - Include CPU, CUDA and training kernels - exclude DNNL as it's not an EP we own * There are separate paths for CUDA and CUDNN as they are not guaranteed to be in the same location on a Windows machine. Use the CUDNN path when looking for the CUDNN library. * Enable validation of both contrib ops and operator kernels in build Filter generation so it's deterministic Add ability for CI to publish the md files as build artifacts if they differ so a developer can download and add to their PR to resolve any diffs. Remove workarounds for github-pages as that will now link to the github docs which display correctly	2021-06-02 17:47:40 +10:00
Guoyu Wang	e7e200ee59	Add test for iOS package (#7816 ) * Add test for iOS package * Add readme * fix pep8 warning * Addressed CR comments, fixed CI failure * Address CR comments * Update readme.md * Update package name and readme, added comments to the podspec	2021-06-01 11:01:37 -07:00
Gao, Chun	4dd724ef1a	Enable WebAssembly SIMD build (#7839 ) Add a build switch "--enable_wasm_simd" to enable WebAssembly SIMD build	2021-05-28 16:29:58 -07:00
liqunfu	bed6e87cbd	add environment variable to control default training package's local version (#7849 )	2021-05-26 22:44:20 -07:00
Yulong Wang	7c4a5faef5	[wasm] enable DWARF format debug info for ORT WASM (#7777 ) * [wasm] enable DWARF format debug info for ORT WASM * resolve comments	2021-05-21 01:32:00 -07:00
Taewoo Kim	1e6ad669cf	Support arm64e for osx Add arm64e to choices variable	2021-05-18 14:58:58 -07:00
Changming Sun	26a472c948	Increase test timeout from 1 hour to 2 hours (#7735 ) I saw a test timeout in our nodejs packaging pipeline. I'm not sure if it is because it ran slower than before or it's a deadlock issue. Increasing the timeout will be helpful for investigating such issues.	2021-05-18 10:51:58 -07:00
liqunfu	359fe1d197	Liqun/ort training version (#7620 )	2021-05-14 09:54:19 -07:00
Guoyu Wang	a47a234b7e	Add minsdkver for AAR and AndroidTest (#7669 )	2021-05-12 16:01:25 -07:00
Guoyu Wang	69d1db83ac	Enable bitcode for iOS by default (#7640 )	2021-05-10 21:27:45 -07:00
Rachel Guo	d8cf960412	Add android test app to validate Java API for ORT-Mobile Android (#7477 ) * test * [gwang] make cmake compile work * [gwang] enble build apks * some build update * add simple sigmoid test android project and cmake * add build.py * refine and remove unused import lib * address CR comments * remove unnecessary files * add README.md * minor update * remove * minor change * fix ci failure and minor update * fix typo in project folder * remove * remove and minor update * refine * minor fix * fix * fix typo * add gradle spotlessApply task to fix CI failure * fix * enable spotlessApply in build gradle * revert some changes * minor fix * run spotless apply for format * address CR comments and fix CI version and format * refine * Refine * address comments * refine * refine * modify * reformat * resolve version conflicts * minor update * minor update * address comments * minor update Co-authored-by: Guoyu Wang <wanggy@outlook.com>	2021-05-04 15:39:14 -07:00
Tang, Cheng	54db6648af	kerne invoker api for eager mode (#7473 ) * initial draft for kernel invoke api * initial implementation of kernel invoker * [eager] fix build on Mac * [eager] increment input name in kernel invoker * temp fix for type in eager mode * use global default log manager * rollback the previous commit since it break linux build * Revert "rollback the previous commit since it break linux build" This reverts commit `58c2c3423a`. * Eager Mode: fix linking on macOS * optimizer_execution_frame: ignore unused lambda capture (model_path) * fix link issue * ORTInvoker: set correct input argument tensor element proto types Do not set a type proto on output arguments to allow ORT to deduce them * ORTInvoker: create only one logging manager * Minor fix to set execution provider type correctly. (#7000) Co-authored-by: Chandru Ramakrishnan <chandru-r@github.com> * training fix * support config output ml values in frame, so we can use it to implement inplace update * Fix range loop error while building. (#7087) Co-authored-by: Chandru Ramakrishnan <chandru-r@github.com> * Conditionally link with nsync_cpp if not windows. (#7151) Co-authored-by: Chandru Ramakrishnan <chandru-r@github.com> * Fixed initialization order in ORT kernel invoker (#7342) * Updated constructor of ort_kernel_invoker to take a logger. * Changed linking order. * Updated test. * add inplace ut * add build option * Update include/onnxruntime/core/eager/ort_kernel_invoker.h Co-authored-by: Derek Murray <Derek.Murray@microsoft.com> * resolve comments in pr * fix build break;merge from master * fix build break Co-authored-by: Cheng Tang <chenta@microsoft.com> Co-authored-by: Aaron Bockover <abock@microsoft.com> Co-authored-by: Chandru Ramakrishnan <41447659+chandru-r@users.noreply.github.com> Co-authored-by: Chandru Ramakrishnan <chandru-r@github.com> Co-authored-by: Derek Murray <Derek.Murray@microsoft.com>	2021-04-30 13:33:58 -07:00
Yulong Wang	00aaa6dabb	update CI for onnxruntime-web (#7497 )	2021-04-29 22:22:52 -07:00
Edward Chen	d21304ceb0	Initial Objective-C API (#7366 ) Initial implementation of an Objective-C API.	2021-04-27 10:06:30 -07:00
Suffian Khan	7a3c1787af	Add CI pipeline to publish Python training package targeting Rocm (#7417 ) * first attempt rocm training wheel * modifications needed to python packaging pipeline for Rocm 4.1 * changges to not conflict with cuda missed stage1 changes remove package push add option r to getopt try again without python install try again without python install try again without python install split pipelines and add back push to remote storage try on cuda gpu pool try again try again try running without az subscription set try again on original pipeline change pool passing AMD Rocm whl on AMD-GPU pool split rocm pipeline from cuda pipeline remove comments * try adding Rocm tests as well * try with tests in place * fix trailing ws * add training data * try again as root for tests * use python3 * typo * try to map video, render group into container * try again * try again * try to avoid yum error code * make UID 1001 * try without yum downgrade * define rocm_version=None * remove CUDA related comments for Rocm Dockerfile * Dont pin nightly torch torchvision torchtext versions as they expire (for now nightly is required for Rocm 4.1) * missed requirements-rocm.txt from last commit * fix whitespace	2021-04-23 17:22:31 -07:00
Guoyu Wang	d414039189	Add ios coreml ci, and speedup ios ci run (#7420 )	2021-04-22 23:41:58 -07:00
Yulong Wang	b56dd037d3	increase timeout for nodejs binding test (#7422 ) Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-04-22 21:40:40 -07:00
Changming Sun	afa7b23609	Update docs/ContribOperators.md and the script that generates it. (#7399 )	2021-04-21 16:20:56 -07:00
Yulong Wang	009f342caf	[JS] refactor Javascript/Typescript libraries in ONNX Runtime (#7308 ) * working on re-organizing js code for ortweb * remove dup files * move folder * fix common references * fix common es5 * add webpack to common * split interfact/impl * use cjs for node * add npmignore for common * update sourcemap config for common * update node * adjust folder/path in CI and build * update folder * nit: readme * add bundle for dev * correct nodejs paths * enable ORT_API_MANUAL_INIT * set name for umd library * correct name for commonjs export * add priority into registerBackend() * fix npm ci pwd * update eslintrc * revise code * revert package-lock lockfileVersion 2->1 * update prebuild * resolve comments * update document * revise eslint config * update eslint for typescript rules * revert changes by mistake in backend.ts * add env * resolve comments	2021-04-16 01:33:10 -07:00
Sunghoon	ded2b08380	WebAssembly multi-threads support. (#7326 ) * WebAssembly multi-threads support. * PROXY_TO_PTHREAD is not required for wasm library * Remove an unnecessary line commented out	2021-04-15 21:46:11 -07:00
Guoyu Wang	28e229ac4c	Enable build dynamic framework for macOS/iOS (#7343 ) * Enable build dynamic framework for macOS/iOS * Address CR comments	2021-04-15 16:47:53 -07:00
liqunfu	4c862c73ed	for training to use new python package naming convention to explicitl… (#7204 )	2021-04-13 16:19:42 -07:00
Yulong Wang	405ca49012	build ONNXRuntime into WebAssembly (#6478 ) * Simplified version of WebAssembly support to keep most of existing data structures and add cmake using Ninja and emcmake * Clean up CMakeLists.txt and add an example to create and compute a kernel * Load a model from bytes and remove graph building steps * Add all cpu and contrib ops with mlas library * WebAssembly build with Onnxruntime C/CXX API * Use protobuf cmakefile directory instead of adding every necessary source file * Fix invalid output at example * add missing files * Change an example to use Teams model and support ort mobile format * add API for javascript * fix input releasing in _ort_run() * update API * Let onnxruntime cmake build WebAssembly with option '--wasm' * allow one-step building for wasm * Make build script working on Linux and MacOS * Fix broken build from Windows command * Enable unit test on building WebAssembly * Resolve comments * update build flags * wasm conv improvement from: 1) GemmV; 2) Depthwise direct convolution 3x3; 3) Direct convolution 3x3 * Cleaned mlas unittest. * use glob * update comments * Update baseline due to loss scale fix (#6948) * fix stream sync issue (#6954) * Enable type reduction in EyeLike, Mod, random.cc CPU kernels. (#6960) * Update EyeLike CPU kernel. * Update Mod CPU kernel. * Update Multinomial CPU kernel. * Slight improvement to Pad CPU kernel binary size. * Update RandomNormal[Like], RandomUniform[Like] CPU kernels. * Fix warning from setting multiple MSVC warning level options. (#6917) Fix warning from setting multiple MSVC warning level options. Replace an existing /Wn flag instead of always appending a new one. * MLAS: quantized GEMM update (#6916) Various updates to the int8_t GEMMs: 1) Add ARM64 udot kernel to take advantage of dot product instructions available in newer cores. Some models run 4x faster than the stock implementation we used before. 2) Refactor the x64 kernels to share common code for AVX2(u8u8/u8s8/avxvnni) vs AVX512(u8u8/u8s8/avx512vnni) to reduce binary size. 3) Extend kernels to support per-column zero points for matrix B. This is not currently wired to an operator. * Implement QLinearAveragePool with unit tests. (#6896) Implement QLinearAveragePool with unit tests. * Attention fusion detect num_heads and hidden_size automatically (#6920) * fixed type to experimental session constructor (#6950) * fixed type to experimental session constructor Co-authored-by: David Medine <david.medine@brainproducts.com> * Update onnxruntime_perf_test.exe to accept free dimension overrides (#6962) Co-authored-by: Ori Levari <orlevari@microsoft.com> * Fix possible fd leak in NNAPI (#6966) * Release buffers for prepacked tensors (#6820) Unsolved problems: 1. One test failure was caused by a bug in Cudnn rnn kernels, when they can allocate a buffer and partially initialize it, the garbage data near tail of the buffer caused problem in some of the hardware. To attack this problem in a broader sense, should we add code in our allocators, and during a memory fuzzing test, fill an allocated buffer with garbage before returning to the caller? 2. Prepacking is used more widely than we know. For instance, Cudnn rnn kernels also cache their weights. They mix several weight tensors together into a single buffer, and never touch the original weight tensor anymore. This is the same idea with pre-pack, but they didn't override the virtual function, and they never tried to release those weight tensors, leading to memory waste. It also seems to me that there are some other kernels have similar behavior. Wonder how much memory we can save if we try to cleanup those too. 3. Turning off memory pattern planning does increase memory fragmentation, leading to out of memory error in some training test cases. Perhaps we can revisit the idea of pushing kernels-creation stage earlier, and then during initializer deserialization, we only avoid tracing those that will be prepacked. * Enable type reduction for Range, ReverseSequence, ScatterND, Split, and Unique CPU kernels. (#6963) * add CI * fix test in ci * fix flags for nsync in wasm build * add copyright banner * fix wasm source glob * add missing exports * resolve comments * Perf gain by make packb wide to 4 from 16 on GEMM for WASM. Remove no need direct conv in previous perf tuning. * fix buildbreak introduced from latest master merge * fix buildbreak in mlasi.h * resolve all comments except MLAS * rewrite packb related 3 functions for WASM_SCALAR seperately rather than using #ifdef in each. and other changes according to PR feedback in mlas. * More complete scalar path in sgemm from Tracy. * Fix edge case handling in depthwise conv2d kernel 3x3. where: ) support input W==1 and H==1 ) recalc in accurate pad_right and pad_bottom ) support hidden pad_right == 2 or pad_bottom == 2 when W == 1 or H==1 and no pad left/top Add more test coverage for conv depthwise from Tracy. Fix one typo according to PR. * resolve comments * replace typedef by using * do not use throw in OrtRun() * output error message Co-authored-by: Sunghoon <35605090+hanbitmyths@users.noreply.github.com> Co-authored-by: Lei Zhang <zhang.huanning@hotmail.com> Co-authored-by: Wei-Sheng Chin <wschin@outlook.com> Co-authored-by: Tianlei Wu <tlwu@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: Tracy Sharpe <42477615+tracysh@users.noreply.github.com> Co-authored-by: David Medine <david.eric.medine@gmail.com> Co-authored-by: David Medine <david.medine@brainproducts.com> Co-authored-by: Ori Levari <ori.levari@microsoft.com> Co-authored-by: Ori Levari <orlevari@microsoft.com> Co-authored-by: Guoyu Wang <62914304+gwang-msft@users.noreply.github.com> Co-authored-by: Chen Fu <chenfucs@gmail.com>	2021-04-06 16:18:10 -07:00
Changming Sun	2fcd69d644	Cleanup build.py (#7245 )	2021-04-05 18:49:29 -07:00
Changming Sun	5bd192c439	Update ContribOperators.md (#7246 )	2021-04-05 17:11:33 -07:00
Guoyu Wang	d500c5952b	Add Android AAR packaging script for ORT-Mobile (#7138 ) * Add Android aar packaging script for ORT-Mobile * Address CR comments	2021-03-30 18:42:18 -07:00
Ben Niu	d1acdd4f4b	Support building ARM64EC onnxruntime.dll (#6999 )	2021-03-29 15:35:30 -07:00
Yufeng Li	c965878a69	fix a bug in global average pool and add unit test (#6913 ) * fix bug in QGlobalAveragePool * add unit test for quant GlobalAveragePool * not run quantization tests if disable_contrib_ops enabled	2021-03-22 20:01:27 -07:00
Thiago Crepaldi	867804bea1	Add auto doc gen for ORTModule API during CI build (#7046 ) In addition to ORTModule auto documentation during packaging, this PR also update golden numbers to fix CI	2021-03-22 10:20:33 -07:00
Tianlei Wu	8a6f6bc38b	add --enable_cuda_line_info to build.py (#6773 )	2021-02-22 22:00:21 -08:00
Edward Chen	ee35be0129	Support specifying globally allowed types from build script (#6677 ) Add initial support for constraining operator kernel implementations (which support this type-granularity) to a set of allowed types from scripts.	2021-02-22 14:05:00 -08:00
Ivan Stojiljkovic	c91f314217	Add robust dependency check for Python package (#6436 ) * Add robust dependency check for Python package * Add version_info.py to .gitignore * Fix Linux build * Fix Windows CPU build * Fix Windows 32-bit build * Minor tweak * Generate version_info.py earlier in onnxruntime_python.cmake * Print a user-friendly message if cuDNN is not found in * Relax version requirements for CUDA 11 - only the major version has to match * Fix PATH environment variable to include CUDA 11 in 'Python packaging pipeline' (Windows/GPU) * Fix the build with cuDNN 7	2021-02-21 15:11:28 -08:00
liqunfu	2c5e603bad	Liqun/nuphar nuget (#6656 ) create nuphar nuget with correct name	2021-02-17 16:13:07 -08:00
Scott McKay	33279250b5	Update a couple of usages of args.minimal_build to check for not specified vs empty list correctly. (#6688 )	2021-02-16 14:46:51 +10:00
Scott McKay	25f7c93504	Require explicit inclusion of custom op support in a minimal build (#6663 ) * Remove support from custom ops from the base minimal build as they contribute too much binary growth to an Android build. Add ability to explicitly enable custom op support in a minimal build. Change one minimal build CI to test adding custom op support (unit tests are run in that build to validate)	2021-02-13 12:42:33 +10:00
Sheil Kumar	87cb6fd495	Add LearningModelBuilder to WinML Experimental Namespace along with various Audio operators (#6623 ) * model building * fix build * winml adapter model building api * model building * make build * make build again * add model building with audio op * inplace and inorder fft * add ifft * works! * cleanup * add comments * switch to iterative rather than recursive and use parallelization * batched parallelization * fft->dft * cleanup * window functions * add melweightmatrix op * updates to make spectrogram test work * push latest * add onesided * cleanup * Clean up building apis and fix mel * cleanup * cleanup * naive stft * fix test output * middle c complete * 3 tones * cleanup * signal def new line * Add save functionality * Perf improvements, 10x improvement * cleanup * use bitreverse lookup table for performance * implement constant initializers for tensors * small changes * add matmul tests * merge issues * support add attribute * add tests for double data type windowfunctions and minor cleanup * stft onesided/and not tests * cleanup * cleanup * clean up * cleanup * remove threading attribute * forward declare orttypeinfo * warnings * fwd declare * fix warnings * 1 more warning * remove saving to e drive... * cleanup and fix stft test * add opset picker * small additions * add onnxruntime tests * add signed/unsigned * fix warning * fix warning * finish onnxruntime tests * make windows namespace build succeed * add experimental flag * add experimental api into nuget package * add experimental api build flag and add to windows ai nuget package * turn experimental for tests * add minimum opset version to new experimental domain * api cleanup * disable ms experimental ops test when --ms_experimental is not enabled * add macro behind flag * remove unused x * pr feedback Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2021-02-12 14:17:10 -08:00
Scott McKay	13d7db9a98	Don't update the excluded ops/types unless args.update is true. Updating the exclusion info triggers rebuilding of all kernels using type reduction. (#6604 )	2021-02-09 07:15:31 +10:00
Edward Chen	2ef792ae6e	Don't resolve symlink in resolve_executable_path(). (#6540 )	2021-02-04 12:32:03 -08:00
Cian Hayes	6fc5237d9e	Introduce --enable_training_ops build flag (#6523 ) * minimal_build with training ops * Removing redundant comment from an earlier attempt at a fix * Fixing a bad merge conflict resolution * Responding to PR feedback * tweaking the makefiles based on feedback * combining two enable_training blocks in CMakeLists.txt	2021-02-01 21:54:16 -08:00
suryasidd	1a5b75a554	[OpenVINO-EP] Remove support for OpenVINO 2020.2 (#6493 ) * Removed OpenVINO 2020.2 support * Updated documentation and build.py * Removed unnecessary libraries from setup.py	2021-01-28 23:00:41 -08:00
liqunfu	00afd00059	merge e2e with distributed pipeline (#6443 ) merge e2e with distributed pipeline	2021-01-28 14:17:47 -08:00
Scott McKay	c84bb9df9f	Add ability to track per operator types in reduced build config. (#6428 ) * Add ability to generate configuration that includes required types for individual operators, to allow build size reduction based on that. - Add python bindings for ORT format models - Add script to update bindings and help info - Add parsing of ORT format models - Add ability to enable type reduction to config generation - Update build.py to only allow operator/type reduction via config - simpler to require config to be generated first - can't mix a type aware (ORT format model only) and non-type aware config as that may result in insufficient types being enabled - Add script to create reduced build config - Update CIs	2021-01-29 07:59:51 +10:00
Guoyu Wang	c05adb1147	Initial version of CoreML EP (#6392 )	2021-01-27 10:43:17 -08:00
liqunfu	6ed12402a4	Liqun/liqun/enable pipeline parallel test2 (#6399 ) * enable data and pipeline parallism test Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-01-25 15:15:26 -08:00
Yufeng Li	c20965f9b2	enable pipeline to run quantization tests (#6416 ) * enable pipeline to run quantization tests setup test pipeline for quantization	2021-01-25 09:33:08 -08:00
wezuo	5b6753ce27	Wezuo/memory analysis (#5658 ) * merged alloc_plan * pass compilation * Start running, incorrect allocation memory info * add in comments * fix a bug of recording pattern too early. * debugging lifetime * fix lifetime * passed mnist * in process of visualization * Add code to generate chrome trace for allocations. * in process of collecting fragmentation * before rebuild * passed mnist * passed bert tiny * fix the inplace reuse * fix the exception of weight in pinned memory * add guards to ensure the tensor is in AllocPlan * add customized profiling * debugging * debugging * fix the reuse of differnt location type * add rank * add the rank * add fragmentation * add time_step_trace * Add summary for each execution step (total bytes, used/free bytes). * add top k * change type of top k parameter * remove prints * change heap to set{ * add the name pattern * add the useage for pattern * add partition * change to static class * add custom group * remove const * update memory_info * in process of adding it as runtime config * change the memory profiling to be an argument * add some comments * add checks to recored meomry_info in traaining session * set the "local rank setting" to correct argument. * addressing comments * format adjustment * formatting * remove alloc_interval * update memory_info.cc to skip session when there is no tensor for a particular memory type * fix memory_info multiple iteration seg-fault * consolidate mainz changes * fixed some minor errors * guard by ORT_MINIMAL_BUILD * add ORT_MEMORY_PROFILE flag * added compiler flag to turn on/off memory profiling related code * clean up the code regarding comments * add comments * revoke the onnx version * clean up the code to match master * clean up the code to match master * clean up the code to match master Co-authored-by: Jesse Benson <benson.jesse@gmail.com> Co-authored-by: Wei Zuo <wezuo@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: wezuo <wezuo@az-eus-v100-32gb-5-worker-mgtbby.eastus.cloudapp.azure.com> Co-authored-by: wezuo <wezuo@az-eus-v100-32gb-5-worker-yclzsf.eastus.cloudapp.azure.com>	2021-01-19 08:30:55 -08:00
Scott McKay	e54e2f969d	Use readelf for minimal build binary size checks. (#6338 ) * Use readelf for minimal build binary size checks. The on-disk size grows in 4KB chunks which makes it hard to see how much growth an individual checkin causes. Only downside is that the sum of the sections is larger than the on-disk size (assumably things get packed smaller on disk and some of the section alignment constraints can be ignored) * Remove unused function	2021-01-15 07:46:02 +10:00
Edward Chen	042053c55e	Add support for running Android emulator from build.py on Windows. (#6317 )	2021-01-13 19:21:49 -08:00
Alberto Magni	5623cc6d17	Use onnxruntime_USE_FULL_PROTOBUF=OFF for the cuda execution provider (#6340 ) This removes a special case of the cuda EP.	2021-01-13 18:27:13 +00:00
Changming Sun	5084ce0969	Update nuget build (#6297 ) 1. Update the ProtoSrc path. The old one is not used anymore. 2. Regenerate OnnxMl.cs 3. Delete some unused code in tools/ci_build/build.py 4. Avoid set intra_op_param.thread_pool_size in ModelTests in OpenMP build. 5. Fix a typo in the C API pipeline.	2021-01-11 10:49:05 -08:00
William Tambellini	39a988ce1c	Upgrade build.py to assert for python 3.6+ Upgrade build.py to assert for python 3.6+ as python 3.5 cannot build anymore todays master.	2020-12-30 20:17:09 -08:00
Changming Sun	1b23b28706	Remove MKLML/openblas/jemalloc build config (#6212 )	2020-12-30 17:18:19 -08:00
Michael Goin	bbb6b416f0	Fix ImportError in build.py (#6231 ) There is a possible ImportError where build.py can import the wrong 'util' package if there are others present in `sys.path` already	2020-12-30 14:22:55 -08:00
Tixxx	32c67c2944	Deprecating Horovod and refactored Adasum computations (#5468 ) deprecated horovod submodule refactored adasum logic to be ort-native added tests for native kernel and e2e tests	2020-12-17 16:21:33 -08:00
Edward Chen	64709b1335	Deprecate Python global configuration functions [Part 1] (#5923 ) Enable options to be set via execution provider (EP)-specific options and log deprecation warning from current global configuration functions.	2020-12-15 11:32:43 -08:00
baijumeswani	dd2e5a1a05	state_dict and load_state_dict for ORTTrainer (#6095 ) * add functions state_dict and load_state_dict to ORTTrainer * unit tests for state_dict and load_state_dict for ORTTrainer	2020-12-14 11:55:52 -08:00
baijumeswani	523d187193	save data to and load data from an hdf5 file for checkpointing (#5975 ) * save python dictionary to hdf5 representation and load an hdf5 file into a python dictionary * unit tests for saving data to and loading data from hdf5 file	2020-12-08 11:40:57 -08:00
satyajandhyala	f68a256140	Android code coverage (#6061 ) * Added Onnxruntime_GCOV_COVERAGE flag for Android. * Set CMAKE_SYSTEM_NAME explicityly for Android. * Added GCOV_PREFIX option to collect code coverage data. Added a new python script to generate code coverage info. Modified build pipeline to geneate Android code coverage info * Added build command line option --android_coverage * Added a comment describing the GCOV environment variables * Fixed PEP8 issues. * Added --android_coverage option to the build command. * Increased Android emulator memory from 3K to 8K. * Increased Android partition-size from 2GB to 4GB to overcome no-space-left-on-device error * Removed source_dir from command line args. * Use cwd absolute path to run tests. * Added commands to output the contents of /data/local/tmp on the emulator. * Added run_adb_shell function. * Format changes. * Removed keywd argument cwd. * Removed Android in the --build_dir path. * Removed commands added for debugging. * Removed exxtra new-lines. * Fix MacOs build pipeline failures by uninstalling openssl before running build script. * Revert "Fix MacOs build pipeline failures by uninstalling openssl before running build script." This reverts commit 90d0568fe533e9456c20d061a2d435c8fea48266. * Change dir to the build directory where the tar file is copied. * Changed the option from --android_coverage to --code_coverage * Moved steps to generate Android code coverage to run_nnap_code_coverage.sh * Require --android option if --code_coverage is specified. * No code coverage needed for onnx_test_runner. * Expect that the emulator is running when the script is executed. * Fixed the title in the buildpipeline step. * Fixed the formatting issue. * Added a command line argument, ORT_ROOT, to run_nnapi_code_coverage.sh script Co-authored-by: Satya Jandhyala <satyajandhyala@Satyas-Mac-mini.local>	2020-12-08 10:55:02 -08:00
baijumeswani	2b35f7d4f6	Fix build.py bug which prevents running some unit tests (#5990 ) Also ignore an exception occurred for execution providers which generate compiled nodes	2020-12-03 08:57:55 -08:00
Guoyu Wang	6846c665ff	Use loose version in build.py (#5998 )	2020-12-01 20:57:44 -08:00
Wenbing Li	2ec211ea7b	Support the cross compiling for Apple Silicon (#5974 ) * support macos_arm64 cross compiling * update the build docs * update as commented. * Update BUILD.md	2020-12-01 10:00:06 -08:00
Wenbing Li	1852ade75d	Enable the xcode build for Apple Silicon (arm64 MacOS) (#5924 ) * fix the build script for macos/xcode * add the version check * correct the osx-arch configuration * typo	2020-11-30 11:22:08 -08:00
Changming Sun	5fdd9f0fd2	Fix Python Linux GPU package name (#5943 ) Fix Python Linux GPU package name. I accidentally added "noopenmp" to it.	2020-11-25 17:46:11 -08:00
Xueyun Zhu	58ea7b3572	temporarily disable test (#5868 )	2020-11-23 15:18:37 -08:00
Ryan Hill	ba739a8000	Convert OpenVINO into a shared provider (#5778 ) Same as Dnnl and TensorRT before it, now with more methods and more cleanup.	2020-11-20 17:39:57 -08:00
Edward Chen	bef06dac93	Automatically clean up build docker image cache. (#5843 ) Follow up to #5811 to automate cleanup of the build docker image cache. Added a script and build definition to clean up docker images that haven't been accessed recently.	2020-11-20 11:56:26 -08:00
S. Manohar Karlapalem	ff58f621fa	Remove nGraph Execution Provider (#5858 ) * Remove nGraph Execution Provider Pursuant to nGraph deprecation notice: https://github.com/microsoft/onnxruntime/blob/master/docs/execution_providers/nGraph-ExecutionProvider.md#deprecation-notice Deprecation Notice \| \| \| \| --- \| --- \| \| Deprecation Begins \| June 1, 2020 \| \| Removal Date \| December 1, 2020 \| Starting with the OpenVINO™ toolkit 2020.2 release, all of the features previously available through nGraph have been merged into the OpenVINO™ toolkit. As a result, all the features previously available through ONNX RT Execution Provider for nGraph have been merged with ONNX RT Execution Provider for OpenVINO™ toolkit. Therefore, ONNX RT Execution Provider for nGraph will be deprecated starting June 1, 2020 and will be completely removed on December 1, 2020. Users are recommended to migrate to the ONNX RT Execution Provider for OpenVINO™ toolkit as the unified solution for all AI inferencing on Intel® hardware. * Remove nGraph Licence info from ThirdPartyNotices.txt * Use simple Test.Run() for tests without EP exclusions To be consistent with rest of test code. * Remove nGraph EP functions from Java code	2020-11-19 16:47:55 -08:00
Hariharan Seshadri	62508ef0e4	Revert "Remove MKLML build config (#5559 )" (#5855 )	2020-11-19 10:53:08 -08:00
Edward Chen	71e7c2b423	Cache build docker images in container registry. (#5811 ) This PR adds infrastructure to automatically cache docker images used in CI builds in a container registry. Currently, build images are pulled from a container registry for some builds and built every time for others. The container registry requires maintenance to keep the images up to date and building images every time wastes build agent resources. With this change, a given build image can be looked up in a cache container registry and if present, pulled, and otherwise, built and pushed. The uniqueness of a build image is determined by a hash digest of the dockerfile, docker build context directory, and certain "docker build" options. This digest is part of the image tag in the cache container repository. The cache container registry will need to be cleaned up periodically. This is not automated yet.	2020-11-17 17:02:24 -08:00
Scott McKay	7b76b57fc8	Support EPs that compile nodes in a minimal build. (#5776 ) * Support EPs that compile nodes in a minimal build. This enables NNAPI being used.	2020-11-17 13:52:22 +10:00
Scott McKay	a3f3a63206	Move OpenVINO specific validation function to somewhere more sensible, and rename to provide context on its usage. (#5822 )	2020-11-17 10:58:43 +10:00
Guoyu Wang	c4818d36ed	[NNAPI EP] Make NNAPI EP build on non-Android Platform (#5779 ) * Make NNAPI EP build on non-Android Platform * minor updates * Adress CR comments * Fix build issue using Windows, address CR comments * Fix linux build warnings * Fix for test failure * Fix for test failure * Fix model_tests failure	2020-11-15 17:04:45 -08:00
jeyblu	435b904f0e	add dnnl gpu engine (#5788 )	2020-11-12 20:17:54 -08:00
Maajid khan	a84a058f9e	[OpenVINO-EP] Enabling Multi Device support (#5740 ) * Enabling Multi Device support for UEP Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Minor fix added *Added a simple fix to determine OpenVINO version for Arm build as well Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>	2020-11-11 15:16:30 -08:00
Xueyun Zhu	d8ace07ad7	Add CPU send/recv for pipeline (#5315 ) * cpu send/recv * clean up send/recv * remove unused code * assert and nccl option for mnist * add build option to enable build with only cpu. Without this, nccl is always enabled which will break build on machine that only contains cpu * Add USE_MPI distinct from USE_NCCL/USE_HOROVOD * fix * fix * exclude cpu send/recv for machines without mpi Co-authored-by: Tim Harris <tiharr@microsoft.com>	2020-11-11 12:41:39 -08:00
liqunfu	1416d12f0b	Liqun/merge e2e pipelines (#5702 ) * Create an Azure Pipeline to merge cpp and python e2e pipelines into one. Still keep cpp 2e2 pipeline until this new pipeline is stable. Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-11-11 09:42:08 -08:00
Weixing Zhang	fff85a6a35	Add GPU kernels for ROCm EP (#5655 ) * Add kernels for AMD GPU. This PR is mostly about GPU kernels for ROCm EP. Due to similar GPU programming language (CUDA and HIP and similar math library calls, one principle in ROCM EP design is to share CUDA kernels as much as possible for ROCm. Thus, the script amd_hipify.py has been created for converting CUDA kernels to ROCm HIP kernels automatically during compilation phase. But, for some reasons such as perf issue, syntax difference..., some converted kernels need some manual intervention. These kernels will be checked in the repo physically for now. In order to avoid manual intervention, the plan is to refactor CUDA kernels to make them portable between CUDA EP and ROCm EP as much as possible. Please refer to "HIP Porting Guide" for details. * like lamb, multi-tensor-apply needs to be disabled for IsAllFiniteOp and ReduceAllL2, current AMD GPU compiler has perf issue for kernel parameter which is a structure with "pass by value". * Use hipMemsetAsync and add checks on HIP calls. * move the generated files to build folder. Co-authored-by: Jesse Benson <jesseb@microsoft.com>	2020-11-06 16:11:06 -08:00
Maajid khan	d98062da0c	[OpenVINO-EP] Hetero support (#5627 ) * Implement Hetero in UEP * Added security checks to take valid Hetero combinations as device type * Integrating Hetero features * Get the statistics Report in Debug Mode Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Passing right device type for vadm_baackend Added simple fix to pick the right device type when using vadm_backend with Hetero as well. Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Fixed batching logic for 2020.4 and above * Fixed flake8 PEP8 errors Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Minor Fixes Added Added security checks for device_type passed in for Hetero build during run time code cleanup Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Minor changes Added Fixed batch_size bug in vadm_backend code cleanup *Documentation updated for Hetero Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> Co-authored-by: suryasidd <surya.siddharth.pemmaraju@intel.com>	2020-10-30 22:35:08 -07:00
Weixing Zhang	aec4cb489e	ROCm EP for AMD GPU (#5480 ) The ROCm EP is designed and implemented based on AMD GPU software stack named ROCm. Here is the link for the details about ROCm: https://rocmdocs.amd.com/en/latest/ ROCm EP was created based on the following things: 1. AMD GPU programming language: HIP 2. AMD GPU HIP language runtime: amdhip64 3. BLAS: rocBLAS, hipBLAS 4. DNN: miOpen 5. Collective Communication library: RCCL 6. cub: hipCub 7. … Current status: BERT-L and GPT2 training can be ran on AMD GPU with data parallel. Next: 1. Make more GPU code be sharable between ROCm EP and CUDA EP since HIP language and HIP runtime API are very close to CUDA. 2. Continue improving the implementation. 3. Continue GPU kernel optimization. 4. Support model parallelism on ROCm EP. …… The rocm kernels have been removed from this commit and will be in a separate PR. Since the original PR was too big(~180 files), it was suggested to split the PR into two parts, one is rocm-kernels, the other is non rocm kernels. Co-authored-by: Weixing Zhang <wezhan@microsoft.com> Co-authored-by: sabreshao <sabre.shao@amd.com> Co-authored-by: anghostcici <11013544+anghostcici@users.noreply.github.com> Co-authored-by: Suffian Khan <sukha@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2020-10-29 17:13:04 -07:00
Changming Sun	e6956be40c	Publish no-openmp python packages to test pypi (#5610 ) Publish no-openmp python packages to test pypi	2020-10-28 19:49:53 -07:00
liqunfu	5129b4d5bc	batch size tests (#5508 ) * batch size tests Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-10-28 15:55:40 -07:00
liqunfu	92662659ba	Liqun/remove number matching (#5606 ) replace number matching with relaxed comparison in frontend tests Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-10-27 21:27:37 -07:00
Dmitri Smirnov	3433576fd3	Support for Sparse Initializers (#5540 ) Introduce sparse_initializers support. Convert them to dense on model load and prune graph_proto_ so they don't consume space. Convert back to sparse on ORT Format model save. Implement serializing sparse initializers to OrtFormat. Fix Model::ToProto() to return original sparse initializers Set a flag that graph_sync is needed when loading a simple ORT Format model. otherwise nothing is resolved. Add ORT Format history to README.md ifdef MINIMAL build for DenseToSparseTensorInitializer Allow duplicate initializers to support existing models. Issue a warning instead of aborting. * Revert "Remove SparseTensor support from minimal build. (#5114)" This reverts commit `59ee8ffb17`. Signed-off-by: Dmitri Smirnov <dmitrism@microsoft.com>	2020-10-27 10:32:06 -07:00
Andrews548	20bc83400b	ACL/ArmNN update (#5515 ) * Build ACL and ArmNN with custom library path * Define import to tensor as a separate function for maintenance and readability * Enabled optimized depthwise convolution for ACL v20.02 * Check operation status for ACL and ArmNN Execution Providers * Enabled fused operation for convolution-activation Co-authored-by: Andrei-Alexandru <andrei-alexandru.avram@nxp.com>	2020-10-22 09:29:44 -07:00
Changming Sun	5802fe1699	Remove MKLML build config (#5559 ) Remove MKLML build config	2020-10-21 13:11:25 -07:00
Guoyu Wang	915d475353	Android CI update (#5474 ) * Update Android CI * update comments	2020-10-14 16:56:50 -07:00
Wenbing Li	80d36eab86	enable the onnxruntime shared library test on iOS (#5443 ) * enable the onnxruntime shared library test on iOS * fixing as commented. * add return status check.	2020-10-12 21:40:57 -07:00
liqunfu	dbe7e6623b	only use/import pytest if needed (by enable_training) (#5437 ) Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-10-09 12:42:19 -07:00
liqunfu	1cceefc7d4	use run_orttraining_test_orttrainer_frontend_separately to work aroun… (#5408 ) * use run_orttraining_test_orttrainer_frontend_separately to work around a sporadic segfault. Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-10-09 09:16:10 -07:00
Changming Sun	09aef240d6	Skip running onnx tests in python mac os pipeline (#5416 )	2020-10-08 11:49:28 -07:00
Hariharan Seshadri	6f54113a1b	Support OrtValue binding in Python to enable interesting IOBinding scenarios in Python (#5248 )	2020-10-06 21:14:41 -07:00
Guoyu Wang	b4934b0016	Mitigate pybind11 build break using Xcode 12 on macOS (#5381 ) * turn dev_mode off if we are using macos to build python with xcode 12 * Address CR comments * Add ways to check compiler version	2020-10-06 19:03:33 -07:00
liqunfu	773992c7d4	Liqun/bert pretrain tb (#5377 ) * add tensor board, remove torch.distributed.lanuch because ort nccl depends on MPI. Use MPI to launch parallel training. Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-10-06 16:28:31 -07:00
Wenbing Li	4721729fdc	Enable iOS CI pipeline (#5360 ) * add the ios ci build. * no dependency on mac ci pipeline. * fix the command line. * keep sync * automatically retrieve sdpath * fix the case errors and warnings * fix the vlog switch issue. * add parallel flag for build. * update the display name of the pipeline.	2020-10-02 20:14:45 -07:00
Guoyu Wang	9df0790856	Update linux minimal CI to report Android mininal baseline binary size (#5361 ) * Update linux minimal CI to report Android mininal baseline binary size * Fix some issues in the script	2020-10-02 17:35:23 -07:00
Ashwini Khade	ce49cfa67c	add support for configurable build dir when building nuget packages (#5352 ) * add support for configurable build dir when building nuget packages * rename vars	2020-10-02 09:31:35 -07:00

1 2 3 4 5 ...

481 commits