onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-21 02:18:09 +00:00

Author	SHA1	Message	Date
Changming Sun	5a7f65b831	Fix training e2e pipeline (#7942 ) 1. Fix training e2e pipeline. The failure was caused by my recent change #7632. The fix is adding "--cmake_extra_defines CMAKE_CUDA_ARCHITECTURES=70" to the build parameters because the machines are with V100 GPUs. 2. Simplify Nuphar pipeline. It doesn't need to install a separated ONNX version(1.5.0) 3. Fix a problem that run_dockerbuild.sh ignored OS version parameter. Now because it starts to take effect, I also set python version to the system default one(3.8 for ubuntu 20.04)	2021-06-04 09:37:09 -07:00
Yulong Wang	0723d16436	[wasm] allows to specify MALLOC setting for wasm build (#7934 )	2021-06-03 23:08:56 -07:00
Changming Sun	b854f2399d	Update manylinux build scripts and GPU CUDA version from 11.0 to 11.1 (#7632 ) 1. Update manylinux build scripts. This will add [PEP600](https://www.python.org/dev/peps/pep-0600/)(manylinux2 tags) support. numpy has adopted this new feature, we should do the same. The old build script files were copied from https://github.com/pypa/manylinux, but they has been deleted and replaced in the upstream repo. The manylinux repo doesn't have a manylinux2014 branch anymore. So I'm removing the obsolete code, sync the files with the latest master. 2. Update GPU CUDA version from 11.0 to 11.1(after a discussion with PMs). 3. Delete tools/ci_build/github/linux/docker/Dockerfile.manylinux2014_cuda10_2. (Merged the content to tools/ci_build/github/linux/docker/Dockerfile.manylinux2014_cuda11) 4. Modernize the cmake code of how to locate python devel files. It was suggested in https://github.com/onnx/onnx/pull/1631 . 5. Remove `onnxruntime_MSVC_STATIC_RUNTIME` and `onnxruntime_GCC_STATIC_CPP_RUNTIME` build options. Now cmake has builtin support for it. Starting from cmake 3.15, we can use `CMAKE_MSVC_RUNTIME_LIBRARY` cmake variable to choose which MSVC runtime library we want to use. 6. Update Ubuntu docker images that used in our CI build from Ubuntu 18.04 to Ubuntu 20.04. 7. Update GCC version in CUDA 11.1 pipelines from 8.x to 9.3.1 8. Split Linux GPU CI pipeline to two jobs: build the code on a CPU machine then run the tests on another GPU machines. In the past we didn't test our python packages. We only tested the pre-packed files. So we didn't catch the rpath issue in CI build. 9. Add a CentOS machine pool and test our Linux GPU build on real CentOS machines. 10. Rework ARM64 Linux GPU python packaging pipeline. Previously it uses cross-compiling therefore we must static link to C Runtime. But now have pluggable EP API and it doesn't support static link. So I changed to use qemu emulation instead. Now the build is 10x slower than before. But it is more extensible.	2021-06-02 23:36:49 -07:00
Yulong Wang	faae347d9f	[wasm] upgrade emsdk version to 2.0.23 (#7893 ) * upgrade emsdk version to 2.0.23 * fix build * override gmock build options	2021-06-02 12:26:24 -07:00
Scott McKay	0fbec1b9c1	Update the operator documentation generation (#7787 ) * Update the operator documentation generation - Make layout a little nicer - Update to latest supported operators including training - Fix some links that are broken when the docs content is copied to github-pages - Fix incorrect usage of 'onnx.ai.ml' as the default domain - ML ops are now separated from the real default domain of 'onnx.ai' - Include CPU, CUDA and training kernels - exclude DNNL as it's not an EP we own * There are separate paths for CUDA and CUDNN as they are not guaranteed to be in the same location on a Windows machine. Use the CUDNN path when looking for the CUDNN library. * Enable validation of both contrib ops and operator kernels in build Filter generation so it's deterministic Add ability for CI to publish the md files as build artifacts if they differ so a developer can download and add to their PR to resolve any diffs. Remove workarounds for github-pages as that will now link to the github docs which display correctly	2021-06-02 17:47:40 +10:00
Guoyu Wang	e7e200ee59	Add test for iOS package (#7816 ) * Add test for iOS package * Add readme * fix pep8 warning * Addressed CR comments, fixed CI failure * Address CR comments * Update readme.md * Update package name and readme, added comments to the podspec	2021-06-01 11:01:37 -07:00
Gao, Chun	4dd724ef1a	Enable WebAssembly SIMD build (#7839 ) Add a build switch "--enable_wasm_simd" to enable WebAssembly SIMD build	2021-05-28 16:29:58 -07:00
liqunfu	bed6e87cbd	add environment variable to control default training package's local version (#7849 )	2021-05-26 22:44:20 -07:00
Yulong Wang	7c4a5faef5	[wasm] enable DWARF format debug info for ORT WASM (#7777 ) * [wasm] enable DWARF format debug info for ORT WASM * resolve comments	2021-05-21 01:32:00 -07:00
Taewoo Kim	1e6ad669cf	Support arm64e for osx Add arm64e to choices variable	2021-05-18 14:58:58 -07:00
Changming Sun	26a472c948	Increase test timeout from 1 hour to 2 hours (#7735 ) I saw a test timeout in our nodejs packaging pipeline. I'm not sure if it is because it ran slower than before or it's a deadlock issue. Increasing the timeout will be helpful for investigating such issues.	2021-05-18 10:51:58 -07:00
liqunfu	359fe1d197	Liqun/ort training version (#7620 )	2021-05-14 09:54:19 -07:00
Guoyu Wang	a47a234b7e	Add minsdkver for AAR and AndroidTest (#7669 )	2021-05-12 16:01:25 -07:00
Guoyu Wang	69d1db83ac	Enable bitcode for iOS by default (#7640 )	2021-05-10 21:27:45 -07:00
Rachel Guo	d8cf960412	Add android test app to validate Java API for ORT-Mobile Android (#7477 ) * test * [gwang] make cmake compile work * [gwang] enble build apks * some build update * add simple sigmoid test android project and cmake * add build.py * refine and remove unused import lib * address CR comments * remove unnecessary files * add README.md * minor update * remove * minor change * fix ci failure and minor update * fix typo in project folder * remove * remove and minor update * refine * minor fix * fix * fix typo * add gradle spotlessApply task to fix CI failure * fix * enable spotlessApply in build gradle * revert some changes * minor fix * run spotless apply for format * address CR comments and fix CI version and format * refine * Refine * address comments * refine * refine * modify * reformat * resolve version conflicts * minor update * minor update * address comments * minor update Co-authored-by: Guoyu Wang <wanggy@outlook.com>	2021-05-04 15:39:14 -07:00
Tang, Cheng	54db6648af	kerne invoker api for eager mode (#7473 ) * initial draft for kernel invoke api * initial implementation of kernel invoker * [eager] fix build on Mac * [eager] increment input name in kernel invoker * temp fix for type in eager mode * use global default log manager * rollback the previous commit since it break linux build * Revert "rollback the previous commit since it break linux build" This reverts commit `58c2c3423a`. * Eager Mode: fix linking on macOS * optimizer_execution_frame: ignore unused lambda capture (model_path) * fix link issue * ORTInvoker: set correct input argument tensor element proto types Do not set a type proto on output arguments to allow ORT to deduce them * ORTInvoker: create only one logging manager * Minor fix to set execution provider type correctly. (#7000) Co-authored-by: Chandru Ramakrishnan <chandru-r@github.com> * training fix * support config output ml values in frame, so we can use it to implement inplace update * Fix range loop error while building. (#7087) Co-authored-by: Chandru Ramakrishnan <chandru-r@github.com> * Conditionally link with nsync_cpp if not windows. (#7151) Co-authored-by: Chandru Ramakrishnan <chandru-r@github.com> * Fixed initialization order in ORT kernel invoker (#7342) * Updated constructor of ort_kernel_invoker to take a logger. * Changed linking order. * Updated test. * add inplace ut * add build option * Update include/onnxruntime/core/eager/ort_kernel_invoker.h Co-authored-by: Derek Murray <Derek.Murray@microsoft.com> * resolve comments in pr * fix build break;merge from master * fix build break Co-authored-by: Cheng Tang <chenta@microsoft.com> Co-authored-by: Aaron Bockover <abock@microsoft.com> Co-authored-by: Chandru Ramakrishnan <41447659+chandru-r@users.noreply.github.com> Co-authored-by: Chandru Ramakrishnan <chandru-r@github.com> Co-authored-by: Derek Murray <Derek.Murray@microsoft.com>	2021-04-30 13:33:58 -07:00
Yulong Wang	00aaa6dabb	update CI for onnxruntime-web (#7497 )	2021-04-29 22:22:52 -07:00
Edward Chen	d21304ceb0	Initial Objective-C API (#7366 ) Initial implementation of an Objective-C API.	2021-04-27 10:06:30 -07:00
Suffian Khan	7a3c1787af	Add CI pipeline to publish Python training package targeting Rocm (#7417 ) * first attempt rocm training wheel * modifications needed to python packaging pipeline for Rocm 4.1 * changges to not conflict with cuda missed stage1 changes remove package push add option r to getopt try again without python install try again without python install try again without python install split pipelines and add back push to remote storage try on cuda gpu pool try again try again try running without az subscription set try again on original pipeline change pool passing AMD Rocm whl on AMD-GPU pool split rocm pipeline from cuda pipeline remove comments * try adding Rocm tests as well * try with tests in place * fix trailing ws * add training data * try again as root for tests * use python3 * typo * try to map video, render group into container * try again * try again * try to avoid yum error code * make UID 1001 * try without yum downgrade * define rocm_version=None * remove CUDA related comments for Rocm Dockerfile * Dont pin nightly torch torchvision torchtext versions as they expire (for now nightly is required for Rocm 4.1) * missed requirements-rocm.txt from last commit * fix whitespace	2021-04-23 17:22:31 -07:00
Guoyu Wang	d414039189	Add ios coreml ci, and speedup ios ci run (#7420 )	2021-04-22 23:41:58 -07:00
Yulong Wang	b56dd037d3	increase timeout for nodejs binding test (#7422 ) Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-04-22 21:40:40 -07:00
Changming Sun	afa7b23609	Update docs/ContribOperators.md and the script that generates it. (#7399 )	2021-04-21 16:20:56 -07:00
Yulong Wang	009f342caf	[JS] refactor Javascript/Typescript libraries in ONNX Runtime (#7308 ) * working on re-organizing js code for ortweb * remove dup files * move folder * fix common references * fix common es5 * add webpack to common * split interfact/impl * use cjs for node * add npmignore for common * update sourcemap config for common * update node * adjust folder/path in CI and build * update folder * nit: readme * add bundle for dev * correct nodejs paths * enable ORT_API_MANUAL_INIT * set name for umd library * correct name for commonjs export * add priority into registerBackend() * fix npm ci pwd * update eslintrc * revise code * revert package-lock lockfileVersion 2->1 * update prebuild * resolve comments * update document * revise eslint config * update eslint for typescript rules * revert changes by mistake in backend.ts * add env * resolve comments	2021-04-16 01:33:10 -07:00
Sunghoon	ded2b08380	WebAssembly multi-threads support. (#7326 ) * WebAssembly multi-threads support. * PROXY_TO_PTHREAD is not required for wasm library * Remove an unnecessary line commented out	2021-04-15 21:46:11 -07:00
Guoyu Wang	28e229ac4c	Enable build dynamic framework for macOS/iOS (#7343 ) * Enable build dynamic framework for macOS/iOS * Address CR comments	2021-04-15 16:47:53 -07:00
liqunfu	4c862c73ed	for training to use new python package naming convention to explicitl… (#7204 )	2021-04-13 16:19:42 -07:00
Yulong Wang	405ca49012	build ONNXRuntime into WebAssembly (#6478 ) * Simplified version of WebAssembly support to keep most of existing data structures and add cmake using Ninja and emcmake * Clean up CMakeLists.txt and add an example to create and compute a kernel * Load a model from bytes and remove graph building steps * Add all cpu and contrib ops with mlas library * WebAssembly build with Onnxruntime C/CXX API * Use protobuf cmakefile directory instead of adding every necessary source file * Fix invalid output at example * add missing files * Change an example to use Teams model and support ort mobile format * add API for javascript * fix input releasing in _ort_run() * update API * Let onnxruntime cmake build WebAssembly with option '--wasm' * allow one-step building for wasm * Make build script working on Linux and MacOS * Fix broken build from Windows command * Enable unit test on building WebAssembly * Resolve comments * update build flags * wasm conv improvement from: 1) GemmV; 2) Depthwise direct convolution 3x3; 3) Direct convolution 3x3 * Cleaned mlas unittest. * use glob * update comments * Update baseline due to loss scale fix (#6948) * fix stream sync issue (#6954) * Enable type reduction in EyeLike, Mod, random.cc CPU kernels. (#6960) * Update EyeLike CPU kernel. * Update Mod CPU kernel. * Update Multinomial CPU kernel. * Slight improvement to Pad CPU kernel binary size. * Update RandomNormal[Like], RandomUniform[Like] CPU kernels. * Fix warning from setting multiple MSVC warning level options. (#6917) Fix warning from setting multiple MSVC warning level options. Replace an existing /Wn flag instead of always appending a new one. * MLAS: quantized GEMM update (#6916) Various updates to the int8_t GEMMs: 1) Add ARM64 udot kernel to take advantage of dot product instructions available in newer cores. Some models run 4x faster than the stock implementation we used before. 2) Refactor the x64 kernels to share common code for AVX2(u8u8/u8s8/avxvnni) vs AVX512(u8u8/u8s8/avx512vnni) to reduce binary size. 3) Extend kernels to support per-column zero points for matrix B. This is not currently wired to an operator. * Implement QLinearAveragePool with unit tests. (#6896) Implement QLinearAveragePool with unit tests. * Attention fusion detect num_heads and hidden_size automatically (#6920) * fixed type to experimental session constructor (#6950) * fixed type to experimental session constructor Co-authored-by: David Medine <david.medine@brainproducts.com> * Update onnxruntime_perf_test.exe to accept free dimension overrides (#6962) Co-authored-by: Ori Levari <orlevari@microsoft.com> * Fix possible fd leak in NNAPI (#6966) * Release buffers for prepacked tensors (#6820) Unsolved problems: 1. One test failure was caused by a bug in Cudnn rnn kernels, when they can allocate a buffer and partially initialize it, the garbage data near tail of the buffer caused problem in some of the hardware. To attack this problem in a broader sense, should we add code in our allocators, and during a memory fuzzing test, fill an allocated buffer with garbage before returning to the caller? 2. Prepacking is used more widely than we know. For instance, Cudnn rnn kernels also cache their weights. They mix several weight tensors together into a single buffer, and never touch the original weight tensor anymore. This is the same idea with pre-pack, but they didn't override the virtual function, and they never tried to release those weight tensors, leading to memory waste. It also seems to me that there are some other kernels have similar behavior. Wonder how much memory we can save if we try to cleanup those too. 3. Turning off memory pattern planning does increase memory fragmentation, leading to out of memory error in some training test cases. Perhaps we can revisit the idea of pushing kernels-creation stage earlier, and then during initializer deserialization, we only avoid tracing those that will be prepacked. * Enable type reduction for Range, ReverseSequence, ScatterND, Split, and Unique CPU kernels. (#6963) * add CI * fix test in ci * fix flags for nsync in wasm build * add copyright banner * fix wasm source glob * add missing exports * resolve comments * Perf gain by make packb wide to 4 from 16 on GEMM for WASM. Remove no need direct conv in previous perf tuning. * fix buildbreak introduced from latest master merge * fix buildbreak in mlasi.h * resolve all comments except MLAS * rewrite packb related 3 functions for WASM_SCALAR seperately rather than using #ifdef in each. and other changes according to PR feedback in mlas. * More complete scalar path in sgemm from Tracy. * Fix edge case handling in depthwise conv2d kernel 3x3. where: ) support input W==1 and H==1 ) recalc in accurate pad_right and pad_bottom ) support hidden pad_right == 2 or pad_bottom == 2 when W == 1 or H==1 and no pad left/top Add more test coverage for conv depthwise from Tracy. Fix one typo according to PR. * resolve comments * replace typedef by using * do not use throw in OrtRun() * output error message Co-authored-by: Sunghoon <35605090+hanbitmyths@users.noreply.github.com> Co-authored-by: Lei Zhang <zhang.huanning@hotmail.com> Co-authored-by: Wei-Sheng Chin <wschin@outlook.com> Co-authored-by: Tianlei Wu <tlwu@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: Tracy Sharpe <42477615+tracysh@users.noreply.github.com> Co-authored-by: David Medine <david.eric.medine@gmail.com> Co-authored-by: David Medine <david.medine@brainproducts.com> Co-authored-by: Ori Levari <ori.levari@microsoft.com> Co-authored-by: Ori Levari <orlevari@microsoft.com> Co-authored-by: Guoyu Wang <62914304+gwang-msft@users.noreply.github.com> Co-authored-by: Chen Fu <chenfucs@gmail.com>	2021-04-06 16:18:10 -07:00
Changming Sun	2fcd69d644	Cleanup build.py (#7245 )	2021-04-05 18:49:29 -07:00
Changming Sun	5bd192c439	Update ContribOperators.md (#7246 )	2021-04-05 17:11:33 -07:00
Guoyu Wang	d500c5952b	Add Android AAR packaging script for ORT-Mobile (#7138 ) * Add Android aar packaging script for ORT-Mobile * Address CR comments	2021-03-30 18:42:18 -07:00
Ben Niu	d1acdd4f4b	Support building ARM64EC onnxruntime.dll (#6999 )	2021-03-29 15:35:30 -07:00
Yufeng Li	c965878a69	fix a bug in global average pool and add unit test (#6913 ) * fix bug in QGlobalAveragePool * add unit test for quant GlobalAveragePool * not run quantization tests if disable_contrib_ops enabled	2021-03-22 20:01:27 -07:00
Thiago Crepaldi	867804bea1	Add auto doc gen for ORTModule API during CI build (#7046 ) In addition to ORTModule auto documentation during packaging, this PR also update golden numbers to fix CI	2021-03-22 10:20:33 -07:00
Tianlei Wu	8a6f6bc38b	add --enable_cuda_line_info to build.py (#6773 )	2021-02-22 22:00:21 -08:00
Edward Chen	ee35be0129	Support specifying globally allowed types from build script (#6677 ) Add initial support for constraining operator kernel implementations (which support this type-granularity) to a set of allowed types from scripts.	2021-02-22 14:05:00 -08:00
Ivan Stojiljkovic	c91f314217	Add robust dependency check for Python package (#6436 ) * Add robust dependency check for Python package * Add version_info.py to .gitignore * Fix Linux build * Fix Windows CPU build * Fix Windows 32-bit build * Minor tweak * Generate version_info.py earlier in onnxruntime_python.cmake * Print a user-friendly message if cuDNN is not found in * Relax version requirements for CUDA 11 - only the major version has to match * Fix PATH environment variable to include CUDA 11 in 'Python packaging pipeline' (Windows/GPU) * Fix the build with cuDNN 7	2021-02-21 15:11:28 -08:00
liqunfu	2c5e603bad	Liqun/nuphar nuget (#6656 ) create nuphar nuget with correct name	2021-02-17 16:13:07 -08:00
Scott McKay	33279250b5	Update a couple of usages of args.minimal_build to check for not specified vs empty list correctly. (#6688 )	2021-02-16 14:46:51 +10:00
Scott McKay	25f7c93504	Require explicit inclusion of custom op support in a minimal build (#6663 ) * Remove support from custom ops from the base minimal build as they contribute too much binary growth to an Android build. Add ability to explicitly enable custom op support in a minimal build. Change one minimal build CI to test adding custom op support (unit tests are run in that build to validate)	2021-02-13 12:42:33 +10:00
Sheil Kumar	87cb6fd495	Add LearningModelBuilder to WinML Experimental Namespace along with various Audio operators (#6623 ) * model building * fix build * winml adapter model building api * model building * make build * make build again * add model building with audio op * inplace and inorder fft * add ifft * works! * cleanup * add comments * switch to iterative rather than recursive and use parallelization * batched parallelization * fft->dft * cleanup * window functions * add melweightmatrix op * updates to make spectrogram test work * push latest * add onesided * cleanup * Clean up building apis and fix mel * cleanup * cleanup * naive stft * fix test output * middle c complete * 3 tones * cleanup * signal def new line * Add save functionality * Perf improvements, 10x improvement * cleanup * use bitreverse lookup table for performance * implement constant initializers for tensors * small changes * add matmul tests * merge issues * support add attribute * add tests for double data type windowfunctions and minor cleanup * stft onesided/and not tests * cleanup * cleanup * clean up * cleanup * remove threading attribute * forward declare orttypeinfo * warnings * fwd declare * fix warnings * 1 more warning * remove saving to e drive... * cleanup and fix stft test * add opset picker * small additions * add onnxruntime tests * add signed/unsigned * fix warning * fix warning * finish onnxruntime tests * make windows namespace build succeed * add experimental flag * add experimental api into nuget package * add experimental api build flag and add to windows ai nuget package * turn experimental for tests * add minimum opset version to new experimental domain * api cleanup * disable ms experimental ops test when --ms_experimental is not enabled * add macro behind flag * remove unused x * pr feedback Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2021-02-12 14:17:10 -08:00
Scott McKay	13d7db9a98	Don't update the excluded ops/types unless args.update is true. Updating the exclusion info triggers rebuilding of all kernels using type reduction. (#6604 )	2021-02-09 07:15:31 +10:00
Edward Chen	2ef792ae6e	Don't resolve symlink in resolve_executable_path(). (#6540 )	2021-02-04 12:32:03 -08:00
Cian Hayes	6fc5237d9e	Introduce --enable_training_ops build flag (#6523 ) * minimal_build with training ops * Removing redundant comment from an earlier attempt at a fix * Fixing a bad merge conflict resolution * Responding to PR feedback * tweaking the makefiles based on feedback * combining two enable_training blocks in CMakeLists.txt	2021-02-01 21:54:16 -08:00
suryasidd	1a5b75a554	[OpenVINO-EP] Remove support for OpenVINO 2020.2 (#6493 ) * Removed OpenVINO 2020.2 support * Updated documentation and build.py * Removed unnecessary libraries from setup.py	2021-01-28 23:00:41 -08:00
liqunfu	00afd00059	merge e2e with distributed pipeline (#6443 ) merge e2e with distributed pipeline	2021-01-28 14:17:47 -08:00
Scott McKay	c84bb9df9f	Add ability to track per operator types in reduced build config. (#6428 ) * Add ability to generate configuration that includes required types for individual operators, to allow build size reduction based on that. - Add python bindings for ORT format models - Add script to update bindings and help info - Add parsing of ORT format models - Add ability to enable type reduction to config generation - Update build.py to only allow operator/type reduction via config - simpler to require config to be generated first - can't mix a type aware (ORT format model only) and non-type aware config as that may result in insufficient types being enabled - Add script to create reduced build config - Update CIs	2021-01-29 07:59:51 +10:00
Guoyu Wang	c05adb1147	Initial version of CoreML EP (#6392 )	2021-01-27 10:43:17 -08:00
liqunfu	6ed12402a4	Liqun/liqun/enable pipeline parallel test2 (#6399 ) * enable data and pipeline parallism test Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-01-25 15:15:26 -08:00
Yufeng Li	c20965f9b2	enable pipeline to run quantization tests (#6416 ) * enable pipeline to run quantization tests setup test pipeline for quantization	2021-01-25 09:33:08 -08:00
wezuo	5b6753ce27	Wezuo/memory analysis (#5658 ) * merged alloc_plan * pass compilation * Start running, incorrect allocation memory info * add in comments * fix a bug of recording pattern too early. * debugging lifetime * fix lifetime * passed mnist * in process of visualization * Add code to generate chrome trace for allocations. * in process of collecting fragmentation * before rebuild * passed mnist * passed bert tiny * fix the inplace reuse * fix the exception of weight in pinned memory * add guards to ensure the tensor is in AllocPlan * add customized profiling * debugging * debugging * fix the reuse of differnt location type * add rank * add the rank * add fragmentation * add time_step_trace * Add summary for each execution step (total bytes, used/free bytes). * add top k * change type of top k parameter * remove prints * change heap to set{ * add the name pattern * add the useage for pattern * add partition * change to static class * add custom group * remove const * update memory_info * in process of adding it as runtime config * change the memory profiling to be an argument * add some comments * add checks to recored meomry_info in traaining session * set the "local rank setting" to correct argument. * addressing comments * format adjustment * formatting * remove alloc_interval * update memory_info.cc to skip session when there is no tensor for a particular memory type * fix memory_info multiple iteration seg-fault * consolidate mainz changes * fixed some minor errors * guard by ORT_MINIMAL_BUILD * add ORT_MEMORY_PROFILE flag * added compiler flag to turn on/off memory profiling related code * clean up the code regarding comments * add comments * revoke the onnx version * clean up the code to match master * clean up the code to match master * clean up the code to match master Co-authored-by: Jesse Benson <benson.jesse@gmail.com> Co-authored-by: Wei Zuo <wezuo@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: wezuo <wezuo@az-eus-v100-32gb-5-worker-mgtbby.eastus.cloudapp.azure.com> Co-authored-by: wezuo <wezuo@az-eus-v100-32gb-5-worker-yclzsf.eastus.cloudapp.azure.com>	2021-01-19 08:30:55 -08:00
Scott McKay	e54e2f969d	Use readelf for minimal build binary size checks. (#6338 ) * Use readelf for minimal build binary size checks. The on-disk size grows in 4KB chunks which makes it hard to see how much growth an individual checkin causes. Only downside is that the sum of the sections is larger than the on-disk size (assumably things get packed smaller on disk and some of the section alignment constraints can be ignored) * Remove unused function	2021-01-15 07:46:02 +10:00
Edward Chen	042053c55e	Add support for running Android emulator from build.py on Windows. (#6317 )	2021-01-13 19:21:49 -08:00
Alberto Magni	5623cc6d17	Use onnxruntime_USE_FULL_PROTOBUF=OFF for the cuda execution provider (#6340 ) This removes a special case of the cuda EP.	2021-01-13 18:27:13 +00:00
Changming Sun	5084ce0969	Update nuget build (#6297 ) 1. Update the ProtoSrc path. The old one is not used anymore. 2. Regenerate OnnxMl.cs 3. Delete some unused code in tools/ci_build/build.py 4. Avoid set intra_op_param.thread_pool_size in ModelTests in OpenMP build. 5. Fix a typo in the C API pipeline.	2021-01-11 10:49:05 -08:00
William Tambellini	39a988ce1c	Upgrade build.py to assert for python 3.6+ Upgrade build.py to assert for python 3.6+ as python 3.5 cannot build anymore todays master.	2020-12-30 20:17:09 -08:00
Changming Sun	1b23b28706	Remove MKLML/openblas/jemalloc build config (#6212 )	2020-12-30 17:18:19 -08:00
Michael Goin	bbb6b416f0	Fix ImportError in build.py (#6231 ) There is a possible ImportError where build.py can import the wrong 'util' package if there are others present in `sys.path` already	2020-12-30 14:22:55 -08:00
Tixxx	32c67c2944	Deprecating Horovod and refactored Adasum computations (#5468 ) deprecated horovod submodule refactored adasum logic to be ort-native added tests for native kernel and e2e tests	2020-12-17 16:21:33 -08:00
Edward Chen	64709b1335	Deprecate Python global configuration functions [Part 1] (#5923 ) Enable options to be set via execution provider (EP)-specific options and log deprecation warning from current global configuration functions.	2020-12-15 11:32:43 -08:00
baijumeswani	dd2e5a1a05	state_dict and load_state_dict for ORTTrainer (#6095 ) * add functions state_dict and load_state_dict to ORTTrainer * unit tests for state_dict and load_state_dict for ORTTrainer	2020-12-14 11:55:52 -08:00
baijumeswani	523d187193	save data to and load data from an hdf5 file for checkpointing (#5975 ) * save python dictionary to hdf5 representation and load an hdf5 file into a python dictionary * unit tests for saving data to and loading data from hdf5 file	2020-12-08 11:40:57 -08:00
satyajandhyala	f68a256140	Android code coverage (#6061 ) * Added Onnxruntime_GCOV_COVERAGE flag for Android. * Set CMAKE_SYSTEM_NAME explicityly for Android. * Added GCOV_PREFIX option to collect code coverage data. Added a new python script to generate code coverage info. Modified build pipeline to geneate Android code coverage info * Added build command line option --android_coverage * Added a comment describing the GCOV environment variables * Fixed PEP8 issues. * Added --android_coverage option to the build command. * Increased Android emulator memory from 3K to 8K. * Increased Android partition-size from 2GB to 4GB to overcome no-space-left-on-device error * Removed source_dir from command line args. * Use cwd absolute path to run tests. * Added commands to output the contents of /data/local/tmp on the emulator. * Added run_adb_shell function. * Format changes. * Removed keywd argument cwd. * Removed Android in the --build_dir path. * Removed commands added for debugging. * Removed exxtra new-lines. * Fix MacOs build pipeline failures by uninstalling openssl before running build script. * Revert "Fix MacOs build pipeline failures by uninstalling openssl before running build script." This reverts commit 90d0568fe533e9456c20d061a2d435c8fea48266. * Change dir to the build directory where the tar file is copied. * Changed the option from --android_coverage to --code_coverage * Moved steps to generate Android code coverage to run_nnap_code_coverage.sh * Require --android option if --code_coverage is specified. * No code coverage needed for onnx_test_runner. * Expect that the emulator is running when the script is executed. * Fixed the title in the buildpipeline step. * Fixed the formatting issue. * Added a command line argument, ORT_ROOT, to run_nnapi_code_coverage.sh script Co-authored-by: Satya Jandhyala <satyajandhyala@Satyas-Mac-mini.local>	2020-12-08 10:55:02 -08:00
baijumeswani	2b35f7d4f6	Fix build.py bug which prevents running some unit tests (#5990 ) Also ignore an exception occurred for execution providers which generate compiled nodes	2020-12-03 08:57:55 -08:00
Guoyu Wang	6846c665ff	Use loose version in build.py (#5998 )	2020-12-01 20:57:44 -08:00
Wenbing Li	2ec211ea7b	Support the cross compiling for Apple Silicon (#5974 ) * support macos_arm64 cross compiling * update the build docs * update as commented. * Update BUILD.md	2020-12-01 10:00:06 -08:00
Wenbing Li	1852ade75d	Enable the xcode build for Apple Silicon (arm64 MacOS) (#5924 ) * fix the build script for macos/xcode * add the version check * correct the osx-arch configuration * typo	2020-11-30 11:22:08 -08:00
Changming Sun	5fdd9f0fd2	Fix Python Linux GPU package name (#5943 ) Fix Python Linux GPU package name. I accidentally added "noopenmp" to it.	2020-11-25 17:46:11 -08:00
Xueyun Zhu	58ea7b3572	temporarily disable test (#5868 )	2020-11-23 15:18:37 -08:00
Ryan Hill	ba739a8000	Convert OpenVINO into a shared provider (#5778 ) Same as Dnnl and TensorRT before it, now with more methods and more cleanup.	2020-11-20 17:39:57 -08:00
Edward Chen	bef06dac93	Automatically clean up build docker image cache. (#5843 ) Follow up to #5811 to automate cleanup of the build docker image cache. Added a script and build definition to clean up docker images that haven't been accessed recently.	2020-11-20 11:56:26 -08:00
S. Manohar Karlapalem	ff58f621fa	Remove nGraph Execution Provider (#5858 ) * Remove nGraph Execution Provider Pursuant to nGraph deprecation notice: https://github.com/microsoft/onnxruntime/blob/master/docs/execution_providers/nGraph-ExecutionProvider.md#deprecation-notice Deprecation Notice \| \| \| \| --- \| --- \| \| Deprecation Begins \| June 1, 2020 \| \| Removal Date \| December 1, 2020 \| Starting with the OpenVINO™ toolkit 2020.2 release, all of the features previously available through nGraph have been merged into the OpenVINO™ toolkit. As a result, all the features previously available through ONNX RT Execution Provider for nGraph have been merged with ONNX RT Execution Provider for OpenVINO™ toolkit. Therefore, ONNX RT Execution Provider for nGraph will be deprecated starting June 1, 2020 and will be completely removed on December 1, 2020. Users are recommended to migrate to the ONNX RT Execution Provider for OpenVINO™ toolkit as the unified solution for all AI inferencing on Intel® hardware. * Remove nGraph Licence info from ThirdPartyNotices.txt * Use simple Test.Run() for tests without EP exclusions To be consistent with rest of test code. * Remove nGraph EP functions from Java code	2020-11-19 16:47:55 -08:00
Hariharan Seshadri	62508ef0e4	Revert "Remove MKLML build config (#5559 )" (#5855 )	2020-11-19 10:53:08 -08:00
Edward Chen	71e7c2b423	Cache build docker images in container registry. (#5811 ) This PR adds infrastructure to automatically cache docker images used in CI builds in a container registry. Currently, build images are pulled from a container registry for some builds and built every time for others. The container registry requires maintenance to keep the images up to date and building images every time wastes build agent resources. With this change, a given build image can be looked up in a cache container registry and if present, pulled, and otherwise, built and pushed. The uniqueness of a build image is determined by a hash digest of the dockerfile, docker build context directory, and certain "docker build" options. This digest is part of the image tag in the cache container repository. The cache container registry will need to be cleaned up periodically. This is not automated yet.	2020-11-17 17:02:24 -08:00
Scott McKay	7b76b57fc8	Support EPs that compile nodes in a minimal build. (#5776 ) * Support EPs that compile nodes in a minimal build. This enables NNAPI being used.	2020-11-17 13:52:22 +10:00
Scott McKay	a3f3a63206	Move OpenVINO specific validation function to somewhere more sensible, and rename to provide context on its usage. (#5822 )	2020-11-17 10:58:43 +10:00
Guoyu Wang	c4818d36ed	[NNAPI EP] Make NNAPI EP build on non-Android Platform (#5779 ) * Make NNAPI EP build on non-Android Platform * minor updates * Adress CR comments * Fix build issue using Windows, address CR comments * Fix linux build warnings * Fix for test failure * Fix for test failure * Fix model_tests failure	2020-11-15 17:04:45 -08:00
jeyblu	435b904f0e	add dnnl gpu engine (#5788 )	2020-11-12 20:17:54 -08:00
Maajid khan	a84a058f9e	[OpenVINO-EP] Enabling Multi Device support (#5740 ) * Enabling Multi Device support for UEP Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Minor fix added *Added a simple fix to determine OpenVINO version for Arm build as well Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>	2020-11-11 15:16:30 -08:00
Xueyun Zhu	d8ace07ad7	Add CPU send/recv for pipeline (#5315 ) * cpu send/recv * clean up send/recv * remove unused code * assert and nccl option for mnist * add build option to enable build with only cpu. Without this, nccl is always enabled which will break build on machine that only contains cpu * Add USE_MPI distinct from USE_NCCL/USE_HOROVOD * fix * fix * exclude cpu send/recv for machines without mpi Co-authored-by: Tim Harris <tiharr@microsoft.com>	2020-11-11 12:41:39 -08:00
liqunfu	1416d12f0b	Liqun/merge e2e pipelines (#5702 ) * Create an Azure Pipeline to merge cpp and python e2e pipelines into one. Still keep cpp 2e2 pipeline until this new pipeline is stable. Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-11-11 09:42:08 -08:00
Weixing Zhang	fff85a6a35	Add GPU kernels for ROCm EP (#5655 ) * Add kernels for AMD GPU. This PR is mostly about GPU kernels for ROCm EP. Due to similar GPU programming language (CUDA and HIP and similar math library calls, one principle in ROCM EP design is to share CUDA kernels as much as possible for ROCm. Thus, the script amd_hipify.py has been created for converting CUDA kernels to ROCm HIP kernels automatically during compilation phase. But, for some reasons such as perf issue, syntax difference..., some converted kernels need some manual intervention. These kernels will be checked in the repo physically for now. In order to avoid manual intervention, the plan is to refactor CUDA kernels to make them portable between CUDA EP and ROCm EP as much as possible. Please refer to "HIP Porting Guide" for details. * like lamb, multi-tensor-apply needs to be disabled for IsAllFiniteOp and ReduceAllL2, current AMD GPU compiler has perf issue for kernel parameter which is a structure with "pass by value". * Use hipMemsetAsync and add checks on HIP calls. * move the generated files to build folder. Co-authored-by: Jesse Benson <jesseb@microsoft.com>	2020-11-06 16:11:06 -08:00
Maajid khan	d98062da0c	[OpenVINO-EP] Hetero support (#5627 ) * Implement Hetero in UEP * Added security checks to take valid Hetero combinations as device type * Integrating Hetero features * Get the statistics Report in Debug Mode Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Passing right device type for vadm_baackend Added simple fix to pick the right device type when using vadm_backend with Hetero as well. Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Fixed batching logic for 2020.4 and above * Fixed flake8 PEP8 errors Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Minor Fixes Added Added security checks for device_type passed in for Hetero build during run time code cleanup Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Minor changes Added Fixed batch_size bug in vadm_backend code cleanup *Documentation updated for Hetero Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> Co-authored-by: suryasidd <surya.siddharth.pemmaraju@intel.com>	2020-10-30 22:35:08 -07:00
Weixing Zhang	aec4cb489e	ROCm EP for AMD GPU (#5480 ) The ROCm EP is designed and implemented based on AMD GPU software stack named ROCm. Here is the link for the details about ROCm: https://rocmdocs.amd.com/en/latest/ ROCm EP was created based on the following things: 1. AMD GPU programming language: HIP 2. AMD GPU HIP language runtime: amdhip64 3. BLAS: rocBLAS, hipBLAS 4. DNN: miOpen 5. Collective Communication library: RCCL 6. cub: hipCub 7. … Current status: BERT-L and GPT2 training can be ran on AMD GPU with data parallel. Next: 1. Make more GPU code be sharable between ROCm EP and CUDA EP since HIP language and HIP runtime API are very close to CUDA. 2. Continue improving the implementation. 3. Continue GPU kernel optimization. 4. Support model parallelism on ROCm EP. …… The rocm kernels have been removed from this commit and will be in a separate PR. Since the original PR was too big(~180 files), it was suggested to split the PR into two parts, one is rocm-kernels, the other is non rocm kernels. Co-authored-by: Weixing Zhang <wezhan@microsoft.com> Co-authored-by: sabreshao <sabre.shao@amd.com> Co-authored-by: anghostcici <11013544+anghostcici@users.noreply.github.com> Co-authored-by: Suffian Khan <sukha@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2020-10-29 17:13:04 -07:00
Changming Sun	e6956be40c	Publish no-openmp python packages to test pypi (#5610 ) Publish no-openmp python packages to test pypi	2020-10-28 19:49:53 -07:00
liqunfu	5129b4d5bc	batch size tests (#5508 ) * batch size tests Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-10-28 15:55:40 -07:00
liqunfu	92662659ba	Liqun/remove number matching (#5606 ) replace number matching with relaxed comparison in frontend tests Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-10-27 21:27:37 -07:00
Dmitri Smirnov	3433576fd3	Support for Sparse Initializers (#5540 ) Introduce sparse_initializers support. Convert them to dense on model load and prune graph_proto_ so they don't consume space. Convert back to sparse on ORT Format model save. Implement serializing sparse initializers to OrtFormat. Fix Model::ToProto() to return original sparse initializers Set a flag that graph_sync is needed when loading a simple ORT Format model. otherwise nothing is resolved. Add ORT Format history to README.md ifdef MINIMAL build for DenseToSparseTensorInitializer Allow duplicate initializers to support existing models. Issue a warning instead of aborting. * Revert "Remove SparseTensor support from minimal build. (#5114)" This reverts commit `59ee8ffb17`. Signed-off-by: Dmitri Smirnov <dmitrism@microsoft.com>	2020-10-27 10:32:06 -07:00
Andrews548	20bc83400b	ACL/ArmNN update (#5515 ) * Build ACL and ArmNN with custom library path * Define import to tensor as a separate function for maintenance and readability * Enabled optimized depthwise convolution for ACL v20.02 * Check operation status for ACL and ArmNN Execution Providers * Enabled fused operation for convolution-activation Co-authored-by: Andrei-Alexandru <andrei-alexandru.avram@nxp.com>	2020-10-22 09:29:44 -07:00
Changming Sun	5802fe1699	Remove MKLML build config (#5559 ) Remove MKLML build config	2020-10-21 13:11:25 -07:00
Guoyu Wang	915d475353	Android CI update (#5474 ) * Update Android CI * update comments	2020-10-14 16:56:50 -07:00
Wenbing Li	80d36eab86	enable the onnxruntime shared library test on iOS (#5443 ) * enable the onnxruntime shared library test on iOS * fixing as commented. * add return status check.	2020-10-12 21:40:57 -07:00
liqunfu	dbe7e6623b	only use/import pytest if needed (by enable_training) (#5437 ) Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-10-09 12:42:19 -07:00
liqunfu	1cceefc7d4	use run_orttraining_test_orttrainer_frontend_separately to work aroun… (#5408 ) * use run_orttraining_test_orttrainer_frontend_separately to work around a sporadic segfault. Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-10-09 09:16:10 -07:00
Changming Sun	09aef240d6	Skip running onnx tests in python mac os pipeline (#5416 )	2020-10-08 11:49:28 -07:00
Hariharan Seshadri	6f54113a1b	Support OrtValue binding in Python to enable interesting IOBinding scenarios in Python (#5248 )	2020-10-06 21:14:41 -07:00
Guoyu Wang	b4934b0016	Mitigate pybind11 build break using Xcode 12 on macOS (#5381 ) * turn dev_mode off if we are using macos to build python with xcode 12 * Address CR comments * Add ways to check compiler version	2020-10-06 19:03:33 -07:00
liqunfu	773992c7d4	Liqun/bert pretrain tb (#5377 ) * add tensor board, remove torch.distributed.lanuch because ort nccl depends on MPI. Use MPI to launch parallel training. Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-10-06 16:28:31 -07:00
Wenbing Li	4721729fdc	Enable iOS CI pipeline (#5360 ) * add the ios ci build. * no dependency on mac ci pipeline. * fix the command line. * keep sync * automatically retrieve sdpath * fix the case errors and warnings * fix the vlog switch issue. * add parallel flag for build. * update the display name of the pipeline.	2020-10-02 20:14:45 -07:00
Guoyu Wang	9df0790856	Update linux minimal CI to report Android mininal baseline binary size (#5361 ) * Update linux minimal CI to report Android mininal baseline binary size * Fix some issues in the script	2020-10-02 17:35:23 -07:00
Ashwini Khade	ce49cfa67c	add support for configurable build dir when building nuget packages (#5352 ) * add support for configurable build dir when building nuget packages * rename vars	2020-10-02 09:31:35 -07:00
liqunfu	fe50213491	Liqun/bert pretrain2 (#5327 ) * bert single node multi GPU pretrain w/o checkpoint Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-10-01 11:01:26 -07:00
Wenbing Li	ed102e9d88	Add iOS test pipeline and a sample app. (#5298 ) * Add iOS test pipeline and a sample app. * clean up the unused code. * clean up. * revert the unknown change * disable the shared library for iOS. * add open source notice text. * ignore the skipped test. * extract the common ortenv setup	2020-09-29 13:53:11 -07:00
liqunfu	24d8b1bf42	to skip an unstable test to unblock release (#5314 ) Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-09-28 22:30:11 -07:00
Changming Sun	1a04b8f8b7	Add valgrind support to our cmake files (#5296 )	2020-09-28 09:31:08 -07:00
Guoyu Wang	d957dbebea	Fix possible ios build break after update to Xcode 12 (#5246 ) * Fix possible ios build break after update to Xcode 12 * Address comments	2020-09-22 07:42:54 -07:00
Xueyun Zhu	55e4b5d302	add pipeline distributed training test (#5222 ) * add pipeline distributed training test * fix max line length error in windows build * function header indent * fix * fix flake8 error	2020-09-21 14:35:01 -07:00
Tiago Koji Castro Shibata	cd663d58f5	Fix WinML warnings (#5228 )	2020-09-19 12:41:42 -07:00
KeDengMS	ce3b67e0cd	[Python] Move symbolic_shape_infer from nuphar to tools (#5162 ) * [Python] Move symbolic shape inference from nuphar to tools * Fix PEP8 ERROR	2020-09-18 09:31:06 -07:00
Guoyu Wang	8156e0dd10	[ORT Mobile] Some updates to iOS/Android build settings (#5184 ) * Update android CI and build settings * add build_java to arm64 also * Add ios signing param * fix a small build warning * address pr comments	2020-09-17 15:53:14 -07:00
Tiago Koji Castro Shibata	f3f119a945	Use onecore umbrella lib in onecore builds (#5182 ) * delayload hack * Skip tests * Onecore uses onecore umbrella * Uncomment tests * cleanup * Disable dev mode for WinML	2020-09-16 10:46:27 -07:00
Changming Sun	8946d212bf	Remove the dependency on CUDA SDk's version.txt (#5155 )	2020-09-14 14:25:28 -07:00
Wenbing Li	2a456d16c0	Enable onnxruntime iOS shared library build. (#5148 )	2020-09-14 10:32:39 -07:00
Scott McKay	323a1ba8a4	Add option to exclude support for loading ORT format models in full build. (#5129 ) * Add ability to exclude support for loading ORT format models. Disable support for ORT format models in packages	2020-09-12 12:21:30 +10:00
Tiago Koji Castro Shibata	62848c4de5	Add store builds to nuget packaging (#5040 ) * Nuget store packaging * Move DNNL workaround to EP * Fix warning as error * Disable store tests * Skip store tests * msbuild target * Cross compile protoc in Store * Disable DML in store * Move store builds to CPU queue * Copy uap10 to final nuget * Fix pip8 error * Remove extra dml copies * Fix argparse * pep8 * Forward IsStoreBuild * Apply is_store_build to duplicate generate_nuspec * runtimes * Refactor uap10 * Store .NET * uap * PR feedback	2020-09-09 21:38:14 -07:00
Scott McKay	dbf4e7019d	Add ability to generate configuration file with required operators. (#5089 ) * Add ability to generate configuration file with required operators.	2020-09-09 21:39:17 +10:00
Scott McKay	80ada0291f	Improve the minimal build size on android and linux (#5086 ) Fix bug where linux build fails when python is enabled and rtti is disabled Update doco for new build settings	2020-09-09 21:38:34 +10:00
gwang-msft	a1a81470e3	Add minimal build binary size verification (arm64) to Android CI (#5087 ) * Add minimal build binary size verification (arm64) to Android CI * Add comments in the CI ymal	2020-09-09 19:06:20 +10:00
Cameron Maske	4553b2eecd	Expose DirectML provider to python (conflicts resolved from #3359 ) (#4630 )	2020-09-08 14:34:09 -07:00
liqunfu	de58720a97	Liqun/transformer test and e2e golden numbers (#5064 ) * match new/old api numbers * new golden numbers for Roberta and MC Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-09-04 18:11:37 -07:00
Scott McKay	b5c2932ae8	Last major set of ORT format model changes (#5056 ) * Add minimal build option to build.py Group some of the build settings so binary size reduction options are all together Make some cmake variable naming more consistent Replace usage of std::hash with murmurhash3 for kernel. std::hash is implementation dependent so can't be used. Add initial doco and ONNX to ORT model conversion script Misc cleanups of minimal build breaks.	2020-09-05 07:59:01 +10:00
Thiago Crepaldi	0fc9c504fe	Re-enable CI tests for the new PyTorch frontend (#5017 ) This PR includes: * Re-enable CI tests for new PyTorch frontend * Re-enable fp16 and adjust tolerances for number matching	2020-09-04 09:36:24 -07:00
Andrews548	bd215b79a2	ACL v20.02 (#4981 ) * Add ACL version 20.02 * fix loging typo * check depthwise operation based on group param * Generate ArmNN runtime inside class constructor * Update to the latest ONNX operation set * Update BUILD.md Co-authored-by: Andrei-Alexandru <andrei-alexandru.avram@nxp.com>	2020-09-03 20:44:27 -07:00
Nat Kershaw (MSFT)	8a03b6e5c7	Render Operator documentation as compliant markdown (#3658 )	2020-09-02 15:07:50 -07:00
RandySheriffH	14b51d6502	CiPipeline@ReducedOpsBuild (#4917 ) * cancel night build on pyop * setup ci pipeline for build of reduced ops * add back c# test * remove debugging print * add testing model * add more arg in pipeline script * disable pipeline trigger temporarily * fix yaml format * fix yaml format * fix pipeline error * rid c# test * add ops for test cases * add Conv from domain com.microsoft.nchwc * remove --reduce_ops * fix typo * remove --build_java * add test case for excluded op * update doc with --skip_test * formatting code, renaming files and simplify yaml * remove debug build from yaml * remove surplus ops from included_ops.txt * add MinSizeRel build to yaml * rename test cases and models * exclude ir test from minimum build * restrict ir test to be only applied to reduced ops build	2020-08-31 21:21:18 -07:00
Ashwini Khade	0d3bbfdd0f	enable nuget packaging in local builds (#4884 ) * enable building nuget packages * add nuget creation from build.py * add documentation * fix flake8 errors * fix nuget package version * enable csharp tests * update csharp tests * copy nuget packges to nuget-artifacts * add libmklml_gnu * plus review updates * fix references for release builds	2020-08-26 12:33:48 -07:00
Rayan-Krishnan	eb05db5a2a	Fix OptimizerConfig params groups (#4877 ) * Copy samples to build folder and load models from there. Fix CI * This PR also includes a fix to path validation for save_as_onnx API * Add torchtext to CI for GPU training * Remove new frontend tests from CI Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>	2020-08-22 22:04:17 -07:00
liqunfu	6260d073b3	Glue parallel training (#4550 ) add mpi size, rank python API add single node parallel training example	2020-08-21 21:24:27 -07:00
RandySheriffH	3fa73a5b6a	ReduceBinarySize (#4747 ) * cancel night build on pyop * add rewriter to rewrite cpu provider * skip BuildKernelCreateInfo<void> * refactor variable name and comment * include ops from csv file * process multiple eps * add default function to cuda provider * rename function and add license header * fix import * add doc * fix typo * deal with empty kernel entry in cuda * rename the rewriter file * add comment into provider file * add comment and rename function * log warnings * refactor extracting logic * add entry for script to run solo * add better example * avoid onnx importing * fix flake8 alerts * minor fixes to better comments and doc * add entries for all domains * add void entry into contrib providers * format cuda_contrib_kernels.cc * format cpu_contrib_kernels.cc * add all providers * add default entry to all providers * include op_kernel header * cancelling change in providers beyond cpu/cuda * rename file and switch file format to domain;opset;op1,op2... * update doc * restore non-regular ending grammar in cuda_contrib_kernels.cc * add ort_root as input argument of script * enable test in ci * update doc * update doc * revert change on linux gnu ci * switch to set to host ops * simplify trimming logic * add domain map to track current model * allow ort_root to take relative path	2020-08-21 19:50:13 -07:00
Thiago Crepaldi	42408aa3ed	Add new PytTrch front-end (#4815 ) * Add ORTTrainerOptions class for the new pytorch frontend (#4382) Add ORTTrainerOptions class and some placeholders * Add _ORTTrainerModelDesc to perform validation for model description (#4416) * Add Loss Scaler classes to the new frontend (#4306) * Add TrainStepInfo used on the new frontend API (#4256) * Add Optimizer classes to the new frontend (#4280) * Add LRScheduler implementation (#4357) * Add basic ORTTrainer API (#4435) This PR presents the public API for ORTTrainer for the short term development. It also validates and saves input parameters, which will be used in the next stages, such as building ONNX model, post processing the model and configuring the training session * Add opset_version into ORTTrainerOptions and change type of ORTTrainer.loss_fn (#4592) * Update ModelDescription and minor fix on ORTTrainer ctor (#4605) * Update ModelDescription and minor fix on ORTTrainer/ORTTrainerOptions This PR keeps the public API intact, but changes how model description is stored on the backend Currently, users creates a dict with two lists of tuples. One list called 'inputs' and each tuple has the following format tuple(name, shape). The second list is called 'outputs' and each tuple can be either tuple(name, shape) or tuple(name, shape, is_loss). With this PR, when this dict is passed in to ORTTrainer, it is fully validated as usual. However, tuples are internally replaced by namedtuples and all output tuples will have tuple(name, shape, is_loss) format instead of is_loss being optionally present. Additionally to that normalization in the internal representation (which eases coding), two internal methods were created to replace a namedtuple(name, shape) to namedtuple(name, shape, dtype) or namedtuple(name, shape, is_loss, dtype) dependeing whether the tuple is an input or output. This is necessary as ORTTRainer finds out data types of each input/output during model export to onnx. Finally, a minor fix was done on ORTTrainer. It could initialize ORTTrainerOptions incorrectly when options=None * Rename input name for test * Add ONNX Model Export to New Frontend (#4612) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com> * Create training session + minor improvements (#4668) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> * Save ONNX model in file (#4671) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> * Add eval step (#4674) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> * Add train_step (#4677) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> * Add LR Scheduler (#4694) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com> * Add deterministic compute tests (#4716) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com> * Add legacy vs experimental ORTTrainer accuracy comparison (#4727) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com> * Add Mixed precision/LossScaler + several fixes (#4739) Additionally to the mixed precision/loss scaler code, this PR includes: * Fix CUDA training * Add optimization_step into TrainStepInfo class * Refactor LRSCheduler to use optimization_step instead of step * Updated several default values at ORTTrainerOptions * Add initial Gradient Accumulation supported. Untested * Fix ONNX model post processing * Refactor unit tests * Add ONNX BERT example + minor fixes (#4757) * Fix training issue when passing ONNX file into ORTTrainer Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com> Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> * Add Dynamic Shape support (#4758) * Update DeepSpeed Zero Stage option to a separate option group (#4772) * Add support to fetches (#4777) * Add Gradient Accumulation Steps support (#4793) * Fix Dynamic Axes feature and add unit test (#4795) * Add frozen weights test (#4807) * Move new pytorch front-end to 'experimental' namespace (#4814) * Fix build Co-authored-by: Rayan-Krishnan <rayankrishnan@live.com> Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-08-17 09:45:25 -07:00
Changming Sun	5eec4f66ed	Refactor manylinux docker image and the related pipelines (#4751 ) 1. Publish the image ACR, instead of building it every time for every PR 2. Make USE_MKLML and USE_OPENMP be able to co-exist. Currently both of them are enabled in our Linux CI build but indeed only one of them is taking effect. 3. Split nuphar and DNNL to separated pipelines. 4. Fix two warnings in onnxruntime/core/optimizer/matmul_scale_fusion.cc and onnxruntime/test/tvm/tvm_basic_test.cc. 5. Update the manylinux2010_x86_64 image to the latest.	2020-08-17 09:40:31 -07:00
gwang-msft	8507bc1f48	[Android NNAPI EP] Enable test for BatchNormalization, enable dev_mode for Android, fix some issues in concat (#4715 ) * update batch_norm test, enable dev_mode for nnapi, ignore onnx protobuf warning for nnapi ep * fix some issues in concat and mark input without shape as not supported for now * address review comments * addressed comments	2020-08-06 14:11:59 -07:00
Changming Sun	1e054739b8	Remove the requirement of CUDA's version.txt (#4706 ) Sometimes there is a file named "version.txt" in your CUDA installation dir, but sometimes there isn't one. I couldn't figure out it why, but the latest CUDA 11 on our CI build machines doesn't have this file. As the file is not needed for building onnxruntime, so I removed the check.	2020-08-04 17:03:40 -07:00
RandySheriffH	1fcd3eb376	cancel night build on pyop (#4673 )	2020-07-30 19:51:52 -07:00
gwang-msft	c2ec3b734b	[Android NNAPI EP] Remove dependency on external JD/DNNLibrary (#4576 ) * remove dependency of external jd-dnnlibrary * remove extra variables not used any more * update /cgmanifest.json	2020-07-22 14:08:12 -07:00
Andrews548	f20afc4991	Update ACL/ArmNN EP (#4571 ) * Add BN to ArmNN EP * Add Concat to ArmNN EP * ACL logging improvements * ArmNN logging improvements * Fallback to CPU for 9x9 convolution in ACL EP * Fallback to CPU for 9x9 convolution in ArmNN EP * Enable python support for ACL and ArmNN EPs when compiled with BSP toolchain * Removed the matmul operator * Fix conv infer shape function * Fix provider_names list for armnn Co-authored-by: Andrei-Alexandru <andrei-alexandru.avram@nxp.com>	2020-07-21 22:25:58 -07:00
Changming Sun	bc1d197ddf	Re-enable dnnl in CI build (#4544 ) * Revert "Temporarily remove dnnl from Linux CI build to unblock the whole team (#4266)" Previously it fails because it used too much memory. Now we only run dnnl EP with opset12 models in unit tests, to reduce peak memory usage.	2020-07-19 23:20:03 -07:00
Changming Sun	8ada440961	Move model tests to onnxruntime_test_all (#4521 ) 1. Move model tests to onnxruntime_test_all 2. Publish TestResults of Windows CI build.	2020-07-15 16:46:18 -07:00
liqunfu	f721f5f1cd	Liqun/multiple choice (#4480 ) * multiple choice runner * add docker cleanup task to frontent pipeline	2020-07-14 17:57:58 -07:00
liqunfu	0bff55512e	updated expected values for frontend test to pass frontend e2e pipeline. raise tolerance to reduce future risk of failure (#4497 ) * updated expected values for frontend test, raise tol	2020-07-13 19:25:54 -07:00
gwang-msft	5f8f443ac4	Android CI build, test copy, emulator boot improvement (#4481 ) * Enable onnxruntime_test_all for NNAPI EP * switch to use ninja for ANdroid CI * make android elumator boot faster in android ci * simplify adb push * more style change * more tweaking on android ci * build.py style update	2020-07-13 14:18:34 -07:00
Hariharan Seshadri	26ebcfab88	Fix Nuget GPU pipeline (#4462 )	2020-07-10 14:02:28 -07:00
Hariharan Seshadri	6d6b6b54a5	Support binding a graph output to a specific device via the Python binding (#4439 )	2020-07-07 21:09:37 -07:00
EronsJ	632b2896f3	Onnxruntime fuzzing (#4341 ) * Add protobuf mutator library as a git submodule * Added files and instructions to build the protobuf mutator library in CMake * Added fuzzing flag to build system and added fuzzing dependency library. To run fuzzing test use the flags --fuzz_testing --build_shared_lib --use_full_protobuf --cmake_generator 'Visual Studio 16 2019' * Added src files and build instructions for the main fuzzing engine * Removed Random number generation test from inside the engine * Added license header to files * Removed all pep8 violations introduced by this change and other E501 violations	2020-07-06 16:34:34 -07:00
Tiago Koji Castro Shibata	7fea332f93	Support builds without RTTI (#4333 ) * Support builds without RTTI * Disable RTTI in all builds	2020-07-01 13:05:35 -07:00
gwang-msft	9e0f5fc7af	The initial PR for NNAPI EP (#4287 ) * Move nnapi dnnlib to subfolder * dnnlib compile settings * add nnapi buildin build.py * add onnxruntime_USE_NNAPI_BUILTIN * compile using onnxruntime_USE_NNAPI_BUILTIN * remove dnnlib from built in code * Group onnxruntime_USE_NNAPI_BUILTIN sources * add file stubs * java 32bit compile error * built in nnapi support 5-26 * init working version * initializer support * fix crash on free execution * add dynamic input support * bug fixes for dynamic input shape, add mul support, working on conv and batchnorm * Add batchnormalization, add overflow check for int64 attributes * add global average/max pool and reshape * minor changes * minor changes * add skip relu and options to use different type of memory * small bug fix for in operator relu * bug fix for nnapi * add transpose support, minor bug fix * Add transpose support * minor bug fixes, depthwise conv weight fix * fixed the bug where the onnx model input has mismatch order than the nnapi model input * add helper to add scalar operand * add separated opbuilder to handle single operator * add cast operator * fixed reshape, moved some logs to verbose * Add softmax and identity support, change shaper calling signature, and add support for int32 output * changed the way to execute the NNAPI * move NNMemory and InputOutputInfo into Model class * add limited support for input dynamic shape * add gemm support, fixed crash when allocating big array on stack * add abs/exp/floor/log/sigmoid/neg/sin/sqrt/tanh support * better dynamic input shape support; * add more check for IsOpSupportedImpl, refactored some code * some code style fix, switch to safeint * Move opbuilders to a map with single instance, minor bug fixes * add GetUniqueName for new temp tensors * change from throw std to ort_throw * build settings change and 3rd party notice update * add readme for nnapi_lib, move to ort log, add comments to public functions, clean the code * add android log sink and more logging changes, add new string for NnApiErrorDescription * add nnapi execution options/fp16 relax * fix a dnnlibrary build break * addressed review comments * address review comments, changed adding output for subgraph in NnapiExecutionProvider::GetCapability, minor issue fixes * formatting in build.py * more formatting fix in build.py, return fail status instead of throw in compute_func * moved android_log_sink to platform folder, minor coding style changes * addressed review comments	2020-06-26 00:02:39 -07:00
Aaron Bockover	64264c3846	Allow --cmake_generator to work on macOS (#4278 )	2020-06-24 16:30:33 -07:00
Changming Sun	deea945f80	Remove openmp and scipy from build pipelines (#4305 ) 1. Remove openmp because the default thread pool is already good enough. 2. Remove scipy from build pipelines because it stops support python 3.5.	2020-06-23 20:18:16 -07:00
Yufeng Li	867ba846f7	Implement MinMax with SIMD (#4285 ) * Implement MinMax with SIMD	2020-06-23 20:07:53 -07:00
Pranav Sharma	2204d39a06	Add build option to disable traditional ML ops from the binary. (#4272 ) * Add build option to disable traditional ML ops from the binary. * Fix python tests by splitting tests for ML ops to a separate file. Exclude ML tests from onnx_test_runner and C# tests. Exclude ML op sources. * Update Edge pkg pipelines with new MLops env variable and fix C# packaging pipeline tests to skip ML ops.	2020-06-20 06:36:06 -07:00
goloskokovic	478b923e19	Expose ACL/ARMNN providers to Python (#4260 ) * expose ACL/ARMNN providers to python * add -acl / -armnn to package name when use_acl / use_armnn is specified * build python wheel for ARMNN EP * link ACL/ARMNN EPs into onnxruntime_pybind11_state * wrong argument order in build_python_wheel for wheel_name_suffix	2020-06-18 20:24:14 +05:30
Weixing Zhang	b4b1c6440a	Enable ORT with CUDA 11 toolkit (#4168 ) * ORT on CUDA 11 1. Seperate HOROVOD and MPI 2. Seperate NCCL from HOROVOD in CMakeLists.txt 2. Remove dependency on external cub 3. cudnnSetRNNDescriptor is changed in cuDNN 8.0 * polish the code about MPI/NCCL in CMakeLists.txt and build.py * check CUDA version * ${MPI_INCLUDE_DIRS} should be PUBLIC * sm30, sm50 are deprecated in CUDA 11 Toolkit * update change based on code review feedback. * add sm_52 * improve MPI/NCCL build path Co-authored-by: Weixing Zhang <wezhan@microsoft.com>	2020-06-15 08:47:03 -07:00
Yulong Wang	73bc6be5d1	build: split nodejs binding build and test to avoid timeout issue (#4188 ) * split nodejs binding build and test * enable nodejs tests	2020-06-10 19:16:32 -07:00
George Wu	e8ed14bcb3	disable MEMLEAK CHECKER for openvino	2020-06-10 11:12:17 -07:00
Changming Sun	a7366d82af	Disable nuphar large model test (#4173 ) Disable nuphar large model test, because it takes too long(40+ minutes), while the default cpu provider takes about 5 minutes. After this change, we still keep a lot of other nuphar model tests, I think that should be enough.	2020-06-09 17:45:17 -07:00
Wenbing Li	ee35320974	The fixings for python scripts in ONNXRuntime (#4135 ) * The fixings for python scripts in ONNXRuntime * update according the comments	2020-06-08 10:27:32 -07:00
Cecilia Liu	b8db8076cb	Fix MKLML Tests Run (#4144 ) Add a path to LD_LIBRARY_PATH to fix library not found error when running mklml test cases.	2020-06-06 20:28:53 -07:00
liqunfu	ffed43e9b8	handle loss and name marching wrappers (#4066 ) * handle loss and name marching wrappers Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-06-05 23:34:26 -07:00
Andrews548	62b44527e5	Add ArmNN Execution Provider (#3714 ) * Add ArmNN Execution Provider Add a new execution provider targeting Arm architecture based on ArmNN. Validated on NXP i.MX8QM CPU with ResNet50, MobileNetv2 and VGG models. reviewed-by: mike.caraman@nxp.com * Minor fixes - renamed onnxruntime_ARMNN_RELU_USECPU to onnxruntime_ARMNN_RELU_USE_CPU - fixed acl typo * remove extra includes. added exception for ArmNN in test * fix indentation * Separated the activation implementation from the cpu and fixed the blockage from the endif Co-authored-by: Andrei-Alexandru <andrei-alexandru.avram@nxp.com>	2020-06-03 22:57:51 +05:30
Ashwini Khade	70d91a8550	re-enable graph optimizations during build phase (#4044 ) * re-enable graph optimizations during build phase * fix * re-enable optimizers for all provider tests	2020-06-01 10:32:42 -07:00
Scott McKay	1d441f89ac	Re-enable PEP8 check in Win CI build (#4075 ) * Add flake8 to Win CI build so it's re-enabled. It was in the static analysis build that is currently disabled so checks are not running. Fix build.py to be compliant again. Add prefix to flake8 output so it's (hopefully) easier to identify the errors in build output. * Add to all builds in Windows CPU CI so they all fail quickly if there's an issue.	2020-05-30 09:10:05 +10:00
edgchen1	38d76cc904	Clean up training E2E test (#4078 ) Update training E2E build to not go through CTest and call test scripts directly.	2020-05-29 09:20:47 -07:00
liqunfu	6665d5e2bc	Liqun/a transformer example (#3845 ) Add transformer glue test example to show how to use ORTTrainer to fine-tune a transformer model Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-05-27 15:21:35 -07:00
Yulong Wang	b3ec8035ee	[Node.js binding] add build flag for node.js binding (#3948 )	2020-05-27 13:30:22 -07:00
Paul Fultz II	7759136610	Add amd migraphx execution provider to onnx runtime (#2929 ) * Add amd migraphx execution provider to onnx runtime * rename MiGraphX to MIGraphX * remove unnecessary changes in migraphx_execution_provider.cc * add migraphx EP to tests * add input requests of the batchnorm operator * add to support an onnx operator PRelu * update migrapx dockerfile and removed one unused line * sync submodules with mater branch * fixed a small bug * fix various bugs to run msft real models correctly * some code cleanup * fix python file format * fixed a code style issue * add default provider for migraphx execution provider Co-authored-by: Shucai Xiao <Shucai.Xiao@amd.com>	2020-05-27 04:24:59 +08:00
Wei-Sheng Chin	24eda3df33	Create Utils for Adding Range and Marker (#4013 ) In this PR, we 1. create some APIs for creating NVTX objects 2. apply those APIs in pipeline-related operators and sequential executor. As a result, we can explicitly see how a pipeline schedule is run by GPUs in Nvidia's visual profiler. Note that these APIs are Linux only due to Nvidia's limited support.	2020-05-24 22:55:24 -07:00
edelaye	64b5f7edf6	Initial release of Vitis-AI Execution Provider (#3771 ) * Initial release of Vitis-AI Execution Provider * Add documentation, fix for onnxruntime::Model changes and use stringstream instead of file dump for model passing * - Add Vitis-AI docker file - Add online quantization flow Vitis-AI execution provider - Fix remarks * - Add fatal error build message for Vitis-AI cmake build on Windows - Fix pep8 issue in build.py - Add Vitis-AI execution provider example in docs Co-authored-by: Elliott Delaye <elliott@xilinx.com> Co-authored-by: Jorn Tuyls <jornt@xilinx.com> Co-authored-by: Jorn Tuyls <jtuyls@users.noreply.github.com>	2020-05-19 05:32:32 -07:00
Scott McKay	c6a94f95cf	Update Android instructions (#3971 ) Update Android build instructions to provide more information. Add info on testing directly on Android Update build.py to better support using Ninja generator to build Android on Windows.	2020-05-19 07:30:45 +10:00
Scott McKay	5e0928a777	Enable running PEP8 on python scripts using flake8 (#3928 ) * Enable running PEP8 checks via flake8 as part of the build if flake8 is installed. Update scripts in \tools and \onnxruntime\python. Excluding \onnxruntime\python\tools which needs a lot more work to be PEP8 compliant. Also excluding orttraining\tools for the same reason. Install flake8 as part of the static_analysis build task in the Win-CPU CI so the checks are run in one CI build. Update coding standards doc.	2020-05-15 07:15:06 +10:00
gwang-msft	cba8bdc790	Make some compile change for Android NNAPI provider using DNNLibrary (#3935 ) * Change compile settings for NNAPI with DNNLib * update build.py * update build readme	2020-05-14 10:53:37 -07:00
Hariharan Seshadri	c00945ae81	Build ORT by default for Mac OS X versions 10.12+ (#3626 )	2020-05-12 14:43:32 -07:00
airockchip	edaf8a542c	Initial PR for RKNPU execution provider (#3609 ) * Initial RKNPU execution provider * Init * Support Ops: Conv, Relu, Clip, LeakyRelu, MaxPool, AveragePool, GlobalAveragePool, Concat, Softmax, BatchNormalization, Gemm, Add, Mul, Sub, Reshape, Squeeze, Unsqueeze, Flatten, Transpose, QLinearConv, DequantizeLinear * Add rknpu unittest * Update BUILD.md and Add RKNPU-ExecutionProvider.md * misc code update * fix CLIP accuracy issue. * fix "Error: Duplicate definition of name". * move rknpu_ddk out of onnxruntime submodule. * remove temporary code. * add rknpu namespace. * update misc of node_attr_helper * add const & comment for onnx_converter * add const & comment for shaper * unify variable name Co-authored-by: dkm <dkm@rock-chips.com> Co-authored-by: George Wu <jywu@microsoft.com>	2020-05-05 20:36:47 -07:00
liqunfu	af3988198c	Liqun/e2e transformer test (#3540 ) * initial change to transformer.py * prepare e2e transformer tests * refactor transformer tests * put test python files in a flat folder * fix typo pip install transform(s) * python 3.6 * python version to 3.6 in install_ubuntu.sh * remove argparser * to use opset ver 12 * workaround loss_scale naming patch in case of loss_fn_ * assign self.loss_fn_ so it can be checked * skip a few un-needed post-process steps * fix loss_scale_input_name, clean up post process steps * skip non-frontend tests * move cpu/cuda related files to coresponding cpu/cuda folder (#3668) Co-authored-by: Weixing Zhang <wezhan@microsoft.com> * type cast for ratio is not necessary for dropout (#3682) Co-authored-by: Weixing Zhang <wezhan@microsoft.com> * thrustallocator is not needed since cub is used directly for gather now. (#3683) Co-authored-by: Weixing Zhang <wezhan@microsoft.com> * GatherND-12 Implementation (#3645) * Renamed, UT passing * Move GatherND CUDA Kerenl into onnxruntime * Merge GatherNDOpTest * Refactor Test code * Merge CPU Kernel Impl * Handle Negative Indice, Fix UT * Improve CUDA kernel to handle negative index * Minor Fixes * Preserve GatherND-1 Cuda kernel * Fix Mac build * fix UT * Fix Build * fix GatherNDOpTest.double > CUDA error cudaErrorInvalidDeviceFunction:invalid device function Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: Peng Wang (pengwa) <pengwa@microsoft.com> * update with reviewers' comments * testBertTrainingGradientAccumulation was not using rtol and may fail occasionally with small (e-06) difference * fix merge mistakes Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: Weixing Zhang <weixingzhang@users.noreply.github.com> Co-authored-by: Weixing Zhang <wezhan@microsoft.com> Co-authored-by: Sherlock <baihan.huang@gmail.com> Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: Peng Wang (pengwa) <pengwa@microsoft.com>	2020-04-30 12:26:38 -07:00
David Brownell	7296e06dd5	Properly creating arguments to pass to setup.py (#3744 )	2020-04-29 09:47:51 -07:00
suryasidd	e529464a12	Limit the number of models run on OpenVINO (#3742 ) * Removed NMS from supported list	2020-04-29 02:23:09 -07:00
Tixxx	0638565fe0	Fix evaluation issues (#3538 ) * allow switching between eval and training modes dynamically Co-authored-by: Tixxx <root@525204a066204ea794f942530b05ae7f000000.axlncovkyjne5caro2tmz3zryb.xx.internal.cloudapp.net>	2020-04-28 21:03:37 -07:00
gwang-msft	12d7c2f6e4	iOS cross build on MacOS (#3699 ) * Enable iOS cross build on MacOS (step#1) * Changed parallel option * fixed style issues * Enable ios arm64 crossbuild on MacOS * Enable ios arm64 crossbuild on MacOS * Enable parallel build for xcode * Fix arm64 function not 4-byte aligned warning * Rename onnxruntime_ios.cmake to onnxruntime_ios.toolchain.cmake * change build.py to use the new ios toolchain file name	2020-04-28 17:09:31 -07:00
edgchen1	e22d97ba56	Merge pull request #3643 from microsoft/ort_training_for_merge_to_master Introduce ORT training implementation	2020-04-25 07:15:22 -07:00
Sheil Kumar	a475f2824d	Create the Nuget WindowsAI Pipeline (#3684 ) * add windowsai.yml for new Microsoft.AI.MachineLearning nuget * temporarily add windowsai.yml to gpu.yml * pass in build arch * remove install onnx task * no dml for arm or arm64 * refactor nuget pipeline defs * update package creation * pass in build and sources path * missing hyphens * copy license file * fix parameter variable * disable arm builds for now * remove commented script block * download pipeline atifcat name update * set working dir * Add bundling nuget script * path combine * null path * combine needs parentheses * binplace microsoft.* dlls in new nuget package * update artifact name * move merged nuget to artifacts directory * move to merged subfolder in artifacts staging dir * forward slash to back * enable arm * vcvarsall needs x64 vars setup * Run Tests * fix tests * move global variables * update yml to not have global variable in template * removed parameters * fixes * Add build arch as an env variable * ne not neq * %Var% for batch script * dont pass argument for x64 * disable arm tests * skip csharp/cxx tests for microsoft nuget package * remove test-win as it tests only c# cxx and capi * test build for store apps * dont build for store * tools/nuget/generate_nuspec_for_native_nuget.py * remove args. * add new props and targets for microsoft.ai * make windowsai props/targets static * add dependency * dont ship dot net props * Remove c# fom windowsai nuget * copy license file * native packages must have win10 as the platform, not win * cuda header in wrong if branch * no dml for arm builds * only build dml for x64/ x86 * User/sheilk/props update (#3616) * prelim store work * props * Fix desktop nuget props/targets * clean up targets and make store apps work Co-authored-by: Sheil Kumar <sheilk@microsoft.com> * update windowsai.yml with latest * remove extra dloadhelpers * Add abi headers to abi dir, and reference native includes * update windowsai.yml * minor update * remove parameters * add doesrp param * hard code esrp to true * add directml for x86/x64 * revert gpu yml changes * add store builds * add store builds * add checks again in old way * dup job names for store and desktop builds * move all of the runtime binaries to win10 folder * only set safeseh on x86 * disable the store builds for now... missing msvcprt.lib * copy paste deletion... * switch back to win- (#3646) Co-authored-by: Sheil Kumar <sheilk@microsoft.com> * use stahlworks * & not supported in ado * add cuda to cpu nuget(???) and EnableDelayedExpansion to enable x86 dml package * revert nocontribops * add underscore... * extra win/win10 change * merged nuget... still not being bundled... * files in merged directory * missing parens causing dml to be included in cpu package * more diagnostic info * switch dir to get-childitem * wait for compression to complete * add winml_adapter to mkml and gpu packages * enable_wcos * add mklml binaries * props and targets missing from mklml Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2020-04-24 20:20:04 -07:00
Ethan Tao	e9f1e7e797	resolve conflicts	2020-04-24 15:15:36 -07:00
edgchen1	7347c73139	Revert "resolving conflicts from master (#3691 )" (#3696 ) This reverts commit `c38a60a450`.	2020-04-24 14:49:00 -07:00
ytaous	c38a60a450	resolving conflicts from master (#3691 ) * resolving conflicts * resolving conflicts * resolving conflicts * resolve conflicts Co-authored-by: Ethan Tao <ettao@microsoft.com>	2020-04-24 14:38:30 -07:00
S. Manohar Karlapalem	6d4f2f5bf9	OpenVINO EP v2.0 (#3585 ) * Added FP16 transformations * Revert "Added CMAKE_BUILD_TYPE to make building dynamic" This reverts commit d3e17af1af655cfdc4d2fec33f52055caa525e85. * Added FP16 transformations for FP16 builds * Backend logic cleanup Cleans the backend(intel_graph.) code in the following ways:- 1. Minimize global usage: Since all the IR graphs need to be re-generated on every Infer, it is bad practice to rely on globals for their saving and usage as there would be multiple readers and writers to the same global variable leading to incorrect usages or contentions. This change replaces globals with locals where possible. This change also fixes an existing bug with due to incorrect global usage. 2. Remove all unused functions. 3. Remove all unused headers and prepocessor directives. removed commented out code * Disabled default optimization for Intel EP Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com> * Fix missed plugins.xml for python bindings * Fixed the build after latest master changes Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com> * Disabled unsupported ops for accelerators Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com> * Added some more disabled ops Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com> * Added environment variable to enable debugging Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com> * Added more debug statements Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com> * Fixed unsupported ops list for GPU and VPU Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com> * Fixed unsqueeze unit tests Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com> * Added error message to the status Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com> * Overwrite Model proto with shape info from data Overwrites the shape info of Model proto with the shape from actual input data. Needed for inferring models with Dynamic shapes. * Removed print statement and disabled where op Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com> * Disabled Reshape with Empty initializer * Added more debug statements for 1P * Don't allow 1D inputs with symbol for dimension * Disabled some 3rd phase ops * Disabled split and added zero dimension check for OutputDefs * Cleanup zero dimensionality check * Added different data type check for inputs and initializers * Added conditions for Mod, Cast and Pad * Removed unused variable * Disabled scan and added conditions for squeeze * Added changes for fixing all C++ unit tests * Implements Backend Manager class for caching Backend Manager provides a layer of indirection between EP interface and OV backend that provides caching services for models with symbolic dims in input shapes. * clean up commented blocks * clang-formatting * Read I/O type info from ModleProto Read the tensor element type information from ModelProto object, as FusedNode is no longer available. * code cleanup * clang-formatting * Added print statement for jenkins * Disabled some python tests * Changed the path of convert fp32 to fp16 hpp * Added conditions for BatchNorm in GetCapability * Fixed failed tests * Revert "Added conditions for BatchNorm in GetCapability" This reverts commit c3c28c3b00d27892c42546b35dacdd807a48ee90. * Added Intel to onnxruntime backends * pick up vars set by OV package setupvars.sh * Added conditions for Identity * remove a few cout prints * Added conditions for GPU_FP32 unit tests * Revert "pick up vars set by OV package setupvars.sh" This reverts commit 8199e029c03eae21a1a7ef6bfdc93d00e5d0198b. * Commented out fatal message for protobuf * Might need to be removed * Add interface class for current backend * moved common logic to base class * simplified cpu backend * Removed unused headers * use vectors to save i/o tensors for windows compatibility * move utils fxns to backend_utils namespace * rename ov_backend to ibackend * Factory pattern for backend creation * rename CPU backend to Basic backend * renamed to vad-M and added to factory list * Added conditions for VPU * Added print statements * Changed the logic for checking for symbolic shapes * Modified logic for zero dimension check * Removed VPU single dimension condition * Removed comments * Modified logic in DimensionCheck method * Remove legacy OpenVINO EP Remove all the legacy code for OpenVINO EP. UEP code will take its place going forward. This change does NOT remove OVEP files in the following areas asa they will be reused by UEP:- 1. Documentation: All .md files 2. Docker releated files 3. Python bindings 4. Java bindings 5. C# bindings 6. ORT Server 7. CI pipeline setup files * Rename Intel EP to OpenVINO EP * Added unique names to the subgraphs * Removed subgraphs with only constant inputs * Modified subgraph partitioning algorithm to remove const input subgraphs * Apply suggestion to onnxruntime/core/providers/openvino/openvino_execution_provider.cc * Tracking output names to fix the output order bug * Changed output names to a unordered map * Modified logic to check for symbolic input shapes * Fixed a bug in Reshape check * Added empty model path to Model constructor * Made necessary changes to cmake to build from the binary package * Changed INTEL_CVSDK_DIR to INTEL_OPENVINO_DIR * Enable dyn device selection with C++ API * Added Round operator to unsupported list * Modified subgraph partition logic for MYRIAD * Removed supported ops from the list * Enable dyn dev selection in Py API's * Add documentation for dynamic device selection * Use MYRIAD \|\| HDDL instead of VPU * Removed temporary cast of Int64 to FP32 * Disabled unit Tests for CPU_FP32 and GPU_FP32 * Removed default "CPU" from unit tests to allow overriding * Removed ops Concat, Squeeze, Unsqueeze from unsupported list * Get the device id from info * Removed overwriting device_id and precision * Enabled ConvTranspose and EyeLike * Reordered unsupported ops in alphabetical order * Fixed syntax error * Fixed syntax error * Code clean-up: Handle exceptions, logs and formatting Code formatted according to ORT coding guidelines. * remove debug print from pybind code * updated docs with ops and models * formatting prints * Added default values for c and j for openvino * Overriding the values set for c and j to be 1 * BACKEND_OPENVINO should be empty if openvino is not in build * Overriding c value with default for perftest * fix VAD-M device string bug * Add IE error details to exceptions * Use IE specific device names in EP * Add VAD-F (FPGA) device support * Removed unecessary libraries from whl package * Code changes for Windows compatibility * Add VAD-F option to python API * [revert before merge] cmake changes for RC * Enable Windows build in CMake * Unset macro OPTIONAL for windows builds inference_engine.hpp's include chain defines a macro 'OPTIONAL' which conflicts with onnx project's headers when using MSVC. So would need to explictly unset it for MSVC. * Use a single copy of plugin/IE::Core Defined as a static member in Backend manager * Remove restriction of single subgraphs for myriad * Passed subgraph name to Backend to enhance log statements * Disabled zero dimension conditions * Disabled concat to remove zero dims * Enabled building ngraph as part of ORT * Removed serializing and added versioning * Fix CPU_FP32 unit tests * Removed unecessary condition * add ngraph.so.0.0 to .whl * Check for zero dimensions only for inputs and outputs * Restrict loading only 10 subgraphs on myriad * Build ngraph.dll within UEP. Doesn't link yet * Rename Linux included libngraph.so to libovep_ngraph.so Renames locally built libngraph.so containing ONNX importer to libovep_ngraph.so in order to avoid linkage conflicts with libngraph.so supplied by OpenVINO binary installer. Applies only for Linux builds. * use output_name cmake properties for lib name * fix .so name format in lib_name.patch * CMake code cleanup * Rename WIN32 included ngraph.dll to ovep_ngraph.dll To avoid conflict with ngraph.dll distributed by openvino. * Added myriad config for networks without 4 dimensions * Loading the 10 max clusters for inference on myriad * Refactor code and add Batching support Encapsulate subgraph settings into context structs. Add batching support for completely supported models. * Disabled some broken tests * use input_indexes to avoid batch-checking initializers * Avoid static initialization order error on WOS * Added candy to broken tests * InternalCI changes for 2020.2 * Updated DLDT instructions * Unsaved changed in install_openvino.sh * Changes after manual check * Remove custom ngraph onnx_import build for WOS ONNX Importer on WOS does not have protobuf issue. * Remove FP32ToFP16 ngraph pass This conversion is performed implicitly within IE. * Surround debug logic by #ifndef NDEBUG * remove invalid TODO comments * removed references to ngrpah-ep * clang-formatting * remove commented code * comment edits * updating copyright year to that of first OpenVINO-EP release * remove redundant log msg * Modified operator and topology support * Update build instructions * doc formatting * Fixed clip unit tests * Revert "Remove FP32ToFP16 ngraph pass" This reverts commit ec962ca5f315a5658ad980e740196f19de2639c1. * Applying FP16 transformation only for GPU FP16 * Fixed GPU FP32 python tests * automatically use full protobuf * disable onnxrt server for now * Disabled upsample * update dockerfile instructions * Removed MO paths and added ngraph path * Remove OVEP from ORT Server docs Will put it back in after validation * Updated path to Ngraph lib * Disabled Resize and some other python tests * Removed unnecesary header files * Use commit SHA to fetch ngraph repo * Avoid un-needed file changes due to version update * Fixed clip tests * Fixed Pow, max and min onnx tests * build.md doc typo * Update cmake patch command for ngraph src * remove dead cmake code for onnxruntime_USE_OPENVINO_BINARY * use spaces instead of tab * remove commented code * Add info about protobuf version * edit debug env var and enable for WIN32 * specify only version tag of 2020.2 for dockerbuilds * remove unnecessary file changes * Pass empty string as default argument to C# tests * Use ${OPENVINO_VERSION} to name openvino install directory in CI builds * Enabled unnecessarily disabled tests * Fixed ngraph protobuf patch * Fixed error in protobuf patch * Revert "Use ${OPENVINO_VERSION} to name openvino install directory in CI builds" This reverts commit 89e72adb8bf3b9712f5c81c5e13fe68c6c0df002. * Remove unsetting OPTIONAL macro This is no longer used in recent ONNX update onnx/onnx@da13be2, so this unset workaround is no longer necessary. * Use a null string default argument for C# API * Set OpenVINO version yml files and pass to CI Docker builds Git Tag info for DLDT as well as install directory are set using this value. This reverts commit 9fa9c20348ed72ae360a95c98e9b074d2f9fafc5. * Documentation: recommendation and instructions for disabling ORT graph optimizations * more doc updates * Reduced the number of models according to CI time constraints Co-authored-by: ynimmaga <yamini.nimmagadda@intel.com> Co-authored-by: suryasidd <surya.siddharth.pemmaraju@intel.com> Co-authored-by: Mikhail Treskin <mikhail.treskin@intel.com> Co-authored-by: mbencer <mateusz.bencer@intel.com> Co-authored-by: Aravind <aravindx.gunda@intel.com> Co-authored-by: suryasidd <48925384+suryasidd@users.noreply.github.com>	2020-04-24 04:06:02 -07:00
Edward Chen	deac467683	Merge remote-tracking branch 'origin/master' into edgchen1/merge_from_master	2020-04-23 20:50:33 +00:00
David Brownell	3ce31933bb	Wheel file updates for FeaturizerLibrary data (#3640 )	2020-04-23 13:27:22 -07:00
gwang-msft	02bae6bd06	Not use OpenMP for android build (#3636 )	2020-04-22 21:17:05 -07:00
Changming Sun	00917917d6	Downgrade numpy requirement to 1.16.6 (#3635 )	2020-04-22 16:11:33 -07:00
Edward Chen	87fad09c7b	Fix merge issue.	2020-04-21 03:44:44 +00:00
Edward Chen	daa14b64e3	Merge remote-tracking branch 'origin/master' into edgchen1/merge_from_master	2020-04-21 03:31:32 +00:00
Prabhat	381fee47ab	Added support to build onnxruntime with ACL (#3586 ) * Added support to build onnxruntime with ACL * Added ACL build instructions	2020-04-20 13:35:28 +05:30
Prabhat	ea62b3435a	Clean up build.py code (#3466 )	2020-04-18 20:48:30 -07:00
Hariharan Seshadri	b4457ecb7a	Fix `gen_doc` build option and refresh documentation (#3545 ) * Support listing keys in custom metadata map via C/C++ API * nit * PR feedback * Nit * Initial commit * More changes * Support listing keys in custom metadata map via C/C++ API * nit * PR feedback * Nit * Initial commit * More changes * Add md files * Doc changes * Update * revert cmake changes * Update * Doc change * Update * Update	2020-04-17 14:41:04 -07:00
Sheil Kumar	2717c178cc	Fork the WinML APIs into the Microsoft namespace (#3503 ) * Migrate winml to Microsoft Namespace (packaging changes are pending) * add ns_prefix toggle * fix packaging * Users/sheilk/add missing raw header (#3484) * add dualapipartition * wrong variable for repo root Co-authored-by: Sheil Kumar <sheilk@microsoft.com> * remove existence check to force failures * extra paren * dualapipartition needs to be referenced from the source * add microsoft.ai.machinelearning.dll to the output dir * rename the idl file so that assembly info is correctly added into the winmd * fix namespaces * update namespaces * default to microsoft, and add namespace override as build argument * update cmakesetings.json as well * remove from cmakelists.txt Co-authored-by: Sheil Kumar <sheilk@microsoft.com> Co-authored-by: Changming Sun <chasun@microsoft.com>	2020-04-17 06:18:54 -07:00
Changming Sun	1a222b3f6e	Disable downloading test data on Windows (#3551 ) * Disable downloading test data on Windows	2020-04-16 22:15:20 -07:00
harshitha	80e0c64e2e	merged with master	2020-04-16 17:13:36 +00:00
David Brownell	006c5be1b1	Optionally produce a python wheel that includes featurizers (#3491 )	2020-04-14 09:00:13 -07:00
M. Zeeshan Siddiqui	5d99f179b9	Merge pull request #3486 from microsoft/sedymche/merge_master_ort_training Merge from master into ort_training	2020-04-13 10:55:36 -07:00
liqunfu	e7297e6c9d	create pipeline for ci frontend tests (#3422 ) create pipeline for nightly python front-end e2e tests	2020-04-09 15:31:22 -07:00
Sergii Dymchenko	6ba7c99e50	Merge branch 'master' into ort_training	2020-04-09 12:42:04 -07:00
Changming Sun	33006f48c0	Update onnx submodule to 1.7.0 release candidate (#3405 ) Update onnx submodule to 1.7.0 release candidate. This isn't a release tag, but it will be released soon, in 1-2 weeks.	2020-04-04 16:23:42 -07:00
Tiago Koji Castro Shibata	1c334ed0f1	Add Ninja generator to build.py (#3331 )	2020-04-01 14:19:22 -07:00

... 2 3 4 5 6 ...

481 commits