Commit graph

217 commits

Author SHA1 Message Date
Scott McKay
dbf4e7019d
Add ability to generate configuration file with required operators. (#5089)
* Add ability to generate configuration file with required operators.
2020-09-09 21:39:17 +10:00
Scott McKay
80ada0291f
Improve the minimal build size on android and linux (#5086)
Fix bug where linux build fails when python is enabled and rtti is disabled
Update doco for new build settings
2020-09-09 21:38:34 +10:00
gwang-msft
a1a81470e3
Add minimal build binary size verification (arm64) to Android CI (#5087)
* Add minimal build binary size verification (arm64) to Android CI

* Add comments in the CI ymal
2020-09-09 19:06:20 +10:00
Cameron Maske
4553b2eecd
Expose DirectML provider to python (conflicts resolved from #3359) (#4630) 2020-09-08 14:34:09 -07:00
liqunfu
de58720a97
Liqun/transformer test and e2e golden numbers (#5064)
* match new/old api numbers

* new golden numbers for Roberta and MC

Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-04 18:11:37 -07:00
Scott McKay
b5c2932ae8
Last major set of ORT format model changes (#5056)
* Add minimal build option to build.py
Group some of the build settings so binary size reduction options are all together
Make some cmake variable naming more consistent
Replace usage of std::hash with murmurhash3 for kernel. std::hash is implementation dependent so can't be used.
Add initial doco and ONNX to ORT model conversion script
Misc cleanups of minimal build breaks.
2020-09-05 07:59:01 +10:00
Thiago Crepaldi
0fc9c504fe
Re-enable CI tests for the new PyTorch frontend (#5017)
This PR includes:

* Re-enable CI tests for new PyTorch frontend
* Re-enable fp16 and adjust tolerances for number matching
2020-09-04 09:36:24 -07:00
Andrews548
bd215b79a2
ACL v20.02 (#4981)
* Add ACL version 20.02

* fix loging typo

* check depthwise operation based on group param

* Generate ArmNN runtime inside class constructor

* Update to the latest ONNX operation set

* Update BUILD.md

Co-authored-by: Andrei-Alexandru <andrei-alexandru.avram@nxp.com>
2020-09-03 20:44:27 -07:00
Nat Kershaw (MSFT)
8a03b6e5c7
Render Operator documentation as compliant markdown (#3658) 2020-09-02 15:07:50 -07:00
RandySheriffH
14b51d6502
CiPipeline@ReducedOpsBuild (#4917)
* cancel night build on pyop

* setup ci pipeline for build of reduced ops

* add back c# test

* remove debugging print

* add testing model

* add more arg in pipeline script

* disable pipeline trigger temporarily

* fix yaml format

* fix yaml format

* fix pipeline error

* rid c# test

* add ops for test cases

* add Conv from domain com.microsoft.nchwc

* remove --reduce_ops

* fix typo

* remove --build_java

* add test case for excluded op

* update doc with --skip_test

* formatting code, renaming files and simplify yaml

* remove debug build from yaml

* remove surplus ops from included_ops.txt

* add MinSizeRel build to yaml

* rename test cases and models

* exclude ir test from minimum build

* restrict ir test to be only applied to reduced ops build
2020-08-31 21:21:18 -07:00
Ashwini Khade
0d3bbfdd0f
enable nuget packaging in local builds (#4884)
* enable building nuget packages

* add nuget creation from build.py

* add documentation

* fix flake8 errors

* fix nuget package version

* enable csharp tests

* update csharp tests

* copy nuget packges to nuget-artifacts

* add libmklml_gnu

* plus review updates

* fix references for release builds
2020-08-26 12:33:48 -07:00
Rayan-Krishnan
eb05db5a2a
Fix OptimizerConfig params groups (#4877)
* Copy samples to build folder and load models from there. Fix CI
* This PR also includes a fix to path validation for save_as_onnx API
* Add torchtext to CI for GPU training
* Remove new frontend tests from CI

Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
2020-08-22 22:04:17 -07:00
liqunfu
6260d073b3
Glue parallel training (#4550)
add mpi size, rank python API

add single node parallel training example
2020-08-21 21:24:27 -07:00
RandySheriffH
3fa73a5b6a
ReduceBinarySize (#4747)
* cancel night build on pyop

* add rewriter to rewrite cpu provider

* skip BuildKernelCreateInfo<void>

* refactor variable name and comment

* include ops from csv file

* process multiple eps

* add default function to cuda provider

* rename function and add license header

* fix import

* add doc

* fix typo

* deal with empty kernel entry in cuda

* rename the rewriter file

* add comment into provider file

* add comment and rename function

* log warnings

* refactor extracting logic

* add entry for script to run solo

* add better example

* avoid onnx importing

* fix flake8 alerts

* minor fixes to better comments and doc

* add entries for all domains

* add void entry into contrib providers

* format cuda_contrib_kernels.cc

* format cpu_contrib_kernels.cc

* add all providers

* add default entry to all providers

* include op_kernel header

* cancelling change in providers beyond cpu/cuda

* rename file and switch file format to domain;opset;op1,op2...

* update doc

* restore non-regular ending grammar in cuda_contrib_kernels.cc

* add ort_root as input argument of script

* enable test in ci

* update doc

* update doc

* revert change on linux gnu ci

* switch to set to host ops

* simplify trimming logic

* add domain map to track current model

* allow ort_root to take relative path
2020-08-21 19:50:13 -07:00
Thiago Crepaldi
42408aa3ed
Add new PytTrch front-end (#4815)
* Add ORTTrainerOptions class for the new pytorch frontend (#4382)

Add ORTTrainerOptions class and some placeholders

* Add _ORTTrainerModelDesc to perform validation for model description (#4416)

* Add Loss Scaler classes to the new frontend (#4306)

* Add TrainStepInfo used on the new frontend API (#4256)

* Add Optimizer classes to the new frontend (#4280)

* Add LRScheduler implementation (#4357)

* Add basic ORTTrainer API (#4435)

This PR presents the public API for ORTTrainer for the short term
development.

It also validates and saves input parameters, which will be used in the
next stages, such as building ONNX model, post processing the model and
configuring the training session

* Add opset_version into ORTTrainerOptions and change type of ORTTrainer.loss_fn (#4592)

* Update ModelDescription and minor fix on ORTTrainer ctor (#4605)

* Update ModelDescription and minor fix on ORTTrainer/ORTTrainerOptions

This PR keeps the public API intact, but changes how model description is stored on the backend

Currently, users creates a dict with two lists of tuples.
One list called 'inputs' and each tuple has the following format tuple(name, shape).
The second list is called 'outputs' and each tuple can be either tuple(name, shape) or tuple(name, shape, is_loss).

With this PR, when this dict is passed in to ORTTrainer, it is fully validated as usual.
However, tuples are internally replaced by namedtuples and all output tuples will have
tuple(name, shape, is_loss) format instead of is_loss being optionally present.

Additionally to that normalization in the internal representation (which eases coding),
two internal methods were created to replace a namedtuple(name, shape) to namedtuple(name, shape, dtype)
or namedtuple(name, shape, is_loss, dtype) dependeing whether the tuple is an input or output.

This is necessary as ORTTRainer finds out data types of each input/output during model export to onnx.

Finally, a minor fix was done on ORTTrainer. It could initialize ORTTrainerOptions incorrectly when options=None

* Rename input name for test

* Add ONNX Model Export to New Frontend (#4612)

Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>

* Create training session + minor improvements (#4668)

Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>

* Save ONNX model in file (#4671)

Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>

* Add eval step (#4674)

Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>

* Add train_step (#4677)

Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>

* Add LR Scheduler (#4694)

Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>

* Add deterministic compute tests (#4716)


Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>

* Add legacy vs experimental ORTTrainer accuracy comparison (#4727)

Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>

* Add Mixed precision/LossScaler + several fixes (#4739)

Additionally to the mixed precision/loss scaler code, this PR includes:

* Fix CUDA training
* Add optimization_step into TrainStepInfo class
* Refactor LRSCheduler to use optimization_step instead of step
* Updated several default values at ORTTrainerOptions
* Add initial Gradient Accumulation supported. Untested
* Fix ONNX model post processing
* Refactor unit tests

* Add ONNX BERT example + minor fixes (#4757)

* Fix training issue when passing ONNX file into ORTTrainer

Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>

* Add Dynamic Shape support (#4758)

* Update DeepSpeed Zero Stage option to a separate option group (#4772)

* Add support to fetches (#4777)

* Add Gradient Accumulation Steps support (#4793)

* Fix Dynamic Axes feature and add unit test (#4795)

* Add frozen weights test (#4807)

* Move new pytorch front-end to 'experimental' namespace (#4814)

* Fix build

Co-authored-by: Rayan-Krishnan <rayankrishnan@live.com>
Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-08-17 09:45:25 -07:00
Changming Sun
5eec4f66ed
Refactor manylinux docker image and the related pipelines (#4751)
1. Publish the image ACR, instead of building it every time for every PR
2. Make USE_MKLML and USE_OPENMP be able to co-exist. Currently both of them are enabled in our Linux CI build but indeed only one of them is taking effect.
3. Split nuphar and DNNL to separated pipelines.
4. Fix two warnings in onnxruntime/core/optimizer/matmul_scale_fusion.cc and onnxruntime/test/tvm/tvm_basic_test.cc.
5. Update the manylinux2010_x86_64 image to the latest.
2020-08-17 09:40:31 -07:00
gwang-msft
8507bc1f48
[Android NNAPI EP] Enable test for BatchNormalization, enable dev_mode for Android, fix some issues in concat (#4715)
* update batch_norm test, enable dev_mode for nnapi, ignore onnx protobuf warning for nnapi ep

* fix some issues in concat and mark input without shape as not supported for now

* address review comments

* addressed comments
2020-08-06 14:11:59 -07:00
Changming Sun
1e054739b8
Remove the requirement of CUDA's version.txt (#4706)
Sometimes there is a file named "version.txt" in your CUDA installation dir, but sometimes there isn't one. I couldn't figure out it why, but the latest CUDA 11 on our CI build machines doesn't have this file. As the file is not needed for building onnxruntime, so I removed the check.
2020-08-04 17:03:40 -07:00
RandySheriffH
1fcd3eb376
cancel night build on pyop (#4673) 2020-07-30 19:51:52 -07:00
gwang-msft
c2ec3b734b
[Android NNAPI EP] Remove dependency on external JD/DNNLibrary (#4576)
* remove dependency of external jd-dnnlibrary

* remove extra variables not used any more

* update /cgmanifest.json
2020-07-22 14:08:12 -07:00
Andrews548
f20afc4991
Update ACL/ArmNN EP (#4571)
* Add BN to ArmNN EP

* Add Concat to ArmNN EP

* ACL logging improvements

* ArmNN logging improvements

* Fallback to CPU for 9x9 convolution in ACL EP

* Fallback to CPU for 9x9 convolution in ArmNN EP

* Enable python support for ACL and ArmNN EPs when compiled with BSP toolchain

* Removed the matmul operator

* Fix conv infer shape function

* Fix provider_names list for armnn

Co-authored-by: Andrei-Alexandru <andrei-alexandru.avram@nxp.com>
2020-07-21 22:25:58 -07:00
Changming Sun
bc1d197ddf
Re-enable dnnl in CI build (#4544)
* Revert "Temporarily remove dnnl from Linux CI build to unblock the whole team (#4266)"

Previously it fails because it used too much memory.
Now we only run dnnl EP with opset12 models in unit tests, to reduce peak memory usage.
2020-07-19 23:20:03 -07:00
Changming Sun
8ada440961
Move model tests to onnxruntime_test_all (#4521)
1. Move model tests to onnxruntime_test_all
2. Publish TestResults of Windows CI build.
2020-07-15 16:46:18 -07:00
liqunfu
f721f5f1cd
Liqun/multiple choice (#4480)
* multiple choice runner

* add docker cleanup task to frontent pipeline
2020-07-14 17:57:58 -07:00
liqunfu
0bff55512e
updated expected values for frontend test to pass frontend e2e pipeline. raise tolerance to reduce future risk of failure (#4497)
* updated expected values for frontend test, raise tol
2020-07-13 19:25:54 -07:00
gwang-msft
5f8f443ac4
Android CI build, test copy, emulator boot improvement (#4481)
* Enable onnxruntime_test_all for NNAPI EP

* switch to use ninja for ANdroid CI

* make android elumator boot faster in android ci

* simplify adb push

* more style change

* more tweaking on android ci

* build.py style update
2020-07-13 14:18:34 -07:00
Hariharan Seshadri
26ebcfab88
Fix Nuget GPU pipeline (#4462) 2020-07-10 14:02:28 -07:00
Hariharan Seshadri
6d6b6b54a5
Support binding a graph output to a specific device via the Python binding (#4439) 2020-07-07 21:09:37 -07:00
EronsJ
632b2896f3
Onnxruntime fuzzing (#4341)
* Add protobuf mutator library as a git submodule

* Added files and instructions to build the protobuf mutator library in CMake

* Added fuzzing flag to build system and added fuzzing dependency library. To run fuzzing test use the flags --fuzz_testing --build_shared_lib --use_full_protobuf --cmake_generator 'Visual Studio 16 2019'

* Added src files and build instructions for the main fuzzing engine

* Removed Random number generation test from inside the engine

* Added license header to files

* Removed all pep8 violations introduced by this change and other E501 violations
2020-07-06 16:34:34 -07:00
Tiago Koji Castro Shibata
7fea332f93
Support builds without RTTI (#4333)
* Support builds without RTTI

* Disable RTTI in all builds
2020-07-01 13:05:35 -07:00
gwang-msft
9e0f5fc7af
The initial PR for NNAPI EP (#4287)
* Move nnapi dnnlib to subfolder

* dnnlib compile settings

* add nnapi buildin build.py

* add onnxruntime_USE_NNAPI_BUILTIN

* compile using onnxruntime_USE_NNAPI_BUILTIN

* remove dnnlib from built in code

* Group onnxruntime_USE_NNAPI_BUILTIN sources

* add file stubs

* java 32bit compile error

* built in nnapi support 5-26

* init working version

* initializer support

* fix crash on free execution

* add dynamic input support

* bug fixes for dynamic input shape, add mul support, working on conv and batchnorm

* Add batchnormalization, add overflow check for int64 attributes

* add global average/max pool and reshape

* minor changes

* minor changes

* add skip relu and options to use different type of memory

* small bug fix for in operator relu

* bug fix for nnapi

* add transpose support, minor bug fix

* Add transpose support

* minor bug fixes, depthwise conv weight fix

* fixed the bug where the onnx model input has mismatch order than the nnapi model input

* add helper to add scalar operand

* add separated opbuilder to handle single operator

* add cast operator

* fixed reshape, moved some logs to verbose

* Add softmax and identity support, change shaper calling signature, and add support for int32 output

* changed the way to execute the NNAPI

* move NNMemory and InputOutputInfo into Model class

* add limited support for input dynamic shape

* add gemm support, fixed crash when allocating big array on stack

* add abs/exp/floor/log/sigmoid/neg/sin/sqrt/tanh support

* better dynamic input shape support;

* add more check for IsOpSupportedImpl, refactored some code

* some code style fix, switch to safeint

* Move opbuilders to a map with single instance, minor bug fixes

* add GetUniqueName for new temp tensors

* change from throw std to ort_throw

* build settings change and 3rd party notice update

* add readme for nnapi_lib, move to ort log, add comments to public functions, clean the code

* add android log sink and more logging changes, add new string for NnApiErrorDescription

* add nnapi execution options/fp16 relax

* fix a dnnlibrary build break

* addressed review comments

* address review comments, changed adding output for subgraph in NnapiExecutionProvider::GetCapability, minor issue fixes

* formatting in build.py

* more formatting fix in build.py, return fail status instead of throw in compute_func

* moved android_log_sink to platform folder, minor coding style changes

* addressed review comments
2020-06-26 00:02:39 -07:00
Aaron Bockover
64264c3846
Allow --cmake_generator to work on macOS (#4278) 2020-06-24 16:30:33 -07:00
Changming Sun
deea945f80
Remove openmp and scipy from build pipelines (#4305)
1. Remove openmp because the default thread pool is already good enough.
2. Remove scipy from build pipelines because it stops support python 3.5.
2020-06-23 20:18:16 -07:00
Yufeng Li
867ba846f7
Implement MinMax with SIMD (#4285)
* Implement MinMax with SIMD
2020-06-23 20:07:53 -07:00
Pranav Sharma
2204d39a06
Add build option to disable traditional ML ops from the binary. (#4272)
* Add build option to disable traditional ML ops from the binary.

* Fix python tests by splitting tests for ML ops to a separate file. Exclude ML tests from onnx_test_runner and C# tests. Exclude ML op sources.

* Update Edge pkg pipelines with new MLops env variable and fix C# packaging pipeline tests to skip ML ops.
2020-06-20 06:36:06 -07:00
goloskokovic
478b923e19
Expose ACL/ARMNN providers to Python (#4260)
* expose ACL/ARMNN providers to python

* add -acl / -armnn to package name when use_acl / use_armnn is specified

* build python wheel for ARMNN EP

* link ACL/ARMNN EPs into onnxruntime_pybind11_state

* wrong argument order in build_python_wheel for wheel_name_suffix
2020-06-18 20:24:14 +05:30
Weixing Zhang
b4b1c6440a
Enable ORT with CUDA 11 toolkit (#4168)
* ORT on CUDA 11

1. Seperate HOROVOD and MPI
2. Seperate NCCL from HOROVOD in CMakeLists.txt
2. Remove dependency on external cub
3. cudnnSetRNNDescriptor is changed in cuDNN 8.0

* polish the code about MPI/NCCL in CMakeLists.txt and build.py

* check CUDA version

* ${MPI_INCLUDE_DIRS} should be PUBLIC

* sm30, sm50 are deprecated in CUDA 11 Toolkit

* update change based on code review feedback.

* add sm_52

* improve MPI/NCCL build path

Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
2020-06-15 08:47:03 -07:00
Yulong Wang
73bc6be5d1
build: split nodejs binding build and test to avoid timeout issue (#4188)
* split nodejs binding build and test

* enable nodejs tests
2020-06-10 19:16:32 -07:00
George Wu
e8ed14bcb3 disable MEMLEAK CHECKER for openvino 2020-06-10 11:12:17 -07:00
Changming Sun
a7366d82af
Disable nuphar large model test (#4173)
Disable nuphar large model test, because it takes too long(40+ minutes), while the default cpu provider takes about 5 minutes. After this change, we still keep a lot of other nuphar model tests, I think that should be enough.
2020-06-09 17:45:17 -07:00
Wenbing Li
ee35320974
The fixings for python scripts in ONNXRuntime (#4135)
* The fixings for python scripts in ONNXRuntime

* update according the comments
2020-06-08 10:27:32 -07:00
Cecilia Liu
b8db8076cb
Fix MKLML Tests Run (#4144)
Add a path to LD_LIBRARY_PATH to fix library not found error when running mklml test cases.
2020-06-06 20:28:53 -07:00
liqunfu
ffed43e9b8
handle loss and name marching wrappers (#4066)
* handle loss and name marching wrappers

Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-06-05 23:34:26 -07:00
Andrews548
62b44527e5
Add ArmNN Execution Provider (#3714)
* Add ArmNN Execution Provider

Add a new execution provider targeting Arm architecture based on ArmNN.
Validated on NXP i.MX8QM CPU with ResNet50, MobileNetv2 and VGG models.

reviewed-by: mike.caraman@nxp.com

* Minor fixes

- renamed onnxruntime_ARMNN_RELU_USECPU to onnxruntime_ARMNN_RELU_USE_CPU
- fixed acl typo

* remove extra includes. added exception for ArmNN in test

* fix indentation

* Separated the activation implementation from the cpu and fixed the blockage from the endif

Co-authored-by: Andrei-Alexandru <andrei-alexandru.avram@nxp.com>
2020-06-03 22:57:51 +05:30
Ashwini Khade
70d91a8550
re-enable graph optimizations during build phase (#4044)
* re-enable graph optimizations during build phase

* fix

* re-enable optimizers for all provider tests
2020-06-01 10:32:42 -07:00
Scott McKay
1d441f89ac
Re-enable PEP8 check in Win CI build (#4075)
* Add flake8 to Win CI build so it's re-enabled. It was in the static analysis build that is currently disabled so checks are not running.
Fix build.py to be compliant again.
Add prefix to flake8 output so it's (hopefully) easier to identify the errors in build output.

* Add to all builds in Windows CPU CI so they all fail quickly if there's an issue.
2020-05-30 09:10:05 +10:00
edgchen1
38d76cc904
Clean up training E2E test (#4078)
Update training E2E build to not go through CTest and call test scripts directly.
2020-05-29 09:20:47 -07:00
liqunfu
6665d5e2bc
Liqun/a transformer example (#3845)
Add transformer glue test example to show how to use ORTTrainer to fine-tune a transformer model

Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-05-27 15:21:35 -07:00
Yulong Wang
b3ec8035ee
[Node.js binding] add build flag for node.js binding (#3948) 2020-05-27 13:30:22 -07:00
Paul Fultz II
7759136610
Add amd migraphx execution provider to onnx runtime (#2929)
* Add amd migraphx execution provider to onnx runtime

* rename MiGraphX to MIGraphX

* remove unnecessary changes in migraphx_execution_provider.cc

* add migraphx EP to tests

* add input requests of the batchnorm operator

* add to support an onnx operator PRelu

* update migrapx dockerfile and removed one unused line

* sync submodules with mater branch

* fixed a small bug

* fix various bugs to run msft real models correctly

* some code cleanup

* fix python file format

* fixed a code style issue

* add default provider for migraphx execution provider

Co-authored-by: Shucai Xiao <Shucai.Xiao@amd.com>
2020-05-27 04:24:59 +08:00