Commit graph

622 commits

Author SHA1 Message Date
Changming Sun
17f1178c2e
Downgrade GCC (#5269)
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
2020-09-24 21:14:54 -07:00
Dmitri Smirnov
89742411ec
Insert telemetry template into GPU build, add telemry build switches. (#5278) 2020-09-24 17:13:09 -07:00
edgchen1
6d5b93b805
Synchronize training dependency versions between Docker image and Python wheel. (#5261)
Synchronize training dependency versions between Docker image and wheel, update docs, refactor build scripts.
2020-09-23 19:03:42 -07:00
Guoyu Wang
d957dbebea
Fix possible ios build break after update to Xcode 12 (#5246)
* Fix possible ios build break after update to Xcode 12

* Address comments
2020-09-22 07:42:54 -07:00
suffian khan
417929b049 jobs timeout .. 2020-09-21 21:51:59 -07:00
Xueyun Zhu
55e4b5d302
add pipeline distributed training test (#5222)
* add pipeline distributed training test

* fix max line length error in windows build

* function header indent

* fix

* fix flake8 error
2020-09-21 14:35:01 -07:00
Tiago Koji Castro Shibata
cd663d58f5
Fix WinML warnings (#5228) 2020-09-19 12:41:42 -07:00
Guoyu Wang
78a29aebbc
[ORT Mobile] ORT Minimal E2E CI (#5200)
* Modify the ort minimal CI to ort minimal e2e ci
2020-09-19 18:43:22 +10:00
KeDengMS
ce3b67e0cd
[Python] Move symbolic_shape_infer from nuphar to tools (#5162)
* [Python] Move symbolic shape inference from nuphar to tools

* Fix PEP8 ERROR
2020-09-18 09:31:06 -07:00
Scott McKay
c46a480306
Update conversion script and process to simplify creating ORT format models and a minimal build (#5217)
* Update conversion script and process to simplify creating ORT format models and a minimal build.
2020-09-18 18:49:54 +10:00
liqunfu
f37e1292a1
--shm-size=1024m to fix nccl shared memory issue (#5214)
* --shm-size=256m to fix nccl shared memory issue

Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-17 17:21:47 -07:00
Guoyu Wang
8156e0dd10
[ORT Mobile] Some updates to iOS/Android build settings (#5184)
* Update android CI and build settings

* add build_java to arm64 also

* Add ios signing param

* fix a small build warning

* address pr comments
2020-09-17 15:53:14 -07:00
Tiago Koji Castro Shibata
f3f119a945
Use onecore umbrella lib in onecore builds (#5182)
* delayload hack

* Skip tests

* Onecore uses onecore umbrella

* Uncomment tests

* cleanup

* Disable dev mode for WinML
2020-09-16 10:46:27 -07:00
Tiago Koji Castro Shibata
1a2e289d2d
Fix nuget build (#5163)
* Fix nuget content

* Revert "Fix nuget content"

This reverts commit e2cdcec4e39964c50eac2fb306c7a4bb84352443.

* Nuget packaging

* skip tests

* msbuild path

* Force msbuild version

* Workaround https://github.com/NuGet/Home/issues/7621

* cleanup
2020-09-16 10:37:09 -07:00
Changming Sun
a0a435abc6
Add sympy==1.1.1 to Linux docker image (#5177) 2020-09-15 16:08:49 -07:00
Wenbing Li
de6e3fb61d
Reduce IOS shared library size by symbol file. (#5171) 2020-09-14 23:59:41 -07:00
Scott McKay
089789c135
Revert change to disable support for loading ORT format models in the packaging pipelines. (#5168) 2020-09-15 15:11:06 +10:00
RandySheriffH
1dde215d96
promote cuda version on packacking pipelines (#5154)
* promote cuda version on packacking pipelines

* fix cudnn version in py packaing template

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2020-09-14 21:09:09 -07:00
Changming Sun
8946d212bf
Remove the dependency on CUDA SDk's version.txt (#5155) 2020-09-14 14:25:28 -07:00
Wenbing Li
2a456d16c0
Enable onnxruntime iOS shared library build. (#5148) 2020-09-14 10:32:39 -07:00
RandySheriffH
9392aa2f64
Promote Cuda version to 10.2 for windows pipelines (#5138) 2020-09-13 20:32:06 -07:00
Scott McKay
323a1ba8a4
Add option to exclude support for loading ORT format models in full build. (#5129)
* Add ability to exclude support for loading ORT format models.
Disable support for ORT format models in packages
2020-09-12 12:21:30 +10:00
RandySheriffH
120e3cda74
fix path (#5131)
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2020-09-11 12:18:07 -07:00
Changming Sun
c5efb0085d
Update Linux GPU build pipelines to CUDA 10.2 (#5120)
* Update Linux GPU build pipelines to CUDA 10.2
2020-09-10 17:40:51 -07:00
Changming Sun
a5530358c9
Fix a path problem in Dockerfile.manylinux2014_cuda10_2 (#5106) 2020-09-10 10:30:13 -07:00
Tiago Koji Castro Shibata
62848c4de5
Add store builds to nuget packaging (#5040)
* Nuget store packaging

* Move DNNL workaround to EP

* Fix warning as error

* Disable store tests

* Skip store tests

* msbuild target

* Cross compile protoc in Store

* Disable DML in store

* Move store builds to CPU queue

* Copy uap10 to final nuget

* Fix pip8 error

* Remove extra dml copies

* Fix argparse

* pep8

* Forward IsStoreBuild

* Apply is_store_build to duplicate generate_nuspec

* runtimes

* Refactor uap10

* Store .NET

* uap

* PR feedback
2020-09-09 21:38:14 -07:00
RandySheriffH
5e10cde006
PipelinesForCuda11Cudnn8 (#4938)
* cancel night build on pyop

* setup win cuda11 pipeline

* add debug build

* test base gpu settings

* setup pipelines to test cuda 10.2 and 11

* rename linux docker images

* rename docker image tag and add clean up job

* fix typo in cuda 11 config

* set cuda11 env

* update linux cuda 11 pipeline

* reset docker image name

* disable uninitialized warning from linux build

* change the way to silence uninitialized warning

* add flags to linux gpu pipeline

* switch docker image for linux cuda 10.2

* switch linuc cuda 10.2 image

* test cuda11 with devtool8

* try latest built images

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2020-09-09 16:13:58 -07:00
Changming Sun
924ecb0623
Use manylinux2014 for Linux CPU build (#5091) 2020-09-09 10:09:52 -07:00
Scott McKay
dbf4e7019d
Add ability to generate configuration file with required operators. (#5089)
* Add ability to generate configuration file with required operators.
2020-09-09 21:39:17 +10:00
Scott McKay
80ada0291f
Improve the minimal build size on android and linux (#5086)
Fix bug where linux build fails when python is enabled and rtti is disabled
Update doco for new build settings
2020-09-09 21:38:34 +10:00
gwang-msft
a1a81470e3
Add minimal build binary size verification (arm64) to Android CI (#5087)
* Add minimal build binary size verification (arm64) to Android CI

* Add comments in the CI ymal
2020-09-09 19:06:20 +10:00
gwang-msft
a40d34386a
Add Linux CPU CI for ORT minimal build (#5074)
* initial test version

* update yml

* minor updates

* minor updates

* Test minimal build

* update with include ops for minimal build ut only

* error case to see build failure

* test no_exceptio

* Remove error cases

* address pr comments

Co-authored-by: gwang0000 <62914304+gwang0000@users.noreply.github.com>
2020-09-08 17:09:33 -07:00
Cameron Maske
4553b2eecd
Expose DirectML provider to python (conflicts resolved from #3359) (#4630) 2020-09-08 14:34:09 -07:00
liqunfu
de58720a97
Liqun/transformer test and e2e golden numbers (#5064)
* match new/old api numbers

* new golden numbers for Roberta and MC

Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-04 18:11:37 -07:00
Changming Sun
370d194db7
Add a docker file for CI build CUDA 10.2 (#5065) 2020-09-04 16:28:45 -07:00
Scott McKay
b5c2932ae8
Last major set of ORT format model changes (#5056)
* Add minimal build option to build.py
Group some of the build settings so binary size reduction options are all together
Make some cmake variable naming more consistent
Replace usage of std::hash with murmurhash3 for kernel. std::hash is implementation dependent so can't be used.
Add initial doco and ONNX to ORT model conversion script
Misc cleanups of minimal build breaks.
2020-09-05 07:59:01 +10:00
Thiago Crepaldi
0fc9c504fe
Re-enable CI tests for the new PyTorch frontend (#5017)
This PR includes:

* Re-enable CI tests for new PyTorch frontend
* Re-enable fp16 and adjust tolerances for number matching
2020-09-04 09:36:24 -07:00
Andrews548
bd215b79a2
ACL v20.02 (#4981)
* Add ACL version 20.02

* fix loging typo

* check depthwise operation based on group param

* Generate ArmNN runtime inside class constructor

* Update to the latest ONNX operation set

* Update BUILD.md

Co-authored-by: Andrei-Alexandru <andrei-alexandru.avram@nxp.com>
2020-09-03 20:44:27 -07:00
Nat Kershaw (MSFT)
8a03b6e5c7
Render Operator documentation as compliant markdown (#3658) 2020-09-02 15:07:50 -07:00
Changming Sun
d5d5e37e76
Build system enhancements (#5012)
1. Add a docker file for CUDA11
2. Support setting CUDA_ARCHITECTURES from command line.
2020-09-02 10:13:26 -07:00
RandySheriffH
14b51d6502
CiPipeline@ReducedOpsBuild (#4917)
* cancel night build on pyop

* setup ci pipeline for build of reduced ops

* add back c# test

* remove debugging print

* add testing model

* add more arg in pipeline script

* disable pipeline trigger temporarily

* fix yaml format

* fix yaml format

* fix pipeline error

* rid c# test

* add ops for test cases

* add Conv from domain com.microsoft.nchwc

* remove --reduce_ops

* fix typo

* remove --build_java

* add test case for excluded op

* update doc with --skip_test

* formatting code, renaming files and simplify yaml

* remove debug build from yaml

* remove surplus ops from included_ops.txt

* add MinSizeRel build to yaml

* rename test cases and models

* exclude ir test from minimum build

* restrict ir test to be only applied to reduced ops build
2020-08-31 21:21:18 -07:00
Ashwini Khade
8679a7244e
Enable rejecting models based on onnx opset (#4912)
* enable rejecting models based on onnx opset

* enable unreleased opsets in linux and mac CI

* test fixes and more updates

* enable unreleased opsets in CI builds

* enable released opsets in linux cis

* try fix windows ci yml

* yml fixes

* update yml

* yml updates post master merge

* review comments

* bug fix
2020-08-31 13:35:36 -07:00
Hariharan Seshadri
b945225de3
Include DirectML pdb in x86 bin folder (#4953) 2020-08-28 11:29:26 -07:00
Changming Sun
c37fa7c278
Delete Dockerfile.centos6_gpu (#4851) 2020-08-28 09:56:52 -07:00
Ashwini Khade
0d3bbfdd0f
enable nuget packaging in local builds (#4884)
* enable building nuget packages

* add nuget creation from build.py

* add documentation

* fix flake8 errors

* fix nuget package version

* enable csharp tests

* update csharp tests

* copy nuget packges to nuget-artifacts

* add libmklml_gnu

* plus review updates

* fix references for release builds
2020-08-26 12:33:48 -07:00
edgchen1
71d8846635
Fix telemetry-steps.yml (#4903)
Fix bug in telemetry-steps.yml that causes telemetry setup to be disabled even if TELEMETRYGUID is set.
2020-08-24 22:14:40 -07:00
Changming Sun
f34ed3a576
Hot fix for the python packaging pipeline Linux ARM build (#4902) 2020-08-24 20:14:33 -07:00
Rayan-Krishnan
eb05db5a2a
Fix OptimizerConfig params groups (#4877)
* Copy samples to build folder and load models from there. Fix CI
* This PR also includes a fix to path validation for save_as_onnx API
* Add torchtext to CI for GPU training
* Remove new frontend tests from CI

Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
2020-08-22 22:04:17 -07:00
liqunfu
6260d073b3
Glue parallel training (#4550)
add mpi size, rank python API

add single node parallel training example
2020-08-21 21:24:27 -07:00
RandySheriffH
3fa73a5b6a
ReduceBinarySize (#4747)
* cancel night build on pyop

* add rewriter to rewrite cpu provider

* skip BuildKernelCreateInfo<void>

* refactor variable name and comment

* include ops from csv file

* process multiple eps

* add default function to cuda provider

* rename function and add license header

* fix import

* add doc

* fix typo

* deal with empty kernel entry in cuda

* rename the rewriter file

* add comment into provider file

* add comment and rename function

* log warnings

* refactor extracting logic

* add entry for script to run solo

* add better example

* avoid onnx importing

* fix flake8 alerts

* minor fixes to better comments and doc

* add entries for all domains

* add void entry into contrib providers

* format cuda_contrib_kernels.cc

* format cpu_contrib_kernels.cc

* add all providers

* add default entry to all providers

* include op_kernel header

* cancelling change in providers beyond cpu/cuda

* rename file and switch file format to domain;opset;op1,op2...

* update doc

* restore non-regular ending grammar in cuda_contrib_kernels.cc

* add ort_root as input argument of script

* enable test in ci

* update doc

* update doc

* revert change on linux gnu ci

* switch to set to host ops

* simplify trimming logic

* add domain map to track current model

* allow ort_root to take relative path
2020-08-21 19:50:13 -07:00