Commit graph

363 commits

Author SHA1 Message Date
edgchen1
de543c0308
Add SafeInt include to WinML targets (#3558)
Fixing Windows builds on the ort_training branch in preparation for the merge to master.
SafeInt (included via onnxruntime/core/common/safeint.h) was recently made a dependency of onnxruntime/core/framework/bfc_arena.h. That requires consumers of bfc_arena to compile with the SafeInt include directory.
2020-04-17 09:54:01 -07:00
edgchen1
0ec90f7019
Put safeint_interface include directory into onnxruntime_common interface include directories to simplify usage by other targets. (#3546) 2020-04-16 10:34:32 -07:00
edgchen1
2f16172e69
Address PR comments and clean up. (#3536)
Address PR comments and clean up.
- https://github.com/microsoft/onnxruntime/pull/3174#discussion_r408549886
- https://github.com/microsoft/onnxruntime/pull/3174#discussion_r408551151
2020-04-15 15:51:52 -07:00
pengwa
2c7c45076b
MaxBatchSize E2E Test (#3454)
* max batch size e2e test

*update test data snapshot
2020-04-15 09:50:44 +08:00
M. Zeeshan Siddiqui
5d99f179b9
Merge pull request #3486 from microsoft/sedymche/merge_master_ort_training
Merge from master into ort_training
2020-04-13 10:55:36 -07:00
Sergii Dymchenko
8ea0e596ec Fix onnxruntime_unittests.cmake after merge. 2020-04-09 13:14:15 -07:00
Sergii Dymchenko
6ba7c99e50 Merge branch 'master' into ort_training 2020-04-09 12:42:04 -07:00
ytaous
f73008483a
safeint for region bytes in bfc arena and code clean up (#3447)
* PR comments

* remove build issue workaround

* SafeInt for region bytes

* fix build

* fix build

Co-authored-by: Ethan Tao <ettao@microsoft.com>
2020-04-08 13:54:42 -07:00
Yufeng Li
4d71958ccf
Use IMMA for int8 matmul to leverage Turing Tensor Core (#3413)
Use IMMA for int8 matmul to leverage Turing Tensor Core
Format files under onnxruntime/core/providers/cude
2020-04-07 15:22:04 -07:00
Thiago Crepaldi
15e32b44fd
Merge pull request #3383
Merge from master into ort_training
2020-04-06 19:05:01 -07:00
Ye Wang
4ebad8805b
change (#3431) 2020-04-06 11:30:21 -07:00
Changming Sun
0dcc6035b1
Disable strong inline (#3399)
To bypass a MSVC bug. Without this change, people can't use VS2017 to build onnxruntime in Release or RelWithDebInfo mode.
2020-04-06 11:19:09 -07:00
Changming Sun
33006f48c0
Update onnx submodule to 1.7.0 release candidate (#3405)
Update onnx submodule to 1.7.0 release candidate.  This isn't a release tag,  but it will be released soon, in 1-2 weeks.
2020-04-04 16:23:42 -07:00
Pranav Sharma
14f4c3e25f
Fix issue in construction of DummyArena. (#3416) 2020-04-03 08:28:05 -07:00
Thiago Crepaldi
d89e5d91a6 Disable GradientCheckerTest tests for GPU/Debug build (#3407) 2020-04-03 01:01:58 +00:00
Thiago Crepaldi
675035b1a8
Disable GradientCheckerTest tests for GPU/Debug build (#3407) 2020-04-02 18:00:54 -07:00
Tiago Koji Castro Shibata
1671072b6b
[WIP] Port image tests from WAI (#3365)
* Copy image tests from ADO

* wip

* Port tests to googletest

* Add FNS-Candy license

* Add missing collaterals

* Remove brand images

* Fix typos

* Use PrepareModelSessionBinding in MnistImageTest

* Fix typos
2020-04-01 15:38:44 -07:00
ytaous
2ce90cff4c
PR comments (#3374)
* PR comments

* PR comments

* PR comments

* PR comments

* PR comments

* PR comments

* PR comments

Co-authored-by: Ethan Tao <ettao@microsoft.com>
2020-04-01 10:36:16 -07:00
Changming Sun
accffded5d
Build options for enabling AVX/AVX2/AVX512 (#3373)
1. Add build options for enabling AVX/AVX2/AVX512
2. Update eigen to a newer version, because the current one doesn't work with VC and AVX512.
2020-04-01 10:07:22 -07:00
Dmitri Smirnov
a4fe60c4d3
OpSet 12 ops (#3341)
Advance ONNX commit to pickup the latest ArgMax, ArgMin,
  ReduceMax/ReduceMin, MaxPool
  Declare new versions for CPU/CUDA.
  Implement infrastructure support for int8/uint8.
  Adust GatherOp test for a new error.
  Adjust Scan9.BadShape test.
  Add exclusions for index out of bounds checks.
  Rework result verification for SVDTransformer.
2020-03-31 15:31:06 -07:00
Thiago Crepaldi
759818f2c1 Merge remote-tracking branch 'origin/master' into thiagofc/ort_training_merge_from_master 2020-03-31 10:53:22 -07:00
stevenlix
2332a93db0
Update onnx-tensorrt parser (#3369)
* sync onnx-tensorrt parser and update TensorRT doc

* remove --msvc_toolset 14.16 in tensorrt ci pipeline
2020-03-30 20:31:59 -07:00
Jan Scholz
ce9acf0c21
iOS crosscompilation under linux (#3298)
* added support for ios crosscompilation under linux

* reverted cmake generator change

* if --ios is added protoc can be compiled for host system

* accidently reverted change to compile protoc for host system for ios if protoc exe is not set

* wdata is now used

* accidentally pasted CMAKE_OSX_ARCHITECTURES into CmakeLists.txt, also made bad merge on build.py previously

* removed print

* fixed typeo, deleted commented statements for earlier debugging

* reverted accidental delete

* added asmmacro.h for aarch64 asm
now MlasSgemmKernel**** gets underscore added if needed
no need anymote to differentiate between iOS arm64 and normal amr64 build
onnxruntime.cmake: added check if iOSCross is set to properly set RPATH

* removed 2 spaces

* fix: logcial error fixed, now protoc gets compiled if not supplied with --path_to_protoc_exe

* removed unecessarily added spaces

* removed some more spaces
2020-03-30 19:39:17 -07:00
edgchen1
fb2f97a002
Address master merge PR comments (#3348)
Address some comments from https://github.com/microsoft/onnxruntime/pull/3174.

- https://github.com/microsoft/onnxruntime/pull/3174#discussion_r396855459
- https://github.com/microsoft/onnxruntime/pull/3174#discussion_r396855630
- https://github.com/microsoft/onnxruntime/pull/3174#discussion_r396857140
- https://github.com/microsoft/onnxruntime/pull/3174#discussion_r398094858
- https://github.com/microsoft/onnxruntime/pull/3174#issuecomment-599024924
2020-03-30 18:52:48 -07:00
Changming Sun
06fc9506fd
Thread pool changes (#3153)
1. Copy tensorflow's thread pool class to ORT, so that we can get a better implementation of thread pool based parallelfor
2. Copy Eigen's thread pool class to ORT
3. Support thread affinity
4. Remove RNN kernel’s private thread pool
5. Modify pool kernels to use the thread pool when openmp is disabled.
2020-03-30 12:18:40 -07:00
George Wu
355f39ddee
fix cuda build for cmake >= 3.17.0 (#3362) 2020-03-30 00:38:57 -07:00
Tiago Koji Castro Shibata
c3cea486d0
Port ConcurrencyTests from TAEF (#3086)
* Add ConcurrencyTests

* Make ConcurrencyTests compatible with TAEF

* Use test PCH in concurrency tests

* Fix include header

* Ignore unused code warnings on WINML_SKIP_TEST

* Remove BOM

* Remove conflicting namespace in older SDK

* Refactor duplicate code

* Fix unused DELAYLOAD

* Fix unused DELAYLOAD

* Remove link to internal bug

* Address code style fixes

* Add new concurrency tests
2020-03-27 17:39:22 -07:00
Sheil Kumar
b72fe13941
Update WinML Projection to accept sequence of tensors (#3287)
* Enable sequence of tensor

* add tests

* small updates

* There should only be 2 elements returned

* CR feedback, and another 6->2 check update in the test.

* missing semicolon...

* Add explicit to constructor taking pointer paramter

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2020-03-23 15:55:20 -07:00
Tracy Sharpe
57468c651c
QLinearMatMul speed up (#3283)
The equivalent of PR#3196 but done for QLinearMatMul. Use MLAS to do a u8u8=s32 GEMM and then requantize this intermediate buffer.
2020-03-21 15:37:25 -07:00
Pranav Sharma
84015d9491
Fix post merge test. This doesn't get triggered as part of gated PR checks. (#3277) 2020-03-20 13:23:09 -07:00
Xueyun Zhu
ccc3535e72 resolve conflict 2020-03-20 20:20:35 +00:00
Ye Wang
c5149e89d9
Wangye/shortgraindropper (#3273) (#3274)
* Featurizer Library update

* update Featurizer Library

* add short_grain_dropper_transformer

* resolve comments

* resolve comments

* resolve comments
2020-03-20 11:48:31 -07:00
liqunfu
d521efd904
refactor frontend (#3235)
* refactor frontend

* remove training python files from inferencing build

* update according to reviewer's comments

* merge pybind_state.cc

* refactor pybind_state.cc

* code clean up

* missed a forward declaration in ort_pybind_state.cc

* passed pytest

* move training_session.py into a subfolder per reviewer's comment

* add copyright

Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-03-19 20:59:41 -07:00
ytaous
ca7985fd9f
Address PR comments (#3256)
* comments

* fix path

* fix path

Co-authored-by: Ethan Tao <ettao@microsoft.com>
2020-03-19 10:40:00 -07:00
Tiago Koji Castro Shibata
3bdb0b620a
Fix WCOS/Win32 linking bugs (#3126)
* Fix WCOS/Win32 linking bugs

* Remove unused NODEFAULTLIB flags

* Avoid plain target_link_libraries signature

* Avoid plain target_link_libraries signature

* Fix library list escaping

* Use library list instead of string

* Remove duplicate link to windowsapp.lib

* Remove Win32 build workarounds

* Specify CMake policies before initializing language

* Expose Win32 header definitions during build

* Force set API family

* Enable Win32 APIs in featurizer

* Use MT dynamic CRT

* Expose Win32 specific functions

* Disable app container globally

* Disable default wide functions in featurizers

* Add featurizers to test include path

* Workaround https://gitlab.kitware.com/cmake/cmake/issues/19428

* Revert pipeline debugging hacks

* Skip /FI in CUDA sources

* Default to Win32 builds

* Enable WCOS when using WinML

* Use generator expression to apply CMAKE_MSVC_RUNTIME_LIBRARY to C++ only
2020-03-19 08:52:40 -07:00
Pranav Sharma
435f014d71
Add support for sessions to share a global threadpool. (#3177)
* Add support for sessions to share a global threadpool.

* Fix build issues

* Add tests, fix build issues.

* Added some documentation

* Fix centos issue when threadpools become nullptr due to 1 core.

* Fix mac and x86 build issues

* Address some PR comments

* Disabled test for android, added few more tests and addressed more PR comments.

* const_cast
2020-03-18 15:42:46 -07:00
edgchen1
e03b8a1e2f
Move path_lib from onnxruntime/core/framework to onnxruntime/core/platform. (#3253)
Moved path_lib.h/cc from onnxruntime/core/framework to onnxruntime/core/platform and from the onnxruntime_framework to the onnxruntime_common libraries.
2020-03-18 11:53:46 -07:00
Tracy Sharpe
88c20eaef1
MLAS: rename AVX512BW->AVX512Core (#3216)
Cleanup change: remap functions and files with Avx512BW to Avx512Core.
2020-03-13 22:45:51 -07:00
Tracy Sharpe
fe0b2b2abd
QLinearConv speed up (#3196)
For x86/x64 builds, change the QLinearConv op to use MLAS for the u8u8=s32 GEMM, then requantize the intermediate buffer to u8.
2020-03-13 16:54:55 -07:00
Zeeshan Siddiqui
2cad08bd60 Merged PR 5688: Upgrade ONNX submodule to the latest from github ONNX master.
We want to implement SoftmaxCrossentropy and NegativeLossLikelihoodLoss forward training ops for opset-12 but that requires ONNX submodule to point to the latest commit to have the latest and greatest ONNX spec!

- Reverse integrate changes from *.in.proto files in github ONNX repo.
- Regenerate csharp/test/Microsoft.ML.OnnxRuntime.Tests/OnnxMl.cs
- Disable ONNX tests that don't have op implementation for the latest opset.
2020-03-12 16:51:45 -07:00
Edward Chen
e542cfd0e0 Introduce training changes. 2020-03-11 14:39:03 -07:00
KeDengMS
ade4fa108f
Disable delayload for cuda dlls (#3147)
This change fixes #3129. When running onnxruntime as dll on Windows, CUDA does some internal cleanups when process exits. After this, any call to CUDA would cause crash. Delayload makes thread_local destructor to happen after CUDA cleanup, thus the crash.
2020-03-05 14:40:22 -08:00
smk2007
6cdd2b4934
Enable DML Nuget Package for x64 or x86 architectures (#3120)
* add dml gpu pipelines

* add x86 to the gpu dml dev build pipeline

* Enable DML x86 builds

* Fix uint64_t -> size_t warning

* fix warnings

* enable dml on x86 ci builds

* operatorHelper 773 error uint32_t vs uint64_t

* operatorHelper 773 error uint32_t vs uint64_t

* make x86 pipeline use the gpu pool

* more warnings

* fix x86 directml path

* make dml nuget package

* disable tf_pnasnet_large

* disable zfnet512

* make validation use wildcards

* disable x86 dml gpu tests

* add args.

* update gpu.yml

* change nupkg wildcard

* add debug statements

* package x86 dml nupkg

* dont drop managed nuget again from dml pipeline build

* Add DML EULA

* directml license should be renamed to not clobber the existing license

* casing on dml package....

* {} to ()

* fix license name

* disable dml from x86 ci

* typo and cr feedback

* remove featurizers

* ship the dml pdb as well
2020-03-02 20:18:46 -08:00
edgchen1
37f5fd8fb8
Add support for loading TensorProtos with external data from optimizer Initializer (#3045)
- Added support for loading TensorProtos with external data from the optimizer Initializer class.
- Added some file path utilities.
2020-02-28 13:19:16 -08:00
Changming Sun
c6ed077441
Add d2FH4- flag to cuda (#3105) 2020-02-27 20:22:07 -08:00
Dmitri Smirnov
5008fc5b00
Featurizers: Import fix for Linux build adjust linkage (#3089)
Advance FeaturizersLibrary
  SetAbsError on Output
2020-02-27 15:49:18 -08:00
Changming Sun
d72639ef77
Fix CUDA 10.1 DLL names (#3102) 2020-02-27 14:43:16 -08:00
daquexian
37a905f557
Make Java API available on Android (#3030) 2020-02-27 08:23:50 -08:00
Ori Levari
5e0f7412cd
Properly handle downlevel and WCOS scenarios (#3075) 2020-02-25 17:47:02 -08:00
stevenlix
f4a5d17294
Upgrade to CUDA10.2 for TensorRT (#3084)
* Switch to CUDA10.2

* Update win-gpu-tensorrt-ci-pipeline.yml

* Update win-gpu-tensorrt-ci-pipeline.yml

* remove dynamic_shape

* update onnx-tensorrt submodule

* check if input shape is specified for TensorRT subgraph input and enable some TensorRT unit tests

* fix format issue

* add shape inference instruction for TensorRT

* update according to the reviews

* Update win-gpu-tensorrt-ci-pipeline.yml
2020-02-25 05:36:01 -08:00