Commit graph

2105 commits

Author SHA1 Message Date
Sergii Dymchenko
6bbc80951d Get onnxruntime/core/providers/cuda/tensor/slice.h from ort_training. 2020-04-09 17:03:58 -07:00
Sergii Dymchenko
0e4080f1d6 Get cuda_common.h from master. 2020-04-09 16:56:52 -07:00
Sergii Dymchenko
84773c61c6 Rename ONNX OPTIONAL to OPTIONAL_VALUE. 2020-04-09 16:22:30 -07:00
Sergii Dymchenko
eaa3f652df Fix dynamicslice.cc after merge. 2020-04-09 15:17:21 -07:00
Sergii Dymchenko
8ea0e596ec Fix onnxruntime_unittests.cmake after merge. 2020-04-09 13:14:15 -07:00
Sergii Dymchenko
6ba7c99e50 Merge branch 'master' into ort_training 2020-04-09 12:42:04 -07:00
Yufeng Li
4d71958ccf
Use IMMA for int8 matmul to leverage Turing Tensor Core (#3413)
Use IMMA for int8 matmul to leverage Turing Tensor Core
Format files under onnxruntime/core/providers/cude
2020-04-07 15:22:04 -07:00
Tracy Sharpe
de60a14c16
Fix output range for int8_t QuantizeLinear op (#3445) 2020-04-07 15:01:20 -07:00
Yulong Wang
aabf47b107
Fix Split CUDA implementation for zero sized input (#2942)
* Fix Split CUDA implementation for zero sized input

* resolve comments

* add case

* test case update: split into 2 tensors
2020-04-07 14:44:20 -07:00
Scott McKay
48e96ea65f
Reduce binary size of Slice implementation (#3238)
* Make the Slice implementation based on type sizes and reduce templatized code to a minimum.

* Remove using 'dynamic' as a template param to Slice as well.
2020-04-08 07:19:29 +10:00
ytaous
b35468289a
View Op - new unit tests and add support for tensor memcpy by offset/size (#3439)
* view ops UTs

* update per comments

* PR comments - code clean up

* code clean up per comments

Co-authored-by: Ethan Tao <ettao@microsoft.com>
2020-04-07 13:07:11 -07:00
Dmitri Smirnov
53b9d52fc6
Rework TensorToTensorProto. Do not put string data to raw_string. Eliminate redundant argument. (#3438)
Rework TensorToTensorProto. Eliminate redundant argument.
  Do not put string data into raw_data.
2020-04-07 11:42:10 -07:00
Andrews548
43d6c464fc
Fix ACL EP pooling build breakage (#3429)
The commit 06fc9506fd which refactored cpu Pool class broke ACL EP build.
Also worked on the commit a4fe60c4d3 as it also affects the new class.
Move the declaration of the new MaxPoolV8 cpu class in the header file. Implement MaxPool 8-11 in ACL EP.

Co-authored-by: Andrei-Alexandru <andrei-alexandru.avram@nxp.com>
2020-04-07 07:03:52 -07:00
Tianlei Wu
4bdb5cc8e2
Add CPU implementation for FastGelu operator (#3398)
* Add CPU implementation for FastGelu operator
* Update optimization script  to fuse Gelu or FastGelu according to Elf or Tanh is used in graph.
* Merge BiasGelu and FastGelu into one class
* Enable FastGelu Fusion optimizer for CPU Execution Provider.
2020-04-07 00:19:30 -07:00
Changming Sun
9e65298d7a
Re-enable tests (#3437)
Re-enable some tests that was recently fixed.
2020-04-06 20:13:34 -07:00
Thiago Crepaldi
15e32b44fd
Merge pull request #3383
Merge from master into ort_training
2020-04-06 19:05:01 -07:00
Tianlei Wu
8ab09186b7
Bert Optimization Script Improvements (#3387)
Add opt_level option for graph optimization level in bert perf test.
Support BERT models that output each layer, where SkipLayerNormalization has more than 4 children.
Check weight and bias are 1D for layer norm fusion.
Add a dummy class Gpt2OnnxModel for further changes of GPT2 model.
2020-04-06 16:55:40 -07:00
Edward Chen
95707d22a5 Disable gradient clipping for E2E test. 2020-04-06 23:07:28 +00:00
Dmitri Smirnov
c8f5e6e632
Implement Min/Max/Clip(12) (#3410)
Implement Max/Min for opset 12.
  Add CLip(12) CPU impl.
  Implement Clip(12) for CPU and CUDA add tests
2020-04-06 14:24:59 -07:00
Yang Chen
7c69b1703b
Fixed a typo (no functional change) (#3433)
s/initailizer/initializer/
2020-04-06 13:46:17 -07:00
Ye Wang
4ebad8805b
change (#3431) 2020-04-06 11:30:21 -07:00
Changming Sun
0dcc6035b1
Disable strong inline (#3399)
To bypass a MSVC bug. Without this change, people can't use VS2017 to build onnxruntime in Release or RelWithDebInfo mode.
2020-04-06 11:19:09 -07:00
Sherlock
a3ab2ba036
Reapply commit 131c65d; Fix memory regression issue. (#3423)
* Reapply commit 131c65d

* fix merge error
2020-04-06 10:29:31 -07:00
Yang Chen
d361121d98
Do not inline ExternOp's scalar tensor inputs (#3426)
An ExternOp's input needs buffers, so we cannot add compute_inline
schedule on it even if it's a scalar tensor. Instead, we need to
schedule it as compute_root.
2020-04-05 18:35:09 -07:00
Tiago Koji Castro Shibata
517693a507
Fix race condition creating ConverterResourceStore (#3419) 2020-04-04 20:10:07 -07:00
Changming Sun
33006f48c0
Update onnx submodule to 1.7.0 release candidate (#3405)
Update onnx submodule to 1.7.0 release candidate.  This isn't a release tag,  but it will be released soon, in 1-2 weeks.
2020-04-04 16:23:42 -07:00
Tracy Sharpe
d4d19a75ba
Use MlasConv for 1D convolutions (#3425)
Use the existing 2D convolution code in MlasConv to also handle 1D convolutions.
2020-04-04 09:43:10 -07:00
Jesse Benson
5835349614 Add #pragma once to providers.h, so avoid 'struct' redefinition error when including the header from multiple places. 2020-04-03 16:25:18 -07:00
edgchen1
82c1e1b3db
Enable loss scale input from Python frontend (#3327)
Made some fixes to enable loss scale to be wired up to ORT from the Python frontend. In particular, now addition of loss scaling is done unconditionally if mixed precision is enabled. The generated loss scale input name is passed back to the frontend.

Also fixed how inputs were added during the training graph configuration. Graph::SetInputs() was causing some issues - it seems to not be working correctly.

Also added some mixed precision Python frontend tests.

Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-04-03 16:02:14 -07:00
Sherlock
f437665360
Revert "Addressing PR comments (#3334)" (#3412)
This reverts commit 131c65d23d.
2020-04-03 11:59:47 -07:00
Pranav Sharma
14f4c3e25f
Fix issue in construction of DummyArena. (#3416) 2020-04-03 08:28:05 -07:00
Scott McKay
85131e760c
Enable upsample2x optimization for opset 11 Resize (#3388)
* Enable use_nearest2x_optimization for opset 11 of Resize when possible
2020-04-03 17:36:11 +10:00
Thiago Crepaldi
d89e5d91a6 Disable GradientCheckerTest tests for GPU/Debug build (#3407) 2020-04-03 01:01:58 +00:00
Thiago Crepaldi
675035b1a8
Disable GradientCheckerTest tests for GPU/Debug build (#3407) 2020-04-02 18:00:54 -07:00
Pranav Sharma
3568f8d186
Allow a custom op with the same name to be registered for several providers. (#3400) 2020-04-02 15:38:51 -07:00
Changming Sun
a5fea26cb4 Disable model tests for Mac OS X builds 2020-04-02 15:14:32 -07:00
Thiago Crepaldi
e2afe5e054 Revert Session and InferenceSession implementation 2020-04-02 11:47:44 -07:00
Changming Sun
aefa466334
Allow zero in split op (#3389)
Allow zero in split op (A change in onnx 1.7 without bumping up the op version)
2020-04-01 16:20:14 -07:00
Tiago Koji Castro Shibata
1671072b6b
[WIP] Port image tests from WAI (#3365)
* Copy image tests from ADO

* wip

* Port tests to googletest

* Add FNS-Candy license

* Add missing collaterals

* Remove brand images

* Fix typos

* Use PrepareModelSessionBinding in MnistImageTest

* Fix typos
2020-04-01 15:38:44 -07:00
Thiago Crepaldi
0b1e3f1e10 Revert _SliceKernel cuda implementation 2020-04-01 14:28:17 -07:00
Thiago Crepaldi
28ff88ce52 Disable tests (temporary) 2020-04-01 14:28:07 -07:00
Tiago Koji Castro Shibata
1c334ed0f1
Add Ninja generator to build.py (#3331) 2020-04-01 14:19:22 -07:00
Xavier Dupré
edec8043d4
Fix python examples in documentation (#3379) 2020-04-01 22:48:32 +02:00
ytaous
2ce90cff4c
PR comments (#3374)
* PR comments

* PR comments

* PR comments

* PR comments

* PR comments

* PR comments

* PR comments

Co-authored-by: Ethan Tao <ettao@microsoft.com>
2020-04-01 10:36:16 -07:00
Sherlock
614eb438ae
Update Op's Domain and Version (#3356)
* Update Nccl ops domain opset

* Update ZeroGradient Domain OpSet

* Update InPlaceAccumulator Domain OpSet

* Update SoftmaxGrad Domain and OpSet

* Update LayerNormalizationGrad Domain and OpSet

* Update BatchNormGrad Domain and Opset

* Update IsAllFinite Domain and Opset

* Update DivGrad Domain and Opset

* Update GatherGrad Domain and Opset

* Update IsFinite Domain and OpSet

* Update ReduceAllL2 Domain and Opset

* Update MixedPrecisionScale Doman and Opset

* Update AllOp Domain and Opset

* Update GroupOp Domain and OpSet

* Update ViewOp Domain and OpSet
2020-04-01 10:10:38 -07:00
Changming Sun
accffded5d
Build options for enabling AVX/AVX2/AVX512 (#3373)
1. Add build options for enabling AVX/AVX2/AVX512
2. Update eigen to a newer version, because the current one doesn't work with VC and AVX512.
2020-04-01 10:07:22 -07:00
Brian Martin
77c7d09ced
ERROR_NOT_SUPPORTED doesn't trigger Failed Hresult. Need E_NOTIMPL (#3396) 2020-04-01 10:06:00 -07:00
Brian Martin
052c1fda44
fix some warnings in concurrency tests (#3395) 2020-04-01 10:05:24 -07:00
Scott McKay
33d3239b67
Rework SVMClassifier to improve performance (#3363)
* Rework SVMClassifier
 - use GEMM for initial scoring
 - minimize data allocations and copies
 - parallelize the second half of the scoring for larger batches
2020-04-01 22:00:01 +10:00
Thiago Crepaldi
6d769d47c4 Fix InferenceSession API 2020-03-31 20:10:06 -07:00