Commit graph

1377 commits

Author SHA1 Message Date
Scott McKay
ddbc2086e4 Add support for opset 11 Clip in optimizers. (#2059) 2019-10-10 10:47:29 -07:00
Yulong Wang
a41c71cbf2
check and fix CUDA kernel launch errors in several OPs (#2047) 2019-10-10 23:47:00 +08:00
baowenlei
b4a98aab78
change MatMulInteger/MatMulInteger16 fallback option (#2064)
* change MatMulInteger/MatMulInteger16 fallback option when no initializer exist

* add AVX option

* fix condition for old machines
2019-10-09 22:03:21 -07:00
Hariharan Seshadri
d186c19c45
Add opset-11 TopK CPU kernel (#1912)
* initial commit

* Update

* Update top_k.cc

* PR comments

* Add more tests

* Update

* Add another test case

* Update

* Resolve conflicts

* Update

* Nits

* Nits

* Nits

* Pick sorted content using 2 different approaches

* Update to logic

* PR comments

* PR feedback

* Update

* Fix build

* Fix build

* Update
2019-10-09 19:09:30 -07:00
Colin Versteeg
8fda6593fe Update failing tests (#2038)
* Fix failing tests from when they were not enabled

* split into two

* fix failing test
2019-10-09 15:17:21 -07:00
Tracy Sharpe
57e0099425
MLAS: Implement U8S8 GEMV kernels (#2069)
This implements an optimization for U8S8 MlasGemm when M=1, aka GEMV.
2019-10-09 11:54:16 -07:00
Changming Sun
eee9c55030
C++11 fix for memcpy_transformer_test.cc (#2061) 2019-10-09 10:52:10 -07:00
Changming Sun
cefae93305
Add a test case for linearregressor (#1962) 2019-10-09 10:17:08 -07:00
Changming Sun
ccaf692ff2
Run auditwheel for manylinux1 (#2063) 2019-10-09 09:23:00 -07:00
Dmitri Smirnov
cae571c713 Add a test for AVX512 compilation before compiling 512 asm (#2055) 2019-10-08 21:18:04 -07:00
Changming Sun
af8fe0f980
Replace make_unique in cuda_utils.cu (#2052) 2019-10-08 18:32:08 -07:00
Scott McKay
db0dd09ded
Cleanup some aspects of the Initializer class used by optimizers (#2005)
* Move check on data type outside of the Initializer class as it's specific to Conv processing.
Use references for arguments that can't be null.
2019-10-09 10:37:44 +10:00
Changming Sun
a00ca56ae1
Remove gcc from manylinux1 docker image (#2048) 2019-10-08 13:49:15 -07:00
baowenlei
b82de794d5
Weba/update nuphar doc (#2026)
* update nuphar xp doc

* address comments

* address CR

* update doc
2019-10-08 12:41:25 -07:00
RandySheriffH
f501b6e234
pack pyop in nightly build (#2018)
* pack pyop in nightly build

* correct logic

* add comment

* exclude debug build

* add dependency

* reset postbuild rule

* remove dep
2019-10-08 12:02:45 -07:00
Changming Sun
e9bed8b23b
Change python packaging pipeline to use manylinux1 (#2035)
1. Change the python packaing pipeline to use manylinux1
2. Temporarily disable model test in the python pipeline.
2019-10-08 10:03:54 -07:00
Changming Sun
3053af812c
Fix a crash in deep_cpu_gru_op_test.cc (#2028) 2019-10-08 10:03:07 -07:00
Zhang Lei
71b389322e Implement cuda scatter op. (#1991)
* Implement cuda scatter op.
Disable Invalid Index of Scatter op only for cuda provider.

* Fix some pipeline's type narrow warning as error.
2019-10-08 09:53:33 -07:00
Yang Chen
a94c9bd88d throw exception using dmlc::LogMessageFatal (#2033)
* throw exception using dmlc::LogMessageFatal

On windows, ORT_THROW couldn't be caught if the exception was thrown from
a jitted functions. Let's call dmlc::LogMessageFatal instead.

* address CR

use LOG(FATAL)
2019-10-08 09:31:35 -07:00
Yang Chen
19b0d0af87
Enabled bool input type for Equal for op_ver 11 (#2034)
This change enabled bool type for Equal-11's inputs
2019-10-08 01:50:37 -07:00
Yang Chen
203c2f5b59
updated reduce_ops for op_ver 11 (#2039)
After enabling op_ver 11 for reduce ops, we need to check axes to
make sure it's not empty.
2019-10-08 01:05:05 -07:00
Pranav Sharma
f13b66768a
Fix build for gcc 4.8.5. (#2036) 2019-10-08 00:50:53 -07:00
shahasad
b70fc34fae
Fix C# end to end tests in NuGet pipeline, failing for missing test data file 2019-10-07 20:14:20 -07:00
shahasad
b0feaef9de
Update the C# pretrained model test to include opset9 and 10 models (#2003) 2019-10-07 19:14:34 -07:00
George Wu
0bd807f3b3
trt provider status return cleanup (#2032)
* status and code cleanup.

* revert change. seems like a bug in TRT causes intermittent failure return?
2019-10-07 18:34:48 -07:00
Tianlei Wu
b2c1937523
Add EmbedLayerNormalization and SkipLayerNormalization ops for bert optimization (#2012)
* Add Embed Layer Normalization and Skip Layer Normalization ops for bert optimization.

* add float16 test for skiplayernorm

* Add test for EmbedLayerNormalization op

* fix cpu build error

* fix build warning

* update HasCudaEnvironment function

* handle cuda error
2019-10-07 17:29:43 -07:00
Changming Sun
8f7657fa32
Ignore some gcc warnings (#1996) 2019-10-07 16:32:34 -07:00
Pranav Sharma
ea60469af5
Support seq(tensor), implement 2 sequence ops that use the new type. (#1983)
* Mention OrtCreateSessionFromArray in C API doc

* fix seq of tensors

* changes on 9/30

* All tests passing

* Add SequenceAt op

* Fix shared_lib non_tensor_types test

* Address some PR comments

* Address PR comments

* Add support in python bindings to accept seq(tensor)

* Change data type from vector<Tensor> to TensorSeq

* Change data type from vector<Tensor> to TensorSeq

* Added some documentation

* Added missing test model

* Fix Linux build

* Fix Mac build

* Fix Mac build
2019-10-07 15:35:09 -07:00
Hector Li
00e24ae4fe
refactor Cuda Ops Sum, Max, Min, remove dup code (#1946)
refactor Cuda Ops Sum, Max, Min, remove dup code
2019-10-07 13:17:49 -07:00
Tianlei Wu
7b39f5090c
Add Attention op for multi-head self attention in BERT (#1984)
* Add Attention op for multi head self attention in BERT

* Add test cases

* Move op from kOnnxDomain to kMSDomain.
Limit test to run by CUDA provider only.

* fix test

* Add float16 test

* fix cpu build error

* handle cuda error

* get last cuda error when failed
2019-10-07 12:22:54 -07:00
Yang Chen
7d2f0c79bd Bumped up to op_ver 11 for a bunch of Nuphar Ops (#2025)
This change enabled op_ver 11 for a dozen of Nuphar Ops
2019-10-07 10:34:05 -07:00
Changming Sun
3c26ae5b6d
ThreadPool fix for roialign and CropAndResize (#2020) 2019-10-06 22:43:59 -07:00
Pranav Sharma
4cdb95e436
Resort to sequential execution if the inter op thread pool ptr is nullptr; (#2023) 2019-10-06 16:08:41 -07:00
stevenlix
544e53e24e Update TensorRT to version 6.0.1.5 (#1966)
* remove onnx-tensorrt submodule

* add new onnx-tensorrt submodule (experiment) for trt6

* update engine build for trt6

* update compile and compute for tensorrt6.0

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.cc

* switch to onnx-tensorrt master for TensorRT6'

* Update tensorrt_execution_provider.cc

* Handle dynamic batch size and add memcpy in TensorRT EP

* update test cases

* Update tensorrt_execution_provider.cc

* update onnx-tensorrt submodule

* Update Dockerfile.ubuntu_tensorrt

* Update Dockerfile.ubuntu_tensorrt

* Update run_dockerbuild.sh

* Update run_dockerbuild.sh

* Update install_ubuntu.sh

* Update concat_op_test.cc

* Update tensorrt_execution_provider.cc

* Upgrade TensorRT to version 6.0.1.5

* Update onnxruntime_providers.cmake

* Update CMakeLists.txt

* Update reduction_ops_test.cc

* Update install_ubuntu.sh

* Update Dockerfile.ubuntu_tensorrt

* Update Dockerfile.tensorrt

* Update BUILD.md

* Update run_dockerbuild.sh

* Update install_ubuntu.sh

* Update onnxruntime_providers.cmake

* Update install_ubuntu.sh

* Update install_ubuntu.sh

* Update gemm_test.cc

* Update gather_op_test.cc

* Update CMakeLists.txt

* Removed submodule

* update onnx-tensorrt submodule

* Add Ubuntu18.04 build option

* Add Ubuntu18.04 build option

* Add Ubuntu18.04 build option

* Add Ubuntu18.04 build option

* Remove redundency

* Fix issue that it does not add memcopy node correctly if some nodes fall back to CUDA EP.
e.g. after partition, there's TRT_Node -> Cuda_node (with CPU memory expected), we still need to add memcpy node between them.

* update for Trt Windows build

* Update onnxruntime_providers.cmake

* Disable opset11 tests on TensorRT

* Update pad_test.cc

* Update build.py

* update scripts for ubuntu18.04

* Disable warning for Windows build
2019-10-06 10:40:53 -07:00
baowenlei
4bb6385dca
Weba/merge ngemm (#2021)
* save status: add tiling layout; add avx512 skylake cpuid info

* unit tests and matmul integer model passed on skylake, need to verify model

* save commit before update master

* fix check

* address comments
2019-10-05 12:09:22 -07:00
Xavier Dupré
0b5aac0a2e fix python setup (#2022) 2019-10-05 09:46:41 -07:00
Yang Chen
e8285a7996
Added GatherElements to Nuphar (#2016)
* Added GatherElements to Nuphar

This change added GatherElements (op_ver 11) to the Nuphar provider.

* address CR feedback

* create a utilify function for accessing index safely

* address more CR

* SafeIndex -> ClampIndex
2019-10-04 23:53:02 -07:00
Colin Versteeg
1ba76c5f74 add support for empty version and score route (#1995) 2019-10-04 22:53:11 -07:00
Changming Sun
a9e04a29b3
Ignore a test: ParallelExecutor.StatusPropagation (#2019) 2019-10-04 22:51:47 -07:00
Scott McKay
2a2e6e6641
Handle nullptr for NodeArg.Shape() (#2009) 2019-10-05 15:00:19 +10:00
Hariharan Seshadri
f528da35f2
Update ONNX to a newer commit (#2015)
* Update ONNX to a newer version

* PR comments
2019-10-04 19:41:00 -07:00
Dmitri Smirnov
f5a8a23951 Replace std::regex with re2 bc CentOS std::regex is broken (#2017) 2019-10-04 18:47:03 -07:00
daquexian
e071a1249b Android CI (#1600) 2019-10-04 17:39:51 -07:00
Colin Versteeg
bfa1b0e96e Fix logger regression (#2011)
* Fix regression in creating default logger from custom function

* fix model naming issue in tests

* fix version in addition to model name
2019-10-04 16:39:40 -07:00
shahasad
b322e072b9
added the overridableinitializers api (#1977) 2019-10-04 16:38:00 -07:00
ybrnathan
19873c70dc
Implement Cuda Kernel of Where Op (#1997)
* Implement Cuda Kernel of  Where Op

* Fix the template
2019-10-04 15:32:41 -07:00
Yufeng Li
a6bf1d0ad8
use mlaserf (#1999)
1. use MlasErf for Gelu. Eigen's erf is very slow.
2. change the ErfUpperAbsRange to 3.925 because MlasErf doesn't return 1 for 3.725
Motivation and Context
2019-10-04 15:17:26 -07:00
Scott McKay
fdbe365c37
Add BitShift operator (#1981)
* Add BitShift operator. Enable uint32 and uint64 support initially.
2019-10-05 07:48:58 +10:00
Colin Versteeg
d5d1719c1f Fix integration_tests/test_main.py to have correct exit code (#2010) 2019-10-04 14:25:28 -07:00
Changming Sun
ace0b2ca1c
CentOS CI (#1998) 2019-10-04 10:48:43 -07:00