Commit graph

1672 commits

Author SHA1 Message Date
liuziyue
200f4b4ea6 EmbedLayerNormalization Fusion Improvement (#2553)
Embedding layer norm fusion improvements - add more checks
2019-12-07 23:14:26 -08:00
KeDengMS
0f12346d76
[Nuphar EP] fixes for some object detection models (#2581)
Update notebook tutorial with multi-threaded int8 GEMM from #2517
2019-12-07 13:37:00 -08:00
Ryan Hill
cbc398bb75
Ryanunderhill/packagename test (#2582) 2019-12-07 12:08:46 -08:00
Ashwini Khade
c06dbd8311 Add ConvTranspose1D (#2578) 2019-12-07 08:50:02 -08:00
Mark
79847f39b3 Fix file not found error during docker build. (#2569) 2019-12-07 08:49:47 -08:00
Yufeng Li
5575766a53
Add more check on SkipLayerNorm and BiasGelu fusion (#2574) 2019-12-06 15:36:02 -08:00
Changming Sun
262ee9dc5a Fix a warning found in the latest VS release 2019-12-06 15:07:21 -08:00
Yufeng Li
34beafc51c
make layernorm fusion to support opset 11 (#2545) 2019-12-06 13:06:36 -08:00
shahasad
eeb28a80c0
setup java ci mac (#2570) 2019-12-06 11:43:40 -08:00
Tianlei Wu
038ee91da5
Allow sequence length to be symbolic (#2559) 2019-12-06 10:13:56 -08:00
George Wu
73c682b97c
disable onnx_test_runner -x invocations for dnnl (#2568) 2019-12-05 23:05:34 -08:00
Changming Sun
7eddac16c2
Re-enable Windows C# tests (#2564) 2019-12-05 21:22:31 -08:00
Ryan Hill
854362cf05
Update win-x86-ci.yml (#2557)
Fix build pipeline break
2019-12-05 18:44:12 -08:00
Changming Sun
ace132f9aa
Fix android build (#2558) 2019-12-05 15:03:22 -08:00
Sreekanth Yalachigere
4c996a8699 DNNL CMAKE update (#2548) 2019-12-05 13:48:57 -08:00
Hariharan Seshadri
53a6bc2f07
Fix a bug handling negative begin pad values in Pad op (#2550)
* Fix bug in Pad op

* Update
2019-12-05 11:29:45 -08:00
Changming Sun
bec4abf074 Add back executable bit to build.py 2019-12-04 21:22:02 -08:00
Ashwini Khade
281933fa1c
Fix C API tests for centos and mac (#2544)
* change c++14 to c++11

* add ld lib path for centos

* enable csharp tests on macos

* fix C API test on MacOS + fix manylinux dotnet install

* fix manylinux dotnet install

* fix lib link
2019-12-04 18:01:35 -08:00
Dmitri Smirnov
d34fb62012
Introduce container type runtime checks and other improvements (#2522)
Rework TensorSeq in a manner consistent with Tensor and SparseTensor
  in terms of type system setup.
  Reduce templating. Introduce helpers to ensure the same
  data type.
  Make OrtValue __dtor not virtual.
  Introduce ContainerChecker
2019-12-04 16:04:17 -08:00
Yulong Wang
be56d77a66
Fix integer overflow in cuda NonMaxSuppression implementation (#2540)
* add test case that should pass but fail

* fix nms

* extract int_max_output_boxes_per_class
2019-12-04 13:27:04 -08:00
Xiang Zhang
3e7aaf8fa1 User/xianz/telemetry (#2458)
* enabme telemetry

* enable telemetry

* set enable telemetry as default

* for debugging

* remove log and set disable telemetry as default back

* delete private file while testing

* resolve comment: mainly add license header, rename macro and update docs

* rewording in privacy.md
2019-12-03 23:34:53 -08:00
stevenlix
293b15480b Add dynamic shape support in TensorRT execution provider (#2450)
* remove onnx-tensorrt submodule

* add new onnx-tensorrt submodule (experiment) for trt6

* update engine build for trt6

* update compile and compute for tensorrt6.0

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.cc

* switch to onnx-tensorrt master for TensorRT6'

* Update tensorrt_execution_provider.cc

* Handle dynamic batch size and add memcpy in TensorRT EP

* update test cases

* Update tensorrt_execution_provider.cc

* update onnx-tensorrt submodule

* Update Dockerfile.ubuntu_tensorrt

* Update Dockerfile.ubuntu_tensorrt

* Update run_dockerbuild.sh

* Update run_dockerbuild.sh

* Update install_ubuntu.sh

* Update concat_op_test.cc

* Update tensorrt_execution_provider.cc

* Upgrade TensorRT to version 6.0.1.5

* Update onnxruntime_providers.cmake

* Update CMakeLists.txt

* Update reduction_ops_test.cc

* Update install_ubuntu.sh

* Update Dockerfile.ubuntu_tensorrt

* Update Dockerfile.tensorrt

* Update BUILD.md

* Update run_dockerbuild.sh

* Update install_ubuntu.sh

* Update onnxruntime_providers.cmake

* Update install_ubuntu.sh

* Update install_ubuntu.sh

* Update gemm_test.cc

* Update gather_op_test.cc

* Update CMakeLists.txt

* Removed submodule

* update onnx-tensorrt submodule

* update header file

* Removed submodule

* add submodule onnx-tensorrt kevin's branch shape-test'

* add debugging code

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.cc

* merge master

* Removed submodule

* update onnx-tensorrt submodule

* add more changes for dynamic shapes

* Update tensorrt_execution_provider.cc

* update for dynamic shape

* update dynamic shape processing

* fix logger issue

* remove submodule onnx-tensorrt

* add submodule onnx-tensorrt

* add env variable min_subgraph_size

* remove redundency

* update document

* use onnxruntime::make_unique

* fix multi-run issue

* remove some tests to save CI build time

* Add dynamic shape test

* Update TensorRT-ExecutionProvider.md

* Add example of running Faster R-CNN model on TensorRT EP

* Add more details on env variables

* update environment variables

* Update tensorrt_basic_test.cc

* Update model tests

* Update tensor_op_test.cc

* remove --use_full_protobuf

* Update build.py
2019-12-03 23:18:33 -08:00
Yulong Wang
d748f891d8
Revert "Disable thread pool creation when enabled OpenMP (#2485)" (#2535)
This reverts commit 7c7d5a149c.
2019-12-03 22:09:02 -08:00
Hariharan Seshadri
5c2e474751
Add provision in ORT for session options to be parsed when available via model file (#2449)
* Initial commit

* Fix gitmodules

* Nits

* Nits

* Updates

* Update

* More changes

* Updates

* Update

* Some updates

* More changes

* Update

* Update

* Merge

* Update

* Updates

* More changes

* Update

* Fix nits

* Updates

* Fix warning

* Fix build

* Add comment

* PR feedback

* PR feedback

* Updates

* Updates

* Update

* More changes

* Fix build break

* Comment test for now

* Updates

* Updates

* PR feedback

* Updates

* Nits

* Add tests

* Fix build

* Fix build

* Fix build

* Fix build break

* Fix build

* Nits

* PR feedback

* More change

* Expose GetSessionOptions in pybind logic and add unit test for python

* Fix build

* PR feedback

* PR feedback
2019-12-03 16:56:07 -08:00
shahasad
178d059111 Setup java ci (#2528) 2019-12-03 14:21:51 -08:00
Tianlei Wu
b50878dcf0 Disable Attention fusion tests when DISABLE_CONTRIB_OPS is defined (#2529) 2019-12-03 14:21:21 -08:00
Ashwini Khade
e32eff826c
enable nuget package testing on centos7 (#2527)
* add centos tests to linux cpu ci pipeline

* Disable failing test

* use centos6 instead of centos7

* change back to centos7

* add dotnet runtime dependency

* fix dotnet runtime dependencies

* install dotnet sdk instead of runtimes

* add more dotnet dependencies

* temporary skip failing test

* ix lib path

* reenable failing test
2019-12-03 10:16:45 -08:00
RandySheriffH
85a4ed8cf7
fix cuda kernel causing invalid mem access (#2523) 2019-12-03 09:16:00 -08:00
Tianlei Wu
66254eb25a
Update BERT model optimization python script (#2521)
Add support of GPT2 model optimization:
* Match subgraph of Gelu Approximation (using Tanh).
* Fuse LayerNormalization if SkipLayerNormalization is not ready.
* Output model even if embedding layer is not fused.
* Improve Reshape Fusion to improve coverage.
* Refine constant input checking, and output fused op counter.

Update script according to latest op improvements:
* Fusion of Add Bias and Gelu.
* Fuse SkipLayerNormalization and Add Bias.

Other:
* Add ReduceSum for mask as intermediate step.
* Refactor verbose setting.
2019-12-03 08:40:51 -08:00
Sreekanth Yalachigere
31ea11a696 Renaming MKL-DNN as DNNL (#2515)
* DNNL: Moving Files to rename file names

* DNNL name change

* azure pipeline updated

* disable ceil/dialation and enable Opset10

* disable ceil/dialation tests in Python

* mlperf_ssd_resnet34_1200 disabled
2019-12-03 07:34:23 -08:00
Changming Sun
3d627362a0
Upgrade Windows CPU CI pipeline to use VS 2019 (#2519) 2019-12-02 23:05:35 -08:00
Scott McKay
e8b327d657
Fix constant folding of node assigned to CUDA (#2510)
* Constant folding bug fix/improvements
  - Handle constant folding for node that is assigned to a non cpu EP
  - Check for errors in optimizer execution frame setup
  - Improve CUDA partitioning to look for initializers in parent graphs
  - Add unit test

Fixes #2474
2019-12-03 16:28:44 +10:00
Changming Sun
4354023913
Make link time optimization work on Linux (#2477) 2019-12-02 22:25:41 -08:00
baowenlei
25c260fdef Add parallel for tensorized gemm (#2517)
* add parallel for tensorize gemm

* add option to control parallel

* change to a more clean way to control
2019-12-02 22:05:46 -08:00
KeDengMS
c1be615c45
[NupharEP] refine parallel schedule control (#2514)
* [NupharEP] Add parallel schedule to JIT function name
Update Nuphar docker to use Python 3.6 and ubuntu 18.04

* Update notebook

* Avoid JIT cache file name conflict
2019-12-02 17:40:51 -08:00
Zhang Lei
784eca0dcd
Cuda pad() for opset 11 (#2490)
* Cuda pad opset 11.

* Handle type conversion issue in building.
2019-12-02 16:28:17 -08:00
Jeff Bloomfield
b9faa0b6fd Fix kernel registry validation to reenable DML kernels 2019-12-02 15:43:44 -08:00
Scott McKay
ddaad86605
CUDA Loop (#2444)
* Implement CUDA Loop operator.

* Add control flow node implicit input handling to the memcpy transformer and allocation planner.
2019-12-03 08:29:21 +10:00
Zhang Lei
50eb140119
Cuda Resize Operator for opset 11. (#2484)
* Cuda Resize Operator for opset 11.
2019-12-02 13:42:21 -08:00
xavier dupré
c42148a0c3 Improves softmax function for standard ml 2019-12-02 10:48:46 -08:00
Dmitri Smirnov
ec88f6d8d6
Add DataFrameTool (#2456)
Add DataFrameTool to feed inputs from Panda DataFrame
2019-12-02 10:12:03 -08:00
Yulong Wang
89824b35e9
optimize CPU implementation of Attention (#2496) 2019-12-01 14:43:38 -08:00
Tianlei Wu
0f57e0a49e
Change mask input of EmbedLayerNormalization op to be optional (#2495)
Change mask input of EmbedLayerNormalization op to be optional
2019-12-01 08:36:06 -08:00
liuziyue
0edd4ef6ca
EmbedLayerNormalization fusion (#2452)
Embed Layer Normalization Fusion
2019-11-28 14:03:58 -08:00
KeDengMS
60208463a9
[NupharEP] Enable parallel schedule (#2505)
* [NupharEP] Enable parallel schedule
* Update TVM with the fix to TVM threadpool to use OpenMP if possible
* Add parallel schedule when trying to vectorize
With this change, BERT squad perf on a 4-core (8 HT) CPU goes from 187ms to 150ms

* Address CR, docs and cmake update

* Doc fix

* Fix mkl

* Fix TVM windows build when using mklml
2019-11-28 08:35:56 -08:00
Yufeng Li
005305be6e
Implement AddGelu and SkipLayerNorm (#2487)
* Implement AddGelu and SkipLayerNorm
2019-11-28 08:29:59 -08:00
Zhang Lei
ee0bde6b69 Enable three type of Equal() to version 11. (#2508) 2019-11-28 03:03:43 -08:00
Dmitri Smirnov
75b4747701
Fix a memleak in pybind. (#2503) 2019-11-27 15:32:05 -08:00
Scott McKay
1fdf1006ac
Various fixes coming out of discussions in #2436 (#2497)
- Add --skip_tests option to build.py based on github feedback
  - Add debug output at end of run_subprocess so it's clearer when the output is from a different process running
  - Add check for scipy as it's required by gen_test_models.py for the onnx tests
  - Use log.warning instead of warnings.warn for consistency. We use the logger almost everywhere and somewhat randomly used warnings.warn in two places.
  - Add check for 'wheel' dependency not being found in setup.py and handle more gracefully
  - Fix invalid input name in Keras tests
2019-11-28 07:03:23 +10:00
Zhang Lei
04b6097db4
Cuda Clip() for op set 11. (#2411)
* Cuda Clip() for op set 11.
* make min_val and max_value input CPU memory directly.
* Remove original cu file useless "#pragma once"
* merge duplicate logic into one class.
2019-11-27 12:42:45 -08:00