Changming Sun
6e08efa6a2
Fix lto bug for protobuf and ubuntu
2019-12-09 17:34:06 -08:00
daquexian
62de8fa841
Update docs for Android NNAPI EP ( #2586 )
2019-12-09 14:37:03 -08:00
Hector Li
0ab54521f4
Temporarily exclude vgg19 test from Python backend test
...
1. temporarily exclude vgg19 test which comsumes too much memory, run out of memory on Upsquared device. Single test pass for vgg19, need furture investigation (#2588 )
2. Update docker file to decrease the docker image size
2019-12-09 12:25:46 -08:00
Ryan Hill
36eb1771ba
Update version ( #2584 )
2019-12-08 18:00:12 -08:00
liuziyue
200f4b4ea6
EmbedLayerNormalization Fusion Improvement ( #2553 )
...
Embedding layer norm fusion improvements - add more checks
2019-12-07 23:14:26 -08:00
KeDengMS
0f12346d76
[Nuphar EP] fixes for some object detection models ( #2581 )
...
Update notebook tutorial with multi-threaded int8 GEMM from #2517
2019-12-07 13:37:00 -08:00
Ryan Hill
cbc398bb75
Ryanunderhill/packagename test ( #2582 )
2019-12-07 12:08:46 -08:00
Ashwini Khade
c06dbd8311
Add ConvTranspose1D ( #2578 )
2019-12-07 08:50:02 -08:00
Mark
79847f39b3
Fix file not found error during docker build. ( #2569 )
2019-12-07 08:49:47 -08:00
Yufeng Li
5575766a53
Add more check on SkipLayerNorm and BiasGelu fusion ( #2574 )
2019-12-06 15:36:02 -08:00
Changming Sun
262ee9dc5a
Fix a warning found in the latest VS release
2019-12-06 15:07:21 -08:00
Yufeng Li
34beafc51c
make layernorm fusion to support opset 11 ( #2545 )
2019-12-06 13:06:36 -08:00
shahasad
eeb28a80c0
setup java ci mac ( #2570 )
2019-12-06 11:43:40 -08:00
Tianlei Wu
038ee91da5
Allow sequence length to be symbolic ( #2559 )
2019-12-06 10:13:56 -08:00
George Wu
73c682b97c
disable onnx_test_runner -x invocations for dnnl ( #2568 )
2019-12-05 23:05:34 -08:00
Changming Sun
7eddac16c2
Re-enable Windows C# tests ( #2564 )
2019-12-05 21:22:31 -08:00
Ryan Hill
854362cf05
Update win-x86-ci.yml ( #2557 )
...
Fix build pipeline break
2019-12-05 18:44:12 -08:00
Changming Sun
ace132f9aa
Fix android build ( #2558 )
2019-12-05 15:03:22 -08:00
Sreekanth Yalachigere
4c996a8699
DNNL CMAKE update ( #2548 )
2019-12-05 13:48:57 -08:00
Hariharan Seshadri
53a6bc2f07
Fix a bug handling negative begin pad values in Pad op ( #2550 )
...
* Fix bug in Pad op
* Update
2019-12-05 11:29:45 -08:00
Changming Sun
bec4abf074
Add back executable bit to build.py
2019-12-04 21:22:02 -08:00
Ashwini Khade
281933fa1c
Fix C API tests for centos and mac ( #2544 )
...
* change c++14 to c++11
* add ld lib path for centos
* enable csharp tests on macos
* fix C API test on MacOS + fix manylinux dotnet install
* fix manylinux dotnet install
* fix lib link
2019-12-04 18:01:35 -08:00
Dmitri Smirnov
d34fb62012
Introduce container type runtime checks and other improvements ( #2522 )
...
Rework TensorSeq in a manner consistent with Tensor and SparseTensor
in terms of type system setup.
Reduce templating. Introduce helpers to ensure the same
data type.
Make OrtValue __dtor not virtual.
Introduce ContainerChecker
2019-12-04 16:04:17 -08:00
Yulong Wang
be56d77a66
Fix integer overflow in cuda NonMaxSuppression implementation ( #2540 )
...
* add test case that should pass but fail
* fix nms
* extract int_max_output_boxes_per_class
2019-12-04 13:27:04 -08:00
Xiang Zhang
3e7aaf8fa1
User/xianz/telemetry ( #2458 )
...
* enabme telemetry
* enable telemetry
* set enable telemetry as default
* for debugging
* remove log and set disable telemetry as default back
* delete private file while testing
* resolve comment: mainly add license header, rename macro and update docs
* rewording in privacy.md
2019-12-03 23:34:53 -08:00
stevenlix
293b15480b
Add dynamic shape support in TensorRT execution provider ( #2450 )
...
* remove onnx-tensorrt submodule
* add new onnx-tensorrt submodule (experiment) for trt6
* update engine build for trt6
* update compile and compute for tensorrt6.0
* Update tensorrt_execution_provider.cc
* Update tensorrt_execution_provider.cc
* Update tensorrt_execution_provider.cc
* Update tensorrt_execution_provider.cc
* switch to onnx-tensorrt master for TensorRT6'
* Update tensorrt_execution_provider.cc
* Handle dynamic batch size and add memcpy in TensorRT EP
* update test cases
* Update tensorrt_execution_provider.cc
* update onnx-tensorrt submodule
* Update Dockerfile.ubuntu_tensorrt
* Update Dockerfile.ubuntu_tensorrt
* Update run_dockerbuild.sh
* Update run_dockerbuild.sh
* Update install_ubuntu.sh
* Update concat_op_test.cc
* Update tensorrt_execution_provider.cc
* Upgrade TensorRT to version 6.0.1.5
* Update onnxruntime_providers.cmake
* Update CMakeLists.txt
* Update reduction_ops_test.cc
* Update install_ubuntu.sh
* Update Dockerfile.ubuntu_tensorrt
* Update Dockerfile.tensorrt
* Update BUILD.md
* Update run_dockerbuild.sh
* Update install_ubuntu.sh
* Update onnxruntime_providers.cmake
* Update install_ubuntu.sh
* Update install_ubuntu.sh
* Update gemm_test.cc
* Update gather_op_test.cc
* Update CMakeLists.txt
* Removed submodule
* update onnx-tensorrt submodule
* update header file
* Removed submodule
* add submodule onnx-tensorrt kevin's branch shape-test'
* add debugging code
* Update tensorrt_execution_provider.cc
* Update tensorrt_execution_provider.cc
* merge master
* Removed submodule
* update onnx-tensorrt submodule
* add more changes for dynamic shapes
* Update tensorrt_execution_provider.cc
* update for dynamic shape
* update dynamic shape processing
* fix logger issue
* remove submodule onnx-tensorrt
* add submodule onnx-tensorrt
* add env variable min_subgraph_size
* remove redundency
* update document
* use onnxruntime::make_unique
* fix multi-run issue
* remove some tests to save CI build time
* Add dynamic shape test
* Update TensorRT-ExecutionProvider.md
* Add example of running Faster R-CNN model on TensorRT EP
* Add more details on env variables
* update environment variables
* Update tensorrt_basic_test.cc
* Update model tests
* Update tensor_op_test.cc
* remove --use_full_protobuf
* Update build.py
2019-12-03 23:18:33 -08:00
Yulong Wang
d748f891d8
Revert "Disable thread pool creation when enabled OpenMP ( #2485 )" ( #2535 )
...
This reverts commit 7c7d5a149c .
2019-12-03 22:09:02 -08:00
Hariharan Seshadri
5c2e474751
Add provision in ORT for session options to be parsed when available via model file ( #2449 )
...
* Initial commit
* Fix gitmodules
* Nits
* Nits
* Updates
* Update
* More changes
* Updates
* Update
* Some updates
* More changes
* Update
* Update
* Merge
* Update
* Updates
* More changes
* Update
* Fix nits
* Updates
* Fix warning
* Fix build
* Add comment
* PR feedback
* PR feedback
* Updates
* Updates
* Update
* More changes
* Fix build break
* Comment test for now
* Updates
* Updates
* PR feedback
* Updates
* Nits
* Add tests
* Fix build
* Fix build
* Fix build
* Fix build break
* Fix build
* Nits
* PR feedback
* More change
* Expose GetSessionOptions in pybind logic and add unit test for python
* Fix build
* PR feedback
* PR feedback
2019-12-03 16:56:07 -08:00
shahasad
178d059111
Setup java ci ( #2528 )
2019-12-03 14:21:51 -08:00
Tianlei Wu
b50878dcf0
Disable Attention fusion tests when DISABLE_CONTRIB_OPS is defined ( #2529 )
2019-12-03 14:21:21 -08:00
Ashwini Khade
e32eff826c
enable nuget package testing on centos7 ( #2527 )
...
* add centos tests to linux cpu ci pipeline
* Disable failing test
* use centos6 instead of centos7
* change back to centos7
* add dotnet runtime dependency
* fix dotnet runtime dependencies
* install dotnet sdk instead of runtimes
* add more dotnet dependencies
* temporary skip failing test
* ix lib path
* reenable failing test
2019-12-03 10:16:45 -08:00
RandySheriffH
85a4ed8cf7
fix cuda kernel causing invalid mem access ( #2523 )
2019-12-03 09:16:00 -08:00
Tianlei Wu
66254eb25a
Update BERT model optimization python script ( #2521 )
...
Add support of GPT2 model optimization:
* Match subgraph of Gelu Approximation (using Tanh).
* Fuse LayerNormalization if SkipLayerNormalization is not ready.
* Output model even if embedding layer is not fused.
* Improve Reshape Fusion to improve coverage.
* Refine constant input checking, and output fused op counter.
Update script according to latest op improvements:
* Fusion of Add Bias and Gelu.
* Fuse SkipLayerNormalization and Add Bias.
Other:
* Add ReduceSum for mask as intermediate step.
* Refactor verbose setting.
2019-12-03 08:40:51 -08:00
Sreekanth Yalachigere
31ea11a696
Renaming MKL-DNN as DNNL ( #2515 )
...
* DNNL: Moving Files to rename file names
* DNNL name change
* azure pipeline updated
* disable ceil/dialation and enable Opset10
* disable ceil/dialation tests in Python
* mlperf_ssd_resnet34_1200 disabled
2019-12-03 07:34:23 -08:00
Changming Sun
3d627362a0
Upgrade Windows CPU CI pipeline to use VS 2019 ( #2519 )
2019-12-02 23:05:35 -08:00
Scott McKay
e8b327d657
Fix constant folding of node assigned to CUDA ( #2510 )
...
* Constant folding bug fix/improvements
- Handle constant folding for node that is assigned to a non cpu EP
- Check for errors in optimizer execution frame setup
- Improve CUDA partitioning to look for initializers in parent graphs
- Add unit test
Fixes #2474
2019-12-03 16:28:44 +10:00
Changming Sun
4354023913
Make link time optimization work on Linux ( #2477 )
2019-12-02 22:25:41 -08:00
baowenlei
25c260fdef
Add parallel for tensorized gemm ( #2517 )
...
* add parallel for tensorize gemm
* add option to control parallel
* change to a more clean way to control
2019-12-02 22:05:46 -08:00
KeDengMS
c1be615c45
[NupharEP] refine parallel schedule control ( #2514 )
...
* [NupharEP] Add parallel schedule to JIT function name
Update Nuphar docker to use Python 3.6 and ubuntu 18.04
* Update notebook
* Avoid JIT cache file name conflict
2019-12-02 17:40:51 -08:00
Zhang Lei
784eca0dcd
Cuda pad() for opset 11 ( #2490 )
...
* Cuda pad opset 11.
* Handle type conversion issue in building.
2019-12-02 16:28:17 -08:00
Jeff Bloomfield
b9faa0b6fd
Fix kernel registry validation to reenable DML kernels
2019-12-02 15:43:44 -08:00
Scott McKay
ddaad86605
CUDA Loop ( #2444 )
...
* Implement CUDA Loop operator.
* Add control flow node implicit input handling to the memcpy transformer and allocation planner.
2019-12-03 08:29:21 +10:00
Zhang Lei
50eb140119
Cuda Resize Operator for opset 11. ( #2484 )
...
* Cuda Resize Operator for opset 11.
2019-12-02 13:42:21 -08:00
xavier dupré
c42148a0c3
Improves softmax function for standard ml
2019-12-02 10:48:46 -08:00
Dmitri Smirnov
ec88f6d8d6
Add DataFrameTool ( #2456 )
...
Add DataFrameTool to feed inputs from Panda DataFrame
2019-12-02 10:12:03 -08:00
Yulong Wang
89824b35e9
optimize CPU implementation of Attention ( #2496 )
2019-12-01 14:43:38 -08:00
Tianlei Wu
0f57e0a49e
Change mask input of EmbedLayerNormalization op to be optional ( #2495 )
...
Change mask input of EmbedLayerNormalization op to be optional
2019-12-01 08:36:06 -08:00
liuziyue
0edd4ef6ca
EmbedLayerNormalization fusion ( #2452 )
...
Embed Layer Normalization Fusion
2019-11-28 14:03:58 -08:00
KeDengMS
60208463a9
[NupharEP] Enable parallel schedule ( #2505 )
...
* [NupharEP] Enable parallel schedule
* Update TVM with the fix to TVM threadpool to use OpenMP if possible
* Add parallel schedule when trying to vectorize
With this change, BERT squad perf on a 4-core (8 HT) CPU goes from 187ms to 150ms
* Address CR, docs and cmake update
* Doc fix
* Fix mkl
* Fix TVM windows build when using mklml
2019-11-28 08:35:56 -08:00
Yufeng Li
005305be6e
Implement AddGelu and SkipLayerNorm ( #2487 )
...
* Implement AddGelu and SkipLayerNorm
2019-11-28 08:29:59 -08:00