Commit graph

1808 commits

Author SHA1 Message Date
Saquib Nadeem Hashmi
daff4240f0 Updated README.md (#2910)
Corrected spelling mistake.
2020-01-27 13:37:22 -08:00
Yufeng Li
cd876720d9
Only fuse when output count of add is 1 (#2884)
* Only fuse when output count of add is 1

* add unit test for add with multi output
2020-01-24 13:47:34 -08:00
Scott McKay
a92e924ab2
Revert "Use IArenaAllocator::Reserve for initializers and mem pattern planner blocks (#2835)" (#2904)
This reverts commit 724ff0753b.
2020-01-24 14:02:30 +10:00
Changming Sun
e0c9cdaa73
Fix the nuget pipelines (#2901) 2020-01-23 20:02:18 -08:00
Tracy Sharpe
17b72d5578
Fix NCHWc BatchNormalization regression (#2903)
Fix the BatchNormalization optimization in the NCHWC optimizer. If the node has the optional training outputs specified, then skip the transform.
2020-01-23 18:54:11 -08:00
Jeff
ba336b5583
Disable DML EP on software adapter, fix float16 fallback bug, re-enable DML in CI (#2896)
* Re-enable DML in CI pipeline

* Fix bug with float16 fallback + fusion, and disallow DML EP with software adapter

* Address PR comments
2020-01-23 15:18:28 -08:00
Changming Sun
201b089a36
Fix some warnings on Windows (#2560)
1. Enable warning "4503" # Decorated name length exceeded.
2. Enable warning "4146" # unary minus operator applied to unsigned type.
3. Enable float64 support for the Softmax operator
4. Enable compliance checks for Windows x86 32bits build
5. Use TryBatchParallelFor to replace some fallback code in mlas pooling.cc
6. Fix Android CI pipeline.
2020-01-22 15:59:11 -08:00
Pranav Sharma
49725f896c
Disable openmp for the nocontribops pipeline. (#2888) 2020-01-22 12:07:44 -08:00
Scott McKay
fc51473b09
Update BFCArena logic to use backoff if cudaMalloc fails. Makes behaviour equivalent to when a CPU allocation fails. Add unit test. (#2748)
Clear error when throwing an exception for a failed CUDA call so that there is only one error mechanism being used at a time.
Minor improvements to logging to aid debugging of BFCArena behaviour.
2020-01-22 14:21:21 +10:00
edgchen1
061f10fcd5 Fixed typo in ORT_RETURN_IF_NOT() message. (#2862) 2020-01-21 20:03:41 -08:00
Scott McKay
9f5e8c4ae8
InferenceSession::Run needs to call OnRunEnd for any EP that OnRunStart was called for so they can cleanup. Currently it only calls OnRunEnd if the Status is OK. Due to this the CUDA EP will throw during shutdown as the per-thread information has not been cleaned up prior to the CUDA library shutting down. (#2881)
Also update onnxruntime_perf_test to catch the exception from the call to Run and return a Status. Otherwise it exits with an 'unknown exception' error.
2020-01-22 12:17:52 +10:00
RandySheriffH
38b34babe0
Rashuai/boost cuda TopK performance (#2826)
* Implement Bitonic and Radix TopK

* remove needless print out

* fix com err

* add negative support

* fix comments

Co-authored-by: Randy <45701928+RandyShuai@users.noreply.github.com>
2020-01-21 13:40:38 -08:00
Tracy Sharpe
08113b80cc
Optimize BatchNormalization to NCHWc Conv (#2855)
Update the NCHWc transformer to convert BatchNormalization ops to NCHWc convolutions when the input tensor is already in NCHWc.
2020-01-20 16:35:03 -08:00
Ashwini Khade
807a59c55d
Add calibration tool (#2845)
* add calibration tool

* add model for e2e example

* format readme

* some more formatting updates

* plus a few more updates

* plus review comments

* plus updates

* more updates
2020-01-20 14:49:35 -08:00
Xavier Dupré
22d9f3998e
Fix positive raw scores for TreeEnsembleClassifier (#2824)
Fix positive raw scores for TreeEnsembleClassifier
2020-01-20 16:48:37 +01:00
Hariharan Seshadri
b21576eeb0 Support non-sequence tensor fed through as a python list (#2782)
* Support list feeds in Python
2020-01-20 09:45:10 +10:00
KeDengMS
f9f25ec047
Fix spurious component detection warning (#2857)
Fix spurious component detection warning
Use component detection template for all pipelines
2020-01-18 20:10:35 -08:00
Yufeng Li
25d7ad187f
Add float16 support back in the bert fusion script (#2870)
* Add float16 support back in the bert fusion script

* update readme
2020-01-17 20:00:39 -08:00
Yufeng Li
95f3eb6aeb
Bert fusion script for Tensorflow squad (#2858) 2020-01-17 15:27:04 -08:00
Tracy Sharpe
01f3a33c38
update protoc path to match protobuf version (#2865) 2020-01-17 14:48:39 -08:00
Changming Sun
e6f7658ade Update Windows GPU build to use cudnn 7.6 2020-01-17 12:23:13 -08:00
Pranav Sharma
3853ddf9c7
Fix topk type handling to accommodate more types. (#2842)
* Fix topk type handling to accommodate more types + add unit test for int64_t.

* Fix Linux build
2020-01-17 11:57:29 -08:00
Changming Sun
47e27ec9a1
Disable DML in Windows GPU CI build (#2856)
Disable DML in Windows GPU CI build for now, because there are some wired model test failure and I don't know how to fix it. Will seek help from WinML team.
2020-01-16 18:47:30 -08:00
Scott McKay
724ff0753b
Use IArenaAllocator::Reserve for initializers and mem pattern planner blocks (#2835)
* Use IArenaAllocator::Reserve for initializers and mem pattern planner blocks.
2020-01-17 07:41:48 +10:00
Tracy Sharpe
928b6bb210
MLAS: enable threading for quantized GEMMs (#2844) 2020-01-15 19:25:40 -08:00
Tianlei Wu
5db8543018
update optimization doc for BERT related fusions (#2819)
* Add bert related transformers to doc
* Add execution provider and comment for bert optimizations
* Add comment about accuracy impact of approximation
2020-01-15 16:01:11 -08:00
Changming Sun
56030f8d74 Fix Linux CUDA nuget packaging pipeline break 2020-01-14 21:13:41 -08:00
Tiago Koji Castro Shibata
cff266e1b9 Fix cgmanifest.json generating script (#2770)
* Fix protobuf submodule name

* Workaround pygit2 bug
2020-01-14 14:59:07 -08:00
Ori Levari
db05436fc0 User/orilevari/32bit comparison warning (#2800)
* use correct type for for loop

* explicitly specify void for parameters of OrtGetApiBase because the function is defined in c, so when the function is just (), it is interpreted as having an unknown number of parameters. This was causing compiler warning C4276.
2020-01-14 14:59:07 -08:00
Ashwini Khade
8643f3ebbb
add domain check for nodes + update documentation (#2831) 2020-01-14 11:15:50 -08:00
Dmitri Smirnov
aa37dea598
Convert ExternalProject Featurizers into git submodule (#2834)
Add git submodule for Featurizer library.
  Update cmake to build for git submodule.
2020-01-14 10:32:06 -08:00
Scott McKay
98cb41aa03
Ignore allocator type in ExecutionProviders allocator map. Make default initialization of OrtMemoryInfo more clearly invalid. (#2768)
* Remove allocator type from the key comparison in ExecutionProviders.
Remove usage of DummyArena as it's no longer necessary.

* Fix x86 tests where arena allocator is disabled.
Make initialization of OrtMemoryInfo clearer by adding Invalid enum value.

* Make OrtValueNameIdxMap::MaxIdx more intuitive.
2020-01-14 18:14:55 +10:00
Pranav Sharma
b308e826a8
Add support for int64_t for topk CPU. Fixes github issue #2806. (#2833) 2020-01-13 20:26:16 -08:00
Changming Sun
5c391854f4
Upgrade gtest to the latest version (#2827)
WinML would like to update the googletest submodule. They want some newer features (namely GTEST_SKIP to skip tests programmatically and be able to skip entire fixtures easily) and would need to update the submodule version.

However, because the new version of code hit a bug in gcc, even though the bug is already fixed in the latest gcc but we're using gcc 4.8.x and it won't get patched for the bug, so we have to do a compromise, change our code a little bit to make it work.

The gcc bug:  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51213
2020-01-13 20:16:48 -08:00
Dmitri Smirnov
120433c29d
Add OneHotEncoder and HashOneHotEncoder kernels. (#2830)
Add defs and imlementation for OneHotEncoders, adjuist date_time_transformer kernel and test.
  Add OneHotEncoder kernel test.
  Add HashOneHotVectorizerTransformer unit test.
  This does not link due to multiple definitions of functions
  that are included into header from a CPP file.
2020-01-13 17:58:33 -08:00
Qing
723cf83793 Update Ubuntu & TensorRT version in README (#2820)
Dockerfile.tensorrt is using nvcr.io/nvidia/tensorrt:19.09-py3 as base Image, update Ubuntu and TensorRT version according to
https://docs.nvidia.com/deeplearning/sdk/tensorrt-container-release-notes/rel_19-09.html#rel_19-09
2020-01-13 14:37:32 -08:00
Yingge WAN
07502ec14e Fix dnnl wheel package name (#2823)
* Append '-dnnl' to whl package name when --use_dnnl

* Update build.py
2020-01-13 14:37:11 -08:00
Ashwini Khade
7c6242b024
update default optimization level + fix gemm_activation fusion (#2791)
* update defualt optimization level + fix gemm_activation fusion

* fix typo

* add unit test and incorporate review comments

* fix test comment
2020-01-13 14:05:38 -08:00
Ashwini Khade
cc75e5a162
update quantization doc (#2783)
* update documentation for quantization script

* plus some spell corrections
2020-01-13 10:52:46 -08:00
Changming Sun
c4e4abce73
Run static code analyzer on most of our code (#2817) 2020-01-10 22:17:17 -08:00
Dmitri Smirnov
e37cdbed74 Add manifest missing comma 2020-01-10 16:02:19 -08:00
stevenlix
c4f6db7796 Fix memory leak in TRT (#2815)
* fix memory leak issue

* revert EP_FAIL on enueueV2
2020-01-10 14:07:40 -08:00
Dmitri Smirnov
afa48b7e13
Add timeseries imputer transformer featurizer kernel (#2813)
Make kernels non-template. Add input constraint for learnt data.
  Fixup tests.
  Add two more featurizers along with tests. Tests fail.
  min_max_scalar_transformer
  robust_scalar_transformer
  Fix tests serialized stream by prepending version bytes.
  Add inputation_marker_transfomer and the test.
  Fix up float/double type designations.
 Added label_encoder_transformer along with a test.
  string_throw case is broken at the momement.
  Fix labelencodertransfomer_test.cc string_throw case
  Rename maxabsscalertransformer_test.cc
  Add MissingDummiesTransformer along with the test.
  Update manifest.
  Add TimeSeriesImputerTransformer definition, implementation and tests
2020-01-10 13:27:51 -08:00
Changming Sun
48e042868f
Update test data (#2356) 2020-01-10 10:52:23 -08:00
George Wu
31200ed92c
speed up Windows TRT CI (#2811)
* don't run cuda tests if building with tensorrt

* remove unnecessary build options for win trt ci

* refactor win gpu tensorrt ci yml

* --numpy_version=1.17

* update

* update

* azcopy and cuda path
2020-01-10 08:40:40 -08:00
Ke Zhang
b0019ac7fe
add interface to copy batch tensors. (#2807)
* add interface to copy batch tensors.

* onnxruntime
2020-01-09 16:52:34 -08:00
Tracy Sharpe
7ef6570e27
MLAS: update SGEMM threading parameters (#2808) 2020-01-09 14:48:20 -08:00
Yufeng Li
71b5165ed3
Initialize max of softmax with lowest of float (#2786) 2020-01-09 13:48:18 -08:00
Dmitri Smirnov
2c8179bee4
ML.NET team needs featurizers within a package (#2789)
Add auto ml featurizers to Windows, MacOS as well as to GPU  packaging-pipelines.
2020-01-09 10:54:12 -08:00
George Wu
1978376e1e
add session creation time cost. (#2798) 2020-01-08 11:17:48 -10:00