Commit graph

11997 commits

Author SHA1 Message Date
RandySheriffH
38b34babe0
Rashuai/boost cuda TopK performance (#2826)
* Implement Bitonic and Radix TopK

* remove needless print out

* fix com err

* add negative support

* fix comments

Co-authored-by: Randy <45701928+RandyShuai@users.noreply.github.com>
2020-01-21 13:40:38 -08:00
Tracy Sharpe
08113b80cc
Optimize BatchNormalization to NCHWc Conv (#2855)
Update the NCHWc transformer to convert BatchNormalization ops to NCHWc convolutions when the input tensor is already in NCHWc.
2020-01-20 16:35:03 -08:00
Ashwini Khade
807a59c55d
Add calibration tool (#2845)
* add calibration tool

* add model for e2e example

* format readme

* some more formatting updates

* plus a few more updates

* plus review comments

* plus updates

* more updates
2020-01-20 14:49:35 -08:00
Xavier Dupré
22d9f3998e
Fix positive raw scores for TreeEnsembleClassifier (#2824)
Fix positive raw scores for TreeEnsembleClassifier
2020-01-20 16:48:37 +01:00
Hariharan Seshadri
b21576eeb0 Support non-sequence tensor fed through as a python list (#2782)
* Support list feeds in Python
2020-01-20 09:45:10 +10:00
KeDengMS
f9f25ec047
Fix spurious component detection warning (#2857)
Fix spurious component detection warning
Use component detection template for all pipelines
2020-01-18 20:10:35 -08:00
Yufeng Li
25d7ad187f
Add float16 support back in the bert fusion script (#2870)
* Add float16 support back in the bert fusion script

* update readme
2020-01-17 20:00:39 -08:00
Yufeng Li
95f3eb6aeb
Bert fusion script for Tensorflow squad (#2858) 2020-01-17 15:27:04 -08:00
Tracy Sharpe
01f3a33c38
update protoc path to match protobuf version (#2865) 2020-01-17 14:48:39 -08:00
Changming Sun
e6f7658ade Update Windows GPU build to use cudnn 7.6 2020-01-17 12:23:13 -08:00
Pranav Sharma
3853ddf9c7
Fix topk type handling to accommodate more types. (#2842)
* Fix topk type handling to accommodate more types + add unit test for int64_t.

* Fix Linux build
2020-01-17 11:57:29 -08:00
Changming Sun
47e27ec9a1
Disable DML in Windows GPU CI build (#2856)
Disable DML in Windows GPU CI build for now, because there are some wired model test failure and I don't know how to fix it. Will seek help from WinML team.
2020-01-16 18:47:30 -08:00
Scott McKay
724ff0753b
Use IArenaAllocator::Reserve for initializers and mem pattern planner blocks (#2835)
* Use IArenaAllocator::Reserve for initializers and mem pattern planner blocks.
2020-01-17 07:41:48 +10:00
Tracy Sharpe
928b6bb210
MLAS: enable threading for quantized GEMMs (#2844) 2020-01-15 19:25:40 -08:00
Tianlei Wu
5db8543018
update optimization doc for BERT related fusions (#2819)
* Add bert related transformers to doc
* Add execution provider and comment for bert optimizations
* Add comment about accuracy impact of approximation
2020-01-15 16:01:11 -08:00
Changming Sun
56030f8d74 Fix Linux CUDA nuget packaging pipeline break 2020-01-14 21:13:41 -08:00
Tiago Koji Castro Shibata
cff266e1b9 Fix cgmanifest.json generating script (#2770)
* Fix protobuf submodule name

* Workaround pygit2 bug
2020-01-14 14:59:07 -08:00
Ori Levari
db05436fc0 User/orilevari/32bit comparison warning (#2800)
* use correct type for for loop

* explicitly specify void for parameters of OrtGetApiBase because the function is defined in c, so when the function is just (), it is interpreted as having an unknown number of parameters. This was causing compiler warning C4276.
2020-01-14 14:59:07 -08:00
Ashwini Khade
8643f3ebbb
add domain check for nodes + update documentation (#2831) 2020-01-14 11:15:50 -08:00
Dmitri Smirnov
aa37dea598
Convert ExternalProject Featurizers into git submodule (#2834)
Add git submodule for Featurizer library.
  Update cmake to build for git submodule.
2020-01-14 10:32:06 -08:00
Scott McKay
98cb41aa03
Ignore allocator type in ExecutionProviders allocator map. Make default initialization of OrtMemoryInfo more clearly invalid. (#2768)
* Remove allocator type from the key comparison in ExecutionProviders.
Remove usage of DummyArena as it's no longer necessary.

* Fix x86 tests where arena allocator is disabled.
Make initialization of OrtMemoryInfo clearer by adding Invalid enum value.

* Make OrtValueNameIdxMap::MaxIdx more intuitive.
2020-01-14 18:14:55 +10:00
Pranav Sharma
b308e826a8
Add support for int64_t for topk CPU. Fixes github issue #2806. (#2833) 2020-01-13 20:26:16 -08:00
Changming Sun
5c391854f4
Upgrade gtest to the latest version (#2827)
WinML would like to update the googletest submodule. They want some newer features (namely GTEST_SKIP to skip tests programmatically and be able to skip entire fixtures easily) and would need to update the submodule version.

However, because the new version of code hit a bug in gcc, even though the bug is already fixed in the latest gcc but we're using gcc 4.8.x and it won't get patched for the bug, so we have to do a compromise, change our code a little bit to make it work.

The gcc bug:  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51213
2020-01-13 20:16:48 -08:00
Dmitri Smirnov
120433c29d
Add OneHotEncoder and HashOneHotEncoder kernels. (#2830)
Add defs and imlementation for OneHotEncoders, adjuist date_time_transformer kernel and test.
  Add OneHotEncoder kernel test.
  Add HashOneHotVectorizerTransformer unit test.
  This does not link due to multiple definitions of functions
  that are included into header from a CPP file.
2020-01-13 17:58:33 -08:00
Qing
723cf83793 Update Ubuntu & TensorRT version in README (#2820)
Dockerfile.tensorrt is using nvcr.io/nvidia/tensorrt:19.09-py3 as base Image, update Ubuntu and TensorRT version according to
https://docs.nvidia.com/deeplearning/sdk/tensorrt-container-release-notes/rel_19-09.html#rel_19-09
2020-01-13 14:37:32 -08:00
Yingge WAN
07502ec14e Fix dnnl wheel package name (#2823)
* Append '-dnnl' to whl package name when --use_dnnl

* Update build.py
2020-01-13 14:37:11 -08:00
Ashwini Khade
7c6242b024
update default optimization level + fix gemm_activation fusion (#2791)
* update defualt optimization level + fix gemm_activation fusion

* fix typo

* add unit test and incorporate review comments

* fix test comment
2020-01-13 14:05:38 -08:00
Ashwini Khade
cc75e5a162
update quantization doc (#2783)
* update documentation for quantization script

* plus some spell corrections
2020-01-13 10:52:46 -08:00
Changming Sun
c4e4abce73
Run static code analyzer on most of our code (#2817) 2020-01-10 22:17:17 -08:00
Dmitri Smirnov
e37cdbed74 Add manifest missing comma 2020-01-10 16:02:19 -08:00
stevenlix
c4f6db7796 Fix memory leak in TRT (#2815)
* fix memory leak issue

* revert EP_FAIL on enueueV2
2020-01-10 14:07:40 -08:00
Dmitri Smirnov
afa48b7e13
Add timeseries imputer transformer featurizer kernel (#2813)
Make kernels non-template. Add input constraint for learnt data.
  Fixup tests.
  Add two more featurizers along with tests. Tests fail.
  min_max_scalar_transformer
  robust_scalar_transformer
  Fix tests serialized stream by prepending version bytes.
  Add inputation_marker_transfomer and the test.
  Fix up float/double type designations.
 Added label_encoder_transformer along with a test.
  string_throw case is broken at the momement.
  Fix labelencodertransfomer_test.cc string_throw case
  Rename maxabsscalertransformer_test.cc
  Add MissingDummiesTransformer along with the test.
  Update manifest.
  Add TimeSeriesImputerTransformer definition, implementation and tests
2020-01-10 13:27:51 -08:00
Changming Sun
48e042868f
Update test data (#2356) 2020-01-10 10:52:23 -08:00
George Wu
31200ed92c
speed up Windows TRT CI (#2811)
* don't run cuda tests if building with tensorrt

* remove unnecessary build options for win trt ci

* refactor win gpu tensorrt ci yml

* --numpy_version=1.17

* update

* update

* azcopy and cuda path
2020-01-10 08:40:40 -08:00
Ke Zhang
b0019ac7fe
add interface to copy batch tensors. (#2807)
* add interface to copy batch tensors.

* onnxruntime
2020-01-09 16:52:34 -08:00
Tracy Sharpe
7ef6570e27
MLAS: update SGEMM threading parameters (#2808) 2020-01-09 14:48:20 -08:00
Yufeng Li
71b5165ed3
Initialize max of softmax with lowest of float (#2786) 2020-01-09 13:48:18 -08:00
Dmitri Smirnov
2c8179bee4
ML.NET team needs featurizers within a package (#2789)
Add auto ml featurizers to Windows, MacOS as well as to GPU  packaging-pipelines.
2020-01-09 10:54:12 -08:00
George Wu
1978376e1e
add session creation time cost. (#2798) 2020-01-08 11:17:48 -10:00
Tianlei Wu
32c5e76a16
Improve bert optimization script: (#2712)
(1) Move input int64=>int32 conversion to embed layer fusion.
(2) Output epsilon attribute for LayerNormalization fusion.
2020-01-08 11:32:27 -08:00
Nathan
f84240db2b
add uint8 support to where op (#2792) 2020-01-08 09:59:42 -08:00
Hariharan Seshadri
ebfcad1c90
Add script for release Nuget validation (#2719)
* Initial commit

* Nits

* Disable a test temporarily

* Change working directory

* Test

* Add download python step

* Test update

* More changes

* Fix space issue

* Fix

* Verify nuget signing

* Fix

* Spaces

* PR feedback

* Nit

* Fix

* Fix

* Remove temporary changes
2020-01-08 18:42:22 +05:30
Andrews548
3e6f1836eb ACL EP convolution improvements (#2774)
Added the optimized implementation for depthwise convolution for both ACL v19.02 and ACL 19.05.
Also the pointwise convolution seems to be more optimal in the CPU implementation so we opted for that instead.
2020-01-07 06:42:03 -10:00
Andrews548
fdc0106f83 ACL EP GEMM improvements (#2780)
When it is posible we use a fully connected layer instead of the gemm implementation.
This will let the library use the best implementation based on the input data.
2020-01-07 06:35:18 -10:00
Maher Jendoubi
f22bffe0f6 Contributing: Fix a typo (#2784) 2020-01-07 06:32:13 -10:00
Yufeng Li
72bdfc8cd4
Implement a more stable softmax (#2715)
* Implement a more stable SoftMax
 e^x is represented as infinity if x is large enough, like 100.f. Infinity divided by Infinity is a NAN. Thus, softmax gets a NAN if one or more item are large enough.
A math transform as below is leveraged to get a stable softmax:
e^xi/(e^x1 + ...e^xn) = e^(xi - max) / (e^(x1 - max) + ... + e^(xn - max))

And for convenience, force max to 0.f if all xi are negative
2020-01-06 14:28:12 -08:00
Dmitri Smirnov
6f66260372
Import more featurizers (#2781)
Make kernels non-template. Add input constraint for learnt data.
  Add min_max_scalar_transformer, robust_scalar_transformer,
  inputation_marker_transfomer, label_encoder_transformer,
 missing_dummies_transformer along with tests.
 Advance Featurizers library commit.
2020-01-06 13:43:44 -08:00
Changming Sun
1b23118056 Fix nightly build version number issue 2020-01-06 11:16:44 -08:00
Changming Sun
e3f674b563 Disable featurizers in python packages 2020-01-06 11:16:44 -08:00
Changming Sun
7ace7a5bcd Pass BUILD_BUILDNUMBER to linux docker 2020-01-06 11:16:44 -08:00