Commit graph

3779 commits

Author SHA1 Message Date
satyajandhyala
b495ae8103
ORT fuzz testing (#5771)
* Added fuzz testing using ORT model.

* The onnxruntime_security_fuzz driver code should accept either ONNX or ORT (based on the file extension) input file if /f flag is provided.

*  Added ValidateOrtFormatModelDoesNotRunOptimizersInFullBuild test.

* Added win-ci-fuzz-testing.yml to run build pipeline.

* Prevent out-of-range access in the graph.cpp.
2020-11-18 16:07:36 -08:00
Sheil Kumar
84c1340f9b
Refactor implementation of Tensor<T> and underlying buffer stores to improve binary size and maintainability (#5836)
* refactor tensor buffers to make cleaner

* refactor to make tensor backing buffer implementation smaller and cleaner

* missed virtual on destructor

* remove unnecessary static_pointer_cast

* add string vector accessor

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2020-11-18 14:56:47 -08:00
Changming Sun
85f945a875
Regenerate CI build docker images (#5850) 2020-11-18 14:36:59 -08:00
Yufeng Li
6f86c4dbe3
Quantize LSTM (#5595)
Quantize LSTM:
1. dynamically quantizes MatMul inside the LSTM. It doesn't quantize activation function.
2. support per-channel on the input weight and recurrent weight.
2020-11-18 11:21:49 -08:00
Pranav Sharma
c2a993e745
Add documentation for OrtArenaCfg for CreateAndRegisterAllocator API. (#5831)
* Add documentation for OrtArenaCfg for CreateAndRegisterAllocator API.

* Address PR comments

* More comments
2020-11-18 10:21:20 -08:00
Scott McKay
b3a6ed14d4
Prevent saving a model containing fused nodes as we don't have any way to save the compiled kernels so the saved model will be invalid. (#5840) 2020-11-18 16:17:07 +10:00
Peichen Xie
e8c0f5d0ff
Update the quantization script to support GEMM (transB==1) (#5432)
* Modify onnx_quantizer.py

* Fix topology order issues

* Handle more cases
2020-11-17 21:24:48 -08:00
Tracy Sharpe
f964bb94ba
Add QLinearConv NHWC transformer (#5824)
The implementation of QLinearConv internally does a transpose(NHWC)->im2col+GEMM->transpose(NCHW). This adds a graph transformer to change a model to use a com.microsoft.QLinearConv that supports NHWC natively to avoid unnecessary transposes.
2020-11-17 20:51:02 -08:00
RandySheriffH
e814c9307a
Boost Expand cpu operator by multi-threading (#5739)
* implement multi-threading expand on cpu

* format code

* move expand op

* add test case

* format code

* optimize code

* fix comments

* handle empty tensor

* sync with master

* add ParallelSection

* add threshold for multi-threading

Co-authored-by: RandySheriffH <rashuai@microsoft.com>
2020-11-17 20:27:24 -08:00
Edward Chen
71e7c2b423
Cache build docker images in container registry. (#5811)
This PR adds infrastructure to automatically cache docker images used in CI builds in a container registry.

Currently, build images are pulled from a container registry for some builds and built every time for others. The container registry requires maintenance to keep the images up to date and building images every time wastes build agent resources.

With this change, a given build image can be looked up in a cache container registry and if present, pulled, and otherwise, built and pushed. The uniqueness of a build image is determined by a hash digest of the dockerfile, docker build context directory, and certain "docker build" options. This digest is part of the image tag in the cache container repository.

The cache container registry will need to be cleaned up periodically. This is not automated yet.
2020-11-17 17:02:24 -08:00
Guoyu Wang
252dbf1182
fix build break (#5835) 2020-11-17 16:08:10 -08:00
Du Li
3b5ba1cf7e
Parallelizing Resize op (#5792)
* adding parallelization for resize bi-linear mode.

* Adding parallelization for resize op.

* Use TrySimpleParallelFor instead of TryParallelFor.
TryParallelFor has unaddressed issue with cost model.

* Addressing PR comments.
2020-11-17 13:59:18 -08:00
Justin Stoecker
bd236ecc26
Switch to unified DirectML 1.4.0 redistributable (#5794)
Transitions from the ORT-only DML NuGet (hosted on the onnxruntime_public feed) to the new unified DirectML NuGet (Microsoft.AI.DirectML) on nuget.org. In addition, the Microsoft.AI.MachineLearning (WinML) and Microsoft.ML.OnnxRuntime.DirectML packages now take a dependency on the Microsoft.AI.DirectML package. This means we can remove the extra copy of DML binaries in these packages since they will be installed by the DML package.
2020-11-17 13:42:23 -08:00
Scott McKay
c84bc25e28
Add validation of op registrations (#5817)
* Add validation of operator registrations to the reduction script
  - the script has all the logic to process the registrations, and there's a CI that uses it

Fix some operator registrations

* Fix CUDA PRelu registration

* Refactor to split out kernel registration file parsing and use in the exclude ops script and an op registration validation script.
Run op validation in minimal build CI

* Fix PEP8 error and some comments
2020-11-17 10:44:09 -08:00
Sherlock
241b2226a7
Update orttraining-linux-gpu-ci-pipeline.yml for Azure Pipelines (#5826) 2020-11-17 09:27:59 -08:00
Guoyu Wang
1a66dfc0f9
Enable Squeeze Opset 13 for NNAPI (#5717)
* Add copy sparse model in minimal CI

* Add squeeze 13 support

* fix small typo

* Add ut for squeeze in NNAPI

* Fix some issue in the UT and code

* Modify based on the master change

* Fix build break
2020-11-17 00:26:06 -08:00
Scott McKay
7b76b57fc8
Support EPs that compile nodes in a minimal build. (#5776)
* Support EPs that compile nodes in a minimal build. This enables NNAPI being used.
2020-11-17 13:52:22 +10:00
Tiago Koji Castro Shibata
794e8479eb
Revert #5805 (#5823)
* Fix race condition in msbuild

* Revert "Named Dimension Override internals test and experimental API (#5805)"

This reverts commit 157d1844fb.
2020-11-16 17:05:28 -08:00
Scott McKay
a3f3a63206
Move OpenVINO specific validation function to somewhere more sensible, and rename to provide context on its usage. (#5822) 2020-11-17 10:58:43 +10:00
Dwayne Robinson
732ffd12d2
DirectML Execution Provider integration 2020-11-13 (#5809)
* Merged PR 5253310: Fix 0-sized dimension broadcasting

Tensors that contain 0-sized dimensions were being broadcasted to higher dimensions, which would remove the possibility to remove them from the graph. 0-sized dimensions represent empty tensors, so whatever operator needs to broadcast it shouldn't try to call into DML.

* Merged PR 5334334: Fix asserts and failure in GraphKernelHelper.cpp

This extends a workaround needed to match node inputs with Tensors to the EP code handling constant input upload.

This was causing issues in a couple of models, including EfficientDet, although that model still fails due to this bug:
https://microsoft.visualstudio.com/OS/_workitems/edit/29970551

Related work items: #29706035

* Merged PR 5344477: Disable GPU timeouts in DML EP command queue creation

GPU timeouts have already been disabled in command queues created by Winml, but not the ones created by the DML EP within the ORT API

* Merged PR 5380534: BatchNormalization failure in autopilot - fix output size

New validation [here](https://microsoft.visualstudio.com/DefaultCollection/WindowsAI/_git/WindowsAI/pullrequest/5354070?_a=files&path=%2Fdml%2FSharedValidation%2FDmlBatchNormalizationOperatorValidator.h) causes some BatchNorm cases to fail (e.g. OnnxConformanceTestsTaef::BatchNormalization (BatchNormalization_2x2x2)). I'm unsure how long this bug existed, but based on Nick's investigation, it apparently still worked anyway.

Related work items: #27678610

* Merged PR 5386132: Update 8D BatchNorm

Update 8D BatchNorm

Related work items: #27678610

* Merged PR 5390213: Tile allow 0 in repeats

0 is valid in Tile in "repeats" parameter. The CPU kernel handles it fine. So should the DML EP.

Related work items: #29970551

Co-authored-by: Justin Stoecker
Co-authored-by: Jeff Bloomfield
Co-authored-by: Patrice Vignola
Co-authored-by: Nick Feeney
2020-11-16 15:29:08 -08:00
Guoyu Wang
339348bc46
Fix bug in resize IsOpSupported, and add nearest neighbor resize support (#5810) 2020-11-16 14:27:50 -08:00
Changming Sun
833432d7d1
Update mysql-connector-java (#5802) 2020-11-16 14:09:14 -08:00
Dmitri Smirnov
2a6c73cf8c
Address publishing pipelines failures. (#5806)
* Address pipelines failures.

* Addrss one more fp16 model failure.
2020-11-16 10:19:19 -08:00
Sheil Kumar
671fa60327
Enable direct tensorization and detensorization to many buffers in WinML (#5791)
* switch to work PC

* back with iterable of buffers

* add raw api tests

* tensorization

* last test

* all tests pass!

* small cleanup

* whitespace

* newline

* whitespace

* refactor common code into DisjointBufferHelpers

* remove unused file

* warning

* skip gpu tests when hardware not available

* Add error condition when createreference is invoked

* add null check to cretereference

* uncomment out check

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2020-11-16 10:06:22 -08:00
RandySheriffH
20ae1ea21f
Remerge custom gpu op (#5818)
* add case for cpu custom op on gpu

* format doc

* restrict GPU custom op on Linux GPU CI only

* separate cu file to a independent project

* fix typo

* include cuda_add lib

* move lib def

* add file header

Co-authored-by: RandySheriffH <rashuai@microsoft.com>
2020-11-16 09:27:46 -08:00
Ryan Lai
e40df385ba
Skipping even more x86 tests (#5799) 2020-11-15 20:52:26 -08:00
zhijxu
89e5b3a24f resolve review comments 2020-11-16 11:23:01 +08:00
zhijxu
89902c2519 fix frontend bug.
old ort session may already exists when creating new ort session, this may cause OOM error
2020-11-16 11:23:01 +08:00
Guoyu Wang
c4818d36ed
[NNAPI EP] Make NNAPI EP build on non-Android Platform (#5779)
* Make NNAPI EP build on non-Android Platform

* minor updates

* Adress CR comments

* Fix build issue using Windows, address CR comments

* Fix linux build warnings

* Fix for test failure

* Fix for test failure

* Fix model_tests failure
2020-11-15 17:04:45 -08:00
Weixing Zhang
5b7dc5aeee
fix build failure for ROCm EP (#5816)
The kernel declaration of Identity needs to be updated in ROCm EP since
ROCm EP shares the implementation of Identity with CUDA EP in which it
has been changed due to opset 13 support.
2020-11-15 10:36:15 -08:00
Jesse Benson
ced5b66306 Re-enable multi-tensor-apply for LAMB optimizer 2020-11-15 09:35:00 -08:00
Weixing Zhang
fc614ad050 revert the code change which was based on b4869926
The change b4869926 which was to remove per-thread allocator would cause seg fault for
distributed training.

In addition, add dockerfile for ROCm3.9
2020-11-15 00:24:32 -08:00
RandySheriffH
c23fbba463
Fix reduce pipeline by replacing model (#5813)
* update model and better comment

* fix parameter

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2020-11-14 20:17:23 -08:00
Scott McKay
3269e59b2c
Add opset 13 registration for Identity. (#5800)
* Add opset 13 registration for Identity.
2020-11-14 21:40:24 +10:00
Ori Levari
157d1844fb
Named Dimension Override internals test and experimental API (#5805) 2020-11-13 21:21:11 -08:00
Ye Wang
262e9ef21d
Support input dimension swap in Attention op (#5774)
* checkin cpu

* checkin cpu

* add test

* cuda

* update comments

* review comments

* update

* modify var name

* remove unnecessary error msg

* fix comments

Co-authored-by: wangye <wangye@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-11-13 18:29:08 -08:00
sfatimar
dfbf6d78be
OpenVino: fix allocation failure on Window for RelWithDebInfo build (#5713)
* ng_supported_ops

* Remove ng_supported_ops

* Revert "Remove ng_supported_ops"

This reverts commit 3c27385b2d88c6e8cf7ac4e8c290a367ad5d0bd8.

* Revert "ng_supported_ops"

This reverts commit 650721ae2913b79739521d58838298e031abdac1.

* cmake changes to ensure that the debug build on windows link to debug builds of openvino
and do not result in bad allocation error

Co-authored-by: sfatimar <sahar.fatima@intel/com>
2020-11-13 07:59:52 -08:00
Vincent Wang
0c8902cbbe
Update Gradient Builder of Some Ops for OpSet13 (#5748)
* gradient builder for opset13

* code clean.

* resolve comments

* stop grad for axes input

* add split to stop grad list.

Co-authored-by: Vincent Wang <weicwang@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-11-13 16:20:34 +08:00
Yufeng Li
1f722863b2
Scale Bias post processor for ARM (#5795) 2020-11-12 21:12:23 -08:00
jeyblu
435b904f0e
add dnnl gpu engine (#5788) 2020-11-12 20:17:54 -08:00
Ryan Lai
0ea998134a
Skip new x86 tests in ort model tests (#5789) 2020-11-12 18:08:11 -08:00
Dmitri Smirnov
2f35e65135
Add Float16 and BFloat16 support to C# API (#5775)
Add Float16 and BFloat16 support.
2020-11-12 17:57:08 -08:00
edgchen1
4d517c68a3
Fix reference to old download_e2e_test_data.py script. It was renamed to download_azure_blob.py. (#5790) 2020-11-12 15:48:06 -08:00
Alberto Magni
88c3704257
Add shape inference for additional ops
This commit adds shape inference support for the following ops:

SoftmaxCrossEntropy
SoftmaxCrossEntropyLossGrad
SoftmaxCrossEntropyGrad
LayerNormalizationGrad
Motivation and Context
2020-11-12 20:18:54 +00:00
Ryan Lai
4e29f48010
skip gpt2 test on x86 (#5787) 2020-11-12 11:49:47 -08:00
pengwa
49288de17c
Fix memory planning issues (#5752)
* Fix memory planning issues

* fix build

* fix the wrong line...
2020-11-13 03:07:59 +08:00
alexzakv
44d3c31200
Winml_principles_change (#5727)
* Contributing page change

* Update WinML_principles.md

* Update WinML_principles.md

* Update WinML_principles.md

* Updated

* Update WinML_principles.md

* Update WinML_principles.md

* Update WinML_principles.md

* Update WinML_principles.md
2020-11-12 10:39:24 -08:00
Guoyu Wang
dc0f7b8f82
Remove onnxruntime_session_options_config_keys.h from c_api (#5772)
* Remove seesion config keys header from c_api

* remove copy session config header in release package

* Keep the session option config header in the package
2020-11-12 09:12:13 -08:00
stevenlix
54de618c2e
Improve TensorRT engine caching (#5737)
* add profile caching to improve engine caching feature

* Add comments

* fix typo

* add decryption for engine caching

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.cc

* update onnx-tensorrt submodule

* set opt profile to max value of the range

* add hash to engine/profile name

* Add calibration based INT8 quantization

* add an option to enable both FP16 and INT8

* Update tensorrt_execution_provider.cc

* add env variable to specify calibration file name

* clean up code

* Add comments and update TRT document

* enable tensorrt basic test and add EngineCachingTest

* clean up

* update envrionment variable in the test

* clean up
2020-11-12 08:56:45 -08:00
Vincent Wang
2a87108431
SoftmaxCrossEntropyLoss OpSet13. (#5777)
Co-authored-by: Vincent Wang <weicwang@microsoft.com>
2020-11-12 15:50:34 +08:00