Commit graph

3425 commits

Author SHA1 Message Date
gwang
a6f092300a Fix for CUDA build break 2020-10-12 13:27:37 -07:00
gwang
90e132c5d5 Cherry-pick 5378 2020-10-12 13:27:37 -07:00
Guoyu Wang
c6c4bab7ad Move flatbuffers to 1.12 release (#5392) 2020-10-12 13:27:37 -07:00
stevenlix
bcbcb2552a update onnx-tensorrt submodule (#5442) 2020-10-12 13:27:37 -07:00
Hariharan Seshadri
8206ff4a90 Support trilinear sampling in Resize CPU and CUDA kernels (#5300) 2020-10-12 13:27:37 -07:00
Changming Sun
e4f71abd90 Exclude GPT2_LM_HEAD from OpenVino's model test list (#5356)
GPT2_LM_HEAD is a new ONNX model zoo model that OpenVino doesn't support.

Error message:1: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running OpenVINO-EP-subgraph_1162 node. Name:'OpenVINOExecutionProvider_OpenVINO-EP-subgraph_1162_1' Status Message: _Map_base::at
2020-10-12 13:27:37 -07:00
Tiago Koji Castro Shibata
d77241e7f1 Fix WinML warnings (#5228) 2020-10-12 13:27:37 -07:00
Tianlei Wu
4227dd7df5 Revert "Move flatbuffers to 1.12 release (#5392)"
This reverts commit 0294d9aa5624b463ef34aa3ad8458e983715e2a4.
2020-10-12 13:27:37 -07:00
Tianlei Wu
fc0fc80db2 Revert "Add flatbuffers verifier for ORT format buffer (#5378)"
This reverts commit e8bf3ba2bb383055b5974e8904324c33e0f4cbb1.
2020-10-12 13:27:37 -07:00
Guoyu Wang
6782866529 Mitigate pybind11 build break using Xcode 12 on macOS (#5381) 2020-10-12 13:27:37 -07:00
Guoyu Wang
fda4363992 Add flatbuffers verifier for ORT format buffer (#5378) 2020-10-12 13:27:37 -07:00
Pranav Sharma
5f331af157 Include config keys header file in the release packages for Linux and Mac. (#5388) 2020-10-12 13:27:37 -07:00
Guoyu Wang
8adfa7ac70 Move flatbuffers to 1.12 release (#5392) 2020-10-12 13:27:37 -07:00
Tiago Koji Castro Shibata
8283526541 Fix com ptr refcount (#5404) 2020-10-12 13:27:37 -07:00
Tianlei Wu
58bf508ce0 bump version to 1.5.2 (#5420) 2020-10-12 13:27:37 -07:00
Tianlei Wu
4e983634ff clear cudaDelayLoadedLibs since delayload is disabled (#5386) 2020-10-12 13:27:37 -07:00
Yufeng Li
5de47affb1
fix quantization of EmbeddingLayerNorm (#5321) 2020-09-29 01:00:47 -07:00
Tianlei Wu
c00e13a291
Cherry pick (batch 2) to rel-1.5.1 (#5290)
* remove implicit linking of tensorrt and dnnl ep shared libs (#5262)
* Update DirectML Nuget to 1.3.0 (#5274)
* Update PyTorch TransformerModel sample (#5275)
* Insert telemetry template into GPU build, add telemry build switches. (#5278)
* Synchronize training dependency versions between Docker image and Python wheel (#5261)
* Downgrade GCC (#5269)
* Remove --enable_symbolic_shape_infer_tests to fix linux ci pipeline build error.

Co-authored-by: Edward Chen
Co-authored-by: George Wu <jywu@microsoft.com>
Co-authored-by: Dwayne Robinson <dwayner@microsoft.com>
Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
Co-authored-by: Dmitri Smirnov <yuslepukhin@users.noreply.github.com>
Co-authored-by: edgchen1 <18449977+edgchen1@users.noreply.github.com>
Co-authored-by: Changming Sun <chasun@microsoft.com>
2020-09-25 09:26:40 -07:00
Jeff Bloomfield
389cca7a45 Handle missing initializers in allocation planner to fix crashes with DML provider (#5244)
* Fix memory planning bug with DML EP

* Address PR comments

* Fix typo
2020-09-23 16:50:58 -07:00
Dwayne Robinson
b648fe5f74 ORT DirectML EP for Iron release, ONNX 1.5 (part 2) (#5263)
* Merged PR 5195856: Fix broken cases of zero size tensors in Cast/Reduce

 MaskRCNN failed when `Cast` tried to execute `Xor` with emptiness (zero in dimensions). This is perfectly legal and should be treated as a nop.

Ultimately DML itself should treat this case as a nop, just like how C's `memcpy` treats 0 count as a nop, but I'm just addressing it in ORT now, as enabling it in DML would impact more operators to be consistent (probably should incrementally add a flag to tensor validation so operators can be opted in gradually).

Corresponding WindowsAI PR: https://microsoft.visualstudio.com/WindowsAI/_git/WindowsAI/pullrequest/5195850

Related work items: #27469839, #28761382

* Merged PR 5201369: Remove copy of initializers added in DMLXP refactor

When used in ORT, a common method shouldn't copy and return initializer data

Related work items: #29514403

Co-authored-by: Justin Stoecker <justoeck@microsoft.com>
Co-authored-by: Jeff Bloomfield <jeffbloo@microsoft.com>
2020-09-23 16:50:58 -07:00
Yufeng Li
eb75b492cc Fix bug in the back to back quantization of matmul and conv (#5264)
* fix bug in the back to back quantization of matmul and conv

* fix bug in back to back gather
2020-09-23 16:50:58 -07:00
Tianlei Wu
47447da4fd bump version to 1.5.1 (#5258) 2020-09-23 16:50:58 -07:00
Ye Wang
87b15f32ef Fix reshape fusion crash (#5252)
* fix reshape fusion crash

* handling start_node statelessly

* fix
2020-09-23 16:50:58 -07:00
Guoyu Wang
fc259de3bc Fix possible ios build break after update to Xcode 12 (#5246)
* Fix possible ios build break after update to Xcode 12

* Address comments
2020-09-23 16:50:58 -07:00
Sherlock
9fd76c8693 Place Shape's output in CPU memory (#5245)
Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-23 16:50:58 -07:00
edgchen1
9158679c43 Update BUILD.md training dependency info. (#5240)
Update training dependency versions based on Dockerfile.training.
2020-09-23 16:50:58 -07:00
Changming Sun
b9b7c279fa Update BUILD.md for CUDA versions (#5239) 2020-09-23 16:50:58 -07:00
George Wu
0cbe240ea3 update TensorRT docs (#5238)
* doc updates TensorRT

* update

* update

* fix warning

* newline

* format
2020-09-23 16:50:58 -07:00
Scott McKay
c93f292d1f Revert to using release SafeInt repo now that it supports a build with exceptions disabled. (#5233) 2020-09-23 16:50:58 -07:00
edgchen1
6371ad61c5 Fix TransposeScaleMatMul and MatMulScaleFusion issues (#5230)
- Rename TransposeScaleMatMul back to TransposeMatMul for backwards compatibility
- Fix MatMulScaleFusion issues:
  - Add check for supported execution providers
  - Add check for supported MatMul input types
2020-09-23 16:50:58 -07:00
stevenlix
c27f461c1d Create profile for all dynamic shape input tensors (#5229) 2020-09-23 16:50:58 -07:00
Adam Pocock
4427b1e2a3 [java] Fixing the buffer semantics. (#5223)
* [java] Fixing the buffer semantics.
* Renaming bufferCapacity to bufferRemaining.
* Adding a cast to char* so the pointer arithmetic works on Windows.
2020-09-23 16:50:58 -07:00
George Wu
c909c67701 fix _WIN32 (#5218) 2020-09-23 16:50:58 -07:00
Scott McKay
95b2e31659 Update conversion script and process to simplify creating ORT format models and a minimal build (#5217)
* Update conversion script and process to simplify creating ORT format models and a minimal build.
2020-09-23 16:50:58 -07:00
liqunfu
21a7afb2c6 --shm-size=1024m to fix nccl shared memory issue (#5214)
* --shm-size=256m to fix nccl shared memory issue

Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-23 16:50:58 -07:00
RRRachelllll555
b791402f84 Remove shape inference and fix save large model(>2g) issue (#5210)
* remove shape inference and fix save large model problem

* remove unnecessary import

* refine code and add external format for quantize_qat

* remove initializers in tensors_to_calibrate

* small refine

Co-authored-by: t-yguo <t-yguo@microsoft.com>
2020-09-23 16:50:58 -07:00
Pranav Prakash
0a31b9ed3c Fix order of returned values in quantize_weight_per_channel (#5205)
Must match returned order of `quantize_inputs`
2020-09-23 16:50:58 -07:00
Tracy Sharpe
f726af34e0 NCHWc optimizer fixes for quantized models (#5203)
This updates the NCHWc transformer to not interfere with quantized convolution models, based on observations from internal models. The tensor type for MaxPool must be float. The input to GlobalAveragePool/GlobalMaxPool must be in NCHWc format.
2020-09-23 16:50:58 -07:00
S. Manohar Karlapalem
84ffdbc467 Corrects doc typos and formatting (#5201) 2020-09-23 16:50:58 -07:00
Pranav Sharma
24d111c342 Add API to allow configuration of the global thread pools. (#5199) 2020-09-23 16:50:58 -07:00
Zhang Lei
498483b464 MaxPool versioning in quantization tools. (#5194)
MaxPool versioning in quantization tools.
2020-09-23 16:50:58 -07:00
Suffian Khan
39a7f96a44 Fix softmax_warp_backward math when is_log_softmax = True and register LogSoftmax CUDA kernel (#5160)
* register logsoftmax cuda kernel; fix logsoftmaxgrad cuda kernal; fix tests to invoke dispatch_softmax_*

* forgot to remove axis check

* add tests all axis

Co-authored-by: suffian khan <sukha@OrtTrainingDev1.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-23 16:50:58 -07:00
Shucai Xiao
8e650c5384 Amdmigraphx improvements (#5158)
* code backup

* remove unnecessary log info

* code backup

* code backup

* merge changes from master branch

* code backup

* code backup

* merge changes from master branch

* code backup

* code backup for constant folding enhancement

* code backup

* include more scenarios for constant folding

* code backup

* remove unnecessary code

* remove unnecessary log information

* fix an error in comments

* update algorithm to do graph partition

* code backup

* remove unnecessary log information

* remove an unused function

* remove unnecessary changes
2020-09-23 16:50:58 -07:00
Ye Wang
b693cb1370 Fix a bug in EmbedLayerNorm fusion (#5150)
* fix embedlayernorm bug

* review comments

* interim checkin

* review comments

* Fix core dump in MacOS

* remove unnecessary lines

* update document

* Update graph_utils.cc

* Update onnx_exporter.py

* resolve comments
2020-09-23 16:50:58 -07:00
Changming Sun
5b5bcba9e3 Update MCR CUDA docker image to 10.2 (#5181) 2020-09-17 08:39:47 -07:00
Dmitri Smirnov
ece9a7c1fc Refactor TensorAt, prepare for release (#5180)
* Refactor TensorAt
  locations* must be const and int64_t since our dims are int64_t
  Remove unnecessary copy of locations.
  Remove unnecesary casting and C-casting. Simplify implementation.
  Add a check for string type.
  Make CXX api return T& to fully expose C API in C++, const std::vector& by value as it
  covers more ground and eliminate redundant copy.
  Eliminate inner loop, compute strides first.
2020-09-17 08:39:47 -07:00
Tracy Sharpe
b2994492af MLAS: add sgemm weight prepacking (#5183)
Add support to MLAS to prepack weights for the float GEMM. Support for prepacking has been added to MatMul and Attention for this release.
2020-09-17 08:39:47 -07:00
Tiago Koji Castro Shibata
ecf04d23c4 Fix nuget build (#5163)
* Fix nuget content

* Revert "Fix nuget content"

This reverts commit e2cdcec4e39964c50eac2fb306c7a4bb84352443.

* Nuget packaging

* skip tests

* msbuild path

* Force msbuild version

* Workaround https://github.com/NuGet/Home/issues/7621

* cleanup
2020-09-17 08:39:47 -07:00
Tiago Koji Castro Shibata
b523fa08bc Use onecore umbrella lib in onecore builds (#5182)
* delayload hack

* Skip tests

* Onecore uses onecore umbrella

* Uncomment tests

* cleanup

* Disable dev mode for WinML
2020-09-17 08:39:47 -07:00
Chun-Wei Chen
393ff2f434 Add GetStartTime() for profiler to get private profiling_start_time_ (#4994)
* add GetStartTime() for profiler

* add function in inference_session

* remove qualified name

* add the api in cxx_api.h

* rename starttime to StartTimeNs, expost profiling object

* rename GetProfilingStartTime

* move Ortapis to the right place

* move to the end

* add const for session

* const the right place

* use const auto instead of const auto* for session

* remove const for auto getstarttime

* remove const for auto getstarttime

add unit tests

* nit: update test name and add comments
2020-09-17 08:39:47 -07:00