gwang
a6f092300a
Fix for CUDA build break
2020-10-12 13:27:37 -07:00
gwang
90e132c5d5
Cherry-pick 5378
2020-10-12 13:27:37 -07:00
Guoyu Wang
c6c4bab7ad
Move flatbuffers to 1.12 release ( #5392 )
2020-10-12 13:27:37 -07:00
stevenlix
bcbcb2552a
update onnx-tensorrt submodule ( #5442 )
2020-10-12 13:27:37 -07:00
Hariharan Seshadri
8206ff4a90
Support trilinear sampling in Resize CPU and CUDA kernels ( #5300 )
2020-10-12 13:27:37 -07:00
Changming Sun
e4f71abd90
Exclude GPT2_LM_HEAD from OpenVino's model test list ( #5356 )
...
GPT2_LM_HEAD is a new ONNX model zoo model that OpenVino doesn't support.
Error message:1: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running OpenVINO-EP-subgraph_1162 node. Name:'OpenVINOExecutionProvider_OpenVINO-EP-subgraph_1162_1' Status Message: _Map_base::at
2020-10-12 13:27:37 -07:00
Tiago Koji Castro Shibata
d77241e7f1
Fix WinML warnings ( #5228 )
2020-10-12 13:27:37 -07:00
Tianlei Wu
4227dd7df5
Revert "Move flatbuffers to 1.12 release ( #5392 )"
...
This reverts commit 0294d9aa5624b463ef34aa3ad8458e983715e2a4.
2020-10-12 13:27:37 -07:00
Tianlei Wu
fc0fc80db2
Revert "Add flatbuffers verifier for ORT format buffer ( #5378 )"
...
This reverts commit e8bf3ba2bb383055b5974e8904324c33e0f4cbb1.
2020-10-12 13:27:37 -07:00
Guoyu Wang
6782866529
Mitigate pybind11 build break using Xcode 12 on macOS ( #5381 )
2020-10-12 13:27:37 -07:00
Guoyu Wang
fda4363992
Add flatbuffers verifier for ORT format buffer ( #5378 )
2020-10-12 13:27:37 -07:00
Pranav Sharma
5f331af157
Include config keys header file in the release packages for Linux and Mac. ( #5388 )
2020-10-12 13:27:37 -07:00
Guoyu Wang
8adfa7ac70
Move flatbuffers to 1.12 release ( #5392 )
2020-10-12 13:27:37 -07:00
Tiago Koji Castro Shibata
8283526541
Fix com ptr refcount ( #5404 )
2020-10-12 13:27:37 -07:00
Tianlei Wu
58bf508ce0
bump version to 1.5.2 ( #5420 )
2020-10-12 13:27:37 -07:00
Tianlei Wu
4e983634ff
clear cudaDelayLoadedLibs since delayload is disabled ( #5386 )
2020-10-12 13:27:37 -07:00
Yufeng Li
5de47affb1
fix quantization of EmbeddingLayerNorm ( #5321 )
2020-09-29 01:00:47 -07:00
Tianlei Wu
c00e13a291
Cherry pick (batch 2) to rel-1.5.1 ( #5290 )
...
* remove implicit linking of tensorrt and dnnl ep shared libs (#5262 )
* Update DirectML Nuget to 1.3.0 (#5274 )
* Update PyTorch TransformerModel sample (#5275 )
* Insert telemetry template into GPU build, add telemry build switches. (#5278 )
* Synchronize training dependency versions between Docker image and Python wheel (#5261 )
* Downgrade GCC (#5269 )
* Remove --enable_symbolic_shape_infer_tests to fix linux ci pipeline build error.
Co-authored-by: Edward Chen
Co-authored-by: George Wu <jywu@microsoft.com>
Co-authored-by: Dwayne Robinson <dwayner@microsoft.com>
Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
Co-authored-by: Dmitri Smirnov <yuslepukhin@users.noreply.github.com>
Co-authored-by: edgchen1 <18449977+edgchen1@users.noreply.github.com>
Co-authored-by: Changming Sun <chasun@microsoft.com>
2020-09-25 09:26:40 -07:00
Jeff Bloomfield
389cca7a45
Handle missing initializers in allocation planner to fix crashes with DML provider ( #5244 )
...
* Fix memory planning bug with DML EP
* Address PR comments
* Fix typo
2020-09-23 16:50:58 -07:00
Dwayne Robinson
b648fe5f74
ORT DirectML EP for Iron release, ONNX 1.5 (part 2) ( #5263 )
...
* Merged PR 5195856: Fix broken cases of zero size tensors in Cast/Reduce
MaskRCNN failed when `Cast` tried to execute `Xor` with emptiness (zero in dimensions). This is perfectly legal and should be treated as a nop.
Ultimately DML itself should treat this case as a nop, just like how C's `memcpy` treats 0 count as a nop, but I'm just addressing it in ORT now, as enabling it in DML would impact more operators to be consistent (probably should incrementally add a flag to tensor validation so operators can be opted in gradually).
Corresponding WindowsAI PR: https://microsoft.visualstudio.com/WindowsAI/_git/WindowsAI/pullrequest/5195850
Related work items: #27469839 , #28761382
* Merged PR 5201369: Remove copy of initializers added in DMLXP refactor
When used in ORT, a common method shouldn't copy and return initializer data
Related work items: #29514403
Co-authored-by: Justin Stoecker <justoeck@microsoft.com>
Co-authored-by: Jeff Bloomfield <jeffbloo@microsoft.com>
2020-09-23 16:50:58 -07:00
Yufeng Li
eb75b492cc
Fix bug in the back to back quantization of matmul and conv ( #5264 )
...
* fix bug in the back to back quantization of matmul and conv
* fix bug in back to back gather
2020-09-23 16:50:58 -07:00
Tianlei Wu
47447da4fd
bump version to 1.5.1 ( #5258 )
2020-09-23 16:50:58 -07:00
Ye Wang
87b15f32ef
Fix reshape fusion crash ( #5252 )
...
* fix reshape fusion crash
* handling start_node statelessly
* fix
2020-09-23 16:50:58 -07:00
Guoyu Wang
fc259de3bc
Fix possible ios build break after update to Xcode 12 ( #5246 )
...
* Fix possible ios build break after update to Xcode 12
* Address comments
2020-09-23 16:50:58 -07:00
Sherlock
9fd76c8693
Place Shape's output in CPU memory ( #5245 )
...
Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-23 16:50:58 -07:00
edgchen1
9158679c43
Update BUILD.md training dependency info. ( #5240 )
...
Update training dependency versions based on Dockerfile.training.
2020-09-23 16:50:58 -07:00
Changming Sun
b9b7c279fa
Update BUILD.md for CUDA versions ( #5239 )
2020-09-23 16:50:58 -07:00
George Wu
0cbe240ea3
update TensorRT docs ( #5238 )
...
* doc updates TensorRT
* update
* update
* fix warning
* newline
* format
2020-09-23 16:50:58 -07:00
Scott McKay
c93f292d1f
Revert to using release SafeInt repo now that it supports a build with exceptions disabled. ( #5233 )
2020-09-23 16:50:58 -07:00
edgchen1
6371ad61c5
Fix TransposeScaleMatMul and MatMulScaleFusion issues ( #5230 )
...
- Rename TransposeScaleMatMul back to TransposeMatMul for backwards compatibility
- Fix MatMulScaleFusion issues:
- Add check for supported execution providers
- Add check for supported MatMul input types
2020-09-23 16:50:58 -07:00
stevenlix
c27f461c1d
Create profile for all dynamic shape input tensors ( #5229 )
2020-09-23 16:50:58 -07:00
Adam Pocock
4427b1e2a3
[java] Fixing the buffer semantics. ( #5223 )
...
* [java] Fixing the buffer semantics.
* Renaming bufferCapacity to bufferRemaining.
* Adding a cast to char* so the pointer arithmetic works on Windows.
2020-09-23 16:50:58 -07:00
George Wu
c909c67701
fix _WIN32 ( #5218 )
2020-09-23 16:50:58 -07:00
Scott McKay
95b2e31659
Update conversion script and process to simplify creating ORT format models and a minimal build ( #5217 )
...
* Update conversion script and process to simplify creating ORT format models and a minimal build.
2020-09-23 16:50:58 -07:00
liqunfu
21a7afb2c6
--shm-size=1024m to fix nccl shared memory issue ( #5214 )
...
* --shm-size=256m to fix nccl shared memory issue
Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-23 16:50:58 -07:00
RRRachelllll555
b791402f84
Remove shape inference and fix save large model(>2g) issue ( #5210 )
...
* remove shape inference and fix save large model problem
* remove unnecessary import
* refine code and add external format for quantize_qat
* remove initializers in tensors_to_calibrate
* small refine
Co-authored-by: t-yguo <t-yguo@microsoft.com>
2020-09-23 16:50:58 -07:00
Pranav Prakash
0a31b9ed3c
Fix order of returned values in quantize_weight_per_channel ( #5205 )
...
Must match returned order of `quantize_inputs`
2020-09-23 16:50:58 -07:00
Tracy Sharpe
f726af34e0
NCHWc optimizer fixes for quantized models ( #5203 )
...
This updates the NCHWc transformer to not interfere with quantized convolution models, based on observations from internal models. The tensor type for MaxPool must be float. The input to GlobalAveragePool/GlobalMaxPool must be in NCHWc format.
2020-09-23 16:50:58 -07:00
S. Manohar Karlapalem
84ffdbc467
Corrects doc typos and formatting ( #5201 )
2020-09-23 16:50:58 -07:00
Pranav Sharma
24d111c342
Add API to allow configuration of the global thread pools. ( #5199 )
2020-09-23 16:50:58 -07:00
Zhang Lei
498483b464
MaxPool versioning in quantization tools. ( #5194 )
...
MaxPool versioning in quantization tools.
2020-09-23 16:50:58 -07:00
Suffian Khan
39a7f96a44
Fix softmax_warp_backward math when is_log_softmax = True and register LogSoftmax CUDA kernel ( #5160 )
...
* register logsoftmax cuda kernel; fix logsoftmaxgrad cuda kernal; fix tests to invoke dispatch_softmax_*
* forgot to remove axis check
* add tests all axis
Co-authored-by: suffian khan <sukha@OrtTrainingDev1.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-23 16:50:58 -07:00
Shucai Xiao
8e650c5384
Amdmigraphx improvements ( #5158 )
...
* code backup
* remove unnecessary log info
* code backup
* code backup
* merge changes from master branch
* code backup
* code backup
* merge changes from master branch
* code backup
* code backup for constant folding enhancement
* code backup
* include more scenarios for constant folding
* code backup
* remove unnecessary code
* remove unnecessary log information
* fix an error in comments
* update algorithm to do graph partition
* code backup
* remove unnecessary log information
* remove an unused function
* remove unnecessary changes
2020-09-23 16:50:58 -07:00
Ye Wang
b693cb1370
Fix a bug in EmbedLayerNorm fusion ( #5150 )
...
* fix embedlayernorm bug
* review comments
* interim checkin
* review comments
* Fix core dump in MacOS
* remove unnecessary lines
* update document
* Update graph_utils.cc
* Update onnx_exporter.py
* resolve comments
2020-09-23 16:50:58 -07:00
Changming Sun
5b5bcba9e3
Update MCR CUDA docker image to 10.2 ( #5181 )
2020-09-17 08:39:47 -07:00
Dmitri Smirnov
ece9a7c1fc
Refactor TensorAt, prepare for release ( #5180 )
...
* Refactor TensorAt
locations* must be const and int64_t since our dims are int64_t
Remove unnecessary copy of locations.
Remove unnecesary casting and C-casting. Simplify implementation.
Add a check for string type.
Make CXX api return T& to fully expose C API in C++, const std::vector& by value as it
covers more ground and eliminate redundant copy.
Eliminate inner loop, compute strides first.
2020-09-17 08:39:47 -07:00
Tracy Sharpe
b2994492af
MLAS: add sgemm weight prepacking ( #5183 )
...
Add support to MLAS to prepack weights for the float GEMM. Support for prepacking has been added to MatMul and Attention for this release.
2020-09-17 08:39:47 -07:00
Tiago Koji Castro Shibata
ecf04d23c4
Fix nuget build ( #5163 )
...
* Fix nuget content
* Revert "Fix nuget content"
This reverts commit e2cdcec4e39964c50eac2fb306c7a4bb84352443.
* Nuget packaging
* skip tests
* msbuild path
* Force msbuild version
* Workaround https://github.com/NuGet/Home/issues/7621
* cleanup
2020-09-17 08:39:47 -07:00
Tiago Koji Castro Shibata
b523fa08bc
Use onecore umbrella lib in onecore builds ( #5182 )
...
* delayload hack
* Skip tests
* Onecore uses onecore umbrella
* Uncomment tests
* cleanup
* Disable dev mode for WinML
2020-09-17 08:39:47 -07:00
Chun-Wei Chen
393ff2f434
Add GetStartTime() for profiler to get private profiling_start_time_ ( #4994 )
...
* add GetStartTime() for profiler
* add function in inference_session
* remove qualified name
* add the api in cxx_api.h
* rename starttime to StartTimeNs, expost profiling object
* rename GetProfilingStartTime
* move Ortapis to the right place
* move to the end
* add const for session
* const the right place
* use const auto instead of const auto* for session
* remove const for auto getstarttime
* remove const for auto getstarttime
add unit tests
* nit: update test name and add comments
2020-09-17 08:39:47 -07:00