Pranav Sharma
5e48c0fd6c
Register opset13 ops: Dropout, Flatten, LRN, MeanVarianceNormalization, ArgMax, ArgMin, Reshape, Shape, Concat. ( #5451 )
2020-10-12 10:09:38 -07:00
stevenlix
186f0668b0
update onnx-tensorrt submodule ( #5442 )
2020-10-09 21:49:40 -07:00
Hariharan Seshadri
b9f90e297e
Support sharing of initializers between session via the Python API ( #5407 )
2020-10-09 20:26:28 -07:00
Ryan Hill
6132e1f6ae
Shared providers - fix logging plus cleanup ( #5406 )
...
* Fix logging, cleanup, and implement the remainder of the not implemented functions from the shared provider interface.
2020-10-09 17:31:03 -07:00
Wei-Sheng Chin
6cba42e942
Avoid inserting other CUDA calls in-between NCCL Send's and Recv's ( #5430 )
...
* Avoid inserting other CUDA calls in-between NCCL Send's and Recv's
* Add a comment
* Place CUDA EP on the right device
* Fix a warning
* Address a comment
2020-10-09 15:34:46 -07:00
liqunfu
dbe7e6623b
only use/import pytest if needed (by enable_training) ( #5437 )
...
Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-10-09 12:42:19 -07:00
Dmitri Smirnov
9642f1448e
Add OpSet 13 Registrations ( #5426 )
...
Register Sigmoid for OpSet13
Register OpSet 13 for Sum, Min, Max, Mean.
Add Erf OpSet 13 registration.
Register Clip for OpSet 13
Add Gemm/MatMul Opset 13 resigstartions
Signed-off-by: Dmitri Smirnov <dmitrism@microsoft.com>
2020-10-09 12:39:22 -07:00
Sergii Dymchenko
3a9a1a4ef1
Fix registration for GatherGrad ( #5382 )
...
* Fix registration for GatherGrad to fix GatherGradOpTest.GatherGrad_axis0_indices2d_half.
* Fix GatherGrad registration for CUDA also.
2020-10-09 11:57:50 -07:00
liqunfu
1cceefc7d4
use run_orttraining_test_orttrainer_frontend_separately to work aroun… ( #5408 )
...
* use run_orttraining_test_orttrainer_frontend_separately to work around a sporadic segfault.
Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-10-09 09:16:10 -07:00
Scott McKay
a92ccbe1bc
Various armv7 related fixes ( #5394 )
...
* - Link with libatomic if needed
- Install pip differently so it doesn't clash with the system pip which may involve a wrapper script
- Remove ability to specify offset when Tensor allocates the data. The data prior to offset isn't accessible by anything.
- Fix use of offset in TensorOpTest to work on armv7 where it must be aligned to the type it points to.
- Fix ActivationOpNoInfTest.Softsign to allow for armv7 behavior
- Fix ReductionOpTest.ReduceMean_*keepdims to allow for armv7 floating point inaccuracy
* Address PR comments
2020-10-09 22:34:32 +10:00
Yufeng Li
b99eaa99cd
Prepacking MatMulInteger ( #5403 )
...
* prepack matmulinteger
Prepacking constant matrix B for MatMulInteger to get better performance.
2020-10-09 02:37:19 -07:00
Xavier Dupré
621fdb44e5
Fixes #4688 , remove CPUAllocator in TreeEnsemble ( #5375 )
2020-10-09 11:26:07 +02:00
Keizo Fujiwara
d4507e9331
Use relative path for HEADER_SEARCH_PATHS ( #5412 )
...
Currently HEADER_SEARCH_PATHS refers a personal directory.
2020-10-08 23:06:11 -07:00
Ye Wang
90f976d060
Some improvements on transformers tool ( #5383 )
...
* modify tensoflow benchmark gpu setting
* add export from tf choice in script
* fix typo
* match more embedlayernorm pattern
* format
2020-10-08 19:35:17 -07:00
Tracy Sharpe
fab7f799a7
MLAS: fix ARM64 + VS2017 build break ( #5423 )
2020-10-08 18:03:45 -07:00
Sergii Dymchenko
8a632a903f
Remove unused imports from Python tests. ( #5405 )
2020-10-08 17:24:10 -07:00
Tianlei Wu
15696b8fce
bump version to 1.5.2 ( #5420 )
2020-10-08 16:30:13 -07:00
Suffian Khan
498f94668d
Keep all_finite tensor on CPU when using PyTorch Frontend ( #5371 )
2020-10-08 15:47:18 -07:00
Pranav Sharma
c2c78399ee
Include config keys header file in the release packages for Linux and Mac. ( #5388 )
2020-10-08 15:00:29 -07:00
Changming Sun
09aef240d6
Skip running onnx tests in python mac os pipeline ( #5416 )
2020-10-08 11:49:28 -07:00
Tiago Koji Castro Shibata
83ead3e2eb
Fix com ptr refcount ( #5404 )
2020-10-08 10:18:38 -07:00
Yufeng Li
b04cf2d229
Update ORT to 1.5.1 in Bert Quantization Notebook ( #5396 )
...
* Update ORT to 1.5.1 in Bert Quantization Notebook
2020-10-08 09:55:01 -07:00
manashgoswami
132ab2230d
Updated with image for creating the onnxruntime pkg ( #5400 )
...
* Create Mobile.png
* Update ONNX_Runtime_for_Mobile_Platforms.md
* Update ONNX_Runtime_for_Mobile_Platforms.md
2020-10-08 08:54:27 -07:00
Scott McKay
9684e1b5a8
Add doco for pre-requisites to be able to cross compile for Android on Windows with Java bindings enabled. ( #5395 )
2020-10-08 12:31:46 +10:00
Tianlei Wu
8133223871
clear cudaDelayLoadedLibs since delayload is disabled ( #5386 )
2020-10-07 11:33:12 -07:00
Tianlei Wu
8ee2b08325
Allow benchmark different threads ( #5390 )
2020-10-07 11:13:01 -07:00
Tianlei Wu
094384781e
Add --use_external_data_format in convert_to_onnx.py ( #5393 )
2020-10-07 09:42:02 -07:00
Guoyu Wang
5947445457
Add flatbuffers verifier for ORT format buffer ( #5378 )
...
* Add flatbuffers verifier before accessing data in ort format models
* Address review comments
2020-10-07 09:23:17 -07:00
Guoyu Wang
deb708d3b1
Move flatbuffers to 1.12 release ( #5392 )
2020-10-07 09:23:03 -07:00
Hariharan Seshadri
6f54113a1b
Support OrtValue binding in Python to enable interesting IOBinding scenarios in Python ( #5248 )
2020-10-06 21:14:41 -07:00
Tracy Sharpe
0122e890d9
MLAS: implement u8x8 GEMM for ARM64 ( #5380 )
...
Add an implementation for u8u8/u8s8 GEMM for use on ARM64 (Windows/Linux).
2020-10-06 19:22:23 -07:00
Guoyu Wang
b4934b0016
Mitigate pybind11 build break using Xcode 12 on macOS ( #5381 )
...
* turn dev_mode off if we are using macos to build python with xcode 12
* Address CR comments
* Add ways to check compiler version
2020-10-06 19:03:33 -07:00
Kaarthik Sivashanmugam
10f1902d90
Update code snippet in README.md
2020-10-06 17:41:56 -07:00
liqunfu
773992c7d4
Liqun/bert pretrain tb ( #5377 )
...
* add tensor board, remove torch.distributed.lanuch because ort nccl depends on MPI. Use MPI to launch parallel training.
Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-10-06 16:28:31 -07:00
manashgoswami
b5caa7cb12
Updated docs: Execution Provider overview ( #5328 )
...
* Update ReleaseManagement.md
* Create ONNX_Runtime_Execution_Providers.md
* Create ONNX_Runtime_EP3.png
* Create ONNX_Runtime_EP2.png
* Create ONNX_Runtime_EP1.png
* Delete ONNX_Runtime_Execution_Providers.md
* Create README.md
* Update README.md
* commit
* Updated in error.
Revert "Update ReleaseManagement.md"
This reverts commit 8530bd5fd46aebce3a6d6055d8952ae4f6458c4e.
* Create ONNX_Runtime_Execution_Providers.md
* Create ONNX_Runtime_EP3.png
* Create ONNX_Runtime_EP2.png
* Create ONNX_Runtime_EP1.png
* Delete ONNX_Runtime_Execution_Providers.md
* Create README.md
* Update README.md
* commit
* Updated in error.
Revert "Update ReleaseManagement.md"
This reverts commit 8530bd5fd46aebce3a6d6055d8952ae4f6458c4e.
* Update ReleaseManagement.md
* Update .gitignore
* Update README.md
* Update README.md
2020-10-06 15:01:25 -07:00
Du Li
323c4dfe02
Adding an option for cudnn conv algorithms. ( #5159 )
...
* adding cudnn conv algorithm selection options.
* adding cudnn conv algorithm selection options.
* export the api
* adding the perf test option.
* accomodating pr comments.
* Move OrtSessionOptionsAppendExecutionProvider_CUDA to onnxruntime_c_api.h
* Accomodating PR comments.
2020-10-05 16:53:52 -07:00
Shucai Xiao
a0b8218f9a
Amdmigraphx update to rocm3.7 ( #5362 )
...
* backup dockerfile for upgrading to rocm3.7
* fix build errors related to rocm3.7
* backup dockerfile for migraphx
* remove unnecessary component from dockerfile
* fix review comments
Co-authored-by: Shucai Xiao <scxiao@prj47-rack-99.local.lan>
2020-10-05 15:34:24 -07:00
Yufeng Li
24f99b3be8
Support OuterStride for QGemm when MLAS_SUPPORTS_GEMM_U8X8 undefined ( #5374 )
...
Quantized GEMM on ARM doesn't support the case that leading dimension is not equal to column size. The PR adds support of this case.
2020-10-05 13:06:12 -07:00
Ashwini Khade
668ab04917
rename all TransposeMatMul nodes to FusedMatMul ( #5373 )
2020-10-05 12:41:05 -07:00
Wei-Sheng Chin
4e3a420aa7
Use single thread when pipeline is not enabled in TrainingRunner ( #4265 )
...
* Use single thread when pipeline is not enabled in TrainingRunner
* Remove macro indents
* Format file and remove state variable
2020-10-05 10:42:09 -07:00
Vlad Burlik
c20fcf26eb
Onnx GPU runtime fails to fallback to CPU when GPU is not available/busy ( #5304 )
...
* ONNX GPU runtime fails to fallback to CPU when GPU is not available OR busy
https://github.com/microsoft/onnxruntime/issues/5299
* comments
* Init _fallback_providers before C.InferenceSession
* As per review: Fallback providers order supersedes user's providers order, IF they are included into providers list.
* Code convention fix
* pep8
2020-10-02 22:45:14 -07:00
Wenbing Li
4721729fdc
Enable iOS CI pipeline ( #5360 )
...
* add the ios ci build.
* no dependency on mac ci pipeline.
* fix the command line.
* keep sync
* automatically retrieve sdpath
* fix the case errors and warnings
* fix the vlog switch issue.
* add parallel flag for build.
* update the display name of the pipeline.
2020-10-02 20:14:45 -07:00
Guoyu Wang
9df0790856
Update linux minimal CI to report Android mininal baseline binary size ( #5361 )
...
* Update linux minimal CI to report Android mininal baseline binary size
* Fix some issues in the script
2020-10-02 17:35:23 -07:00
Chun-Wei Chen
5bd7241839
Raise output mismatch error in ort_test_dir_utils.py ( #5364 )
2020-10-02 16:44:59 -07:00
Tianlei Wu
f5e4c0ea04
Fix benchmark_gpt2 model verification ( #5343 )
2020-10-02 13:53:02 -07:00
Guoyu Wang
6e4949e235
javadoc warning fix ( #5332 )
...
Co-authored-by: gwang0000 <62914304+gwang0000@users.noreply.github.com>
2020-10-02 11:52:07 -07:00
Hariharan Seshadri
06cd81d791
Support trilinear sampling in Resize CPU and CUDA kernels ( #5300 )
2020-10-02 11:02:43 -07:00
Sherlock
e71668f92c
Expose recompute configs to the frontend ( #5318 )
...
* Expose recompute configs to the frontend
* Add frontend test
* Ensure recompute graph transformer is only applied once
Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-10-02 09:49:47 -07:00
Tianlei Wu
e33de20861
Update gpt2 notebook for int8 quantization ( #5346 )
...
* Update gpt2 notebook for ORT 1.5
* add sections for int8 quantization including QAT note
2020-10-02 09:41:52 -07:00
Ashwini Khade
ce49cfa67c
add support for configurable build dir when building nuget packages ( #5352 )
...
* add support for configurable build dir when building nuget packages
* rename vars
2020-10-02 09:31:35 -07:00