Commit graph

458 commits

Author SHA1 Message Date
Ryan Hill
330339caa4 Remove test_execution_provider from training build
Only enable python atexit on windows
Remove assert on provider library exit
2021-05-11 03:10:20 -07:00
Ryan Hill
fcb98063a2 Python crash on exit, possibly due to unloading of libraries. 2021-05-11 02:08:02 -07:00
Ryan Hill
12224c88b7 Java fix for CPU build 2021-05-10 13:03:17 -07:00
Ryan Hill
1698c57fcc Merge with master 2021-05-10 00:41:07 -07:00
Hariharan Seshadri
4b691a5c0d
Add ability for memory arenas to "shrink" periodically (#7284) 2021-05-08 07:53:21 -07:00
Tianlei Wu
55c086b664
symbolic shape inference improvements for contrib ops (#7606)
* add EmbedLayerNormalization
* use onnx shape inference for Unsqueeze
* Fix type warning in Attention
2021-05-07 17:03:24 -07:00
stevenlix
8ab0deceed
Add DLA support to TensorRT EP (#7532)
* Add DLA to TensorRT EP, enable device_id options in pybind, fix cycledetection issue

* fix format

* remove unecessary passing by pointer

* fix issue
2021-05-07 10:31:42 -07:00
Ryan Hill
af3824ce25 Merge branch 'master' of https://github.com/microsoft/onnxruntime into ryanunderhill/cuda_shared 2021-05-06 17:00:59 -07:00
Tianlei Wu
d88da44066
Allow flexible order of Add inputs in Attention fusion (#7565) 2021-05-06 09:43:28 -07:00
Ryan Hill
8ba8c26233 Merge with master 2021-05-05 15:05:00 -07:00
Ryan Hill
401751b4c7 Merge with master 2021-05-05 15:01:02 -07:00
Zhang Lei
9465948715
Quantization tools using one more extra_options on interface. (#7293)
handle nnapi special sigmoid options.
2021-05-05 13:51:50 -07:00
Ryan Hill
401ef4a634 Revert DllMain idea, it didn't work 2021-05-04 22:29:52 -07:00
Ryan Hill
b2c202d930 Remove extra debug messages
Try a more clean python shutdown through DllMain
2021-05-04 20:10:11 -07:00
Zhang Lei
f6cefc92e2
Add quantized value map after quantize input node added. (#7558) 2021-05-04 15:27:56 -07:00
Ryan Hill
076846190e Another python exit test 2021-05-03 21:11:10 -07:00
Tianlei Wu
3c9ece4a11
[transformers optimizer] catch symbolic shape inference exception and clean up (#7560)
catch symbolic shape inference exception.
no prune graph when there is inner graph (Loop/If/Scan)
add an wrapper for numpy_helper.to_array so that we can debug onnx graph without external data
remove fuse_mask that is not used any more in onnx_model_bert_tf.py
2021-05-03 20:42:13 -07:00
Ryan Hill
1ff2002534 Fix id_to_allocator_map 2021-05-03 19:34:22 -07:00
Tianlei Wu
731f9e5033
Fix symbolic shape inference for Unsqueeze (#7555)
* fix Unsqueeze shape inference
* add tests
2021-05-03 18:06:59 -07:00
Ryan Hill
9287e602b7 Revert "Test adding unload method for shared providers"
This reverts commit c427b78799.
2021-05-03 16:31:32 -07:00
Ryan Hill
acba6779df Revert "Python test"
This reverts commit c7ec2cfe98.
2021-05-03 16:30:48 -07:00
Ryota Tomioka
d1cb8c9dc9
Support negative indices and fix bound checking in symbolic shape inference for Slice (#7401)
* Use positivity everywhere; handle negative index in Slice

* limit positivity to inputs

* make handle_negative_index private

* strengthen sympy comparison

* further strengthen compariso
n and a minor refactoring

* Add flip test

* Fall through if -int_max in handle_negative_index()

* minor fix for infer_Concat to include initializers

* Add more tests

* use simplify

* more tests
2021-05-03 09:07:55 -07:00
Ryan Hill
c7ec2cfe98 Python test 2021-04-30 21:06:21 -07:00
Ryan Hill
c427b78799 Test adding unload method for shared providers 2021-04-29 22:41:11 -07:00
Changming Sun
1012535dab
Change onnxruntime::make_unique to std::make_unique (#7502)
1. Change onnxruntime::make_unique to std::make_unique
2. Add "-std=c++14" to ROCM EP's build flags.
2021-04-29 17:04:53 -07:00
Xiaoyu Liu
994c2ed420
GPT2 one step beam search update with configuration support (#7425)
* check in early stop search as separate type
* rename to beam search configurations
* update do sample configuration flag help
* rename to configurable search step
* add option groups
* add more unit tests

Co-authored-by: Xiaoyu Liu <xiaoyu@xiaoyu-VM.z4vh1dzj5eoevgybsksdpz2izh.jx.internal.cloudapp.net>
2021-04-29 13:19:56 -07:00
Ryan Hill
5a3a8fe2d0 Fix python shutdown 2021-04-28 23:53:28 -07:00
Lifu Huang
ab373d6f03
Lifhuan/force trt sequential (#7440)
* Support sequential TensorRT engine build.

* Add documentation.

* Add tests and fix typos.

* Fix missing field in pybind_state.
2021-04-28 13:59:37 -07:00
thilow
22d7cde725
Fix a 'Squeeze' related issue in symbolic_shape_infer.py (#7380)
* Update symbolic_shape_infer.py

don't rely on static code infer in _infer_Squeeze_

* checking if dorpped axes might be =! 1

* Checking opset. Logging assumption that symbolic dimensions are unequal to 1.

* more checks
2021-04-28 13:13:04 -07:00
Ryan Hill
9405a9cc72 Merge with master 2021-04-26 16:41:45 -07:00
Zhang Lei
ada0fbbd2d
Implement qlinear concat and unit test. (#7341)
* Implement qlinear concat and unit test.
Add quantization tools for QLinearConcat and it quantization tests.

* Add kernel def hash for QLinearConcat.

* Change according to PR. Add qdq transformer support for QLinearConcat.

* Add QDQ Transformer unittest. Fix typo on domain.

* remove dup logic of no use.

* fix x86 build error.

* Update operator docs.
2021-04-26 13:38:40 -07:00
Ryan Hill
06eac846b8 Add diagnostics 2021-04-23 21:39:41 -07:00
Tang, Cheng
1fa6d8fe1c
support loading external execution provider from python frontend (#7332)
* initial dynamic load example

* support load EP in the provider options

* support dynamic load EP in orttrainer

* split the provider interface; fix comments in pr

* remove experiment code

* add test

* remove useless file

* add test model file;fix linux brewak

* fix linux build and missing file

* fix python build

* fix python build

* fix python binding

* fix python test

* fix runtime path for posix env

* exclude the shared library from minimal build

* fix comments in pr;

* seperate the provider shared lib loading

* excluded from minimal / macos / ios build

* skip copy the provider shared lib for minimal build and mac os

* fix macos build

* exclude the test for macos build

* exclude from andorid build

* exclude from web assembly build

* enable the invalid ep test

Co-authored-by: Cheng Tang <chenta@microsoft.com>
2021-04-23 09:54:09 -07:00
Ryan Hill
e74de85a53 Merge branch 'master' of https://github.com/microsoft/onnxruntime into ryanunderhill/cuda_shared 2021-04-21 19:55:50 -07:00
Thiago Crepaldi
771a6d235b
Fix IsContiguousTensor check on backend (#7391) 2021-04-21 17:01:17 -07:00
Ryan Hill
95fa78df38 Fix pybind 2021-04-20 22:40:38 -07:00
Xiaoyu Liu
913ea8264b
GPT2 with one step beam search (#7163)
* beam search refactoring checkin
* add factory class and deduplicate code
* one step beam search works on gpu

Co-authored-by: Xiaoyu Liu <xiaoyu@xiaoyu-VM.z4vh1dzj5eoevgybsksdpz2izh.jx.internal.cloudapp.net>
2021-04-20 06:23:52 -07:00
M. Zeeshan Siddiqui
6dda1e0681
Flag for tensor memory re-use in allocation planner. (#7359) 2021-04-16 17:53:25 -07:00
Tianlei Wu
aa9ab565f5
FastGelu fusion for Megatron model (#7344)
* add a fastgelu pattern from Megatron model

* update comment

* add test
2021-04-15 00:39:33 -07:00
Ryan Hill
80cae23393 Merge with master 2021-04-14 19:07:25 -07:00
Oliver Rausch
87bd836886
Fixes in symbolic shape inference (#7258)
* Add symbolic shape inference for Transpose

* Support steps in symbolic shape inference for Slice

* Add inference for BatchNormalization

* Address review changes

* Address review changes
2021-04-13 22:17:30 -07:00
Ryan Hill
4cf4cf3032 Fix python 2021-04-13 20:28:03 -07:00
Zhang Lei
f62db1a09c
quantization tools support qlinear average pool (#7309) 2021-04-13 18:22:42 -07:00
Zhang Lei
a4fdb4dbd9
Support transpose by merge Reshape etc into direct xint8 operators. (#7265)
* Suppose transpose by merge Reshape etc into direct xint8 operators.

* Add resize operator quantization support

* Add QDQ tests for resize, reshape, maxpool, transpose.
2021-04-08 18:00:35 -07:00
KeDengMS
0d49e53985
[Symbolic shape infer] fix scalar shape in Expand (#7285) 2021-04-08 10:26:28 -07:00
Maajid khan
27e778909d
[OpenVINO-EP] Enabling save/Load blob feature (#7054)
* Enabling save/Load blob feature for OpenVINO-EP

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Added changes to enhance save/load feature

->This feature applies only for MYRIAD device target
->cleaned up the code and added error checks

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Enabled the feature only for MyriadX and only for Linux

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fixed compilation issues on windows

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Added changes to fix const subgraph issue

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fixed issues on windows

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Added changes for the feature

-> Removed default location dir dump using cmake
-> Enabled saving blob dumps at the executable path
   by default

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Made save/load dump path configurable

-> The save/load blob dump path is now also made configurable
using a c/python Api's.

-> Introduced a flag named blob_dump_path

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Minor fixes added

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fixed python API issues

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Using GetEnvironmentVar to get the path

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fixed python runtime option issue

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fixes import network issue on windows

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>
2021-04-07 20:59:16 -07:00
raviskolli
5d759e182b
Allocate external Rocm allocator via PyBind (#7148)
* Enabled rocm support for graph transformations

* Support for external Hip allocator

* Added const_cast to reinterpret_cast to fix compiler issue

* Another crack at fixing the compile error

* More compilation fixes

* Added compilation flags to load_inline extension

* Added ROCM, ROCM_PINNED constants

* Changes to address PR comments

* Changed gpu identifier from ROCM to CUDA

* Added HIP compilation flag for torch inline functions

* Fixed a typo in header allocator string formatting

* Fix for runtime error with external_cuda_allocator

* Removed cuda/rocm specific code paths for allocators

* More name changes to generic gpu from rocm/cuda

* Removed duplicate allocator creation

* Rename cuda_external_ config options as gpu_external_

* Rename hip_mem_limit to gpu_mem_limit

* Rename cuda_mem_limit to gpu_mem_limit
2021-04-06 15:23:51 -07:00
Olivia Jain
fb40602ea2
Mem trt (#6868)
* adding trt comparison and memory consumption

* creating separate docker file
2021-04-05 22:16:12 -07:00
Marek Šuppa
008065aab1
Update README.md (#7043)
* Fix the precision type (switch from nonexistent `int32` to `fp32`).
2021-04-05 10:03:14 -07:00
Weixing Zhang
74ee24cf7f
rename cuda_mem_limit and hip_mem_limit to gpu_mem_limit for both CUDA EP and ROCm EP (#7226)
With this change, differentiating CUDA EP and ROCm EP is not needed in training script when mem_limit option needs to be set.

Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
2021-04-05 09:04:04 -07:00