Commit graph

5286 commits

Author SHA1 Message Date
Edward Chen
e09321f4db
Update ORT format model conversion utility to optionally fail fast on model conversion failure. (#8589) 2021-08-03 11:12:56 -07:00
Weixing Zhang
deab284e4c
fix build failure with --cmake_extra_defines onnxruntime_DEBUG_NODE_INPUTS_OUTPUTS=1 (#8587)
* fix build failure with --cmake_extra_defines onnxruntime_DEBUG_NODE_INPUTS_OUTPUTS=1

* another compile error and add onnxruntime_USE_ROCM

* braces alignment

Co-authored-by: suffian khan <sukha@microsoft.com>
2021-08-03 09:02:49 -07:00
stevenlix
d14b08d09c
Update onnx-tensorrt parser and cgmanifest (#8585)
* update onnx-tensorrt parser and cgmanifest.json

* update cgmanifest
2021-08-02 18:55:33 -07:00
Maajid khan
9e07ad93ae
[OpenVINO-EP 2021.4] Add/update Dockerfiles w.r.t OpenVINO 2021.4 Version (#8491)
* Implement multi-stage Dockerfile

- Reduces image size from 2.3 GB to 1.46 GB.
- Uses Ubuntu based OpenVINO image as base image leading to fewer
required instructions
- Does not include unnecessary build time components in deploy image

* Remove wget after usage

* Uninstall wget in the same RUN statement

Avoids re-distributing wget package in any of the layers

* Update License header according to Intel guidelines

Updated the license header according to Intel corporate guidelines.

* Use Ubuntu18's default Python3

Don't install Miniconda and use the default Python3 provided by
the base Ubuntu 18 OS.

* OpenVINO EP with CentOS7

Dockefile to build ONNX RT with OpenVINO EP with a CentOS 7 base.

* Dockerfile documentation changes

Updated documentation to show the latest docker image location and
usage details.

* updated ov-ep doc link

* Temporarily disabling VAD-M due to regression

* fix for vad-m daemon config setting

* Revert "Temporarily disabling VAD-M due to regression"

This reverts commit c503bea38397f332b220321823e0ca1c55f4aab3.
VAD-M issue fixed. this is no longer needed

* Revert "Revert "Temporarily disabling VAD-M due to regression""

This reverts commit 7ca53feb2ba585c050be81770698f9abae8dbe28.

* Revert "fix for vad-m daemon config setting"

This reverts commit 9964f8452194655c0b988bd8472da45996deca38.

* Ubuntu Dockerfile update w.r.t 2021.4

This dockerfile uses openvino 2021.4 runtime
base image from OpenVINO.

uses onnxruntime 1.8 release branch to generate the
image.

Added fix for VADM HDDL

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Added new dependency in deploy stage

Added sources for all the dependency
packages of unattended-upgrades package
which had GPL license into deploy stage.

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Updated CentOS Dockerfile to the latest 2021.4

-Dockerfile updated
-VADM Fix added

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Updated c# openvino dockerfile w.r.t 2021.4

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Updated the ubuntu dockefile branch and repo

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Updated Dockerfile Documentation w.r.t 2021.4

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Updated GCC version to 10 for centos dockerfile

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

Co-authored-by: S. Manohar Karlapalem <manohar.karlapalem@intel.com>
2021-08-02 15:13:46 -07:00
Edward Chen
717627775a
Increase build timeout (#8583) 2021-08-02 14:50:01 -07:00
stevenlix
ee99fb400c
Upgrade TensorRT to v8.0.1 (#8512)
* update onnx-tensorrt parser to master

* disable unsupported tests

* add cuda sm 75 for T4

* update tensorrt pipeline

* update trt pipelines

* update trt pipelines

* Update linux-gpu-tensorrt-ci-pipeline.yml

* update trt cid pipeline

* Update linux-gpu-tensorrt-ci-pipeline.yml

* Update Tensorrt Windows build pool and TensorRT/CUDA/CuDNN version

* update to cuda11.4 in trt ci pipeline

* update base image to cuda11.4

* update packaging pipeline to cuda11.4

* clean up

* remove cuda11.1 and cuda11.3 docker file

* disable unsupported tensorrt tests at runtime

* Update linux-multi-gpu-tensorrt-ci-pipeline.yml
2021-08-02 11:20:31 -07:00
satyajandhyala
87975bdeef
Use CUDA_HOME and CUDNN_HOME from the environment if they are not specified on the command line. (#8575) 2021-08-02 09:18:44 -07:00
Changming Sun
49a6ff75e6
Update py-packaging-stage.yml (#8569) 2021-08-02 09:17:15 -07:00
KeDengMS
d8c145d218
[Nuphar] don't transpose B if A is a 1D array (#8568)
don't transpose B if A is a 1D array

Don't transpose and pre-pack B if A is a 1D array, because
we only handle non-transposed case when we compute MatMul's
shape in codegen/mti/math/matmul_ops.cc

Co-authored-by: Yang Chen <yanchen@microsoft.com>
2021-07-31 00:12:47 -07:00
Changming Sun
0510688411
Update compliance tasks in python packaging pipeline and fix some compile warnings (#8471)
1. Update SDLNativeRules from v2 to v3. The new one allows us setting excluded paths.
2. Update TSAUpload from v1 to v2. And add a config file ".gdn/.gdntsa" for it.
3. Fix some parentheses warnings
4. Update cmake to the latest.
5. Remove "--x86" build option from pipeline yaml files. Now we can auto-detect cpu architecture from python. So we don't need to ask user to specify it.
2021-07-30 17:16:37 -07:00
Tianlei Wu
330b8e74bd
Fix attention parity for GPT-2 (#8549)
* Use persistent softmax to parity with huggingface
* fix undirectional mask logic
* add test
2021-07-30 16:49:20 -07:00
baijumeswani
816ad86d14
Configuring ORTModule - Internal Options (#8537) 2021-07-30 13:05:32 -07:00
Scott McKay
c6f95841dc
Add HardSigmoid to mobile packages. Used by PyTorch MobileNet v3 (#8552) 2021-07-30 12:08:11 +10:00
Guoyu Wang
464fd28ee9
Update iOS packaging script to default build static framework, disable bitcode (#8533)
* default package build to static, disable bitcode

* fix pipeline failure

* Address CR comments
2021-07-29 17:28:02 -07:00
Ye Wang
ad093b94b9
Restore transformers tests and disable some tests (#8530)
* restore transformers tests and disable some tests

* test

* update

* pass pep8 check

* update
2021-07-29 14:09:36 -07:00
Rachel Guo
0cf2ed029b
Add python binding for CoreML EP (#8472)
* add pybind binding for coreml ep

* update merged files

* address comments

* format

* remove lines for non-macOS platform

Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
2021-07-29 10:06:47 -07:00
KeDengMS
d243b38929 [Symbolic Shape Infer] Bump up required onnx ver
And remove some stale comments in build.py
2021-07-29 09:36:20 -07:00
Tang, Cheng
94c54718fb
fix build break (#8536) 2021-07-28 21:09:43 -07:00
Xiang Zhang
778680202b remove unused functions to avoid warnings 2021-07-28 18:03:00 -07:00
satyajandhyala
5e2f4263db
Enable cast propagation in the frontend. (#8517) 2021-07-28 17:06:49 -07:00
Tang, Cheng
00d8f8ce95
enable shared lib based execution provider test on linux (#8480)
* enable shared lib test on linux

* fix build break

* add onnx dependency

* add rpath

* skip the test for linux training

* set ONNX_ML definition

* install training python dependency

* update

* fix format; add eigen include folder

* fix format

* skip amd build

* enable shared provider on training

* fix comments in pr

Co-authored-by: Ubuntu <chenta@chenta-orttraining-cpu.bxgbzpva45kedp3rhbsbit4phb.jx.internal.cloudapp.net>
Co-authored-by: Changming Sun <chasun@microsoft.com>
2021-07-28 16:58:13 -07:00
baijumeswani
2e28cbaa64
Configuring ORTModule - End User Facing Options (#8470) 2021-07-28 10:51:43 -07:00
Changming Sun
6f5bf8b8f2
Update Linux Training CPU CI pipeline (#8518) 2021-07-28 10:25:52 -07:00
Sherlock
1370cbe256
[ORTModule] Extract output schema in module's true train/eval mode (#8516)
* Extract output schema in module's true train/eval mode
2021-07-28 09:55:07 -07:00
mindest
a71dab691d
Implement BatchNormInternal for cuda (#8172)
* correct batchnorm replacement output order;

remove bn replacement in grad graph builder

* update op defs and kernel class

* implement batch norm internal and grad.

* change saved_var into saved_inv_std

* cuda test case: bn internal

* remove redundant include

* fix comment; add support and UT for 1d input.

* exclude batch_norm_internal in amd_hipify

* run BNInternal UT for CUDA only

* fix CI error

* fix comment errors

* fix error

* add comment for inconsistency with cudnnBN doc

* additional comments for cudnnBN inconsistency
2021-07-28 16:04:49 +08:00
Tracy Sharpe
539d1d44c1
Optimize ARM64EC build (#8515)
Add sgemm and qgemm optimized kernels for ARM64EC configuration.
2021-07-27 23:46:39 -07:00
Vincent Wang
1798698545
avgpool2d atenop (#8507) 2021-07-28 14:04:55 +08:00
Xiang Zhang
73660d78df
Fix WinML build warnings in HStringFromUTF8 (#8519) 2021-07-27 22:29:58 -07:00
Yufeng Li
ceeb1a65d6
Add quantization support of GEMM directly with QGemm (#8447)
QGemm takes in quantized A, B, C, and quantization parameters of output Y, in which C and quantization parameters of Y are optional. Its output can be quantized or full precision, which depends on whether quantization parameters of Y exists or not. If quant params of Y are provided, the output will be requantized or is full precision.

Comparing with QLinearMatMul and MatMulInteger, QGemm supports transpose, apha and beta attribute.

The formula for quantized GEMM is:
Y = alpha * scale_a * scale_b * ((A_int8 - zp_a) * (B_int8 - zp_b) + C_int32), in which,
C_int32 is quantized with formula: C_int32 = (beta * C) / (alpha * scale_a * scale_b)
2021-07-27 21:21:49 -07:00
Zhang Lei
0f46b08646
improve the qlinear avg pool perf (#8514)
*) use context buffer allocator, remove init cost of vector
    *) using lookup table to dequantize large input
    *) fall back to global average pool if it is
2021-07-27 20:56:59 -07:00
Tim Harris
56441dcd88
Limit work items to available threads, upgrade checks from assert to ORT_ENFORCE (#8495) 2021-07-27 19:25:12 -07:00
Sherlock
686f9b530b
ORTModule set_seed in int (#8511) 2021-07-27 15:43:13 -07:00
Tracy Sharpe
7d47175f76
cleanup NCHWc transformer (#8479) 2021-07-27 15:39:10 -07:00
ashari4
3850755feb
Fix: onnxruntime_eager library does not compile on Windows due to path string constant (#8487) 2021-07-27 15:15:18 -07:00
Oliver Rausch
1685ab8138
Implement Concat with Strided copy (#8336)
Adds a StridedCopy function that implements a copy from strided tensor to another.

This parallelizes the Concat operator, and can also be used in the future to parallelize many other data movement operators (e.g. Transpose, Split, etc.).
This operation is also required for the proposed data layout extensions to ORT.
2021-07-27 18:27:56 +02:00
Guoyu Wang
4c939e1cb7
Add an option to use the input model bytes (ORT format only) directly without copy at session creation (#8502)
* Do not copy the model_data when session is started by CreateSessionFromArray

* Add config option for disabling copy model bytes

* Add one additional test

* Address CR comments
2021-07-27 09:11:42 -07:00
ytaous
1ae32655b3
fix t5 assert error (#8501)
Co-authored-by: Ethan Tao <ettao@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2021-07-27 09:04:01 -07:00
Edward Chen
b4baac888c
[NNAPI EP] Make partitioning stop ops configurable from Python API. (#8484) 2021-07-27 08:16:47 -07:00
Edward Chen
421c4059c0
[iOS Packaging] Update build definition (#8503)
* Add build number into version.

* Add parameter for archive upload.
2021-07-27 08:16:02 -07:00
KeDengMS
0a70c2de00
[Nuphar] Add support for opset 14 (#8483)
- For ops used in quantized LSTM
- Update nuphar model editing/quantizer scripts
2021-07-27 06:13:47 -07:00
Ankur Verma
91936864ce
Expose additional shared_provider APIs (#8478) 2021-07-26 18:03:12 -07:00
Tianlei Wu
534c22d769
use float for alpha in attention Gemm (#8477) 2021-07-26 11:04:56 -07:00
Xavier Dupré
a9fc3c448c
Improves documentation, show InferenceSession contructor attributes (#8494)
* include constructor parameters in the python documentation
* expose more classes into the documentation
2021-07-26 15:58:47 +02:00
Tianlei Wu
79097ef553
remove useless reshape node (#8419) 2021-07-23 18:12:21 -07:00
Viswanath Boga
6dee9b9d2d
attention fusion kernel refactoring (#8432)
* attention fusion kernel refactored

* consider the case of none in add_qk

* variabled added to check for pre-pack weights

* added a comment to PrePack()

* Optimized prepack and try to free the weights

* making comment sound better

* fixing a bug with optimizer.py

* commented out changes to be done

* removed comments

* make the private fn() private

* fix build

* making clean up fn static

* backed out optimizer tool change, needs more looking into
2021-07-23 17:46:39 -07:00
Ryan Hill
a396c9e572
Add more safety checks to the C API (#8474) 2021-07-23 15:41:27 -07:00
Ye Wang
6a07172a93
Restore cpu affinity after loading tensorflow model from transformers (#8448)
* Update onnx_exporter.py

* update

* review comments
2021-07-23 15:20:44 -07:00
Yulong Wang
e66846da4a revise terms according to guideline 2021-07-23 13:26:15 -07:00
ytaous
ab5289f109
Performance: enable faster training with skip checks config (#8411)
* freeze/fastpath support

* more comments on _fast_path

* per comments

* minor fix

* IntFlag improve

* address comments

Co-authored-by: Ethan Tao <ettao@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2021-07-23 10:23:13 -07:00
Vincent Wang
c8d210de29
Decouple Forward and Backward of ATenOp (#8301)
* atenop for inference

* assert if dtype mismatch

* atenop config in frontend

* fix orttrainer test

* gradient def not only for ATenOp

* bugfix

* fix gradient input shape and type issue

* fix after merge master
2021-07-23 16:53:26 +08:00