Commit graph

11997 commits

Author SHA1 Message Date
Tianlei Wu
289999af35
disable half2 kernel by dfault (#9034) 2021-09-10 20:09:21 -07:00
Tang, Cheng
8eb6546e8e
enable eager mode with ortmodule (#8961)
* initial change for eager/ortmodule integration

* pdate to latest pytorch api

* add test model;fix torch version issue

* fix comments in pr

* fix python test break

* fix api change

* fix comments in PR

* pass device into the fw function
2021-09-10 15:09:23 -07:00
Edward Chen
29d6573f3d
Increase timeouts for Mac CI builds. (#9024)
Increase timeouts for "orttraining-mac-ci-pipeline" and "iOS CI Pipeline" CI builds.
2021-09-10 12:57:08 -07:00
Chen Fu
b3c2725862
fix cpuinfo compilation flag usage (#9029)
Co-authored-by: Chen Fu <fuchen@microsoft.com>
Bug was introduced from PR #8716

When restricting cpuinfo to only known platforms, compilation flag change was not thorough, which accidentally turned off hybrid core detection for ARM systems.

This PR fixes this bug
2021-09-10 12:43:38 -07:00
satyajandhyala
ce7b12bf5d
Added new fp16 allow/safe opcodes in PropagateCastOps (#8964)
* Removed RemoveInputOutputUpDownCasts strategy in PropagatCastOps.

* Added Expand, Squeeze and Unsqueeze ops to fp16 allow ops

* Added onnx models for squeeze/unsqueeze tests.
2021-09-10 11:53:26 -07:00
Bowen Bao
31af88c0bc
Update cross_entropy_loss symbolic for new argument from upstream torch (#9007)
In torch 1.10, `label_smoothing` is added as additional input to `cross_entropy_loss`. Update the symbolic function to handle this change.
2021-09-10 10:32:59 -07:00
Zuwei Zhao
ff66cfdfa6
Enable linking in exception throwing support library when build onnxruntime wasm. (#8973)
* Enable linking in exception throwing support library when build onnxruntime webassembly containing onnxruntime-extensions.

* Add flag in build.py to enable linking exceptions throwing library.

* Update onnxruntime-extensions document and bind custom_ops build flag with use_extensions.

* Update doc.

* Update cgmanifest.json.

Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com>
2021-09-10 22:09:16 +08:00
Tianlei Wu
e5ee0b435d
Attention Fusion for GPT-2 from Megatron (#8987)
(1) Attention Fusion for gpt-2 model from Megatron.
(2) Update symbolic shape inference of Attention to support 4D mask.
(3) Add an otpion in save_model_to_file to save external data in one file or not, and warning of existing external data
(4) Fix deprecation: logger.warn => logger.warning
(5) Add model loader to test model without external data
(6) Add an API of optimize_by_fusion, and topological sort after optimization.
2021-09-10 00:29:40 -07:00
Du Li
57b7ab56cd
Adding async fetching for webgl backend (#8951)
* Adding async fetching for webgl backend

* fix PR comments and CI failure.

* fixing a bug

* adding a flag
2021-09-09 22:17:42 -07:00
Yulong Wang
5145fa236f
[js/web] fix ort web e2e test (#9025) 2021-09-09 22:08:27 -07:00
Ryan Hill
2439ced3ec
API Documentation (#8948)
* Make help information compile properly
2021-09-09 22:04:51 -07:00
liqun Fu
6412c6a362
do not add pkg wheel entry to the index html file if it already exists (#9004)
* do not add pkg wheel entry to the index html file if it already exists
2021-09-09 16:20:19 -07:00
Gary Miguel
e357022362
Remove onnxruntime team from CODEOWNERS (#8954)
There are currently 98 members in the team. Requesting review from
all of them for every PR is too noisy.
2021-09-09 15:26:59 -07:00
Spike Curtis
00fbc3b0bc Instruct dockerfile users to do submodule updates
Signed-off-by: Spike Curtis <spike@lodestar.ai>
2021-09-09 11:17:21 -07:00
baijumeswani
d78e90d1af
Adding preprocessor checks for torch version during torch cpp extensions compilation (#8989) 2021-09-09 10:26:38 -07:00
Chi Lo
0367e1f1c2
Update Nuget Packge Pipline to CUDA11.4 and TensorRT8 on Windows (#9000)
* Update to CUDA11.4 and TensorRT-8.0.3.4

* update trt pool, remove cudnn from setup_env_gpu.bat

* revert pool

* test gpu package pipeline on t4

* back out changes

* back out changes

Co-authored-by: George Wu <jywu@microsoft.com>
2021-09-09 06:56:37 -07:00
pengwa
d209fe29b9
custom autograd func memory refinement (#8993)
* Release torch tensor referenced by torch gradient graph (created in PythonOp)

* Update orttraining/orttraining/python/training/ortmodule/torch_cpp_extensions/torch_interop_utils/torch_interop_utils.cc

* refine with comments

Co-authored-by: Wei-Sheng Chin <wschin@outlook.com>
2021-09-09 18:37:24 +08:00
Pranav Sharma
d39959172f
Fix fuzz testing build blocking release. (#9008) 2021-09-09 00:44:40 -07:00
Guoyu Wang
1533f574e4
Add full Android job in package pipeline (#9009)
* Add full Android job in package pipeline

* Address CR comments
2021-09-08 21:12:59 -07:00
Hariharan Seshadri
c20cb766be
Optimize sequence type usage on CUDA [3/n] (#9002) 2021-09-08 16:01:38 -07:00
Yulong Wang
2e8792ca42
[js/web] fix karma launch with chrome headless (#8998) 2021-09-08 11:52:41 -07:00
Ashwini Khade
ec63d10303
add model local function support (#8540)
* updates for picking pnnx commit

* add tests filter to c# tests

* plus test fixes

* fix versioning for contrib ops

* fix tests

* test filter for optional ops

* more versioning related updates

* fix test

* fix layernorm spec

* more updates

* update docs

* add more test filters

* more filters

* update binary size threshold

* update docs

* draft - enable model local function

* enable model local functions in ORT

* update to latest rel onnx commit

* plus tests

* plus more updates

* plus updates

* test updates

* Fix for nested functions + shape inference

* plus bug fix and updates per review

* plus fixes per review

* plus test updates

* plus updates per review

* plus fixes

* fix a test
2021-09-08 11:47:01 -07:00
Vincent Wang
b7b42e0c5d
fast reduction for reducemean (#8976) 2021-09-08 10:28:57 -07:00
stevenlix
1c872f9d74
Fix issues in TensorRT EP (#8996)
* fix big engine load issue and add cuda_cpu_alloc

* remove redundancy

* fix minor issues
2021-09-08 10:28:16 -07:00
Olivia Jain
6fbd0a8233
Change cmake_cuda_architectures to double quotes (#8990) 2021-09-08 09:41:52 -07:00
Chi Lo
5ae4c54ab8
Fix bug for validating GPU packages (#8997) 2021-09-08 02:06:53 -07:00
George Wu
a30d9f5317
fix windows gpu pipelines that use cuda 10.2 (training, reduced_ops and 10.2 validation) (#8994)
* build for arch 52

* arch 52

* gpu arch 52
2021-09-07 22:01:06 -07:00
Sunghoon
450524359e
[js/web] WebAssembly profiling (#8932)
* add p50 in test

* Preallocate WebAssembly worker threads to minimize worker creation time

* WebAssembly profiling

* merge master

* merge with proxy changes

* disable profiling tests from WebAssembly build

* fix e2e test failure

Co-authored-by: Yulong Wang <yulongw@microsoft.com>
2021-09-07 17:18:08 -07:00
ytaous
0193490cbf
ReduceMin - add int64 cuda kernel support for opset12/13 (#8966)
* ReduceMin - int64 support

* fix doc

Co-authored-by: Ethan Tao <ettao@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2021-09-07 17:01:26 -07:00
Changming Sun
91c15843cd
Fix a directml python packaging error (#8981) 2021-09-07 16:29:33 -07:00
Ye Wang
e2194797a7
bumping up to version 1.9 (#8982)
* bump up version

* makes the windowAI column align with ORT version

* update the hardcoded version string

* fix a typo
2021-09-07 14:30:55 -07:00
George Wu
00eca42413
make_policy(SET CMP0104 OLD) (#8793) 2021-09-07 13:12:50 -07:00
Ryan Hill
b7971575f8
Fix python manylinux to not load cuda if it fails to load dependencies (#8882)
* Fix python manylinux to not load cuda if it fails to load dependencies
2021-09-07 11:09:25 -07:00
Changming Sun
0bb56a18cf
Add TRT header file to ORT GPU nuget package (#8962) 2021-09-07 09:50:09 -07:00
senysenyseny16
3be96f8a15
fix: import error in TrtTable::Dict method (#8940) 2021-09-07 00:28:49 -07:00
Ye Wang
5d47b2e431
Add Einsum and Reciprocal op support in symbolic shape inference (#8931)
* fix 1

* fix 2

* update

* support einsum

* format

* test

* format

* add test for eimsum
2021-09-06 16:54:48 -07:00
Changming Sun
60c98a86b7
CMake file changes for macOS universal2 support (#8953) 2021-09-04 13:30:33 -07:00
stevenlix
a9776d1c70
Add QDQ model support in TensorRT EP (#8969)
* disable setting dynamic range for QDQ model

* update cgmanifest

* Update cgmanifest.json
2021-09-03 19:33:34 -07:00
ytaous
53eb79f9f6
Gemm/Transpose fusion - additional pattern coverage (#8941)
* gemm transpose fixes

* enforce condition

* add comments

* rm redundant code

Co-authored-by: Ethan Tao <ettao@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2021-09-03 15:24:47 -07:00
Scott McKay
eebcc20f10
Add netstandard2.0 framework to nuget managed package. (#8960)
* Add netstandard2.0 to nuget managed package.
Re-does PR that was backed out due to packaging pipeline changes.
Allows deprecation of netstandard1.1 in the following release as netstandard2 is the preferred lowest level framework.
2021-09-04 08:01:46 +10:00
Olivia Jain
a0c9408f0d
Make TRT Version Configurable (#8864)
* copy changes from trt_and_mem

* second edits

* Update linux-gpu-tensorrt-ci-perf-pipeline.yml for Azure Pipelines

* Update linux-gpu-tensorrt-ci-perf-pipeline.yml for Azure Pipelines

* Update linux-gpu-tensorrt-ci-perf-pipeline.yml for Azure Pipelines

* change to cuda 11.4

* build with cuda 11.4

* Update Dockerfile.ubuntu_cuda11_1_tensorrt7_2

* add cmake extra defines

* cmake architectures

* fix cmake arch

* Delete ubuntu-18.04.Dockerfile

* Rename Dockerfile.ubuntu_cuda11_1_tensorrt7_2 to Dockerfile.ubuntu_cuda11_4_tensorrt7_2

* Update linux-gpu-tensorrt-ci-perf-pipeline.yml

* Update linux-gpu-tensorrt-ci-perf-pipeline.yml for Azure Pipelines

* removing previous ort args

* rename to cuda 11.4

* remove cuda 10_2

* delete trt 7.1

* remove 7.1

* Passing in cuda architecture to reduce build time

* always add submodule sync due to recursive cloning

* fix run command

* add and

* take away unused arms and share python installation script

* Update linux-gpu-tensorrt-ci-perf-pipeline.yml

* Update Dockerfile.tensorrt

* cleanup file

* install python directly on dockerfile - move to scripts in future

* Update Dockerfile.custom-trt-perf

* adding cuda 11.1 for missing Libnvrtc.so.11.1

* Delete install_python.sh
2021-09-03 13:32:27 -07:00
Chi Lo
1f576e1766
Detect necessary files inside GPU packages (#8955)
* Rename files

* Update YAML files

* Update validation script and YAML
2021-09-03 13:28:28 -07:00
liqun Fu
a7f5bd226b
retarget torch181 to torch182 (#8947)
Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2021-09-03 09:44:42 -07:00
baijumeswani
0cc2909573
Auto forward non method attribute lookups to the user's model and bind custom methods to ORTModule (#8798) 2021-09-03 08:25:44 -07:00
Vincent Wang
c343f7cb43
Add Algorithm Search for ConvGrad (#8613)
* algo search for conv grad

* global cache, bigger workspace size

* fix build error

* refactor

* refactor

* resolve comments

* fix rocm

* change lock places

* rename variable

* remove setting for inference

* resolve comments
2021-09-03 11:25:17 +08:00
Tianlei Wu
91f05f387a
Update embed layer norm fusion to work with transformers v4.9 (#8914) 2021-09-02 19:48:07 -07:00
Hariharan Seshadri
e348929019
Minor cleanup from #7592 (#8952) 2021-09-02 18:46:57 -07:00
Scott McKay
5f30be3e92
Exclude training support from BatchNorm in minimal build (#8939)
* Exclude changes to BatchNorm that are training specific from minimal build.

Previous changes [excluded](https://github.com/microsoft/onnxruntime/pull/7704) training specific code but that was recently [undone](https://github.com/microsoft/onnxruntime/pull/8269) to support a pytorch CI need that isn't relevant to minimal builds.
2021-09-03 08:02:19 +10:00
Gary Miguel
47435311f4
Include pytorch_export_contrib_ops in inference builds (#8878)
* Include pytorch_export_contrib_ops in inference builds

Rename / move it from tools/python/register_custom_ops_pytorch_exporter
to onnxruntime/python/tools/pytorch_export_contrib_ops.

Rationale for inclusion in inference builds:
This code is potentially useful for anyone using ORT, not just training.

Rationale for new name:
"Contrib op" is the nomenclature used within ORT to refer to the set of
ops that are not in the standard op set but are included by default with
ORT. This is more specific than "custom op", which is what the PyTorch
exporter uses to refer to any non-standard op.

Step 1 of addressing #8818. After this is merged I will update the docs.

* Enable test_pytorch_export_contrib_ops.py in CI

Fixes AB#1342330
2021-09-02 14:26:58 -07:00
Gary Miguel
06bb2ec561
ignore direnv configs (#8861)
https://direnv.net/ is a useful tool but its configs are developer-specific
2021-09-02 11:53:57 -07:00