Commit graph

20 commits

Author SHA1 Message Date
yf711
0ad0d6ebbf
Unblock Linux MultiGPU TensorRT CI (#16446)
### Description
Revert docker base image to
nvidia/cuda:11.8.0-cudnn8-devel-ubuntu20.04@sha256:b754c43fe9d62e88862d168c4ab9282618a376dbc54871467870366cacfa456e



### Motivation and Context
The default img env of nvidia/cuda:11.8.0-cudnn8-devel-ubuntu20.04 has
minor upgrade, which make Linux MultiGPU TensorRT CI (NV12 instance with
Maxwell GPU) fail on three CApiTestGlobalThreadPoolsWithProvider
tests (these three tests have higher error which are above the tolerance)

That minor upgrade includes cudnn 8.7.0->8.9.0, which might be a factor
that make maxwell GPU generator higher error. CIs with T4 GPU are not
affected.
2023-06-21 17:15:39 -07:00
yf711
d701dcd027
Fix Linux MultiGPU TensorRT CI (#15697)
### Description
* Reverting default TensorRT version to 8.5 as temporary fix
  
* Apart from that, this PR temporarily leaves this CI as a place to
validate user behavior that uses TRT 8.5 with latest ORT

### Context
* This CI pool equips 2xTesla M60 GPUs, which are no longer supported by
TensorRT 8.6.
* Currently, other CIs are using single-T4 VM but there's no VM with
2xT4 or other suitable dualGPU in the range.
* Once we decide which VM instance for this CI to migrate to, TRT8.6 can
be enabled on this CI

* According to
[Nvidia](https://docs.nvidia.com/deeplearning/tensorrt/release-notes/index.html):
* TensorRT 8.5.3 was the last release supporting NVIDIA Kepler (SM 3.x)
and NVIDIA Maxwell (SM 5.x) devices. *These devices are no longer
supported in TensorRT 8.6*. NVIDIA Pascal (SM 6.x) devices are
deprecated in TensorRT 8.6.
2023-04-26 10:01:33 -07:00
yf711
8cd5f3ad9c
[TensorRT EP] support TensorRT 8.6-EA (#15299)
### Description

<!-- Describe your changes. -->

* Integrate TRT 8.6EA on relevant Linux/Windows/pkg pipelines
  * Update onnx-tensorrt to 8.6
  * Add new dockerfiles for TRT 8.6 and clean old ones
* Update
[CGManifest](https://github.com/microsoft/onnxruntime/tree/main/cgmanifests)
files and ort build deps version
  * yml/script update
* Enable built-in TRT parser option on TRT related pipelines by default
* Exclude test TopKOperator.Top3ExplicitAxisInfinity out of TRT EP tests
(8.6-EA has issue with topk operator)
2023-04-12 11:34:59 -07:00
Chi Lo
ba89cae3bd
Update package pipelines to support TRT 8.5 (#13998)
Update following package pipelines to support TRT 8.5 after
https://github.com/microsoft/onnxruntime/pull/13867:

- [Linux Multi GPU TensorRT CI
Pipeline](https://aiinfra.visualstudio.com/Lotus/_build?definitionId=1016&_a=summary)
- [Python packaging
pipeline](https://aiinfra.visualstudio.com/Lotus/_build?definitionId=841&_a=summary)
-
[build-perf-test-binaries](https://aiinfra.visualstudio.com/Lotus/_build?definitionId=1130&_a=summary)
-
[Linux-GPU-EP-Perf](https://aiinfra.visualstudio.com/Lotus/_build?definitionId=841&_a=summary)
2022-12-16 15:01:50 -08:00
Adrian Lizarraga
b20daeda81
Update Linux Multi GPU TensorRT pipeline to TensorRT 8.4 (#11923)
* Try manually installing trt8.4 in multi-gpu pipeline

* Remove stmts that clean up cmake, ctest. Update tensorrt repository name passed to get_docker_image.py

* Update trt and cudnn home

* Don't install trtexec cli tool.

* Increase job timeout

* Revert timeout change and use trt placeholder builder build option
2022-06-21 07:59:11 -07:00
George Wu
3d6786c92e
update tensorrt multi gpu pipeline to tensorrt 8.2 (#10141) 2021-12-27 15:43:27 -08:00
stevenlix
ee99fb400c
Upgrade TensorRT to v8.0.1 (#8512)
* update onnx-tensorrt parser to master

* disable unsupported tests

* add cuda sm 75 for T4

* update tensorrt pipeline

* update trt pipelines

* update trt pipelines

* Update linux-gpu-tensorrt-ci-pipeline.yml

* update trt cid pipeline

* Update linux-gpu-tensorrt-ci-pipeline.yml

* Update Tensorrt Windows build pool and TensorRT/CUDA/CuDNN version

* update to cuda11.4 in trt ci pipeline

* update base image to cuda11.4

* update packaging pipeline to cuda11.4

* clean up

* remove cuda11.1 and cuda11.3 docker file

* disable unsupported tensorrt tests at runtime

* Update linux-multi-gpu-tensorrt-ci-pipeline.yml
2021-08-02 11:20:31 -07:00
baijumeswani
cab84d902e
Install and use conda on ortmodule CI pipelines (#7530)
* Install and use conda on ortmodule CI pipelines

* Update build script to install onnxruntime wheel before running unit tests

* Remove python 3.5 from install_python_deps

* Pinning deepspeed version to 0.3.15
2021-05-03 15:52:22 -07:00
Changming Sun
65b2b87f83
Update CI build docker images (#7386)
Update CI build docker images: delete ubuntu 16.04 support.
2021-04-21 13:18:34 -07:00
stevenlix
53eb948f4c
Upgrade TensorRT to v7.2.2 (#6452)
* upgrade to TensorRT 7.2.2

* extend GPU tensorrt CI timeout to 150 minutes

* update docker image name

* disable user interaction to avoid tensorrt container stuck when install tzdata

* upgrade to libssl1.1 for ubuntu20.04

* remove libicu60 from ubuntu20.04

* add libicu66 for ubuntu20.04

* debug

* llvm

* llvm

* disable ReverseSequenceTest.InvalidInput

* disable ReverseSequenceTest.InvalidInput

* fix issues

* fix issues

* Update linux-gpu-tensorrt-ci-pipeline.yml

* disable warning 4458 for TensorRT parser

* update onnx-tensorrt submodule

* disable warnings for TensorRT parser

* update onnx-tensorrt submodule to include latest bug fixes

* update setup_env_trt

* update pool for win trt ci pipeline'

Co-authored-by: George Wu <jywu@microsoft.com>
2021-02-18 04:30:47 -08:00
stevenlix
77c69a0325
Upgrade TensorRT to v7.1.3.4 (#4704)
* upgrade to TensorRT 7.1.3.4

* Upgrade onnx-tensorrt parser for TensorRT 7.1.3.4

* fix format issue

* fix format issue

* fix format issue

* Update tensorrt_execution_provider.cc

* change cmake version to 3.14

* Remove --msvc_toolset 14.16

* change to onnxruntime::make_unique

* use onnxruntime::make_unique

* disable some tests for TensorRT

* disable some tests for TensorRT

* Update upsample_op_test.cc

* Update tile_op_test.cc

* disable some tests for TensorRT

* Update constant_of_shape_test.cc

* update parser

* Update Dockerfile.ubuntu_tensorrt
2020-08-07 17:43:56 -07:00
stevenlix
da653ccdac
Upgrade TensorRT to version 7.0.0.11 (#2973)
* update onnx-tensorrt submodule to trt7 branch

* add fp16 option for TRT7

* switch to master branch of onnx tensorrt

* update submodule

* update to TensorRT7.0.0.11

* update to onnx-tensorrt for TensorRT7.0

* switch to private branch due to issues in master branch

* remove trt_onnxify

* disable warnings c4804 for TensorRT parser

* disable warnings c4702 for TensorRT parser

* add back sanity check of shape tensort input in the parser

* disable some warnings for TensorRT7

* change fp16 threshold for TensorRT

* update onn-tensorrt parser

* fix cycle issue in faster-rcnn and add cycle detection in GetCapability

* Update TensorRT container to v20.01

* Update TensorRT image name

* Update linux-multi-gpu-tensorrt-ci-pipeline.yml

* Update linux-gpu-tensorrt-ci-pipeline.yml

* disable rnn tests for TensorRT

* disable rnn tests for TensorRT

* disabled some unit test for TensorRT

* update onnx-tensorrt submodule

* update build scripts for TensorRT

* formating the code

* Update TensorRT-ExecutionProvider.md

* Update BUILD.md

* Update tensorrt_execution_provider.h

* Update tensorrt_execution_provider.cc

* Update win-gpu-tensorrt-ci-pipeline.yml

* use GetEnvironmentVar function to get env virables and switch to Win-GPU-2019 agent pool for win CI build

* change tensorrt path

* change tensorrt path

* fix win ci build issue

* update code based on the reviews

* fix build issue

* roll back to cuda10.0

* add RemoveCycleTest for TensorRT

* fix windows ci build issues

* fix ci build issues

* fix file permission

* fix out of range issue for max_workspace_size_env
2020-02-12 07:03:58 -08:00
Changming Sun
5558b80774
clean up ubuntu docker scripts (#2103) 2019-10-14 07:20:20 -07:00
stevenlix
544e53e24e Update TensorRT to version 6.0.1.5 (#1966)
* remove onnx-tensorrt submodule

* add new onnx-tensorrt submodule (experiment) for trt6

* update engine build for trt6

* update compile and compute for tensorrt6.0

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.cc

* switch to onnx-tensorrt master for TensorRT6'

* Update tensorrt_execution_provider.cc

* Handle dynamic batch size and add memcpy in TensorRT EP

* update test cases

* Update tensorrt_execution_provider.cc

* update onnx-tensorrt submodule

* Update Dockerfile.ubuntu_tensorrt

* Update Dockerfile.ubuntu_tensorrt

* Update run_dockerbuild.sh

* Update run_dockerbuild.sh

* Update install_ubuntu.sh

* Update concat_op_test.cc

* Update tensorrt_execution_provider.cc

* Upgrade TensorRT to version 6.0.1.5

* Update onnxruntime_providers.cmake

* Update CMakeLists.txt

* Update reduction_ops_test.cc

* Update install_ubuntu.sh

* Update Dockerfile.ubuntu_tensorrt

* Update Dockerfile.tensorrt

* Update BUILD.md

* Update run_dockerbuild.sh

* Update install_ubuntu.sh

* Update onnxruntime_providers.cmake

* Update install_ubuntu.sh

* Update install_ubuntu.sh

* Update gemm_test.cc

* Update gather_op_test.cc

* Update CMakeLists.txt

* Removed submodule

* update onnx-tensorrt submodule

* Add Ubuntu18.04 build option

* Add Ubuntu18.04 build option

* Add Ubuntu18.04 build option

* Add Ubuntu18.04 build option

* Remove redundency

* Fix issue that it does not add memcopy node correctly if some nodes fall back to CUDA EP.
e.g. after partition, there's TRT_Node -> Cuda_node (with CPU memory expected), we still need to add memcpy node between them.

* update for Trt Windows build

* Update onnxruntime_providers.cmake

* Disable opset11 tests on TensorRT

* Update pad_test.cc

* Update build.py

* update scripts for ubuntu18.04

* Disable warning for Windows build
2019-10-06 10:40:53 -07:00
Changming Sun
c86d17754a
Dockerfile for CentOS CI build (#1986) 2019-10-03 11:46:27 -07:00
stevenlix
1c5b15c2b8
Remove memory copy between TensorRT and CUDA (#1561)
* remove memory copy between CUDA and TRT

* add info to RegisterExecutionProvider input

* use new IDeviceAllocator for trt allocator

* remove SetDefaultInputsMemoryType from TRT EP

* remove onnx-tensorrt 5.0

* add submodule onnx-tensorrt branch 5.1

* remove redundancy

* Update transformer_memcpy.cc

* Update tensorrt_execution_provider.cc

* switch to TensorRT 5.1.5.0

* update python binding

* disable failed test case on TensorRT

* Update activation_op_test.cc

* upgrade to TensorRT container 19.06

* update according to feedback

* add comments

* remove tensorrt allocator and use cuda(gpu) allocator

* update onnx-tensorrt submodule

* change ci build cuda directory name
2019-08-08 19:31:39 -07:00
Changming Sun
28759e2f6f Uninstall the preinstalled cmake in tensorrt image because it's too old (#1316) 2019-07-01 15:08:01 -07:00
Changming Sun
be36385a8c
Delete docker/scripts/install_deps_x86.sh and enable onnx tests for x86 (#1191) 2019-06-08 16:17:18 -07:00
Vinitra Swamy
0b5f06b0fd
removing LLVM dependency for ubuntu tensor_rt build dockerfiles (#954) 2019-05-02 10:41:04 -07:00
stevenlix
e8b0ae8923
Trt execution provider (#382)
* updated cmake files for trt

* added trt execution provider

* added trt basic test

* removed trt_path action attribute

* Add files via upload

* Update build.py

* Update trt_allocator.h

* fixed issues found by reviewers

* changed cast operator

* added comment for custom kernel implementation

* changed auto to auto&

* changed to function compile APIs for TRT execution provider

* changed to function compile APIs for TRT execution provider

* added new DType DInt64

* adapted to the changes of onnxruntime_c_api

* removed trt kernel (use function compile instead)

* updated onnx-tensorrt submodule

* set default memory type to TRT fused kernel

* resolve merge conflict

* fixed the issue that USE_CUDA conflicts with USE_TRT

* construct graph by adding nodes in topological order

* made changes for Windows

* change buffers type

* bypass HasImplementationOf check for TRT XP because TRT kernel is not registered

* added domain to version info in rebuilt model proto

* added trt to test option list

* added DomainToVersionMap() to GraphViewer

* removed Copy()

* fixed broken code

* format the code to clang format

* used local reference to the frequently used values

* fixed a couple of issues according to reviewers feedback

* fixed a couple of issues according to reviewers feedback

* added python binding for TRT and enable use_cuda when use_trt is on

* fixed a redefinition issue

* changed shared_ptr to unique_ptr on trt engines, and made a few changes required by reviewers

* enabled trtexecution provider for unit tests

* renamed trt to tensorrt

* added tesorrt to python binding

* update submodule onnx and onnx-tensorrt

* made a couple of minor changes based on reviewer's feedback

* added CUDA_CHECK

* removed test code

* fixed broken code after merge

* updated onnx-tensorrt submodule

* added post processing to align trt inputs/outputs with graph inputs/outputs

* updated onnx submodule

* added CUDA fallback for TensorRT and fixed TensorRT cmake issue

* added ci pipeline for tensorrt and removed some redundent code from trt xp

* fixed syntax issue

* updated onnx-tensorrt submodule

* fix trt build problem by: (#602)

1. Add additional /wd for debug build
2. Add io.h for additional targets
3. Bring back mb version of getopt

* Update install_ubuntu.sh

* Update linux-gpu-tensorrt-ci-pipeline.yml

* Update linux-gpu-tensorrt-ci-pipeline.yml

* Update run_build.sh

* Update run_build.sh

* Update run_build.sh

* Update run_build.sh

* fixed the issue that GetKernelRegistry returns nullptr

* merged master to this branch

* moved some data types to private

* fixed tensorrt CI pipeline issue

* customized test data for TensorRT pipeline

* added onnx-tensorrt in json file and fixed an issue in ci script

* added comments
2019-03-14 12:00:39 -07:00