Updates to training pipelines to reduce CI time (#18116)

### Description
Motivation for this PR is reducing CI test time by removing unnecessary
tests from the pipelines.

Following changes are for reducing test time in pipelines:

- Skip CPU model tests in GPU builds. Training CIs run these tests as a
sanity check. There is no direct training code being tested in these
pipelines, furthermore, CPU tests are being run in CPU pipelines so no
need to run them again in GPU builds and block the GPU VM. This change
reduces testing time by 20-25 mins in all training GPU pipelines.

- Delete debug package building pipeline for linux training packages.
This was required by compiler team at some point but there have been 0
downloads of these packages.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
This commit is contained in:
Ashwini Khade 2023-10-26 14:58:57 -07:00 committed by GitHub
parent a514a68770
commit f2e19a8ccf
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
3 changed files with 8 additions and 14 deletions

View file

@ -443,6 +443,13 @@ static ORT_STRING_VIEW provider_name_dml = ORT_TSTR("dml");
#ifdef USE_DML
provider_names[provider_name_dml] = {opset7, opset8, opset9, opset10, opset11, opset12, opset13, opset14, opset15, opset16, opset17, opset18};
#endif
#if defined(ENABLE_TRAINING_CORE) && defined(USE_CUDA)
// Removing the CPU EP tests from CUDA build for training as these tests are already run in the CPU pipelines.
// Note: These are inference tests, we run these in training builds as an extra check. Therefore reducing
// the number of times these are run to reduce the CI time.
provider_names.erase(provider_name_cpu);
#endif
std::vector<std::basic_string<ORTCHAR_T>> v;
// Permanently exclude following tests because ORT support only opset starting from 7,
// Please make no more changes to the list

View file

@ -9,7 +9,7 @@ resources:
ref: 5eda9aded5462201e6310105728d33016e637ea7
stages:
- stage: Python_Packaging_Linux_Trainin_CPU
- stage: Python_Packaging_Linux_Training_CPU
jobs:
- job: Linux_Training_CPU_Wheels

View file

@ -20,16 +20,3 @@ stages:
agent_pool: Onnxruntime-Linux-GPU
upload_wheel: 'yes'
debug_build: false
# Added for triton compiler team. Can be potentially removed.
- template: templates/py-packaging-training-cuda-stage.yml
parameters:
build_py_parameters: --enable_training --update --build
torch_version: '2.0.0'
opset_version: '15'
cuda_version: '11.8'
cmake_cuda_architectures: 70;75;80;86
docker_file: Dockerfile.manylinux2_28_training_cuda11_8
agent_pool: Onnxruntime-Linux-GPU
upload_wheel: 'no'
debug_build: true