onnxruntime/tools/ci_build/github/azure-pipelines/templates/py-win-gpu.yml
liqun Fu cd7112f800
Integration with ONNX 1.16.0 (#19745)
### Description
update with ONNX 1.16.0 branch according to
https://github.com/microsoft/onnxruntime/blob/main/docs/How_To_Update_ONNX_Dev_Notes.md

ONNX 1.16.0 release notes:
https://github.com/onnx/onnx/releases/tag/v1.16.0

#### Updated ops for CPU EP:
- DequantizeLinear(21)
  - Added int16 and uint16 support + various optimizer tests
  - Missing int4 and uint4 support
  - Missing block dequantization support
- QuantizeLinear(21)
  - Added int16 and uint16 support + various optimizer tests
  - Missing int4 and uint4 support
  - Missing block quantization support
- Cast(21)
  - Missing int4 and uint4 support
- CastLike(21)
  - Missing int4 and uint4 support
- ConstantOfShape(21)
  - Missing int4 and uint4 support
- Identity(21)
  - Missing int4 and uint4 support
- If(21)
  - Missing int4 and uint4 support
- Loop(21)
  - Missing int4 and uint4 support
- Reshape(21)
  - Missing int4 and uint4 support
- Scan(21)
  - Missing int4 and uint4 support
- Shape(21)
  - Missing int4 and uint4 support
- Size(21)
  - Missing int4 and uint4 support
- Flatten(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support
- Pad(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support
- Squeeze(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support
- Transpose(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support
- Unsqueeze(21)
- Missing float8e4m3fnuz, float8e5m2, float8e5m2fnuz, int4, and uint4
support

#### Unimplemented opset 21 features/ops
- int4 and uint4 data type
- QLinearMatMul(21)
- GroupNormalization(21)
- ai.onnx.ml.TreeEnsemble(5)

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

### Disabled tests
#### ORT Training

orttraining/orttraining/test/python/orttraining_test_ort_apis_py_bindings.py
- test_ort_custom_ops: Potential shape inference bug for custom ops

#### Python quantization unit tests
test/onnx/python/quantization (shape inference bug)
- test_op_conv_transpose.py: test_quantize_conv_transpose_u8u8_fp16
- test_op_conv_transpose.py: test_quantize_conv_transpose_s8s8_fp16
- test_op_gemm.py: test_quantize_qop_gemm_s8s8
- test_op_gemm.py: test_quantize_qop_gemm_e4m3fn_same
 - test_op_gemm.py: test_quantize_qop_gemm_e4m3fn_p3
- test_op_matmul.py: test_quantize_matmul_u8u8_f16
- test_op_matmul.py: test_quantize_matmul_s8s8_f16
- test_op_matmul.py: test_quantize_matmul_s8s8_f16_entropy
- test_op_matmul.py: test_quantize_matmul_s8s8_f16_percentile
- test_op_matmul.py: test_quantize_matmul_s8s8_f16_distribution
- test_op_relu.py: test_quantize_qop_relu_s8s8

#### ONNX tests
- test_maxpool_2d_ceil_output_size_reduce_by_one: ONNX 1.16.0 fixed a
maxpool output size bug and added this test. Enable this test when [ORT
PR](https://github.com/microsoft/onnxruntime/pull/18377) is merged.
Refer to original [ONNX PR](https://github.com/onnx/onnx/pull/5741).
- test_ai_onnx_ml_tree_ensemble_set_membership_cpu: new unimplemented op
ai.onnx.ml.TreeEnsemble
- test_ai_onnx_ml_tree_ensemble_single_tree_cpu: same
- test_ai_onnx_ml_tree_ensemble_set_membership_cuda: same
- test_ai_onnx_ml_tree_ensemble_single_tree_cuda: same
- test_cast_INT4_to_FLOAT_cpu: ORT Cast(21) impl doesn't support int4
yet
- test_cast_INT4_to_INT8_cpu: same
- test_cast_UINT4_to_FLOAT_cpu: same
- test_cast_UINT4_to_UINT8_cpu: same
- test_cast_INT4_to_FLOAT_cuda
- test_cast_INT4_to_INT8_cuda
- test_cast_UINT4_to_FLOAT_cuda
- test_cast_UINT4_to_UINT8_cuda
- test_constantofshape_float_ones_cuda: ConstantOfShape(21) not
implemented for cuda
- test_constantofshape_int_shape_zero_cuda: same
- test_constantofshape_int_zeros_cuda: same
- test_flatten_axis0_cuda: Flatten(21) not implemented for cuda
- test_flatten_axis1_cuda: same
- test_flatten_axis2_cuda: same
- test_flatten_axis3_cuda: same
- test_flatten_default_axis_cuda: same
- test_flatten_negative_axis1_cuda: same
- test_flatten_negative_axis2_cuda: same
- test_flatten_negative_axis3_cuda: same
- test_flatten_negative_axis4_cuda: same
- test_qlinearmatmul_2D_int8_float16_cpu: QLinearMatMul(21) for onnx not
implemented in ORT yet
- test_qlinearmatmul_2D_int8_float32_cpu: same
- test_qlinearmatmul_2D_uint8_float16_cpu: same
- test_qlinearmatmul_2D_uint8_float32_cpu: same
- test_qlinearmatmul_3D_int8_float16_cpu: same
- test_qlinearmatmul_3D_int8_float32_cpu: same
- test_qlinearmatmul_3D_uint8_float16_cpu: same
- test_qlinearmatmul_3D_uint8_float32_cpu: same
- test_qlinearmatmul_2D_int8_float16_cuda: same
- test_qlinearmatmul_2D_int8_float32_cuda: same
- test_qlinearmatmul_2D_uint8_float16_cuda: same
- test_qlinearmatmul_2D_uint8_float32_cuda: same
- test_qlinearmatmul_3D_int8_float16_cuda: same
- test_qlinearmatmul_3D_int8_float32_cuda: same
- test_qlinearmatmul_3D_uint8_float16_cuda: same
- test_qlinearmatmul_3D_uint8_float32_cuda: same
- test_size_cuda: Size(21) not implemented for cuda
- test_size_example_cuda: same
- test_dequantizelinear_blocked: Missing implementation for block
dequant for DequantizeLinear(21)
- test_quantizelinear_blocked_asymmetric: Missing implementation for
block quant for QuantizeLinear(21)
- test_quantizelinear_blocked_symmetric: Missing implementation for
block quant for QuantizeLinear(21)

---------

Signed-off-by: liqunfu <liqun.fu@microsoft.com>
Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>
Co-authored-by: Ganesan Ramalingam <grama@microsoft.com>
Co-authored-by: George Wu <jywu@microsoft.com>
Co-authored-by: adrianlizarraga <adlizarraga@microsoft.com>
2024-04-12 09:46:49 -07:00

255 lines
9.8 KiB
YAML

parameters:
- name: MACHINE_POOL
type: string
default: 'onnxruntime-Win2022-GPU-T4'
- name: EP_NAME
type: string
- name: PYTHON_VERSION
type: string
- name: EP_BUILD_FLAGS
type: string
- name: ENV_SETUP_SCRIPT
type: string
default: ''
- name: BUILD_PY_PARAMETERS
displayName: >
Extra parameters to pass to build.py. Don't put newlines in here.
type: string
default: ''
- name: CudaVersion
type: string
default: '11.8'
values:
- 11.8
- 12.2
- name: SpecificArtifact
displayName: Use Specific Artifact
type: boolean
default: false
- name: BuildId
displayName: Specific Artifact's BuildId
type: string
default: '0'
stages:
- stage: Win_py_${{ parameters.EP_NAME }}_Wheels_${{ replace(parameters.PYTHON_VERSION,'.','_') }}_Build
dependsOn: []
jobs:
- job: Win_py_${{ parameters.EP_NAME }}_Wheels_${{ replace(parameters.PYTHON_VERSION,'.','_') }}_Build
timeoutInMinutes: 360
workspace:
clean: all
pool:
name: onnxruntime-Win-CPU-2022
variables:
GRADLE_OPTS: '-Dorg.gradle.daemon=false'
VSGenerator: 'Visual Studio 17 2022'
CUDA_MODULE_LOADING: 'LAZY'
steps:
- task: mspremier.PostBuildCleanup.PostBuildCleanup-task.PostBuildCleanup@3
displayName: 'Clean Agent Directories'
condition: always()
- checkout: self
clean: true
submodules: recursive
- template: telemetry-steps.yml
- task: UsePythonVersion@0
inputs:
versionSpec: ${{ parameters.PYTHON_VERSION }}
addToPath: true
architecture: 'x64'
- task: onebranch.pipeline.tsaoptions@1
displayName: 'OneBranch TSAOptions'
inputs:
tsaConfigFilePath: '$(Build.SourcesDirectory)\.config\tsaoptions.json'
appendSourceBranchName: false
- task: PythonScript@0
inputs:
scriptSource: inline
script: |
import sys
np_version = 'numpy==1.21.6' if sys.version_info < (3, 11) else 'numpy==1.26'
import subprocess
try:
subprocess.check_call(['pip', 'install', '-q', 'setuptools', 'wheel', np_version])
except subprocess.CalledProcessError:
sys.exit(1)
workingDirectory: '$(Build.BinariesDirectory)'
displayName: 'Install python modules'
- template: download-deps.yml
- ${{ if ne(parameters.ENV_SETUP_SCRIPT, '') }}:
- template: jobs/set-winenv.yml
parameters:
EnvSetupScript: ${{ parameters.ENV_SETUP_SCRIPT }}
${{ if or(contains(parameters.EP_BUILD_FLAGS, 'use_cuda'), contains(parameters.EP_BUILD_FLAGS, 'use_tensorrt')) }}:
DownloadCUDA: true
- ${{ if eq(parameters.ENV_SETUP_SCRIPT, '') }}:
- template: jobs/download_win_gpu_library.yml
parameters:
CudaVersion: ${{ parameters.CudaVersion }}
${{ if or(contains(parameters.EP_BUILD_FLAGS, 'use_cuda'), contains(parameters.EP_BUILD_FLAGS, 'use_tensorrt')) }}:
DownloadCUDA: true
${{ if contains(parameters.EP_BUILD_FLAGS, 'use_tensorrt') }}:
DownloadTRT: true
- task: PythonScript@0
displayName: 'Update deps.txt'
inputs:
scriptPath: $(Build.SourcesDirectory)/tools/ci_build/replace_urls_in_deps.py
arguments: --new_dir $(Build.BinariesDirectory)/deps
workingDirectory: $(Build.BinariesDirectory)
- task: PowerShell@2
displayName: 'Install ONNX'
inputs:
filePath: '$(Build.SourcesDirectory)/tools/ci_build/github/windows/install_third_party_deps.ps1'
workingDirectory: '$(Build.BinariesDirectory)'
arguments: -cpu_arch x64 -install_prefix $(Build.BinariesDirectory)\RelWithDebInfo\installed -build_config RelWithDebInfo
- template: set-nightly-build-option-variable-step.yml
- task: PythonScript@0
displayName: 'Generate cmake config'
inputs:
scriptPath: '$(Build.SourcesDirectory)\tools\ci_build\build.py'
arguments: >
--config RelWithDebInfo
--build_dir $(Build.BinariesDirectory)
--skip_submodule_sync
--cmake_generator "$(VSGenerator)"
--enable_pybind
--enable_onnx_tests
--parallel --use_binskim_compliant_compile_flags --update
$(TelemetryOption) ${{ parameters.BUILD_PY_PARAMETERS }} ${{ parameters.EP_BUILD_FLAGS }}
workingDirectory: '$(Build.BinariesDirectory)'
# building with build.py so the parallelization parameters are added to the msbuild command
- task: PythonScript@0
displayName: 'Build'
inputs:
scriptPath: '$(Build.SourcesDirectory)\tools\ci_build\build.py'
arguments: >
--config RelWithDebInfo
--build_dir $(Build.BinariesDirectory)
--parallel --build
$(TelemetryOption) ${{ parameters.BUILD_PY_PARAMETERS }} ${{ parameters.EP_BUILD_FLAGS }}
workingDirectory: '$(Build.BinariesDirectory)'
# Esrp signing
- template: win-esrp-dll.yml
parameters:
FolderPath: '$(Build.BinariesDirectory)\RelWithDebInfo\RelWithDebInfo\onnxruntime\capi'
DisplayName: 'ESRP - Sign Native dlls'
DoEsrp: true
Pattern: '*.pyd,*.dll'
- task: PythonScript@0
displayName: 'Build wheel'
inputs:
scriptPath: '$(Build.SourcesDirectory)\setup.py'
arguments: 'bdist_wheel ${{ parameters.BUILD_PY_PARAMETERS }} $(NightlyBuildOption) --wheel_name_suffix=${{ parameters.EP_NAME }}'
workingDirectory: '$(Build.BinariesDirectory)\RelWithDebInfo\RelWithDebInfo'
- task: CopyFiles@2
displayName: 'Copy Python Wheel to: $(Build.ArtifactStagingDirectory)'
inputs:
SourceFolder: '$(Build.BinariesDirectory)\RelWithDebInfo\RelWithDebInfo\dist'
Contents: '*.whl'
TargetFolder: '$(Build.ArtifactStagingDirectory)'
- task: PublishBuildArtifacts@1
displayName: 'Publish Artifact: ONNXRuntime python wheel'
inputs:
ArtifactName: onnxruntime_${{ parameters.EP_NAME }}
- script: |
7z x *.whl
workingDirectory: '$(Build.ArtifactStagingDirectory)'
displayName: 'unzip the package'
- task: CredScan@3
displayName: 'Run CredScan'
inputs:
debugMode: false
continueOnError: true
- task: BinSkim@4
displayName: 'Run BinSkim'
inputs:
AnalyzeTargetGlob: '+:file|$(Build.ArtifactStagingDirectory)\**\*.dll;-:file|$(Build.ArtifactStagingDirectory)\**\DirectML.dll'
- task: TSAUpload@2
displayName: 'TSA upload'
condition: and (succeeded(), eq(variables['Build.SourceBranch'], 'refs/heads/main'))
inputs:
GdnPublishTsaOnboard: false
GdnPublishTsaConfigFile: '$(Build.sourcesDirectory)\.gdn\.gdntsa'
- template: component-governance-component-detection-steps.yml
parameters:
condition: 'succeeded'
- stage: Win_py_${{ parameters.EP_NAME }}_Wheels_${{ replace(parameters.PYTHON_VERSION,'.','_') }}_Tests
dependsOn: Win_py_${{ parameters.EP_NAME }}_Wheels_${{ replace(parameters.PYTHON_VERSION,'.','_') }}_Build
jobs:
- job: Win_py_${{ parameters.EP_NAME }}_Wheels_${{ replace(parameters.PYTHON_VERSION,'.','_') }}_Tests
workspace:
clean: all
pool:
name: ${{parameters.MACHINE_POOL}}
steps:
- task: mspremier.PostBuildCleanup.PostBuildCleanup-task.PostBuildCleanup@3
displayName: 'Clean Agent Directories'
condition: always()
- checkout: self
clean: true
submodules: none
- task: UsePythonVersion@0
inputs:
versionSpec: ${{ parameters.PYTHON_VERSION }}
addToPath: true
architecture: 'x64'
- template: flex-downloadPipelineArtifact.yml
parameters:
ArtifactName: onnxruntime_${{ parameters.EP_NAME }}
StepName: 'Download Pipeline Artifact - Windows GPU Build'
TargetPath: '$(Build.ArtifactStagingDirectory)'
SpecificArtifact: ${{ parameters.SpecificArtifact }}
BuildId: ${{ parameters.BuildId }}
- powershell: |
pushd onnxruntime/test/python
python -m pip install --upgrade pip
python -m pip install -r requirements.txt
popd
workingDirectory: '$(Build.SourcesDirectory)'
displayName: 'Install ONNX'
- powershell: |
python -m pip uninstall -y ort-nightly-gpu ort-nightly onnxruntime onnxruntime-gpu -qq
Get-ChildItem -Path $(Build.ArtifactStagingDirectory)/*cp${{ replace(parameters.PYTHON_VERSION,'.','') }}*.whl | foreach {pip --disable-pip-version-check install --upgrade $_.fullname tabulate}
mkdir -p $(Agent.TempDirectory)\ort_test_data
Copy-Item -Path $(Build.sourcesDirectory)/onnxruntime/test/python/onnx_backend_test_series.py -Destination $(Agent.TempDirectory)\ort_test_data
Copy-Item -Recurse -Path $(Build.sourcesDirectory)/onnxruntime/test/testdata -Destination $(Agent.TempDirectory)\ort_test_data
cd $(Agent.TempDirectory)\ort_test_data
python onnx_backend_test_series.py
workingDirectory: '$(Build.sourcesDirectory)'
displayName: 'Run Python Tests'