### Description
1. Add XNNPack build on Linux ARM64
2. Build only one python wheel for PR request.
[AB#49763](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/49763)
### Motivation and Context
Why I add xnnpack build on Linux ARM64 rather than Windows ARM64.
Becuase KleidiAI doesn't support Windows
```
IF(XNNPACK_TARGET_PROCESSOR STREQUAL "arm64" AND XNNPACK_ENABLE_ARM_I8MM AND NOT CMAKE_C_COMPILER_ID STREQUAL "MSVC")
IF (XNNPACK_ENABLE_KLEIDIAI)
MESSAGE(STATUS "Enabling KleidiAI for Arm64")
ENDIF()
ELSE()
SET(XNNPACK_ENABLE_KLEIDIAI OFF)
ENDIF()
```
---------
### Description
Use a common set of prebuilt manylinux base images to build the
packages, to avoid building the manylinux part again and again. The base
images can be used in GenAI and other projects too.
This PR also updates the GCC version for inference python CUDA11/CUDA12
builds from 8 to 11. Later on I will update all other CUDA pipelines to
use GCC 11, to avoid the issue described in
https://github.com/onnx/onnx/issues/6047 and
https://github.com/microsoft/onnxruntime-genai/issues/257 .
### Motivation and Context
To extract the common part as a reusable build infra among different
ONNX Runtime projects.
### Description
1. Remove 'dnf update' from docker build scripts, because it upgrades TRT
packages from CUDA 11.x to CUDA 12.x.
To reproduce it, you can run the following commands in a CentOS CUDA
11.x docker image such as nvidia/cuda:11.8.0-cudnn8-devel-ubi8.
```
export v=8.6.1.6-1.cuda11.8
dnf install -y libnvinfer8-${v} libnvparsers8-${v} libnvonnxparsers8-${v} libnvinfer-plugin8-${v} libnvinfer-vc-plugin8-${v} libnvinfer-devel-${v} libnvparsers-devel-${v} libnvonnxparsers-devel-${v} libnvinfer-plugin-devel-${v} libnvinfer-vc-plugin-devel-${v} libnvinfer-headers-devel-${v} libnvinfer-headers-plugin-devel-${v}
dnf update -y
```
The last command will generate the following outputs:
```
========================================================================================================================
Package Architecture Version Repository Size
========================================================================================================================
Upgrading:
libnvinfer-devel x86_64 8.6.1.6-1.cuda12.0 cuda 542 M
libnvinfer-headers-devel x86_64 8.6.1.6-1.cuda12.0 cuda 118 k
libnvinfer-headers-plugin-devel x86_64 8.6.1.6-1.cuda12.0 cuda 14 k
libnvinfer-plugin-devel x86_64 8.6.1.6-1.cuda12.0 cuda 13 M
libnvinfer-plugin8 x86_64 8.6.1.6-1.cuda12.0 cuda 13 M
libnvinfer-vc-plugin-devel x86_64 8.6.1.6-1.cuda12.0 cuda 107 k
libnvinfer-vc-plugin8 x86_64 8.6.1.6-1.cuda12.0 cuda 251 k
libnvinfer8 x86_64 8.6.1.6-1.cuda12.0 cuda 543 M
libnvonnxparsers-devel x86_64 8.6.1.6-1.cuda12.0 cuda 467 k
libnvonnxparsers8 x86_64 8.6.1.6-1.cuda12.0 cuda 757 k
libnvparsers-devel x86_64 8.6.1.6-1.cuda12.0 cuda 2.0 M
libnvparsers8 x86_64 8.6.1.6-1.cuda12.0 cuda 854 k
Installing dependencies:
cuda-toolkit-12-0-config-common noarch 12.0.146-1 cuda 7.7 k
cuda-toolkit-12-config-common noarch 12.2.140-1 cuda 7.9 k
libcublas-12-0 x86_64 12.0.2.224-1 cuda 361 M
libcublas-devel-12-0 x86_64 12.0.2.224-1 cuda 397 M
Transaction Summary
========================================================================================================================
```
As you can see from the output, they are CUDA 12 packages.
The problem can also be solved by lock the packages' versions by using
"dnf versionlock" command right after installing the CUDA/TRT packages.
However, going forward, to get the better reproducibility, I suggest
manually fix dnf package versions in the installation scripts like we do
for TRT now.
```bash
v="8.6.1.6-1.cuda11.8" &&\
yum-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo &&\
yum -y install libnvinfer8-${v} libnvparsers8-${v} libnvonnxparsers8-${v} libnvinfer-plugin8-${v} libnvinfer-vc-plugin8-${v}\
libnvinfer-devel-${v} libnvparsers-devel-${v} libnvonnxparsers-devel-${v} libnvinfer-plugin-devel-${v} libnvinfer-vc-plugin-devel-${v} libnvinfer-headers-devel-${v} libnvinfer-headers-plugin-devel-${v}
```
When we have a need to upgrade a package due to security alert or some
other reasons, we manually change the version string instead of relying
on "dnf update". Though this approach increases efforts, it can make our
pipeines more stable.
2. Move python test to docker
### Motivation and Context
Right now the nightly gpu package mixes using CUDA 11.x and CUDA 12.x
and the result package is totally not usable(crashes every time)