### Description
<!-- Describe your changes. -->
This fix macos packaging build on universal2 arch.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Merge CPU/GPU nuget pipeline. The old GPU nuget pipeline will be only for DML.
TODO: the result GPU package contains PDB files for some of the DLLs, but not all. It is due to the refactoring of CUDA EP to pluggable DLLs. At that time we forgot to copy the PDB files. However, I can't add them in now. Because currently the package is already 220MB large. If the missed PDB files were added, then it will be oversize. nuget.org doesn't accept >250MB packages.
1. Remove some unused code and simplify tools/ci_build/github/linux/run_dockerbuild.sh.
2. Enable Nuget CUDA tests. The original design was we could leverage Directory.Build.props and let cmake generate the required properties(USE_CUDA/...) there. However, in nuget packaging pipeline we test the package on a different host that doesn't run cmake command and doesn't have the auto-generated Directory.Build.props file.
Add python 3.8/3.9 support for Windows GPU and Linux ARM64
Delete jemalloc from cgmanifest.json.
Add onnx node test to Nuphar pipeline.
Change $ANDROID_HOME/ndk-bundle to $ANDROID_NDK_HOME. The later one is more accurate.
Delete Java GPU packaging pipeline
Remove test data download step in Nuget Mac OS pipeline. Because these machines are out of control and out of our network, it's hard to make it reliable and the data secure.
Fix a doc problem in c-api-artifacts-package-and-publish-steps-windows.yml. It shouldn't copy C_API.md, because the file has been moved into a different branch.
Delete the CI build docker file for Ubuntu cuda 9.x and Ubuntu x86 32 bits
And, due to some internal restrictions, I need to rename some of the agent pools
* Add build option to disable traditional ML ops from the binary.
* Fix python tests by splitting tests for ML ops to a separate file. Exclude ML tests from onnx_test_runner and C# tests. Exclude ML op sources.
* Update Edge pkg pipelines with new MLops env variable and fix C# packaging pipeline tests to skip ML ops.
* change c++14 to c++11
* add ld lib path for centos
* enable csharp tests on macos
* fix C API test on MacOS + fix manylinux dotnet install
* fix manylinux dotnet install
* fix lib link
* add SAS token to download internal test data for nuget pipeline
* update azure endpoint
* fix keyvault download step
* fix variable declaration for secret group
* fix indentation
* fix yaml syntax for variables
* fix setting secrets for script
* fix env synctax
* Fix macos pipeline
* attempt to add secrets to windows download data
* fix mac and win data download
* fix windows data download
* update test data set url and location
* Initial check in
* Add win x86
* minor update to x86
* update win-ci
* update win-ci
* update win-x86ci
* add linux and mac templates
* add nuget pipelines and test templates
* remove buildConfig
* add compliance template
* fix minor typos
* update pool for macos
* update mac agent pool
* update macos pool
* update agent pools for tests
* turn off debug build for testing
* some modifications to packaging scripts
* change ordering of compliance tasks
* Add mklml pipeline
* Add packagename variable to mklml pipeline
* remove unrequired dependent jobs from mklml pipeline
* Update build command for macOS legs in mklml and cpu pipeline
* Set vcvars to true
* Add no contrib ops pipeline
* Add no-contrib-ops pipeline
* set vcvars to true for package tests
* remove repetition in nuget templates
* get buildarch correct
* get name of test template correct
* remove steps from test_all_os.yml
* add parameters to test_all_os.yml
* Need jobs, not steps
* set envars for disablecontrib ops
* add cleanup tasks and CG to package tests
* fix path to cleanup script for macos
* remove buildDirectory -- not needed
* remove fp16tiny_yolov2 model from nocontribops tests
* remove debugging info
* fix individual linux pipelines to use correct template
* remove unneeded bak_latest2
* increase timeout to 120 to allow for variance
* turn off code coverage report