Commit graph

1860 commits

Author SHA1 Message Date
Yi Zhang
14d7872ce9
Reuse T4 for Cuda12.2 training packaging pipeline. (#20244)
### Description
It always has been out of memory in training CUDA 12.2 packaging
pipeline
https://dev.azure.com/aiinfra/Lotus/_build?definitionId=1308&_a=summary
since the PR #19910
I tried other CPU agents for example, D64as_v5(256G memory) and
D32as_v4(128G memory and 256 G SSD temp storage), which are still out of
memory like the below image

![image](https://github.com/microsoft/onnxruntime/assets/16190118/5acde9ef-674f-4b6d-a1b3-b54647645083)


But it works on T4, though T4 only has 4 vCPUs, 28G memory and 180G temp
storage, and it takes much more time.

### Motivation and Context
Restore CUDA 12.2 training packaging pipeline first.
More time is needed to investigate the root cause


### Other Clues.
These 2 compilation steps take nearly 6 minutes with Cuda 12.2 on T4
And it runs out of memory on CPU machine. @ajindal1 
cuda12.2 on T4
```
2024-03-14T05:39:08.7726865Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim32_fp16_sm80.cu.o
2024-03-14T05:45:01.3223393Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim64_bf16_sm80.cu.o

2024-03-14T05:46:07.9218003Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim96_fp16_sm80.cu.o
2024-03-14T05:52:59.2387051Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/group_query_attention_impl.cu.o

```

But they could be finished in about one minute with Cuda 11.8 on CPU
```
cuda11.8 on CPU
2024-04-09T11:34:35.0849836Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim32_fp16_sm80.cu.o
2024-04-09T11:35:53.6648154Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim64_bf16_sm80.cu.o

cuda11.8 on GPU
024-03-13T12:16:33.4102477Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim32_fp16_sm80.cu.o
2024-03-13T12:19:58.8268272Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim64_bf16_sm80.cu.o
```
2024-04-10 09:21:40 +08:00
Adrian Lizarraga
05d97e8d18
Update QNN python packages to use QNN SDK version 2.19.2 (#20213)
### Description
Update QNN python packages to use QNN SDK version 2.19.2.



### Motivation and Context
Our CI builds already use QNN SDK version 2.19.2. We should make sure
the ort-nightly-qnn python packages are also built with the same QNN SDK
version.
2024-04-05 17:15:25 -07:00
Yi Zhang
23a5d0a305
Extend time out in Windows GPU packaging jobs (#20207)
### Description
Extend Windows GPU Packaging job building time out to 6 hours, and test
stage to 3 hours.



### Motivation and Context
There're still a few timeout issues after refactoring. The probability
is about 20% in
https://dev.azure.com/aiinfra/Lotus/_build?definitionId=84.
I found the building could be finished in 4 hours if it becomes slow,
https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=434340&view=logs&j=0c6ee496-b38e-55a9-3699-12934156e90f,
although in most cases, it only take about 30 minutes.
Not like before, the building couldn't be completed.
So, In this PR, I extend the timeout to 6 hours.

And one interesting thing, if one windows GPU job becomes slow, all
other windows GPU jobs in the same run become slow too.
So I doubt it has something with the ADO or virtualization. That is,
it's not completely random.
https://dev.azure.com/aiinfra/Lotus/_build?definitionId=841
2024-04-06 08:03:42 +08:00
Yi Zhang
4ea54b82f9
[Fix] Upload training CUDA daily wheel (#20183)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-04-03 13:18:26 +08:00
Yi Zhang
523ef04240
enable lto in Python-CUDA-Packaging Pipline (#20164)
### Description
Except [Python-CUDA-Packaging
pipeline](https://dev.azure.com/aiinfra/Lotus/_build?definitionId=1299&_a=summary),
all windows cuda packaging jobs have been running well now.
After comparison, enable_lto isn't added in the pipeline, which might be
one root cause of the random hang.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-04-01 15:42:28 +08:00
Jeff Bloomfield
2f31560430
Enable generic feature level devices in DML EP (#20114)
### Description
Enable NPUs supporting DXCORE_ADAPTER_ATTRIBUTE_D3D12_GENERIC_ML and
D3D_FEATURE_LEVEL_1_0_GENERIC with DML EP. This also begins ingesting DX
headers through the DirectX-Headers repo.

Note that this includes an update to cgamanifest.json for onnx-tensorrt
which is triggered during re-generation due to a prior changes to
deps.txt.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-03-29 14:37:30 -07:00
Adam Pocock
2f82400b13
[java] Java 21 build support (#19876)
### Description
Bump spotless and the Gradle wrapper to 6.25.0 and 8.6 respectively to
allow compiling ORT on Java 21. The build still targets Java 8.

I'm not sure if there will be CI changes necessary to use this PR,
specifically for the Gradle version as I don't know if that is cached
somewhere earlier in the CI build process.

The new Gradle version adds a warning that using `--source` and
`--target` to select the Java language version is obsolete which is
annoying, we can fix it if we decide to only allow building on newer
versions of Java, while still supporting running on Java 8.

### Motivation and Context
Java 21 is the latest LTS release of Java and ORT should be able to
build on it.
2024-03-28 15:51:22 -07:00
Yi Zhang
f7b52d2e3e
[Fix] Only copy java files when build_java is True (#20121)
### Description


### Motivation and Context
Fix error in Nuget-CUDA-Packaging-Pipeline
2024-03-28 14:06:28 -07:00
Yi Zhang
c5d7310f1b
Remove TSA upload in testing stage (#20115)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: Yi Zhang <your@email.com>
2024-03-28 13:15:03 +08:00
Yi Zhang
8f069f81c4
Split more windows GPU workflow into 2 stages, building and testing, to make them more stable (#20080)
### Description
reactor win-ci.yml to solve the random hang issue in more GPU workflows,
move nugget-zip packages and python cuda12 packages building to CPU
machine.

---------

Co-authored-by: Yi Zhang <your@email.com>
2024-03-28 12:55:44 +08:00
Dmitri Smirnov
b95fd4e644
Enable CUDA EP unit testing on Windows (#20039)
### Description
Address build issues and source code discrepancies.
Fix cuda_test_provider gtest argument stack corruption.

### Motivation and Context
`OpTester` class that is widely used for kernel testing is not
suitable for testing internal classes for EPs that are built as shared
objects.
Currently, CUDA EP tests run only on Linux.
We want to enable testing and developments on Windows,
and create a usable pattern for testing of other EPs internals.

Alternatives considered: 
Abstracting EP unit tests into separate test executable such as
`onnxruntime_test_all`.
This alternative was rejected as it would create a lot more changes in
the established patterns,
and potentially interfere with CUDA functionality with more complex
source code maintanence.
2024-03-27 13:32:36 -07:00
Yi Zhang
ab2eaedfaa
Install ONNX by buildling source code in Windows DML stage (#20079)
### Description
In #20073, I use pin onnx version to unblock the whole PR CI.
In fact, we could use the onnx that installed by building source code,
that the onnx version is controlled by deps.txt.
For some history reason, DML stage installed onnx from pypi. Now, the
onnx can be installed as other stages.

add an option to skip installing onnx in win-ci-prebuild-step
2024-03-27 12:29:34 -07:00
Yi Zhang
4df9d16f98
[Fix] TSAUpload task must be in building stage (#20098)
### Description
In #20085, TSAUpload was in testing stage so main branch failed.
2024-03-27 12:20:57 -07:00
Yulong Wang
47903e701a
fix condition in web CI YAML (#20095)
### Description
fix condition in web CI YAML
2024-03-27 10:35:43 -07:00
Yi Zhang
0561b9576e
Fix and Refactor Python Packaging Pipeline (#20085)
### Description
Make Windows GPU Packaging stage in Python Packaging pipeline run on CPU
machine as well



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

### Test Link

https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=430961&view=results
2024-03-27 12:17:22 +08:00
Yulong Wang
0313dd1f65
Update Web CI to use data dir under Agent.TempDirectory (#20074)
### Description
Update Web CI to use data dir under Agent.TempDirectory

This change fixes the random failure caused by unstable access to karma
temp directory (which is under AppData\Local\Temp) on CI pipeline
2024-03-26 13:16:59 -07:00
Baiju Meswani
40efbd6c37
Fix training and macos ci pipelines (#20034) 2024-03-26 12:20:11 -07:00
sfatimar
eab35c20fc
Ort openvino npu 1.17 master (#19966)
### Description
Add NPU to list of device supported. 
Added changes for Support to OV 2024.0
Nuget packages removes packaging of OpenVINO DLL 
Bug Fixes with Python API 
Reverted Dockerfiles not being maintained. 



### Motivation and Context
NPU Device has been introduced by Intel in latest client systems
OpenVINO 2024.0 release is out.

---------

Co-authored-by: Suryaprakash Shanmugam <suryaprakash.shanmugam@intel.com>
Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com>
Co-authored-by: Ubuntu <ubuntu@ubuntu-118727.iind.intel.com>
Co-authored-by: hmamidix <hemax.sowjanya.mamidi@intel.com>
Co-authored-by: vthaniel <vishnudas.thaniel.s@intel.com>
Co-authored-by: saurabhkale17 <saurabh1.kale@intel.com>
2024-03-21 18:44:00 -07:00
Yi Zhang
cd6d3aea45
Refactor Python CUDA packaging pipeline to fix random hangs in building (#19989)
### Description
1. Move building on CPU machine.
2. Optimize the pipeline
3. Since there isn't official ONNX package for python 12, the python 12
test stage uses the packages built with ONNX source in build stage.


### Motivation and Context
1. Resolve the random hang in compilation
4. Save a lot of GPU resources.

---------
2024-03-22 09:16:00 +08:00
Yi Zhang
30a0d80925
Fix exception in Publish unit test results step (#20007)
### Description
Test results files are all in RelWithDebInfo\RelWithDebInfo directory.
It's not necessary to stat the directory of _deps 

### Motivation and Context
Recently this exception in zip-nuget pipleine occurs many times.
`##[error]Error: Failed find: EPERM: operation not permitted, stat
'D:\a\_work\1\b\RelWithDebInfo\_deps\flatbuffers-src\java\src\test\java\DictionaryLookup'`

https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=426981&view=logs&j=75fc0348-fe99-522b-3acb-90fd80ac5271&t=5d4ebcc1-bcde-574d-6f4e-8abd0f04ae4b
2024-03-22 06:53:59 +08:00
Yi Zhang
175f149b30
Remove downloading deps in CUDA package test stage (#19993)
### Description
<!-- Describe your changes. -->



### Motivation and Context
downloading deps is not needed in test stage
remove it to reduce random downloading errors
2024-03-21 10:01:03 +08:00
Yufeng Li
15219e2e71
turn on neural_speed by default (#19627)
### Description
<!-- Describe your changes. -->
the crash caused by the neural_speed turns out to be a very corn case.
Turn it on by default.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-03-20 12:49:58 -07:00
Rachel Guo
6b305f95e0
Support xcframework for mac catalyst builds. (#19534)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

MAUI on macOS uses mac-catalyst which requires a different native
binary.

---------

Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
Co-authored-by: Scott McKay <skottmckay@gmail.com>
2024-03-20 10:55:19 -07:00
Yi Zhang
8adbc09314
[Fix] Error Python Packaging Pipeline (Training CPU) (#19992)
### Description
fix the error caused by
https://github.com/microsoft/onnxruntime/pull/19973
2024-03-20 09:02:50 -07:00
mindest
3dfe4a5e6d
[ROCm] Remove MPI dependency and collectives to use NCCL (#19830)
### Description
* Remove MPI dependency to use NCCL AllReduce, etc.
* Exclude unsupported collectives in hipify
2024-03-19 17:35:18 -07:00
Hariharan Seshadri
cd6ec50b50
Switch a portion of CI/packaging jobs to MacOS12 (#19908) 2024-03-19 14:54:58 -07:00
Yi Zhang
d4c8bc359e
Fix Training CPU docker image name to avoid unnecessary rebuilding (#19973)
### Description
The docker image name was fixed, but the docker argument was different
in different job.
It would trigger rebuilding the docker image almost every time!!!
2024-03-19 09:33:24 -07:00
Yulong Wang
b29849a287
[js/common] fix typedoc warnings (#19933)
### Description
Fix a few warnings in typedoc (for generating JS API):
```
[warning] The signature TrainingSession.loadParametersBuffer has an @param with name "buffer", which was not used.
[warning] NonTensorType, defined in ./lib/onnx-value.ts, is referenced by OnnxValue but not included in the documentation.
[warning] TensorFactory, defined in ./lib/tensor-factory.ts, is referenced by Tensor but not included in the documentation.
[warning] ExternalDataFileType, defined in ./lib/onnx-model.ts, is referenced by InferenceSession.SessionOptions.externalData but not included in the documentation.
[warning] TensorToDataUrlOptions, defined in ./lib/tensor-conversion.ts, is referenced by Tensor.toDataURL.toDataURL.options but not included in the documentation.
[warning] TensorToImageDataOptions, defined in ./lib/tensor-conversion.ts, is referenced by Tensor.toImageData.toImageData.options but not included in the documentation.
[warning] Failed to resolve link to "GpuBufferType" in comment for Env.WebGpuFlags.adapter.
[warning] Failed to resolve link to "GpuBufferType" in comment for Env.WebGpuFlags.device.
```

Changes highlighted:
- Merge `CoreMlExecutionProviderOption` and
`CoreMLExecutionProviderOption`. They expose 2 set of different options
for React-native and ORT nodejs binding. This should be fixed in future.
- Fix a few inconsistency of names between JSDoc and parameters
- Fix broken type links
- Exclude trace functions
2024-03-15 19:01:50 -07:00
Yifan Li
0b2a75b274
[EP Perf] Add concurrency test (#19804)
### Description
<!-- Describe your changes. -->
* Add concurrency test to EP Perf CI panel (impl. by onnx_test_runner)
  * Model: FasterRCNN-10 model within CI image
  * `-c` param configurable via CI panel when kicking off CI tasks
  * Auto-replicate test input/outputs according to `-c` param
* By default, the model test will be executed in 100 iterations (~2min
added to T4 CI task load overall)

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
To monitor potential concurrency issues of ORT-TRT
2024-03-15 07:41:21 -07:00
Justin Chu
bcf47d3546
Update install_deps_lort.sh to fix onnxscript installation (#19922)
Install onnxscript correctly with `pip install`. Dev dependencies are
not required.

### Motivation and Context

Fix build breaks.
2024-03-14 17:05:50 -07:00
Adam Louly
32558134a9
[On-Device-Training] Upgrade Flatbuffers to Support 2GB+ Checkpoints. (#19770)
### Description
Modifications to support 2GB+ checkpoint & Upgrading Flatbuffers


### Motivation and Context
This PR includes changes that will make ort handle 2GB+ checkpoints.
To do that we need to upgrade flatbuffers to 23.5.9 -
https://github.com/google/flatbuffers/pull/7945

- Modified the commitHash and the hash for the new version
- Removed the patch for rust generator's unused variable warning as it
is no longer producing this - [Check it out
here](d121e09d89/src/idl_gen_rust.cpp)
- Updated the VerifyField calls with alignment values that were
introduced in the new version.

---------

Co-authored-by: Sumit Agarwal <sumitagarwal@microsoft.com>
2024-03-14 16:36:24 -07:00
Yi Zhang
87a9f77c56
Refactor Python Packaing Pipeline (Training Cuda 11.8) (#19910)
### Description
1. Use stage to organize the pipeline and split building and testing
2. Move compilation on CPU machine
3. test stage can leverage existing artifacts
4. check wheel size, it gives warning if the size above 300M
5. docker image name wasn't change even the argument changed, which
caused the docker image was always rebuilt. So update the docker image
name according to the argument can save the docker build time.

Pipeline duration reduced by 60% (2 hours ->  50 minutes)
Compilation time reduced by 75% (1.5hours -> 20 minutes)
GPU time reduced by 87% ( 8 hours to 1 hours)
for debugging, the GPU time could be reduced by above 95%, because we
can choose run only one test stage and skip building.

### Motivation and Context
Make the pipeline efficient.
Optimized

https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=424177&view=results
Curent

https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=422393&view=results

---------
2024-03-15 06:47:41 +08:00
Changming Sun
8b766bd24e
Change nuget pipeline's "Windows_Packaging_combined_GPU" job to download TRT binaries in every build (#19919)
### Description
Change nuget pipeline's "Final_Jar_Testing_Windows_GPU" job to download
TRT binaries in every build. Now all the other build jobs are already
doing this. This is the only one left.

Similar to #19909

### Motivation and Context

As a follow up of #19118
2024-03-14 15:07:56 -07:00
Changming Sun
ea4a5eea18
Change nuget pipeline's "Final_Jar_Testing_Windows_GPU" job to download TRT binaries in every build (#19909)
### Description
Change nuget pipeline's "Final_Jar_Testing_Windows_GPU" job to download
TRT binaries in every build. Now all the other build jobs are already
doing this. This is the only one left.


### Motivation and Context

As a follow up of #19118
2024-03-14 07:55:00 -07:00
Yulong Wang
e771a763c3
[js/test] align web test runner flags with ort.env (#19790)
### Description
the `npm test` flags are difficult to memorize, because they are
different to the `ort.env` flags. This change makes those flags align
with ort JS API. eg. `--wasm-enable-proxy` became `--wasm.proxy`.

Old flags are marked as deprecated except `-x` (as a shortcut of
`--wasm.numThreads`)
2024-03-13 12:00:36 -07:00
Yi Zhang
d5d9dbd51d
reuse T4 on Linux GPU (#19879)
### Description

### Motivation and Context
Linux GPU test on A10 isn't very stable
2024-03-13 10:41:36 -07:00
Hariharan Seshadri
ed306b4f97
Fix Android CI pipeline (#19877) 2024-03-13 10:09:43 -07:00
Justin Chu
faea42af95
Bump ruff to 0.3.2 and black to 24 (#19878)
### Motivation and Context

Routing updates
2024-03-13 10:00:32 -07:00
Yi Zhang
9e0a0f0f32
Check whether required tests are executed. (#19884)
### Description
Check the onnx node tests and model tests worked

### Motivation and Context
onnx node test data and model data are mount in one dir.
And onnxruntime_test_all search the dir and load the data.
If the dir does exist or there's some change in onnxruntime_test_all,
those tests may not be executed.
For example, all onnx node test data is 32M. It's hardly for us aware of
the regression.
So I add the simple check to ensure those tests are executed.

---------

Co-authored-by: Yi Zhang <your@email.com>
2024-03-13 09:59:57 -07:00
Yi Zhang
7313aa4efe
Remove --extra-index-url (#19885)
### Description
<!-- Describe your changes. -->



### Motivation and Context
--extra-index-url is not allowed by injected Secure Supply Chain Step in
packaging pipelines.
```
> Starting Multifeed Python Security Analysis:
##[warning]tools/ci_build/github/azure-pipelines/bigmodels-ci-pipeline.yml - Found "extra-index-url". (https://aka.ms/cfs/pypi)
```
And those 2 packages can be installed from PyPI as well now.

Co-authored-by: Yi Zhang <your@email.com>
2024-03-13 09:45:22 -07:00
Edward Chen
860eb762c2
[Apple framework] Fix minimal build with training enabled. (#19858)
Fix some linker errors that come up when integrating the onnxruntime-training-c pod into another Xcode project. The problematic configuration is a minimal build with training APIs enabled.
- training_op_defs.o had some unresolved references to ONNX functions. It should not be included at all in a minimal build.
- tree_ensemble_helper.o also had unresolved references to ONNX ParseData. The containing function is unused in a minimal build.

Added a test to cover this configuration.
2024-03-12 11:33:30 -07:00
Yi Zhang
d4fa4f0276
Remove FFmpeg to meet compliance (#19859) 2024-03-12 09:06:59 -07:00
Changming Sun
5479124834
Remove remaining Windows ARM32 build jobs (#19840)
### Description
As a follow up of #19788, remove more remaining Windows ARM32 build
jobs.


### Motivation and Context
Our nuget packaging pipeline is failing because it could not find an
artifact for Win ARM32.
```
##[error]Artifact onnxruntime-training-win-arm was not found for build 421397.
```

Deprecation of Win ARM32 was announced by Windows team in January 2023.
We should follow it.
2024-03-11 11:25:11 +08:00
Yifan Li
069d2d6f54
[EP Perf] Update EP Perf dockerfiles with cuda12/cudnn9 (#19781)
### Description
* Update name of existing dockerfiles and add support to test latest
TensorRT EA binary located in the image
* Add cuda 12.3/cuDNN 9/TensorRT 8.6 dockerfile
* Add detail to CI prompts and configs

Instruction to test latest TRT via BIN:
1. Select `BIN` in TensorRT Version
2. In Variables, update related tarCudaVersion, **clear**
tarCudnnVersion (not required in latest TRT tar binary) , and path to
binary.
2024-03-08 13:58:22 -08:00
Yifan Li
3170a48e60
[EP Perf] Add tag to indicate which TRT parser is using (#19784)
### Description
* Add tag to distinguish if TRT `builtin` or `oss` parser is being used
* `oss` tag will be inserted with onnx-tensorrt commit id, to indicate
which version oss parser is
### Validate
DB entry before/after this PR 
(during test, `builtin` or `oss_{commit_id}` tag was inserted in the
database entries):

### Motivation and Context
To distinguish perf results using builtin/oss parser in the database,
this parser tag is needed.
In future, results using different parsers will be listed in different
Perf Dashboard pages.
2024-03-08 10:24:36 -08:00
Ashwini Khade
e93a860819
Remove arm build for training (#19788)
We no longer support Win arm 32 so removing the associated build and
packaging job.
2024-03-05 21:54:48 -08:00
Scott McKay
db59cec82f
Don't reduce warning level for CUDA build on Windows (#19663)
### Description
<!-- Describe your changes. -->
Address warnings so all the ORT projects build with /W4 on Windows.

Mainly 
- unused parameters
- variables shadowing other ones

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
#19588 started on this.
2024-03-06 15:03:55 +10:00
Yulong Wang
a788514027
[js/web] dump debug logs for karma for diagnose purpose (#19785)
### Description
dump debug logs for karma for diagnose purpose.

This is for debugging the CI issue of Chrome launch failure and
considered temporary.
2024-03-05 18:27:26 -08:00
Yi Zhang
9460597b21
Update copying API header files (#19736)
### Description
Make Linux logic consistent as Windows


### Motivation and Context
onnxruntime_lite_custom_op.h in Windows zip package but not in Linux zip
package

acbfc29f27/tools/ci_build/github/azure-pipelines/templates/c-api-artifacts-package-and-publish-steps-windows.yml (L67)

Co-authored-by: Your Name <your@email.com>
2024-03-02 11:33:47 +08:00
Edward Chen
5672cdebdf
Update google benchmark to 1.8.3. (#19734)
Update google benchmark to 1.8.3.
Update deps_update_and_upload.py script to make it easier to use.
2024-03-01 11:01:58 -08:00