Commit graph

12350 commits

Author SHA1 Message Date
Jian Chen
cf2ba46ce4 Change hostArchitecture to arm64 for aarch64 2025-02-07 15:10:33 -08:00
Jian Chen
1176c96540 Change hostArchitecture to arm64 for aarch64 2025-02-07 15:08:03 -08:00
Jian Chen
3737846fae Change the build artifact to pipeline artifact. 2025-02-07 08:49:42 -08:00
Jian Chen
bb75778c7a Update os 2025-02-06 21:55:58 -08:00
Jian Chen
16c8269520 Update os 2025-02-06 19:36:09 -08:00
Jian Chen
15347b8841 Update os 2025-02-06 19:31:06 -08:00
Jian Chen
67c2221a32 Adding a new stage 2025-02-06 19:25:40 -08:00
Jian Chen
071463027f Adding a new stage 2025-02-06 19:22:56 -08:00
Jian Chen
58183eb1a1 Adding a new stage 2025-02-06 19:17:49 -08:00
Jian Chen
293d96f915 undo templates 2025-02-06 19:11:40 -08:00
Jian Chen
8705d205f2 undo pool 2025-02-06 19:10:09 -08:00
Jian Chen
7d2e988d02 undo binary 2025-02-06 19:08:48 -08:00
Jian Chen
c17734e624 onnxruntime-inference-examples 2025-02-06 19:06:16 -08:00
Jian Chen
5108e87b33 undo dml-vs-2022.yml 2025-02-06 19:02:41 -08:00
Jian Chen
979a1a78a1 undo dml-vs-2022.yml 2025-02-06 19:01:25 -08:00
Jian Chen
3d8da4d0b0 undo Binary c-api-cpu.yml 1 2025-02-06 18:59:24 -08:00
Jian Chen
645f2370ab undo Binary c-api-cpu.yml 1.25
1ES.PublishPipelineArtifact@0 to 1ES.PublishPipelineArtifact@1
2025-02-06 18:58:00 -08:00
Jian Chen
abb684873e undo Binary c-api-cpu.yml 1.25 2025-02-06 18:56:10 -08:00
Jian Chen
73b7200ffe undo Binary c-api-cpu.yml 1.5 2025-02-06 18:55:00 -08:00
Jian Chen
bcb0652298 undo Binary c-api-cpu.yml 1.75 2025-02-06 18:53:08 -08:00
Jian Chen
9746f1cb9b undo Binary c-api-cpu.yml 1.5 2025-02-06 18:51:58 -08:00
Jian Chen
791304291f undo Binary c-api-cpu.yml 1 2025-02-06 18:50:18 -08:00
Jian Chen
503b159f18 undo Binary c-api-cpu.yml 2 2025-02-06 18:49:15 -08:00
Jian Chen
073a6c5675 undo Binary c-api-cpu.yml 3 2025-02-06 18:48:07 -08:00
Jian Chen
b16aac39a2 undo Binary c-api-cpu.yml 3 2025-02-06 18:47:35 -08:00
Jian Chen
4bdec8e0bf Binary c-api-cpu.yml 3 2025-02-06 18:45:32 -08:00
Jian Chen
27eadb158b Binary c-api-cpu.yml 2 2025-02-06 18:44:52 -08:00
Jian Chen
10355d77e3 Binary c-api-cpu.yml 2 2025-02-06 18:44:14 -08:00
Jian Chen
165d968fb7 Binary c-api-cpu.yml 2 2025-02-06 18:43:55 -08:00
Jian Chen
faa38bffbf Binary c-api-cpu.yml 2025-02-06 18:41:35 -08:00
Jian Chen
de9ce655fc Disable c-api-cpu.yml 2025-02-06 18:40:03 -08:00
Jian Chen
1b3dcc89fe Update c-api-cpu.yml 2025-02-06 18:39:15 -08:00
Jian Chen
41f2aa32e1 Disable java-cuda-packaging-stage.yml 2025-02-06 18:34:04 -08:00
Jian Chen
471f287235 Disable c-api-cpu.yml 2025-02-06 18:33:11 -08:00
Jian Chen
77110697f4 Disable nuget-combine-cuda-stage.yml 2025-02-06 18:32:11 -08:00
Jian Chen
05dae73477 Disable dml 2025-02-06 18:30:52 -08:00
Jian Chen
7e22fb64cb 1ES 2025-02-06 18:27:42 -08:00
Jian Chen
39ab30674d publish 2025-02-06 18:24:30 -08:00
Jian Chen
6e45a7bf1d Try to skip validate-package.yml 2025-02-06 18:22:40 -08:00
Jian Chen
f9aa616b04 Try to skip validate-package.yml 2025-02-06 18:20:53 -08:00
Jian Chen
60749a2e5f Try to skip validate-package.yml 2025-02-06 18:17:58 -08:00
Jian Chen
5cf3f47137 Try to skip validate-package.yml 2025-02-06 18:16:42 -08:00
Jian Chen
361c41ed0a Try to skip ESRP 2025-02-06 18:15:27 -08:00
Jian Chen
9efa0b4965 Migrate Zip-Nuget Package Pipeline to 1ES 2025-02-06 18:07:35 -08:00
Jian Chen
33e6ebfe2f Migrate Zip-Nuget Package Pipeline to 1ES 2025-02-06 17:59:23 -08:00
Jian Chen
0469e1577d Migrate Zip-Nuget Package Pipeline to 1ES 2025-02-06 17:58:54 -08:00
Jian Chen
d9dda06456 Migrate Zip-Nuget Package Pipeline to 1ES 2025-02-06 17:57:59 -08:00
Jian Chen
702ed1ce0f Migrate Zip-Nuget Package Pipeline to 1ES 2025-02-06 17:56:34 -08:00
microsoft-github-policy-service[bot]
65008cbb73
Auto-generated baselines by 1ES Pipeline Templates (#23603) 2025-02-06 17:06:29 -08:00
Tianlei Wu
09e5724f3b
[CUDA] Fix beam search of num_beams > 32 (#23599)
### Description
* Pass topk_scores to beam scorer in slow topk path.
* Add an env variable `ORT_BEAM_SEARCH_USE_FAST_TOPK` to enable/disable fast topk.
* Add a test case for slow topk path.

### Motivation and Context

This bug was introduced in
https://github.com/microsoft/onnxruntime/pull/16272

Beam search uses fast cuda kernel when number of beams <= 32. When beam
size is larger than that threshold, we use another code path (slower
cuda kernel) to get topk. In such `slow topk path`, topk_scores shall be
passed to beam scorer but it is not.

This bug will cause incorrect result when num_beams > 32. It was not
found previously since such large beam size is rarely used.
2025-02-06 16:50:31 -08:00