onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-12 17:57:38 +00:00

Author	SHA1	Message	Date
Yufeng Li	985acda28c	optimize int4 gemv kernel with cuda (#18818 ) ### Description optimize gemv kernel: 1. unroll reduction to improve memory bandwidth 2. leverage 4bits to float16 tricks to save instrutions \| m \| n \| k \| symmetric \| latency before(us) \| latency after(us) \| \| - \| ----- \| ----- \| --------- \| ------------------ \| ----------------- \| \| 1 \| 4096 \| 4096 \| TRUE \| 15.54 \| 8.82 \| \| 1 \| 4096 \| 4096 \| FALSE \| 15.84 \| 9.89 \| \| 1 \| 4096 \| 11008 \| TRUE \| 42.44 \| 19.4 \| \| 1 \| 4096 \| 11008 \| FALSE \| 44.42 \| 21.48 \| \| 1 \| 11008 \| 4096 \| TRUE \| 34.65 \| 17.46 \| \| 1 \| 11008 \| 4096 \| FALSE \| 35.76 \| 20.87 \| \| 1 \| 12288 \| 4096 \| TRUE \| 39.27 \| 19.73 \| \| 1 \| 12288 \| 4096 \| FALSE \| 40.91 \| 25.2 \| \| 1 \| 22016 \| 4096 \| TRUE \| 65.78 \| 38.81 \| \| 1 \| 22016 \| 4096 \| FALSE \| 67.98 \| 48.36 \|	2023-12-21 19:32:34 -08:00
Changming Sun	3d8f229d39	Add ARM64EC build jobs (#18870 ) ### Description Add ARM64EC build jobs in post merge pipeline to validate if our code is compatible with Windows ARM64EC.	2023-12-21 16:31:38 -08:00
Changming Sun	5b93c465a8	Delete .github/workflows/generated_fake_win_gpu_ci.yml (#18074 ) ### Description No longer needed. Now Azure DevOps has the built-in support.	2023-12-21 16:31:11 -08:00
Yifan Li	0af946f35a	[EP Perf] Fix ORT-CUDAFp16 tests (#18908 ) ### Description <!-- Describe your changes. --> ORT-CUDAFp16 model tests were all failed due to the latest `onnxmltools` 1.12.0 started to remove `onnxconverter-common` out of its dependencies, which is needed by the ep perf env to test models with CUDA EP under fp16. Add `onnxconverter-common` dep to env to fix. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-12-21 16:20:41 -08:00
Wanming Lin	4c3705cbea	[WebNN EP] Change some support status for XNNPack backend (#18858 ) WebNN XNNPack backend doesn't really support `pow` and `reduceSum`, and will support `sqrt` very soon.	2023-12-21 15:16:44 -08:00
Wanming Lin	1b64d30963	[WebNN EP] Infer the layout via ONNX domain for Resize (#18871 ) Previously we added EP specific logic into generic core code to restrict Resize for WebNN EP at https://github.com/microsoft/onnxruntime/pull/18687 which does not scale and make sense. This PR reverts the change in https://github.com/microsoft/onnxruntime/pull/18687 and uses ONNX domain infomation to infer the layout infomation during layout transformation.	2023-12-21 11:30:29 -08:00
dependabot[bot]	8507c06f8e	Bump conda-incubator/setup-miniconda from 2 to 3 (#18685 ) Bumps [conda-incubator/setup-miniconda](https://github.com/conda-incubator/setup-miniconda) from 2 to 3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/conda-incubator/setup-miniconda/releases">conda-incubator/setup-miniconda's releases</a>.</em></p> <blockquote> <h2>Version 3.0.0</h2> <h3>Features</h3> <ul> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/308">#308</a> Update to node20</li> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/291">#291</a> Add conda-solver option (defaults to libmamba)</li> </ul> <h3>Fixes</h3> <ul> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/299">#299</a> Fix condaBasePath when useBundled is false, and there's no pre-existing conda</li> </ul> <h3>Documentation</h3> <ul> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/309">#309</a> Switch to main branch based development</li> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/313">#313</a> Specify team conda-incubator/setup-miniconda as codeowners</li> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/318">#318</a> README: update actions in examples, add security section, similar actions</li> </ul> <h3>Tasks and Maintenance</h3> <ul> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/307">#307</a> Run dependabot against main branch and also update node packages</li> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/311">#311</a> Bump actions/checkout from 2 to 4</li> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/310">#310</a> Bump actions/cache from 1 to 3</li> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/314">#314</a> Strip/update dependencies</li> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/315">#315</a> Split lint into check and build, switch from <code>npm install</code> to <code>npm ci</code></li> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/317">#317</a> Bump normalize-url from 4.5.1 to 8.0.0</li> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/316">#316</a> Faster workflow response / saving resources via timeout/concurrency policy</li> </ul> <p><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/308">#308</a>: <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/308">conda-incubator/setup-miniconda#308</a> <a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/291">#291</a>: <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/291">conda-incubator/setup-miniconda#291</a> <a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/299">#299</a>: <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/299">conda-incubator/setup-miniconda#299</a> <a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/309">#309</a>: <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/309">conda-incubator/setup-miniconda#309</a> <a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/313">#313</a>: <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/313">conda-incubator/setup-miniconda#313</a> <a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/318">#318</a>: <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/318">conda-incubator/setup-miniconda#318</a> <a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/307">#307</a>: <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/307">conda-incubator/setup-miniconda#307</a> <a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/311">#311</a>: <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/311">conda-incubator/setup-miniconda#311</a> <a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/310">#310</a>: <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/310">conda-incubator/setup-miniconda#310</a> <a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/314">#314</a>: <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/314">conda-incubator/setup-miniconda#314</a> <a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/315">#315</a>: <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/315">conda-incubator/setup-miniconda#315</a> <a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/317">#317</a>: <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/317">conda-incubator/setup-miniconda#317</a> <a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/316">#316</a>: <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/316">conda-incubator/setup-miniconda#316</a></p> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/isuruf"><code>@isuruf</code></a> made their first contribution in <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/299">conda-incubator/setup-miniconda#299</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/conda-incubator/setup-miniconda/compare/v2...v3.0.0">https://github.com/conda-incubator/setup-miniconda/compare/v2...v3.0.0</a></p> <h2>Version 2.3.0</h2> <h3>Documentation</h3> <ul> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/263">#263</a> Update links to GitHub shell docs</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/conda-incubator/setup-miniconda/blob/main/CHANGELOG.md">conda-incubator/setup-miniconda's changelog</a>.</em></p> <blockquote> <h2><a href="https://github.com/conda-incubator/setup-miniconda/releases/tag/v3.0.1">v3.0.1</a> (2023-11-29)</h2> <h3>Fixes</h3> <ul> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/325">#325</a> Fix environment activation on windows (a v3 regression) due to hard-coded install PATH</li> </ul> <p><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/325">#325</a>: <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/325">conda-incubator/setup-miniconda#325</a></p> <h2><a href="https://github.com/conda-incubator/setup-miniconda/releases/tag/v3.0.0">v3.0.0</a> (2023-11-27)</h2> <h3>Features</h3> <ul> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/308">#308</a> Update to node20</li> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/291">#291</a> Add conda-solver option (defaults to libmamba)</li> </ul> <h3>Fixes</h3> <ul> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/299">#299</a> Fix condaBasePath when useBundled is false, and there's no pre-existing conda</li> </ul> <h3>Documentation</h3> <ul> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/309">#309</a> Switch to main branch based development</li> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/313">#313</a> Specify team conda-incubator/setup-miniconda as codeowners</li> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/318">#318</a> README: update actions in examples, add security section, similar actions</li> </ul> <h3>Tasks and Maintenance</h3> <ul> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/307">#307</a> Run dependabot against main branch and also update node packages</li> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/311">#311</a> Bump actions/checkout from 2 to 4</li> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/310">#310</a> Bump actions/cache from 1 to 3</li> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/314">#314</a> Strip/update dependencies</li> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/315">#315</a> Split lint into check and build, switch from <code>npm install</code> to <code>npm ci</code></li> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/317">#317</a> Bump normalize-url from 4.5.1 to 8.0.0</li> <li><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/316">#316</a> Faster workflow response / saving resources via timeout/concurrency policy</li> </ul> <p><a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/308">#308</a>: <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/308">conda-incubator/setup-miniconda#308</a> <a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/291">#291</a>: <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/291">conda-incubator/setup-miniconda#291</a> <a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/299">#299</a>: <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/299">conda-incubator/setup-miniconda#299</a> <a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/309">#309</a>: <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/309">conda-incubator/setup-miniconda#309</a> <a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/313">#313</a>: <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/313">conda-incubator/setup-miniconda#313</a> <a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/318">#318</a>: <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/318">conda-incubator/setup-miniconda#318</a> <a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/307">#307</a>: <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/307">conda-incubator/setup-miniconda#307</a> <a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/311">#311</a>: <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/311">conda-incubator/setup-miniconda#311</a> <a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/310">#310</a>: <a href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/310">conda-incubator/setup-miniconda#310</a></p> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`11b5629583`"><code>11b5629</code></a> Prepare 3.0.1 (<a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/326">#326</a>)</li> <li><a href="`8706aa744e`"><code>8706aa7</code></a> Fix env activation on win (a v3 regression) due to hard-coded install PATH (#...</li> <li><a href="`c585a97097`"><code>c585a97</code></a> Bump conda-incubator/setup-miniconda from 2.3.0 to 3.0.0 (<a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/321">#321</a>)</li> <li><a href="`2defc80cc6`"><code>2defc80</code></a> Prepare release (<a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/320">#320</a>)</li> <li><a href="`0d5a56b9eb`"><code>0d5a56b</code></a> Bump actions/checkout from 2 to 4 (<a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/319">#319</a>)</li> <li><a href="`45fd3f9089`"><code>45fd3f9</code></a> Merge pull request <a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/316">#316</a> from dbast/timeout</li> <li><a href="`d1e04fc267`"><code>d1e04fc</code></a> Merge pull request <a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/299">#299</a> from isuruf/condaBasePath</li> <li><a href="`fab0073840`"><code>fab0073</code></a> Merge pull request <a href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/318">#318</a> from dbast/readme</li> <li><a href="`fa6bdf9643`"><code>fa6bdf9</code></a> Update with npm run build</li> <li><a href="`d42f8b884a`"><code>d42f8b8</code></a> Fix condaBasePath when useBundled is false, and there's no pre-existing conda</li> <li>Additional commits viewable in <a href="https://github.com/conda-incubator/setup-miniconda/compare/v2...v3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=conda-incubator/setup-miniconda&package-manager=github_actions&previous-version=2&new-version=3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-12-21 10:34:24 -08:00
dependabot[bot]	914bc409b0	Bump transformers from 4.30.0 to 4.36.0 in /tools/ci_build (#18895 ) Bumps [transformers](https://github.com/huggingface/transformers) from 4.30.0 to 4.36.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/huggingface/transformers/releases">transformers's releases</a>.</em></p> <blockquote> <h2>v4.36: Mixtral, Llava/BakLlava, SeamlessM4T v2, AMD ROCm, F.sdpa wide-spread support</h2> <h2>New model additions</h2> <h3>Mixtral</h3> <p>Mixtral is the new open-source model from Mistral AI announced by the blogpost <a href="https://mistral.ai/news/mixtral-of-experts/">Mixtral of Experts</a>. The model has been proven to have comparable capabilities to Chat-GPT according to the benchmark results shared on the release blogpost.</p> <!-- raw HTML omitted --> <p>The architecture is a sparse Mixture of Experts with Top-2 routing strategy, similar as <code>NllbMoe</code> architecture in transformers. You can use it through <code>AutoModelForCausalLM</code> interface:</p> <pre lang="py"><code>>>> import torch >>> from transformers import AutoModelForCausalLM, AutoTokenizer <p>>>> model = AutoModelForCausalLM.from_pretrained("mistralai/Mixtral-8x7B", torch_dtype=torch.float16, device_map="auto") >>> tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-8x7B")</p> <p>>>> prompt = "My favourite condiment is"</p> <p>>>> model_inputs = tokenizer([prompt], return_tensors="pt").to(device) >>> model.to(device)</p> <p>>>> generated_ids = model.generate(**model_inputs, max_new_tokens=100, do_sample=True) >>> tokenizer.batch_decode(generated_ids)[0] </code></pre></p> <p>The model is compatible with existing optimisation tools such Flash Attention 2, <code>bitsandbytes</code> and PEFT library. The checkpoints are release under <a href="https://huggingface.co/mistralai"><code>mistralai</code></a> organisation on the Hugging Face Hub.</p> <h3>Llava / BakLlava</h3> <p>Llava is an open-source chatbot trained by fine-tuning LlamA/Vicuna on GPT-generated multimodal instruction-following data. It is an auto-regressive language model, based on the transformer architecture. In other words, it is an multi-modal version of LLMs fine-tuned for chat / instructions.</p> <!-- raw HTML omitted --> <p>The Llava model was proposed in <a href="https://arxiv.org/pdf/2310.03744">Improved Baselines with Visual Instruction Tuning</a> by Haotian Liu, Chunyuan Li, Yuheng Li and Yong Jae Lee.</p> <ul> <li>[<code>Llava</code>] Add Llava to transformers by <a href="https://github.com/younesbelkada"><code>@younesbelkada</code></a> in <a href="https://redirect.github.com/huggingface/transformers/issues/27662">#27662</a></li> <li>[LLaVa] Some improvements by <a href="https://github.com/NielsRogge"><code>@NielsRogge</code></a> in <a href="https://redirect.github.com/huggingface/transformers/issues/27895">#27895</a></li> </ul> <p>The integration also includes <a href="https://github.com/SkunkworksAI/BakLLaVA"><code>BakLlava</code></a> which is a Llava model trained with Mistral backbone.</p> <p>The mode is compatible with <code>"image-to-text"</code> pipeline:</p> <pre lang="py"><code>from transformers import pipeline from PIL import Image import requests <p>model_id = "llava-hf/llava-1.5-7b-hf" </tr></table> </code></pre></p> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`14666775a2`"><code>1466677</code></a> Release: v4.36.0</li> <li><a href="`accccdd008`"><code>accccdd</code></a> [<code>Add Mixtral</code>] Adds support for the Mixtral MoE (<a href="https://redirect.github.com/huggingface/transformers/issues/27942">#27942</a>)</li> <li><a href="`0676d992a5`"><code>0676d99</code></a> [<code>from_pretrained</code>] Make from_pretrained fast again (<a href="https://redirect.github.com/huggingface/transformers/issues/27709">#27709</a>)</li> <li><a href="`9f18cc6df0`"><code>9f18cc6</code></a> Fix SDPA dispatch & make SDPA CI compatible with torch<2.1.1 (<a href="https://redirect.github.com/huggingface/transformers/issues/27940">#27940</a>)</li> <li><a href="`7ea21f1f03`"><code>7ea21f1</code></a> [LLaVa] Some improvements (<a href="https://redirect.github.com/huggingface/transformers/issues/27895">#27895</a>)</li> <li><a href="`5e620a92cf`"><code>5e620a9</code></a> Fix <code>SeamlessM4Tv2ModelIntegrationTest</code> (<a href="https://redirect.github.com/huggingface/transformers/issues/27911">#27911</a>)</li> <li><a href="`e96c1de191`"><code>e96c1de</code></a> Skip <code>UnivNetModelTest::test_multi_gpu_data_parallel_forward</code> (<a href="https://redirect.github.com/huggingface/transformers/issues/27912">#27912</a>)</li> <li><a href="`8d8970efdd`"><code>8d8970e</code></a> [BEiT] Fix test (<a href="https://redirect.github.com/huggingface/transformers/issues/27934">#27934</a>)</li> <li><a href="`235be08569`"><code>235be08</code></a> [DETA] fix backbone freeze/unfreeze function (<a href="https://redirect.github.com/huggingface/transformers/issues/27843">#27843</a>)</li> <li><a href="`df5c5c62ae`"><code>df5c5c6</code></a> Fix typo (<a href="https://redirect.github.com/huggingface/transformers/issues/27918">#27918</a>)</li> <li>Additional commits viewable in <a href="https://github.com/huggingface/transformers/compare/v4.30.0...v4.36.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=transformers&package-manager=pip&previous-version=4.30.0&new-version=4.36.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-12-21 00:44:36 -08:00
dependabot[bot]	f3c62bfad9	Bump actions/setup-node from 3 to 4 (#18148 ) Bumps [actions/setup-node](https://github.com/actions/setup-node) from 3 to 4. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/setup-node/releases">actions/setup-node's releases</a>.</em></p> <blockquote> <h2>v4.0.0</h2> <h2>What's Changed</h2> <p>In scope of this release we changed version of node runtime for action from node16 to node20 and updated dependencies in <a href="https://redirect.github.com/actions/setup-node/pull/866">actions/setup-node#866</a></p> <p>Besides, release contains such changes as:</p> <ul> <li>Upgrade actions/checkout to v4 by <a href="https://github.com/gmembre-zenika"><code>@gmembre-zenika</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/868">actions/setup-node#868</a></li> <li>Update actions/checkout for documentation and yaml by <a href="https://github.com/dmitry-shibanov"><code>@dmitry-shibanov</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/876">actions/setup-node#876</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/gmembre-zenika"><code>@gmembre-zenika</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-node/pull/868">actions/setup-node#868</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-node/compare/v3...v4.0.0">https://github.com/actions/setup-node/compare/v3...v4.0.0</a></p> <h2>v3.8.2</h2> <h2>What's Changed</h2> <ul> <li>Update semver by <a href="https://github.com/dmitry-shibanov"><code>@dmitry-shibanov</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/861">actions/setup-node#861</a></li> <li>Update temp directory creation by <a href="https://github.com/nikolai-laevskii"><code>@nikolai-laevskii</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/859">actions/setup-node#859</a></li> <li>Bump <code>@babel/traverse</code> from 7.15.4 to 7.23.2 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/870">actions/setup-node#870</a></li> <li>Add notice about binaries not being updated yet by <a href="https://github.com/nikolai-laevskii"><code>@nikolai-laevskii</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/872">actions/setup-node#872</a></li> <li>Update toolkit cache and core by <a href="https://github.com/dmitry-shibanov"><code>@dmitry-shibanov</code></a> and <a href="https://github.com/seongwon-privatenote"><code>@seongwon-privatenote</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/875">actions/setup-node#875</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-node/compare/v3...v3.8.2">https://github.com/actions/setup-node/compare/v3...v3.8.2</a></p> <h2>v3.8.1</h2> <h2>What's Changed</h2> <p>In scope of this release, the filter was removed within the cache-save step by <a href="https://github.com/dmitry-shibanov"><code>@dmitry-shibanov</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/831">actions/setup-node#831</a>. It is filtered and checked in the toolkit/cache library.</p> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-node/compare/v3...v3.8.1">https://github.com/actions/setup-node/compare/v3...v3.8.1</a></p> <h2>v3.8.0</h2> <h2>What's Changed</h2> <h3>Bug fixes:</h3> <ul> <li>Add check for existing paths by <a href="https://github.com/dmitry-shibanov"><code>@dmitry-shibanov</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/803">actions/setup-node#803</a></li> <li>Resolve SymbolicLink by <a href="https://github.com/dmitry-shibanov"><code>@dmitry-shibanov</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/809">actions/setup-node#809</a></li> <li>Change passing logic for cache input by <a href="https://github.com/dmitry-shibanov"><code>@dmitry-shibanov</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/816">actions/setup-node#816</a></li> <li>Fix armv7 cache issue by <a href="https://github.com/louislam"><code>@louislam</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/794">actions/setup-node#794</a></li> <li>Update check-dist workflow name by <a href="https://github.com/sinchang"><code>@sinchang</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/710">actions/setup-node#710</a></li> </ul> <h3>Feature implementations:</h3> <ul> <li>feat: handling the case where "node" is used for tool-versions file. by <a href="https://github.com/xytis"><code>@xytis</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/812">actions/setup-node#812</a></li> </ul> <h3>Documentation changes:</h3> <ul> <li>Refer to semver package name in README.md by <a href="https://github.com/olleolleolle"><code>@olleolleolle</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/808">actions/setup-node#808</a></li> </ul> <h3>Update dependencies:</h3> <ul> <li>Update toolkit cache to fix zstd by <a href="https://github.com/dmitry-shibanov"><code>@dmitry-shibanov</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/804">actions/setup-node#804</a></li> <li>Bump tough-cookie and <code>@azure/ms-rest-js</code> by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/802">actions/setup-node#802</a></li> <li>Bump semver from 6.1.2 to 6.3.1 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-node/pull/807">actions/setup-node#807</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`8f152de45c`"><code>8f152de</code></a> Update actions/checkout for documentation and yaml (<a href="https://redirect.github.com/actions/setup-node/issues/876">#876</a>)</li> <li><a href="`23755b521f`"><code>23755b5</code></a> upgrade actions/checkout to v4 (<a href="https://redirect.github.com/actions/setup-node/issues/868">#868</a>)</li> <li><a href="`54534a2a9b`"><code>54534a2</code></a> Change node version for action to node20 (<a href="https://redirect.github.com/actions/setup-node/issues/866">#866</a>)</li> <li>See full diff in <a href="https://github.com/actions/setup-node/compare/v3...v4">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-node&package-manager=github_actions&previous-version=3&new-version=4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) You can trigger a rebase of this PR by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> > Note > Automatic rebases have been disabled on this pull request as it has been open for over 30 days. Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-12-20 23:12:17 -08:00
dependabot[bot]	f74389c976	Bump github/issue-labeler from 3.2 to 3.3 (#18408 ) Bumps [github/issue-labeler](https://github.com/github/issue-labeler) from 3.2 to 3.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/github/issue-labeler/releases">github/issue-labeler's releases</a>.</em></p> <blockquote> <h2>v3.3</h2> <h2>What's Changed</h2> <ul> <li>feat(config): support reading from local file if it exists by <a href="https://github.com/lrstanley"><code>@lrstanley</code></a> in <a href="https://redirect.github.com/github/issue-labeler/pull/48">github/issue-labeler#48</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/lrstanley"><code>@lrstanley</code></a> made their first contribution in <a href="https://redirect.github.com/github/issue-labeler/pull/48">github/issue-labeler#48</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/github/issue-labeler/compare/v3.2...v3.3">https://github.com/github/issue-labeler/compare/v3.2...v3.3</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`6bea9ed491`"><code>6bea9ed</code></a> feat(config): support reading from local file if it exists (<a href="https://redirect.github.com/github/issue-labeler/issues/48">#48</a>)</li> <li>See full diff in <a href="https://github.com/github/issue-labeler/compare/v3.2...v3.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=github/issue-labeler&package-manager=github_actions&previous-version=3.2&new-version=3.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) You can trigger a rebase of this PR by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> > Note > Automatic rebases have been disabled on this pull request as it has been open for over 30 days. Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-12-20 22:20:59 -08:00
Yifan Li	54e471a054	[EP Perf] Display percentage of cuda/trt ops in cuda/trt ep on EP Perf Dashboard (#18868 ) ### Description Display percentage of cuda/trt ops in cuda/trt ep on EP Perf Dashboard: ![image](https://github.com/microsoft/onnxruntime/assets/109183385/bafba098-1338-46fa-b10a-ca19eff2a746) Check [here](https://msit.powerbi.com/groups/d1ae6355-afd0-4c40-b78e-676a86cab1e2/reports/82101bbb-dad2-4f24-9ddf-a37f0d41509a/ReportSectionda402bdf6824e505a614?experience=power-bi) to preview on ep perf dashboard ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> - brief overview of op metrics towards various models - easy to identify models which haven't reached 100% ops on cuda/trt ep.	2023-12-20 22:11:47 -08:00
dependabot[bot]	ce70a30b94	Bump transformers from 4.35.2 to 4.36.0 in /onnxruntime/python/tools/transformers/models/stable_diffusion (#18896 ) Bumps [transformers](https://github.com/huggingface/transformers) from 4.35.2 to 4.36.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/huggingface/transformers/releases">transformers's releases</a>.</em></p> <blockquote> <h2>v4.36: Mixtral, Llava/BakLlava, SeamlessM4T v2, AMD ROCm, F.sdpa wide-spread support</h2> <h2>New model additions</h2> <h3>Mixtral</h3> <p>Mixtral is the new open-source model from Mistral AI announced by the blogpost <a href="https://mistral.ai/news/mixtral-of-experts/">Mixtral of Experts</a>. The model has been proven to have comparable capabilities to Chat-GPT according to the benchmark results shared on the release blogpost.</p> <!-- raw HTML omitted --> <p>The architecture is a sparse Mixture of Experts with Top-2 routing strategy, similar as <code>NllbMoe</code> architecture in transformers. You can use it through <code>AutoModelForCausalLM</code> interface:</p> <pre lang="py"><code>>>> import torch >>> from transformers import AutoModelForCausalLM, AutoTokenizer <p>>>> model = AutoModelForCausalLM.from_pretrained("mistralai/Mixtral-8x7B", torch_dtype=torch.float16, device_map="auto") >>> tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-8x7B")</p> <p>>>> prompt = "My favourite condiment is"</p> <p>>>> model_inputs = tokenizer([prompt], return_tensors="pt").to(device) >>> model.to(device)</p> <p>>>> generated_ids = model.generate(**model_inputs, max_new_tokens=100, do_sample=True) >>> tokenizer.batch_decode(generated_ids)[0] </code></pre></p> <p>The model is compatible with existing optimisation tools such Flash Attention 2, <code>bitsandbytes</code> and PEFT library. The checkpoints are release under <a href="https://huggingface.co/mistralai"><code>mistralai</code></a> organisation on the Hugging Face Hub.</p> <h3>Llava / BakLlava</h3> <p>Llava is an open-source chatbot trained by fine-tuning LlamA/Vicuna on GPT-generated multimodal instruction-following data. It is an auto-regressive language model, based on the transformer architecture. In other words, it is an multi-modal version of LLMs fine-tuned for chat / instructions.</p> <!-- raw HTML omitted --> <p>The Llava model was proposed in <a href="https://arxiv.org/pdf/2310.03744">Improved Baselines with Visual Instruction Tuning</a> by Haotian Liu, Chunyuan Li, Yuheng Li and Yong Jae Lee.</p> <ul> <li>[<code>Llava</code>] Add Llava to transformers by <a href="https://github.com/younesbelkada"><code>@younesbelkada</code></a> in <a href="https://redirect.github.com/huggingface/transformers/issues/27662">#27662</a></li> <li>[LLaVa] Some improvements by <a href="https://github.com/NielsRogge"><code>@NielsRogge</code></a> in <a href="https://redirect.github.com/huggingface/transformers/issues/27895">#27895</a></li> </ul> <p>The integration also includes <a href="https://github.com/SkunkworksAI/BakLLaVA"><code>BakLlava</code></a> which is a Llava model trained with Mistral backbone.</p> <p>The mode is compatible with <code>"image-to-text"</code> pipeline:</p> <pre lang="py"><code>from transformers import pipeline from PIL import Image import requests <p>model_id = "llava-hf/llava-1.5-7b-hf" </tr></table> </code></pre></p> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`14666775a2`"><code>1466677</code></a> Release: v4.36.0</li> <li><a href="`accccdd008`"><code>accccdd</code></a> [<code>Add Mixtral</code>] Adds support for the Mixtral MoE (<a href="https://redirect.github.com/huggingface/transformers/issues/27942">#27942</a>)</li> <li><a href="`0676d992a5`"><code>0676d99</code></a> [<code>from_pretrained</code>] Make from_pretrained fast again (<a href="https://redirect.github.com/huggingface/transformers/issues/27709">#27709</a>)</li> <li><a href="`9f18cc6df0`"><code>9f18cc6</code></a> Fix SDPA dispatch & make SDPA CI compatible with torch<2.1.1 (<a href="https://redirect.github.com/huggingface/transformers/issues/27940">#27940</a>)</li> <li><a href="`7ea21f1f03`"><code>7ea21f1</code></a> [LLaVa] Some improvements (<a href="https://redirect.github.com/huggingface/transformers/issues/27895">#27895</a>)</li> <li><a href="`5e620a92cf`"><code>5e620a9</code></a> Fix <code>SeamlessM4Tv2ModelIntegrationTest</code> (<a href="https://redirect.github.com/huggingface/transformers/issues/27911">#27911</a>)</li> <li><a href="`e96c1de191`"><code>e96c1de</code></a> Skip <code>UnivNetModelTest::test_multi_gpu_data_parallel_forward</code> (<a href="https://redirect.github.com/huggingface/transformers/issues/27912">#27912</a>)</li> <li><a href="`8d8970efdd`"><code>8d8970e</code></a> [BEiT] Fix test (<a href="https://redirect.github.com/huggingface/transformers/issues/27934">#27934</a>)</li> <li><a href="`235be08569`"><code>235be08</code></a> [DETA] fix backbone freeze/unfreeze function (<a href="https://redirect.github.com/huggingface/transformers/issues/27843">#27843</a>)</li> <li><a href="`df5c5c62ae`"><code>df5c5c6</code></a> Fix typo (<a href="https://redirect.github.com/huggingface/transformers/issues/27918">#27918</a>)</li> <li>Additional commits viewable in <a href="https://github.com/huggingface/transformers/compare/v4.35.2...v4.36.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=transformers&package-manager=pip&previous-version=4.35.2&new-version=4.36.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-12-20 22:09:02 -08:00
dependabot[bot]	379c7c43eb	Bump actions/setup-java from 3 to 4 (#18686 ) Bumps [actions/setup-java](https://github.com/actions/setup-java) from 3 to 4. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/setup-java/releases">actions/setup-java's releases</a>.</em></p> <blockquote> <h2>v4.0.0</h2> <h2>What's Changed</h2> <p>In the scope of this release, the version of the Node.js runtime was updated to 20. The majority of dependencies were updated to the latest versions. From now on, the code for the setup-java will run on Node.js 20 instead of Node.js 16.</p> <h2>Breaking changes</h2> <ul> <li>Update Node.js runtime to version 20 by <a href="https://github.com/aparnajyothi-y"><code>@aparnajyothi-y</code></a> in <a href="https://redirect.github.com/actions/setup-java/pull/558">actions/setup-java#558</a></li> </ul> <h2>Non-breaking changes</h2> <ul> <li>Adding support for microsoft openjdk 21.0.0 by <a href="https://github.com/ralfstuckert"><code>@ralfstuckert</code></a> in <a href="https://redirect.github.com/actions/setup-java/pull/546">actions/setup-java#546</a></li> <li>Update <code>@actions/cache</code> dependency and documentation by <a href="https://github.com/IvanZosimov"><code>@IvanZosimov</code></a> in <a href="https://redirect.github.com/actions/setup-java/pull/549">actions/setup-java#549</a></li> <li>Implementation of the cache-dependency-path option to control caching dependency by <a href="https://github.com/itchyny"><code>@itchyny</code></a> in <a href="https://redirect.github.com/actions/setup-java/pull/499">actions/setup-java#499</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/ralfstuckert"><code>@ralfstuckert</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-java/pull/546">actions/setup-java#546</a></li> <li><a href="https://github.com/itchyny"><code>@itchyny</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-java/pull/499">actions/setup-java#499</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-java/compare/v3...v4.0.0">https://github.com/actions/setup-java/compare/v3...v4.0.0</a></p> <h2>v3.13.0</h2> <h2>What's changed</h2> <p>In the scope of this release, support for Dragonwell JDK was added by <a href="https://github.com/Accelerator1996"><code>@Accelerator1996</code></a> in <a href="https://redirect.github.com/actions/setup-java/pull/532">actions/setup-java#532</a></p> <pre lang="yaml"><code>steps: - name: Checkout uses: actions/checkout@v3 - name: Setup-java uses: actions/setup-java@v3 with: distribution: 'dragonwell' java-version: '17' </code></pre> <p>Several inaccuracies were also fixed:</p> <ul> <li>Fix XML namespaces wrongly using https by <a href="https://github.com/gnodet"><code>@gnodet</code></a> in <a href="https://redirect.github.com/actions/setup-java/pull/503">actions/setup-java#503</a></li> <li>Fix typo and remove unintentional(?) word by <a href="https://github.com/CyberFlameGO"><code>@CyberFlameGO</code></a> in <a href="https://redirect.github.com/actions/setup-java/pull/518">actions/setup-java#518</a></li> <li>Fix usage link within the README.md file by <a href="https://github.com/dassiorleando"><code>@dassiorleando</code></a> in <a href="https://redirect.github.com/actions/setup-java/pull/525">actions/setup-java#525</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/CyberFlameGO"><code>@CyberFlameGO</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-java/pull/518">actions/setup-java#518</a></li> <li><a href="https://github.com/dassiorleando"><code>@dassiorleando</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-java/pull/525">actions/setup-java#525</a></li> <li><a href="https://github.com/gnodet"><code>@gnodet</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-java/pull/503">actions/setup-java#503</a></li> <li><a href="https://github.com/Accelerator1996"><code>@Accelerator1996</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-java/pull/532">actions/setup-java#532</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-java/compare/v3...v3.13.0">https://github.com/actions/setup-java/compare/v3...v3.13.0</a></p> <h2>v3.12.0</h2> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`387ac29b30`"><code>387ac29</code></a> Upgrade Node to v20 (<a href="https://redirect.github.com/actions/setup-java/issues/558">#558</a>)</li> <li><a href="`9eda6b51cc`"><code>9eda6b5</code></a> feat: implement cache-dependency-path option to control caching dependency (#...</li> <li><a href="`78078da0cd`"><code>78078da</code></a> Update <code>@actions/cache</code> dependency and documentation (<a href="https://redirect.github.com/actions/setup-java/issues/549">#549</a>)</li> <li><a href="`5caaba646e`"><code>5caaba6</code></a> add support for microsoft openjdk 21.0.0 (<a href="https://redirect.github.com/actions/setup-java/issues/546">#546</a>)</li> <li>See full diff in <a href="https://github.com/actions/setup-java/compare/v3...v4">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-java&package-manager=github_actions&previous-version=3&new-version=4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-12-20 22:08:33 -08:00
Kevin Chen	1c6cb5dfeb	Remove usage of TRT deprecated APIs (#18879 ) ### Description <!-- Describe your changes. --> - Wrap usage of kENABLE_TACTIC_HEURISTIC around version checking macros - Use delete instead of deprecated destroy() functions on TRT objects. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> - Removes usages of deprecated TRT APIs. Signed-off-by: Kevin Chen <kevinch@nvidia.com>	2023-12-20 15:08:13 -08:00
Tianlei Wu	2d6e2e243d	update sdxl demo (#18889 ) ### Description (1) Support importing model from Olive. (2) Add backend engine Torch (Eager and Compile modes) to the demo. (3) Use fp16 in most places. (4) Remove some old pipeline scripts that are not useful anymore. They are replaced by the demo. (5) Remove old benchmark results that are out of date. (6) Add PIL image conversion to end to end latency (for fair comparison with diffusers since the default output type is pil) (7) Remove some options are seldom used like force-rebuild-engine, hf-token, refit etc. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-12-20 14:46:22 -08:00
Yulong Wang	9a61388f0a	[js/web] revise backend registration (#18715 ) ### Description This PR revises the backend registration. The following describes the expected behavior after this change: (bolded are changed behavior) - (ort.min.js - built without webgpu support) - loading: do not register 'webgpu' backend - creating session without EP list: use default EP list ['webnn', 'cpu', 'wasm'] - creating session with ['webgpu'] as EP list: should fail with backend not available - (ort.webgpu.min.js - built with webgpu support) - loading: always register 'webgpu' backend ( previous behavior: only register 'webgpu' backend when `navigator.gpu` is available) - creating session without EP list: use default EP list ['webgpu', 'webnn', 'cpu', 'wasm'] - when WebGPU is available (win): use WebGPU backend - when WebGPU is unavailable (android): should fail backend init, and try to use next backend in the list, 'webnn' (previous behavior: does not fail backend init, but fail in JSEP init, which was too late to switch to next backend) - creating session with ['webgpu'] as EP list - when WebGPU is available (win): use WebGPU backend - when WebGPU is unavailable (android): **should fail backend init, and because no more EP listed, fail. related PRs: #18190 #18144	2023-12-20 14:45:55 -08:00
Yifan Li	c0142c9108	[EP Perf] Fix model zoo url (#18808 ) ### Description <!-- Describe your changes. --> Onnx model zoo had major update recently, and legacy models were relocated under /archive/ ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-12-20 10:54:45 -08:00
Hector Li	8931854528	Move some QNN EP provider options to session options (#18877 ) Move QNN EP provider options to session options ### Description Need to use session option to support multi-partition for context cache feature. To smooth the transaction, move the provider options to session options first. This is the first step for PR: PR https://github.com/microsoft/onnxruntime/pull/18865	2023-12-20 00:13:38 -08:00
Ye Wang	02eb17655d	Fix a bug in 4bits quantizer script (#18878 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-12-19 22:53:33 -08:00
Scott McKay	666fcbde4d	Add LeakyRelu to list of NNAPI operators (#18880 ) ### Description <!-- Describe your changes. --> Add LeakyRelu to the list as support was added a while ago. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-12-20 14:44:31 +10:00
Changming Sun	535a2403dd	Update Nuget publishing jobs (#18851 ) ### Description 1. Add a CodeSign validation task before the binaries are published, to make sure all DLL files are signed. 2. Auto-trigger the CUDA 12 pipeline's publishing job.	2023-12-19 16:54:46 -08:00
Yulong Wang	ffa6602686	[js/node] support manually dispose session (#18655 ) ### Description support manually dispose session in onnxruntime-node feature request: #16796	2023-12-19 16:20:00 -08:00
satyajandhyala	98510fb8fb	[JS/WebGPU] fix an error in Clip (#18799 ) ### Description <!-- Describe your changes. --> Check whether the min/max inputs are provided and use default values if not provided. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-12-19 13:51:01 -08:00
liqun Fu	32fcf73740	Implement dft(20) (#17821 ) ### Description dft is updated in opset20. implement it in ort ### Motivation and Context this is for ort 1.17.0 release Fixes #17723 --------- Signed-off-by: Liqun Fu <liqfu@microsoft.com>	2023-12-19 10:42:54 -08:00
luoyu-intel	5f00bc9931	Integrate high-performance x64 gemm library to MLAS (#17669 ) ### Description Improve MLAS to support high-performance x64 INT4 kernels ### Motivation and Context 1. improve LLM inference performance on Intel CPUs. 2. support more 4bit quantization types: nf4, fp4 3. support dynamic block size: block size aligned with kernel's tiling size(e.g. 4 for VNNI kernel), per channel on N dimension 4. support most Intel ISAs: avx2, avx_vnni, avx512f, avx512_vnni, amx_bf16, amx_int8, avx512_fp16 5. support MatMulNBits' data format ### Tasks - [x] support block_size: 32, 128, -1(per channel) - [x] get weight pack size without memory allocation - [x] use ort's thread pool for parallelism - [x] support ISAs: avx2, avx512f, avx_vnni, avx512_vnni, amx_int8 ### Benchmark Ubuntu 20.22 + Intel(R) Xeon(R) Platinum 8480+ 56 cores Benchmark \| Time \| CPU \| Iterations -- \| -- \| -- \| -- Q4GEMM_Jblas/Q4G32SymInt8/M:1/N:4096/K:4096/Threads:56/real_time \| 47613 \| 47401 \| 12970 Q4GEMM_Jblas/Q4G32SymInt8/M:1024/N:4096/K:4096/Threads:56/real_time \| 6347792 \| 6317562 \| 109 Q4GEMM_Jblas/Q4G32SymInt8/M:2048/N:4096/K:4096/Threads:56/real_time \| 11814014 \| 11757847 \| 59 Q4GEMM_Jblas/Q4G128SymInt8/M:1/N:4096/K:4096/Threads:56/real_time \| 50222 \| 50031 \| 13759 Q4GEMM_Jblas/Q4G128SymInt8/M:1024/N:4096/K:4096/Threads:56/real_time \| 2038222 \| 2028743 \| 341 Q4GEMM_Jblas/Q4G128SymInt8/M:2048/N:4096/K:4096/Threads:56/real_time \| 3792832 \| 3774485 \| 191 Q4GEMM_Jblas/Q4GPerNSymInt8/M:1/N:4096/K:4096/Threads:56/real_time \| 58717 \| 58501 \| 11467 Q4GEMM_Jblas/Q4GPerNSymInt8/M:1024/N:4096/K:4096/Threads:56/real_time \| 1360846 \| 1354598 \| 543 Q4GEMM_Jblas/Q4GPerNSymInt8/M:2048/N:4096/K:4096/Threads:56/real_time \| 2564232 \| 2551365 \| 266 Q4GEMM_Jblas/Q4G32SymFp32/M:1/N:4096/K:4096/Threads:56/real_time \| 57929 \| 57694 \| 12047 Q4GEMM_Jblas/Q4G32SymFp32/M:1024/N:4096/K:4096/Threads:56/real_time \| 5495330 \| 5465810 \| 126 Q4GEMM_Jblas/Q4G32SymFp32/M:2048/N:4096/K:4096/Threads:56/real_time \| 10676240 \| 10617817 \| 66 Q4GEMM_Jblas/Q4G128SymFp32/M:1/N:4096/K:4096/Threads:56/real_time \| 68305 \| 68047 \| 10026 Q4GEMM_Jblas/Q4G128SymFp32/M:1024/N:4096/K:4096/Threads:56/real_time \| 5504862 \| 5476215 \| 126 Q4GEMM_Jblas/Q4G128SymFp32/M:2048/N:4096/K:4096/Threads:56/real_time \| 11758623 \| 11697337 \| 66 Q4GEMM_Jblas/Q4GPerNSymFp32/M:1/N:4096/K:4096/Threads:56/real_time \| 67713 \| 67451 \| 10298 Q4GEMM_Jblas/Q4GPerNSymFp32/M:1024/N:4096/K:4096/Threads:56/real_time \| 5508325 \| 5480237 \| 126 Q4GEMM_Jblas/Q4GPerNSymFp32/M:2048/N:4096/K:4096/Threads:56/real_time \| 10738528 \| 10681656 \| 64 Q4GEMM_Jblas/Q4G32AsymFp32/M:1/N:4096/K:4096/Threads:56/real_time \| 60708 \| 60486 \| 11321 Q4GEMM_Jblas/Q4G32AsymFp32/M:1024/N:4096/K:4096/Threads:56/real_time \| 5523784 \| 5495736 \| 126 Q4GEMM_Jblas/Q4G32AsymFp32/M:2048/N:4096/K:4096/Threads:56/real_time \| 10829633 \| 10772161 \| 67 Reference: Benchmark \| Time \| CPU \| Iterations -- \| -- \| -- \| -- Q4GEMM/Q4Sym/M:1/N:4096/K:4096/Threads:56/real_time \| 53088 \| 52911 \| 13364 Q4GEMM/Q4Sym/M:1024/N:4096/K:4096/Threads:56/real_time \| 6268981 \| 6230335 \| 110 Q4GEMM/Q4Sym/M:2048/N:4096/K:4096/Threads:56/real_time \| 11701237 \| 11632339 \| 59 Win11+12900K 8 cores: Benchmark \| Time \| CPU \| Iterations -- \| -- \| -- \| -- Q4GEMM_Jblas/Q4G32SymInt8/M:1/N:4096/K:4096/Threads:8/real_time \| 215976 \| 211295 \| 2884 Q4GEMM_Jblas/Q4G32SymInt8/M:1024/N:4096/K:4096/Threads:8/real_time \| 60960590 \| 60937500 \| 10 Q4GEMM_Jblas/Q4G32SymInt8/M:2048/N:4096/K:4096/Threads:8/real_time \| 1.18E+08 \| 1.19E+08 \| 5 Q4GEMM_Jblas/Q4G32SymInt8/M:1/N:11008/K:4096/Threads:8/real_time \| 470377 \| 453059 \| 1414 Q4GEMM_Jblas/Q4G32SymInt8/M:1024/N:11008/K:4096/Threads:8/real_time \| 1.54E+08 \| 1.53E+08 \| 5 Q4GEMM_Jblas/Q4G32SymInt8/M:2048/N:11008/K:4096/Threads:8/real_time \| 3.18E+08 \| 3.13E+08 \| 2 Q4GEMM_Jblas/Q4G32SymInt8/M:1/N:4096/K:11008/Threads:8/real_time \| 569072 \| 559398 \| 1229 Q4GEMM_Jblas/Q4G32SymInt8/M:1024/N:4096/K:11008/Threads:8/real_time \| 1.54E+08 \| 1.52E+08 \| 4 Q4GEMM_Jblas/Q4G32SymInt8/M:2048/N:4096/K:11008/Threads:8/real_time \| 3.22E+08 \| 3.28E+08 \| 2 Q4GEMM_Jblas/Q4G32SymInt8/M:1/N:11008/K:11008/Threads:8/real_time \| 1486055 \| 1473325 \| 403 Q4GEMM_Jblas/Q4G32SymInt8/M:1024/N:11008/K:11008/Threads:8/real_time \| 4.14E+08 \| 4.14E+08 \| 2 Q4GEMM_Jblas/Q4G32SymInt8/M:2048/N:11008/K:11008/Threads:8/real_time \| 8.88E+08 \| 8.59E+08 \| 1 --------- Signed-off-by: Mengni Wang <mengni.wang@intel.com> Co-authored-by: Mengni Wang <mengni.wang@intel.com>	2023-12-19 09:36:31 -08:00
Ashwini Khade	4dff154f51	Fix nightly pipeline failure (#18867 ) ### Description Fixes a failure in the ortmodule nightly pipeline. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-12-19 09:18:00 -08:00
Jian Chen	6d7519ede8	Adding new pipeline for python cuda testing (#18718 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-12-18 18:13:03 -08:00
Frank	63b47ceaf8	[REACT NATIVE] Bugfix -> casing Podfile (#18861 ) ### Description The casing of Podfile is incorrect in the plugin. This causes issues when building iOS on case-sensitive systems such as Linux. ### Motivation and Context because cannot build ios on case sensitive systems	2023-12-19 10:20:46 +10:00
dependabot[bot]	3ff4a4c393	Bump actions/stale from 8.0.0 to 9.0.0 (#18774 )	2023-12-18 14:59:03 -08:00
sophies927	ea6186efa8	Update stale.yml to correct close-issue-message (#18849 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-12-18 09:57:33 -08:00
Yifan Li	9426bd50cb	[TensorRT EP] Update deprecated TRT api (#18834 ) ### Description <!-- Describe your changes. --> Update deprecated TRT api: 1. [setMaxWorkspaceSize](https://docs.nvidia.com/deeplearning/tensorrt/api/c_api/classnvinfer1_1_1_i_builder_config.html#a8209999988ab480c60c8a905dfd2654d)(max_workspace_size_)-------->setMemoryPoolLimit(nvinfer1::MemoryPoolType::kWORKSPACE, max_workspace_size_) 2. [kENABLE_TACTIC_HEURISTIC](https://docs.nvidia.com/deeplearning/tensorrt/api/c_api/namespacenvinfer1.html#abdc74c40fe7a0c3d05d2caeccfbc29c1a1215692ad24465e4d9e37a8a7fce1a38)-------->supersede by trt builder optimization level 2 Perf & warning log comparison <html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" xmlns="http://www.w3.org/TR/REC-html40"> <head> <meta name=ProgId content=OneNote.File> <meta name=Generator content="Microsoft OneNote 15"> </head> <body lang=en-US style='font-family:"Microsoft YaHei";font-size:12.0pt'> <!--StartFragment--> <div style='direction:ltr'> TRT EP options \| User will see corresponding warning logs: \| Average inference time cost (FRCNN on A100) -- \| -- \| -- trt_build_heuristics_enable\\|true \| [TensorRT EP] trt_build_heuristics_enable is deprecated on TRT 8.6 onwards. Please set builder optimization level as 2 to enable builder heuristics. \| ~300ms trt_build_heuristics_enable\\|true trt_builder_optimization_level\\|2 \| [TensorRT EP] Builder heuristics are enabled automatically by builder optimization level 2. trt_build_heuristics_enable is deprecated on TRT 8.6 onwards. \| ~275ms trt_builder_optimization_level\\|2 \| \| ~275ms </div> <!--EndFragment--> </body> </html> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Prepare for upcoming TRT 10	2023-12-18 09:16:09 -08:00
Changming Sun	ad476d5a1f	Change Nuget packaging pipeline's build TRT job to download CUDA SDK on-the-fly (#18847 ) ### Description Change Nuget packaging pipeline's build TRT job to download CUDA SDK on-the-fly, so that we do not need to put a CUDA SDK in the build machine's image.	2023-12-15 17:44:02 -08:00
Dmitri Smirnov	50cbcf9587	Build function bodies according to the imported global opset. (#18833 ) ### Description Build function bodies according to the imported global opset. Same is for querying ONNX functions. ### Motivation and Context This addresses issues: https://github.com/microsoft/onnxruntime/issues/18781 https://github.com/microsoft/onnxruntime/issues/16438	2023-12-15 15:56:20 -08:00
RandySheriffH	2952cf82a5	Access map by iterator to silence sanity check. (#18835 ) Use iterator to refer to the set. Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-12-15 14:57:55 -08:00
Jiajia Qin	8f7b89bd5b	[js/webgpu] Optimize NCHW layout for InstanceNormalization (#18123 ) ### Description The changes in this PR includes: 1) Fix f16 errors in InstanceNormalization with NCHW format. 2) Use vec to further optimize the original algorithm. 3) (Removed) Don't do layout conversion for InstanceNormalization for JSEP since InstanceNormalization itself is suitable for NCHW layout and has better performance in our current implementation. Tested on sd-vae-decoder-f16.onnx, it becomes 285 ms from 314 ms. The aggregate gpu profiling data can be found as below (Note the data is based change 3).): Before: <html> <body> <!--StartFragment--><span><span class="ui-provider ef bbg bbh bbi bbj bbk bbl bbm bbn bbo bbp bbq bbr bbs bbt bbu bbv bbw bbx bby bbz bca bcb bcc bcd bce bcf bcg bch bci bcj bck bcl bcm bcn" dir="ltr"> Kernel \| Time (Ms) \| Percentage (%) -- \| -- \| -- Conv \| 201.55 \| 69.56 InstanceNormalization \| 42.49 \| 14.67 Transpose \| 28.95 \| 9.99 Mul \| 5.69 \| 1.96 Add \| 3.82 \| 1.32 MatMul \| 3.27 \| 1.13 Sigmoid \| 2.24 \| 0.77 Resize \| 1.16 \| 0.40 Softmax \| 0.34 \| 0.12 Cast \| 0.24 \| 0.08 Sum \| 289.75 <br class="Apple-interchange-newline"><!--EndFragment--> </body> </html> After: <html> <body> <!--StartFragment--><span><span class="ui-provider ef bbg bbh bbi bbj bbk bbl bbm bbn bbo bbp bbq bbr bbs bbt bbu bbv bbw bbx bby bbz bca bcb bcc bcd bce bcf bcg bch bci bcj bck bcl bcm bcn" dir="ltr"> Kernel \| Time (Ms) \| Percentage (%) -- \| -- \| -- Conv \| 205.44 \| 79.43 InstanceNormalization \| 18.24 \| 7.05 Transpose \| 17.64 \| 6.82 Mul \| 5.69 \| 2.20 Add \| 3.81 \| 1.47 MatMul \| 3.56 \| 1.38 Sigmoid \| 2.24 \| 0.86 Resize \| 1.19 \| 0.46 Softmax \| 0.59 \| 0.23 Cast \| 0.24 \| 0.09 Sum \| 258.65 \| </span></span><!--EndFragment--> </body> </html> From above table, we can see that two ops time are greatly reduced. One is InstanceNormalization and the other is Transpose. The reason that the transpose time is reduced is because each InstanceNormalization is surrounded with two reshape ops in sd-vae-decoder-f16.onnx. Due to JSEP is prefer NHWC and InstanceNormalization is layout sensitive op, so two extra transpose ops are inserted dynamically when executing this model. After this change, those inserted transpose ops are not needed anymore. So the overall transpose time is reduced.	2023-12-15 11:26:15 -08:00
Jiajia Qin	4bbed4c71a	[js/webgpu] Fix f16 errors in unary (#18839 ) ### Description This PR fixes below errors: ``` no matching overload for operator > (vec4<f16>, vec4<f32>)	2023-12-15 11:25:12 -08:00
Changming Sun	f52668cc68	Disable mlas unit test in ARM64EC build (#18747 ) ### Description Disable mlas unit test in ARM64EC build because the program has some link errors. We will fix the errors later. This PR only impacts Windows ARM64EC build. It has no impact on the existing build pipelines.	2023-12-15 09:17:47 -08:00
wirthual	89168b830d	Fix CI error: The workflow is not valid. .github/workflows/rust-ci.yml (Line: 27, Col: 7): Unexpected value 'ORT_RUST_STRATEGY=download' (#18836 ) Use colon for Env variable instead of =	2023-12-15 09:14:02 -08:00
Yang Gu	81ad1e6ac3	[js/webgpu] Fix typo of outputShapes in profiling message (#18837 )	2023-12-15 08:57:48 -08:00
Peishen Yan	d111eed726	[WebNN EP] Change axis to axes for argMax/argMin (#18838 ) In the latest spec, the axes option of WebNN's argMax and argMin requires the use of a sequence long type. Replace axis option (long type) with axes (sequence long type) for argMax and argMin.	2023-12-15 08:57:07 -08:00
Changming Sun	d795fc636c	FIX: Our cmake script didn't check googletest's hash (#18826 )	2023-12-15 08:48:15 -08:00
Changming Sun	fc9ecb59db	Add Windows ARM build jobs to post merge pipeline (#18832 ) ### Description Add Windows ARM build jobs to post merge pipeline to valid our code is still compatible with these build settings.	2023-12-15 08:47:52 -08:00
pengwa	5eda79bdd3	Improve perf for stage3 training (#18099 ) ### Improve perf for stage3 training - first wave Port existing PythonOp/PythonOpGrad python runner to C++, also introduce an unsafe run mode (to skip inplace, save for backward, materrialized grad detection on the fly). This reduce the overhead from XX~XXX us to X ~ lower end of XX us . In LLAMA2 7B training with 8x32GV100, we have observed 6.7% gains over PyTorch. (1.59 v.s. 1.49it/s) Peak memory also dropped from 31GB to 28GB. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-12-15 13:32:19 +08:00
Changming Sun	cbad4fe49b	Update absl and googletest (#18827 ) ### Description Update absl and googletest to their latest version to include some cmake changes: 1. A googletest's cmake change that will allow using external absl and re2. 2. Nullability enhancements that will allow our clang-based static analysis detecting many kinds of null pointer errors. ### Motivation and Context To fix a C4744 link warning in our Windows pipelines. ``` LINK : warning C4744: 'static char const absl::lts_20230802::base_internal::FastTypeTag<bool>::dummy_var' has different type in 'd:\a\_work\_temp\abseil_cpp\abseil-cpp-20230802.0\absl\flags\parse.cc' and 'd:\a\_work\1\b\relwithdebinfo\_deps\googletest-src\googletest\src\gtest-all.cc': 'signed char' and 'unsigned char' [D:\a\_work\1\b\RelWithDebInfo\onnxruntime_mlas_test.vcxproj] LINK : warning C4744: 'static char const absl::lts_20230802::base_internal::FastTypeTag<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > >::dummy_var' has different type in 'd:\a\_work\_temp\abseil_cpp\abseil-cpp-20230802.0\absl\flags\parse.cc' and 'd:\a\_work\1\b\relwithdebinfo\_deps\googletest-src\googletest\src\gtest-all.cc': 'signed char' and 'unsigned char' [D:\a\_work\1\b\RelWithDebInfo\onnxruntime_mlas_test.vcxproj] LINK : warning C4744: 'static char const absl::lts_20230802::base_internal::FastTypeTag<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > >::dummy_var' has different type in 'd:\a\_work\_temp\abseil_cpp\abseil-cpp-20230802.0\absl\flags\internal\usage.cc' and 'd:\a\_work\1\b\relwithdebinfo\_deps\googletest-src\googletest\src\gtest-all.cc': 'signed char' and 'unsigned char' [D:\a\_work\1\b\RelWithDebInfo\onnxruntime_mlas_test.vcxproj] LINK : warning C4744: 'static char const absl::lts_20230802::base_internal::FastTypeTag<bool>::dummy_var' has different type in 'd:\a\_work\_temp\abseil_cpp\abseil-cpp-20230802.0\absl\flags\internal\flag.cc' and 'd:\a\_work\1\b\relwithdebinfo\_deps\googletest-src\googletest\src\gtest-all.cc': 'signed char' and 'unsigned char' [D:\a\_work\1\b\RelWithDebInfo\onnxruntime_mlas_test.vcxproj] LINK : warning C4744: 'static char const absl::lts_20230802::base_internal::FastTypeTag<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > >::dummy_var' has different type in 'd:\a\_work\_temp\abseil_cpp\abseil-cpp-20230802.0\absl\flags\internal\flag.cc' and 'd:\a\_work\1\b\relwithdebinfo\_deps\googletest-src\googletest\src\gtest-all.cc': 'signed char' and 'unsigned char' [D:\a\_work\1\b\RelWithDebInfo\onnxruntime_mlas_test.vcxproj] LINK : warning C4744: 'static char const absl::lts_20230802::base_internal::FastTypeTag<int>::dummy_var' has different type in 'd:\a\_work\_temp\abseil_cpp\abseil-cpp-20230802.0\absl\flags\internal\flag.cc' and 'd:\a\_work\1\b\relwithdebinfo\_deps\googletest-src\googletest\src\gtest-all.cc': 'signed char' and 'unsigned char' [D:\a\_work\1\b\RelWithDebInfo\onnxruntime_mlas_test.vcxproj] ```	2023-12-14 16:15:07 -08:00
Yueqing Zhang	b42d4b8ea6	[VitisAI] 1. api compatbile 2. dynamic load onnx (#18470 ) ### Description <!-- Describe your changes. --> 1. Add a backward-compatible API for compiling model. 2. Run-time load vitisai-ep.dll ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: Yueqing Zhang <yueqingz@amd.com> Co-authored-by: Zhenze Wang <zhenzew@xilinx.com>	2023-12-14 14:43:41 -08:00
zesongw	6d5ee4d69b	[WebNN EP] Use explicit padding (#18688 ) WebNN will remove autoPad option, we need to use explicit padding values. Compute padding values of autopad(same-upper, same-lower) for Op Pool, Conv and ConvTranspose.	2023-12-14 14:33:44 -08:00
Wanming Lin	1db1c75048	[WebNN EP] WebNN only supports 4-D input and weight for Conv/ConvTranspose (#18703 )	2023-12-14 14:33:19 -08:00
Changming Sun	b129f425fc	Fix test model URL issue (#18823 ) ### Description ONNX model zoo changed their dir structure. So some our pipelines are failing. In prevent such things happening again, we'd better to read the test data for a cache from local disk instead of downloading it remotely every time.	2023-12-14 13:06:08 -08:00
Chi Lo	afe5cdc938	[TensorRT EP] Switch to enqueueV3 with support DDS output (copy version) (#18714 ) It's branched off from https://github.com/microsoft/onnxruntime/pull/17751 but removes KernelContext_SetOutput() API. It copies output allocation buffer to kernel context. --------- Co-authored-by: George Wu <jywu@microsoft.com>	2023-12-14 11:10:58 -08:00
Changming Sun	7386e21121	Replace some ORT_ENFORCE with ORT_THROW_IF_ERROR (#18812 ) ### Description Replace some ORT_ENFORCE with ORT_THROW_IF_ERROR to get better error messages.	2023-12-14 10:14:22 -08:00

1 2 3 4 5 ...

10208 commits