Commit graph

10208 commits

Author SHA1 Message Date
Yufeng Li
985acda28c
optimize int4 gemv kernel with cuda (#18818)
### Description
optimize gemv kernel:

1. unroll reduction to improve memory bandwidth
2. leverage 4bits to float16 tricks to save instrutions

| m | n | k | symmetric | latency before(us) | latency after(us) |
| - | ----- | ----- | --------- | ------------------ | -----------------
|
| 1 | 4096 | 4096 | TRUE | 15.54 | 8.82 |
| 1 | 4096 | 4096 | FALSE | 15.84 | 9.89 |
| 1 | 4096 | 11008 | TRUE | 42.44 | 19.4 |
| 1 | 4096 | 11008 | FALSE | 44.42 | 21.48 |
| 1 | 11008 | 4096 | TRUE | 34.65 | 17.46 |
| 1 | 11008 | 4096 | FALSE | 35.76 | 20.87 |
| 1 | 12288 | 4096 | TRUE | 39.27 | 19.73 |
| 1 | 12288 | 4096 | FALSE | 40.91 | 25.2 |
| 1 | 22016 | 4096 | TRUE | 65.78 | 38.81 |
| 1 | 22016 | 4096 | FALSE | 67.98 | 48.36 |
2023-12-21 19:32:34 -08:00
Changming Sun
3d8f229d39
Add ARM64EC build jobs (#18870)
### Description
Add ARM64EC build jobs in post merge pipeline to validate if our code is
compatible with Windows ARM64EC.
2023-12-21 16:31:38 -08:00
Changming Sun
5b93c465a8
Delete .github/workflows/generated_fake_win_gpu_ci.yml (#18074)
### Description
No longer needed. Now Azure DevOps has the built-in support.
2023-12-21 16:31:11 -08:00
Yifan Li
0af946f35a
[EP Perf] Fix ORT-CUDAFp16 tests (#18908)
### Description
<!-- Describe your changes. -->
ORT-CUDAFp16 model tests were all failed 
due to the latest `onnxmltools` 1.12.0 started to remove
`onnxconverter-common` out of its dependencies, which is needed by the
ep perf env to test models with CUDA EP under fp16.

Add `onnxconverter-common` dep to env to fix.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-12-21 16:20:41 -08:00
Wanming Lin
4c3705cbea
[WebNN EP] Change some support status for XNNPack backend (#18858)
WebNN XNNPack backend doesn't really support `pow` and `reduceSum`, and will support `sqrt` very soon.
2023-12-21 15:16:44 -08:00
Wanming Lin
1b64d30963
[WebNN EP] Infer the layout via ONNX domain for Resize (#18871)
Previously we added EP specific logic into generic core code to restrict
Resize for WebNN EP at
https://github.com/microsoft/onnxruntime/pull/18687 which does not scale
and make sense.

This PR reverts the change in
https://github.com/microsoft/onnxruntime/pull/18687 and uses ONNX domain
infomation to infer the layout infomation during layout transformation.
2023-12-21 11:30:29 -08:00
dependabot[bot]
8507c06f8e
Bump conda-incubator/setup-miniconda from 2 to 3 (#18685)
Bumps
[conda-incubator/setup-miniconda](https://github.com/conda-incubator/setup-miniconda)
from 2 to 3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/conda-incubator/setup-miniconda/releases">conda-incubator/setup-miniconda's
releases</a>.</em></p>
<blockquote>
<h2>Version 3.0.0</h2>
<h3>Features</h3>
<ul>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/308">#308</a>
Update to node20</li>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/291">#291</a>
Add conda-solver option (defaults to libmamba)</li>
</ul>
<h3>Fixes</h3>
<ul>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/299">#299</a>
Fix condaBasePath when useBundled is false, and there's no pre-existing
conda</li>
</ul>
<h3>Documentation</h3>
<ul>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/309">#309</a>
Switch to main branch based development</li>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/313">#313</a>
Specify team conda-incubator/setup-miniconda as codeowners</li>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/318">#318</a>
README: update actions in examples, add security section, similar
actions</li>
</ul>
<h3>Tasks and Maintenance</h3>
<ul>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/307">#307</a>
Run dependabot against main branch and also update node packages</li>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/311">#311</a>
Bump actions/checkout from 2 to 4</li>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/310">#310</a>
Bump actions/cache from 1 to 3</li>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/314">#314</a>
Strip/update dependencies</li>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/315">#315</a>
Split lint into check and build, switch from <code>npm install</code> to
<code>npm ci</code></li>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/317">#317</a>
Bump normalize-url from 4.5.1 to 8.0.0</li>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/316">#316</a>
Faster workflow response / saving resources via timeout/concurrency
policy</li>
</ul>
<p><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/308">#308</a>:
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/308">conda-incubator/setup-miniconda#308</a>
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/291">#291</a>:
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/291">conda-incubator/setup-miniconda#291</a>
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/299">#299</a>:
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/299">conda-incubator/setup-miniconda#299</a>
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/309">#309</a>:
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/309">conda-incubator/setup-miniconda#309</a>
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/313">#313</a>:
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/313">conda-incubator/setup-miniconda#313</a>
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/318">#318</a>:
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/318">conda-incubator/setup-miniconda#318</a>
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/307">#307</a>:
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/307">conda-incubator/setup-miniconda#307</a>
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/311">#311</a>:
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/311">conda-incubator/setup-miniconda#311</a>
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/310">#310</a>:
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/310">conda-incubator/setup-miniconda#310</a>
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/314">#314</a>:
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/314">conda-incubator/setup-miniconda#314</a>
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/315">#315</a>:
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/315">conda-incubator/setup-miniconda#315</a>
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/317">#317</a>:
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/317">conda-incubator/setup-miniconda#317</a>
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/316">#316</a>:
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/316">conda-incubator/setup-miniconda#316</a></p>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/isuruf"><code>@​isuruf</code></a> made
their first contribution in <a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/299">conda-incubator/setup-miniconda#299</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/conda-incubator/setup-miniconda/compare/v2...v3.0.0">https://github.com/conda-incubator/setup-miniconda/compare/v2...v3.0.0</a></p>
<h2>Version 2.3.0</h2>
<h3>Documentation</h3>
<ul>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/263">#263</a>
Update links to GitHub shell docs</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/conda-incubator/setup-miniconda/blob/main/CHANGELOG.md">conda-incubator/setup-miniconda's
changelog</a>.</em></p>
<blockquote>
<h2><a
href="https://github.com/conda-incubator/setup-miniconda/releases/tag/v3.0.1">v3.0.1</a>
(2023-11-29)</h2>
<h3>Fixes</h3>
<ul>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/325">#325</a>
Fix environment activation on windows (a v3 regression) due to
hard-coded install PATH</li>
</ul>
<p><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/325">#325</a>:
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/325">conda-incubator/setup-miniconda#325</a></p>
<h2><a
href="https://github.com/conda-incubator/setup-miniconda/releases/tag/v3.0.0">v3.0.0</a>
(2023-11-27)</h2>
<h3>Features</h3>
<ul>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/308">#308</a>
Update to node20</li>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/291">#291</a>
Add conda-solver option (defaults to libmamba)</li>
</ul>
<h3>Fixes</h3>
<ul>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/299">#299</a>
Fix condaBasePath when useBundled is false, and there's no pre-existing
conda</li>
</ul>
<h3>Documentation</h3>
<ul>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/309">#309</a>
Switch to main branch based development</li>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/313">#313</a>
Specify team conda-incubator/setup-miniconda as codeowners</li>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/318">#318</a>
README: update actions in examples, add security section, similar
actions</li>
</ul>
<h3>Tasks and Maintenance</h3>
<ul>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/307">#307</a>
Run dependabot against main branch and also update node packages</li>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/311">#311</a>
Bump actions/checkout from 2 to 4</li>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/310">#310</a>
Bump actions/cache from 1 to 3</li>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/314">#314</a>
Strip/update dependencies</li>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/315">#315</a>
Split lint into check and build, switch from <code>npm install</code> to
<code>npm ci</code></li>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/317">#317</a>
Bump normalize-url from 4.5.1 to 8.0.0</li>
<li><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/316">#316</a>
Faster workflow response / saving resources via timeout/concurrency
policy</li>
</ul>
<p><a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/308">#308</a>:
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/308">conda-incubator/setup-miniconda#308</a>
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/291">#291</a>:
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/291">conda-incubator/setup-miniconda#291</a>
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/299">#299</a>:
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/299">conda-incubator/setup-miniconda#299</a>
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/309">#309</a>:
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/309">conda-incubator/setup-miniconda#309</a>
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/313">#313</a>:
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/313">conda-incubator/setup-miniconda#313</a>
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/318">#318</a>:
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/318">conda-incubator/setup-miniconda#318</a>
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/307">#307</a>:
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/307">conda-incubator/setup-miniconda#307</a>
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/311">#311</a>:
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/311">conda-incubator/setup-miniconda#311</a>
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/310">#310</a>:
<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/pull/310">conda-incubator/setup-miniconda#310</a></p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="11b5629583"><code>11b5629</code></a>
Prepare 3.0.1 (<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/326">#326</a>)</li>
<li><a
href="8706aa744e"><code>8706aa7</code></a>
Fix env activation on win (a v3 regression) due to hard-coded install
PATH (#...</li>
<li><a
href="c585a97097"><code>c585a97</code></a>
Bump conda-incubator/setup-miniconda from 2.3.0 to 3.0.0 (<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/321">#321</a>)</li>
<li><a
href="2defc80cc6"><code>2defc80</code></a>
Prepare release (<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/320">#320</a>)</li>
<li><a
href="0d5a56b9eb"><code>0d5a56b</code></a>
Bump actions/checkout from 2 to 4 (<a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/319">#319</a>)</li>
<li><a
href="45fd3f9089"><code>45fd3f9</code></a>
Merge pull request <a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/316">#316</a>
from dbast/timeout</li>
<li><a
href="d1e04fc267"><code>d1e04fc</code></a>
Merge pull request <a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/299">#299</a>
from isuruf/condaBasePath</li>
<li><a
href="fab0073840"><code>fab0073</code></a>
Merge pull request <a
href="https://redirect.github.com/conda-incubator/setup-miniconda/issues/318">#318</a>
from dbast/readme</li>
<li><a
href="fa6bdf9643"><code>fa6bdf9</code></a>
Update with npm run build</li>
<li><a
href="d42f8b884a"><code>d42f8b8</code></a>
Fix condaBasePath when useBundled is false, and there's no pre-existing
conda</li>
<li>Additional commits viewable in <a
href="https://github.com/conda-incubator/setup-miniconda/compare/v2...v3">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=conda-incubator/setup-miniconda&package-manager=github_actions&previous-version=2&new-version=3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-12-21 10:34:24 -08:00
dependabot[bot]
914bc409b0
Bump transformers from 4.30.0 to 4.36.0 in /tools/ci_build (#18895)
Bumps [transformers](https://github.com/huggingface/transformers) from
4.30.0 to 4.36.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/huggingface/transformers/releases">transformers's
releases</a>.</em></p>
<blockquote>
<h2>v4.36: Mixtral, Llava/BakLlava, SeamlessM4T v2, AMD ROCm, F.sdpa
wide-spread support</h2>
<h2>New model additions</h2>
<h3>Mixtral</h3>
<p>Mixtral is the new open-source model from Mistral AI announced by the
blogpost <a href="https://mistral.ai/news/mixtral-of-experts/">Mixtral
of Experts</a>. The model has been proven to have comparable
capabilities to Chat-GPT according to the benchmark results shared on
the release blogpost.</p>
<!-- raw HTML omitted -->
<p>The architecture is a sparse Mixture of Experts with Top-2 routing
strategy, similar as <code>NllbMoe</code> architecture in transformers.
You can use it through <code>AutoModelForCausalLM</code> interface:</p>
<pre lang="py"><code>&gt;&gt;&gt; import torch
&gt;&gt;&gt; from transformers import AutoModelForCausalLM,
AutoTokenizer
<p>&gt;&gt;&gt; model =
AutoModelForCausalLM.from_pretrained(&quot;mistralai/Mixtral-8x7B&quot;,
torch_dtype=torch.float16, device_map=&quot;auto&quot;)
&gt;&gt;&gt; tokenizer =
AutoTokenizer.from_pretrained(&quot;mistralai/Mistral-8x7B&quot;)</p>
<p>&gt;&gt;&gt; prompt = &quot;My favourite condiment is&quot;</p>
<p>&gt;&gt;&gt; model_inputs = tokenizer([prompt],
return_tensors=&quot;pt&quot;).to(device)
&gt;&gt;&gt; model.to(device)</p>
<p>&gt;&gt;&gt; generated_ids = model.generate(**model_inputs,
max_new_tokens=100, do_sample=True)
&gt;&gt;&gt; tokenizer.batch_decode(generated_ids)[0]
</code></pre></p>
<p>The model is compatible with existing optimisation tools such Flash
Attention 2, <code>bitsandbytes</code> and PEFT library. The checkpoints
are release under <a
href="https://huggingface.co/mistralai"><code>mistralai</code></a>
organisation on the Hugging Face Hub.</p>
<h3>Llava / BakLlava</h3>
<p>Llava is an open-source chatbot trained by fine-tuning LlamA/Vicuna
on GPT-generated multimodal instruction-following data. It is an
auto-regressive language model, based on the transformer architecture.
In other words, it is an multi-modal version of LLMs fine-tuned for chat
/ instructions.</p>
<!-- raw HTML omitted -->
<p>The Llava model was proposed in <a
href="https://arxiv.org/pdf/2310.03744">Improved Baselines with Visual
Instruction Tuning</a> by Haotian Liu, Chunyuan Li, Yuheng Li and Yong
Jae Lee.</p>
<ul>
<li>[<code>Llava</code>] Add Llava to transformers by <a
href="https://github.com/younesbelkada"><code>@​younesbelkada</code></a>
in <a
href="https://redirect.github.com/huggingface/transformers/issues/27662">#27662</a></li>
<li>[LLaVa] Some improvements by <a
href="https://github.com/NielsRogge"><code>@​NielsRogge</code></a> in <a
href="https://redirect.github.com/huggingface/transformers/issues/27895">#27895</a></li>
</ul>
<p>The integration also includes <a
href="https://github.com/SkunkworksAI/BakLLaVA"><code>BakLlava</code></a>
which is a Llava model trained with Mistral backbone.</p>
<p>The mode is compatible with <code>&quot;image-to-text&quot;</code>
pipeline:</p>
<pre lang="py"><code>from transformers import pipeline
from PIL import Image    
import requests
<p>model_id = &quot;llava-hf/llava-1.5-7b-hf&quot;
&lt;/tr&gt;&lt;/table&gt;
</code></pre></p>
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="14666775a2"><code>1466677</code></a>
Release: v4.36.0</li>
<li><a
href="accccdd008"><code>accccdd</code></a>
[<code>Add Mixtral</code>] Adds support for the Mixtral MoE (<a
href="https://redirect.github.com/huggingface/transformers/issues/27942">#27942</a>)</li>
<li><a
href="0676d992a5"><code>0676d99</code></a>
[<code>from_pretrained</code>] Make from_pretrained fast again (<a
href="https://redirect.github.com/huggingface/transformers/issues/27709">#27709</a>)</li>
<li><a
href="9f18cc6df0"><code>9f18cc6</code></a>
Fix SDPA dispatch &amp; make SDPA CI compatible with torch&lt;2.1.1 (<a
href="https://redirect.github.com/huggingface/transformers/issues/27940">#27940</a>)</li>
<li><a
href="7ea21f1f03"><code>7ea21f1</code></a>
[LLaVa] Some improvements (<a
href="https://redirect.github.com/huggingface/transformers/issues/27895">#27895</a>)</li>
<li><a
href="5e620a92cf"><code>5e620a9</code></a>
Fix <code>SeamlessM4Tv2ModelIntegrationTest</code> (<a
href="https://redirect.github.com/huggingface/transformers/issues/27911">#27911</a>)</li>
<li><a
href="e96c1de191"><code>e96c1de</code></a>
Skip <code>UnivNetModelTest::test_multi_gpu_data_parallel_forward</code>
(<a
href="https://redirect.github.com/huggingface/transformers/issues/27912">#27912</a>)</li>
<li><a
href="8d8970efdd"><code>8d8970e</code></a>
[BEiT] Fix test (<a
href="https://redirect.github.com/huggingface/transformers/issues/27934">#27934</a>)</li>
<li><a
href="235be08569"><code>235be08</code></a>
[DETA] fix backbone freeze/unfreeze function (<a
href="https://redirect.github.com/huggingface/transformers/issues/27843">#27843</a>)</li>
<li><a
href="df5c5c62ae"><code>df5c5c6</code></a>
Fix typo (<a
href="https://redirect.github.com/huggingface/transformers/issues/27918">#27918</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/huggingface/transformers/compare/v4.30.0...v4.36.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=transformers&package-manager=pip&previous-version=4.30.0&new-version=4.36.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/microsoft/onnxruntime/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-12-21 00:44:36 -08:00
dependabot[bot]
f3c62bfad9
Bump actions/setup-node from 3 to 4 (#18148)
Bumps [actions/setup-node](https://github.com/actions/setup-node) from 3
to 4.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/setup-node/releases">actions/setup-node's
releases</a>.</em></p>
<blockquote>
<h2>v4.0.0</h2>
<h2>What's Changed</h2>
<p>In scope of this release we changed version of node runtime for
action from node16 to node20 and updated dependencies in <a
href="https://redirect.github.com/actions/setup-node/pull/866">actions/setup-node#866</a></p>
<p>Besides, release contains such changes as:</p>
<ul>
<li>Upgrade actions/checkout to v4 by <a
href="https://github.com/gmembre-zenika"><code>@​gmembre-zenika</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/868">actions/setup-node#868</a></li>
<li>Update actions/checkout for documentation and yaml by <a
href="https://github.com/dmitry-shibanov"><code>@​dmitry-shibanov</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/876">actions/setup-node#876</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/gmembre-zenika"><code>@​gmembre-zenika</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-node/pull/868">actions/setup-node#868</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-node/compare/v3...v4.0.0">https://github.com/actions/setup-node/compare/v3...v4.0.0</a></p>
<h2>v3.8.2</h2>
<h2>What's Changed</h2>
<ul>
<li>Update semver by <a
href="https://github.com/dmitry-shibanov"><code>@​dmitry-shibanov</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/861">actions/setup-node#861</a></li>
<li>Update temp directory creation by <a
href="https://github.com/nikolai-laevskii"><code>@​nikolai-laevskii</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/859">actions/setup-node#859</a></li>
<li>Bump <code>@​babel/traverse</code> from 7.15.4 to 7.23.2 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-node/pull/870">actions/setup-node#870</a></li>
<li>Add notice about binaries not being updated yet by <a
href="https://github.com/nikolai-laevskii"><code>@​nikolai-laevskii</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/872">actions/setup-node#872</a></li>
<li>Update toolkit cache and core by <a
href="https://github.com/dmitry-shibanov"><code>@​dmitry-shibanov</code></a>
and <a
href="https://github.com/seongwon-privatenote"><code>@​seongwon-privatenote</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/875">actions/setup-node#875</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-node/compare/v3...v3.8.2">https://github.com/actions/setup-node/compare/v3...v3.8.2</a></p>
<h2>v3.8.1</h2>
<h2>What's Changed</h2>
<p>In scope of this release, the filter was removed within the
cache-save step by <a
href="https://github.com/dmitry-shibanov"><code>@​dmitry-shibanov</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/831">actions/setup-node#831</a>.
It is filtered and checked in the toolkit/cache library.</p>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-node/compare/v3...v3.8.1">https://github.com/actions/setup-node/compare/v3...v3.8.1</a></p>
<h2>v3.8.0</h2>
<h2>What's Changed</h2>
<h3>Bug fixes:</h3>
<ul>
<li>Add check for existing paths by <a
href="https://github.com/dmitry-shibanov"><code>@​dmitry-shibanov</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/803">actions/setup-node#803</a></li>
<li>Resolve SymbolicLink by <a
href="https://github.com/dmitry-shibanov"><code>@​dmitry-shibanov</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/809">actions/setup-node#809</a></li>
<li>Change passing logic for cache input by <a
href="https://github.com/dmitry-shibanov"><code>@​dmitry-shibanov</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/816">actions/setup-node#816</a></li>
<li>Fix armv7 cache issue by <a
href="https://github.com/louislam"><code>@​louislam</code></a> in <a
href="https://redirect.github.com/actions/setup-node/pull/794">actions/setup-node#794</a></li>
<li>Update check-dist workflow name by <a
href="https://github.com/sinchang"><code>@​sinchang</code></a> in <a
href="https://redirect.github.com/actions/setup-node/pull/710">actions/setup-node#710</a></li>
</ul>
<h3>Feature implementations:</h3>
<ul>
<li>feat: handling the case where &quot;node&quot; is used for
tool-versions file. by <a
href="https://github.com/xytis"><code>@​xytis</code></a> in <a
href="https://redirect.github.com/actions/setup-node/pull/812">actions/setup-node#812</a></li>
</ul>
<h3>Documentation changes:</h3>
<ul>
<li>Refer to semver package name in README.md by <a
href="https://github.com/olleolleolle"><code>@​olleolleolle</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/808">actions/setup-node#808</a></li>
</ul>
<h3>Update dependencies:</h3>
<ul>
<li>Update toolkit cache to fix zstd by <a
href="https://github.com/dmitry-shibanov"><code>@​dmitry-shibanov</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/804">actions/setup-node#804</a></li>
<li>Bump tough-cookie and <code>@​azure/ms-rest-js</code> by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-node/pull/802">actions/setup-node#802</a></li>
<li>Bump semver from 6.1.2 to 6.3.1 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-node/pull/807">actions/setup-node#807</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="8f152de45c"><code>8f152de</code></a>
Update actions/checkout for documentation and yaml (<a
href="https://redirect.github.com/actions/setup-node/issues/876">#876</a>)</li>
<li><a
href="23755b521f"><code>23755b5</code></a>
upgrade actions/checkout to v4 (<a
href="https://redirect.github.com/actions/setup-node/issues/868">#868</a>)</li>
<li><a
href="54534a2a9b"><code>54534a2</code></a>
Change node version for action to node20 (<a
href="https://redirect.github.com/actions/setup-node/issues/866">#866</a>)</li>
<li>See full diff in <a
href="https://github.com/actions/setup-node/compare/v3...v4">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-node&package-manager=github_actions&previous-version=3&new-version=4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

You can trigger a rebase of this PR by commenting `@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

> **Note**
> Automatic rebases have been disabled on this pull request as it has
been open for over 30 days.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-12-20 23:12:17 -08:00
dependabot[bot]
f74389c976
Bump github/issue-labeler from 3.2 to 3.3 (#18408)
Bumps [github/issue-labeler](https://github.com/github/issue-labeler)
from 3.2 to 3.3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/github/issue-labeler/releases">github/issue-labeler's
releases</a>.</em></p>
<blockquote>
<h2>v3.3</h2>
<h2>What's Changed</h2>
<ul>
<li>feat(config): support reading from local file if it exists by <a
href="https://github.com/lrstanley"><code>@​lrstanley</code></a> in <a
href="https://redirect.github.com/github/issue-labeler/pull/48">github/issue-labeler#48</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/lrstanley"><code>@​lrstanley</code></a>
made their first contribution in <a
href="https://redirect.github.com/github/issue-labeler/pull/48">github/issue-labeler#48</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/github/issue-labeler/compare/v3.2...v3.3">https://github.com/github/issue-labeler/compare/v3.2...v3.3</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="6bea9ed491"><code>6bea9ed</code></a>
feat(config): support reading from local file if it exists (<a
href="https://redirect.github.com/github/issue-labeler/issues/48">#48</a>)</li>
<li>See full diff in <a
href="https://github.com/github/issue-labeler/compare/v3.2...v3.3">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=github/issue-labeler&package-manager=github_actions&previous-version=3.2&new-version=3.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

You can trigger a rebase of this PR by commenting `@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

> **Note**
> Automatic rebases have been disabled on this pull request as it has
been open for over 30 days.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-12-20 22:20:59 -08:00
Yifan Li
54e471a054
[EP Perf] Display percentage of cuda/trt ops in cuda/trt ep on EP Perf Dashboard (#18868)
### Description
Display percentage of cuda/trt ops in cuda/trt ep on EP Perf Dashboard:

![image](https://github.com/microsoft/onnxruntime/assets/109183385/bafba098-1338-46fa-b10a-ca19eff2a746)

Check
[here](https://msit.powerbi.com/groups/d1ae6355-afd0-4c40-b78e-676a86cab1e2/reports/82101bbb-dad2-4f24-9ddf-a37f0d41509a/ReportSectionda402bdf6824e505a614?experience=power-bi)
to preview on ep perf dashboard


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
- brief overview of op metrics towards various models
- easy to identify models which haven't reached 100% ops on cuda/trt ep.
2023-12-20 22:11:47 -08:00
dependabot[bot]
ce70a30b94
Bump transformers from 4.35.2 to 4.36.0 in /onnxruntime/python/tools/transformers/models/stable_diffusion (#18896)
Bumps [transformers](https://github.com/huggingface/transformers) from
4.35.2 to 4.36.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/huggingface/transformers/releases">transformers's
releases</a>.</em></p>
<blockquote>
<h2>v4.36: Mixtral, Llava/BakLlava, SeamlessM4T v2, AMD ROCm, F.sdpa
wide-spread support</h2>
<h2>New model additions</h2>
<h3>Mixtral</h3>
<p>Mixtral is the new open-source model from Mistral AI announced by the
blogpost <a href="https://mistral.ai/news/mixtral-of-experts/">Mixtral
of Experts</a>. The model has been proven to have comparable
capabilities to Chat-GPT according to the benchmark results shared on
the release blogpost.</p>
<!-- raw HTML omitted -->
<p>The architecture is a sparse Mixture of Experts with Top-2 routing
strategy, similar as <code>NllbMoe</code> architecture in transformers.
You can use it through <code>AutoModelForCausalLM</code> interface:</p>
<pre lang="py"><code>&gt;&gt;&gt; import torch
&gt;&gt;&gt; from transformers import AutoModelForCausalLM,
AutoTokenizer
<p>&gt;&gt;&gt; model =
AutoModelForCausalLM.from_pretrained(&quot;mistralai/Mixtral-8x7B&quot;,
torch_dtype=torch.float16, device_map=&quot;auto&quot;)
&gt;&gt;&gt; tokenizer =
AutoTokenizer.from_pretrained(&quot;mistralai/Mistral-8x7B&quot;)</p>
<p>&gt;&gt;&gt; prompt = &quot;My favourite condiment is&quot;</p>
<p>&gt;&gt;&gt; model_inputs = tokenizer([prompt],
return_tensors=&quot;pt&quot;).to(device)
&gt;&gt;&gt; model.to(device)</p>
<p>&gt;&gt;&gt; generated_ids = model.generate(**model_inputs,
max_new_tokens=100, do_sample=True)
&gt;&gt;&gt; tokenizer.batch_decode(generated_ids)[0]
</code></pre></p>
<p>The model is compatible with existing optimisation tools such Flash
Attention 2, <code>bitsandbytes</code> and PEFT library. The checkpoints
are release under <a
href="https://huggingface.co/mistralai"><code>mistralai</code></a>
organisation on the Hugging Face Hub.</p>
<h3>Llava / BakLlava</h3>
<p>Llava is an open-source chatbot trained by fine-tuning LlamA/Vicuna
on GPT-generated multimodal instruction-following data. It is an
auto-regressive language model, based on the transformer architecture.
In other words, it is an multi-modal version of LLMs fine-tuned for chat
/ instructions.</p>
<!-- raw HTML omitted -->
<p>The Llava model was proposed in <a
href="https://arxiv.org/pdf/2310.03744">Improved Baselines with Visual
Instruction Tuning</a> by Haotian Liu, Chunyuan Li, Yuheng Li and Yong
Jae Lee.</p>
<ul>
<li>[<code>Llava</code>] Add Llava to transformers by <a
href="https://github.com/younesbelkada"><code>@​younesbelkada</code></a>
in <a
href="https://redirect.github.com/huggingface/transformers/issues/27662">#27662</a></li>
<li>[LLaVa] Some improvements by <a
href="https://github.com/NielsRogge"><code>@​NielsRogge</code></a> in <a
href="https://redirect.github.com/huggingface/transformers/issues/27895">#27895</a></li>
</ul>
<p>The integration also includes <a
href="https://github.com/SkunkworksAI/BakLLaVA"><code>BakLlava</code></a>
which is a Llava model trained with Mistral backbone.</p>
<p>The mode is compatible with <code>&quot;image-to-text&quot;</code>
pipeline:</p>
<pre lang="py"><code>from transformers import pipeline
from PIL import Image    
import requests
<p>model_id = &quot;llava-hf/llava-1.5-7b-hf&quot;
&lt;/tr&gt;&lt;/table&gt;
</code></pre></p>
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="14666775a2"><code>1466677</code></a>
Release: v4.36.0</li>
<li><a
href="accccdd008"><code>accccdd</code></a>
[<code>Add Mixtral</code>] Adds support for the Mixtral MoE (<a
href="https://redirect.github.com/huggingface/transformers/issues/27942">#27942</a>)</li>
<li><a
href="0676d992a5"><code>0676d99</code></a>
[<code>from_pretrained</code>] Make from_pretrained fast again (<a
href="https://redirect.github.com/huggingface/transformers/issues/27709">#27709</a>)</li>
<li><a
href="9f18cc6df0"><code>9f18cc6</code></a>
Fix SDPA dispatch &amp; make SDPA CI compatible with torch&lt;2.1.1 (<a
href="https://redirect.github.com/huggingface/transformers/issues/27940">#27940</a>)</li>
<li><a
href="7ea21f1f03"><code>7ea21f1</code></a>
[LLaVa] Some improvements (<a
href="https://redirect.github.com/huggingface/transformers/issues/27895">#27895</a>)</li>
<li><a
href="5e620a92cf"><code>5e620a9</code></a>
Fix <code>SeamlessM4Tv2ModelIntegrationTest</code> (<a
href="https://redirect.github.com/huggingface/transformers/issues/27911">#27911</a>)</li>
<li><a
href="e96c1de191"><code>e96c1de</code></a>
Skip <code>UnivNetModelTest::test_multi_gpu_data_parallel_forward</code>
(<a
href="https://redirect.github.com/huggingface/transformers/issues/27912">#27912</a>)</li>
<li><a
href="8d8970efdd"><code>8d8970e</code></a>
[BEiT] Fix test (<a
href="https://redirect.github.com/huggingface/transformers/issues/27934">#27934</a>)</li>
<li><a
href="235be08569"><code>235be08</code></a>
[DETA] fix backbone freeze/unfreeze function (<a
href="https://redirect.github.com/huggingface/transformers/issues/27843">#27843</a>)</li>
<li><a
href="df5c5c62ae"><code>df5c5c6</code></a>
Fix typo (<a
href="https://redirect.github.com/huggingface/transformers/issues/27918">#27918</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/huggingface/transformers/compare/v4.35.2...v4.36.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=transformers&package-manager=pip&previous-version=4.35.2&new-version=4.36.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/microsoft/onnxruntime/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-12-20 22:09:02 -08:00
dependabot[bot]
379c7c43eb
Bump actions/setup-java from 3 to 4 (#18686)
Bumps [actions/setup-java](https://github.com/actions/setup-java) from 3
to 4.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/setup-java/releases">actions/setup-java's
releases</a>.</em></p>
<blockquote>
<h2>v4.0.0</h2>
<h2>What's Changed</h2>
<p>In the scope of this release, the version of the Node.js runtime was
updated to 20. The majority of dependencies were updated to the latest
versions. From now on, the code for the setup-java will run on Node.js
20 instead of Node.js 16.</p>
<h2>Breaking changes</h2>
<ul>
<li>Update Node.js runtime to version 20 by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-java/pull/558">actions/setup-java#558</a></li>
</ul>
<h2>Non-breaking changes</h2>
<ul>
<li>Adding support for microsoft openjdk 21.0.0 by <a
href="https://github.com/ralfstuckert"><code>@​ralfstuckert</code></a>
in <a
href="https://redirect.github.com/actions/setup-java/pull/546">actions/setup-java#546</a></li>
<li>Update <code>@​actions/cache</code> dependency and documentation by
<a href="https://github.com/IvanZosimov"><code>@​IvanZosimov</code></a>
in <a
href="https://redirect.github.com/actions/setup-java/pull/549">actions/setup-java#549</a></li>
<li>Implementation of the cache-dependency-path option to control
caching dependency by <a
href="https://github.com/itchyny"><code>@​itchyny</code></a> in <a
href="https://redirect.github.com/actions/setup-java/pull/499">actions/setup-java#499</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/ralfstuckert"><code>@​ralfstuckert</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-java/pull/546">actions/setup-java#546</a></li>
<li><a href="https://github.com/itchyny"><code>@​itchyny</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/setup-java/pull/499">actions/setup-java#499</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-java/compare/v3...v4.0.0">https://github.com/actions/setup-java/compare/v3...v4.0.0</a></p>
<h2>v3.13.0</h2>
<h2>What's changed</h2>
<p>In the scope of this release, support for Dragonwell JDK was added by
<a
href="https://github.com/Accelerator1996"><code>@​Accelerator1996</code></a>
in <a
href="https://redirect.github.com/actions/setup-java/pull/532">actions/setup-java#532</a></p>
<pre lang="yaml"><code>steps:
 - name: Checkout
   uses: actions/checkout@v3
 - name: Setup-java
   uses: actions/setup-java@v3
   with:
     distribution: 'dragonwell'
     java-version: '17'
</code></pre>
<p>Several inaccuracies were also fixed:</p>
<ul>
<li>Fix XML namespaces wrongly using https by <a
href="https://github.com/gnodet"><code>@​gnodet</code></a> in <a
href="https://redirect.github.com/actions/setup-java/pull/503">actions/setup-java#503</a></li>
<li>Fix typo and remove unintentional(?) word by <a
href="https://github.com/CyberFlameGO"><code>@​CyberFlameGO</code></a>
in <a
href="https://redirect.github.com/actions/setup-java/pull/518">actions/setup-java#518</a></li>
<li>Fix usage link within the README.md file by <a
href="https://github.com/dassiorleando"><code>@​dassiorleando</code></a>
in <a
href="https://redirect.github.com/actions/setup-java/pull/525">actions/setup-java#525</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/CyberFlameGO"><code>@​CyberFlameGO</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-java/pull/518">actions/setup-java#518</a></li>
<li><a
href="https://github.com/dassiorleando"><code>@​dassiorleando</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-java/pull/525">actions/setup-java#525</a></li>
<li><a href="https://github.com/gnodet"><code>@​gnodet</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/setup-java/pull/503">actions/setup-java#503</a></li>
<li><a
href="https://github.com/Accelerator1996"><code>@​Accelerator1996</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-java/pull/532">actions/setup-java#532</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-java/compare/v3...v3.13.0">https://github.com/actions/setup-java/compare/v3...v3.13.0</a></p>
<h2>v3.12.0</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="387ac29b30"><code>387ac29</code></a>
Upgrade Node to v20 (<a
href="https://redirect.github.com/actions/setup-java/issues/558">#558</a>)</li>
<li><a
href="9eda6b51cc"><code>9eda6b5</code></a>
feat: implement cache-dependency-path option to control caching
dependency (#...</li>
<li><a
href="78078da0cd"><code>78078da</code></a>
Update <code>@​actions/cache</code> dependency and documentation (<a
href="https://redirect.github.com/actions/setup-java/issues/549">#549</a>)</li>
<li><a
href="5caaba646e"><code>5caaba6</code></a>
add support for microsoft openjdk 21.0.0 (<a
href="https://redirect.github.com/actions/setup-java/issues/546">#546</a>)</li>
<li>See full diff in <a
href="https://github.com/actions/setup-java/compare/v3...v4">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-java&package-manager=github_actions&previous-version=3&new-version=4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-12-20 22:08:33 -08:00
Kevin Chen
1c6cb5dfeb
Remove usage of TRT deprecated APIs (#18879)
### Description
<!-- Describe your changes. -->

- Wrap usage of kENABLE_TACTIC_HEURISTIC around version checking macros
- Use delete instead of deprecated destroy() functions on TRT objects.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
- Removes usages of deprecated TRT APIs.

Signed-off-by: Kevin Chen <kevinch@nvidia.com>
2023-12-20 15:08:13 -08:00
Tianlei Wu
2d6e2e243d
update sdxl demo (#18889)
### Description
(1) Support importing model from Olive.
(2) Add backend engine Torch (Eager and Compile modes) to the demo.
(3) Use fp16 in most places.
(4) Remove some old pipeline scripts that are not useful anymore. They
are replaced by the demo.
(5) Remove old benchmark results that are out of date.
(6) Add PIL image conversion to end to end latency (for fair comparison
with diffusers since the default output type is pil)
(7) Remove some options are seldom used like force-rebuild-engine,
hf-token, refit etc.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-12-20 14:46:22 -08:00
Yulong Wang
9a61388f0a
[js/web] revise backend registration (#18715)
### Description
This PR revises the backend registration.

The following describes the expected behavior after this change:
(**bolded are changed behavior**)

- (ort.min.js - built without webgpu support)
    - loading: do not register 'webgpu' backend
- creating session without EP list: use default EP list ['webnn', 'cpu',
'wasm']
- creating session with ['webgpu'] as EP list: should fail with backend
not available
- (ort.webgpu.min.js - built with webgpu support)
    - loading: **always register 'webgpu' backend**
( previous behavior: only register 'webgpu' backend when `navigator.gpu`
is available)
- creating session without EP list: use default EP list ['webgpu',
'webnn', 'cpu', 'wasm']
        - when WebGPU is available (win): use WebGPU backend
- when WebGPU is unavailable (android): **should fail backend init,**
and try to use next backend in the list, 'webnn'
(previous behavior: does not fail backend init, but fail in JSEP init,
which was too late to switch to next backend)
    - creating session with ['webgpu'] as EP list
        - when WebGPU is available (win): use WebGPU backend
- when WebGPU is unavailable (android): **should fail backend init, and
because no more EP listed, fail.


related PRs: #18190 #18144
2023-12-20 14:45:55 -08:00
Yifan Li
c0142c9108
[EP Perf] Fix model zoo url (#18808)
### Description
<!-- Describe your changes. -->
Onnx model zoo had major update recently, and legacy models were
relocated under /archive/


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-12-20 10:54:45 -08:00
Hector Li
8931854528
Move some QNN EP provider options to session options (#18877)
Move QNN EP provider options to session options

### Description
Need to use session option to support multi-partition for context cache feature. To smooth the transaction, move the provider options to session options first.

This is the first step for PR:
PR https://github.com/microsoft/onnxruntime/pull/18865
2023-12-20 00:13:38 -08:00
Ye Wang
02eb17655d
Fix a bug in 4bits quantizer script (#18878)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-12-19 22:53:33 -08:00
Scott McKay
666fcbde4d
Add LeakyRelu to list of NNAPI operators (#18880)
### Description
<!-- Describe your changes. -->
Add LeakyRelu to the list as support was added a while ago. 


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-12-20 14:44:31 +10:00
Changming Sun
535a2403dd
Update Nuget publishing jobs (#18851)
### Description
1. Add a CodeSign validation task before the binaries are published, to
make sure all DLL files are signed.
2. Auto-trigger the CUDA 12 pipeline's publishing job.
2023-12-19 16:54:46 -08:00
Yulong Wang
ffa6602686
[js/node] support manually dispose session (#18655)
### Description
support manually dispose session in onnxruntime-node

feature request: #16796
2023-12-19 16:20:00 -08:00
satyajandhyala
98510fb8fb
[JS/WebGPU] fix an error in Clip (#18799)
### Description
<!-- Describe your changes. -->
Check whether the min/max inputs are provided and use default values if not provided.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-12-19 13:51:01 -08:00
liqun Fu
32fcf73740
Implement dft(20) (#17821)
### Description
dft is updated in opset20. implement it in ort



### Motivation and Context
this is for ort 1.17.0 release

Fixes #17723

---------

Signed-off-by: Liqun Fu <liqfu@microsoft.com>
2023-12-19 10:42:54 -08:00
luoyu-intel
5f00bc9931
Integrate high-performance x64 gemm library to MLAS (#17669)
### Description
Improve MLAS to support high-performance x64 INT4 kernels



### Motivation and Context
1. improve LLM inference performance on Intel CPUs.
2. support more 4bit quantization types: nf4, fp4
3. support dynamic block size: block size aligned with kernel's tiling
size(e.g. 4 for VNNI kernel), per channel on N dimension
4. support most Intel ISAs: avx2, avx_vnni, avx512f, avx512_vnni,
amx_bf16, amx_int8, avx512_fp16
5. support MatMulNBits' data format

### Tasks
- [x] support block_size: 32, 128, -1(per channel)
- [x] get weight pack size without memory allocation
- [x] use ort's thread pool for parallelism
- [x] support ISAs: avx2, avx512f, avx_vnni, avx512_vnni, amx_int8

### Benchmark
Ubuntu 20.22 + Intel(R) Xeon(R) Platinum 8480+ 56 cores

Benchmark | Time | CPU | Iterations
-- | -- | -- | --
Q4GEMM_Jblas/Q4G32SymInt8/M:1/N:4096/K:4096/Threads:56/real_time | 47613
| 47401 | 12970
Q4GEMM_Jblas/Q4G32SymInt8/M:1024/N:4096/K:4096/Threads:56/real_time |
6347792 | 6317562 | 109
Q4GEMM_Jblas/Q4G32SymInt8/M:2048/N:4096/K:4096/Threads:56/real_time |
11814014 | 11757847 | 59
Q4GEMM_Jblas/Q4G128SymInt8/M:1/N:4096/K:4096/Threads:56/real_time |
50222 | 50031 | 13759
Q4GEMM_Jblas/Q4G128SymInt8/M:1024/N:4096/K:4096/Threads:56/real_time |
2038222 | 2028743 | 341
Q4GEMM_Jblas/Q4G128SymInt8/M:2048/N:4096/K:4096/Threads:56/real_time |
3792832 | 3774485 | 191
Q4GEMM_Jblas/Q4GPerNSymInt8/M:1/N:4096/K:4096/Threads:56/real_time |
58717 | 58501 | 11467
Q4GEMM_Jblas/Q4GPerNSymInt8/M:1024/N:4096/K:4096/Threads:56/real_time |
1360846 | 1354598 | 543
Q4GEMM_Jblas/Q4GPerNSymInt8/M:2048/N:4096/K:4096/Threads:56/real_time |
2564232 | 2551365 | 266
Q4GEMM_Jblas/Q4G32SymFp32/M:1/N:4096/K:4096/Threads:56/real_time | 57929
| 57694 | 12047
Q4GEMM_Jblas/Q4G32SymFp32/M:1024/N:4096/K:4096/Threads:56/real_time |
5495330 | 5465810 | 126
Q4GEMM_Jblas/Q4G32SymFp32/M:2048/N:4096/K:4096/Threads:56/real_time |
10676240 | 10617817 | 66
Q4GEMM_Jblas/Q4G128SymFp32/M:1/N:4096/K:4096/Threads:56/real_time |
68305 | 68047 | 10026
Q4GEMM_Jblas/Q4G128SymFp32/M:1024/N:4096/K:4096/Threads:56/real_time |
5504862 | 5476215 | 126
Q4GEMM_Jblas/Q4G128SymFp32/M:2048/N:4096/K:4096/Threads:56/real_time |
11758623 | 11697337 | 66
Q4GEMM_Jblas/Q4GPerNSymFp32/M:1/N:4096/K:4096/Threads:56/real_time |
67713 | 67451 | 10298
Q4GEMM_Jblas/Q4GPerNSymFp32/M:1024/N:4096/K:4096/Threads:56/real_time |
5508325 | 5480237 | 126
Q4GEMM_Jblas/Q4GPerNSymFp32/M:2048/N:4096/K:4096/Threads:56/real_time |
10738528 | 10681656 | 64
Q4GEMM_Jblas/Q4G32AsymFp32/M:1/N:4096/K:4096/Threads:56/real_time |
60708 | 60486 | 11321
Q4GEMM_Jblas/Q4G32AsymFp32/M:1024/N:4096/K:4096/Threads:56/real_time |
5523784 | 5495736 | 126
Q4GEMM_Jblas/Q4G32AsymFp32/M:2048/N:4096/K:4096/Threads:56/real_time |
10829633 | 10772161 | 67


Reference:

Benchmark | Time | CPU | Iterations
-- | -- | -- | --
Q4GEMM/Q4Sym/M:1/N:4096/K:4096/Threads:56/real_time | 53088 | 52911 |
13364
Q4GEMM/Q4Sym/M:1024/N:4096/K:4096/Threads:56/real_time | 6268981 |
6230335 | 110
Q4GEMM/Q4Sym/M:2048/N:4096/K:4096/Threads:56/real_time | 11701237 |
11632339 | 59

Win11+12900K 8 cores:
Benchmark | Time | CPU | Iterations
-- | -- | -- | --
Q4GEMM_Jblas/Q4G32SymInt8/M:1/N:4096/K:4096/Threads:8/real_time | 215976
| 211295 | 2884
Q4GEMM_Jblas/Q4G32SymInt8/M:1024/N:4096/K:4096/Threads:8/real_time |
60960590 | 60937500 | 10
Q4GEMM_Jblas/Q4G32SymInt8/M:2048/N:4096/K:4096/Threads:8/real_time |
1.18E+08 | 1.19E+08 | 5
Q4GEMM_Jblas/Q4G32SymInt8/M:1/N:11008/K:4096/Threads:8/real_time |
470377 | 453059 | 1414
Q4GEMM_Jblas/Q4G32SymInt8/M:1024/N:11008/K:4096/Threads:8/real_time |
1.54E+08 | 1.53E+08 | 5
Q4GEMM_Jblas/Q4G32SymInt8/M:2048/N:11008/K:4096/Threads:8/real_time |
3.18E+08 | 3.13E+08 | 2
Q4GEMM_Jblas/Q4G32SymInt8/M:1/N:4096/K:11008/Threads:8/real_time |
569072 | 559398 | 1229
Q4GEMM_Jblas/Q4G32SymInt8/M:1024/N:4096/K:11008/Threads:8/real_time |
1.54E+08 | 1.52E+08 | 4
Q4GEMM_Jblas/Q4G32SymInt8/M:2048/N:4096/K:11008/Threads:8/real_time |
3.22E+08 | 3.28E+08 | 2
Q4GEMM_Jblas/Q4G32SymInt8/M:1/N:11008/K:11008/Threads:8/real_time |
1486055 | 1473325 | 403
Q4GEMM_Jblas/Q4G32SymInt8/M:1024/N:11008/K:11008/Threads:8/real_time |
4.14E+08 | 4.14E+08 | 2
Q4GEMM_Jblas/Q4G32SymInt8/M:2048/N:11008/K:11008/Threads:8/real_time |
8.88E+08 | 8.59E+08 | 1

---------

Signed-off-by: Mengni Wang <mengni.wang@intel.com>
Co-authored-by: Mengni Wang <mengni.wang@intel.com>
2023-12-19 09:36:31 -08:00
Ashwini Khade
4dff154f51
Fix nightly pipeline failure (#18867)
### Description
Fixes a failure in the ortmodule nightly pipeline. 



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-12-19 09:18:00 -08:00
Jian Chen
6d7519ede8
Adding new pipeline for python cuda testing (#18718)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-12-18 18:13:03 -08:00
Frank
63b47ceaf8
[REACT NATIVE] Bugfix -> casing Podfile (#18861)
### Description
The casing of Podfile is incorrect in the plugin. This causes issues
when building iOS on case-sensitive systems such as Linux.

### Motivation and Context
because cannot build ios on case sensitive systems
2023-12-19 10:20:46 +10:00
dependabot[bot]
3ff4a4c393
Bump actions/stale from 8.0.0 to 9.0.0 (#18774) 2023-12-18 14:59:03 -08:00
sophies927
ea6186efa8
Update stale.yml to correct close-issue-message (#18849)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-12-18 09:57:33 -08:00
Yifan Li
9426bd50cb
[TensorRT EP] Update deprecated TRT api (#18834)
### Description
<!-- Describe your changes. -->
Update deprecated TRT api:
1.
[setMaxWorkspaceSize](https://docs.nvidia.com/deeplearning/tensorrt/api/c_api/classnvinfer1_1_1_i_builder_config.html#a8209999988ab480c60c8a905dfd2654d)(max_workspace_size_)-------->setMemoryPoolLimit(nvinfer1::MemoryPoolType::kWORKSPACE,
max_workspace_size_)
2.
[kENABLE_TACTIC_HEURISTIC](https://docs.nvidia.com/deeplearning/tensorrt/api/c_api/namespacenvinfer1.html#abdc74c40fe7a0c3d05d2caeccfbc29c1a1215692ad24465e4d9e37a8a7fce1a38)-------->supersede
by trt builder optimization level 2

Perf & warning log comparison
<html xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882"
xmlns="http://www.w3.org/TR/REC-html40">

<head>

<meta name=ProgId content=OneNote.File>
<meta name=Generator content="Microsoft OneNote 15">
</head>

<body lang=en-US style='font-family:"Microsoft YaHei";font-size:12.0pt'>
<!--StartFragment-->

<div style='direction:ltr'>


TRT EP options | User will see corresponding warning logs: | Average
inference time cost (FRCNN on A100)
-- | -- | --
trt_build_heuristics_enable\|true | [TensorRT EP]
trt_build_heuristics_enable is deprecated on TRT 8.6 onwards. Please set
builder optimization level as 2 to enable builder heuristics. | ~300ms
trt_build_heuristics_enable\|true   trt_builder_optimization_level\|2 |
[TensorRT EP] Builder heuristics are enabled automatically by builder
optimization level 2. trt_build_heuristics_enable is deprecated on TRT
8.6 onwards. | ~275ms
trt_builder_optimization_level\|2 |   | ~275ms



</div>

<!--EndFragment-->
</body>

</html>




### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Prepare for upcoming TRT 10
2023-12-18 09:16:09 -08:00
Changming Sun
ad476d5a1f
Change Nuget packaging pipeline's build TRT job to download CUDA SDK on-the-fly (#18847)
### Description
Change Nuget packaging pipeline's build TRT job to download CUDA SDK
on-the-fly, so that we do not need to put a CUDA SDK in the build
machine's image.
2023-12-15 17:44:02 -08:00
Dmitri Smirnov
50cbcf9587
Build function bodies according to the imported global opset. (#18833)
### Description
Build function bodies according to the imported global opset.
Same is for querying ONNX functions.

### Motivation and Context
This addresses issues:
https://github.com/microsoft/onnxruntime/issues/18781
https://github.com/microsoft/onnxruntime/issues/16438
2023-12-15 15:56:20 -08:00
RandySheriffH
2952cf82a5
Access map by iterator to silence sanity check. (#18835)
Use iterator to refer to the set.

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2023-12-15 14:57:55 -08:00
Jiajia Qin
8f7b89bd5b
[js/webgpu] Optimize NCHW layout for InstanceNormalization (#18123)
### Description
The changes in this PR includes:
1) Fix f16 errors in InstanceNormalization with NCHW format.
2) Use vec to further optimize the original algorithm.
3) (Removed) Don't do layout conversion for InstanceNormalization for
JSEP since InstanceNormalization itself is suitable for NCHW layout and
has better performance in our current implementation.

Tested on sd-vae-decoder-f16.onnx, it becomes 285 ms from 314 ms. The
aggregate gpu profiling data can be found as below (Note the data is
based change 3).):
Before:
<html>
<body>
<!--StartFragment--><span><span class="ui-provider ef bbg bbh bbi bbj
bbk bbl bbm bbn bbo bbp bbq bbr bbs bbt bbu bbv bbw bbx bby bbz bca bcb
bcc bcd bce bcf bcg bch bci bcj bck bcl bcm bcn" dir="ltr">

Kernel | Time (Ms) | Percentage (%)
-- | -- | --
Conv | 201.55 | 69.56
InstanceNormalization | 42.49 | 14.67
Transpose | 28.95 | 9.99
Mul | 5.69 | 1.96
Add | 3.82 | 1.32
MatMul | 3.27 | 1.13
Sigmoid | 2.24 | 0.77
Resize | 1.16 | 0.40
Softmax | 0.34 | 0.12
Cast | 0.24 | 0.08
Sum | 289.75

<br class="Apple-interchange-newline"><!--EndFragment-->
</body>
</html>
After:
<html>
<body>
<!--StartFragment--><span><span class="ui-provider ef bbg bbh bbi bbj
bbk bbl bbm bbn bbo bbp bbq bbr bbs bbt bbu bbv bbw bbx bby bbz bca bcb
bcc bcd bce bcf bcg bch bci bcj bck bcl bcm bcn" dir="ltr">

Kernel | Time (Ms) | Percentage (%)
-- | -- | --
Conv | 205.44 | 79.43
InstanceNormalization | 18.24 | 7.05
Transpose | 17.64 | 6.82
Mul | 5.69 | 2.20
Add | 3.81 | 1.47
MatMul | 3.56 | 1.38
Sigmoid | 2.24 | 0.86
Resize | 1.19 | 0.46
Softmax | 0.59 | 0.23
Cast | 0.24 | 0.09
Sum | 258.65 |  

</span></span><!--EndFragment-->
</body>
</html>

From above table, we can see that two ops time are greatly reduced. One
is InstanceNormalization and the other is Transpose. The reason that the
transpose time is reduced is because each InstanceNormalization is
surrounded with two reshape ops in sd-vae-decoder-f16.onnx. Due to JSEP
is prefer NHWC and InstanceNormalization is layout sensitive op, so two
extra transpose ops are inserted dynamically when executing this model.
After this change, those inserted transpose ops are not needed anymore.
So the overall transpose time is reduced.
2023-12-15 11:26:15 -08:00
Jiajia Qin
4bbed4c71a
[js/webgpu] Fix f16 errors in unary (#18839)
### Description
This PR fixes below errors:
```
no matching overload for operator > (vec4<f16>, vec4<f32>)
2023-12-15 11:25:12 -08:00
Changming Sun
f52668cc68
Disable mlas unit test in ARM64EC build (#18747)
### Description
Disable mlas unit test in ARM64EC build because the program has some
link errors. We will fix the errors later.
This PR only impacts Windows ARM64EC build. It has no impact on the
existing build pipelines.
2023-12-15 09:17:47 -08:00
wirthual
89168b830d
Fix CI error: The workflow is not valid. .github/workflows/rust-ci.yml (Line: 27, Col: 7): Unexpected value 'ORT_RUST_STRATEGY=download' (#18836)
Use colon for Env variable instead of =
2023-12-15 09:14:02 -08:00
Yang Gu
81ad1e6ac3
[js/webgpu] Fix typo of outputShapes in profiling message (#18837) 2023-12-15 08:57:48 -08:00
Peishen Yan
d111eed726
[WebNN EP] Change axis to axes for argMax/argMin (#18838)
In the latest spec, the axes option of WebNN's argMax and argMin
requires the use of a sequence long type. Replace axis option (long
type) with axes (sequence long type) for argMax and argMin.
2023-12-15 08:57:07 -08:00
Changming Sun
d795fc636c
FIX: Our cmake script didn't check googletest's hash (#18826) 2023-12-15 08:48:15 -08:00
Changming Sun
fc9ecb59db
Add Windows ARM build jobs to post merge pipeline (#18832)
### Description
Add Windows ARM build jobs to post merge pipeline to valid our code is
still compatible with these build settings.
2023-12-15 08:47:52 -08:00
pengwa
5eda79bdd3
Improve perf for stage3 training (#18099)
### Improve perf for stage3 training - first wave

Port existing PythonOp/PythonOpGrad python runner to C++, also introduce
an unsafe run mode (to skip inplace, save for backward, materrialized
grad detection on the fly).

This reduce the overhead from XX~XXX us to X ~ lower end of XX us . In
LLAMA2 7B training with 8x32GV100, we have observed 6.7% gains over
PyTorch. (1.59 v.s. 1.49it/s)

Peak memory also dropped from 31GB to 28GB.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-12-15 13:32:19 +08:00
Changming Sun
cbad4fe49b
Update absl and googletest (#18827)
### Description
Update absl and googletest to their latest version to include some cmake
changes:
1. A googletest's cmake change that will allow using external absl and
re2.
2. Nullability enhancements that will allow our clang-based static
analysis detecting many kinds of null pointer errors.



### Motivation and Context
To fix a C4744 link warning in our Windows pipelines.
```
LINK : warning C4744: 'static char const absl::lts_20230802::base_internal::FastTypeTag<bool>::dummy_var' has different type in 'd:\a\_work\_temp\abseil_cpp\abseil-cpp-20230802.0\absl\flags\parse.cc' and 'd:\a\_work\1\b\relwithdebinfo\_deps\googletest-src\googletest\src\gtest-all.cc': 'signed char' and 'unsigned char' [D:\a\_work\1\b\RelWithDebInfo\onnxruntime_mlas_test.vcxproj]
LINK : warning C4744: 'static char const absl::lts_20230802::base_internal::FastTypeTag<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > >::dummy_var' has different type in 'd:\a\_work\_temp\abseil_cpp\abseil-cpp-20230802.0\absl\flags\parse.cc' and 'd:\a\_work\1\b\relwithdebinfo\_deps\googletest-src\googletest\src\gtest-all.cc': 'signed char' and 'unsigned char' [D:\a\_work\1\b\RelWithDebInfo\onnxruntime_mlas_test.vcxproj]
LINK : warning C4744: 'static char const absl::lts_20230802::base_internal::FastTypeTag<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > >::dummy_var' has different type in 'd:\a\_work\_temp\abseil_cpp\abseil-cpp-20230802.0\absl\flags\internal\usage.cc' and 'd:\a\_work\1\b\relwithdebinfo\_deps\googletest-src\googletest\src\gtest-all.cc': 'signed char' and 'unsigned char' [D:\a\_work\1\b\RelWithDebInfo\onnxruntime_mlas_test.vcxproj]
LINK : warning C4744: 'static char const absl::lts_20230802::base_internal::FastTypeTag<bool>::dummy_var' has different type in 'd:\a\_work\_temp\abseil_cpp\abseil-cpp-20230802.0\absl\flags\internal\flag.cc' and 'd:\a\_work\1\b\relwithdebinfo\_deps\googletest-src\googletest\src\gtest-all.cc': 'signed char' and 'unsigned char' [D:\a\_work\1\b\RelWithDebInfo\onnxruntime_mlas_test.vcxproj]
LINK : warning C4744: 'static char const absl::lts_20230802::base_internal::FastTypeTag<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > >::dummy_var' has different type in 'd:\a\_work\_temp\abseil_cpp\abseil-cpp-20230802.0\absl\flags\internal\flag.cc' and 'd:\a\_work\1\b\relwithdebinfo\_deps\googletest-src\googletest\src\gtest-all.cc': 'signed char' and 'unsigned char' [D:\a\_work\1\b\RelWithDebInfo\onnxruntime_mlas_test.vcxproj]
LINK : warning C4744: 'static char const absl::lts_20230802::base_internal::FastTypeTag<int>::dummy_var' has different type in 'd:\a\_work\_temp\abseil_cpp\abseil-cpp-20230802.0\absl\flags\internal\flag.cc' and 'd:\a\_work\1\b\relwithdebinfo\_deps\googletest-src\googletest\src\gtest-all.cc': 'signed char' and 'unsigned char' [D:\a\_work\1\b\RelWithDebInfo\onnxruntime_mlas_test.vcxproj]
```
2023-12-14 16:15:07 -08:00
Yueqing Zhang
b42d4b8ea6
[VitisAI] 1. api compatbile 2. dynamic load onnx (#18470)
### Description
<!-- Describe your changes. -->

1. Add a backward-compatible API for compiling model.
2. Run-time load vitisai-ep.dll


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: Yueqing Zhang <yueqingz@amd.com>
Co-authored-by: Zhenze Wang <zhenzew@xilinx.com>
2023-12-14 14:43:41 -08:00
zesongw
6d5ee4d69b
[WebNN EP] Use explicit padding (#18688)
WebNN will remove autoPad option, we need to use explicit padding
values.
Compute padding values of autopad(same-upper, same-lower) for Op Pool,
Conv and ConvTranspose.
2023-12-14 14:33:44 -08:00
Wanming Lin
1db1c75048
[WebNN EP] WebNN only supports 4-D input and weight for Conv/ConvTranspose (#18703) 2023-12-14 14:33:19 -08:00
Changming Sun
b129f425fc
Fix test model URL issue (#18823)
### Description
ONNX model zoo changed their dir structure. So some our pipelines are
failing. In prevent such things happening again, we'd better to read the
test data for a cache from local disk instead of downloading it remotely
every time.
2023-12-14 13:06:08 -08:00
Chi Lo
afe5cdc938
[TensorRT EP] Switch to enqueueV3 with support DDS output (copy version) (#18714)
It's branched off from
https://github.com/microsoft/onnxruntime/pull/17751 but removes
KernelContext_SetOutput() API. It copies output allocation buffer to
kernel context.

---------

Co-authored-by: George Wu <jywu@microsoft.com>
2023-12-14 11:10:58 -08:00
Changming Sun
7386e21121
Replace some ORT_ENFORCE with ORT_THROW_IF_ERROR (#18812)
### Description
Replace some ORT_ENFORCE with ORT_THROW_IF_ERROR to get better error
messages.
2023-12-14 10:14:22 -08:00