onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-07 17:15:29 +00:00

Author	SHA1	Message	Date
Yi Zhang	14d7872ce9	Reuse T4 for Cuda12.2 training packaging pipeline. (#20244 ) ### Description It always has been out of memory in training CUDA 12.2 packaging pipeline https://dev.azure.com/aiinfra/Lotus/_build?definitionId=1308&_a=summary since the PR #19910 I tried other CPU agents for example, D64as_v5(256G memory) and D32as_v4(128G memory and 256 G SSD temp storage), which are still out of memory like the below image ![image](https://github.com/microsoft/onnxruntime/assets/16190118/5acde9ef-674f-4b6d-a1b3-b54647645083) But it works on T4, though T4 only has 4 vCPUs, 28G memory and 180G temp storage, and it takes much more time. ### Motivation and Context Restore CUDA 12.2 training packaging pipeline first. More time is needed to investigate the root cause ### Other Clues. These 2 compilation steps take nearly 6 minutes with Cuda 12.2 on T4 And it runs out of memory on CPU machine. @ajindal1 cuda12.2 on T4 ``` 2024-03-14T05:39:08.7726865Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim32_fp16_sm80.cu.o 2024-03-14T05:45:01.3223393Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim64_bf16_sm80.cu.o 2024-03-14T05:46:07.9218003Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim96_fp16_sm80.cu.o 2024-03-14T05:52:59.2387051Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/group_query_attention_impl.cu.o ``` But they could be finished in about one minute with Cuda 11.8 on CPU ``` cuda11.8 on CPU 2024-04-09T11:34:35.0849836Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim32_fp16_sm80.cu.o 2024-04-09T11:35:53.6648154Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim64_bf16_sm80.cu.o cuda11.8 on GPU 024-03-13T12:16:33.4102477Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim32_fp16_sm80.cu.o 2024-03-13T12:19:58.8268272Z [ 90%] Building CUDA object CMakeFiles/onnxruntime_providers_cuda.dir/onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim64_bf16_sm80.cu.o ```	2024-04-10 09:21:40 +08:00
Dmitri Smirnov	7d8dea9f10	Reduce Heap contention in StringNormalizer (#20182 ) ### Description <!-- Describe your changes. --> Re-use pre-computed and pre-allocated buffers for UNICODE conversions. Make sure we do not introduce unnecessary intermediate `std::string` instances. Create a Utf8Generic converter for use with non-Windows platforms. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> This reduces heap contention in P1 customer. ![image](https://github.com/microsoft/onnxruntime/assets/11303988/fd39fb01-7361-47d2-8f83-69dbc3bbc65c)	2024-04-09 16:10:31 -07:00
pengwa	81005e2c92	Optimize constant sharing perf (#20143 ) ### Optimize constant sharing perf by avoiding [renaming for the first name we detect a constant pattern. Currently every time we start run ConstantSharing, for each initializer, we find its pattern does not exist, then we create a new NodeArg with a unique name. Then later if other initializer share the same pattern, they will be replaced by the NodeArg. The problem is: once there is no real constant sharing cases, we still modify the graph for each initializer. This is not needed. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-04-09 12:04:36 +08:00
danyue	07b5377f7c	Add INT16 and UINT16 compatibility for relu_quantizelinear (#20187 ) ### Description <!-- Describe your changes. --> There is a problem in relu_quantizelinear transformer that causes wrong results. The purpose of this PR is to solve this problem. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> This does not take into account the situation where Q's zeropoint is tensor(int16), tensor(uint16), so when this happens, an error will occur. How to verify： ```python import onnx import onnxruntime as ort import numpy as np model_name = 'relu_quantize_testcase.onnx' model = onnx.load(model_name) ort_input0 = np.random.rand((1, 64, 64, 128),np.float32) # infer with GraphOptimizationLevel=0 so = ort.SessionOptions() so.graph_optimization_level = ort.GraphOptimizationLevel.ORT_DISABLE_ALL ort_session = ort.InferenceSession( model_name, providers=["CPUExecutionProvider"], sess_options=so ) outputs = [x.name for x in ort_session.get_outputs()] ort_outs_mod = ort_session.run(outputs, { 'generator/conv2d_input/conv2d/Conv2D:0': ort_input0} ) del ort_session # infer with GraphOptimizationLevel=default model_orig = onnx.load(model_name) ort_session_orig = ort.InferenceSession(model_orig.SerializeToString()) outputs_orig = [x.name for x in ort_session_orig.get_outputs()] ort_outs_orig = ort_session_orig.run(outputs_orig, { 'generator/conv2d_input/conv2d/Conv2D:0': ort_input0} ) # diff print(np.linalg.norm(ort_outs_mod[0].astype(np.float32) - ort_outs_orig[0].astype(np.float32))) del ort_session_orig ``` [relu_quantize_testcase.zip](https://github.com/microsoft/onnxruntime/files/14848160/relu_quantize_testcase.zip) --------- Co-authored-by: genmingz <genming.zhong@amd.com>	2024-04-08 19:41:43 -07:00
pengwa	41acd8c543	Support more ops for recompute (#20234 ) ### Support more ops for recompute To cover Mistral model, and support padding elimination ops. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-04-09 09:24:48 +08:00
Adam Louly	22a61a3cf5	Fix Mixtral Parity test to keep it consistent with Transformers. (#20210 ) ### Description I recently opened a PR in hf transformers repo to fix an issue on the indexing part. https://github.com/huggingface/transformers/issues/29857 onnx exporter was failing because of the tolist() conversion so we had to remove it. I found out that the code was also a part of our codebase so this PR is to keep the code consistent.	2024-04-08 13:04:12 -07:00
wejoncy	908a76d675	fix "4bit quantization scales and zeropoint tensor shape" (#19986 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-04-08 10:15:28 -07:00
Jiajie Hu	23d3afd4fe	[js/webgpu] Implement com.microsoft.RotaryEmbedding (#20209 ) ### Description https://github.com/microsoft/onnxruntime/blob/main/docs/ContribOperators.md#commicrosoftrotaryembedding ### Motivation and Context As per customer request, this helps Phi-2 and Gemma.	2024-04-08 09:11:26 -07:00
cloudhan	e19c778934	Improve KE for commandline and programmatically tuning dispatch (#18778 )	2024-04-08 11:08:59 +08:00
Ye Wang	cc3faba616	Support seq_len > 64K in rotary embedding cuda kernel (#20204 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-04-05 19:52:55 -07:00
Francesco	6af02ae06a	Remove non-existing function call (#19416 ) This function call is confusing, since it is a function call without definition of the function. It was correctly repalced from compute_data to compute_range, but function call was reintroudced in a later PR. ### Description Problem as described in [this issue](https://github.com/microsoft/onnxruntime/issues/18893 ) In the examples, different calls of compute_range() from calibrate.py can be found, also in the calibrate.py itself. The problem is that it was [replaced here] (https://github.com/microsoft/onnxruntime/pull/16550/files#diff-75e84436a983e17527f8b5bc585087e7ad75b3b515c2101c2a82dcaecca490de ) from `compute_range()` to `cpmute_data() -> TensorsData` and then falsely [added as call here](https://github.com/microsoft/onnxruntime/pull/17029/files#diff-75e84436a983e17527f8b5bc585087e7ad75b3b515c2101c2a82dcaecca490de ). ### Motivation and Context I suggest in this PR to remove this confusing call `self.calibrate_range()` in calibrate.py. Once it is removed and packaged, somehow the examples from the onnx-runtime-examples repository must be adapted, since they are already not working. Examples of `compute_range()` in the examples are linked in [this issue](https://github.com/microsoft/onnxruntime/issues/18893 ).	2024-04-05 19:48:48 -07:00
Adrian Lizarraga	05d97e8d18	Update QNN python packages to use QNN SDK version 2.19.2 (#20213 ) ### Description Update QNN python packages to use QNN SDK version 2.19.2. ### Motivation and Context Our CI builds already use QNN SDK version 2.19.2. We should make sure the ort-nightly-qnn python packages are also built with the same QNN SDK version.	2024-04-05 17:15:25 -07:00
Yi Zhang	23a5d0a305	Extend time out in Windows GPU packaging jobs (#20207 ) ### Description Extend Windows GPU Packaging job building time out to 6 hours, and test stage to 3 hours. ### Motivation and Context There're still a few timeout issues after refactoring. The probability is about 20% in https://dev.azure.com/aiinfra/Lotus/_build?definitionId=84. I found the building could be finished in 4 hours if it becomes slow, https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=434340&view=logs&j=0c6ee496-b38e-55a9-3699-12934156e90f, although in most cases, it only take about 30 minutes. Not like before, the building couldn't be completed. So, In this PR, I extend the timeout to 6 hours. And one interesting thing, if one windows GPU job becomes slow, all other windows GPU jobs in the same run become slow too. So I doubt it has something with the ADO or virtualization. That is, it's not completely random. https://dev.azure.com/aiinfra/Lotus/_build?definitionId=841	2024-04-06 08:03:42 +08:00
Andrew Grigorev	a6611409cc	Fix HalideIR title in third party notices reference (#20190 )	2024-04-05 11:12:43 -07:00
dependabot[bot]	2a323eb670	Bump Sixlabors.ImageSharp from 2.1.1 to 2.1.7 in /csharp/sample/Microsoft.ML.OnnxRuntime.ResNet50v2Sample (#19805 ) Bumps [Sixlabors.ImageSharp](https://github.com/SixLabors/ImageSharp) from 2.1.1 to 2.1.7. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/SixLabors/ImageSharp/releases">Sixlabors.ImageSharp's releases</a>.</em></p> <blockquote> <h2>v2.1.7</h2> <h2>What's Changed</h2> <ul> <li>[release/2.1] Disallow allocation attempts of unrepresentable sizes by <a href="https://github.com/antonfirsov"><code>@antonfirsov</code></a> in <a href="https://redirect.github.com/SixLabors/ImageSharp/pull/2553">SixLabors/ImageSharp#2553</a></li> <li>[release/2.1] Tiff decoding robustness improvements (<a href="https://redirect.github.com/SixLabors/ImageSharp/issues/2550">#2550</a>) by <a href="https://github.com/antonfirsov"><code>@antonfirsov</code></a> in <a href="https://redirect.github.com/SixLabors/ImageSharp/pull/2554">SixLabors/ImageSharp#2554</a></li> <li>[release/2.1] PBM decoder robustness improvements and BufferedReadStream observability by <a href="https://github.com/antonfirsov"><code>@antonfirsov</code></a> in <a href="https://redirect.github.com/SixLabors/ImageSharp/pull/2555">SixLabors/ImageSharp#2555</a></li> <li>Backport 2681 by <a href="https://github.com/JimBobSquarePants"><code>@JimBobSquarePants</code></a> in <a href="https://redirect.github.com/SixLabors/ImageSharp/pull/2688">SixLabors/ImageSharp#2688</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/SixLabors/ImageSharp/compare/v2.1.6...v2.1.7">https://github.com/SixLabors/ImageSharp/compare/v2.1.6...v2.1.7</a></p> <h2>v2.1.6</h2> <h2>What's Changed</h2> <ul> <li>Backport - Handle EOF in Jpeg bit reader when data is bad to prevent DOS attack. by <a href="https://github.com/JimBobSquarePants"><code>@JimBobSquarePants</code></a> in <a href="https://redirect.github.com/SixLabors/ImageSharp/pull/2524">SixLabors/ImageSharp#2524</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/SixLabors/ImageSharp/compare/v2.1.5...v2.1.6">https://github.com/SixLabors/ImageSharp/compare/v2.1.5...v2.1.6</a></p> <h2>v2.1.5</h2> <h2>What's Changed</h2> <ul> <li>Backport <a href="https://redirect.github.com/SixLabors/ImageSharp/issues/2501">#2501</a> by <a href="https://github.com/JimBobSquarePants"><code>@JimBobSquarePants</code></a> in <a href="https://redirect.github.com/SixLabors/ImageSharp/pull/2509">SixLabors/ImageSharp#2509</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/SixLabors/ImageSharp/compare/v2.1.4...v2.1.5">https://github.com/SixLabors/ImageSharp/compare/v2.1.4...v2.1.5</a></p> <h2>v2.1.4</h2> <h2>What's Changed</h2> <ul> <li>Backport WebP fix to 2.1 by <a href="https://github.com/antonfirsov"><code>@antonfirsov</code></a> in <a href="https://redirect.github.com/SixLabors/ImageSharp/pull/2420">SixLabors/ImageSharp#2420</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/SixLabors/ImageSharp/compare/v2.1.3...v2.1.4">https://github.com/SixLabors/ImageSharp/compare/v2.1.3...v2.1.4</a></p> <h2>v2.1.3</h2> <h2>What's Changed</h2> <ul> <li>V2 Backport: 2133, 2154 by <a href="https://github.com/JimBobSquarePants"><code>@JimBobSquarePants</code></a> in <a href="https://redirect.github.com/SixLabors/ImageSharp/pull/2157">SixLabors/ImageSharp#2157</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/SixLabors/ImageSharp/compare/v2.1.2...v2.1.3">https://github.com/SixLabors/ImageSharp/compare/v2.1.2...v2.1.3</a></p> <h2>v2.1.2</h2> <h2>What's Changed</h2> <ul> <li>Backport - Issue 2123 by <a href="https://github.com/JimBobSquarePants"><code>@JimBobSquarePants</code></a> in <a href="https://redirect.github.com/SixLabors/ImageSharp/pull/2126">SixLabors/ImageSharp#2126</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/SixLabors/ImageSharp/compare/v2.1.1...v2.1.2">https://github.com/SixLabors/ImageSharp/compare/v2.1.1...v2.1.2</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`fa7d712702`"><code>fa7d712</code></a> Merge pull request <a href="https://redirect.github.com/SixLabors/ImageSharp/issues/2688">#2688</a> from SixLabors/js/backport-2681</li> <li><a href="`36b3533cc3`"><code>36b3533</code></a> Use correct property to disable upstream warnings.</li> <li><a href="`94bb7615a1`"><code>94bb761</code></a> Update ImageSharp.csproj</li> <li><a href="`3ea2574726`"><code>3ea2574</code></a> Update PngDecoderCore.cs</li> <li><a href="`e74a55fbfd`"><code>e74a55f</code></a> [release/2.1] PBM decoder robustness improvements and BufferedReadStream obse...</li> <li><a href="`749b1c04d7`"><code>749b1c0</code></a> [release/2.1] Tiff decoding robustness improvements (<a href="https://redirect.github.com/SixLabors/ImageSharp/issues/2550">#2550</a>) (<a href="https://redirect.github.com/SixLabors/ImageSharp/issues/2554">#2554</a>)</li> <li><a href="`3064b78927`"><code>3064b78</code></a> Merge pull request <a href="https://redirect.github.com/SixLabors/ImageSharp/issues/2553">#2553</a> from SixLabors/backport/2.1.x/2545</li> <li><a href="`f36ec12695`"><code>f36ec12</code></a> Disallow allocation attempts of unrepresentable sizes </li> <li><a href="`688e242a84`"><code>688e242</code></a> Merge pull request <a href="https://redirect.github.com/SixLabors/ImageSharp/issues/2524">#2524</a> from SixLabors/js/backport-fix-jpeg-dos</li> <li><a href="`0f17a8be9c`"><code>0f17a8b</code></a> Handle EOF in Jpeg bit reader when data is bad to prevent DOS attack.</li> <li>Additional commits viewable in <a href="https://github.com/SixLabors/ImageSharp/compare/v2.1.1...v2.1.7">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=Sixlabors.ImageSharp&package-manager=nuget&previous-version=2.1.1&new-version=2.1.7)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-04-05 11:11:52 -07:00
Hector Li	1ccb164c12	Improve the script to add Q, DQ nodes around EPContext node (#20107 ) Improve the script to add Q, DQ nodes around EPContext node so that the wrapper model use float data as inputs and outputs. User don't need to quantize or dequantize the data in their application	2024-04-05 10:12:01 -07:00
Guenther Schmuelling	c529e05e38	fix ConvTranspose 1D (#20194 )	2024-04-05 10:05:32 -07:00
dependabot[bot]	4f2d454211	Bump Sixlabors.ImageSharp from 2.1.1 to 2.1.7 in /csharp/sample/Microsoft.ML.OnnxRuntime.FasterRcnnSample (#19806 ) Bumps [Sixlabors.ImageSharp](https://github.com/SixLabors/ImageSharp) from 2.1.1 to 2.1.7. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/SixLabors/ImageSharp/releases">Sixlabors.ImageSharp's releases</a>.</em></p> <blockquote> <h2>v2.1.7</h2> <h2>What's Changed</h2> <ul> <li>[release/2.1] Disallow allocation attempts of unrepresentable sizes by <a href="https://github.com/antonfirsov"><code>@antonfirsov</code></a> in <a href="https://redirect.github.com/SixLabors/ImageSharp/pull/2553">SixLabors/ImageSharp#2553</a></li> <li>[release/2.1] Tiff decoding robustness improvements (<a href="https://redirect.github.com/SixLabors/ImageSharp/issues/2550">#2550</a>) by <a href="https://github.com/antonfirsov"><code>@antonfirsov</code></a> in <a href="https://redirect.github.com/SixLabors/ImageSharp/pull/2554">SixLabors/ImageSharp#2554</a></li> <li>[release/2.1] PBM decoder robustness improvements and BufferedReadStream observability by <a href="https://github.com/antonfirsov"><code>@antonfirsov</code></a> in <a href="https://redirect.github.com/SixLabors/ImageSharp/pull/2555">SixLabors/ImageSharp#2555</a></li> <li>Backport 2681 by <a href="https://github.com/JimBobSquarePants"><code>@JimBobSquarePants</code></a> in <a href="https://redirect.github.com/SixLabors/ImageSharp/pull/2688">SixLabors/ImageSharp#2688</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/SixLabors/ImageSharp/compare/v2.1.6...v2.1.7">https://github.com/SixLabors/ImageSharp/compare/v2.1.6...v2.1.7</a></p> <h2>v2.1.6</h2> <h2>What's Changed</h2> <ul> <li>Backport - Handle EOF in Jpeg bit reader when data is bad to prevent DOS attack. by <a href="https://github.com/JimBobSquarePants"><code>@JimBobSquarePants</code></a> in <a href="https://redirect.github.com/SixLabors/ImageSharp/pull/2524">SixLabors/ImageSharp#2524</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/SixLabors/ImageSharp/compare/v2.1.5...v2.1.6">https://github.com/SixLabors/ImageSharp/compare/v2.1.5...v2.1.6</a></p> <h2>v2.1.5</h2> <h2>What's Changed</h2> <ul> <li>Backport <a href="https://redirect.github.com/SixLabors/ImageSharp/issues/2501">#2501</a> by <a href="https://github.com/JimBobSquarePants"><code>@JimBobSquarePants</code></a> in <a href="https://redirect.github.com/SixLabors/ImageSharp/pull/2509">SixLabors/ImageSharp#2509</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/SixLabors/ImageSharp/compare/v2.1.4...v2.1.5">https://github.com/SixLabors/ImageSharp/compare/v2.1.4...v2.1.5</a></p> <h2>v2.1.4</h2> <h2>What's Changed</h2> <ul> <li>Backport WebP fix to 2.1 by <a href="https://github.com/antonfirsov"><code>@antonfirsov</code></a> in <a href="https://redirect.github.com/SixLabors/ImageSharp/pull/2420">SixLabors/ImageSharp#2420</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/SixLabors/ImageSharp/compare/v2.1.3...v2.1.4">https://github.com/SixLabors/ImageSharp/compare/v2.1.3...v2.1.4</a></p> <h2>v2.1.3</h2> <h2>What's Changed</h2> <ul> <li>V2 Backport: 2133, 2154 by <a href="https://github.com/JimBobSquarePants"><code>@JimBobSquarePants</code></a> in <a href="https://redirect.github.com/SixLabors/ImageSharp/pull/2157">SixLabors/ImageSharp#2157</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/SixLabors/ImageSharp/compare/v2.1.2...v2.1.3">https://github.com/SixLabors/ImageSharp/compare/v2.1.2...v2.1.3</a></p> <h2>v2.1.2</h2> <h2>What's Changed</h2> <ul> <li>Backport - Issue 2123 by <a href="https://github.com/JimBobSquarePants"><code>@JimBobSquarePants</code></a> in <a href="https://redirect.github.com/SixLabors/ImageSharp/pull/2126">SixLabors/ImageSharp#2126</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/SixLabors/ImageSharp/compare/v2.1.1...v2.1.2">https://github.com/SixLabors/ImageSharp/compare/v2.1.1...v2.1.2</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`fa7d712702`"><code>fa7d712</code></a> Merge pull request <a href="https://redirect.github.com/SixLabors/ImageSharp/issues/2688">#2688</a> from SixLabors/js/backport-2681</li> <li><a href="`36b3533cc3`"><code>36b3533</code></a> Use correct property to disable upstream warnings.</li> <li><a href="`94bb7615a1`"><code>94bb761</code></a> Update ImageSharp.csproj</li> <li><a href="`3ea2574726`"><code>3ea2574</code></a> Update PngDecoderCore.cs</li> <li><a href="`e74a55fbfd`"><code>e74a55f</code></a> [release/2.1] PBM decoder robustness improvements and BufferedReadStream obse...</li> <li><a href="`749b1c04d7`"><code>749b1c0</code></a> [release/2.1] Tiff decoding robustness improvements (<a href="https://redirect.github.com/SixLabors/ImageSharp/issues/2550">#2550</a>) (<a href="https://redirect.github.com/SixLabors/ImageSharp/issues/2554">#2554</a>)</li> <li><a href="`3064b78927`"><code>3064b78</code></a> Merge pull request <a href="https://redirect.github.com/SixLabors/ImageSharp/issues/2553">#2553</a> from SixLabors/backport/2.1.x/2545</li> <li><a href="`f36ec12695`"><code>f36ec12</code></a> Disallow allocation attempts of unrepresentable sizes </li> <li><a href="`688e242a84`"><code>688e242</code></a> Merge pull request <a href="https://redirect.github.com/SixLabors/ImageSharp/issues/2524">#2524</a> from SixLabors/js/backport-fix-jpeg-dos</li> <li><a href="`0f17a8be9c`"><code>0f17a8b</code></a> Handle EOF in Jpeg bit reader when data is bad to prevent DOS attack.</li> <li>Additional commits viewable in <a href="https://github.com/SixLabors/ImageSharp/compare/v2.1.1...v2.1.7">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=Sixlabors.ImageSharp&package-manager=nuget&previous-version=2.1.1&new-version=2.1.7)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-04-05 08:32:18 -07:00
Edward Chen	2b3071119a	Add onnxruntime/test/run_benchmark.py helper script. (#19234 ) ### Description Add onnxruntime/test/run_benchmark.py helper script to repeat benchmark runs until a target coefficient of variance is reached. It works with [Google Benchmark](https://github.com/google/benchmark) programs like `onnxruntime_mlas_benchmark`. ### Motivation and Context Sometimes there is variability in benchmark run results. This automates the repeated running needed to get results that are stable enough.	2024-04-05 07:02:01 -07:00
Hans	6abfb6b928	[js/rn] Support load external data (#20090 ) Support load external data by passing local model path	2024-04-05 05:55:03 -07:00
Scott McKay	f61cca1b8f	NNAPI: Improve MatMul diagnostic output (#19721 ) ### Description <!-- Describe your changes. --> Re-order so that we don't get two messages for the one node. Currently the batched matmul 'not supported' message will appear for 2D input which is valid, which can be confusing to understand. Change the order so we only check if batched matmul can be used when the input ranks are > 3, as that is one of the requirements. `c311d1faf5/onnxruntime/core/providers/nnapi/nnapi_builtin/builders/op_builder_helpers.cc (L257-L264)`	2024-04-04 21:58:39 -07:00
Thomas Boby	254bdbb19d	OneDNN/dnnl: Fix filepath after dnnl move (#20086 ) ### Description This adjusts the path used in the nuget script for dnnl to the new location of the file. There isn't a CI pipeline for this as far as I can tell, and I can't easily confirm this change works on master, so please check. ### Motivation and Context It is currently not possible to build onednn nuget packages. It's possible that the correct action would be to move the file not fix this path, but I'm not familiar enough with the repository layout. --------- Co-authored-by: Tianlei Wu <tlwu@microsoft.com>	2024-04-04 21:24:49 -07:00
Yi Zhang	4ea54b82f9	[Fix] Upload training CUDA daily wheel (#20183 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-04-03 13:18:26 +08:00
Andrew Fantino	7303a90f49	Fix build errors from date/date.h C++20 compatibility (#20139 ) ### Description For C++ standards >= 20, use `std::chrono::operator<<` in place of `date::operator<<` to fix ambiguous operator compile error. ### Motivation and Context The external dependency HowardHinnant/date has a conflict with std::chrono for >=C++20. Solves #20137	2024-04-02 22:10:25 -07:00
Yi Zhang	dae77e6014	Support building Windows CUDA with Ninja (#20176 ) ### How to run it locally 1. conda install ninja 2. "C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Auxiliary\Build\vcvarsall.bat" x64 3. python.exe {ort_repo}\tools\ci_build\build.py --config RelWithDebInfo --build_dir {ort_repo}\build_cuda --skip_submodule_sync --build_csharp --update --parallel --cmake_generator "Ninja" --build_shared_lib --enable_onnx_tests --enable_pybind --build_java --build_nodejs --use_cuda "--cuda_home=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8" --enable_cuda_profiling --cmake_extra_defines CMAKE_CUDA_ARCHITECTURES=60 4. cd build_cuda\RelWithDebInfo 5. cmake --build . j16 ### Motivation and Context In packaging pipelines, we often come across a random issue that the building with CUDA on Windows takes too much time. Although it has been reduced much by moving the building to the CPU machine. We're planning to build with Ninja instead of msbuild in Packaging pipelines, thus, nvcc can run parallelly. It's the first step to support it locally.	2024-04-03 11:19:31 +08:00
Yulong Wang	fa1917b81b	[js/webgpu] add validation to workgroup size (#20110 ) ### Description add validation to workgroup size in `shaderHelper.mainStart()`.	2024-04-02 19:29:20 -07:00
Shubham Bhokare	be831e1ba3	Export of Openai Whisper with batched prompts (#19854 ) Adds an example to demonstrate the export of openai whipser implemenation with batch_size > 1 and addition of prompts for each audio snippet. Also handles the scenario for when prompts are not of the same size. For example if our prompt ids are [p1_id_1, p1_id_2] and [p2_id_1], the final decoder_input_ids will look as such after padding: `[prev_token, p1_id_1, p1_id_2, start_token, lang_token, transcribe_token] [prev_token, p2_id_1, PAD_TOKEN, start_token, lang_token, transcribe_token]` --------- Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>	2024-04-02 17:01:48 -07:00
Rachel Guo	19793de1b3	#19921 [Dup] LLC Core count calculations updated (#20171 ) ### Description <!-- Describe your changes. --> See #19921 Just to address one comment: https://github.com/microsoft/onnxruntime/pull/19921#discussion_r1543398640 since this is an external branch. need to open another pull request for this. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: Sai Kishan Pampana <sai.kishan.pampana@intel.com> Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net> Co-authored-by: Jian Chen <cjian@microsoft.com>	2024-04-02 16:53:47 -07:00
Dmitri Smirnov	12e2538065	Add new SessionOptions config entry to disable specific transformers and rules (#20135 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Certain transformers slow down session loading time while providing no runtime perf benefits. Allow clients to exclude them.	2024-04-02 16:33:05 -07:00
Chi Lo	e916929371	[TensorRT EP] Address compiler warnings on Windows (#20134 ) Previous [PR ](https://github.com/microsoft/onnxruntime/pull/19663)changes msvc compiler warning level from set_msvc_c_cpp_compiler_warning_level(3) to set_msvc_c_cpp_compiler_warning_level(4) when using CUDA EP (it also applies to TRT EP). Some warnings still need to be addressed in TRT EP code.	2024-04-02 10:39:46 -07:00
Xu Xing	a2998e5d42	[js/webgpu] Use global id in attention and instance-norm (#20008 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-04-02 01:42:39 -07:00
Adam Pocock	262b6bd3b7	[java][DML EP] Modifying dml_provider_factory.h so it can compile as a C header file (#20157 ) ### Description The dml_provider_factory header file can't be used in C programs as it defines C++ inline operators. This PR rearranges that header file so that it looks like valid C when used from C, and also makes a couple of small modifications to the Java code so it correctly binds to the DML EP at build time. I'm having some difficulty testing it as I think it's pulling in the old version of DirectML on my computer and I can't figure out what the library loading path is in Java to make it look at the recent version I downloaded. So the test I added fails with: ``` InferenceTest > testDirectML() FAILED ai.onnxruntime.OrtException: Error code - ORT_RUNTIME_EXCEPTION - message: Exception during initialization: <path-to-ort>\onnxruntime\core\providers\dml\DmlExecutionProvider\src\AbiCustomRegistry.cpp(518)\onnxruntime.dll!00007FFF74819333: (caller: 00007FFF74793509) Exception(3) tid(4f58) 80070057 The parameter is incorrect. at app//ai.onnxruntime.OrtSession.createSession(Native Method) at app//ai.onnxruntime.OrtSession.<init>(OrtSession.java:74) at app//ai.onnxruntime.OrtEnvironment.createSession(OrtEnvironment.java:236) at app//ai.onnxruntime.OrtEnvironment.createSession(OrtEnvironment.java:221) at app//ai.onnxruntime.InferenceTest.openSessionSqueezeNet(InferenceTest.java:1961) at app//ai.onnxruntime.InferenceTest.runProvider(InferenceTest.java:665) at app//ai.onnxruntime.InferenceTest.testDirectML(InferenceTest.java:657) ``` But it does correctly compile, and this error seems very similar to other issues with the DML provider when it doesn't like a model due to the loaded library being old. The test is using the squeezenet file that's been in the repo since 2019. If someone can help me figure out how to get the right version of DML in the library path I can test it more on my end. I tried adding the folder with the new version into the system path, but I'm not very familiar with Windows' library loading behaviour. ### Motivation and Context Fixes #19656 to allow use of the DirectML EP from ORT Java. cc @martinb35	2024-04-01 21:58:50 -07:00
Xiaoyu	3979f53aa4	Update api backward compatibility (#20136 ) ### Description Update api backward compatibility ### Motivation and Context Update api backward compatibility	2024-04-01 21:37:56 -07:00
wangshuai09	3e2b659fce	[CANN] Add dump_om_model flag (#20075 ) ### Description New flag of `dump_om_model` for CANN EP, which defaults to "True". ### Motivation and Context When building an onnx model with CANN EP, the intermediate OM(offline model for Ascend NPU) is automatically saved. There are some users don't want to dump OM when resources are limited. This PR will resovle this situation with `dump_om_model=False`	2024-04-01 21:35:29 -07:00
Dhruv Matani	742d413586	Fix bug related to export failure for DynamicQuantizeLSTM [issue 15465] (#20160 ) ### Description See issue 15465: https://github.com/microsoft/onnxruntime/issues/15465 This PR just applies the workaround suggested in the thread that I and numerous others on the thread have validated to work for them and allows them to successfully export a PyTorch model with LSTM layers that are dynamically quantized by ONNX. ### Motivation and Context It is not possible to successfully export a dynamically quantized LSTM model that I have trained for use in the onnx runtime without this change. Currently, this workaround lives as a local change in my python package directory, and makes it basically impossible for anyone else at the place I work at to successfully export the quantized model that I am exporting. See issue 15465: https://github.com/microsoft/onnxruntime/issues/15465 Co-authored-by: Dhruv Matani <dhruv.matani@grammarly.com>	2024-04-01 21:33:00 -07:00
Yufeng Li	91654988fd	optimize threading of mha (#20088 ) ### Description <!-- Describe your changes. --> The cost computation of ComputeVxAttentionScore is wrong. It should be sequence_length * v_head_size * total_sequence_length instead of sequence_length * v_head_size * sequence_length. The PR also fine-tuned the cost computation. on my local box with i9 cpu, the performance is same as unfused version, but it is much faster on an azure vm with 16 threads. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> https://github.com/microsoft/onnxruntime/issues/19924	2024-04-01 21:32:36 -07:00
Atanas Dimitrov	9d06e1bfa4	Label encoder fusion (#19761 ) ### Description Created a new `LabelEncoderFusion` pass. This is useful in model that result from automatic conversion tools related to data-science: sometimes the produced model contains consecutive `LabelEncoder`-s. To merge 2 `LabelEncoder`-s the optimizer propagates the outputs of the first encoder through the second one. ### Motivation and Context This enhances the capabilities of the `onnxruntime::optimizer` by fusing consecutive `LabelEncoder` nodes. ### Fusion examples ``` Applying fusion node1: (a,C) (b,B) (c,A) -> Default: _Unused node2: (A,1) (B,2) (C,3) -> Default: -1 fused: (a,3) (b,2) (c,1) -> Default: -1 Applying fusion node1: (a,C) (b,B) (c,A) -> Default: D node2: (A,a) (B,b) (C,c) (D,d) -> Default: default fused: (a,c) (b,b) (c,a) -> Default: d Applying fusion node1: (a,0) (b,1) (c,2) -> Default: -1 node2: (2,a) (1,b) (0,c) -> Default: default fused: (a,c) (b,b) (c,a) -> Default: default Applying fusion node1: (a,3) (b,2) (c,1) -> Default: -1 node2: (1,a) (2,b) (3,c) -> Default: d fused: (a,c) (b,b) (c,a) -> Default: d ``` --------- Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>	2024-04-01 09:41:10 -07:00
Yi Zhang	523ef04240	enable lto in Python-CUDA-Packaging Pipline (#20164 ) ### Description Except [Python-CUDA-Packaging pipeline](https://dev.azure.com/aiinfra/Lotus/_build?definitionId=1299&_a=summary), all windows cuda packaging jobs have been running well now. After comparison, enable_lto isn't added in the pipeline, which might be one root cause of the random hang. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-04-01 15:42:28 +08:00
Sumit Agarwal	e1e292f94c	[DML EP] DML Graph Serialization Bug (#19748 ) ### Description This pull request addresses several issues: - The DML Graph's nodes were not sorted in a topologically ordered sequence, leading to crashes during deserialization when a child node preceded its parent node. This PR resolves this issue by implementing a topological sorting algorithm before serialization. - During the `RemoveUnconnectedNodes` process: - we update `intermeidateEdge.FromNodeIndex`. Additionally, we must update `intermediateEdge.Name` when it includes `intermediateEdge.FromNodeIndex`, as serialization/deserialization heavily relies on edge names. - we also eliminate unused edges. Consequently, we must erase inputs (now unused) from corresponding maps `serializedGraphInputIndexToSubgraphInputIndex` and `serializedGraphLargeConstantNameToSubgraphInputIndex`. ### Motivation and Context Why is this change required? What problem does it solve? There are few ONNX Zoo public models which were crashing during deserialization. <!-- - - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: Jeff Bloomfield <38966965+jeffbloo@users.noreply.github.com>	2024-03-31 14:41:42 -07:00
kunal-vaishnavi	a0ebd5fee5	Add flash attention v2 and INT4 CUDA for LLaMA E2E benchmarking (#20149 ) ### Description This PR adds flash attention v2 and support for INT4 CUDA benchmarking in PyTorch. ### Motivation and Context The [flash attention v2](https://github.com/Dao-AILab/flash-attention) algorithm helps improve model performance in PyTorch. Support for INT4 CUDA in PyTorch is done through the [`bitsandbytes`](https://github.com/TimDettmers/bitsandbytes) package.	2024-03-29 23:09:37 -07:00
mo-ja	00244ea143	fix quantization errors of ConvTranspose with per_channel=True (#19996 ) ### Description <!-- Describe your changes. --> - update axis value for per_channel quantization of QDQConv - we should use `axis=1` for ConvTranspose operator. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> - this PR fixes https://github.com/microsoft/onnxruntime/issues/19694, which I have opened	2024-03-29 21:36:15 -07:00
Ye Wang	f3a864217f	Fix MoE tensor parallelism tests (#20147 ) ### Description <!-- Describe your changes. --> Previously the expert weights are in row-major. But with the updated cutlass extension introduced by https://github.com/microsoft/onnxruntime/pull/20108, weights are stored in col-major that aligns with Pytorch implementation. This change fixes the way the tensors are sliced across shards. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-03-29 16:10:09 -07:00
Jeff Bloomfield	2f31560430	Enable generic feature level devices in DML EP (#20114 ) ### Description Enable NPUs supporting DXCORE_ADAPTER_ATTRIBUTE_D3D12_GENERIC_ML and D3D_FEATURE_LEVEL_1_0_GENERIC with DML EP. This also begins ingesting DX headers through the DirectX-Headers repo. Note that this includes an update to cgamanifest.json for onnx-tensorrt which is triggered during re-generation due to a prior changes to deps.txt. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-03-29 14:37:30 -07:00
cao lei	604b284261	add API function GetAliasMap and ReleaseAliasMap in OrtCustomOp (#20145 ) ### Description <!-- Describe your changes. --> Add API function GetAliasMap and ReleaseAliasMap in OrtCustomOp ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Add API function GetAliasMap and ReleaseAliasMap in OrtCustomOp	2024-03-29 13:49:56 -07:00
inisis	8396845806	fix shape inference bug (#19848 ) ### Description for nodes like add, their input should be merged dynamically ### Motivation and Context when doing shape inference, for nodes like Add, currently when doing _onnx_infer_single_node, their inputs are generated from last node's output, but they should be merged.	2024-03-29 13:06:27 -07:00
Adrian Lizarraga	b1a5eb255e	[Quant] Fix accuracy_level config option for MatMul 4bits quantizer (#20146 ) ### Description Fixes code that extracts the accuracy level when creating a MatMulNBits node in the `DefaultWeightOnlyQuantizer` class. ### Motivation and Context Error from line 443: `AttributeError: 'DefaultWeightOnlyQuantizer' object has no attribute 'accuracy_level'`. The solution is to access `self.config.accuracy_level` instead of `self.accuracy_level`. Relevant commit: https://github.com/microsoft/onnxruntime/pull/19106	2024-03-29 11:54:55 -07:00
Ye Wang	17919717b5	add QMoE (#20108 ) ### Description <!-- Describe your changes. --> 1. Introduce latest cutlass extension from TRTLLM that gives us cutlass upgrade(to 3.4) opportunity from MoE side. 2. Fix Windows build issue 3. Add Int4 MoE op and ut ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-03-29 10:24:19 -07:00
pengwa	2092bebc78	Fix transformer layer detection for recompute (#20106 ) ### Fix transformer layer detection for recompute Originally logic miss detecting the layer boudary node in Mistral model. This PR simplifies the searching, by using more strong pattern's match, to make sure it is flexible enough to cover different transformer variants. Also add a UT. Add a warning when user enable layerwise recompute but no layer boudary nodes are found.	2024-03-29 17:44:38 +08:00
cao lei	2a184ac1a1	use OrtCustomOp's new API GetMayInplace in CreateKernelCreateInfo (#20037 ) ### Description <!-- Describe your changes. --> use OrtCustomOp's new API GetMayInplace in CreateKernelCreateInfo to hook the inplace map of custom ops ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> This PR is to use OrtCustomOp's new API GetMayInplace in CreateKernelCreateInfo to hook the inplace map of custom ops	2024-03-28 20:45:37 -07:00
Adam Pocock	2f82400b13	[java] Java 21 build support (#19876 ) ### Description Bump spotless and the Gradle wrapper to 6.25.0 and 8.6 respectively to allow compiling ORT on Java 21. The build still targets Java 8. I'm not sure if there will be CI changes necessary to use this PR, specifically for the Gradle version as I don't know if that is cached somewhere earlier in the CI build process. The new Gradle version adds a warning that using `--source` and `--target` to select the Java language version is obsolete which is annoying, we can fix it if we decide to only allow building on newer versions of Java, while still supporting running on Java 8. ### Motivation and Context Java 21 is the latest LTS release of Java and ORT should be able to build on it.	2024-03-28 15:51:22 -07:00

1 2 3 4 5 ...

10864 commits