Bumps
[follow-redirects](https://github.com/follow-redirects/follow-redirects)
from 1.15.4 to 1.15.6.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="35a517c586"><code>35a517c</code></a>
Release version 1.15.6 of the npm package.</li>
<li><a
href="c4f847f851"><code>c4f847f</code></a>
Drop Proxy-Authorization across hosts.</li>
<li><a
href="8526b4a1b2"><code>8526b4a</code></a>
Use GitHub for disclosure.</li>
<li><a
href="b1677ce001"><code>b1677ce</code></a>
Release version 1.15.5 of the npm package.</li>
<li><a
href="d8914f7982"><code>d8914f7</code></a>
Preserve fragment in responseUrl.</li>
<li>See full diff in <a
href="https://github.com/follow-redirects/follow-redirects/compare/v1.15.4...v1.15.6">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/microsoft/onnxruntime/network/alerts).
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps
[follow-redirects](https://github.com/follow-redirects/follow-redirects)
from 1.15.4 to 1.15.6.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="35a517c586"><code>35a517c</code></a>
Release version 1.15.6 of the npm package.</li>
<li><a
href="c4f847f851"><code>c4f847f</code></a>
Drop Proxy-Authorization across hosts.</li>
<li><a
href="8526b4a1b2"><code>8526b4a</code></a>
Use GitHub for disclosure.</li>
<li><a
href="b1677ce001"><code>b1677ce</code></a>
Release version 1.15.5 of the npm package.</li>
<li><a
href="d8914f7982"><code>d8914f7</code></a>
Preserve fragment in responseUrl.</li>
<li>See full diff in <a
href="https://github.com/follow-redirects/follow-redirects/compare/v1.15.4...v1.15.6">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/microsoft/onnxruntime/network/alerts).
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
### Description
Fix a few warnings in typedoc (for generating JS API):
```
[warning] The signature TrainingSession.loadParametersBuffer has an @param with name "buffer", which was not used.
[warning] NonTensorType, defined in ./lib/onnx-value.ts, is referenced by OnnxValue but not included in the documentation.
[warning] TensorFactory, defined in ./lib/tensor-factory.ts, is referenced by Tensor but not included in the documentation.
[warning] ExternalDataFileType, defined in ./lib/onnx-model.ts, is referenced by InferenceSession.SessionOptions.externalData but not included in the documentation.
[warning] TensorToDataUrlOptions, defined in ./lib/tensor-conversion.ts, is referenced by Tensor.toDataURL.toDataURL.options but not included in the documentation.
[warning] TensorToImageDataOptions, defined in ./lib/tensor-conversion.ts, is referenced by Tensor.toImageData.toImageData.options but not included in the documentation.
[warning] Failed to resolve link to "GpuBufferType" in comment for Env.WebGpuFlags.adapter.
[warning] Failed to resolve link to "GpuBufferType" in comment for Env.WebGpuFlags.device.
```
Changes highlighted:
- Merge `CoreMlExecutionProviderOption` and
`CoreMLExecutionProviderOption`. They expose 2 set of different options
for React-native and ORT nodejs binding. This should be fixed in future.
- Fix a few inconsistency of names between JSDoc and parameters
- Fix broken type links
- Exclude trace functions
### Description
Fix#19931 broken Get Started link
HTTP 404 for "Get Started" link in "ONNX Runtime JavaScript API" page
Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>
### Description
This PR rewrite the backend resolve logic to support specifying multiple
EPs.
#### Backend
The first version of ONNX Runtime Web actually carried some existing
code from [ONNX.js](https://github.com/microsoft/onnxjs), which includes
the "backend" concept. The original "backend" in ONNX.js is designed in
a way assuming there is only one backend from user's backend hint list
will be used. For example, in ONNX.js, if user specify a backend hint as
`['webgl', 'wasm']`, ONNX.js will first try to use WebGL backend - if it
loads successfully (the browser supports webgl), then "webgl" backend
will be used and "wasm" will be ignored; otherwise, "webgl" will be
ignored and try to load "wasm" backend.
In short: only one backend will be used when initializing a session.
#### Execution Provider
Execution Provider, or EP, in ONNX Runtime is a different concept. One
of the differences is that users are allow to specify multiple EPs, and
if one does not support a particular kernel, it can fallback to other
EP. This is a very common case when using a GPU EP in ONNX Runtime.
#### Current Status: Backend v.s. EP
Because of the history reasons mentioned above, the current status is
quite confusing. There are **real backend**s, which means it's different
implementation in code; and there are **backend hint**s, which are used
as string names for backend hint; and there are **EP**s of the ONNX
Runtime concepts.
currently there are only 2 **backend**s in our code base: The "onnxjs
backend", and the "wasm backend". The "onnxjs backend" currently only
powers backend hint "webgl", which go into the old onnx.js code path.
All other backend hints including "wasm", "cpu"(alias to wasm), "webgpu"
and "webnn" are all powered by "wasm backend".
And because ORT Web treat "backend" as an internal concept and want to
align with ONNX Runtime, so those names of backend hints are becoming EP
names.
The following table shows today's status:
| Execution Provider Name (public) / Backend Hint (internal) | Backend |
EP in ORT
| -------- | ------- | ------- |
| "wasm"/"cpu" | WasmBackend | CPU EP
| "webgl" | OnnxjsBackend | \* technically not an EP
| "webgpu" | WasmBackend | JSEP
| "webnn" | WasmBackend | WebNN EP
#### Problem
While the API allows to specify multiple EPs, the backend resolving only
allows one backend. This causes issues when user specify multiple EP
names in session options, the backend resolve behavior and EP
registration behavior is inconsistent. Specifically, in this issue:
https://github.com/microsoft/onnxruntime/issues/15796#issuecomment-1925363908:
EP list `['webgpu', 'wasm']` on a browser without WebGPU support
resolves to 'wasm' backend, but the full EP list is passed in session
options, so JSEP is still enabled, causing the runtime error.
#### Solution
Since we still need WebGL backend, we cannot totally remove the backend
register/resolve system. In this PR I made the following changes:
- initialize every backend from the EP list, instead of only do that for
the first successful one.
- for the first resolved backend, filter all EP using the exact same
backend. Remove all EPs not using this backend from session options
- for every explicitly specified EP, if it's removed, show a warning
message in console
### Description
the `npm test` flags are difficult to memorize, because they are
different to the `ort.env` flags. This change makes those flags align
with ort JS API. eg. `--wasm-enable-proxy` became `--wasm.proxy`.
Old flags are marked as deprecated except `-x` (as a shortcut of
`--wasm.numThreads`)
Vectorize met 2 failed cases in a CI bot with NVIDIA GPU, but we
couldn't repro with all the GPUs at hand, including NVIDIA GPUs. This PR
introduces GPUAdapterInfo and enables this opt on non-NVIDIA GPUs to
make the bots happy.
No obivous perf gain can be seen if we enable vectorize on NVIDIA.
However, it shows big perf improvement on Intel. On my Gen12 Intel GPU,
mobilenetv2-12 perf was improved from 11.14ms to 7.1ms.
### Description
This change exposes a few properties in `ort.env.webgpu` to resolve
feature requirement mentioned in properties in
https://github.com/microsoft/onnxruntime/pull/14579#discussion_r1519612619.
- Add `powerPreference` and `forceFallbackAdapter` in `ort.env.webgpu`,
to allow users to set the value of the properties before the first
inference session is created.
- Add readonly property `adapter` in `ort.env.webgpu` to allow users to
get the adapter instance. Now users can access `ort.env.webgpu.device`
and `ort.env.webgpu.adapter`.
@xenova @beaufortfrancois
### Description
For Concat operation, the zero-size input tensor shape need to be
preserved and, unlike non-zero tensors, the dims are not constrained to
match other input tensors' dims.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
use Chromium Headless for webgpu test by default. Still use normal
Chromium with window when debug=true or perfMode=true.
Use the
[`--headless=new`](https://developer.chrome.com/docs/chromium/new-headless)
mode.
### Motivation and Context
try to use a more stable way to launch npm tests to avoid a "chrome not
found" issue in pipeline, which may potentially caused by windowed
application.
### Description
Try to move 'env.wasm.trace' to 'env.trace' to make it less confusing,
because it also works in webgpu. Marked 'env.wasm.trace' as deprecated.
### Description
Fixes build break brought by #19614
Currently WebGL backend does not support zero sized tensor. This change
split test data into 2 parts, and only enable zero sized tensor tests
for WebGPU.
### Description
This PR allows zero-sized output.
To make the implementation simple, it does not support partial
zero-sized tensor. Which means, either all outputs are zero-sized, or an
error will be reported.
added 2 tests:
- op test of `Add` with input T[2,0] T[2,1], and
- test_split_zero_size_splits
### Description
<!-- Describe your changes. -->
1. Fix Where operator to handle Boolean input less than 4 bytes.
2. Fix JSEP test harness to use tensor names consistently.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
<!-- Describe your changes. -->
Switch to setImmediate to avoid starving the Node.js event loop
There should really be a true async version though, running
computationally intensive things on the event loop will stop everything
else from happening while it is running, e.g. a web server from
answering requests.
This can be done by wrapping `RunAsync` behind a
[`napi::Promise`](https://github.com/nodejs/node-addon-api/blob/main/doc/promises.md)
to run on the onnxruntime thread pool or [`AsyncWorker`](
https://github.com/nodejs/node-addon-api/blob/main/doc/async_worker.md)
for the Node.js/libuv thread pool.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Without this, if you run inference in a tight loop, without anything
else in between that is async/deferred, `process.nextTick` will lead to
starving the event loop and not letting anything else run,
`setImmediate` at least lets the event loop spin between calls to `run`.
See
https://dev.to/ynmanware/setimmediate-settimeout-and-process-nexttick-3mfd
Contributed on behalf of [Swimm](https://swimm.io/)
Bumps [ip](https://github.com/indutny/node-ip) from 1.1.8 to 1.1.9.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="1ecbf2fd8c"><code>1ecbf2f</code></a>
1.1.9</li>
<li><a
href="6a3ada9b47"><code>6a3ada9</code></a>
lib: fixed CVE-2023-42282 and added unit test</li>
<li>See full diff in <a
href="https://github.com/indutny/node-ip/compare/v1.1.8...v1.1.9">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
Dependabot will merge this PR once CI passes on it, as requested by
@fs-eire.
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/microsoft/onnxruntime/network/alerts).
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
This is used in sam-h-decoder-f16.
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Bumps [ip](https://github.com/indutny/node-ip) from 1.1.8 to 1.1.9.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="1ecbf2fd8c"><code>1ecbf2f</code></a>
1.1.9</li>
<li><a
href="6a3ada9b47"><code>6a3ada9</code></a>
lib: fixed CVE-2023-42282 and added unit test</li>
<li>See full diff in <a
href="https://github.com/indutny/node-ip/compare/v1.1.8...v1.1.9">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
Dependabot will merge this PR once CI passes on it, as requested by
@fs-eire.
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/microsoft/onnxruntime/network/alerts).
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
BUG: https://github.com/microsoft/onnxruntime/issues/18855
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
This change adds only necessary code to enable ort-web works with any
Float16Array polyfill. Unlike #19302, in this PR, ort-web does not
include any specific polyfill; instead, it's user's choice for how to
use a polyfill.
ORT-web uses Float16Array if it's available; otherwise, fallback to use
Uint16Array.
```js
// case 1: user does not use polyfill:
import * as ort from 'onnxruntime-web';
const myF16Data = new Uint16Array(...); // need to use Uint16Array
const myF16tensor = new ort.Tensor('float16', myF16Data, dims);
```
```js
// case 2: user use polyfill:
import * as ort from 'onnxruntime-web';
import {
Float16Array, isFloat16Array, isTypedArray,
getFloat16, setFloat16,
f16round,
} from "@petamoriken/float16";
globalThis.Float16Array = Float16Array; // ort-web will pick the global Float16Array
const myF16Data = new Float16Array(...); // Use the polyfilled Float16Array type
const myF16tensor = new ort.Tensor('float16', myF16Data, dims);
```
### Description
This is required to make shape uniforms really work.
### Motivation and Context
The bug was unveiled in a model with multiple Split nodes. The later
nodes would try to reuse a previous pipeline cache, while the old shapes
were hardcoded as constants in cache.
### Description
Add MatMulNBits to support MatMul using 4-bit quantized weights
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
Since TypeScript v4.7, types need to specify inside "exports" field when
it is available. This PR appends types just before each "default" (which
is required by spec to be the last item).
Fixes#19403.
### Description
This PR 1) adds LeakyRelu activation for fusedConv; 2) makes `vec4<f16>`
value work with `float32` uniforms attributes.
For example:
`clamp(value, vec4<f16>(uniforms.clip_min),
vec4<f16>(uniforms.clip_max)` will throw compilation errors since
`uniforms.clip_min` and `uniforms.clip_min` are `f32` not `f16`. So we
need to change it to `clamp(value, vec4<f16>(f16(uniforms.clip_min)),
vec4<f16>(f16(uniforms.clip_max))`
And above problem was introduced when we make activation attributes as
uniforms instead of constant.
BTW, after adding LeakyRelu, `realesrgan-t256` model can pass.
### Description
support external data in npm test.
This allows test runner to detect whether an external data is available
in the test folder, and if it is, load it as external data
automatically.
this feature does not parse every model to figure out whether the model
has external data. the following comments in code explained how to
determine whether should parse the model file.
```js
// for performance consideration, we do not parse every model. when we think it's likely to have external
// data, we will parse it. We think it's "likely" when one of the following conditions is met:
// 1. any file in the same folder has the similar file name as the model file
// (e.g., model file is "model_abc.onnx", and there is a file "model_abc.pb" or "model_abc.onnx.data")
// 2. the file size is larger than 1GB
```
### Description
This PR expands the graph capture capability to JS EP, which is similar
to #16081. But for JS EP, we don't use the CUDA Graph, instead, we
records all gpu commands and replay them, which removes most of the cpu
overhead to avoid the the situation that gpu waiting for cpu.
mobilenetv2-12 becomes 3.7ms from 6ms on NV 3090 and becomes 3.38ms from
4.58ms on Intel A770.
All limitations are similar with CUDA EP:
1. Models with control-flow ops (i.e. If, Loop and Scan ops) are not
supported.
2. Usage of graph capture is limited to models where-in all ops in the
model can be partitioned to the JS EP or CPU EP and no memory copy
between them.
3. Shapes of inputs/outputs cannot change across inference calls.
4. IObinding is required.
The usage is like below:
Method 1: specify outputs buffers explicitly.
```
const sessionOptions = {
executionProviders: [
{
name: "webgpu",
},
],
enableGraphCapture: true,
};
const session = await ort.InferenceSession.create('./models/mobilenetv2-12.onnx', sessionOptions);
// prepare the inputBuffer/outputBuffer
... ...
const feeds = {
'input': ort.Tensor.fromGpuBuffer(inputBuffer, { dataType: 'float32', dims })
};
const fetches = {
'output': ort.Tensor.fromGpuBuffer(outputBuffer, { dataType: 'float32', dims: [1, 1000] })
};
let results = await session.run(feeds, fetches); // The first run will begin to capture the graph.
// update inputBuffer content
... ...
results = = await session.run(feeds, fetches); // The 2ed run and after will directly call replay to execute the graph.
... ...
session.release();
```
Method 2: Don't specify outputs buffers explicitly. Internally, when
graph capture is enabled, it will set all outputs location to
'gpu-buffer'.
```
const sessionOptions = {
executionProviders: [
{
name: "webgpu",
},
],
enableGraphCapture: true,
};
const session = await ort.InferenceSession.create('./models/mobilenetv2-12.onnx', sessionOptions);
// prepare the inputBuffer
... ...
const feeds = {
'input': ort.Tensor.fromGpuBuffer(inputBuffer, { dataType: 'float32', dims })
};
let results = await session.run(feeds); // The first run will begin to capture the graph.
// update inputBuffer content
... ...
results = = await session.run(feeds); // The 2ed run and after will directly call replay to execute the graph.
... ...
session.release();
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
```math
\tanh(x)=\frac{e^x-e^{-x}}{e^x+e^{-x}}=
\left\{
\begin{array}{cc}
-\frac{1-e^{-2\cdot(-x)}}{1+e^{-2\cdot(-x)}}, & x<0 \\
0, & x=0 \\
\frac{1-e^{-2x}}{1+e^{-2x}}, & x>0
\end{array}
\right.
```
### Motivation and Context
On some platforms,
$$\tanh(1000)=\frac{e^{1000}-e^{-1000}}{e^{1000}+e^{-1000}}$$ would
produce NaN instead of 0.999... or 1 (imagine $e^{1000}=\infty$ and
$\frac{\infty}{\infty}$ explodes).