onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-20 19:12:24 +00:00

Author	SHA1	Message	Date
Arthur Islamov	d0519a7603	[js/web] BiasSplitGelu and BiasAdd kernels (#17161 ) ### Description Two contrib kernels that supposed to speed-up StableDiffusion according to this doc https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/python/tools/transformers/models/stable_diffusion/README.md However, there is no noticable effect in speed or memory consumption. So i guess the only way to make it faster is to implement MultiHeadAttention but i'm not capable of doing that right now. So i'll focus on existing PRs and finding the JSEP kernel that produces incorrect results. It should be one of the old ones (i suspect Conv or ConvTranspose), as SD was not generating images correctly on webgpu since i started working on it. I hoped someone else would fix that by the time i finish with kernels/optimizations 😅 --------- Co-authored-by: Guenther Schmuelling <guschmue@microsoft.com> Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>	2023-10-03 12:20:20 -07:00
Kaz Nishimura	d11e053412	Add option to specify the EP to use, enabling DML EP and others (#17490 ) ### Description Add DML EP to the acceptable provider list in the optimizer. ### Motivation and Context With DML EP, graph optimization was not performed in onnxruntime.	2023-10-02 23:53:09 -07:00
Yulong Wang	451c02543a	[js/webgpu] allow specify preferredLayout (#17756 ) ### Description Allow WebGPU backend to specify `preferredLayout`. Default is NHWC. ```js const options = {executionProviders: [{name:'webgpu', preferredLayout: 'NCHW'}]}; sess1 = await ort.InferenceSession.create('./mobilenetv2-12.onnx', options); ``` ### Motivation and Context - implement @qjia7's requirement for an easier way to do performance comparison between NCHW vs NHWC. - It's possible that NCHW does better on some models and NHWC on others. So offer user the capability to switch.	2023-10-02 21:25:12 -07:00
zesongw	f158f394d6	[WebNN EP] Support Softmax since version 13 (#17714 ) ### Description <!-- Describe your changes. --> WebNN only supports 2-D input tensor along axis 1. For now, we use Reshape and Transpose wraparound to get the compatible input. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Enable more models to run on WebNN.	2023-10-02 13:01:04 -07:00
Scott McKay	ac4e726046	Add bytes model loading test to react native e2e (#17749 ) ### Description <!-- Describe your changes. --> Update E2E test to also check InferenceSession.create with bytes. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Add tests to validate #17739	2023-10-02 12:25:28 +10:00
Ella Charlaix	63acaf47d2	Fix onnx quantizer activation and weight type attribute (#17651 ) In [`quantize_subgraph`](https://github.com/microsoft/onnxruntime/blob/v1.16.0/onnxruntime/python/tools/quantization/onnx_quantizer.py#L188-L189) `self.weight_qType` and `self.activation_qType` are [integers](https://github.com/microsoft/onnxruntime/blob/v1.16.0/onnxruntime/python/tools/quantization/onnx_quantizer.py#L115-L116) while `ONNXQuantizer` expects `QuantType`	2023-09-30 18:06:34 -07:00
xhcao	0d60604638	[JS/WebGPU] support Range operator (#17233 ) The patch also introduces the method which copies data from GPU to CPU synchronously. ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-09-30 02:05:32 -07:00
Arthur Islamov	a941dd583e	[js/web] FP16 Conv, ConvTranspose and MatMul (#17514 ) ### Description Another three ops for fp16 --------- Co-authored-by: Guenther Schmuelling <guschmue@microsoft.com> Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>	2023-09-30 00:00:23 -07:00
Changming Sun	9aad78721c	Update debug_alloc.cc: filter out one more memory leak from absl (#17746 )	2023-09-29 20:40:09 -07:00
Pranav Sharma	668c70ee11	Add support for specifying a custom logging function per session. (#17727 ) ### Description Add support for specifying a custom logging function per session. Bindings for other languages will be added after this PR is merged. ### Motivation and Context Users want a way to override the logging provided by the environment.	2023-09-29 19:46:55 -07:00
Caroline Zhu	6a5f469d44	Add training interfaces to js/common (#17333 ) ### Description Following the design document: * Added CreateTrainingSessionHandler to the Backend interface * All existing Backend implementations throw an error for the new method createTrainingSessionHandler * Created TrainingSession namespace, interface, and TrainingSessionFactory interface * Created TrainingSessionImpl class implementation As methods are implemented, the TrainingSession interface will be added to or modified. ### Motivation and Context Adding the public-facing interfaces to the onnxruntime-common package is one of the first steps to support ORT training for web bindings. --------- Co-authored-by: Caroline Zhu <carolinezhu@microsoft.com>	2023-09-29 19:05:10 -07:00
Rachel Guo	e106b1eb8f	Fix react native load from Uint8Array buffer bug (#17739 ) ### Description <!-- Describe your changes. --> Use `.buffer` of Uint8Array to get ArrayBuffer. TODO: Add E2E React Native test case to cover JS level testing to avoid future breakage. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> #17732 Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>	2023-09-29 18:03:28 -07:00
shaahji	5a623dca01	Python API to check whether collective ops are available or not (#17730 ) Python API to check whether collective ops are available or not ### Description <!-- Describe your changes. --> Adding an API to check whether collective ops are available or not. Since there is no independent MPI enabled build, this flag can be used on Python front for branching. Specifically, to conditionally enable tests. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Flag to be used in Python to check whether onnxruntime supports collective ops or not. Handy for conditionally enabling/disabling tests and for other branching decisions.	2023-09-29 14:11:05 -07:00
Changming Sun	14d349e290	Enable backtrace in unit tests (#17655 ) ### Description Google test can be built either with absl/re2 or not. This PR enables the build option so that google test framework can print out a nice stacktrace when something went wrong. It helps locate test errors in CI build pipelines. Also, Google test will remove the build option and make it always ON. So sooner or later we must make this change.	2023-09-29 12:32:56 -07:00
Yulong Wang	561aca97cf	[js/webgpu] support IO binding (#17480 ) <del> This PR is based on a few prerequisites PRs. They are listed as below: - #17465 - #17469 - #17470 - #17472 - #17473 - #17484 Please review the current change by only looking at commit e2e6623e673ec6de55a5c1f8edcbd3a46b535a89 and later. </del> ### Description This PR introduces WebGPU IO binding. This new feature allows onnxruntime-web users to use tensors created from GPU as model input/output so that a model inferencing can be done without unnecessary data copy between CPU and GPU for model input/output. ### Examples An E2E demo/example is being worked on. Following is some simple demo with code snippet. Let's first check today how we do: ```js // STEP.1 - create an inference session: const mySession = await ort.InferenceSession.create('./my_model.onnx', { executionProviders: ['webgpu'] }); // STEP.2 - create model input: (supposing myImageCpuData is a Float32Array) const feeds = { 'input_image:0': new ort.Tensor('float32', myImageCpuData, [1, 224, 224, 3]) }; // STEP.3 - run model const myResults = await mySession.run(feeds); // STEP.4 - get output data const myData = myResults['output_image:0'].data; // Float32Array ``` #### for inputs (GPU tensor): Now, with IO binding, you can create a tensor from a GPU buffer, and feed it to the model: ```js // new STEP.2.A - create model input from a GPU buffer: (supposing myInputGpuBuffer is a `GPUBuffer` object with input data) const feeds = { 'input_image:0': ort.Tensor.fromGpuBuffer(myInputGpuBuffer, { dataType: 'float32', dims: [1, 224, 224, 3] }) }; ``` ### for outputs (pre-allocated GPU tensor) you can also do that for output, if you know the output shape: ```js // new STEP.2.B - create model output from a GPU buffer: (supposing myOutputGpuBuffer is a pre-allocated `GPUBuffer` object) const fetches = { 'output_image:0': ort.Tensor.fromGpuBuffer(myOutputGpuBuffer, { dataType: 'float32', dims: [1, 512, 512, 3] }) }; // new STEP.3 - run model with pre-allocated output (fetches) const myResults = await mySession.run(feeds, fetches); ``` ### for outputs (specify location) if you do not know the output shape, you can specify the output location when creating the session: ```js // new STEP.1 - create an inference session with an option "preferredOutputLocation": const mySession = await ort.InferenceSession.create('./my_model.onnx', { executionProviders: ['webgpu'], preferredOutputLocation: "gpu-buffer" }); ``` if the model has multiple outputs, you can specify them seperately: ```js // new STEP.1 - create an inference session with an option "preferredOutputLocation": const mySession = await ort.InferenceSession.create('./my_model.onnx', { executionProviders: ['webgpu'], preferredOutputLocation: { "output_image:0": "gpu-buffer" } }); ``` now you don't need to prepare the `fetches` object and onnxruntime-web will prepare output data on the location that specified. #### read data when you get the output tensor, you can: ```js // get the gpu buffer object: const gpuBuffer = myOutputTensor.gpuBuffer; // GPUBuffer // get the CPU data asynchronizely const cpuData = await myOutputTensor.getData(); // get the CPU data asynchronizely and release the underlying GPU resources const cpuData = await myOutputTensor.getData(true); // dispose the tensor (release the underlying GPU resources). This tensor object will be invalid after dispose() is called. myOutputTensor.dispose(); ``` #### resource management JavaScript has GC so you don't need to worry about managing JavaScript objects. But there are 2 types of resources that are not managed by GC: - GPU buffer that used in tensors - Underlying ORT native resources To simplify, most of the unmanaged resources and handled inside ORT web. But there are a few resources that need users to manage: - All external GPU resources, including GPU buffers inside all tensors created by `Tensor.fromGpuBuffer()`, will not be managed by ORT. User should manage those GPU buffers themselves. - When a session is created with `preferredOutputLocation` == "gpu-buffer" specified in session options, and the corresponding output is not pre-allocated, user need to call the output tensor's `dispose()` or `getData(true)` to manually release the underlying GPU buffers. - ORT internal errors (including providing a pre-allocated output tensor with wrong type/dims) will invalidate the whole wasm memory and is not recoverable. An exception is thrown in this situation.	2023-09-29 11:24:42 -07:00
satyajandhyala	b4fbc25b1f	[JS/Web] Add ConvTranspose implementation using MatMul (#17573 ) ### Description Add ConvTranspose implementation using MatMul to increase perf. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-09-29 11:00:44 -07:00
Changming Sun	caf98128c1	Update linux-wasm-ci.yml: remove the ln command (#17735 ) ### Description /usr/local/bin can only be modified by root. This command seems unnecessary	2023-09-28 21:43:29 -07:00
Scott McKay	9cb60c5b86	Resize and EP specific transpose optimization updates (#17664 ) ### Description <!-- Describe your changes. --> - Treat Resize as layout sensitive by default - whilst the ONNX spec does not specify a layout, EPs tend to implement only one - add second usage in L2 of TransposeOptimizer to plugin the ability to push a Transpose through a Resize assigned to the CPU EP - Allow EP specific logic for changes the ops considered to be layout sensitive to be plugged in - expected usage is for #17200 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Finish simplifying/clarifying transpose optimization and layout transformation that was proposed in #15552. This PR along with #17618 should complete the changes. --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2023-09-29 08:11:36 +10:00
Tianlei Wu	20f96fd096	Fix Attention Runtime Error for CLIP model (#17729 ) ### Description The condition check is not correct ``` if (is_unidirectional_ && enable_fused_causal_attention_) { // GPT } else { // BERT } ``` Change it to ``` if (is_unidirectional_) { // GPT } else { // BERT } ``` Another walkaround is to enable fused causal attention by adding an environment variable `ORT_ENABLE_FUSED_CAUSAL_ATTENTION=1` before running stable diffusion. ### Motivation and Context Without the fix, optimized CLIP model of stable diffusion will encounter error in running Attention node: 2023-09-24 16:15:31.206037898 [E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Attention node. Name:'Attention_0' Status Message: /onnxruntime_src/onnxruntime/contrib_ops/cuda/bert/tensorrt_fused_multihead_attention/mha_runner.cu:207 bool onnxruntime::contrib::cuda::FusedMHARunnerFP16v2::mhaImpl::is_flash_attention(int) const interface->mHasCausalMask == false was false. Note that the bug has been there for a long time. It is just surfaced since we recently added a fusion for CLIP, which will trigger the error. We will add a comprehensive test for causal attention later to avoid such corner cases.	2023-09-28 14:32:08 -07:00
Jian Chen	fc9a69dcae	Update VecAddMoveOnlyFunctor and VecAddWithIsSupportedMethod with Default constructor (#17705 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-09-28 09:30:42 -07:00
Yi Zhang	9136748462	Fix: Fail to skip disabledmodel in winml (#17728 ) ### Description Move appending source name behind the ModifyNameIfDisabledTest ### Motivation and Context In winml, disabled test name doesn't include the model source name. WinML job will be broken in the new image. https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1151451&view=logs&s=4eef7ad1-5202-529d-b414-e2b14d056c05 ### Verified https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1151691&view=logs&s=4eef7ad1-5202-529d-b414-e2b14d056c05	2023-09-28 13:46:44 +08:00
Jambay Kinley	1f4a3529dd	Bugfix: Add initializer to model in AttentionMask directly (#17719 )	2023-09-27 14:53:37 -07:00
MistEO	8621bdcd44	Remove unnecessary #incldue <stacktrace> (#17716 )	2023-09-27 14:23:26 -07:00
MistEO	870b0bc305	Fix typo of cmake (#17715 ) This caused a cmake configuration error.	2023-09-27 11:48:46 -07:00
Adam Pocock	522cc968e8	[java] Filling out the javadoc for the float8 types (#17694 )	2023-09-27 10:52:11 -07:00
Mustafa Ateş Uzun	13b0f8a6ce	fix: supported typo (#17216 )	2023-09-27 10:45:27 -07:00
Changming Sun	276e8733bd	Update onnx python package and setuptools (#17709 ) ### Description A follow-up for #17125	2023-09-27 07:54:48 -07:00
Vincent Wang	e6aa0fa174	Add Gelu Related Ops to Triton Codegen (#17713 ) Add Gelu/QuickGelu/GeluGrad/QuickGeluGrad support to Triton Codegen so that it can be fused with some other connected supported Ops. For example, in llama2, it can be fused with Mul so we will have extra 1-2% perf gain.	2023-09-27 19:57:39 +08:00
Scott McKay	a99c965d05	Make transpose optimizer able to look past DQ node for const initializer (#17618 ) ### Description <!-- Describe your changes. --> Add ability for transpose optimizer to look past a DQ node if it has a constant initializer as input. This allows UnsqueezeInput/TransposeInput to modify the initializer in-place in the same way it would for a non-QDQ format model. Shared initializers are also handled, and any additional Squeeze/Transpose added to the other usages of the initializer should cancel out when we push the same Transpose though them. The in-place modification means we don't need to run QDQ fixup and constant folding after layout transformation. This means we do not need to enable those optimizers in a minimal build to get an optimal model post-layout transformation. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Ensure layout transformation produces optimal model in full and minimal builds. --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2023-09-27 21:17:16 +10:00
Scott McKay	33295ed883	Handle string initializers in constant folding (#17422 ) ### Description <!-- Describe your changes. --> * Allow either an allocator or a MemBuffer to be used when creating an OrtValue from an TensorProto * `Tensor<std::string>` requires an allocator to allocate/free the string values * Forcing the buffer to be allocated outside of the Tensor doesn't seem to provide any benefit in this usage as the Tensor class disables copy and assignment (so we wouldn't create 2 copies of the buffer via the Tensor class that externally managing the would buffer avoid) * New approach means we don't need to manage the buffers in the optimizer Info class as the Tensor dtor will do that * Update naming - MLValue was replaced by OrtValue a long time ago ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> #17392	2023-09-27 21:15:58 +10:00
trajep	bcc6205161	🐛 Bugfix win del file err (#17697 ) ### Description <!-- Describe your changes. --> Fix for this issue which raise the error of FileNotAccessd in windows when the context of TemporaryDirectory finished. https://github.com/microsoft/onnxruntime/issues/17627 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> https://github.com/microsoft/onnxruntime/issues/17627	2023-09-26 15:32:04 -07:00
liqun Fu	2be4dc6d04	ONNX 1.15 integration (#17125 ) ### Description this is for ORT 1.17.0 - make ORT to use ONNX release 1.15.0 branch. Eventually will update to the release tag once ONNX 1.15.0 is released ### Motivation and Context Prepare for ORT 1.17.0 release. People can start work on new and updated ONNX ops in ORT. --------- Signed-off-by: Liqun Fu <liqfu@microsoft.com>	2023-09-26 14:44:48 -07:00
RandySheriffH	37dcefb5b7	Patch lite custom op API (#17605 ) A few enhancements: - Support compute returning status; - Support variadic; --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-09-26 14:02:18 -07:00
Nicolò Lucchesi	4ab0e17fe8	[Technical docs] Fixed a couple of old links in `FAQ.md` (#17415 ) ### Description Updated a couple of old links in the technical documentation that where pointing to files present prior to the migration to https://onnxruntime.ai/docs.	2023-09-26 13:38:24 -07:00
zesongw	93f22aa189	[WebNN EP] Fix bug in Softmax (#17665 ) ### Description <!-- Describe your changes. --> For now, WebNN Softmax only support 2D (or implicitly coerce to 2D) inputs and the last axis. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fallback some cases to pass the CI.	2023-09-26 12:51:29 -07:00
Brian Lambert	614af3742c	Add prepacked weights container to subgraphs (#17671 ) ### Description Adds prepacked weights container to model subgraphs. ### Motivation and Context Allows for initializer sharing when the initializers are located in subgraphs. I encountered this bug when attempting to share weights between T5 BeamSearch models where the shareable initializers are located in the encoder and decoder subgraphs and it failed to reduce memory usage.	2023-09-26 12:01:41 -07:00
Jian Chen	0141e27ca1	Enabling c++ 20 in MacOS build (#16187 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-09-26 11:27:02 -07:00
Vadym Stupakov	b8e348145c	fixed #16873 (#16932 )	2023-09-26 09:57:01 -07:00
Kaz Nishimura	f43acf2d33	Close the JSON object in settings.json (#17583 ) ### Description This patch adds a closing curly bracket at the end of `settings.json`. ### Motivation and Context `settings.json` is just not closed. It was accidentally removed at `4e6ea730d6`	2023-09-26 09:51:13 -07:00
RandySheriffH	1c245e6775	Stop throwing exception on python binding when multiple EP available (#17659 ) Stop throwing the exception when the provider list is empty but there are multiple available EPs. Other language bindings throw no exception at all, this change will align them up. --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-09-26 09:46:30 -07:00
Chi Lo	7572e6055c	[TensorRT EP] Back out the PerThreadContext (#17690 ) Current TRT EP's PerthreadContext allows more than one IExecutionContext instance to be created by one engine instance. But, it's possible to hit an error that caused by TRT API context.setBindingDimensions() in our TRT EP code [here](https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmicrosoft%2Fonnxruntime%2Fblob%2Fmain%2Fonnxruntime%2Fcore%2Fproviders%2Ftensorrt%2Ftensorrt_execution_provider.cc%23L2775&data=05%7C01%7CChi.Lo%40microsoft.com%7Cd8b23c3a4c0b4dcce9b408dbbd9309de%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638312211465211140%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=5EZoAoXgWFSuz%2BIRMH%2FXZaO%2BfKNP%2FZDZYEZg3W%2Ff30w%3D&reserved=0) under the case of the input shape changes ( meaning engine being rebuilt) with multithreading. From the [doc](https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.nvidia.com%2Fdeeplearning%2Ftensorrt%2Fapi%2Fc_api%2Fclassnvinfer1_1_1_i_execution_context.html%23ada050e88320bcc40987b0acadc2ef962&data=05%7C01%7CChi.Lo%40microsoft.com%7Cd8b23c3a4c0b4dcce9b408dbbd9309de%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638312211465211140%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=%2BmVZU5iLD97B3YBPdHZP7jOQ2dGoleI3R0mSMVgopG4%3D&reserved=0) and the [discussion](https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FNVIDIA%2FTensorRT%2Fissues%2F846&data=05%7C01%7CChi.Lo%40microsoft.com%7Cd8b23c3a4c0b4dcce9b408dbbd9309de%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638312211465211140%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=c8v%2FK2UkQ%2FNbf8w1sHNDGsB2kxw4sSmkyQ2QuCs8Fs8%3D&reserved=0), it seems we should have different OptimizationProfile for different IExecutionContext which our current TRT EP doesn’t support regardless of using PerThreadContext implementation. Back out the PerThreadContext until we completely solve this issue.	2023-09-26 09:28:17 -07:00
Adam Pocock	aed43f429a	[java] Enable output pinning in OrtSession and OrtTrainingSession (#16835 )	2023-09-26 01:49:13 -07:00
Baiju Meswani	ccb73fd827	[On-Device Training] Expose Parameters through the Training API (#17364 )	2023-09-25 20:03:24 -07:00
aimilefth	95e8dfaea5	Update quant_utils.py/write_calibration_table (#17314 )	2023-09-25 15:56:03 -07:00
Changming Sun	a942bbf489	Update nodejs to 18.x (#17657 ) 1. Upgrade nodejs from 16.x to 18.x for Windows pipelines 2. Avoid using Azure DevOps "NodeTool" on Linux. The tool installs nodejs from internet or local disk cache. But we already moved all Linux tests to docker. So we do not need the installer anymore. 3. Remove some other unused code.	2023-09-25 14:12:11 -07:00
Yulong Wang	b2b1408608	[js/web] update browser launch cmd flags (#17658 ) ### Description update Chromium browser launch command line flags Canary already using dxc so no need to specify '--enable-dawn-features=use_dxc' for canary.	2023-09-25 12:24:46 -07:00
Yulong Wang	f50fa46fe0	[JSEP] allow DataTransfer to deal with zero sized input (#17661 ) ### Description allow DataTransfer to deal with zero sized input. This is a standalone fix for zero-sized tensor handling for JSEP DataTransfer. There are other components in JSEP not supporting zero-sized tensors need to be fixed.	2023-09-25 12:21:20 -07:00
Yulong Wang	fcfc2391b8	[JSEP] allow JsCustomAllocator to deal with zero sized input (#17660 ) ### Description allow JsCustomAllocator to deal with zero sized input. This is a standalone fix for zero-sized tensor handling for JsCustomAllocator. There are other components in JSEP not supporting zero-sized tensors need to be fixed.	2023-09-25 12:20:56 -07:00
Xavier Dupré	905faea3b2	Fix static quantization for QDQ and Percentile distribution (#17649 ) ### Description One quantization case was not covered by the current list of unit tests. This PR adds a unit test to cover that case with the fix. It fixes the issue #17619. ### Motivation and Context	2023-09-25 10:11:58 -07:00
Yulong Wang	df15a3a335	[js/web] configure 5GB memory space for webpack build (#17684 ) ### Description ort-web build step - webpack consumes the amount of memory on the edge of Node.js(V8)'s default max-old-space-size, so increase the default memory size to 5GB to avoid this issue.	2023-09-25 09:22:00 -07:00

1 2 3 4 5 ...

9700 commits