onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-01 03:45:06 +00:00

History

Arthur Islamov fac3e33da5 [js/web] JSEP Attention & MultiHeadAttention (#17742 ) ### Description This is a narrow implementation of Attention/MultiHeadAttention as it does not support: a. inputs 5-7 for MHA b. packed QKV/KV c. past/present d. attention mask But it works well for StableDiffusion and can be extended later. It reduces VRAM usage as it combines many ops into few I've updated demo here https://islamov.ai/stable-diffusion-webgpu/ it takes ~13sec for 1 image with 20 steps on RTX3090Ti and about 25s on M1 Pro VRAM usage is about 8gb if you don't use img2img Going to focus on SDXL now --------- Co-authored-by: Guenther Schmuelling <guschmue@microsoft.com> Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>		2023-11-17 12:23:52 -08:00
..
build.ts	Add "glue" between training WASM artifacts and training web (#17474 )	2023-10-12 11:16:56 -07:00
generate-webgl-operator-md.ts	[js/webgpu] generate operator table for webgpu (#15954 )	2023-05-20 12:20:41 -07:00
generate-webgpu-operator-md.ts	[js/web] JSEP Attention & MultiHeadAttention (#17742 )	2023-11-17 12:23:52 -08:00
parse-profiler.ts	[JS/Web] WebGL Profiling Tool (#7724 )	2021-05-18 06:31:00 -07:00
prepack.ts	[js] update prepack script to use exact version (#17484 )	2023-09-13 00:07:16 -07:00
pull-prebuilt-wasm-artifacts.ts	Add training WASM generation to Web CI pipeline (#17319 )	2023-09-08 15:49:47 -07:00
test-runner-cli-args.ts	[js/web] use esbuild to accelerate bundle build (#17745 )	2023-10-06 13:37:37 -07:00
test-runner-cli.ts	[js/web] use esbuild to accelerate bundle build (#17745 )	2023-10-06 13:37:37 -07:00
tsconfig.json	[js/web] set noUnusedParameters to true and fix a few bugs (#18404 )	2023-11-15 09:16:29 -08:00