onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-21 19:18:55 +00:00

History

Tianlei Wu af80542e65 Update optimize_pipeline for SDXL (#17536 ) - [x] Optimize SDXL models exported by optimum. - [x] Enable it to run locally instead of using module. - [x] Detect external data file in original model, and save with same format by default. - [x] Add tests ### Example ``` pip install optimum transformers diffusers onnx onnxruntime-gpu>=1.16 optimum-cli export onnx --model stabilityai/stable-diffusion-xl-base-1.0 --task stable-diffusion-xl ./sd_xl_base_onnx python -m onnxruntime.transformers.models.stable_diffusion.optimize_pipeline -i ./sd_xl_base_onnx -o ./sd_xl_base_fp16 --float16 ``` ### Known issues (1) VAE decoder cannot be converted to float16. Otherwise, there will be black image in output. (2) To use the float16 models, need a minor change in optimum to convert the inputs for VAE decoder from float16 to float32 since we keep VAE decoder as float32. The change is to append a line like the following after [this line](`afd2b5a366/optimum/pipelines/diffusers/pipeline_stable_diffusion_xl.py (L483)`) ``` latents = latents.astype(np.float32) ```		2023-09-15 10:17:20 -07:00
..
common	[C#, CPP] Introduce Float16/BFloat16 support and tests for C#, C++ (#16506 )	2023-07-14 10:46:52 -07:00
contrib_ops	Flash Attention v2 MHA (#17227 )	2023-08-31 13:52:21 -07:00
custom_op_registration
debug_node_inputs_outputs
framework	Fix float 8 rounding on CPU (#16940 )	2023-09-07 20:48:25 +02:00
fuzzing
global_thread_pools	Allow RunAsync with global TP (#17157 )	2023-08-15 14:29:10 -07:00
ir
logging_apis
mlas	Upgrade Centos7 to Alamlinux8 (#16907 )	2023-08-29 21:05:36 -07:00
onnx	[QNN EP] Update QNN SDK to version 2.14.1 (#17467 )	2023-09-11 21:07:50 -07:00
opaque_api
optimizer	Fix CPU constant folding not reverting the node to its previous EP (#17399 )	2023-09-11 17:38:37 -07:00
perftest	Add checks for session options and fix gsubgraph fallback exceptions (#17095 )	2023-08-16 10:06:25 -07:00
platform
proto
providers	[QNN EP] Enable Pad op support for QNN EP (#17508 )	2023-09-14 14:22:45 -07:00
python	Update optimize_pipeline for SDXL (#17536 )	2023-09-15 10:17:20 -07:00
quantization
shared_lib	Introduce output type/shape validation (#17301 )	2023-09-05 15:25:12 -07:00
testdata	Fix CPU constant folding not reverting the node to its previous EP (#17399 )	2023-09-11 17:38:37 -07:00
unittest_main	Enable verbose logging in unit test program with environment variable. (#17133 )	2023-08-22 12:13:52 -07:00
util	[CoreML EP] Add Shape, Gather, and Slice ops (#17153 )	2023-08-18 22:34:34 -07:00
wasm
win_getopt
xctest