onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-25 02:50:42 +00:00

History

Adam Louly c55c6255e0 Eliminate safe nodes that are followed by a shape node. (#16065 ) ### Description Eliminate Cast operator if Shape is the next one. ### Motivation and Context #### Cast When working with onnx opset 15 and above, the shape operator now accepts all types of variables. This change is documented in the [onnx Changelog](https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Shape-15). As a result, casting variables right before the shape operation becomes unnecessary. Removing these unnecessary casts will improve the graph and potentially provide better performance gains. ## Results On : torchrun examples/onnxruntime/training/language-modeling/run_clm.py --model_name_or_path gpt2 --do_train --overwrite_output_dir --output_dir ./outputs/ --seed 1337 --fp16 True --per_device_train_batch_size 4 --num_train_epochs 1 --dataset_name wikitext --dataset_config_name wikitext-2-raw-v1 --learning_rate 2e-5 --report_to none --optim adamw_ort_fused without changes: *** train metrics * epoch = 1.0 train_loss = 3.2981 train_runtime = 0:02:13.29 train_samples = 2318 train_samples_per_second = 17.39 train_steps_per_second = 4.351 With my changes: * train metrics *** epoch = 1.0 train_loss = 3.2981 train_runtime = 0:02:08.98 train_samples = 2318 train_samples_per_second = 17.971 train_steps_per_second = 4.497 We see around 3% gain. --------- Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>		2023-06-26 16:35:07 +08:00
..
api_tests_without_env	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
common	Move tests from core/providers/cuda/test/* to test/providers/cuda/ and refactor CUDA UT (#16161 )	2023-06-20 14:54:55 -07:00
contrib_ops	Use M_PI to replace 3.14 constants (#16421 )	2023-06-20 15:09:10 -07:00
custom_op_registration	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
debug_node_inputs_outputs	Separate out operator vs model testing. (#16228 )	2023-06-17 12:58:57 +10:00
framework	Move tests from core/providers/cuda/test/* to test/providers/cuda/ and refactor CUDA UT (#16161 )	2023-06-20 14:54:55 -07:00
fuzzing	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
global_thread_pools	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
ir	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
logging_apis	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
mlas	NhwcFusedConv: Add before Activation (#15837 )	2023-05-08 21:02:35 -07:00
onnx	ExecutionProvider API refactor - move allocator from EP level to SessionState level and indexed by OrtDevice (#15833 )	2023-06-19 17:44:45 -07:00
opaque_api	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
optimizer	Eliminate safe nodes that are followed by a shape node. (#16065 )	2023-06-26 16:35:07 +08:00
perftest	CUDA graph support for TRT EP (#16081 )	2023-06-21 09:36:45 -07:00
platform	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
proto
providers	Fix file list for test of build with IO debug (#16474 )	2023-06-26 16:36:22 +10:00
python	Allow saving of large models after optimization (github issue 12882) (#16440 )	2023-06-21 22:46:26 -07:00
quantization	ExecutionProvider API refactor - move allocator from EP level to SessionState level and indexed by OrtDevice (#15833 )	2023-06-19 17:44:45 -07:00
shared_lib	CUDA graph support for TRT EP (#16081 )	2023-06-21 09:36:45 -07:00
testdata	Eliminate safe nodes that are followed by a shape node. (#16065 )	2023-06-26 16:35:07 +08:00
unittest_main	[TensorRT EP] avoid excessive library load/unload overhead when running unit tests. (#15639 )	2023-04-24 14:43:13 -07:00
util	ExecutionProvider API refactor - move allocator from EP level to SessionState level and indexed by OrtDevice (#15833 )	2023-06-19 17:44:45 -07:00
wasm	Enable Web CI on Linux (#16419 )	2023-06-22 15:42:58 +08:00
win_getopt	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
xctest	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00