onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-14 20:48:00 +00:00

History

Xavier Dupré e726151b5c Introduce float 8 types (#14731 ) ### Description The PR implements FloatE4M3FN, FloatE5M2, FloatE4MEFNUZ, FloatE5M2FNUZ as described in PR https://github.com/onnx/onnx/pull/4805. It uses CUDA API to cast float/half to float8 if CUDA>=11.8, a custom implementation if CUDA<11.8. * It implements, Cast, QuantizeLinear, DequantizeLinear for all types on CPU, only for types FloatE4M3FN, FloatE5M2 on CUDA. * It extends the supported types for control flow operator, Shape, Reshape, Identity, If, Loop, Scan, Reshape * It implements Equal(19). * Cast, QuantizeLinear, DequantizeLinear operators now support a parameter `saturate` only valid for float 8 types. It is true by default. In that case, any value out of range is converted into the maximum float 8 value. If false, it is infinite. * QuantizeLinear, DequantizeLinear now supports multiple scales on CUDA (and ROCm by extension), scale = 1D tensor with one scale per channel ### Motivation and Context Supports latest onnx version. Fixes [AB#15395](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/15395) --------- Co-authored-by: Xavier Dupre <xadupre@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: Randy Shuai <rashuai@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>		2023-05-30 13:25:58 -07:00
..
util	Introduce float 8 types (#14731 )	2023-05-30 13:25:58 -07:00
__init__.py
check_onnx_model_mobile_usability.py	Format all python files under onnxruntime with black and isort (#11324 )	2022-04-26 09:35:16 -07:00
convert_onnx_models_to_ort.py	Format all python files under onnxruntime with black and isort (#11324 )	2022-04-26 09:35:16 -07:00
create_reduced_build_config.py	Bump ruff in CI (#15533 )	2023-04-17 10:11:44 -07:00
dump_ort_model.py	Adopt linrtunner as the linting tool - take 2 (#15085 )	2023-03-24 15:29:03 -07:00
dump_subgraphs.py	Format all python files under onnxruntime with black and isort (#11324 )	2022-04-26 09:35:16 -07:00
example_operator_perf_test.py	Adopt linrtunner as the linting tool - take 2 (#15085 )	2023-03-24 15:29:03 -07:00
find_optimizer_opset_version_updates_required.py	Adopt linrtunner as the linting tool - take 2 (#15085 )	2023-03-24 15:29:03 -07:00
gen_contrib_doc.py	Bump ruff in CI (#15533 )	2023-04-17 10:11:44 -07:00
gen_opkernel_doc.py	Bump ruff in CI (#15533 )	2023-04-17 10:11:44 -07:00
gen_ort_mobile_pkg_doc.py	Adopt linrtunner as the linting tool - take 2 (#15085 )	2023-03-24 15:29:03 -07:00
get_submodules.py	Bump ruff in CI (#15533 )	2023-04-17 10:11:44 -07:00
onnx2tfevents.py	Tool to Convert ONNX Model to TFEvents (#14160 )	2023-01-28 15:09:15 +08:00
onnx_test_data_utils.py	Bump ruff in CI (#15533 )	2023-04-17 10:11:44 -07:00
ort_test_dir_utils.py	Bump ruff in CI (#15533 )	2023-04-17 10:11:44 -07:00
PythonTools.md	Adopt linrtunner as the linting tool - take 2 (#15085 )	2023-03-24 15:29:03 -07:00
remove_initializer_from_input.py	Format all python files under onnxruntime with black and isort (#11324 )	2022-04-26 09:35:16 -07:00
run_adb.py	[React Native CI] Record more info to debug E2E test (#13329 )	2022-10-18 17:21:28 -07:00
run_android_emulator.py	Adopt linrtunner as the linting tool - take 2 (#15085 )	2023-03-24 15:29:03 -07:00
run_CIs_for_external_pr.py	Add new QNN CIs to azp run tool (#16109 )	2023-05-27 08:46:16 +10:00
sparsify_initializers.py	Adopt linrtunner as the linting tool - take 2 (#15085 )	2023-03-24 15:29:03 -07:00
update_version.py	Update VERSION_NUMBER (#15773 )	2023-05-03 15:07:34 -07:00