onnxruntime/include/onnxruntime/core/framework
Xavier Dupré e726151b5c
Introduce float 8 types (#14731)
### Description
The PR implements FloatE4M3FN, FloatE5M2, FloatE4MEFNUZ, FloatE5M2FNUZ
as described in PR https://github.com/onnx/onnx/pull/4805. It uses CUDA
API to cast float/half to float8 if CUDA>=11.8, a custom implementation
if CUDA<11.8.

* It implements, Cast, QuantizeLinear, DequantizeLinear for all types on
CPU, only for types FloatE4M3FN, FloatE5M2 on CUDA.
* It extends the supported types for control flow operator, Shape,
Reshape, Identity, If, Loop, Scan, Reshape
* It implements Equal(19).
* Cast, QuantizeLinear, DequantizeLinear operators now support a
parameter `saturate` only valid for float 8 types. It is true by
default. In that case, any value out of range is converted into the
maximum float 8 value. If false, it is infinite.
* QuantizeLinear, DequantizeLinear now supports multiple scales on CUDA
(and ROCm by extension), scale = 1D tensor with one scale per channel

### Motivation and Context
Supports latest onnx version.

Fixes
[AB#15395](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/15395)

---------

Co-authored-by: Xavier Dupre <xadupre@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>
2023-05-30 13:25:58 -07:00
..
alloc_kind.h
allocator.h New configuration to limit the arena extension (#15983) 2023-05-25 02:19:07 -07:00
buffer_deleter.h Multi-stream execution support (#13495) 2022-12-15 07:39:29 -08:00
customregistry.h
data_types.h Introduce float 8 types (#14731) 2023-05-30 13:25:58 -07:00
data_types_internal.h Introduce float 8 types (#14731) 2023-05-30 13:25:58 -07:00
endian.h
execution_provider.h ExecutionProvider API refactor - replace OrtMemoryInfo with OrtDevice (#15618) 2023-05-01 10:06:00 -07:00
float8.h Introduce float 8 types (#14731) 2023-05-30 13:25:58 -07:00
float16.h Run clang-format in CI (#15524) 2023-04-18 09:26:58 -07:00
framework_common.h
framework_provider_common.h Add TRT plugins support using custom ops (#13847) 2023-04-18 20:24:32 -07:00
func_api.h Run clang-format in CI (#15524) 2023-04-18 09:26:58 -07:00
kernel_def_builder.h Remove onnxruntime_PYBIND_EXPORT_OPSCHEMA definition from onnxruntime (#15776) 2023-05-03 13:08:35 -07:00
kernel_registry.h Remove onnxruntime_PYBIND_EXPORT_OPSCHEMA definition from onnxruntime (#15776) 2023-05-03 13:08:35 -07:00
op_kernel.h ExecutionProvider API refactor - replace OrtMemoryInfo with OrtDevice (#15618) 2023-05-01 10:06:00 -07:00
op_kernel_context.h ExecutionProvider API refactor - replace OrtMemoryInfo with OrtDevice (#15618) 2023-05-01 10:06:00 -07:00
op_kernel_info.h ExecutionProvider API refactor - replace OrtMemoryInfo with OrtDevice (#15618) 2023-05-01 10:06:00 -07:00
op_node_proto_helper.h Switch GSL to MS GSL 4.0.0 (#13416) 2022-10-29 04:15:20 -07:00
ort_value.h Enable Opset11 Sequence Ops on DirectML, and make the CPU implementations agnostic to backend EP (#14442) 2023-02-21 18:08:28 -08:00
ortdevice.h ExecutionProvider API refactor - replace OrtMemoryInfo with OrtDevice (#15618) 2023-05-01 10:06:00 -07:00
ortmemoryinfo.h Run clang-format in CI (#15524) 2023-04-18 09:26:58 -07:00
provider_options.h
provider_options_utils.h
provider_shutdown.h
run_options.h Adding RunOptions synchronization behaviour to C/C++ API (#14088) 2023-02-07 19:59:28 -08:00
sparse_tensor.h Run clang-format in CI (#15524) 2023-04-18 09:26:58 -07:00
stream_handles.h Run clang-format in CI (#15524) 2023-04-18 09:26:58 -07:00
tensor.h Enable Opset11 Sequence Ops on DirectML, and make the CPU implementations agnostic to backend EP (#14442) 2023-02-21 18:08:28 -08:00
tensor_shape.h Switch GSL to MS GSL 4.0.0 (#13416) 2022-10-29 04:15:20 -07:00
to_tensor_proto_element_type.h Introduce float 8 types (#14731) 2023-05-30 13:25:58 -07:00