onnxruntime/cmake
Chi Lo c964da7ea2
FasterTransformer model wrapper using custom op (#15013)
### Description
<!-- Describe your changes. -->
We are introducing the FasterTransfomer model-level integration using
ORT [custom op runtime
wrapper](https://github.com/microsoft/onnxruntime/pull/13427).
In order to make the FT wrapper/integration work, two things need to be
done:

- New API `KernelInfoGetConstantInput_tensor`. (Done in this PR)
During custom op kernel initialization, it needs to get the model
weights (saved as node's constant inputs) ready for FT's weights
instantiation. What's why we need to add this new API to make kernel
info capable of getting constant inputs.

- Custom op and custom op kernel to wrap FT model. (Will provide in
onnxruntime extensions or inference examples)
During custom op kernel initialization, it can fetch attributes from
kernel info to determine which kind of FT model instance create. During
custom op kernel compute/inference, it can get input/output from kernel
context and then assign input/output buffers for model instance to run.
2023-03-20 09:05:30 -07:00
..
external ROCm Flash Attention (#14838) 2023-03-16 10:39:58 +08:00
patches ROCm Flash Attention (#14838) 2023-03-16 10:39:58 +08:00
tensorboard Improve dependency management (#13523) 2022-12-01 09:51:59 -08:00
adjust_global_compile_flags.cmake Use safe allocator for JNI code (#13999) 2023-03-08 11:40:55 -08:00
CMakeLists.txt Use safe allocator for JNI code (#13999) 2023-03-08 11:40:55 -08:00
CMakeSettings.json
codeconv.runsettings
deps.txt Consume ONNX 1.13.1 in ONNX Runtime (#14812) 2023-03-02 14:57:35 -08:00
EnableVisualStudioCodeAnalysis.props Fix SDL warnings in CPU EP (#9975) 2021-12-19 20:54:29 -08:00
gdk_toolchain.cmake Enable building with a GDK (#11126) 2022-04-07 15:06:31 -07:00
Info.plist.in
libonnxruntime.pc.cmake.in
nuget_helpers.cmake
onnxruntime.cmake OnnxRuntime QNN EP (#14791) 2023-03-01 13:48:20 -08:00
onnxruntime_codegen_tvm.cmake Use target name for flatbuffers (#13991) 2022-12-20 11:44:02 -08:00
onnxruntime_common.cmake Enabling thread pool to be numa-aware (#13778) 2022-12-12 10:33:55 -08:00
onnxruntime_config.h.in Use safe allocator for JNI code (#13999) 2023-03-08 11:40:55 -08:00
onnxruntime_csharp.cmake Refactor training build options (#13964) 2023-01-03 13:28:16 -08:00
onnxruntime_eager.cmake Use target name for flatbuffers (#13991) 2022-12-20 11:44:02 -08:00
onnxruntime_flatbuffers.cmake Use target name for flatbuffers (#13991) 2022-12-20 11:44:02 -08:00
onnxruntime_framework.cmake Introduce collective ops to ort inference build (#14399) 2023-02-07 13:47:48 -08:00
onnxruntime_fuzz_test.cmake Fix fuzz test (#14385) 2023-01-22 22:17:43 -08:00
onnxruntime_graph.cmake Create dedicated build for training api (#14136) 2023-01-10 20:58:04 -08:00
onnxruntime_ios.toolchain.cmake
onnxruntime_java.cmake Update Gradle version (#14862) 2023-03-08 12:22:06 -08:00
onnxruntime_java_unittests.cmake [Java] Initial on device training support (#14027) 2023-03-08 10:01:08 -08:00
onnxruntime_kernel_explorer.cmake Add TuningContext for TunableOp (#14557) 2023-02-10 14:27:43 +08:00
onnxruntime_language_interop_ops.cmake Use target name for flatbuffers (#13991) 2022-12-20 11:44:02 -08:00
onnxruntime_mlas.cmake fix_macos (#15018) 2023-03-14 21:54:44 -07:00
onnxruntime_nodejs.cmake Change how to find npm (#15001) 2023-03-15 11:10:10 -07:00
onnxruntime_objectivec.cmake Remove SafeInt dependency from Objective-C API. (#13698) 2022-11-18 17:06:12 -08:00
onnxruntime_opschema_lib.cmake Use target name for flatbuffers (#13991) 2022-12-20 11:44:02 -08:00
onnxruntime_optimizer.cmake Create dedicated build for training api (#14136) 2023-01-10 20:58:04 -08:00
onnxruntime_providers.cmake fix miopen new API cannot be supported by ROCm5.2.3 (#15077) 2023-03-17 08:40:35 +08:00
onnxruntime_pyop.cmake Use target name for flatbuffers (#13991) 2022-12-20 11:44:02 -08:00
onnxruntime_python.cmake enable pybind for qnn ep (#14897) 2023-03-03 07:26:53 -08:00
onnxruntime_rocm_hipify.cmake [ROCm] support optimized Stable Diffusion model (#14980) 2023-03-14 23:15:37 +08:00
onnxruntime_session.cmake fix headers for training apis (#14350) 2023-01-19 10:26:53 -08:00
onnxruntime_snpe_provider.cmake Use target name for flatbuffers (#13991) 2022-12-20 11:44:02 -08:00
onnxruntime_training.cmake Create dedicated build for training api (#14136) 2023-01-10 20:58:04 -08:00
onnxruntime_unittests.cmake FasterTransformer model wrapper using custom op (#15013) 2023-03-20 09:05:30 -07:00
onnxruntime_util.cmake Improve dependency management (#13523) 2022-12-01 09:51:59 -08:00
onnxruntime_webassembly.cmake [js/web] support flag 'optimizedModelFilePath' in session options (#14355) 2023-02-24 15:50:15 -08:00
precompiled_header.cmake Fix Windows Store build (#8753) 2021-08-23 11:19:03 -07:00
Sdl.ruleset Update Sdl.ruleset to remove C26812 from the rules (#12695) 2022-09-01 20:05:20 -07:00
set_winapi_family_desktop.h
target_delayload.cmake Remove Windows Store specific code 2022-03-17 23:38:14 -07:00
uwp_stubs.h Fix Windows Store build (#8753) 2021-08-23 11:19:03 -07:00
wcos_rules_override.cmake
winml.cmake Use target name for flatbuffers (#13991) 2022-12-20 11:44:02 -08:00
winml_cppwinrt.cmake Fix Windows Store build (#8753) 2021-08-23 11:19:03 -07:00
winml_sdk_helpers.cmake
winml_unittests.cmake Use target name for flatbuffers (#13991) 2022-12-20 11:44:02 -08:00