onnxruntime/onnxruntime/python/tools/kernel_explorer
Justin Chu ad312d9677
Enable comprehension simplification in ruff rules (#23414)
Enable comprehension simplification rules (C4) for ruff and apply
autofix.
2025-01-17 08:43:06 -08:00
..
kernels Enable comprehension simplification in ruff rules (#23414) 2025-01-17 08:43:06 -08:00
device_array.h [ROCm] Add GemmFloat8 (#18488) 2023-12-11 11:37:29 +08:00
kernel_explorer.cc Improve KE for commandline and programmatically tuning dispatch (#18778) 2024-04-08 11:08:59 +08:00
kernel_explorer_interface.h Improve KE for commandline and programmatically tuning dispatch (#18778) 2024-04-08 11:08:59 +08:00
README.md
version_script.lds

Kernel Explorer

Kernel Explorer hooks up GPU kernel code with a Python frontend to help develop, test, profile, and auto-tune GPU kernels. The initial scope is for BERT-like models with ROCM EP.

Build

#!/bin/bash

set -ex

build_dir="build"
config="Release"

rocm_home="/opt/rocm"

./build.sh --update \
    --build_dir ${build_dir} \
    --config ${config} \
    --cmake_extra_defines \
        CMAKE_HIP_COMPILER=/opt/rocm/llvm/bin/clang++ \
        onnxruntime_BUILD_KERNEL_EXPLORER=ON \
    --skip_submodule_sync --skip_tests \
    --use_rocm --rocm_home=${rocm_home} --nccl_home=${rocm_home} \
    --build_wheel

cmake --build ${build_dir}/${config} --target kernel_explorer --parallel

Run

Taking vector_add_test.py and build configuration with build_dir="build" and config="Release" in the previous section as an example.

Set up the native library search path with the following environment variable:

export KERNEL_EXPLORER_BUILD_DIR=`realpath build/Release`

To test a kernel implementation, pip install pytest and then

pytest onnxruntime/python/tools/kernel_explorer/kernels/vector_add_test.py

To run the microbenchmarks:

python onnxruntime/python/tools/kernel_explorer/kernels/vector_add_test.py

Currently, kernel explorer mainly targets kernel developers, not the onnxruntime package end users, so it is not installed via setup.py.