onnxruntime/docs/execution_providers/Vitis-AI-ExecutionProvider.md

<p align="center">
  <img src="./images/Vitis-AI.png">
</p>

# Vitis-AI Execution Provider

[Vitis-AI](https://github.com/Xilinx/Vitis-AI) is Xilinx's development stack for hardware-accelerated AI inference on Xilinx platforms, including both edge devices and Alveo cards. It consists of optimized IP, tools, libraries, models, and example designs. It is designed with high efficiency and ease of use in mind, unleashing the full potential of AI acceleration on Xilinx FPGA and ACAP.

The current Vitis-AI execution provider inside ONNXRuntime enables acceleration of Neural Network model inference using DPUv1. DPUv1 is a hardware accelerator for Convolutional Neural Networks (CNN) on top of the Xilinx [Alveo](https://www.xilinx.com/products/boards-and-kits/alveo.html) platform and targets U200 and U250 accelerator cards.

On this page you will find information on how to [build](#Build) ONNXRuntime with Vitis-AI and on how to [get started](#Getting-started) with an example.

## Build

For building ONNXRuntime with the Vitis-AI execution provider, you will have to setup the hardware environment and build the docker, see [build steps](#Hardware-setup-and-docker-build).

### System requirements

The following table lists system requirements for running docker containers as well as Alveo cards.  


| **Component**                                       | **Requirement**                                            |
|-----------------------------------------------------|------------------------------------------------------------|
| Motherboard                                         | PCI Express 3\.0\-compliant with one dual\-width x16 slot  |
| System Power Supply                                 | 225W                                                       |
| Operating System                                    | Ubuntu 16\.04, 18\.04                                      |
|                                                     | CentOS 7\.4, 7\.5                                          |
|                                                     | RHEL 7\.4, 7\.5                                            |
| CPU                                                 | Intel i3/i5/i7/i9/Xeon 64-bit CPU                          |
| GPU \(Optional to accelerate quantization\)         | NVIDIA GPU with a compute capability > 3.0                 |
| CUDA Driver \(Optional to accelerate quantization\) | nvidia\-410                                                |
| FPGA                                                | Xilinx Alveo U200 or U250                                  |
| Docker Version                                      | 19\.03\.1                                                  |

### Hardware setup and docker build

1. Clone the Vitis AI repository:
    ```
    git clone https://github.com/xilinx/vitis-ai
    ```
2. Install the Docker, and add the user to the docker group. Link the user to docker installation instructions from the following docker's website:
    * https://docs.docker.com/install/linux/docker-ce/ubuntu/
    * https://docs.docker.com/install/linux/docker-ce/centos/
    * https://docs.docker.com/install/linux/linux-postinstall/
3. Any GPU instructions will have to be separated from Vitis AI.
4. Set up Vitis AI to target Alveo cards. To target Alveo cards with Vitis AI for machine learning workloads, you must install the following software components:
    * Xilinx Runtime (XRT)
    * Alveo Deployment Shells (DSAs)
    * Xilinx Resource Manager (XRM) (xbutler)
    * Xilinx Overlaybins (Accelerators to Dynamically Load - binary programming files)

    While it is possible to install all of these software components individually, a script has been provided to automatically install them at once. To do so:
      * Run the following commands:
        ```
        cd Vitis-AI/alveo/packages
        sudo su
        ./install.sh
        ```
      * Power cycle the system.
5. Build and start the ONNXRuntime Vitis-AI Docker Container.
   ```
   cd {onnxruntime-root}/dockerfiles
   docker build -t onnxruntime-vitisai -f Dockerfile.vitisai .
   ./scripts/docker_run_vitisai.sh
   ```
   
   Setup inside container
   ```
   source /opt/xilinx/xrt/setup.sh
   conda activate vitis-ai-tensorflow
   ```

## Getting started

### On-the-fly quantization

Usually, to be able to accelerate inference of Neural Network models with Vitis-AI DPU accelerators, those models need to quantized upfront. In the ONNXRuntime Vitis-AI execution provider we make use of on-the-fly quantization to remove this additional preprocessing step. In this flow, one doesn't need to quantize his/her model upfront but can make use of the typical inference execution calls (InferenceSession.run) to quantize the model on-the-fly using the first N inputs that are provided (see more information below). This will set up and calibrate the Vitis-AI DPU and from that point onwards inference will be accelerated for all next inputs.

### Config/Settings

A couple of environment variables can be used to customize the Vitis-AI execution provider.

| **Environment Variable**   | **Default if unset**      | **Explanation**                                         |
|----------------------------|---------------------------|---------------------------------------------------------|
| PX_QUANT_SIZE              | 128                    | The number of inputs that will be used for quantization (necessary for Vitis-AI acceleration) |
| PX_BUILD_DIR               | Use the on-the-fly quantization flow | Loads the quantization and compilation information from the provided build directory and immediately starts Vitis-AI hardware acceleration. This configuration can be used if the model has been executed before using on-the-fly quantization during which the quantization and comilation information was cached in a build directory. |

### Samples

When using python, you can base yourself on the following example:

```
# Import pyxir before onnxruntime
import pyxir
import pyxir.frontend.onnx
import pyxir.contrib.dpuv1.dpuv1

import onnxruntime

# Add other imports 
# ...

# Load inputs and do preprocessing
# ...

# Create an inference session using the Vitis-AI execution provider
session = onnxruntime.InferenceSession('[model_file].onnx', None,["VitisAIExecutionProvider"])

# First N (default = 128) inputs are used for quantization calibration and will
#   be executed on the CPU
# This config can be changed by setting the 'PX_QUANT_SIZE' (e.g. export PX_QUANT_SIZE=64)
imput_name = [...]
outputs = [session.run([], {input_name: calib_inputs[i]})[0] for i in range(128)]

# Afterwards, computations will be accelerated on the FPGA
input_data = [...]
result = session.run([], {input_name: input_data})
```
Initial release of Vitis-AI Execution Provider (#3771) * Initial release of Vitis-AI Execution Provider * Add documentation, fix for onnxruntime::Model changes and use stringstream instead of file dump for model passing * - Add Vitis-AI docker file - Add online quantization flow Vitis-AI execution provider - Fix remarks * - Add fatal error build message for Vitis-AI cmake build on Windows - Fix pep8 issue in build.py - Add Vitis-AI execution provider example in docs Co-authored-by: Elliott Delaye <elliott@xilinx.com> Co-authored-by: Jorn Tuyls <jornt@xilinx.com> Co-authored-by: Jorn Tuyls <jtuyls@users.noreply.github.com> 2020-05-19 12:32:32 +00:00			`<p align="center">`
			`<img src="./images/Vitis-AI.png">`
			`</p>`

			`# Vitis-AI Execution Provider`

			`[Vitis-AI](https://github.com/Xilinx/Vitis-AI) is Xilinx's development stack for hardware-accelerated AI inference on Xilinx platforms, including both edge devices and Alveo cards. It consists of optimized IP, tools, libraries, models, and example designs. It is designed with high efficiency and ease of use in mind, unleashing the full potential of AI acceleration on Xilinx FPGA and ACAP.`

[Vitis-AI EP] Fix to enable multi-output subgraphs inside Vitis-AI EP + edit docs (#4171) 2020-06-13 11:56:07 +00:00			`The current Vitis-AI execution provider inside ONNXRuntime enables acceleration of Neural Network model inference using DPUv1. DPUv1 is a hardware accelerator for Convolutional Neural Networks (CNN) on top of the Xilinx [Alveo](https://www.xilinx.com/products/boards-and-kits/alveo.html) platform and targets U200 and U250 accelerator cards.`

			`On this page you will find information on how to [build](#Build) ONNXRuntime with Vitis-AI and on how to [get started](#Getting-started) with an example.`

Initial release of Vitis-AI Execution Provider (#3771) * Initial release of Vitis-AI Execution Provider * Add documentation, fix for onnxruntime::Model changes and use stringstream instead of file dump for model passing * - Add Vitis-AI docker file - Add online quantization flow Vitis-AI execution provider - Fix remarks * - Add fatal error build message for Vitis-AI cmake build on Windows - Fix pep8 issue in build.py - Add Vitis-AI execution provider example in docs Co-authored-by: Elliott Delaye <elliott@xilinx.com> Co-authored-by: Jorn Tuyls <jornt@xilinx.com> Co-authored-by: Jorn Tuyls <jtuyls@users.noreply.github.com> 2020-05-19 12:32:32 +00:00			`## Build`

[Vitis-AI EP] Fix to enable multi-output subgraphs inside Vitis-AI EP + edit docs (#4171) 2020-06-13 11:56:07 +00:00			`For building ONNXRuntime with the Vitis-AI execution provider, you will have to setup the hardware environment and build the docker, see [build steps](#Hardware-setup-and-docker-build).`
Initial release of Vitis-AI Execution Provider (#3771) * Initial release of Vitis-AI Execution Provider * Add documentation, fix for onnxruntime::Model changes and use stringstream instead of file dump for model passing * - Add Vitis-AI docker file - Add online quantization flow Vitis-AI execution provider - Fix remarks * - Add fatal error build message for Vitis-AI cmake build on Windows - Fix pep8 issue in build.py - Add Vitis-AI execution provider example in docs Co-authored-by: Elliott Delaye <elliott@xilinx.com> Co-authored-by: Jorn Tuyls <jornt@xilinx.com> Co-authored-by: Jorn Tuyls <jtuyls@users.noreply.github.com> 2020-05-19 12:32:32 +00:00
			`### System requirements`

			`The following table lists system requirements for running docker containers as well as Alveo cards.`


			`\| Component \| Requirement \|`
			`\|-----------------------------------------------------\|------------------------------------------------------------\|`
			`\| Motherboard \| PCI Express 3\.0\-compliant with one dual\-width x16 slot \|`
			`\| System Power Supply \| 225W \|`
			`\| Operating System \| Ubuntu 16\.04, 18\.04 \|`
			`\| \| CentOS 7\.4, 7\.5 \|`
			`\| \| RHEL 7\.4, 7\.5 \|`
			`\| CPU \| Intel i3/i5/i7/i9/Xeon 64-bit CPU \|`
			`\| GPU \(Optional to accelerate quantization\) \| NVIDIA GPU with a compute capability > 3.0 \|`
			`\| CUDA Driver \(Optional to accelerate quantization\) \| nvidia\-410 \|`
			`\| FPGA \| Xilinx Alveo U200 or U250 \|`
			`\| Docker Version \| 19\.03\.1 \|`

[Vitis-AI EP] Fix to enable multi-output subgraphs inside Vitis-AI EP + edit docs (#4171) 2020-06-13 11:56:07 +00:00			`### Hardware setup and docker build`
Initial release of Vitis-AI Execution Provider (#3771) * Initial release of Vitis-AI Execution Provider * Add documentation, fix for onnxruntime::Model changes and use stringstream instead of file dump for model passing * - Add Vitis-AI docker file - Add online quantization flow Vitis-AI execution provider - Fix remarks * - Add fatal error build message for Vitis-AI cmake build on Windows - Fix pep8 issue in build.py - Add Vitis-AI execution provider example in docs Co-authored-by: Elliott Delaye <elliott@xilinx.com> Co-authored-by: Jorn Tuyls <jornt@xilinx.com> Co-authored-by: Jorn Tuyls <jtuyls@users.noreply.github.com> 2020-05-19 12:32:32 +00:00
			`1. Clone the Vitis AI repository:`
			```
			`git clone https://github.com/xilinx/vitis-ai`
			```
			`2. Install the Docker, and add the user to the docker group. Link the user to docker installation instructions from the following docker's website:`
			`* https://docs.docker.com/install/linux/docker-ce/ubuntu/`
			`* https://docs.docker.com/install/linux/docker-ce/centos/`
			`* https://docs.docker.com/install/linux/linux-postinstall/`
			`3. Any GPU instructions will have to be separated from Vitis AI.`
			`4. Set up Vitis AI to target Alveo cards. To target Alveo cards with Vitis AI for machine learning workloads, you must install the following software components:`
			`* Xilinx Runtime (XRT)`
			`* Alveo Deployment Shells (DSAs)`
			`* Xilinx Resource Manager (XRM) (xbutler)`
			`* Xilinx Overlaybins (Accelerators to Dynamically Load - binary programming files)`

			`While it is possible to install all of these software components individually, a script has been provided to automatically install them at once. To do so:`
			`* Run the following commands:`
			```
			`cd Vitis-AI/alveo/packages`
			`sudo su`
			`./install.sh`
			```
			`* Power cycle the system.`
			`5. Build and start the ONNXRuntime Vitis-AI Docker Container.`
			```
			`cd {onnxruntime-root}/dockerfiles`
			`docker build -t onnxruntime-vitisai -f Dockerfile.vitisai .`
			`./scripts/docker_run_vitisai.sh`
			```

			`Setup inside container`
			```
			`source /opt/xilinx/xrt/setup.sh`
			`conda activate vitis-ai-tensorflow`
			```

[Vitis-AI EP] Fix to enable multi-output subgraphs inside Vitis-AI EP + edit docs (#4171) 2020-06-13 11:56:07 +00:00			`## Getting started`

			`### On-the-fly quantization`

			Usually, to be able to accelerate inference of Neural Network models with Vitis-AI DPU accelerators, those models need to quantized upfront. In the ONNXRuntime Vitis-AI execution provider we make use of on-the-fly quantization to remove this additional preprocessing step. In this flow, one doesn't need to quantize his/her model upfront but can make use of the typical inference execution calls (InferenceSession.run) to quantize the model on-the-fly using the first N inputs that are provided (see more information below). This will set up and calibrate the Vitis-AI DPU and from that point onwards inference will be accelerated for all next inputs.

			`### Config/Settings`

			`A couple of environment variables can be used to customize the Vitis-AI execution provider.`
Initial release of Vitis-AI Execution Provider (#3771) * Initial release of Vitis-AI Execution Provider * Add documentation, fix for onnxruntime::Model changes and use stringstream instead of file dump for model passing * - Add Vitis-AI docker file - Add online quantization flow Vitis-AI execution provider - Fix remarks * - Add fatal error build message for Vitis-AI cmake build on Windows - Fix pep8 issue in build.py - Add Vitis-AI execution provider example in docs Co-authored-by: Elliott Delaye <elliott@xilinx.com> Co-authored-by: Jorn Tuyls <jornt@xilinx.com> Co-authored-by: Jorn Tuyls <jtuyls@users.noreply.github.com> 2020-05-19 12:32:32 +00:00
[Vitis-AI EP] Fix to enable multi-output subgraphs inside Vitis-AI EP + edit docs (#4171) 2020-06-13 11:56:07 +00:00			`\| Environment Variable \| Default if unset \| Explanation \|`
			`\|----------------------------\|---------------------------\|---------------------------------------------------------\|`
			`\| PX_QUANT_SIZE \| 128 \| The number of inputs that will be used for quantization (necessary for Vitis-AI acceleration) \|`
			`\| PX_BUILD_DIR \| Use the on-the-fly quantization flow \| Loads the quantization and compilation information from the provided build directory and immediately starts Vitis-AI hardware acceleration. This configuration can be used if the model has been executed before using on-the-fly quantization during which the quantization and comilation information was cached in a build directory. \|`
Initial release of Vitis-AI Execution Provider (#3771) * Initial release of Vitis-AI Execution Provider * Add documentation, fix for onnxruntime::Model changes and use stringstream instead of file dump for model passing * - Add Vitis-AI docker file - Add online quantization flow Vitis-AI execution provider - Fix remarks * - Add fatal error build message for Vitis-AI cmake build on Windows - Fix pep8 issue in build.py - Add Vitis-AI execution provider example in docs Co-authored-by: Elliott Delaye <elliott@xilinx.com> Co-authored-by: Jorn Tuyls <jornt@xilinx.com> Co-authored-by: Jorn Tuyls <jtuyls@users.noreply.github.com> 2020-05-19 12:32:32 +00:00
			`### Samples`

[Vitis-AI EP] Fix to enable multi-output subgraphs inside Vitis-AI EP + edit docs (#4171) 2020-06-13 11:56:07 +00:00			`When using python, you can base yourself on the following example:`
Initial release of Vitis-AI Execution Provider (#3771) * Initial release of Vitis-AI Execution Provider * Add documentation, fix for onnxruntime::Model changes and use stringstream instead of file dump for model passing * - Add Vitis-AI docker file - Add online quantization flow Vitis-AI execution provider - Fix remarks * - Add fatal error build message for Vitis-AI cmake build on Windows - Fix pep8 issue in build.py - Add Vitis-AI execution provider example in docs Co-authored-by: Elliott Delaye <elliott@xilinx.com> Co-authored-by: Jorn Tuyls <jornt@xilinx.com> Co-authored-by: Jorn Tuyls <jtuyls@users.noreply.github.com> 2020-05-19 12:32:32 +00:00
			```
			`# Import pyxir before onnxruntime`
			`import pyxir`
			`import pyxir.frontend.onnx`
			`import pyxir.contrib.dpuv1.dpuv1`

			`import onnxruntime`

			`# Add other imports`
			`# ...`

			`# Load inputs and do preprocessing`
			`# ...`

			`# Create an inference session using the Vitis-AI execution provider`
			`session = onnxruntime.InferenceSession('[model_file].onnx', None,["VitisAIExecutionProvider"])`

			`# First N (default = 128) inputs are used for quantization calibration and will`
			`# be executed on the CPU`
[Vitis-AI EP] Fix to enable multi-output subgraphs inside Vitis-AI EP + edit docs (#4171) 2020-06-13 11:56:07 +00:00			`# This config can be changed by setting the 'PX_QUANT_SIZE' (e.g. export PX_QUANT_SIZE=64)`
Initial release of Vitis-AI Execution Provider (#3771) * Initial release of Vitis-AI Execution Provider * Add documentation, fix for onnxruntime::Model changes and use stringstream instead of file dump for model passing * - Add Vitis-AI docker file - Add online quantization flow Vitis-AI execution provider - Fix remarks * - Add fatal error build message for Vitis-AI cmake build on Windows - Fix pep8 issue in build.py - Add Vitis-AI execution provider example in docs Co-authored-by: Elliott Delaye <elliott@xilinx.com> Co-authored-by: Jorn Tuyls <jornt@xilinx.com> Co-authored-by: Jorn Tuyls <jtuyls@users.noreply.github.com> 2020-05-19 12:32:32 +00:00			`imput_name = [...]`
			`outputs = [session.run([], {input_name: calib_inputs[i]})[0] for i in range(128)]`

			`# Afterwards, computations will be accelerated on the FPGA`
			`input_data = [...]`
			`result = session.run([], {input_name: input_data})`
			```