Here are some commonly raised questions from users of ONNX Runtime and brought up in [Issues](https://github.com/microsoft/onnxruntime/issues).
## Do the GPU builds support quantized models?
The default CUDA build supports 3 standard quantization operators: QuantizeLinear, DequantizeLinear, and MatMulInteger. The TensorRT EP has limited support for INT8 quantized ops. In general, support of quantized models through ORT is continuing to expand on a model-driven basis. For performance improvements, quantization is not always required, and we suggest trying alternative strategies to [performance tune](./ONNX_Runtime_Perf_Tuning.md) before determining that quantization is necessary.
## How do I change the severity level of the default logger to something other than the default (WARNING)?
Setting the severity level to VERBOSE is most useful when debugging errors.
* C - [SetSessionLogSeverityLevel](./../include/onnxruntime/core/session/onnxruntime_c_api.h)
## How do I load and run models that have multiple inputs and outputs using the C/C++ API?
See an example from the 'override initializer' test in [test_inference.cc](./../onnxruntime/test/shared_lib/test_inference.cc) that has 3 inputs and 3 outputs.