From d5b2cd74939f79fa0e75736ace91f4962b66f3f4 Mon Sep 17 00:00:00 2001
From: Jeff Bloomfield <38966965+jeffbloo@users.noreply.github.com>
Date: Sat, 2 May 2020 09:53:33 -0700
Subject: [PATCH] Add performance best practices to DML EP doc (#2859)

* Add performance best practices to DML EP doc


Co-authored-by: Jeff <38966965+jeffbloo@users.noreply.github.com>
---
 .../DirectML-ExecutionProvider.md                  | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/docs/execution_providers/DirectML-ExecutionProvider.md b/docs/execution_providers/DirectML-ExecutionProvider.md
index 4372b6385c..8eef40bd6f 100644
--- a/docs/execution_providers/DirectML-ExecutionProvider.md
+++ b/docs/execution_providers/DirectML-ExecutionProvider.md
@@ -105,6 +105,20 @@ Additionally, as the DirectML execution provider does not support parallel execu
 
 A complete sample of onnxruntime using the DirectML execution provider can be found under [samples/c_cxx/fns_candy_style_transfer](../../samples/c_cxx/fns_candy_style_transfer).
 
+## Performance best practices
+The DirectML execution provider works most efficiently when tensor shapes are known at the time a session is created.  This provides a few performance benefits:
+1) Because constant folding can occur more often, there may be fewer CPU / GPU copies and stalls during evaluations.
+2) More initialization work occurs when sessions are created rather than during the first evaluation.
+3) Weights may be pre-processed within DirectML, enabling more efficient algorithms to be used.
+4) Graph optimization occurs within DirectML. For example, Concat operators may be removed, and more optimal tensor layouts may be used for the input and output of operators.
+
+Normally when the shapes of model inputs are known during session creation, the shapes for the rest of the model are inferred by OnnxRuntime when a session is created.  However if a model input contains a free dimension (such as for batch size), steps must be taken to retain the above performance benefits.
+
+In this case, there are two options:
+- Edit the model to replace an input's free dimension (specified through ONNX using "dim_param") with a fixed size.
+- Edit the model to ensure that an input's free dimension has a [denotation](https://github.com/onnx/onnx/blob/master/docs/DimensionDenotation.md) (such as "DATA_BATCH," or a custom denotation).  Then when creating the session, specify the dimension size for each denotation.  This can be done using the OnnxRuntime *AddFreeDimensionOverride* ABI.
+
+
 ## See also
 
 [DirectML documentation \(docs.microsoft.com\)](https://docs.microsoft.com/en-us/windows/win32/direct3d12/dml)