Minor wording changes to design doc (#51)

* Update HighLevelDesign.md * Update HighLevelDesign.md * Update HighLevelDesign.md
2026-07-24 19:43:35 +00:00 · 2018-11-28 19:43:03 -08:00 · 2018-11-28 19:43:03 -08:00 · 7523e76649
commit 7523e76649
parent 6371025860
1 changed files with 17 additions and 18 deletions
--- a/docs/HighLevelDesign.md
+++ b/docs/HighLevelDesign.md
@ -1,7 +1,7 @@
 # ONNX Runtime High Level Design

 This document outlines the high level design of
-ONNXRuntime - a high performance, cross platform engine.
+ONNX Runtime - a high performance, cross platform engine.

 ## Key objectives
 * Maximally and automatically leverage the custom accelerators and runtimes
@ -10,8 +10,8 @@ available on disparate platforms.
 runtimes. We call this abstraction an [execution
 provider](../include/onnxruntime/core/framework/execution_provider.h). It defines and exposes a set of
 its capabilities to ONNXRuntime: a set of single or fused nodes it can
-execute, its memory allocator and more. Custom accelerators and runtimes are
-instances of execution provider.
+execute, its memory allocator, and more. Custom accelerators and runtimes are
+instances of execution providers.
 * We don't expect that an execution provider can always run an ONNX model fully
 on its device. This means that ONNXRuntime must be able to execute a single
 model in a heterogeneous environment involving multiple execution providers.
@ -35,46 +35,45 @@ provider using the GetCapability() API.

 ![ONNXRuntime high level system architecture](https://azurecomcdn.azureedge.net/mediahandler/acomblog/media/Default/blog/228d22d3-6e3e-48b1-811c-1d48353f031c.png)

-*Note: TensorRT and nGraph support in the works.*
+*Note: TensorRT and nGraph support are in progress*

 ### More about partitioning
-ONNXRuntime partitions a model graph based on the available execution providers
-into subgraphs, each for a distinct provider respectively. ONNXRuntime provides
-a default execution provider that is used for fallback execution for the
+ONNXRuntime partitions a model graph into subgraphs based on the available execution providers, one for each distinct provider. ONNXRuntime provides
+a default execution provider that is used as the fallback execution for the
 operators that cannot be pushed onto the more specialized but more efficient
-execution providers. Intuitively we probably want to push computation to the
-specialized execution providers as much as possible.
+execution providers. Intuitively we want to push computation to more
+specialized execution providers whenever possible.

 We use a simple graph partitioning technique. The available execution providers
 will be considered in a specific order, and each will be assigned the maximal
 subgraphs (possibly more than one) that it is able to handle. The
-ONNXRuntime-provided default execution provider will be the last one to be
+ONNXRuntime-provided default execution provider will be the last one
 considered, and it ensures completeness. More sophisticated optimizations can be
 considered in the future (or can even be implemented as a composite execution
 provider).

 Conceptually, each partition is reduced to a single fused operator. It is
-created by invoking the execution provider's Compile() method and wrap it as a
+created by invoking the execution provider's Compile() method and wraps it as a
 custom operator. Currently we support only synchronous mode of execution. An execution
 provider exposes its memory allocator, which is used to allocate the input
 tensors for the execution provider. The rewriting and partitioning transform the
-initial model graph into a new graph composed with operators assigned to either
+initial model graph into a new graph composed of operators assigned to either
 the default execution provider or other registered execution
-providers. ONNXRuntime execution engine is responsible for running this graph.
+providers. The ONNXRuntime execution engine is responsible for running this graph.

 ## Key design decisions
-* Multiple threads should be able to inovke the Run() method on the same
+* Multiple threads can invoke the Run() method on the same
 inference session object. See [API doc](C_API.md) for more details.
-* To facilitate the above the Compute() function of all kernels is const
+* To facilitate this, the Compute() function of all kernels is const
 implying the kernels are stateless.
-* We call implementations of the operators by execution providers as
+* Implementations of the operators by execution providers are called
 kernels. Each execution provider supports a subset of the (ONNX)
 operators/kernels.
-* ONNXRuntime runtime guarantees that all operators are supported by the default
+* The ONNXRuntime runtime guarantees that all operators are supported by the default
 execution provider.
 * Tensor representation: ONNXRuntime will utilize a standard representation for
 the tensor runtime values. The execution providers can internally use a
-different representation, if they choose to, but it is their responsibility to
+different representation if they choose to, but it is their responsibility to
 convert the values from/to the standard representation at the boundaries of
 their subgraph.