* Adding a custom op interface to the C API to remove shared library dependency.
* Remove old custom op test
* Rework how custom ops handle inputs/outputs to enable custom op output shape calculation in the compute method
* Add a nicer C++ API for custom ops and switch the tests to use it.
Currently, when using OrtEnableProfiling to enable profiling using the C API,
the profile output file is created but is always empty.
The reason is that InferenceSession::EndProfiling() needs to be called to
write the profiling data to the output file.
However there's currently no way to call this function via the C API.
This adds a call to EndProfiling() to the descructor of the session if
profiling is enabled in the session options.
Generalize node removal method in graph_utils. This is a higher-level method that keeps the graph consistent so that no Resolve is needed after the removal of a node.
The new method supports the removal of nodes with a single input (be it an incoming node or an initializer) and a single output (but allowing multiple output edges of that output). It also takes into account the case that one of the output edges is fed to a subgraph.
Also updated the rewrite rules to use this new, less restrictive method, and improved the rules' conditions. Introduced a GraphEdge struct to simplify various methods in graph_utils.
* Update NMS to compatible with both TF & Pytoch models
* update text
* set max_output_boxes_per_batch, iou_threshold, score_threshold as optional input to support dynamic value
* fix typo
* Set the last output selected_indices as optional output
* fix shape inference in case the input don't have shape
* Update schema to remove scores & boxes from output. support for class broadcast.
* change max_output_boxes_per_batch to max_output_boxes_per_class
* update schema to remove the class dimension from boxes
* Update BUILD.md
* Update README.md
* Update tensorrt_execution_provider.cc
remap node index to handle the case that nodes in graph may be deleted and node index is not continuous.
* Update onnxruntime_providers.cmake
Solve conflicts to onnx-tensorrt
* Update tensorrt_execution_provider.h
* Update tensorrt_execution_provider.cc
* Update tensorrt_execution_provider.h
* Update tensorrt_execution_provider.cc
* Update tensorrt_execution_provider.h
* Update tensorrt_execution_provider.cc
* Update tensorrt_execution_provider.cc
* Update tensorrt_execution_provider.cc
* Update build.py
* Update onnx
* Support updated function schema in ORT
* Update onnx related commit hash
* Check out an older commit in ONNX
* Add support for subgraph attribute
* Add comments
* added tools for doc gen, added doc
* doc updated
* some fixes
* hooked up with build.py
* hooked up with build.py and fail on nonupdated doc
* update
* fix graph transformers and refactor tests
* fix merge master
* Set default optimization level to Level1
* fix build warnings for Linux
* try root cause tensorrt test failures
* try root cause tensorrt test failure
* Test level2 transformers with all CI builds
* remove ConvActivation fusion transformer
* change default level back to level1
* remove providers from apply api
* more changes
* Convert unsqueeze elimination to rewrite rule
* Simplify the way we register predefined transformers and rules in the inference session (all details are now moved to the graph transformer utils)
* Some reorganization and renaming of methods in graph_utils
* Updates in graph transformers test
* Update in edge removal to not perform unnecessary check of node args that led to race conditions when updating the graph
* Improve documentation for rewrite rules
* Remove top-down rule-based transformer (given we currently have only one type of rule-based transformer)