Commit graph

565 commits

Author SHA1 Message Date
Hariharan Seshadri
bbeceb7541
Support optional type in ORT (#8339) 2021-11-04 15:01:42 -07:00
Edward Chen
ddb4c05852
Save graph runtime optimizations for minimal build (#9508)
Add support for saving graph runtime optimizations in an ORT format model. The idea is to allow some optimizations to be "replayed" at runtime in a minimal build. The replaying part will be in a future change.
2021-11-04 10:49:46 -07:00
Edward Chen
c315d1b3cd
Always enable ORT format model loading. (#9586) 2021-11-01 10:00:08 +10:00
TomWildenhain-Microsoft
e8268c9a18
Add Transpose Optimizer and modify nhwc optimizer to use it. (#9284)
* Add Transpose Optimizer and modify nhwc optimizer to use it.

* Fix casts

* Fix casts2

* Fix move

* Add tests

* Add headers

* Fixes and tests

* Remove explicit template instantiation

* Fix build warning

* Name unit tests

* Code review fixes

* Add some comments

* Fix some casts

* Make optimization slightly less agressive

* Some unit test fixes

* Update Attention pattern to work with transpose optimizer

* Update attention fuser

* Fix attention fusion python script

* Improve transpose optimizer documentation

* Create OptimizerCtx struct

* Disable Slice handler for testing

* Implement Slice int32

* Only push transposes leading up to other transposes

* Improve optimization heuristic

* Add exemption for MaxPool

* Document transpose optimizer api.h

* Revert fusion tests to master

* Remove temp files

* Replace typedef with using

* Trim trailing whitespace

* Move class declarations from api_impl.h to api_impl.cc

* Remove copy constructors and move allocator

* Alphabetize headers

* Add override keyword

* Comments for nhwc_transformer

* Rename OrtGraph to ApiGraph, etc.

* Wrap line

* Remove extra qualifier on ApiGraph

* Refector attention fusion

* Remove c-style casts from api_impl.cc

* Improve documentation

* Avoid printing vector in ORT_ENSURES

* Revert attention fusion refactor

* Remove duplicate cost heuristics and improve documentation

* Fix size_t casts

* Fixes from Scott's review

* Unrevert attention refactor and more updates from Scott's review

* Revert api_impl.cc ValueInfo change

* only optimize first transpose input

* Unrevert api_impl.cc changes

* Make vector call reserve

* transpose_optimizer.cc update from Scott's comments

* Rename api::Graph to api::GraphRef etc.

* Consider domains 'onnx.ai' and '' equal

* Replace AddInput with SetInput

* Improve tests

* quantization and heuristic tests

* Comments for tests

* Replace const string_view with string_view and update tests

* Fixes requested by Edward

* Fix std::string to string_view conversion

* Add <string> to includes

* Fix bug for broadcasting ops with unknown rank. Slight safety improvements

* Changes requested by Edward

* Fix formatting

* Improve description of cost metric
2021-10-27 22:10:39 -07:00
Ginés Hidalgo
9639eded4b
Missing #pragma once in dml_provider_factory.h (#9457) 2021-10-27 02:49:52 -07:00
Ginés Hidalgo
a79d375d24 Added fixes for Clang on Win64 2021-10-22 16:59:09 -07:00
Changming Sun
d83adaaf9f
Remove optional-lite (#9424) 2021-10-22 16:45:45 -07:00
Sherlock
ff23b9ff55
Avoid cudaStreamSync at the end of Forward/Backward (#9470)
* Skip cudaStreamSynchronize at the end of fw 

* skip sync stream for end of backward
2021-10-21 11:28:25 -07:00
Changming Sun
406f1629c1
Remove Featurizers code (#9300) 2021-10-20 10:20:35 -07:00
Jeff Daily
c8789d3047
[ROCm] static re-hipify of CUDA EP to ROCm EP, now a shared provider (#8877)
* re-hipify all rocm EP sources

* fix all other files affected by re-hipify

* add cuda_provider_factory.h to amd_hipify.py

* do not use cudnn_conv_algo_search in ROCm EP, missing reduce min registration

* Fix ReduceConsts template specialization introduced in #9101.

Fixes the error when building for ROCm 4.3.1:

error: too many template headers for onnxruntime::rocm::ReduceConsts<__half>::One (should be 0)

* fix flake8 error in amd_hipify.py

* speed up hipify with concurrent.futures

* flake8 fix in amd_hipify.py
2021-10-14 15:15:51 -07:00
Edward Chen
79e736ed25
Make onnxruntime::Status nodiscard (#9279)
Mark onnxruntime::Status class with [[nodiscard]] attribute.
Fix existing warnings.
2021-10-08 17:10:31 -07:00
Guoyu Wang
60bbdf1403
Remove unused NodeArgs in Graph::Resolve (#9213)
* Remove unused NodeArgs

* Handle case where a node arg from an initializer from initializer_names_to_preserve

* Fix CI failure

* update test

* Fix outer scope node args failure

* Use NodeArg* as the key of the std::set instead of string

* Minor updates
2021-10-01 11:44:26 -07:00
RandySheriffH
058108bef9
Execution Provider Profiler (#8406)
* implement cuda provider

* define profiler common

* call start after register

* add memcpy event

* add cuda correlation

* format code

* add cupti to test path

* switch to CUpti_ActivityKernel3

* reset cupti path

* fix test case

* fix trt pipeline

* add namespace

* format code

* exclude training from testing

* remove mutex
2021-09-28 13:59:52 -07:00
Hariharan Seshadri
f7dedc9002
Fix default initialization value in C API header (#9126)
* fix default initialization value in C API header

* Fix conflicts

* Nits
2021-09-20 20:58:13 -07:00
Ryan Hill
6ae5f7a244
C API Docs - Add build instructions (#9106)
* Update Doxyfile, add build instructions to header
* Update paths in README.md
2021-09-17 18:40:27 -07:00
Ryan Hill
b876e5675b
C API Enum Name Fixes (#9092) 2021-09-17 15:11:26 -07:00
Ryan Hill
280e79463a
FIll in more documentation (#9088)
Fix plural values with %s
Fix more symbol links
Add custom header for web metrics
2021-09-16 17:08:27 -07:00
Ryan Hill
26509465f0
Add default C++ initialization to OrtCUDAProviderOptions (#9064)
* Add default C++ initialization to OrtCUDAProviderOptions
2021-09-16 15:03:58 -07:00
Guoyu Wang
bee5c26580
Add CPU_ONLY runtime option to NNAPI EP (#9066)
* Add NNAPI cpu only option

* update java

* Update comments
2021-09-15 15:50:18 -07:00
Edward Chen
e574be4a53
[C API Docs] Add docs for run options tag/log level accessors/modifiers. (#9045)
Add documentation for these C API functions:
RunOptionsGetRunLogSeverityLevel
RunOptionsGetRunLogVerbosityLevel
RunOptionsGetRunTag
RunOptionsSetRunLogSeverityLevel
RunOptionsSetRunLogVerbosityLevel
RunOptionsSetRunTag

Update some existing documentation.
2021-09-14 08:53:35 -07:00
satyajandhyala
ce7b12bf5d
Added new fp16 allow/safe opcodes in PropagateCastOps (#8964)
* Removed RemoveInputOutputUpDownCasts strategy in PropagatCastOps.

* Added Expand, Squeeze and Unsqueeze ops to fp16 allow ops

* Added onnx models for squeeze/unsqueeze tests.
2021-09-10 11:53:26 -07:00
Ryan Hill
2439ced3ec
API Documentation (#8948)
* Make help information compile properly
2021-09-09 22:04:51 -07:00
Ashwini Khade
ec63d10303
add model local function support (#8540)
* updates for picking pnnx commit

* add tests filter to c# tests

* plus test fixes

* fix versioning for contrib ops

* fix tests

* test filter for optional ops

* more versioning related updates

* fix test

* fix layernorm spec

* more updates

* update docs

* add more test filters

* more filters

* update binary size threshold

* update docs

* draft - enable model local function

* enable model local functions in ORT

* update to latest rel onnx commit

* plus tests

* plus more updates

* plus updates

* test updates

* Fix for nested functions + shape inference

* plus bug fix and updates per review

* plus fixes per review

* plus test updates

* plus updates per review

* plus fixes

* fix a test
2021-09-08 11:47:01 -07:00
Vincent Wang
c343f7cb43
Add Algorithm Search for ConvGrad (#8613)
* algo search for conv grad

* global cache, bigger workspace size

* fix build error

* refactor

* refactor

* resolve comments

* fix rocm

* change lock places

* rename variable

* remove setting for inference

* resolve comments
2021-09-03 11:25:17 +08:00
Hariharan Seshadri
acd9db7fad
Fix location planning for initializers used only in nested subgraphs (#8642) 2021-09-01 00:02:08 -07:00
Tang, Cheng
4dc0ddf606
support register external ep lib information (#8897)
* support register external ep lib inforation; make eager mode share the same ep pools with training workloads

* fix inference code

* fix build break

* fix the message
2021-08-31 20:51:22 -07:00
Tang, Cheng
ae7f2d824d
Share the execution provider instance for training (#8719)
* seperate the training python module; share the execution proivder instance

* fix build break

* fix cuda test crash; reorg the python module code base

* se correct env

* use provider customized hash func

* fixbuild break

* fix rocm break

* use const ref in argument

* rename the file

* move hash func to trainiing module
2021-08-27 16:23:35 -07:00
Scott McKay
0034ad72e6
Minimize changes to fix missing symbols used from C# (#8867)
* Revert "Cleanup C# bindings to add EP (#8810)"

This reverts commit b21ea00020.

* Add back in a minimal set of changes.
Provide stubs in for a limited set of things
  - things called from C# using a static lib of ORT built for mac/ios
  - things in OrtApis that are not included in the build by default
  - things in OrtApis that are excluded in a minimal build

* Cleanup order or EPs in test

* Fix unused function in ROCM build
2021-08-28 07:10:14 +10:00
Edward Chen
7e53a1df6f
Enable selector action transformer infrastructure in minimal build. (#8804) 2021-08-27 17:16:05 +10:00
Rachel Guo
1886f1a737
Make SparseTensor infrastructure optional (#8802)
Add cmake parameter and #ifdefs to allow for disabling sparse tensor support. This comes with a significant binary size cost so we want to be able to exclude it in a minimal build.
2021-08-27 17:12:26 +10:00
Scott McKay
b21ea00020
Cleanup C# bindings to add EP (#8810)
Fix C# add EP bindings.
Add stubs to ORT so that if EP is not included in the build we return a graceful error message.
Move declaration of stubs into C API and out for EP so they're in one place and are easier to use (no extra header required in the C/C++ world and consistent with the CUDA EP setup).
Fix inconsistency in ROCM EP.
Cleanup a few other things.
2021-08-26 13:59:40 +10:00
Hariharan Seshadri
cee79526fd
Add opset 15 kernels for Pow, BatchNorm, and Shape (#8442) 2021-08-25 12:04:20 -07:00
Changming Sun
4bfff45859
Downgrade Eigen (#8817) 2021-08-23 18:06:23 -07:00
Dmitri Smirnov
8713d76dd1
Introduce C and C++ APIs for Sparse Tensors (#8621)
Add IsSparseTensor
  Add CreateSparseTensor
 Add utilities and test fully sparse instantiation
 Fully sparse blocksparse
 Add test and docs for fully sparse tensor instantiation
 Rework creation API
 Use API
 Non string API
 Retrofit of existing String API
 Add tests
 Add documentation
 Address build issues (Winml pending)
 Add inference test
 Bump binary size
 Add ifdef DISABLE CONTRIB
2021-08-16 16:33:47 -07:00
Changming Sun
436ac6dd5f
Rename ml_value.h to ort_value.h (#8726) 2021-08-13 07:04:56 -07:00
Dmitri Smirnov
1a8adb96fe
Reduce templatization of C API and refactor for InitOrtValue (#8700)
Refactor for OrtInit
  Simplify C API
  Add ort_provider bridge interfaces
2021-08-12 16:51:18 -07:00
Edward Chen
89601ee6b3
[EP Partitioning Utils] Add check for assigned node. (#8473)
Adds a check that a node is not already assigned to an EP before adding it to an EP partition.
2021-08-12 16:08:25 -07:00
Hariharan Seshadri
e791faeca5
Fix bug in CPU force fallback logic (#8597) 2021-08-05 21:36:28 -07:00
Tim Harris
56441dcd88
Limit work items to available threads, upgrade checks from assert to ORT_ENFORCE (#8495) 2021-07-27 19:25:12 -07:00
Guoyu Wang
4c939e1cb7
Add an option to use the input model bytes (ORT format only) directly without copy at session creation (#8502)
* Do not copy the model_data when session is started by CreateSessionFromArray

* Add config option for disabling copy model bytes

* Add one additional test

* Address CR comments
2021-07-27 09:11:42 -07:00
Vincent Wang
619a8782a5
Improve AddValueInfo (#8451)
* change AddValueInfo

* fix after merge master
2021-07-23 16:39:55 +08:00
Dmitri Smirnov
950fe5e28b
Implement SparseTensor and infrastructure suppport and advance ONNX commit (#8038)
SparseTensor support
  Implement Builder pattern
  Fix support for 1-D and 2-D COO indices
  Implement and test CSR support.
  Handle shape inference for SparseTensors
  Implement conversion for COO, CSR and tests.
  Address the case where constant sparse initializer is the output.
  Implement test infra for SparseTensors
  Implement SparseDenseMatMul for Csr and COO and tested it.
  Add hash for SparseToDenseMatMul
  Finish shared provider refactor
  Refactor GetOrCreate to Create
  Working on py interface
  Expose OrtDevice and use it in allocate_numpy
	Adjust Sparse interfaces, add support for string SparseTensor. Add tests.
	Add and test to_cuda()
	Add accessors to format specific indices
	Test values and indices views, read-only flag, after GC access
	Add sparse related methods to OrtValue
	Re-work SparseTensor wrapper, add OrtValue methods
	Rework numpy_array_to_cuda/to_cpu
	Add run_with_ort_values
	Add models and test sparse_mat_mul with run_with_ort_values
	Refactor sparse tensor to use a single buffer
        Ifdef x86 Eigen CSR sparse matmul implementation
        Exclude broken test, check for string type when copying cross device
       Split pybind schema, regenerate docs, add exclusion
       Conditionally exclude schema module
       Update docs fix cuda build
       Add test to a filter and renerate JS docs
      Add conversion and test string support for sparse tensors
      Exclude conversion utils from minimal build
      Add CUDA Memcpy and adjust provider interfaces
2021-07-22 15:24:36 -07:00
Hariharan Seshadri
3360024a0b
Support plugging in custom user-defined allocators for sharing between sessions (#8059) 2021-07-22 10:17:35 -07:00
Edward Chen
989491c333
[NNAPI EP] Make partitioning stop ops configurable. (#8444)
Enable NNAPI EP partitioning stop ops to be overridden by a session configuration option.
2021-07-22 09:21:42 -07:00
Edward Chen
695536a7ac
Make some common macros safer to use. (#8445) 2021-07-21 12:14:36 -07:00
Ryan Hill
cc9f793b48
Move one function from cuda_provider_factory.h (#8407) 2021-07-19 17:55:59 -07:00
satyajandhyala
84bc20fe9d
Enable cast propagation with level one by default. (#8286) 2021-07-08 14:38:09 -07:00
RandySheriffH
f40df30219
Replace functions with secured version for OSX compliance (#7586)
* replace strlen with strnlen

* replace vsnprintf with vsnprintf_l

* add macro

* switch to std numeric::limits

* apply uint16 max

* fix build err

* fix mac build

* define MAX_STR_LEN

* define MAX_STR_LEN

* fix typo

* trim empty lines

* apply constexpr

* fix typo

* add namespace

* fix build err

* rename global constant

Co-authored-by: Randy <Randy@randysmac.attlocal.net>
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
Co-authored-by: Randy <Randy@randysmac.local>
2021-07-08 11:02:36 -07:00
Zuwei Zhao
b46310b349
Integrate onnxruntime-extensions into onnxruntime. (#8143)
Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com>
2021-07-01 09:34:03 -07:00
Scott McKay
4993680e56
Graph::GetNodeProvidesGraphOutput -> NodeProducesGraphOutput (#8243)
'GetNode' is a little confusing as it returns a bool.

Update a couple more places where GetNodeOutputsInGraphOutputs was being used unnecessarily.
2021-06-30 20:43:33 +10:00