* Revert to ignoring optional subgraph inputs due to abandoning PR 216. Restores previous behaviour that changed a couple of days ago with the Scan v9 checkin.
* Update to allow either all inputs, or just required inputs to be provided for the subgraph.
* Update IterateSequence to prefer all inputs over required inputs.
* switch to nonblocking threadpool in inference session and sessions state
* switch to eigen threadpool - first draft
* refine
* refine
* add a switch to easily revert back to windows thread pool
* switch thread pool in test runner and turn on leak checker
* remove unncessary files
* fix build error
* more build fixes
* catch exceptions in parallel executor
* fix mac build error
* fix mac build error
* more build fixes
* more mac build fixes
* fix cv issue
* change macro to include cuda compiler for disabled compiler warning
* try switching the macro to win32 only
* test #error
* move #disable warning to the top
* Update onnxruntime_framework.cmake
* move eigen include to public scope
* turn off eigenthreadpool by default and add todo comment
* update
* cmake change
* rename
* update
* update
* add cmake
* fix build warnings.
* fix comments
* update cmake to avoid run gemmlowp tests
* update cmake
* update
* fix build break
* update
* fix comments
* fix test failure
* add one more test case with padding.
* fix conv implementation of mkldnn and cuda to use updated computekernelshape function.
* fix linux ci build break
* Check the pads attribute on Conv, and auto fallback to CPU if it's not symmetric padding
* Insert copy nodes after all graph transformer. It causes some issue if do the cast transformer before memory copy transformer.
* Fix for non-wide characters in strings for linux - for c#-native interop
* update some unit tests
* added unicode and utf-8 encoding explicitly for file names
* mkldnn:Conv weight optimization
* weight optimization: review changes
* lock_guard and mutex for thread safe
* mutex added to provider
* lock to ReOrder done only once
* removed #ifndef mkldnn_hpp
* keep re-ordered mem buffer in scope
* applied clang format
* review updates: map to unordered map
* conv_mutex to mutex_
* implement dynamic slice cuda
* add template parameter
* add delaration
* init base class
* exclude case from cuda
* use cuda mapped type
* separate function implementation
* add cpy logic
* refactor
* add type check
* use InputMemoryType
* merge functions
* Make OrtAllocator not be reference counted
* Make the allocator interface more type safe
* Fix build break
* Build break fix
* Build break fix
* Mistake in previous build fix.
* Fix review comments + build break
* Missed the export symbols
* C specific error, need 'struct' keyword in one case.
* Function calling OrtReleaseObject instead of OrtReleaseEnv
* Added test data arguments to build.py, modified win-ci-pipeline build.
* Updated CI builds to use template tasks, added test data args, removed AZURE_BLOB_KEY uses.
* Fixed up set test data step template.
* Templatize Scan as step 1
* Pre-thunderstorm save
* Initial v8 and v9 implementations.
Need to add transpose to v9 and unit tests.
* Make Transpose operator implementation re-usable by Scan.
Add transpose logic to Scan.
* Rework a bunch of things. First Scan 9 unit test passes
* Add more tests.
Need to add axis validation and handling of negative values.
* Convert remaining Scan 8 tests to also work for Scan 9 if applicable.
Add invalid input tests for new Scan 9 attributes.
* Add transpose unit test.
Some cleanups.
* Cleanups
* Check number of direction entries for outputs at kernel instantiation.
* have Im2ColNd support all types and allow customized padding value.
* only specialize the template in order NCHW.
* fix build break.
* fix build break
* Implement N-gram
Do not load unnecessary pool n-grams. Add String typed tests.
Set output size to the mav ngram_index value plus 1.
* Address security warnings and some review comments.
* Fix build issues, rework sampling to try all n-gram sizes at a given offset.
* Rework the loop so all n should be tried at a given offset
and we do not add the same items all over again such as
b,c and next we try b,c,d but we no longer add b,c again.
* Compute hash incrementally so we do not re-hash elements that were
already there when we add more elements to n-gram.
* Address review comments.
TODO: Remove all attribute.
* Remove all attribute, adjust tests. Correct docs.
* Address more review comments.
* Create Type And Shape inference function.
* Address review comments. Implement batch mode per new spec.
* Correct switch bracing in OutputResult and re-test.
* Fix shape error message within TypeAndShapeInferenceFunction.
Implement Inverse for hyberbolic ops
Eigen will add support for asinh, acosh and atanh in the upcoming release. But until then for completeness of opset9 we have std based implementation.
* merge function compile interface
* fix build error
* fix linux build break
* fix static cast issue; fix clang style
* fix argument change
* use alignment allocation;fix comments in pr
* fix linux break
* apply clang format
* rename according to comments in pr
* rename according to pr comments;remove useless file
* remove the need_compile flag
* avoid passing whole session state