* Add missig env variables for mac pipeline test (#2595)
* Java API for onnxruntime (#2215)
* Rename automl python tools folder to featurizer_ops. (#2593)
* Make sure fenced tensor could not reuse other tensor. (#2561)
* Add support for opset 11 in reshape fusion (#2592)
* Support opset 11 subgraph of Squad model in Embed Layer Normalization (#2605)
* Allow providers to be set for InferenceSession at construction (#2606)
* EmbedLayerNormalization Fusion For Dynamic Squad Model Opset 10 (#2613)
* Improve Embed Layer Norm Fusion for SQuAD with static input shape (#2621)
* Improve cuda expand() opeator's performance. (#2624)
* Cuda pad optimize when no padding is needed. (#2625)
* Shortcut cuda Pad() when no padding is needed.
* Improve performance of resize() in Nearest mode (#2626)
* Optimize cuda scatter() on 2D compatible. (#2628)
* Optimize cuda scatter() on 2D compatible.
* fix float16 comparison in initializer (#2629)
* epsilon attribute for layernormalization fusion (#2639)
* Fix memory exception in Layer Norm Fusion (#2644)
* Add missig env variables for mac pipeline test (#2595)
* Java API for onnxruntime (#2215)
* Rename automl python tools folder to featurizer_ops. (#2593)
* change c++14 to c++11
* add ld lib path for centos
* enable csharp tests on macos
* fix C API test on MacOS + fix manylinux dotnet install
* fix manylinux dotnet install
* fix lib link
Rework TensorSeq in a manner consistent with Tensor and SparseTensor
in terms of type system setup.
Reduce templating. Introduce helpers to ensure the same
data type.
Make OrtValue __dtor not virtual.
Introduce ContainerChecker
* enabme telemetry
* enable telemetry
* set enable telemetry as default
* for debugging
* remove log and set disable telemetry as default back
* delete private file while testing
* resolve comment: mainly add license header, rename macro and update docs
* rewording in privacy.md
* add centos tests to linux cpu ci pipeline
* Disable failing test
* use centos6 instead of centos7
* change back to centos7
* add dotnet runtime dependency
* fix dotnet runtime dependencies
* install dotnet sdk instead of runtimes
* add more dotnet dependencies
* temporary skip failing test
* ix lib path
* reenable failing test
Add support of GPT2 model optimization:
* Match subgraph of Gelu Approximation (using Tanh).
* Fuse LayerNormalization if SkipLayerNormalization is not ready.
* Output model even if embedding layer is not fused.
* Improve Reshape Fusion to improve coverage.
* Refine constant input checking, and output fused op counter.
Update script according to latest op improvements:
* Fusion of Add Bias and Gelu.
* Fuse SkipLayerNormalization and Add Bias.
Other:
* Add ReduceSum for mask as intermediate step.
* Refactor verbose setting.
* Constant folding bug fix/improvements
- Handle constant folding for node that is assigned to a non cpu EP
- Check for errors in optimizer execution frame setup
- Improve CUDA partitioning to look for initializers in parent graphs
- Add unit test
Fixes#2474
* [NupharEP] Add parallel schedule to JIT function name
Update Nuphar docker to use Python 3.6 and ubuntu 18.04
* Update notebook
* Avoid JIT cache file name conflict
* [NupharEP] Enable parallel schedule
* Update TVM with the fix to TVM threadpool to use OpenMP if possible
* Add parallel schedule when trying to vectorize
With this change, BERT squad perf on a 4-core (8 HT) CPU goes from 187ms to 150ms
* Address CR, docs and cmake update
* Doc fix
* Fix mkl
* Fix TVM windows build when using mklml