* Add Attention fusion for GPT2
* Support distilgpt2 in benchmark_gpt2.py
* Add options to disable Attention/SkipLayerNormalization/EmbedLayerNormalization/BiasGelu fusions
* Add logging at the begining of each fusion
* Update notebooks: Add Gpt2OnnxModel.py to list of script files.
* Add test for gpt2 model optimization
* Add optional parameters (--input_ids --segment_ids --input_mask) for graph inputs
* Fuse BiasGelu
* Handle model that does not have segment_ids input.
* Allow fuse embed layer without mask
* Make QuantizeLinear support half
* remove unnessary type constraint
* refine kernel definition
* add fp16 support for dequantizelinear
* diable QuantizeLinear_per_tensor_half_int8 for tensorrt
* refine unit test and fix saturate issue for MSDomain QuantizeLinear
* fix build break
* include tensorrt for half_uint8 test
Fixing Windows builds on the ort_training branch in preparation for the merge to master.
SafeInt (included via onnxruntime/core/common/safeint.h) was recently made a dependency of onnxruntime/core/framework/bfc_arena.h. That requires consumers of bfc_arena to compile with the SafeInt include directory.
* Migrate winml to Microsoft Namespace (packaging changes are pending)
* add ns_prefix toggle
* fix packaging
* Users/sheilk/add missing raw header (#3484)
* add dualapipartition
* wrong variable for repo root
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
* remove existence check to force failures
* extra paren
* dualapipartition needs to be referenced from the source
* add microsoft.ai.machinelearning.dll to the output dir
* rename the idl file so that assembly info is correctly added into the winmd
* fix namespaces
* update namespaces
* default to microsoft, and add namespace override as build argument
* update cmakesetings.json as well
* remove from cmakelists.txt
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
Co-authored-by: Changming Sun <chasun@microsoft.com>
Add ORT DML EP kernels for:
- MaxPooling with dilations
- TopK with sort order
- SpaceToDepth with order
- Mvn with axes
- Resample-11 with transformation modes (minus cubic, related cubic attributes, and nearest neighbor rounding mode).
WinML model tests: Summary: Total=6879, Passed=434, Failed=0, Blocked=0, Not Run=0, Skipped=6445
ONNX conformance tests: Summary: Total=3241, Passed=3099, Failed=0, Blocked=0, Not Run=0, Skipped=142
* Fixed cornercases for acl ep gemm implementation by setting fully connected as the main layer
* Introduced versioned build for the acl ep. ACL versions supported are 1902, 1905 and 1908
* Added convolution-activation fusion optimization for acl ep. We see improvements of 12% for mobilenetv2 and 4% for resnet50
Co-authored-by: Andrei-Alexandru <andrei-alexandru.avram@nxp.com>
1. Fix static analysis warnings found by VC++
2. Add a new pipeline for static analysis
3. Merge all the windows CI build into one single yaml file.(Easier to queue them all).
4. Make DNNL build faster by disabling building the tests and examples.
5. Enable custom op unitest.