* fix filtered subgraph initializer issue
* minor fix
* Inlcude implicit input of nodes to see if they are initializers
* Add test case
* minor update
* Address PR comments
* Fix some code error
* build off a specific commit and archive wheel file
* rename to fp32, prefix results w/ commit, add CPU col
* rename 99th to 90 percentile
* get symbolic_shape from master each time
* add install archive wheel, parallel build
* shortening hash
Update ORT model conversion script
- add args for specifying optimization level and whether to use NNAPI
- add logic to create a list of required ops and ORT format model that can be used with NNAPI
* Add initial documentation on using NNAPI with a minimal build
* minor clarification
* Add note on avoiding local full build
* Address a couple of PR comments
Follow up to #5811 to automate cleanup of the build docker image cache.
Added a script and build definition to clean up docker images that haven't been accessed recently.
* Exclude some training specific code around the allocation planning and initializer handling from the minimal build.
Simplify the code around tracking start/end usage of a value.
* add int8
* support both native TRT cal table and ORT cal table
* add more comments
* Update env variable name and check platform availability for int8/fp16
* add backward compatibility on old env var ORT_TENSORRT_ENGINE_CACHE_PATH and switch to flatbuffers for ort cal table deserialization
* add int8
* support both native TRT cal table and ORT cal table
* add more comments
* Update env variable name and check platform availability for int8/fp16
* Remove nGraph Execution Provider
Pursuant to nGraph deprecation notice: https://github.com/microsoft/onnxruntime/blob/master/docs/execution_providers/nGraph-ExecutionProvider.md#deprecation-notice
**Deprecation Notice**
| | |
| --- | --- |
| Deprecation Begins | June 1, 2020 |
| Removal Date | December 1, 2020 |
Starting with the OpenVINO™ toolkit 2020.2 release, all of the features
previously available through nGraph have been merged into the OpenVINO™
toolkit. As a result, all the features previously available through
ONNX RT Execution Provider for nGraph have been merged with ONNX RT
Execution Provider for OpenVINO™ toolkit.
Therefore, ONNX RT Execution Provider for **nGraph** will be deprecated
starting June 1, 2020 and will be completely removed on December 1,
2020. Users are recommended to migrate to the ONNX RT Execution Provider
for OpenVINO™ toolkit as the unified solution for all AI inferencing on
Intel® hardware.
* Remove nGraph Licence info from ThirdPartyNotices.txt
* Use simple Test.Run() for tests without EP exclusions
To be consistent with rest of test code.
* Remove nGraph EP functions from Java code
* address scalar expand to scalar case
* remove tensorrt unsupported type
* exclude tensorrt from scalar to scalar case
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
* init op_builder change
* Add reduce ops support for OpSupportChecker
* minor update
* Address CR comments
* Add scripts to exclude ops for NNAPI EP
* Fix the long string error in python script
* Remove extra debug print in the script
* Remove the op reduction scripts from this CR
* Remove unused FindOpBuilder functions
* Move CreateSharedOpBuilder to shared helper function
* Added fuzz testing using ORT model.
* The onnxruntime_security_fuzz driver code should accept either ONNX or ORT (based on the file extension) input file if /f flag is provided.
* Added ValidateOrtFormatModelDoesNotRunOptimizersInFullBuild test.
* Added win-ci-fuzz-testing.yml to run build pipeline.
* Prevent out-of-range access in the graph.cpp.
Quantize LSTM:
1. dynamically quantizes MatMul inside the LSTM. It doesn't quantize activation function.
2. support per-channel on the input weight and recurrent weight.
The implementation of QLinearConv internally does a transpose(NHWC)->im2col+GEMM->transpose(NCHW). This adds a graph transformer to change a model to use a com.microsoft.QLinearConv that supports NHWC natively to avoid unnecessary transposes.
* implement multi-threading expand on cpu
* format code
* move expand op
* add test case
* format code
* optimize code
* fix comments
* handle empty tensor
* sync with master
* add ParallelSection
* add threshold for multi-threading
Co-authored-by: RandySheriffH <rashuai@microsoft.com>
This PR adds infrastructure to automatically cache docker images used in CI builds in a container registry.
Currently, build images are pulled from a container registry for some builds and built every time for others. The container registry requires maintenance to keep the images up to date and building images every time wastes build agent resources.
With this change, a given build image can be looked up in a cache container registry and if present, pulled, and otherwise, built and pushed. The uniqueness of a build image is determined by a hash digest of the dockerfile, docker build context directory, and certain "docker build" options. This digest is part of the image tag in the cache container repository.
The cache container registry will need to be cleaned up periodically. This is not automated yet.
* adding parallelization for resize bi-linear mode.
* Adding parallelization for resize op.
* Use TrySimpleParallelFor instead of TryParallelFor.
TryParallelFor has unaddressed issue with cost model.
* Addressing PR comments.
Transitions from the ORT-only DML NuGet (hosted on the onnxruntime_public feed) to the new unified DirectML NuGet (Microsoft.AI.DirectML) on nuget.org. In addition, the Microsoft.AI.MachineLearning (WinML) and Microsoft.ML.OnnxRuntime.DirectML packages now take a dependency on the Microsoft.AI.DirectML package. This means we can remove the extra copy of DML binaries in these packages since they will be installed by the DML package.
* Add validation of operator registrations to the reduction script
- the script has all the logic to process the registrations, and there's a CI that uses it
Fix some operator registrations
* Fix CUDA PRelu registration
* Refactor to split out kernel registration file parsing and use in the exclude ops script and an op registration validation script.
Run op validation in minimal build CI
* Fix PEP8 error and some comments
* Add copy sparse model in minimal CI
* Add squeeze 13 support
* fix small typo
* Add ut for squeeze in NNAPI
* Fix some issue in the UT and code
* Modify based on the master change
* Fix build break
* Merged PR 5253310: Fix 0-sized dimension broadcasting
Tensors that contain 0-sized dimensions were being broadcasted to higher dimensions, which would remove the possibility to remove them from the graph. 0-sized dimensions represent empty tensors, so whatever operator needs to broadcast it shouldn't try to call into DML.
* Merged PR 5334334: Fix asserts and failure in GraphKernelHelper.cpp
This extends a workaround needed to match node inputs with Tensors to the EP code handling constant input upload.
This was causing issues in a couple of models, including EfficientDet, although that model still fails due to this bug:
https://microsoft.visualstudio.com/OS/_workitems/edit/29970551
Related work items: #29706035
* Merged PR 5344477: Disable GPU timeouts in DML EP command queue creation
GPU timeouts have already been disabled in command queues created by Winml, but not the ones created by the DML EP within the ORT API
* Merged PR 5380534: BatchNormalization failure in autopilot - fix output size
New validation [here](https://microsoft.visualstudio.com/DefaultCollection/WindowsAI/_git/WindowsAI/pullrequest/5354070?_a=files&path=%2Fdml%2FSharedValidation%2FDmlBatchNormalizationOperatorValidator.h) causes some BatchNorm cases to fail (e.g. OnnxConformanceTestsTaef::BatchNormalization (BatchNormalization_2x2x2)). I'm unsure how long this bug existed, but based on Nick's investigation, it apparently still worked anyway.
Related work items: #27678610
* Merged PR 5386132: Update 8D BatchNorm
Update 8D BatchNorm
Related work items: #27678610
* Merged PR 5390213: Tile allow 0 in repeats
0 is valid in Tile in "repeats" parameter. The CPU kernel handles it fine. So should the DML EP.
Related work items: #29970551
Co-authored-by: Justin Stoecker
Co-authored-by: Jeff Bloomfield
Co-authored-by: Patrice Vignola
Co-authored-by: Nick Feeney