* Add build option to disable traditional ML ops from the binary.
* Fix python tests by splitting tests for ML ops to a separate file. Exclude ML tests from onnx_test_runner and C# tests. Exclude ML op sources.
* Update Edge pkg pipelines with new MLops env variable and fix C# packaging pipeline tests to skip ML ops.
* expose ACL/ARMNN providers to python
* add -acl / -armnn to package name when use_acl / use_armnn is specified
* build python wheel for ARMNN EP
* link ACL/ARMNN EPs into onnxruntime_pybind11_state
* wrong argument order in build_python_wheel for wheel_name_suffix
* Fix a bug and add code to profile memory
1. Compile Send/Recv again (currently broken because of
HOROVOD refactor).
2. Add code to print out initializer allocation size and
activation memory size.
* Address comments
* Split memory counts per locations
* Fix a metric
* ORT on CUDA 11
1. Seperate HOROVOD and MPI
2. Seperate NCCL from HOROVOD in CMakeLists.txt
2. Remove dependency on external cub
3. cudnnSetRNNDescriptor is changed in cuDNN 8.0
* polish the code about MPI/NCCL in CMakeLists.txt and build.py
* check CUDA version
* ${MPI_INCLUDE_DIRS} should be PUBLIC
* sm30, sm50 are deprecated in CUDA 11 Toolkit
* update change based on code review feedback.
* add sm_52
* improve MPI/NCCL build path
Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
Modify gradle build so artifactID has _gpu for GPU builds.
Pass USE_CUDA flag on CUDA build
Adjust publishing pipelines to extract POM from a correct path.
Co-Authored-By: @Craigacp
* Add ArmNN Execution Provider
Add a new execution provider targeting Arm architecture based on ArmNN.
Validated on NXP i.MX8QM CPU with ResNet50, MobileNetv2 and VGG models.
reviewed-by: mike.caraman@nxp.com
* Minor fixes
- renamed onnxruntime_ARMNN_RELU_USECPU to onnxruntime_ARMNN_RELU_USE_CPU
- fixed acl typo
* remove extra includes. added exception for ArmNN in test
* fix indentation
* Separated the activation implementation from the cpu and fixed the blockage from the endif
Co-authored-by: Andrei-Alexandru <andrei-alexandru.avram@nxp.com>
- Add support for ENABLE_LANGUAGE_INTEROP_OPS in training build which is enabled for nightly builds
- Fix passing of environment variables to `sudo docker run` in build definitions
- Fix setup.py package naming logic
* Add amd migraphx execution provider to onnx runtime
* rename MiGraphX to MIGraphX
* remove unnecessary changes in migraphx_execution_provider.cc
* add migraphx EP to tests
* add input requests of the batchnorm operator
* add to support an onnx operator PRelu
* update migrapx dockerfile and removed one unused line
* sync submodules with mater branch
* fixed a small bug
* fix various bugs to run msft real models correctly
* some code cleanup
* fix python file format
* fixed a code style issue
* add default provider for migraphx execution provider
Co-authored-by: Shucai Xiao <Shucai.Xiao@amd.com>
In this PR, we
1. create some APIs for creating NVTX objects
2. apply those APIs in pipeline-related operators and sequential executor.
As a result, we can explicitly see how a pipeline schedule is run by GPUs in
Nvidia's visual profiler. Note that these APIs are Linux only due to Nvidia's
limited support.
* [java] - adding a cuda enabled test.
* Adding --build_java to the windows gpu ci pipeline.
* Removing a stray line from the unit tests that always enabled CUDA for Java.
* Enable running PEP8 checks via flake8 as part of the build if flake8 is installed.
Update scripts in \tools and \onnxruntime\python. Excluding \onnxruntime\python\tools which needs a lot more work to be PEP8 compliant. Also excluding orttraining\tools for the same reason.
Install flake8 as part of the static_analysis build task in the Win-CPU CI so the checks are run in one CI build.
Update coding standards doc.
Detect os and arch and move the artifacts to a new folder.
Remove unnecesary jars so we cam focus on those we publish.
Add signing
Make signature simlper.
Fix indent.
Halt on 32-bit arch.
Credits: @Craigacp
* Merged PR 4616739: Update QLinear Ops fix 1D support layout
Update QLinear Ops fix 1D support layout
Related work items: #26011523
* Merged PR 4617257: Gather operator DML EP fails with scalar indices and 1D inputs
Fix gather with scalar value.
The ONNX conformance test case is in another PR:
// 0D, axis 1, rank 0 indices tensor
{
"op_type": "Gather",
"axis": 0,
"data": [1,2,3],
"indices": 0,
"output": 1,
"T": "float32"
}
* Merged PR 4632178: Re-enable ORT onnx_test_runner test case (DirectML ConvTranspose validation needs to be loosened to comply with ONNX definition of output_padding)
Re-enable 1D convolution tests.
Related work items: #23499747
* Merged PR 4656672: Make DML EP use Direct queue
While a Compute queue has benefits, Direct is consistent with Winml.
Related work items: #26324112
* Update DML nuget version
* Merged PR 4662079: Update DmlDev branch again from github master
Include Sheil's changes to fix namespace and header file include paths. Without this, the ONNX conformance tests all fail with E_NOTIMPL.
* Increment DML nuget version
Co-authored-by: Nick Feeney <nickfe@microsoft.com>
Co-authored-by: Dwayne Robinson <dwayner@microsoft.com>
* Added aarch64 build pipeline
* Fix build error
* Remove auditwheel repair which doesn't work with cross compiling
* Statically link C++
* Added auditwheel repair back and fix stdlib.h
* Remove extra space
* dashboard integration - first phase
* change a field
* perf scripts
* addressing PR comments
* address comments and fix build
* minor
* make GetConfigFromData() const
* more update for comments
* addressing comments
* more on addressing comments
* minor
* fix build
* add condition check
* more on comments
* retrun status
* remove batch size
* on comments
* rename pkg path
* rename pkg path
* additional commentss
Co-authored-by: Ethan Tao <ettao@microsoft.com>
* add build inbox flag
* remove raw tests and wstring for utf filenames
* enable raw tests
* use ToWideString
* create new utf8 helper
* update string helper to utf8
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
* java - adding support for custom op libraries.
* Adding support for RunOptions and additional methods for SessionOptions and OrtSession.
As a result OrtEnvironment.LoggingLevel moved to be a top level enum
called OrtLoggingLevel.
* java - adding unit tests for RunOptions and SessionOptions.
* java - removing unused releaseNamesHandle method
* java - add test for custom op library.
* java - adding log verbosity methods, and tests for the same.
* java - fixes for custom op loading test on Windows.
* Cleanup after rebase on master.