Fixing Windows builds on the ort_training branch in preparation for the merge to master.
SafeInt (included via onnxruntime/core/common/safeint.h) was recently made a dependency of onnxruntime/core/framework/bfc_arena.h. That requires consumers of bfc_arena to compile with the SafeInt include directory.
* Migrate winml to Microsoft Namespace (packaging changes are pending)
* add ns_prefix toggle
* fix packaging
* Users/sheilk/add missing raw header (#3484)
* add dualapipartition
* wrong variable for repo root
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
* remove existence check to force failures
* extra paren
* dualapipartition needs to be referenced from the source
* add microsoft.ai.machinelearning.dll to the output dir
* rename the idl file so that assembly info is correctly added into the winmd
* fix namespaces
* update namespaces
* default to microsoft, and add namespace override as build argument
* update cmakesetings.json as well
* remove from cmakelists.txt
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
Co-authored-by: Changming Sun <chasun@microsoft.com>
* Fixed cornercases for acl ep gemm implementation by setting fully connected as the main layer
* Introduced versioned build for the acl ep. ACL versions supported are 1902, 1905 and 1908
* Added convolution-activation fusion optimization for acl ep. We see improvements of 12% for mobilenetv2 and 4% for resnet50
Co-authored-by: Andrei-Alexandru <andrei-alexandru.avram@nxp.com>
1. Fix static analysis warnings found by VC++
2. Add a new pipeline for static analysis
3. Merge all the windows CI build into one single yaml file.(Easier to queue them all).
4. Make DNNL build faster by disabling building the tests and examples.
5. Enable custom op unitest.
Advance ONNX commit to pickup the latest ArgMax, ArgMin,
ReduceMax/ReduceMin, MaxPool
Declare new versions for CPU/CUDA.
Implement infrastructure support for int8/uint8.
Adust GatherOp test for a new error.
Adjust Scan9.BadShape test.
Add exclusions for index out of bounds checks.
Rework result verification for SVDTransformer.
* added support for ios crosscompilation under linux
* reverted cmake generator change
* if --ios is added protoc can be compiled for host system
* accidently reverted change to compile protoc for host system for ios if protoc exe is not set
* wdata is now used
* accidentally pasted CMAKE_OSX_ARCHITECTURES into CmakeLists.txt, also made bad merge on build.py previously
* removed print
* fixed typeo, deleted commented statements for earlier debugging
* reverted accidental delete
* added asmmacro.h for aarch64 asm
now MlasSgemmKernel**** gets underscore added if needed
no need anymote to differentiate between iOS arm64 and normal amr64 build
onnxruntime.cmake: added check if iOSCross is set to properly set RPATH
* removed 2 spaces
* fix: logcial error fixed, now protoc gets compiled if not supplied with --path_to_protoc_exe
* removed unecessarily added spaces
* removed some more spaces
1. Copy tensorflow's thread pool class to ORT, so that we can get a better implementation of thread pool based parallelfor
2. Copy Eigen's thread pool class to ORT
3. Support thread affinity
4. Remove RNN kernel’s private thread pool
5. Modify pool kernels to use the thread pool when openmp is disabled.
* Enable sequence of tensor
* add tests
* small updates
* There should only be 2 elements returned
* CR feedback, and another 6->2 check update in the test.
* missing semicolon...
* Add explicit to constructor taking pointer paramter
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>