* add options to disable cpu copy back
* null check proprties
* only affect gpu outputs
* change name to disabletensorcpusync
* slight refactoring
* Globally enable ms-experimental ops
* change meaning of ms_experimental to mean *all* ms_experimental ops. Some experimental ops will still be enabled globally without this flag like audio ops.
* remove changes incorrectly merged
* bad merge
* add test
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
* Remove APIs unavailable in Store in #8349, #8178, #8065
* Add UWP stubs of C runtime functions
* Remove UWP incompatible tests from UWP build
* Remove incompatible tests from Store
* Use UWP stubs in store only
* Skip partition check outside of Windows
* Remove unused WRL include
* Workaround Windows header not including what it uses
* Fix precompiled header name clash
* Workaround SDK bugs
* DXCore workaround in Win7
* Fix warning
* Fix more warnings
* Bump WinML to target Windows 8
* Fix more warnings
* Remove unnecessary workarounds
* Remove Desktop only APIs from DML adapter
Bug #31652854 also repros on Qualcomm Adreno (down to the exact same pixel). This change disables this model test for Qualcomm, in addition to the existing disablement for Intel.
* Remove APIs unavailable in Store in #8349, #8178, #8065
* Add UWP stubs of C runtime functions
* Remove UWP incompatible tests from UWP build
* Remove incompatible tests from Store
* Use UWP stubs in store only
* Skip partition check outside of Windows
* Remove unused WRL include
* Workaround Windows header not including what it uses
* Fix precompiled header name clash
* Workaround SDK bugs
* DXCore workaround in Win7
* Fix warning
* Fix more warnings
* Bump WinML to target Windows 8
* Fix more warnings
* Remove unnecessary workarounds
* Merged PR 6093117: Fix test_DynamicQuantizedLinear_max_adjusted_expanded by allowing Identity operator to run on non-float inputs
Motivation:
As part of the OnnxConformance Backend tests, DynamicQuantizedLinear_max_adjusted_expanded is failing.
Root Cause:
- The test model has `Identity` operator as one of the node. The input of this node is of non-float data type.
- In DML, `Identity` operator is registered as operator which requires floating input.
- As per `DirectMLSchema.h`, support for non-float input has been added for `Identity` operator in DML but the same has not been reflected in the `OperatorRegistration.cpp`.
Changes:
- Removed all traces of the requiresFloatFormatsForGraph flag from it's definition and usage. This flag was only used for Identity and it's related operator.
- Added null check for the graphOutput nodeArg in GraphDescBuilder.cpp to stop the crash of the test.
Related work items: #33076298
* Merged PR 6103324: Remove usage of non-generic error code (FWP_E_NULL_POINTER)
Motivation:
Addressing Dwayne comment on the previous PR. [Ref: [6093117](https://dev.azure.com/microsoft/WindowsAI/_git/onnxruntime/pullrequest/6093117?discussionId=44292162&path=%2Fonnxruntime%2Fcore%2Fproviders%2Fdml%2FDmlExecutionProvider%2Fsrc%2FGraphPartitioner.cpp)]
Changes:
Inside the DML EP, we should not use some other platform specific error codes. Instead we should a appropriate generic error code.
Related work items: #33076298
Co-authored-by: Sumit Agarwal <sumitagarwal@microsoft.com>
Add `support64BitTensorsViaEmulation` to the internal registration info, that informs the graph partitioner that int64 is supported via emulation, even if the device doesn't support it natively.
See further description in the corresponding WindowsAI DML PR:
https://dev.azure.com/microsoft/WindowsAI/_git/WindowsAI/pullrequest/6101182
Note a later PR will most likely *delete* this newly added flag and simplify much of the existing logic, even deleting the strides hack completely ^__^...
Related work items: #28761231
Motivation:
As part of the OnnxConformance Backend tests, DynamicQuantizedLinear_max_adjusted_expanded is failing.
Root Cause:
- The test model has `Identity` operator as one of the node. The input of this node is of non-float data type.
- In DML, `Identity` operator is registered as operator which requires floating input.
- As per `DirectMLSchema.h`, support for non-float input has been added for `Identity` operator in DML but the same has not been reflected in the `OperatorRegistration.cpp`.
Changes:
- Removed all traces of the requiresFloatFormatsForGraph flag from it's definition and usage. This flag was only used for Identity and it's related operator.
- Added null check for the graphOutput nodeArg in GraphDescBuilder.cpp to stop the crash of the test.
Related work items: #33076298
* model building
* fix build
* winml adapter model building api
* model building
* make build
* make build again
* add model building with audio op
* inplace and inorder fft
* add ifft
* works!
* cleanup
* add comments
* switch to iterative rather than recursive and use parallelization
* batched parallelization
* fft->dft
* cleanup
* window functions
* add melweightmatrix op
* updates to make spectrogram test work
* push latest
* add onesided
* cleanup
* Clean up building apis and fix mel
* cleanup
* cleanup
* naive stft
* fix test output
* middle c complete
* 3 tones
* cleanup
* signal def new line
* Add save functionality
* Perf improvements, 10x improvement
* cleanup
* use bitreverse lookup table for performance
* implement constant initializers for tensors
* small changes
* add matmul tests
* merge issues
* support add attribute
* add tests for double data type windowfunctions and minor cleanup
* stft onesided/and not tests
* cleanup
* cleanup
* clean up
* cleanup
* remove threading attribute
* forward declare orttypeinfo
* warnings
* fwd declare
* fix warnings
* 1 more warning
* remove saving to e drive...
* cleanup and fix stft test
* add opset picker
* small additions
* add onnxruntime tests
* add signed/unsigned
* fix warning
* fix warning
* finish onnxruntime tests
* make windows namespace build succeed
* add experimental flag
* add experimental api into nuget package
* add experimental api build flag and add to windows ai nuget package
* turn experimental for tests
* add minimum opset version to new experimental domain
* api cleanup
* disable ms experimental ops test when --ms_experimental is not enabled
* add macro behind flag
* remove unused x
* pr feedback
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
* Checkoutpoint 1
* Remove global logruntime error telemetry. This isn't necessary and doesn't contain relevant information
* Make macro simpler
Co-authored-by: Ryan Lai <ryalai96@gamil.com>
* update load library code to have the fullly qualified path
* make it work for syswow32
* git Revert "make it work for syswow32"
This reverts commit b9f594341b7cf07241b18d0c376af905edcabae3.
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
* Implement conversion from ortvalue to Itensor for string tensors and comparing sequence of maps of strings to floats
* PR comments
* Add ability to skip gpu tests according to adapter description
* spacing
* spacing
* spacing
* Add suspend handler with new telemetry event
* Fix build warning
* Use cppwinrt from nuget
* Restore nuget packages
* add dependencies
* Add nuget_helpers
* Cleaned up
* Clean up
* Comment
* Add dependencies for the rest
* Remove unused line
* Update activation string
* PR comment to remove ALL
* switch to work PC
* back with iterable of buffers
* add raw api tests
* tensorization
* last test
* all tests pass!
* small cleanup
* whitespace
* newline
* whitespace
* refactor common code into DisjointBufferHelpers
* remove unused file
* warning
* skip gpu tests when hardware not available
* Add error condition when createreference is invoked
* add null check to cretereference
* uncomment out check
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
* Model test start with float
* Clean up code and add environment variable detection
* Move into namespace
* PR comments
* Fix linker errors in latest merge to master and also fix warning
* add skipping model test mechanism
* Return std::string instead of writing to buffer
* Address case where env variable is larger than max_path
* use const static string for test reason
* Disable x86 tests and don't build if ort memory checker is enabled
* Add comment
* Add additional failing x86 tests and ifdef for checking fo rx86 build
* PR comments
* Nuget store packaging
* Move DNNL workaround to EP
* Fix warning as error
* Disable store tests
* Skip store tests
* msbuild target
* Cross compile protoc in Store
* Disable DML in store
* Move store builds to CPU queue
* Copy uap10 to final nuget
* Fix pip8 error
* Remove extra dml copies
* Fix argparse
* pep8
* Forward IsStoreBuild
* Apply is_store_build to duplicate generate_nuspec
* runtimes
* Refactor uap10
* Store .NET
* uap
* PR feedback
* add runtime session id to (de)tensorization events
* append start or stop to the event names and remove opcodes
* add appsessionguid to telemetry events
* Add SetLanguageProjection C Api and use it in four projections
* static cast enum languageprojection to uint32_t
* resolve comments
* fix typo and line added unintentionally
* revert unecessary change
* reorder c# api
* add TensorAt and CreateAndRegisterAllocator in Csharp to keep the same order as C apis
* make tensorizer events measures
* throttle the events and add a new one SoftwareBitmapToGPUTensorTelemetryEvent
* factor out timing code into a class
* typo
* typo
* move eventimer class into its own header file
* add throttling to detensorization and remove variable timing
* make detensorization events measures as well
* add ConvertGPUTensorToSoftwareBitmapTelemetryEvent event
* de-duplicate event names
* fix comment
* PR feedback
* support Normalized_0_1 and Normalized_1_1
* add tests for Normalized_1_1
* fix build error
* fix imagetests failure
* support denterization and add more tests
* fix build
* remove added models
* disable gpu tests for CPU pipeline
* refactor based on comments and moved two added models
* merge normalizer and Denomalizer into NominalRangeConverter
* add comments
* little change
* fix build failure for amd64
* support Normalized_0_1 and Normalized_1_1
* add tests for Normalized_1_1
* fix build error
* fix imagetests failure
* support denterization and add more tests
* fix build
* remove added models
* disable gpu tests for CPU pipeline
* refactor based on comments and moved two added models
* merge normalizer and Denomalizer into NominalRangeConverter
* add comments
* little change
While attempting to throw an error and format an error message about an incompatible binding, WinML dies via FAIL_FAST_IF_MSG because the helper `ToString` function itself croaks :b. Instead, it should just say the data type is undefined.
```
StartGroup: Test:#62; Graph:test_cast_BFLOAT16_to_FLOAT; Executor:WinMLOperatorExecutor_Cpu;
TAEF: A crash with exception code 0xC0000409 occurred in module "Windows.AI.MachineLearning.dll" in process "te.processhost.exe" (pid:15732).
Error: TAEF: [HRESULT 0x800706BE] A failure occurred while running a test operation: 'OnnxConformanceTestsTaef::OnnxBackend'. (A crash with exception code 0xC0000409 occurred in module "Windows.AI.MachineLearning.dll" in the process hosting the test code while invoking a test operation.)
```
* Add experimental winrt api idl with dummy type to satisfy the build
* remove experimental from the api_lib target
* make experimental api available on windows builds also
* remove /y /d
* revert some pathing changes
* remove experimental api call from tests
* revert cppwinrt cmake changes
* switch to stdapi
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
* Register ILearningModelSessionOptionsNate interface
* Threading options exposed
* Add interrogator for Session options
* Add test
* Polish test
* PR comments
* Set intra op threads
* Add adapter api to grab intra op threads
* Add adapter test for getting intraop num threads
* Make ILearningModelSessionNative and update winml api test
* Make it required when building engine to set the intraop num threads
* Make test more pretty
* Change naming of idl function
* Revert "Change naming of idl function"
This reverts commit c06916aa5bf94e3bf233ed281e508b935fc8638d.
* PR comment on naming
* Skip the test because it's influenced if it's built with openmp
Co-authored-by: Ryan Lai <ryalai96@gamil.com>
* make dml and onnxruntime system32 only when winml and onnxruntime is loaded from system32
* use __ImageBase as that will not incur the unsupport store api call into GetModuleHandleEx
* remove accidental comment
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
* Expose load tensor proto from protobuf file function
* Add comment
* Remove use of fstream and use parsefromzerocopystream
* Close file descriptor after finish parsing it
* Close input stream too
* Set Close on delete only, no need to close file descriptor
* Revert "Set Close on delete only, no need to close file descriptor"
This reverts commit 5ba6e3c31b.
* Revert "Close input stream too"
This reverts commit 4564776733.
* Revert "Close file descriptor after finish parsing it"
This reverts commit 846e550c4f.
* Revert "Remove use of fstream and use parsefromzerocopystream"
This reverts commit 25a3117183.
* build e2e cppwinrt tests
* add use nuget task
* make all referenced to package version prop/target-ified
* remove dupe props/targets reference
* work around project.assets.json error by deleting it
* powershell test invocation
* switch to batch script
* print debug info
* update x86->x64
* stdio.h
* pushd/popd
* add csharp tests
* package.config -> packages.config
* typo
* x86 -> anycpu
* debug is default
* add test path
* update csproj as well
* debug
* really replace all package versions
* debug output
* really use [PackageVersion]
* sleep intead of converting async operation to task and waiting
* dont close software bitmap
* switch to powershell script
* remove binding check
* continue on failure
* continuse on error action
* continueOnError and errorActionPreference
* tabbing
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
* Move allocators to SessionState so they're decoupled from ExecutionProviders
- when looking up an allocator it's based on OrtMemoryInfo not the EP so SessionState is a more natural place for that infromation to be stored
- add device based lookup
- simplifies logic for copying feeds/fetches across devices
Cleanup SessionState and SessionStateInitializer
- provide more things to SessionState at construction time so we don't construct and instance and immediately after call a bunch of setters
- simplify SessionStateInitializer
- reduced down to FinalizeSessionState method
- Add 64-bit for those which now support true 64-bit (Gather, Scatter, OneHot, Cast), and update others (ArgMin, ArgMax, ReverseSequence, TopK, MaxPool, MaxUnpool) which take 64-bit indices but use strided 32-bit fallback.
- Stop forcibly coercing all 64-bit tensors in TensorDesc. Instead, decide in the respective kernels how to behave.
- Update graph partitioning code with enough registration information to know whether (a) 64-bit tenors are not supported at all (b) they are support via strided 32-bit fallback (c) they are supported via fallback and directly (preferred when device capable). Unfortunately this introduces a lot of flag parameters :/.
Related work items: #22265955
* add build inbox flag
* remove raw tests and wstring for utf filenames
* enable raw tests
* use ToWideString
* create new utf8 helper
* update string helper to utf8
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
* Migrate winml to Microsoft Namespace (packaging changes are pending)
* add ns_prefix toggle
* fix packaging
* Users/sheilk/add missing raw header (#3484)
* add dualapipartition
* wrong variable for repo root
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
* remove existence check to force failures
* extra paren
* dualapipartition needs to be referenced from the source
* add microsoft.ai.machinelearning.dll to the output dir
* rename the idl file so that assembly info is correctly added into the winmd
* fix namespaces
* update namespaces
* default to microsoft, and add namespace override as build argument
* update cmakesetings.json as well
* remove from cmakelists.txt
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
Co-authored-by: Changming Sun <chasun@microsoft.com>