* Nuget store packaging
* Move DNNL workaround to EP
* Fix warning as error
* Disable store tests
* Skip store tests
* msbuild target
* Cross compile protoc in Store
* Disable DML in store
* Move store builds to CPU queue
* Copy uap10 to final nuget
* Fix pip8 error
* Remove extra dml copies
* Fix argparse
* pep8
* Forward IsStoreBuild
* Apply is_store_build to duplicate generate_nuspec
* runtimes
* Refactor uap10
* Store .NET
* uap
* PR feedback
* add runtime session id to (de)tensorization events
* append start or stop to the event names and remove opcodes
* add appsessionguid to telemetry events
* Add SetLanguageProjection C Api and use it in four projections
* static cast enum languageprojection to uint32_t
* resolve comments
* fix typo and line added unintentionally
* revert unecessary change
* reorder c# api
* add TensorAt and CreateAndRegisterAllocator in Csharp to keep the same order as C apis
* make tensorizer events measures
* throttle the events and add a new one SoftwareBitmapToGPUTensorTelemetryEvent
* factor out timing code into a class
* typo
* typo
* move eventimer class into its own header file
* add throttling to detensorization and remove variable timing
* make detensorization events measures as well
* add ConvertGPUTensorToSoftwareBitmapTelemetryEvent event
* de-duplicate event names
* fix comment
* PR feedback
* support Normalized_0_1 and Normalized_1_1
* add tests for Normalized_1_1
* fix build error
* fix imagetests failure
* support denterization and add more tests
* fix build
* remove added models
* disable gpu tests for CPU pipeline
* refactor based on comments and moved two added models
* merge normalizer and Denomalizer into NominalRangeConverter
* add comments
* little change
* fix build failure for amd64
* support Normalized_0_1 and Normalized_1_1
* add tests for Normalized_1_1
* fix build error
* fix imagetests failure
* support denterization and add more tests
* fix build
* remove added models
* disable gpu tests for CPU pipeline
* refactor based on comments and moved two added models
* merge normalizer and Denomalizer into NominalRangeConverter
* add comments
* little change
While attempting to throw an error and format an error message about an incompatible binding, WinML dies via FAIL_FAST_IF_MSG because the helper `ToString` function itself croaks :b. Instead, it should just say the data type is undefined.
```
StartGroup: Test:#62; Graph:test_cast_BFLOAT16_to_FLOAT; Executor:WinMLOperatorExecutor_Cpu;
TAEF: A crash with exception code 0xC0000409 occurred in module "Windows.AI.MachineLearning.dll" in process "te.processhost.exe" (pid:15732).
Error: TAEF: [HRESULT 0x800706BE] A failure occurred while running a test operation: 'OnnxConformanceTestsTaef::OnnxBackend'. (A crash with exception code 0xC0000409 occurred in module "Windows.AI.MachineLearning.dll" in the process hosting the test code while invoking a test operation.)
```
* Add experimental winrt api idl with dummy type to satisfy the build
* remove experimental from the api_lib target
* make experimental api available on windows builds also
* remove /y /d
* revert some pathing changes
* remove experimental api call from tests
* revert cppwinrt cmake changes
* switch to stdapi
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
* Register ILearningModelSessionOptionsNate interface
* Threading options exposed
* Add interrogator for Session options
* Add test
* Polish test
* PR comments
* Set intra op threads
* Add adapter api to grab intra op threads
* Add adapter test for getting intraop num threads
* Make ILearningModelSessionNative and update winml api test
* Make it required when building engine to set the intraop num threads
* Make test more pretty
* Change naming of idl function
* Revert "Change naming of idl function"
This reverts commit c06916aa5bf94e3bf233ed281e508b935fc8638d.
* PR comment on naming
* Skip the test because it's influenced if it's built with openmp
Co-authored-by: Ryan Lai <ryalai96@gamil.com>
* make dml and onnxruntime system32 only when winml and onnxruntime is loaded from system32
* use __ImageBase as that will not incur the unsupport store api call into GetModuleHandleEx
* remove accidental comment
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
* Expose load tensor proto from protobuf file function
* Add comment
* Remove use of fstream and use parsefromzerocopystream
* Close file descriptor after finish parsing it
* Close input stream too
* Set Close on delete only, no need to close file descriptor
* Revert "Set Close on delete only, no need to close file descriptor"
This reverts commit 5ba6e3c31b.
* Revert "Close input stream too"
This reverts commit 4564776733.
* Revert "Close file descriptor after finish parsing it"
This reverts commit 846e550c4f.
* Revert "Remove use of fstream and use parsefromzerocopystream"
This reverts commit 25a3117183.
* build e2e cppwinrt tests
* add use nuget task
* make all referenced to package version prop/target-ified
* remove dupe props/targets reference
* work around project.assets.json error by deleting it
* powershell test invocation
* switch to batch script
* print debug info
* update x86->x64
* stdio.h
* pushd/popd
* add csharp tests
* package.config -> packages.config
* typo
* x86 -> anycpu
* debug is default
* add test path
* update csproj as well
* debug
* really replace all package versions
* debug output
* really use [PackageVersion]
* sleep intead of converting async operation to task and waiting
* dont close software bitmap
* switch to powershell script
* remove binding check
* continue on failure
* continuse on error action
* continueOnError and errorActionPreference
* tabbing
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
* Move allocators to SessionState so they're decoupled from ExecutionProviders
- when looking up an allocator it's based on OrtMemoryInfo not the EP so SessionState is a more natural place for that infromation to be stored
- add device based lookup
- simplifies logic for copying feeds/fetches across devices
Cleanup SessionState and SessionStateInitializer
- provide more things to SessionState at construction time so we don't construct and instance and immediately after call a bunch of setters
- simplify SessionStateInitializer
- reduced down to FinalizeSessionState method
- Add 64-bit for those which now support true 64-bit (Gather, Scatter, OneHot, Cast), and update others (ArgMin, ArgMax, ReverseSequence, TopK, MaxPool, MaxUnpool) which take 64-bit indices but use strided 32-bit fallback.
- Stop forcibly coercing all 64-bit tensors in TensorDesc. Instead, decide in the respective kernels how to behave.
- Update graph partitioning code with enough registration information to know whether (a) 64-bit tenors are not supported at all (b) they are support via strided 32-bit fallback (c) they are supported via fallback and directly (preferred when device capable). Unfortunately this introduces a lot of flag parameters :/.
Related work items: #22265955