onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-18 21:21:17 +00:00

Author	SHA1	Message	Date
RandySheriffH	75584c5fa8	Enabling thread pool to be numa-aware (#13778 ) The PR enables ort thread pool to be numa-aware, so that threads could be evenly created and distributed among numa nodes. In addition, to facilitate performance tuning, the PR opens a new API allowing customers to attach threads to certain logical processors. Please check the API [definition](https://github.com/microsoft/onnxruntime/pull/13778/files#diff-5845a5c76fb64abdc8f0cffe21b37f8da1712674eb3abc4cd87190891be1bd48) for details. Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2022-12-12 10:33:55 -08:00
Abhishek Udupa	83c59d2594	Session-aware and thread-safe CUDA profiler (#13706 ) ### Description The existing CUDA profiler is neither session-aware, nor thread-safe. This PR ensures both. ### Motivation and Context [PR 13549](https://github.com/microsoft/onnxruntime/pull/13549) brought thread-safety and session-awareness to the ROCm profiler. This PR brings the same goodness to the CUDA profiler as well. Sample outputs of a profiling run from the StableDiffusion model (this model was chosen because it requires orchestration of multiple sessions, and verifies that the profilers are now indeed session-aware) on both CUDA and ROCm EPs are attached, along with a script that checks that the trace files generated by the profile are well-formed. Update 11/29: Updated the profile outputs. The older profile outputs exhibited an issue where some timestamps were wildly out of range, leading to problems visualizing the traces. The bug has been fixed and the profile outputs have been updated, along with an update to the check script to ensure that timestamps are monotonically increasing. [sd_profile_outputs_cuda.tar.gz](https://github.com/microsoft/onnxruntime/files/10118088/sd_profile_outputs_cuda.tar.gz) [sd_profile_outputs_rocm.tar.gz](https://github.com/microsoft/onnxruntime/files/10118089/sd_profile_outputs_rocm.tar.gz) [check_profile_output_well_formedness.zip](https://github.com/microsoft/onnxruntime/files/10118090/check_profile_output_well_formedness.zip) Co-authored-by: Abhishek Udupa <abhishek.udupa@microsoft.com>	2022-12-09 13:22:12 -08:00
Sumit Agarwal	5b16593192	[DML EP] Attention Kernel bug fix (#13879 ) ### Description - Use same data type as input for mask_index tensor which is used as DML GEMM API's C parameter. - Remove gsl header include as it is already gets included transitively. ### Motivation and Context - Why is this change required? What problem does it solve? Bug found in internal conformance testing. - If it fixes an open issue, please link to the issue here. N/A	2022-12-07 15:24:27 -08:00
Yi Zhang	ae2a9373ab	reenable quant model tests (#13871 ) ### Description ### Motivation and Context Test data in the image has been fixed.	2022-12-07 23:33:22 +08:00
Numfor Tiapo	e0dcbc3832	Fix C26436 prefast errors (#13774 ) Fixes errors 9196, 9214, 9255, and 9314. Co-authored-by: Numfor Mbiziwo-Tiapo <numform@microsoft.com>	2022-12-01 09:07:44 -08:00
Numfor Tiapo	aa1390e963	Fix Prefast Errors (#13675 ) Fixes all C28204, C6031, and C26814 prefast errors. Co-authored-by: Numfor Mbiziwo-Tiapo <numform@microsoft.com>	2022-11-28 09:16:22 -08:00
Yi Zhang	a9a9c34d98	Fix WinML Test Case: create LearningModelBinding for every testcase (#13587 ) ### Description Fix #13509 ### Motivation and Context The exception was caused by the incorrect fetches, which was from the binding with last test cases. `efcbdac58e/onnxruntime/core/session/onnxruntime_c_api.cc (L809-L815)`	2022-11-09 11:20:48 +08:00
Numfor Tiapo	49e5a11ccd	Fix SDL and Prefast Errors (#13465 ) Fixes Errors 1978844, 1978870, 1978850, 1978855, and 9245 Co-authored-by: Numfor Mbiziwo-Tiapo <numform@microsoft.com>	2022-10-28 09:41:18 -07:00
Yi Zhang	e160688a9b	Skip some failed models winml and training workflows on Windows CPU (#13407 ) ### Description 1. update model name structure in model_tests.cpp with source name. To avoid `Condition test_param_names.count(param_name) == 0 failed. Duplicate parameterized test name 'BERT_Squad_opset10_CPU'` 2. skip some failed models https://github.com/onnx/models/issues/568 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2022-10-25 10:05:04 +08:00
Numfor Tiapo	56387c3c31	Fix SDL Unmatched Annotation Errors (#13162 ) Fixes 3 SDL unmatched annotation errors. Co-authored-by: Numfor Mbiziwo-Tiapo <numform@microsoft.com>	2022-09-30 15:36:30 -07:00
Brian Martin	c20abcab87	User/brianma/eo (#13152 ) fixing SDL issues. One was a SAL mismatch, the other was handling an optional null pointer.	2022-09-30 09:43:56 -07:00
Edward Chen	454f77cd94	Update kernel matching logic: decouple from op schemas and remove kernel def hashes (#12791 ) # Motivation Currently, ORT minimal builds use kernel def hashes to map from nodes to kernels to execute when loading the model. As the kernel def hashes must be known ahead of time, this works for statically registered kernels. This works well for the CPU EP. For this approach to work, the kernel def hashes must also be known at ORT format model conversion time, which means the EP with statically registered kernels must also be enabled then. This is not an issue for the always-available CPU EP. However, we do not want to require that any EP which statically registers kernels is always available too. Consequently, we explore another approach to match nodes to kernels that does not rely on kernel def hashes. An added benefit of this is the possibility of moving away from kernel def hashes completely, which would eliminate the maintenance burden of keeping the hashes stable. # Approach In a full build, ORT uses some information from the ONNX op schema to match a node to a kernel. We want to avoid including the ONNX op schema in a minimal build to reduce binary size. Essentially, we take the necessary information from the ONNX op schema and make it available in a minimal build. We decouple the ONNX op schema from the kernel matching logic. The kernel matching logic instead relies on per-op information which can either be obtained from the ONNX op schema or another source. This per-op information must be available in a minimal build when there are no ONNX op schemas. We put it in the ORT format model. Existing uses of kernel def hashes to look up kernels are replaced with the updated kernel matching logic. We no longer store kernel def hashes in the ORT format model’s session state and runtime optimization representations. We no longer keep the logic to generate and ensure stability of kernel def hashes.	2022-09-20 14:24:59 -07:00
Sumit Agarwal	f78ed1388a	Fixed build break: inbox version of WindowsAI repo	2022-09-09 18:25:01 -07:00
Sumit Agarwal	bcdddb47ba	Merge remote-tracking branch 'origin/main' into WindowsAI	2022-09-09 17:34:48 -07:00
sumitsays	05c65a54b3	[DML EP] Contrib Op: FusedMatMul (#12898 ) * Contrib Op: FusedMatMul for DML EP * Added relevant comments and extra validation * Polish * More polish * Last polish * Addressed comment on the PR * Addressed comment on the R * Removed un-necessary comments * Used c++ standard function * used std::c++ algorithms function * Removed unsed code Co-authored-by: Sumit Agarwal <sumitagarwal@microsoft.com> Co-authored-by: Dwayne Robinson <fdwr@hotmail.com>	2022-09-09 09:37:38 -07:00
Sheil Kumar	535b0835f2	User/sheilk/dft fixes (#12862 ) * DirectML DFT Tests and Fixes * Dynamicaly allocate temporaries using the allocator... * Allocate during compute * wrong dims * CR feedback	2022-09-07 13:21:56 -07:00
Yulong Wang	1a402a3f25	replace 'master' branch ref to 'main' for onnx repo (#12678 )	2022-08-30 13:41:42 -07:00
Chun-Wei Chen	6246662b1d	[Dup] Fix SAME_UPPER/SAME_LOWER (auto_pad attribute) in ConvTranspose (#12537 ) * Fix SAME_UPPER/SAME_LOWER (auto_pad attribute) in ConvTranspose * Bump ONNX 1.10.2 globally * load ONNX_VERSION from VERSION_NUMBER * / * revert deprecate warning in ORT 1.12 * add a comment about why removing cntk_simple_seg * correct the implem in DML as well	2022-08-22 15:35:34 -07:00
Edward Chen	3efd9a73bb	Refactor InferenceSession Load member functions. (#12430 ) Fix comparison of path characters when checking for ".ort" suffix. Some clean up of InferenceSession Load functions. - Reduce duplication between std::string/std::wstring versions. - Renaming for clarity.	2022-08-03 16:28:26 -07:00
Sheil Kumar	7d712c8f8b	Fix WinML Tests are still targetting deprecated (deleted) experimental signal op definitions (#12006 ) * fix winml tests * remove legacy test * switch idft -> dft+inverse attr * upgrade opset 13->17 for signal ops tests	2022-06-27 16:35:50 -07:00
Dmitri Smirnov	267a424e52	Retry Rework execution frame to reduce memory allocations (#11897 ) * Revert "Revert "Refactor ExecutionFrame and SessionState to reduce memory all… (#11888)" This reverts commit `d2cbae3a04`. * Revert prepacked_weights to avoid indirect inclusion in CUDA and TRT code that breaks the build.	2022-06-20 10:29:43 -07:00
Yi Zhang	d2cbae3a04	Revert "Refactor ExecutionFrame and SessionState to reduce memory all… (#11888 ) Revert "Refactor ExecutionFrame and SessionState to reduce memory allocations and improve data locality (#11804)" This reverts commit `2ecba6fd25`.	2022-06-17 17:07:21 +08:00
Dmitri Smirnov	2ecba6fd25	Refactor ExecutionFrame and SessionState to reduce memory allocations and improve data locality (#11804 ) Refactor ExecutionFrame and SessionState for better data locality and less memory allocations.	2022-06-16 16:50:48 -07:00
Gary Miguel	e8b0d24071	Support per-test tolerances for ONNX tests (#11775 ) Prior to this every test shared the same tolerances. This meant that if an ONNX test failed due to a small but acceptable difference in output, the only alternative was to disable the test entirely. In op set 17, the DFT operator is being added. Without this change, the tests for that operator fail because the output is off by about 5e-5. It's better to keep test coverage for this new op rather than disable the test entirely. Also prior to this change, the global tolerances were not shared between C++, JavaScript, and Python tests. Now they are. Also fix various minor issues raised by linters. Unblocks https://github.com/microsoft/onnxruntime/issues/11640.	2022-06-14 15:12:23 -07:00
Sheil Kumar	22739137c4	Update signal op defs to match onnx17 defs, and add more tests (#11631 )	2022-05-28 16:00:09 -07:00
Sheil Kumar	6255194659	All LearningModelSessions created from a common LearningModelDevice should share the same thread pool (#11457 ) * Share thread pools between devices * make tests reuse device * Change cpu thread pool options for dml sessions to use 1 thread with no spinning * fix test failure * Update missing type constraints for dft * Add comment and rename inference session parameter * default missing causing inconsistent test behavior Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2022-05-13 11:12:43 -07:00
Sheil Kumar	85fa168dc1	Add optional dft_length input to the DFT and IDFT operators. (#11427 ) * Add optional dft_length input. * CR Feedback Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2022-05-03 16:17:43 -07:00
Sheil Kumar	027565b3b2	Add multi-dim dft test, and fix complex idft (#10947 ) * fix complex multi-dim dft * Add multi-dim dft test, and fix complex idft * remove incorrect inplace specification * Add DFT tests * update epsilon to 1000ths place Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2022-03-22 10:08:12 -07:00
Sheil Kumar	810c18e809	fix complex multi-dim dft (#10896 ) Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2022-03-17 12:45:51 -07:00
Sheil Kumar	860f28254e	Update DFT definition to more closely align with PyTorch by enabling axis attribute, and arbitrary tensor rank. (#10842 ) * Add axis attribute * fix breaks * Enable axis-specified DFT * remove static cast Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2022-03-15 15:27:12 -07:00
Jingqiao Fu	f4fd67cc2c	Revert "add load from buffer (#10162 )" (#10590 ) This reverts commit `5cd57bb726`.	2022-03-08 13:35:23 -08:00
Numfor Tiapo	9ad95bf068	Skip SetName test on inbox build (#10699 )	2022-03-02 10:28:58 -08:00
Numfor Tiapo	5fbfca3d58	Add Experimental API for setting model name (#10518 ) * Add experimental API for editing model name * Change EditModelName to 'SetName' * Change API to pass c_string * Update SetName to edit the proto * Test that the model proto gets changed * Remove comments * Skip inbox tests * Use filehelper path Co-authored-by: Numfor Mbiziwo-Tiapo <numform@microsoft.com>	2022-02-25 14:23:49 -08:00
Dwayne Robinson	ea7f773a6e	Merge pull request #10619 from microsoft/user/dwayner/DmlDev20220221 Update DirectML EP for ORT 1.11	2022-02-23 01:09:26 -08:00
Dwayne Robinson	6db6ee5710	Merged PR 6973543: ORT DML EP Opset 13 more complete Extend opset 13 support for: - Split-13 - Squeeze-13 - Unsqueeze-13 - Reshape-13 - QuantizeLinear-13 - DequantizeLinear-13 - ReduceSum-13 - Resize-13 Also: - Rename the file where all the opset versions are stored from "OperatorRegistration.h" to "OperatorVersions.h", which will make it much less confusing in the future when looking given there's another file called "OperatorRegistration.h" that corresponds to "OperatorRegistration.cpp". - Detemplatize many of the OperatorHelper.h constructors, which duplicate multiple instantiations due to the operator helper classes not sharing a common base class, by wrapping them with an adapter. Ideally there would be a common COM base interface that both IMLOperatorKernelCreationContext and IMLOperatorShapeInferenceContext implementation objects would implement, which a wrapper in MLOperatorAuthorHelper.h could QI for. - Fix style formatting issues in OperatorHelper.h (sorry for the noise). ``` Summary: Total=4679, Passed=4355, Failed=0, Blocked=0, Not Run=0, Skipped=324 ``` Corresponding WindowsAI PR: https://microsoft.visualstudio.com/WindowsAI/_git/WindowsAI/pullrequest/6973645 Related work items: #36672908, #36672926	2022-02-18 01:41:07 +00:00
Jingqiao Fu	2fa333443a	Add telemetry for device kind (#10431 ) Add telemetry for device kind	2022-02-17 13:56:22 -08:00
Dwayne Robinson	6fd7ba5b7e	Merged PR 6917440: ONNX Runtime update from GitHub master Just RI. Related work items: #38034064	2022-02-04 10:13:38 +00:00
Sheil Kumar	2dd5e75ba8	Incorrect output after GPU to GPU inference via VideoFrame and Gray8 models (#10425 ) * If the tensor is of gray8 format, we should call the gray8 shader * other check (which resolves to unknown in this case) is incorrectly being compared to constant and not DXGI_FORMAT Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2022-01-28 08:45:57 -08:00
Ryan Lai	c07e251cec	Merged PR 6835169: RI 12/9/21 - 01/12/22 Build is green https://microsoft.visualstudio.com/WindowsAI/_build/results?buildId=43713985&view=results ![image.png](https://microsoft.visualstudio.com/274e76ac-6b29-4f77-a85d-7914c77cabd5/_apis/git/repositories/853d2ddc-663c-4fe8-8036-dbf0d50db2d9/pullRequests/6835169/attachments/image.png) Related work items: #37712737	2022-01-13 00:25:51 +00:00
Jingqiao Fu	5cd57bb726	add load from buffer (#10162 ) * Add LoadFromBuffer API	2022-01-10 10:51:48 -08:00
Dwayne Robinson	0f5e82c294	DirectML EP remove stale code for int64 via int32 double strides (#9959 )	2022-01-10 02:07:22 -08:00
Dwayne Robinson	4ff78aae45	Merge pull request #9917 from microsoft/user/dwayner/FnsCandyTolerance30696168 Update WinML model tests for FNS candy and Inception float16	2021-12-02 22:45:45 -08:00
Sheil Kumar	5edaa75ef6	Fix LoadFromStream to not use wss::Buffer internally (#9918 ) Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2021-12-02 21:29:06 -08:00
Dwayne Robinson	6e4c534ce2	Relax tolerance slightly more for Intel after autopilot run	2021-12-02 19:42:31 -08:00
Dwayne Robinson	77e67a6de7	Add one more example line	2021-12-02 13:34:01 -08:00
Dwayne Robinson	ef7671b938	Comment out old lines	2021-12-02 13:30:34 -08:00
Dwayne Robinson	7a3abd863f	Update WinML model test tolerances for tiny_yolov2 and FNS_Candy	2021-12-02 00:48:54 -08:00
Ryan Lai	d8a7e1d159	Merged PR 6718335: RI 11/30 from github Pipeline green https://microsoft.visualstudio.com/WindowsAI/_build/results?buildId=42142807&view=results ![image.png](https://microsoft.visualstudio.com/274e76ac-6b29-4f77-a85d-7914c77cabd5/_apis/git/repositories/853d2ddc-663c-4fe8-8036-dbf0d50db2d9/pullRequests/6718335/attachments/image.png) Related work items: #37220320	2021-11-30 21:29:25 +00:00
Sheil Kumar	53c43e9949	WinML RT API: Add PixelRange Metadata to Bind() call PropertySet (#9827 ) * Enable Normalization Binding Metadata * copy paste error * Small fix. Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2021-11-24 13:44:25 -08:00
nums11	533b20c6ca	Merge remote-tracking branch 'upstream/master' into dmldev_temp	2021-11-18 14:21:34 -08:00

1 2 3 4 5

231 commits