onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-18 18:52:16 +00:00

Author	SHA1	Message	Date
Cassie Breviu	3e57cd88fc	Csharp docfx update (#12755 ) * update dest to csharp folder, update ci to remove unused files, update git ignore * add test branch to ci	2022-08-29 08:13:45 -05:00
Baiju Meswani	80c8d934b8	Add debug option to packaging pipeline (#12685 )	2022-08-26 20:25:52 -07:00
mwootton	817dc94345	Add first pass of rocm kernel profiler (#10911 ) * Add first pass of rocm kernel profiler * Clean up rocm_profiler. Format args. Demangle kernel names. Add Api EventRecords * Remove debug output * Temporarily disable profiling unit test 'api record check' for cupti * Fix compile error for non-gpu builds * Use common file for demangle and pid/tid. Namespace ThreadUtil. Fix gpu buffer clearing. * Merge demangle into profiler_common * Merge demangle into profiler_common part 2 * Style cleanup * Resolve linking issues via ProviderHost interface * Demangle cuda kernel names * Clean up comments * Fix formatting * Fix anal retentive formatting	2022-08-26 19:38:03 -07:00
Adam Louly	ee543a47f6	upgrade cuda version on ci pipelines (training CI pipelines) (#12708 ) * upgrade cuda version on ci pipelines * keeping folder name same * keeping folder name same * setting manual seed for primitive test case * resolving comments * changing atol and rtrol only for test case Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-08-26 16:51:19 -07:00
edgchen1	64e8806148	Address some static analysis warnings.	2022-08-26 15:05:53 -07:00
edgchen1	c270ea148a	Move 'using common::Status;' from common.h to status.h.	2022-08-26 15:05:53 -07:00
Dmitri Smirnov	3ff75fa05f	Address static analysis warnings (#12711 ) Address static analysis warnings	2022-08-26 14:24:14 -07:00
Baiju Meswani	34d90dd5bd	mac-objc-static-analysis-ci-pipeline increase timeout (#12737 )	2022-08-26 12:49:49 -07:00
Chi Lo	c9fd193ef6	Make TRT EP fully support control flow op and its subgraphs (#12692 ) * sync graph proto in node's attributes * Don't fuse nodes of control flow op until later in control flow op level * remove unnecessary ep funtions * remove unnecessary ep funtions * remove unnecessary ep funtions * missing 'override' keyword which makes MacOS/Web CI fail * Add one more test run for Test3LayerNestedSubgraph with disabling graph optimization * Update the comments to better understand the 4 cases	2022-08-26 12:45:47 -07:00
Yi-Hong Lyu	a972db06bf	Disable SYMMQGEMM benchmark for CPU other than ARM (#12739 ) Besides, MlasGemmPackBSize should be MlasSymmQgemmPackBSize instead	2022-08-26 01:47:21 -07:00
cloudhan	5bdb1d4146	Add Tunable GEMM composed from rocblas and composable kernels (#12599 ) * Add tunable gemm	2022-08-26 14:32:56 +08:00
cloudhan	46c074a6c8	Update composable kernel and enable experimental inter wave scheduling (#12626 ) Update ck to latest master and enable interwave scheduling	2022-08-25 22:19:41 -07:00
Adam Louly	3bb5fb0f90	moving training pipelines from cuda 11.5 to 11.6 and deprecating 11.3 (packaging pipeline) (#12688 ) * moving training pipelines from cuda 11.5 to 11.6 and deprecating cuda 11.3 * change to cuda 11.6.2 * change pytorch's & torchvision's cuda version to 11.6 * specify deps version to 11.6.2 * update pytorch and torch text version * torch 1.12.1 * change torchvision and torchtext version to be compatible with torch 1.12.1 * change cuda to 11.6 for cuda_home comaptibility Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-08-25 22:12:01 -07:00
cloudhan	f76b40aa5b	Change TunableOp to use a type erased interface (#12597 ) * Change to type erased interface, so that there is no need to implement a class for a simple kernel launch function	2022-08-25 19:46:04 -07:00
Cheng	baf141a084	Enable xnnpack EP in Android AAR package (#12720 ) * take new features to export symbols * comments to explain why	2022-08-26 10:29:23 +08:00
Scott McKay	8483b9c6e3	MacOS pipeline and MAUI CoreML fixes (#12724 ) * Add asm statement to model.mm to force linker to link against CoreML.Framework. Update targets.xml as per Rolf's suggestions * Remove explicit numpy version from macos build. We don't specify it for other CIs and the version specified doesn't have a pre-built 3.10 wheel. This leads to the CI attempting to build numpy which fails.	2022-08-26 08:51:37 +10:00
abhi-ort	ebff15d743	Pinning manual seed (#12714 )	2022-08-25 10:09:02 -07:00
Cassie Breviu	e85dce8cea	Add csharp docfx (#12596 ) * add docfx and gh action to build docs * kick off build from feature branch * Fix LGTM linting * update az pipeline to win22 & remove nuget install * remove azure ci changes * fix implicit using to support 5.0 * fix more js issues * remove resource designer changes * remove space * fix linting misspellings in autogenerated js temp * fix misspellings in generated code * delete log file	2022-08-25 09:51:32 -05:00
Vincent Wang	5104c7dbd3	Fix Prefast Warnings (#12717 ) fix prefast warnings	2022-08-25 17:09:37 +08:00
Yulong Wang	5be3e87c71	[js] upgrade minimist@1.2.6 (#12689 )	2022-08-25 01:40:42 -07:00
Hariharan Seshadri	cde504ebbf	Fix/Suppress some VC static analyzer warnings (#12713 )	2022-08-24 23:39:40 -07:00
Yi Zhang	dee2fdffb0	Remove debug build/test in Mac CPU training (#12698 ) * run mac training parallely * update jobname * remove debug build/test	2022-08-25 13:38:53 +08:00
Yi Zhang	d91f017da1	remove redundant publish unit test results (#12697 ) rm redundant publish unit test results	2022-08-25 11:18:07 +08:00
Cheng	eba4f77d00	enable xnnpack in default_full_aar_build_settings (#12682 )	2022-08-25 10:41:06 +08:00
Pranav Sharma	f1528ea50f	Fix arithmetic overflow warning. (#12712 ) Fix arithmetic overflow warning. Suggested fix by static analysis tool Arithmetic overflow: Using operator '+' on a 4 byte value and then casting the result to a 8 byte value. Cast the value to the wider type before calling operator '+' to avoid overflow (io.2).	2022-08-24 18:27:30 -07:00
Changming Sun	7927d525a7	Remove CUDNN path from CI build scripts (#12671 )	2022-08-24 18:21:50 -07:00
Dwayne Robinson	3f47119f33	DML EP Fix InstanceNormalization with 3D tensors (#12693 ) Fix InstanceNormalization with 3D tensors	2022-08-24 14:58:38 -07:00
Adam Louly	94f76b944e	nightly pipeline build using PTCA image. (#12605 ) * nightly pipeline yaml and requirements files * changed names, removed torchvision installing * delete old file Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-08-24 10:40:55 -07:00
Nat Kershaw (MSFT)	0757d51334	Fix Java api docs broken link (#12686 )	2022-08-24 09:56:51 -07:00
Vincent Wang	53ecb9e635	Update Supporting DS Version to 0.7.1 for ORTModule (#12696 ) update ds version support for fp16_optimizer	2022-08-24 14:56:12 +08:00
Yi Zhang	de3d772995	Check GCC version (#12680 ) * check gcc version	2022-08-24 12:10:08 +08:00
Edward Chen	8d657de4b2	Update Newtonsoft.Json version to 13.0.1. (#12691 )	2022-08-23 18:45:38 -07:00
abhi-ort	73e5741a9a	Enabling softmax grad and logsoftmax grad on ORT (#12614 ) * Enabling softmax grad and logsoftmax grad on ORT * formatting changes * formatting changes * reverting changes * Changing the OpType	2022-08-23 15:49:02 -07:00
Changming Sun	cb2601c5ea	Update mac-ci.yml to increase macOS build jobs' timeout value to 3 hours (#12675 )	2022-08-22 21:31:30 -07:00
Tianlei Wu	8d78f96dfe	[CUDA] Fuse add bias and transpose into one kernel in Attention (#12670 ) * fuse add bias and transpose in attention	2022-08-22 15:46:13 -07:00
Chun-Wei Chen	6246662b1d	[Dup] Fix SAME_UPPER/SAME_LOWER (auto_pad attribute) in ConvTranspose (#12537 ) * Fix SAME_UPPER/SAME_LOWER (auto_pad attribute) in ConvTranspose * Bump ONNX 1.10.2 globally * load ONNX_VERSION from VERSION_NUMBER * / * revert deprecate warning in ORT 1.12 * add a comment about why removing cntk_simple_seg * correct the implem in DML as well	2022-08-22 15:35:34 -07:00
Yulong Wang	c144acc534	Replace 'master' branch ref to 'main' in the code (#12547 )	2022-08-22 10:48:12 -07:00
Tianlei Wu	d93e6533b7	Format bert or transformers code (#12646 ) (1) Modify some lines to fit line length limit 120 (2) Adjust parameter order of LaunchAttentionKernel (3) Format code with Clang-Format in VS Code (4) Fix spelling errors	2022-08-22 10:18:52 -07:00
Wei-Sheng Chin	dc486d146b	Make ORT callable from various Pytorch compilers (LazyTensor, TorchDynamo, etc) (#10460 ) * Make ORT as Pytorch JIT backend LORT likely doesn't work with aten fallback so we only test LORT in its own CI. * Revert changes to enable external CUDA allocator. Will add it later. Revert "Revert changes to enable external CUDA allocator. Will add it later." This reverts commit d5487f2e193014c805505afae8fb577c53667658. Fix external allocator * Relax tolerance and remove commented code * Print more information in CI * Fix pointer * Address comments. 1. Reuse ORT-eager mode's environment. 2. Remove unused ctor. * Use Pytorch master branch as all PRs are merged Fix * Refine based on cpplint feedbacks * Revert changes to allow custom CUDA allocator in public APIs * Use torch.testing.assert_close * Use unittest framework * Switch docker repo * Rename .cpp to .cc * Address comments * Add comment * Use same pipeline file for eager and lort pipelines * Address comments * Add yaml comment * Fix cmake files * Address comments * Rename flags, remove printing code, remove dead comment	2022-08-22 09:40:40 -07:00
G. Ramalingam	53090f620e	Fix attribute renaming bug in function inliner (#12445 ) * Fix attribute renaming bug in function inliner Signed-off-by: Ganesan Ramalingam <grama@microsoft.com> * Fix attr name Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>	2022-08-22 08:19:42 -07:00
Vincent Wang	a078c8d99b	Update Supporting Deepspeed Version of ORTModule's FP16_Optimizer (#12668 )	2022-08-22 22:22:53 +08:00
Chen Fu	8456f5fd97	qdq_util bug fix (#12647 ) bugfix: when creating a temp infer file, an existing file maybe accidentally deleted	2022-08-22 09:32:43 -04:00
Scott McKay	2102b8f67c	Avoid duplicate symbol error between ONNX and ORT for ostream operator<< with TensorShapeProto (#12651 ) * Remove ostream operator<< definitions for TensorShapeProto and TensorProto as they clash with ONNX definitions in onnx/defs/printer.h/cc. Currently printer.h (unnecessarily) pulls in a number of other ONNX headers which causes naming clashes with parts of ORT. It is also excluded in a minimal build. Instead convert the onnx::TensorShapeProto to onnxruntime::TensorShape so we use the existing ostream operator<< for TensorShape. Make GetTensorShapeFromTensorProto consistent with GetTensorShapeFromTensorShapeProto so both return a TensorShape (as the name implies).	2022-08-22 17:20:52 +10:00
Yulong Wang	f40e90c33f	[js/web] fix incorrect shader for 'Resize' (#12588 )	2022-08-21 21:47:28 -07:00
Yulong Wang	bfdd191eec	[wasm] use same export name for SIMD/NOSIMD build (#12545 )	2022-08-19 18:17:50 -07:00
Dwayne Robinson	aa85092b51	DML EP squeeze all axes when empty (#12649 ) DML EP squeeze empty axes	2022-08-19 08:56:03 -07:00
Changming Sun	b270334e1e	Update numpy version from 1.21.0 to 1.21.6 to avoid building it from source (#12644 )	2022-08-18 22:11:48 -07:00
Chen Fu	56dd0176a1	QDQ debugger - Adding Error Calculator (#12632 ) QDQ debugger - Adding Error Calculator	2022-08-18 09:30:43 -07:00
Cheng	81b128b5e9	Qlinearsoftmax take FLOAT lookup-table (#12574 ) * [loopuptable] float-type * typed y-scale * round to nearest even	2022-08-18 09:54:39 +08:00
Erick Muñoz	82b724fa5e	[oneDNN] Improve DequantizeLinear operator performance. (#12611 ) * Detect when ZeroPoint = 0 and avoid sub op. * Added tests to verify constant initializer behaviour.	2022-08-17 12:31:10 -07:00

1 2 3 4 5 ...

7262 commits