onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-03 03:58:54 +00:00

Author	SHA1	Message	Date
Changming Sun	5bd192c439	Update ContribOperators.md (#7246 )	2021-04-05 17:11:33 -07:00
Pranav Prakash	3b16afc0db	Make dW optional for convgrad (#7083 )	2021-04-05 17:05:20 -07:00
Guoyu Wang	c5973fbbac	Update the build script for Android AAR package (#7229 ) * Update the build script for Android AAR package * Address CR comments	2021-04-05 16:37:22 -07:00
Suffian Khan	9f14af9809	Add BERT-L perf regression test on MI100 and re-enable batch size test (#7240 ) * restore bs test and add perf test * update perf number and fix path to results	2021-04-05 15:51:52 -07:00
Ryan Lai	10102c09b6	Add better model test error messaging (#7239 )	2021-04-05 14:59:19 -07:00
Ashwini Khade	e7c5dcd572	Fix Zip-Nuget-Java Packaging Pipeline (#7208 ) * Ignore test failures due to opset support * skip identity sequence test * plus fixes	2021-04-05 10:58:13 -07:00
Chun-Wei Chen	3ee9b0ec4d	Add detailed assertion error message (#7232 )	2021-04-05 10:05:40 -07:00
Marek Šuppa	008065aab1	Update README.md (#7043 ) * Fix the precision type (switch from nonexistent `int32` to `fp32`).	2021-04-05 10:03:14 -07:00
ashbhandare	2b8513539e	Div mul fusion (#7183 ) * Div mul fusion * Change to rewrite rule * Add to inference transformers	2021-04-05 09:35:30 -07:00
Weixing Zhang	74ee24cf7f	rename cuda_mem_limit and hip_mem_limit to gpu_mem_limit for both CUDA EP and ROCm EP (#7226 ) With this change, differentiating CUDA EP and ROCm EP is not needed in training script when mem_limit option needs to be set. Co-authored-by: Weixing Zhang <wezhan@microsoft.com>	2021-04-05 09:04:04 -07:00
baijumeswani	68b12a6179	Support for saving and loading pytorch compatible state dictionaries (#7220 ) * Override methods on torch.nn.Module to get direct access to the methods on the original module.	2021-04-05 03:40:41 -07:00
Yufeng Li	8d737f9770	handle optional input in quant topo sort (#7223 )	2021-04-02 20:42:48 -07:00
Weixing Zhang	59b57d8322	HSA_NO_SCRATCH_RECLAIM and RCCL_ALLTOALL_KERNEL_DISABLE are not needed for ROCm 4.1 (#7224 ) Co-authored-by: Weixing Zhang <wezhan@microsoft.com>	2021-04-02 18:19:11 -07:00
Ahmad Zakaria	ba5f056b09	move trt_profile to TensorrtFuncState and reuse it (#7195 ) use unordered_set instead of unordered_map to keep track of dynamic shape tensors with shape updates fix: insert input_name in the set of input_names move trt_profile to TensorrtFuncState and reuse it	2021-04-02 17:09:03 -07:00
Weixing Zhang	ef88dc912c	enable more unit tests for ROCM EP (#7222 )	2021-04-02 15:57:08 -07:00
Guoyu Wang	afbbeaa30a	[NNAPI/CoreML EP] Add Onnx opset 14 support (#7211 ) * Add opset 14 support for nnapi/coreml ep * Address CR comments	2021-04-02 13:18:47 -07:00
Sherlock	a98c2ebb8c	Enable saving optimized models in OrtModule (#7214 ) * Enable saving optimized models in OrtModule Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-04-02 12:37:05 -07:00
RandySheriffH	ebde320950	Add cupti path for python gpu packaging pipeline (#7200 ) * add cupti dll path for py3.8 * correct path * add prints * replace path join * add all path * restore pipeline * format * expand path only for python 38&39 * add all cupti path Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2021-04-02 12:12:46 -07:00
Weixing Zhang	2d352056cf	Support SkipLayerNorm for ROCm EP (#7210 ) Co-authored-by: Weixing Zhang <wezhan@microsoft.com>	2021-04-02 09:03:30 -07:00
Weixing Zhang	a3f17c8b0d	update lamb and GatherGrad kernel for ROCm EP (#7184 ) With ROCm4.1, the CUDA implementation of Lamb and GatherGrad can be utilized for ROCm EP.	2021-04-02 09:02:49 -07:00
Weixing Zhang	17f91ff410	remove un-needed header file. (#7193 ) Co-authored-by: Weixing Zhang <wezhan@microsoft.com>	2021-04-01 21:05:58 -07:00
Ryan Hill	5a6d477625	Make IDataTransfer be directly shared with shared providers (#7215 )	2021-04-01 20:39:16 -07:00
Edward Chen	0ebeaf529d	Check kernel def hashes (#7120 ) Add unit test for verifying kernel def hashes. Add way to add new types to kernel definition without changing hash.	2021-04-01 17:42:58 -07:00
Dwayne Robinson	48fcddd2c5	Merged PR 5873494: Resize support nearest_mode floor in DML EP Resize support nearest_mode floor in DML EP. Related work items: #32221069	2021-04-02 00:28:27 +00:00
ashbhandare	15c67ddbf0	Make output 1 of ConcatTraining Optional and place on CPU (#7199 ) * Optional input 1 on CPU ConcatTraining * Rename output_1	2021-04-01 16:05:17 -07:00
Jesse Benson	4543459984	MIOpen supports MIOPEN_REDUCE_TENSOR_AVG now.	2021-04-01 16:00:34 -07:00
Yufeng Li	34a8b22186	disable prepacking in training (#7201 ) * disable prepacking in training	2021-04-01 14:03:47 -07:00
sfatimar	52bcef4d4f	Openvino ep 2021.3 (#7180 ) * Integrate openvino-ep-2021.3 * operators type * changed the myriad as it is case sensitive * logging information for openvino-ep-2021.3 * Unit test fix * Resize operator added for myriad * Fixed python tests for CPU and GPU * data commit for loop tile and gatherelements failure * adding checks for Where * fixing gatherelements and loop tests * disabling instance normalization test for now as there seems to be a myriad bug, putting loop in ops supported only because all the tests fail * gather elements op test taking care of warning message * condition needs to be an intializers * Disabled python test for Myriad * Disable compilation warning for MSVC windows compiler * softmax_test, threedimaxis0 and 1 test give accuracy mismatch tensoroptest disables test gives accuracy mismatch gather test gives accuracy mismatch * Updated with ov version 2021.3 * Updated with ov version 2021.3 * Updated README * Disabling python tests for cpu * Disabling python tests with accuracy mismatch on cpu * Added fix for Linux CI Pipeline failure -> Disabled tests that were throwing segfault Co-authored-by: sfatimar <sahar.fatima@intel/com> Co-authored-by: MaajidKhan <n.maajidkhan@gmail.com> Co-authored-by: Aravind <aravindx.gunda@intel.com>	2021-04-01 11:28:54 -07:00
baijumeswani	249a2c14ef	Pin version of pytorch to 1.8.1 for ORTModule CI pipeline (#7167 ) * Pin version of pytorch to 1.8.1 for ORTModule CI pipeline * Use pytorch-lightning stable version 1.2.5 * Revert to cuda 10.1	2021-04-01 09:37:47 -07:00
George Wu	fc6ac5bfac	dnnl fixes (#7202 )	2021-04-01 07:34:18 -07:00
Scott McKay	329fd03bb4	Add int32_t as required type to some operators (#7192 ) * Updates to some operators to always support int32 and int64 based on testing of Android package build config with a minimal build. If an operator can be used for shape manipulation (int64) it is frequently used for indices manipulation (int32), so we enable both types for that set of ops. - e.g. BERT models take indices as input - Scatter/Gather ops utilize indices Misc. fix to python bindings to exclude call that fails in a minimal build.	2021-04-01 19:32:34 +10:00
Tiago Koji Castro Shibata	9a8991e9b6	Merged PR 5866671: Move onnxruntime arm64x forwarder Cherry pick "Move onnxruntime arm64x forwarder"	2021-04-01 02:17:20 +00:00
Edward Chen	04679e31ab	Specify CUDA compute capability 7.5 in Linux GPU build (#7203 ) Recently a build agent pool was changed to use T4 GPUs (CUDA compute capability 7.5). Updating some CUDA build options to accommodate that.	2021-03-31 18:51:44 -07:00
Hariharan Seshadri	0e0dd50e39	Support int32 type for TopK CPU op (#7089 )	2021-03-31 18:08:21 -07:00
Jeff Bloomfield	057de97d92	Merged PR 5866812: Decompose unsupported QLinearSigmoid operation in DML EP Related work items: #32220862	2021-04-01 00:24:38 +00:00
Jeff Bloomfield	56d2c4baa2	Merged PR 5861108: Allow nodes in DML graph partitions with empty shapes on constant CPU inputs Resize is spec'd to ignore the "roi" tensor in certain modes. For some reason, converters are specifying an arbitrary value for this tensor, even though it's optional. This makes the graph partitioner skip a check for empty shape dimensions for tensors such as this, which the DML kernel registers as consuming as CPU inputs. Otherwise, the node is not included in DML graph partitions, because the DML graph doesn't handle empty dimensions. Related work items: #32221164	2021-03-31 19:06:08 +00:00
Xavier Dupré	b370ddbf5e	Removes unnecessary transpose in operator Einsum (#7141 ) * remove one unnecessary transpose * add more unit test	2021-03-31 09:59:08 +02:00
Guoyu Wang	d500c5952b	Add Android AAR packaging script for ORT-Mobile (#7138 ) * Add Android aar packaging script for ORT-Mobile * Address CR comments	2021-03-30 18:42:18 -07:00
Yulong Wang	0fdef1bf47	[Node.js binding] upgrade y18n to v4.0.1 (#7185 )	2021-03-30 16:09:04 -07:00
Negin Raoof	45cb0cae8c	Adding TorchEmbedding contrib op (#7136 ) * Adding TorchEmbedding contrib op * Update contrib_defs.cc * Shape fix * Update shape_inference_test_helper.h * Fix typo * Fix test * Fix for test code * Merge * Fix CI * Fix for CI * Fix CI no-contrib	2021-03-30 14:33:25 -07:00
liqunfu	e545604499	. (#7165 )	2021-03-30 13:58:30 -07:00
RandySheriffH	d880578537	Exclude cpuid.h from Mac non x86 arch (#7166 ) * add ifdef to exclude inclusion from non x86 arch * exclude calling of __cpuid_count Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2021-03-30 11:50:42 -07:00
Edward Chen	0ccfe6c86a	Enable type reduction for Scatter/ScatterElements CPU kernels (#7171 ) Enable type reduction for Scatter/ScatterElements CPU kernels. Some refactoring to reduce binary size. Add MLTypeCallDispatcher methods. Minor cleanup for Pad CPU kernel.	2021-03-30 11:02:24 -07:00
Tang, Cheng	07201bac7a	expose session option and provider options (#7112 ) * expose session option and provider options * merge provider_names and provider_options * integrate into orttrainer options * fix doc string * fix a typo * Update orttraining/orttraining/python/training/orttrainer.py Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com> * Update orttraining/orttraining/python/training/orttrainer.py Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com> * Update orttraining/orttraining/python/training/orttrainer_options.py Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com> * fix the usage of provider_options * Update orttraining/orttraining/python/training/orttrainer.py Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com> * Update orttraining/orttraining/python/training/orttrainer.py Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com> * update expected result in tests * fix default provider options * minor update to trigger rebuild * minor update to trigger rebuild Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>	2021-03-30 09:49:45 -07:00
Yufeng Li	c4ebc60870	sort quantized nodes in topo logical order (#7172 )	2021-03-30 09:01:15 -07:00
Yufeng Li	4f30341253	Check the count of DequantizeLinear for matmul (#7174 )	2021-03-30 09:00:48 -07:00
Tracy Sharpe	a01334ba56	MLAS: activate udot kernel on Windows ARM64 (#7169 )	2021-03-29 17:56:48 -07:00
Changming Sun	bbcf419ac6	Move the Windows GPU machine pool of Onnxruntime packaging pipelines to a new one (#7161 )	2021-03-29 17:32:03 -07:00
Ben Niu	d1acdd4f4b	Support building ARM64EC onnxruntime.dll (#6999 )	2021-03-29 15:35:30 -07:00
Yufeng Li	77c19436c0	add a notebook for mobilenetv2 quantization (#7164 ) * add a notebook for quant mobilenetv2	2021-03-29 13:24:14 -07:00

... 146 147 148 149 150 ...

11997 commits