onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-01 03:45:06 +00:00

Author	SHA1	Message	Date
David Medine	f723ff2285	fixed type to experimental session constructor (#6950 ) * fixed type to experimental session constructor Co-authored-by: David Medine <david.medine@brainproducts.com>	2021-03-10 10:18:27 -08:00
Tianlei Wu	4884eee642	Attention fusion detect num_heads and hidden_size automatically (#6920 )	2021-03-10 10:17:00 -08:00
Sergii Dymchenko	ce403eea98	Add *args support for ORTModule inputs (#6883 )	2021-03-10 10:15:23 -08:00
Zhang Lei	acfe7ac4ce	Implement QLinearAveragePool with unit tests. (#6896 ) Implement QLinearAveragePool with unit tests.	2021-03-10 10:02:01 -08:00
Weixing Zhang	1e13e2666e	Support ROCM EP for ORTModule (#6967 ) 1. Disable external allocator for ROCM EP since it is not supported yet. 2. For AMD GPU, the EP name is ROCMExecutionProvider	2021-03-10 10:00:35 -08:00
Tracy Sharpe	a8b897f710	MLAS: quantized GEMM update (#6916 ) Various updates to the int8_t GEMMs: 1) Add ARM64 udot kernel to take advantage of dot product instructions available in newer cores. Some models run 4x faster than the stock implementation we used before. 2) Refactor the x64 kernels to share common code for AVX2(u8u8/u8s8/avxvnni) vs AVX512(u8u8/u8s8/avx512vnni) to reduce binary size. 3) Extend kernels to support per-column zero points for matrix B. This is not currently wired to an operator.	2021-03-10 09:54:43 -08:00
Edward Chen	bc319bd7aa	Fix warning from setting multiple MSVC warning level options. (#6917 ) Fix warning from setting multiple MSVC warning level options. Replace an existing /Wn flag instead of always appending a new one.	2021-03-10 09:27:54 -08:00
Vincent Wang	8468099f93	Use DLPack for Graph Inputs and External Outputs of YieldOp (#6968 )	2021-03-10 09:13:45 -08:00
Vincent Wang	3f579facbc	Relax atol for some ORTModule UTs (#6969 )	2021-03-10 08:59:56 -08:00
Edward Chen	d5ed3e7fba	Enable type reduction in EyeLike, Mod, random.cc CPU kernels. (#6960 ) * Update EyeLike CPU kernel. * Update Mod CPU kernel. * Update Multinomial CPU kernel. * Slight improvement to Pad CPU kernel binary size. * Update RandomNormal[Like], RandomUniform[Like] CPU kernels.	2021-03-10 15:32:56 +10:00
Tianlei Wu	89916fdb05	fix stream sync issue (#6954 )	2021-03-09 20:57:18 -08:00
Wei-Sheng Chin	bdaea1d9ae	Update baseline due to loss scale fix (#6948 )	2021-03-10 09:46:15 +08:00
Raduan Al-Shedivat	743a93faf3	Fix broken link in server usage and remove absolute path from dockerfiles readme (#6926 )	2021-03-09 11:54:21 -08:00
Weixing Zhang	534adbb065	Support ORTModule on ROCm EP (#6945 )	2021-03-09 10:10:57 -08:00
ytaous	3b2847b2d8	Add UT correctness and address comments for previous symbolic shape PR (#6930 ) * address comments * disable assert * testing relaxed tolerance * testing relaxed tolerance * testing relaxed tolerance * per comments * modify UT * remove imports * remove prints Co-authored-by: Ethan Tao <ettao@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-03-09 10:10:18 -08:00
George Nash	ba51774a1f	Add GPU support for DNNL endpoint (#6741 ) * Added code for Relugrad with GPU support. Signed-off-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com> * Add GPU support for DNNL ConvGrad Signed-off-by: George Nash <george.nash@intel.com> * Add GPU support for DNNL MaxPoolGrad Updates to MaxPool for training with GPU Update oneDNN to version 1.8.1 Signed-off-by: George Nash <george.nash@intel.com> * Fixed issues found durring code review - error in code comment - using auto when the direct type would have been better - removed ternary operators that were returning bool values Signed-off-by: George Nash <george.nash@intel.com> Co-authored-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com>	2021-03-09 09:40:42 -08:00
Thiago Crepaldi	5303b33f69	Clean ORTModule dev branch (#6944 )	2021-03-09 09:06:23 -08:00
satyajandhyala	48eebed869	Interchange Cast and Transpose operations to facilitate Transpose-MatMul fusion (#6924 ) * Added support to interchange Cast and Transpose operations. * Added ONNX models for the Transpose-Cast-MatMul fusion testcases. * Added python code to generate the ONNX models required for testing Transpose+Cast+Matmul fusion to Cast+FusedMatMul. * Added diagram of the Transpose+MatMul fusion documentation	2021-03-09 08:54:56 -08:00
Vincent Wang	91c6a330c0	Add UseCount for External Outputs (#6894 ) * add usecount for external outputs * ut	2021-03-09 17:06:27 +08:00
Hariharan Seshadri	c8e2e3191b	Support parsing an array of values stored as an attribute in a custom op (#6878 )	2021-03-08 23:49:58 -08:00
Guoyu Wang	e64eff1f13	Enable build with bitcode for iOS (#6905 ) * Enable build with bitcode for iOS * minor format update * Minor format update * Addressed CR comments	2021-03-08 22:56:13 -08:00
Edward Chen	73fe1f2deb	Rename op kernel type control 'supported types' to 'default types'. (#6886 ) Cleaning up some naming in the op kernel type control infrastructure. "Supported types" was a bit semantically overloaded. Renamed it to "default types". They are the types that are supported by default.	2021-03-08 18:33:27 -08:00
baijumeswani	f1ade14e44	Assert that the data is on the same device as ORTModule (#6942 )	2021-03-08 17:03:28 -08:00
Sheil Kumar	67c67408c4	Only set _native folder for Microsoft.AI.MachineLearning package (#6939 ) * only set _native folder for Microsoft.AI.MachineLearning package Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2021-03-08 15:27:11 -08:00
Tracy Sharpe	bc27652188	MLAS: workaround LLVM x86 assembler (#6922 ) Implement an alternate workaround for the LLVM x86 problem described in PR #5088. That change made the x86 assembly files build with the GNU assembler by using -fno-integrated-as	2021-03-08 14:18:49 -08:00
Tianlei Wu	b89f52c277	Add tests of Attention and QAttention for pruned model (#6914 )	2021-03-08 11:56:31 -08:00
Denny Abraham Cheriyan	f2f60eed59	Fix broken Java API link (#6826 )	2021-03-08 11:28:41 -08:00
Edward Chen	15d81fb63a	Enable type reduction for Clip, MaxPool, and Pad CPU kernels. (#6918 )	2021-03-08 08:25:43 -08:00
Edward Chen	b6c4a7ac54	Support required types when excluding typed registrations (#6871 )	2021-03-08 08:22:07 -08:00
Wei-Sheng Chin	de6e66f3d4	Fix loss scaling when running ORTTrainer with BERT under mixed-precision mode (#6932 ) * Fix missed Loss scale * not to dump	2021-03-08 21:12:33 +08:00
Vincent Wang	56c5620fd2	Disable Materializing Grads (#6822 ) * disable materialize grads * gradient builder bugfix * fix ut * fix ut * resolve comments and bugfix * add more assert * disable forward compare for now	2021-03-08 16:56:06 +08:00
Thiago Crepaldi	dfc7c18e31	Introducing TrainingAgent interface to performance training using YieldOp (#6898 )	2021-03-05 17:03:46 -08:00
George Wu	601e04fb27	update Readme (#6903 )	2021-03-05 16:29:04 -08:00
baijumeswani	79f832c682	Separate requirements.txt file for ORTModule pipelines (#6879 ) * Move all ORTModule dependency installations to ortmodule subfolder	2021-03-05 14:12:11 -08:00
ytaous	ac4d615553	Enable priority-based execution order as default to support inputs with symbolic/dynamic shape (#6892 ) * priority-based exec order * disable 1 failing test * fix UT * more comments Co-authored-by: Ethan Tao <ettao@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-03-04 22:36:25 -08:00
Funtowicz Morgan	9126faa35b	Ability to fuse non-square (pruned) attention weights for BERT-like models (#6850 )	2021-03-04 17:08:08 -08:00
RandySheriffH	f986ffcb5f	move pipeline file and change relative path (#6882 ) Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2021-03-04 15:31:42 -08:00
Reuben Zotz-Wilson	107c9672fd	No such file or directory with --use_external_data_form and int8 (#6867 ) Implemented following change to avoid the error when using both --use_external_data_form and --precision int8 with GPT2LMHeadModel, which results in line 161, in save_external_data; open(external_data_file_path, 'ab').close() FileNotFoundError: [Errno 2] No such file or directory: This may also be related to the identified bug #6047.	2021-03-04 15:14:23 -08:00
RandySheriffH	679718b12f	Configure session thread pool spinning preference (#6895 ) * add config allow_spinning * add config allow_spinning * set true as default * split configures for inter and intra ops Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2021-03-04 14:54:58 -08:00
Tianlei Wu	8f1786d5d2	Save output tensors in bert_test_data tool (#6872 )	2021-03-04 13:09:05 -08:00
Tiago Koji Castro Shibata	fa8d1b44b8	Fix app packaging in UWP (#6804 ) * Change msbuild condition for UAP * update .netcore target as well * create nuget packages with _native path * validate path under _native directory for windowsai package * pep8 * add diagnostic error message * pep8 * use baseame * lib\uap10.0 * uap10 * build\\uap10.0 * Manually binplace winmds into appx when PackageReference is used. * always binplace winmd regardless of packagereference since c# should work with packages.config also * resolve all paths to full paths to avoid some reference warnings * move winmds out of lib folder to prevent automatic component registration Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2021-03-04 11:16:25 -08:00
Suffian Khan	7915b6709a	Revert Gather Grad optimization in PR 6381 targeted for Rocm (#6880 ) * revert gather_grad_impl.cu * put stream changes back in * restrict changes to commenting launch of optimized version	2021-03-04 10:21:49 -08:00
Scott McKay	54cdb6af71	Add check that the first 2 Loop subgraph inputs have an shape (could be explicit or inferred) as we need to know the rank the subgraph expects. Other inputs to the subgraph are more opaque so we can just pass them through. (#6891 )	2021-03-04 20:42:40 +10:00
Sherlock	b429edcd45	Merge pull request #6890 from microsoft/bmeswani/merge_master_onto_ortmodule Merge master onto ortmodule dev branch	2021-03-03 23:42:50 -08:00
Baiju Meswani	aa93f2e236	move SetOutputMLValue from op_kernel.h to op_kernel_context.h	2021-03-03 20:39:34 -08:00
Baiju Meswani	d5667554e6	Merge branch 'master' of github.com:microsoft/onnxruntime into bmeswani/merge_master_onto_ortmodule	2021-03-03 20:37:29 -08:00
RandySheriffH	d01006fc22	Move constants from heap to stack to avoid randomness on cudnn function (#6869 ) * move const from heap to stack * add namespace * add base prefix * define local type	2021-03-03 20:18:21 -08:00
Sherlock	749e6a08a6	Add more asserts for ORTModule forward's correctness (#6887 ) * Add more asserts on forward outputs * Found one more failing case Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-03-03 19:57:42 -08:00
baijumeswani	ed1883a97c	Workaround for HTTP Error 403: Forbidden for MNIST dataset (#6885 )	2021-03-03 18:59:48 -08:00
Guoyu Wang	fedb68429c	[NNAPI EP] Add per-tensor u8s8 support for Qlinear[Conv/MatMul] (#6818 ) * NNAPI Add per-tensor u8s8 support * Update some comments * Address CR comments * Address CR comments	2021-03-03 15:44:49 -08:00

... 149 150 151 152 153 ...

11997 commits