onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-21 21:52:11 +00:00

Author	SHA1	Message	Date
Yulong Wang	bec18eb3f4	[Node.js binding] support CentOS 7 in CI (#4447 )	2020-07-09 00:59:50 -07:00
Josh Bradley	ca5af9d622	Add modern C++ standards for Ort::Value (#4367 ) * add modern standards to function arguments * code cleanup * fix code formatting * add element access convenience function * change template type name to match rest of code * remove new At() convenience function * add better documentation message	2020-07-09 00:35:41 -07:00
Vincent Wang	7fb194d03d	Update convergence baseline for ci_test. (#4465 ) Co-authored-by: Vincent Wang <weicwang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-07-09 15:29:36 +08:00
Josh Bradley	3effac2990	Experimental C++ API examples (#4358 ) * Add examples * fix build instructions for linux users * fix header include * update documentation	2020-07-08 23:17:50 -07:00
Yufeng Li	5dc7339be6	Add quantization tool to python package (#4458 ) * Add quantization tool to python package	2020-07-08 21:42:53 -07:00
edgchen1	0ca4f7eb30	Update Git submodule cgmanifests. (#4461 )	2020-07-08 19:24:03 -07:00
George Wu	f24d8e4587	fix build break from PR#2850 api change (#4451 )	2020-07-08 17:02:12 -07:00
Tianlei Wu	cb5c4292b8	GPT-2 Attention Fusion without input mask (#4456 ) * Allow input mask to be optional * Add test for model without input mask and past state.	2020-07-08 15:59:57 -07:00
Wei-Sheng Chin	5222b2c6c0	Remove code which is not thread-safe. (#4454 ) Because of acync access to the memory logger when using parallel executor, ORT crashes sometime.	2020-07-08 14:27:56 -07:00
Tianlei Wu	05757b4c3c	Transformer benchmark: add option to use raw attention mask (#4446 ) * Update benchmark and optimizer to add an option to use raw attention mask * Remove temporary model in optimizer	2020-07-08 12:34:41 -07:00
Tixxx	b156ae4448	Support training_mode flag in eval (#4324 ) * add training_mode feed for evaluation to support opset12	2020-07-08 10:38:54 -07:00
Negin Raoof	71aec2adcb	Custom op export test template (#4383 ) * Adding pytorch custom op export tests to CI * Test clean build * Fix export for intended failure * update export script * Build onnxruntime	2020-07-08 10:14:56 -07:00
Du Li	063156d98d	IOBinding docs (#4432 ) * Adding iobinding pathon docs. * Adding iobinding pathon docs. * Addressing PR comments.	2020-07-08 03:48:22 -07:00
Hariharan Seshadri	6d6b6b54a5	Support binding a graph output to a specific device via the Python binding (#4439 )	2020-07-07 21:09:37 -07:00
Tracy Sharpe	aa06d308a6	Build new AVX file with /ARCH:AVX (#4442 ) Build new file with /ARCH:AVX on Windows to ensure correct vzeroupper behavior.	2020-07-07 12:00:12 -07:00
Tiago Koji Castro Shibata	e62686c36e	Remove use of RTTI in CUDA provider (#4444 )	2020-07-07 11:38:09 -07:00
Sheil Kumar	fdb4a3a2e8	Add cppwinrt and cswinrt tests in windowsai nuget pipeline (#4381 ) * build e2e cppwinrt tests * add use nuget task * make all referenced to package version prop/target-ified * remove dupe props/targets reference * work around project.assets.json error by deleting it * powershell test invocation * switch to batch script * print debug info * update x86->x64 * stdio.h * pushd/popd * add csharp tests * package.config -> packages.config * typo * x86 -> anycpu * debug is default * add test path * update csproj as well * debug * really replace all package versions * debug output * really use [PackageVersion] * sleep intead of converting async operation to task and waiting * dont close software bitmap * switch to powershell script * remove binding check * continue on failure * continuse on error action * continueOnError and errorActionPreference * tabbing Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2020-07-07 09:36:42 -07:00
Yufeng Li	612f52c975	add bias for DynamicQuantizeMatmul (#4440 )	2020-07-06 22:31:29 -07:00
Pranav Sharma	1f1384f8a9	Update dependency introduced by fuzzing change. (#4438 )	2020-07-06 21:56:40 -07:00
Tianlei Wu	eabf6dc9ee	Add Fusion for GPT Attention with both past state and attention mask (#4437 ) Add Fusion for GPT Attention with past state and attention mask	2020-07-06 19:37:37 -07:00
gwang-msft	7baf374939	Change the input to NNAPI EP ModelBuilder from ModelProto to GraphViewer (#4389 ) * init version to use graph instead of model_proto for IsOpSupported * move add to modelbuilder to use graph node * move the rest of model_builder to use graph instead of modelproto * remove redundant code * Clear some redundant code * merge master and some minor style changes * move check if an initializer is external to individual op instead the whole graph * Addressed comments * Change the GetType and GetShape to log waring info inside to simplify the caller, remove some redundant onnxruntime namespace * add squeeze op support, some more code style clean up * fix a bug where duplicate output can be added to a subgraph, some other minor logging changes	2020-07-06 18:44:04 -07:00
EronsJ	632b2896f3	Onnxruntime fuzzing (#4341 ) * Add protobuf mutator library as a git submodule * Added files and instructions to build the protobuf mutator library in CMake * Added fuzzing flag to build system and added fuzzing dependency library. To run fuzzing test use the flags --fuzz_testing --build_shared_lib --use_full_protobuf --cmake_generator 'Visual Studio 16 2019' * Added src files and build instructions for the main fuzzing engine * Removed Random number generation test from inside the engine * Added license header to files * Removed all pep8 violations introduced by this change and other E501 violations	2020-07-06 16:34:34 -07:00
Cecilia Liu	ec35a1b514	Remove unused initializer in graph after embed fusion (#4436 )	2020-07-06 16:04:02 -07:00
Tracy Sharpe	3ef449816c	MLAS: support prepacking APIs for quantized GEMM (#4433 ) Add support for prepacking matrix B for use in the quantized GEMMs.	2020-07-06 15:20:10 -07:00
Ashwini Khade	dd73e8c016	add function initialization back to graph resolve (#4434 )	2020-07-06 15:17:27 -07:00
liqunfu	0fdb1e9f60	Liqun/roberta (#4408 ) add GLUE Roberta example, fix unused initializer issue at backend. Bert GLUE expected out updated due to graph changes between June29 to July1st	2020-07-06 09:19:30 -07:00
Christian Goll	3588484336	use system libnsync (#4377 ) * use system libnsync	2020-07-06 07:53:22 -07:00
KeDengMS	77cf51b13c	Fix symbolic_shape_infer for Resize with roi (#4426 ) Should only apply roi when coordinate_transformation_mode == tf_crop_and_resize	2020-07-05 23:37:36 -07:00
jornt-xilinx	0d4a65eede	Fix Vitis-AI EP for memory info into IAllocator move (#4404 )	2020-07-05 09:00:26 +10:00
pengwa	8bcdefc9c1	Optimize GatherND (#4097 ) * Optimize GatherND * Refine the code, Fix few comments	2020-07-03 19:42:32 +08:00
Weixing Zhang	bd11ab6816	Optimize LayernormGrad (#4156 ) * Draft for LayerNorm Optimization * Modify LayernormGrad kernel based on new backward graph. * keep two LayernormGrad implementations. One is implemented based on input X, mean. The other is based on output Y, scale, bias. The first one is enabled by default. The second one can be enabled by --use_invertible_layernorm_grad * expose use_invertible_layernorm_grad to frontend. * add fp16 tests. Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: Weixing Zhang <wezhan@microsoft.com>	2020-07-02 22:09:30 -07:00
Weixing Zhang	33e06be4ac	optimize transpose CUDA kernel (#4233 ) * optimize transpose * optimize for the case when the tensor is 3D and the permutation is done in last two dimension. BERT-L throughput is improved ~1.4% from transpose optimization * fix UT MegatronSelfAttentionPartitionCorrectnessTest * polish code. * add test and change tile size to 16x16 for better perf. * fix UT * fix test of mask_rcnn * address code review comments. Co-authored-by: Weixing Zhang <wezhan@microsoft.com>	2020-07-02 22:05:32 -07:00
edgchen1	dba22b17b4	Update BiasGeluGradDxKernel and tests. (#4400 ) For BiasGeluGradDxKernel: - Implement optimization to first load from global memory into registers as suggested by Weixing. - Support larger bias sizes which were previously limited by the number of threads per block. - Address flaky unit test by increasing the error tolerance to the default value.	2020-07-02 18:55:44 -07:00
Tracy Sharpe	93d4964727	Use single OpKernel for u8u8 and u8s8 types (#4414 ) Combine kernels for u8u8 and u8s8 types.	2020-07-02 18:23:58 -07:00
Pranav Sharma	4df8a1e240	Use the file size while reading onnx models. Ensure models are loaded using APIs in model.h for consistency. (#4399 ) * Use the file size while reading onnx models. Ensure models are loaded using APIs in model.h for consistency. * Refactor existing GetFileLength in posix.cc and address PR comments. * Fix linux build - signed/unsigned conversion	2020-07-02 17:30:53 -07:00
Scott McKay	d22f6fddf7	Add ability to specify just the device when using IOBinding for an output (#4386 ) * Add ability to specify just the device when using IOBinding for an output. This enables keeping an output on a different device GPU when it has a dynamic size that is not known ahead of graph execution.	2020-07-03 09:26:47 +10:00
Vincent Wang	28e4c0edf5	Keep loss_scale and Whole Loss Subgraph in FP32 during Mixed Precision Training (#4268 ) * Keep loss subgraph as FP32 when mixed-p training. * Fix case where there is no white-list loss op. * Get nodes from loss_scale instead of whitelist. * rename const variables. Co-authored-by: Vincent Wang <weicwang@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-07-03 06:54:56 +08:00
suffiank	7a05b3ca87	Increase python packaging pipeline timeout (#4412 ) * increase python packaging pipeline from 90 to 110 min * change timeout to Linux GPU and do 120 min to match Win GPU	2020-07-02 15:38:39 -07:00
Yufeng Li	67a7d93b49	Fuse MatMulInteger and scale followed (#4350 ) * Fuse MatMulInteger and scale followed * Add bias	2020-07-02 13:08:21 -07:00
Tiago Koji Castro Shibata	10c25416bb	Remove use of RTTI in CUDA provider (#4410 )	2020-07-02 12:44:17 -07:00
Hariharan Seshadri	eabc1616e6	Rename variable in InferenceSession class so as to not clash with an existing var (#4391 ) * Rename variable in InferenceSession class so as to not clash with an existing var * Fix build break	2020-07-02 12:27:14 -07:00
suffiank	f6bf66c8cf	Adjustments to MPI and NCCL library discovery on build (#4407 ) * cmake edits for mpi_home and nccl_home * cmake syntax error on else	2020-07-02 12:03:42 -07:00
dependabot[bot]	f4e0070c2e	Bump mysql-connector-java from 8.0.15 to 8.0.16 in /tools/perf_util (#4401 ) Bumps [mysql-connector-java](https://github.com/mysql/mysql-connector-j) from 8.0.15 to 8.0.16. - [Release notes](https://github.com/mysql/mysql-connector-j/releases) - [Changelog](https://github.com/mysql/mysql-connector-j/blob/release/8.0/CHANGES) - [Commits](https://github.com/mysql/mysql-connector-j/compare/8.0.15...8.0.16) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2020-07-02 11:22:45 -07:00
gwang-msft	0bef9d5114	Fix the broken Android NNAPI CI (#4403 ) * Change NNAPI CI to run on new NNAPI EP * update android ci to mac 10.15 and remove in install cmake * update the android ci to targe android api level 29 * remove unnecessary ndk install git submodule call	2020-07-02 10:22:18 -07:00
Ashwini Khade	ef602835b0	update getfunctionbody (#4396 )	2020-07-02 09:00:37 -07:00
Changming Sun	3bb6a865cc	Revert "remove openmp and scipy from build pipelines (#4305 )"	2020-07-02 00:30:02 -07:00
S. Manohar Karlapalem	4c0236d6c1	Update MCR container instructions with dynamic device selection info (#4371 )	2020-07-01 22:16:55 -07:00
Tracy Sharpe	5c23b17196	MLAS: more prepacking kernel changes (#4397 ) Kernel changes to support StrideK>128	2020-07-01 22:09:42 -07:00
Sherlock	2d54c89d77	Update filename and Cleanup unused cudnn kernels (#4387 ) * Update filename and Cleanup unused cudnn kernels * Cleanup unnecessary dependency	2020-07-01 17:19:49 -07:00
Yang Chen	010445fc52	handle Floor and SplitToSequence (#4384 ) * handle Floor and SplitToSequence added support to Floor and SplitToSequence ops * Address CR use sympy.floor for computation on Floor	2020-07-01 16:09:43 -07:00

1 2 3 4 5 ...

2841 commits