onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-06 04:28:32 +00:00

Author	SHA1	Message	Date
Scott McKay	a3d2bc36be	Fix script name in doco (#5530 )	2020-10-20 06:42:53 +10:00
Thien Bui	6ad70d7371	[Doc] ONNX_Runtime_Server_Usage fix proto uri (#5345 ) The predict proto should be `../server/protobuf/prediction_service.proto` instead of `../onnxruntime/server/protobuf/prediction_service.proto`	2020-10-19 13:30:58 -07:00
Olivia Jain	1e4b259d28	Updating EP docs with Onnxruntime API calls (#5503 ) * updating examples with current api calls * Fixing capitalization in api calls, adding RKNPU update * Correcting nuphar and rknpu ep api calls * Include creating session in readme	2020-10-19 12:21:21 -07:00
Derek Murray	0b59004666	Add fallback function implementation for DivGrad (#5518 ) * Add fallback function implementation for DivGrad. * Add shape inference for DivGrad. * Add missing argument. Co-authored-by: Derek Murray <demurra@microsoft.com>	2020-10-19 10:47:47 -07:00
Tracy Sharpe	a355281b99	Add alternate IsSupportedOptypeVersionAndString signature (#5529 ) Add a variant of graph_utils::IsSupportedOptypeVersionAndDomain that takes const char* instead of std::string.	2020-10-18 18:14:06 -07:00
KeDengMS	e1a54c4090	Symbolic shape inference: fix a bug in shape merge (#5519 ) * Symbolic shape inference: fix a bug in shape merge OpType Where: input0: ['mt_src_tokens_batch', 1, 1, 'mt_src_tokens_len'] input1: [] input2: ['mt_prev_output_tokens_batch', 12, 'mt_prev_output_tokens_len', 'floor(mt_src_tokens_batchmt_src_tokens_len/mt_prev_output_tokens_batch)'] 1 output: [None, 12, 'mt_prev_output_tokens_len', None] Undo unintended TRT change	2020-10-16 17:54:57 -07:00
Sergii Dymchenko	eda9fd566e	Update tar-stream and prebuild-install versions (#5479 ) * Update tar-stream and prebuild-install versions Update the versions because of Component Governance alerts. * Update package-lock.json	2020-10-16 12:18:49 -07:00
Scott McKay	ad94a1dd6d	Add opset 13 registrations for Identity, IsNaN, NonZero, GatherND and Pad (#5513 )	2020-10-16 09:39:03 -07:00
Ryan Lai	f207f0bf5e	Add WinML Model testing (#5417 ) * Model test start with float * Clean up code and add environment variable detection * Move into namespace * PR comments * Fix linker errors in latest merge to master and also fix warning * add skipping model test mechanism * Return std::string instead of writing to buffer * Address case where env variable is larger than max_path * use const static string for test reason * Disable x86 tests and don't build if ort memory checker is enabled * Add comment * Add additional failing x86 tests and ifdef for checking fo rx86 build * PR comments	2020-10-15 19:04:12 -07:00
Guoyu Wang	b991ee4c69	Cleanup NNAPI code (#5505 ) * Cleanup NNAPI code * Check return of GetNCHWInput	2020-10-15 17:40:10 -07:00
Derek Murray	6f65e2ad2c	Mark the dX and dB outputs of ConvGrad as OpSchema::Optional. (#5462 ) * Mark the dB output of ConvGrad as OpSchema::Optional. * Also mark dX as optional Co-authored-by: Derek Murray <demurra@microsoft.com>	2020-10-15 16:54:17 -07:00
Derek Murray	64f6d856e4	Add FlattenGrad and test. (#5461 ) Co-authored-by: Derek Murray <demurra@microsoft.com>	2020-10-15 16:11:57 -07:00
Derek Murray	88f6523baf	Add type inference for BroadcastGradientArgs (#5501 ) * Add type inference for BroadcastGradientArgs This change enables the ONNX shape and type inference to work on a function body containing a BroadcastGradientArgs op. Without this change, the dummy inference function is used, and no types are inferred for the output here: `531e6dd459/onnx/shape_inference/implementation.cc (L467-L469)` * Handle optional outputs.	2020-10-15 16:11:24 -07:00
Scott McKay	7da7e07909	Cleanup some test infrastructure (#5484 ) * Created shared version of InferenceSession wrapper class and update relevant tests to use it. Include domain in the ops counting helper so it's more general and we don't need to duplicate it in the nchwc tests. Update tests to include domain in key being checked. * Fix some training tests * Fix prefixing of contrib op names in test	2020-10-16 06:44:01 +10:00
Sunghoon	645d978589	Sunghcho/denormals (#5391 ) * Add session option and global thread pool option to set denormal as zero. * Revert unneccessary changes. * Add cpuinfo submodule * Add more comments * Remove cpuinfo submodule dependency and check only SSE3 support for ftz and daz inspired by Tensorflow * Preserve API order in C api * Clean up and utilize SSE3 detection logic from existeing cpuid_info.h * Keep the same order with header file * Fix build issue with Linux pipeline, which has old g++ compiler * Fix broken build on Linux and remove a duplicated unit test * Remove reformatting at eigen thread pool * Remove flatbuffers which is not intentionally added * Revert "Remove flatbuffers which is not intentionally added" This reverts commit 9f509a9aaaa3c7832d88854c82fd26b234770b7f. * Remove flatbuffers which is not intentionally added * Resolve comments - Put details on APIs - Add a log for ftz/daz initialization - Add clang - Fix typo * Remove unnecessary header include * Resolve comments	2020-10-15 12:47:42 -07:00
Guoyu Wang	915d475353	Android CI update (#5474 ) * Update Android CI * update comments	2020-10-14 16:56:50 -07:00
sfatimar	6d2a30eae3	[OPENVINO-EP] 2021.1 Release (#5431 ) * Cmake changes for 2021.1 * added new ov version 2020.1 for faster rcnn * Added missing defs * equal op modified * changes to incoroporate faster rcnn * backend util.cc * hddl_plugin_config.hpp is depreceated . instead use hddl_config.hpp * changing myriad precision bool to i32 * gather is not enabled for gpu * conv2D and pooltest auto_pad attribute should not be null * negative indices are not valid for scatter op in myriad * non max suppression op only supported in faster rcnn mode * maxpool indices output is not supported * Cleaned redundant code in backends * Added ifdefs for HDDL config * cast output dimensions check topk operator k input it seems only resolved for myriad as it is throwing issues for ask rcnn . need to verify * we are limiting the subgraph size to 3 here * taking care of review comments * Fixed minor bugs * Modified Slice op checks * Added NonZero, Upsample * Removed TopK if it's in the middle of a subgraph * incorporated upsample conditions too * Dockerfile changes for 2021.1 release * dockerfile aptkey update * Minor fixes * ceil condition added again * Fixed few gpu models * Disabled LSTM and yolov3 in ModelTests * python softmax cross entropy tests and negative log likelihood * Update Build.md Updated for openvino 2021.1 * Update OpenVINO-ExecutionProvider.md update openvino execution provider for 2021.1 * Update READMe.md updated new openvino version * Update Dockerfile.openvino added environment variable for DEBIAN Frontend * Fixed myriad models * Fixed gather condition * Fixed mask rcnn model on myriad * Modified Gather condition * set default target of MCR dockerfile to MYRIAD_FP16 * Fixed tinyolov3 on CPU * Update OpenVINO-ExecutionProvider.md update openvino execution provider documentation * Update Dockerfile.openvino Removed environment variable * Update OpenVINO-ExecutionProvider.md update image manipulation networks supported * Update onnx_backend_test_series_filters.jsonc removed test_upsample_nearest from cpu test cases * New InternalCI changes for 2021.1 * Full protobuf removed for OpenVINO * Protobuf added * Updated with apt installation for openvino * Revert the testing changes * Reverted testing changes * File permessions are changed to original * Deleted openvino installation and cmake change * Optimized Dockerfile Removed unnecessary cmake installation, numpy * Added missing ifdefs * delete array fix * backend_utils.cc output_shape * Revert "set default target of MCR dockerfile to MYRIAD_FP16" This reverts commit 928d3e2b71e2f589cf51dacd3a133951cf9ca18d. Co-authored-by: suryasidd <surya.siddharth.pemmaraju@intel.com> Co-authored-by: sfatimar <sahar.fatima@intel/com> Co-authored-by: suryasidd <48925384+suryasidd@users.noreply.github.com> Co-authored-by: S. Manohar Karlapalem <manohar.karlapalem@intel.com> Co-authored-by: Aravind <aravindx.gunda@intel.com> Co-authored-by: Aravind Gunda <38353114+gundaarx@users.noreply.github.com>	2020-10-14 15:56:00 -07:00
Chun-Wei Chen	2b6b3a2ee6	Add GetProfilingStartTimeNs() to Python/C# APIs (#5280 ) * add Python API for getProfilingStartTime * debug for using Python API * add in C# api * use uint intead of uint64_t to prevent warning * typo for GetProfilingStartTimeNs * remove const * Update onnxruntime/python/session.py Co-authored-by: Pranav Sharma <emailpranav@gmail.com> * remove unnecessary return * Add Python unit test * Add C# unit test and refactor Python test * use ulong in C# for uint64_t in C++ * remove time.monotonic_ns * syntax: remove public for inner function * correct the API's order * getprofilingstarttime after run * Correct the right order in NativeMethod.cs * update order * nit: remove spaces * Update csharp/src/Microsoft.ML.OnnxRuntime/InferenceSession.cs Co-authored-by: Guoyu Wang <62914304+gwang-msft@users.noreply.github.com> * use the updated function * add comment about the precision * add more comments * add session.py back * fix flake8 * remove session.py * Add comments in C, C#, Python APIs about precision Co-authored-by: Pranav Sharma <emailpranav@gmail.com> Co-authored-by: Guoyu Wang <62914304+gwang-msft@users.noreply.github.com>	2020-10-14 05:32:43 -07:00
Changming Sun	1514509fd7	Update protobuf submodule url (#5477 )	2020-10-14 02:35:38 -07:00
Ashwini Khade	44248d9646	opset13 kernel registration (Transpose, Tile, ScatterND, ScatterElements, Gather, GatherElements, Slice, DepthToSpace, SpaceToDepth) (#5454 ) * register kernels for opset 13 * fix formatting	2020-10-13 22:10:01 -07:00
Tiago Koji Castro Shibata	fabe02ddc2	Don't change global FPU state during round-half-to-even (#5376 ) * Don't change global FPU state * Handle infinity properly	2020-10-13 20:10:33 -07:00
Ye Wang	67315d8ae0	Optimize openai-gpt/albert model and add fusion test (#5466 ) * optimize openai-gpt * add huggingface model fusion test * move albert's attention fusion here * add test for albert fusion	2020-10-13 19:24:14 -07:00
Scott McKay	5544391e79	Fix linking of MLAS unit test lib on platforms where libatomic is required. (#5469 )	2020-10-14 07:25:43 +10:00
Bowen Bao	8e9afe1944	Add long type support for SplitToSequence operator (#5367 )	2020-10-13 12:57:11 -07:00
Hariharan Seshadri	e01d152464	Add OpSet kernel registrations as part of opset 13 support (#5465 )	2020-10-13 10:02:00 -07:00
S. Manohar Karlapalem	6e6147fb75	Use correct protoc tool file name for C# builds (#5429 ) In Linux builds, the protoc tool is simply named 'protoc' (without the .exe extension).	2020-10-13 09:43:03 -07:00
Xiang Zhang	b12824fa7a	add telemetry event for nodejs binding (#5463 )	2020-10-12 22:53:01 -07:00
Guoyu Wang	ce5465d5f3	[NNAPI EP] Add Resize and Clip support (#5427 ) * Add resize and clip support in NNAPI EP * Try to get around tensor rt test failure * Addressed PR comments	2020-10-12 22:29:19 -07:00
KeDengMS	c444b9d76a	Add CUDA option to run copy in default stream (#5445 ) * Add CUDA option to run copy in default stream This change fixes #4829. Thanks @maherzog for providing the repro! The bug is caused by memory reuse in BFC arena, where copy and compute stream in CUDA has a racing condition. BFC arena is an arena allocator on top of cudaMalloc/Free to reduce the cost in syncing CPU and GPU when alloc/free. It means when CPU alloc/free the memory, GPU might not finished previous work on the memory, so that CPU and GPU could run asynchronously. This is OK if there's only one stream, where the execution order in CPU and GPU are consistent. For example, if we have two kernels A and B, CPU runs allocA->computeA->freeA->allocB->computeB->freeB, A and B could shares the same memory since computeA and computeB will not have racing as long as they run in the same GPU compute stream. However, if CPU runs allocA->CopyA->freeA->allocB->computeB->freeB, the order of execution in GPU could have copyA happen after computeB, if copy and compute happens in different GPU streams. This change makes copy to run in default compute stream, while adding an option to fall back to previous behavior if there's perf hit. This is a short term fix before BFC arena could support multiple streams. User may use following options to revert to previous behavior: C API: struct OrtCUDAProviderOptions cudaProviderOpt; cudaProviderOpt.do_copy_in_default_stream = false; C++ API: CUDAExecutionProviderInfo cudaEPInfo; cudaEPInfo.do_copy_in_default_stream = false; C# API: pending... Python: import onnxruntime onnxruntime.capi._pybind_state.set_do_copy_in_default_stream(False) * Confirmed the test failes in CI when doing copy in separate stream Revert the test to get CI pass now * Fix Windows test * Address CR	2020-10-12 22:12:05 -07:00
Wenbing Li	80d36eab86	enable the onnxruntime shared library test on iOS (#5443 ) * enable the onnxruntime shared library test on iOS * fixing as commented. * add return status check.	2020-10-12 21:40:57 -07:00
RandySheriffH	913116e64e	bump ops version to opset13 (#5456 )	2020-10-12 20:47:09 -07:00
Sergii Dymchenko	05b1c02d32	Fix commands in README.md. (#5459 )	2020-10-12 17:53:09 -07:00
Sherlock	60dbd8a1e5	Update maximum batch size for UT; Include recompute modes (#5444 ) * Update MaxBatchSize and include recompute mode * Minor fix for frontend test Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-10-12 14:50:43 -07:00
Derek Murray	dbc626dcbe	Add ExpGrad registration and test. (#5438 ) Description: Add missing gradient registration for the `Exp` op. Motivation and Context * Adding support for training a model that uses the `Exp` op. Co-authored-by: Derek Murray <demurra@microsoft.com>	2020-10-12 13:56:08 -07:00
Ashwini Khade	2a018cc235	revert contrib op version bump and deprecation of TransposeMatMul (#5424 ) * revert contrib op version bump and deprecation of TransposeMatMul * update documentation	2020-10-12 13:02:15 -07:00
jingyanwangms	20c47ce91c	Simplified layer norm changes (#5028 ) * t5 layer norm changes * add t5 layer norm kernel * use template for t5 layer norm * template definition changes * no build error * add CPU cuda kernel * first unit test * other forward unit tests * add T5LayerNormGrad * Add c++ transform and test for T5 LN * fix and some debug prints * fix cuda error * rename from t5 to simplified * PR comments * revert change on invertible LM code path * remove duplicate forward computation * add GradientCheckerTest.SimplifiedLayerNormGrad * change back macro * Fix SimplifiedLayerNorm Gradient * merge with Sherlockss changes * changed cuda kernel * reapply cpu kernel changes Co-authored-by: Jingyan Wang <jingywa@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: aishwarya bhandare <aibhanda@microsoft.com> Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-10-12 11:22:12 -07:00
edgchen1	ed60e0fe39	Fix BUILD.md environment variable name typo. (#5402 )	2020-10-12 11:17:09 -07:00
Pranav Sharma	5e48c0fd6c	Register opset13 ops: Dropout, Flatten, LRN, MeanVarianceNormalization, ArgMax, ArgMin, Reshape, Shape, Concat. (#5451 )	2020-10-12 10:09:38 -07:00
stevenlix	186f0668b0	update onnx-tensorrt submodule (#5442 )	2020-10-09 21:49:40 -07:00
Hariharan Seshadri	b9f90e297e	Support sharing of initializers between session via the Python API (#5407 )	2020-10-09 20:26:28 -07:00
Ryan Hill	6132e1f6ae	Shared providers - fix logging plus cleanup (#5406 ) * Fix logging, cleanup, and implement the remainder of the not implemented functions from the shared provider interface.	2020-10-09 17:31:03 -07:00
Wei-Sheng Chin	6cba42e942	Avoid inserting other CUDA calls in-between NCCL Send's and Recv's (#5430 ) * Avoid inserting other CUDA calls in-between NCCL Send's and Recv's * Add a comment * Place CUDA EP on the right device * Fix a warning * Address a comment	2020-10-09 15:34:46 -07:00
liqunfu	dbe7e6623b	only use/import pytest if needed (by enable_training) (#5437 ) Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-10-09 12:42:19 -07:00
Dmitri Smirnov	9642f1448e	Add OpSet 13 Registrations (#5426 ) Register Sigmoid for OpSet13 Register OpSet 13 for Sum, Min, Max, Mean. Add Erf OpSet 13 registration. Register Clip for OpSet 13 Add Gemm/MatMul Opset 13 resigstartions Signed-off-by: Dmitri Smirnov <dmitrism@microsoft.com>	2020-10-09 12:39:22 -07:00
Sergii Dymchenko	3a9a1a4ef1	Fix registration for GatherGrad (#5382 ) * Fix registration for GatherGrad to fix GatherGradOpTest.GatherGrad_axis0_indices2d_half. * Fix GatherGrad registration for CUDA also.	2020-10-09 11:57:50 -07:00
liqunfu	1cceefc7d4	use run_orttraining_test_orttrainer_frontend_separately to work aroun… (#5408 ) * use run_orttraining_test_orttrainer_frontend_separately to work around a sporadic segfault. Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-10-09 09:16:10 -07:00
Scott McKay	a92ccbe1bc	Various armv7 related fixes (#5394 ) * - Link with libatomic if needed - Install pip differently so it doesn't clash with the system pip which may involve a wrapper script - Remove ability to specify offset when Tensor allocates the data. The data prior to offset isn't accessible by anything. - Fix use of offset in TensorOpTest to work on armv7 where it must be aligned to the type it points to. - Fix ActivationOpNoInfTest.Softsign to allow for armv7 behavior - Fix ReductionOpTest.ReduceMean_keepdims to allow for armv7 floating point inaccuracy Address PR comments	2020-10-09 22:34:32 +10:00
Yufeng Li	b99eaa99cd	Prepacking MatMulInteger (#5403 ) * prepack matmulinteger Prepacking constant matrix B for MatMulInteger to get better performance.	2020-10-09 02:37:19 -07:00
Xavier Dupré	621fdb44e5	Fixes #4688 , remove CPUAllocator in TreeEnsemble (#5375 )	2020-10-09 11:26:07 +02:00
Keizo Fujiwara	d4507e9331	Use relative path for HEADER_SEARCH_PATHS (#5412 ) Currently HEADER_SEARCH_PATHS refers a personal directory.	2020-10-08 23:06:11 -07:00

1 2 3 4 5 ...

3570 commits