onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-16 21:00:14 +00:00

Author	SHA1	Message	Date
Edward Chen	71e7c2b423	Cache build docker images in container registry. (#5811 ) This PR adds infrastructure to automatically cache docker images used in CI builds in a container registry. Currently, build images are pulled from a container registry for some builds and built every time for others. The container registry requires maintenance to keep the images up to date and building images every time wastes build agent resources. With this change, a given build image can be looked up in a cache container registry and if present, pulled, and otherwise, built and pushed. The uniqueness of a build image is determined by a hash digest of the dockerfile, docker build context directory, and certain "docker build" options. This digest is part of the image tag in the cache container repository. The cache container registry will need to be cleaned up periodically. This is not automated yet.	2020-11-17 17:02:24 -08:00
Guoyu Wang	252dbf1182	fix build break (#5835 )	2020-11-17 16:08:10 -08:00
Du Li	3b5ba1cf7e	Parallelizing Resize op (#5792 ) * adding parallelization for resize bi-linear mode. * Adding parallelization for resize op. * Use TrySimpleParallelFor instead of TryParallelFor. TryParallelFor has unaddressed issue with cost model. * Addressing PR comments.	2020-11-17 13:59:18 -08:00
Justin Stoecker	bd236ecc26	Switch to unified DirectML 1.4.0 redistributable (#5794 ) Transitions from the ORT-only DML NuGet (hosted on the onnxruntime_public feed) to the new unified DirectML NuGet (Microsoft.AI.DirectML) on nuget.org. In addition, the Microsoft.AI.MachineLearning (WinML) and Microsoft.ML.OnnxRuntime.DirectML packages now take a dependency on the Microsoft.AI.DirectML package. This means we can remove the extra copy of DML binaries in these packages since they will be installed by the DML package.	2020-11-17 13:42:23 -08:00
Scott McKay	c84bc25e28	Add validation of op registrations (#5817 ) * Add validation of operator registrations to the reduction script - the script has all the logic to process the registrations, and there's a CI that uses it Fix some operator registrations * Fix CUDA PRelu registration * Refactor to split out kernel registration file parsing and use in the exclude ops script and an op registration validation script. Run op validation in minimal build CI * Fix PEP8 error and some comments	2020-11-17 10:44:09 -08:00
Sherlock	241b2226a7	Update orttraining-linux-gpu-ci-pipeline.yml for Azure Pipelines (#5826 )	2020-11-17 09:27:59 -08:00
Guoyu Wang	1a66dfc0f9	Enable Squeeze Opset 13 for NNAPI (#5717 ) * Add copy sparse model in minimal CI * Add squeeze 13 support * fix small typo * Add ut for squeeze in NNAPI * Fix some issue in the UT and code * Modify based on the master change * Fix build break	2020-11-17 00:26:06 -08:00
Scott McKay	7b76b57fc8	Support EPs that compile nodes in a minimal build. (#5776 ) * Support EPs that compile nodes in a minimal build. This enables NNAPI being used.	2020-11-17 13:52:22 +10:00
Tiago Koji Castro Shibata	794e8479eb	Revert #5805 (#5823 ) * Fix race condition in msbuild * Revert "Named Dimension Override internals test and experimental API (#5805)" This reverts commit `157d1844fb`.	2020-11-16 17:05:28 -08:00
Scott McKay	a3f3a63206	Move OpenVINO specific validation function to somewhere more sensible, and rename to provide context on its usage. (#5822 )	2020-11-17 10:58:43 +10:00
Dwayne Robinson	732ffd12d2	DirectML Execution Provider integration 2020-11-13 (#5809 ) * Merged PR 5253310: Fix 0-sized dimension broadcasting Tensors that contain 0-sized dimensions were being broadcasted to higher dimensions, which would remove the possibility to remove them from the graph. 0-sized dimensions represent empty tensors, so whatever operator needs to broadcast it shouldn't try to call into DML. * Merged PR 5334334: Fix asserts and failure in GraphKernelHelper.cpp This extends a workaround needed to match node inputs with Tensors to the EP code handling constant input upload. This was causing issues in a couple of models, including EfficientDet, although that model still fails due to this bug: https://microsoft.visualstudio.com/OS/_workitems/edit/29970551 Related work items: #29706035 * Merged PR 5344477: Disable GPU timeouts in DML EP command queue creation GPU timeouts have already been disabled in command queues created by Winml, but not the ones created by the DML EP within the ORT API * Merged PR 5380534: BatchNormalization failure in autopilot - fix output size New validation [here](https://microsoft.visualstudio.com/DefaultCollection/WindowsAI/_git/WindowsAI/pullrequest/5354070?_a=files&path=%2Fdml%2FSharedValidation%2FDmlBatchNormalizationOperatorValidator.h) causes some BatchNorm cases to fail (e.g. OnnxConformanceTestsTaef::BatchNormalization (BatchNormalization_2x2x2)). I'm unsure how long this bug existed, but based on Nick's investigation, it apparently still worked anyway. Related work items: #27678610 * Merged PR 5386132: Update 8D BatchNorm Update 8D BatchNorm Related work items: #27678610 * Merged PR 5390213: Tile allow 0 in repeats 0 is valid in Tile in "repeats" parameter. The CPU kernel handles it fine. So should the DML EP. Related work items: #29970551 Co-authored-by: Justin Stoecker Co-authored-by: Jeff Bloomfield Co-authored-by: Patrice Vignola Co-authored-by: Nick Feeney	2020-11-16 15:29:08 -08:00
Guoyu Wang	339348bc46	Fix bug in resize IsOpSupported, and add nearest neighbor resize support (#5810 )	2020-11-16 14:27:50 -08:00
Changming Sun	833432d7d1	Update mysql-connector-java (#5802 )	2020-11-16 14:09:14 -08:00
Dmitri Smirnov	2a6c73cf8c	Address publishing pipelines failures. (#5806 ) * Address pipelines failures. * Addrss one more fp16 model failure.	2020-11-16 10:19:19 -08:00
Sheil Kumar	671fa60327	Enable direct tensorization and detensorization to many buffers in WinML (#5791 ) * switch to work PC * back with iterable of buffers * add raw api tests * tensorization * last test * all tests pass! * small cleanup * whitespace * newline * whitespace * refactor common code into DisjointBufferHelpers * remove unused file * warning * skip gpu tests when hardware not available * Add error condition when createreference is invoked * add null check to cretereference * uncomment out check Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2020-11-16 10:06:22 -08:00
RandySheriffH	20ae1ea21f	Remerge custom gpu op (#5818 ) * add case for cpu custom op on gpu * format doc * restrict GPU custom op on Linux GPU CI only * separate cu file to a independent project * fix typo * include cuda_add lib * move lib def * add file header Co-authored-by: RandySheriffH <rashuai@microsoft.com>	2020-11-16 09:27:46 -08:00
Ryan Lai	e40df385ba	Skipping even more x86 tests (#5799 )	2020-11-15 20:52:26 -08:00
zhijxu	89e5b3a24f	resolve review comments	2020-11-16 11:23:01 +08:00
zhijxu	89902c2519	fix frontend bug. old ort session may already exists when creating new ort session, this may cause OOM error	2020-11-16 11:23:01 +08:00
Guoyu Wang	c4818d36ed	[NNAPI EP] Make NNAPI EP build on non-Android Platform (#5779 ) * Make NNAPI EP build on non-Android Platform * minor updates * Adress CR comments * Fix build issue using Windows, address CR comments * Fix linux build warnings * Fix for test failure * Fix for test failure * Fix model_tests failure	2020-11-15 17:04:45 -08:00
Weixing Zhang	5b7dc5aeee	fix build failure for ROCm EP (#5816 ) The kernel declaration of Identity needs to be updated in ROCm EP since ROCm EP shares the implementation of Identity with CUDA EP in which it has been changed due to opset 13 support.	2020-11-15 10:36:15 -08:00
Jesse Benson	ced5b66306	Re-enable multi-tensor-apply for LAMB optimizer	2020-11-15 09:35:00 -08:00
Weixing Zhang	fc614ad050	revert the code change which was based on `b4869926` The change `b4869926` which was to remove per-thread allocator would cause seg fault for distributed training. In addition, add dockerfile for ROCm3.9	2020-11-15 00:24:32 -08:00
RandySheriffH	c23fbba463	Fix reduce pipeline by replacing model (#5813 ) * update model and better comment * fix parameter Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2020-11-14 20:17:23 -08:00
Scott McKay	3269e59b2c	Add opset 13 registration for Identity. (#5800 ) * Add opset 13 registration for Identity.	2020-11-14 21:40:24 +10:00
Ori Levari	157d1844fb	Named Dimension Override internals test and experimental API (#5805 )	2020-11-13 21:21:11 -08:00
Ye Wang	262e9ef21d	Support input dimension swap in Attention op (#5774 ) * checkin cpu * checkin cpu * add test * cuda * update comments * review comments * update * modify var name * remove unnecessary error msg * fix comments Co-authored-by: wangye <wangye@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-11-13 18:29:08 -08:00
sfatimar	dfbf6d78be	OpenVino: fix allocation failure on Window for RelWithDebInfo build (#5713 ) * ng_supported_ops * Remove ng_supported_ops * Revert "Remove ng_supported_ops" This reverts commit 3c27385b2d88c6e8cf7ac4e8c290a367ad5d0bd8. * Revert "ng_supported_ops" This reverts commit 650721ae2913b79739521d58838298e031abdac1. * cmake changes to ensure that the debug build on windows link to debug builds of openvino and do not result in bad allocation error Co-authored-by: sfatimar <sahar.fatima@intel/com>	2020-11-13 07:59:52 -08:00
Vincent Wang	0c8902cbbe	Update Gradient Builder of Some Ops for OpSet13 (#5748 ) * gradient builder for opset13 * code clean. * resolve comments * stop grad for axes input * add split to stop grad list. Co-authored-by: Vincent Wang <weicwang@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-11-13 16:20:34 +08:00
Yufeng Li	1f722863b2	Scale Bias post processor for ARM (#5795 )	2020-11-12 21:12:23 -08:00
jeyblu	435b904f0e	add dnnl gpu engine (#5788 )	2020-11-12 20:17:54 -08:00
Ryan Lai	0ea998134a	Skip new x86 tests in ort model tests (#5789 )	2020-11-12 18:08:11 -08:00
Dmitri Smirnov	2f35e65135	Add Float16 and BFloat16 support to C# API (#5775 ) Add Float16 and BFloat16 support.	2020-11-12 17:57:08 -08:00
edgchen1	4d517c68a3	Fix reference to old download_e2e_test_data.py script. It was renamed to download_azure_blob.py. (#5790 )	2020-11-12 15:48:06 -08:00
Alberto Magni	88c3704257	Add shape inference for additional ops This commit adds shape inference support for the following ops: SoftmaxCrossEntropy SoftmaxCrossEntropyLossGrad SoftmaxCrossEntropyGrad LayerNormalizationGrad Motivation and Context	2020-11-12 20:18:54 +00:00
Ryan Lai	4e29f48010	skip gpt2 test on x86 (#5787 )	2020-11-12 11:49:47 -08:00
pengwa	49288de17c	Fix memory planning issues (#5752 ) * Fix memory planning issues * fix build * fix the wrong line...	2020-11-13 03:07:59 +08:00
alexzakv	44d3c31200	Winml_principles_change (#5727 ) * Contributing page change * Update WinML_principles.md * Update WinML_principles.md * Update WinML_principles.md * Updated * Update WinML_principles.md * Update WinML_principles.md * Update WinML_principles.md * Update WinML_principles.md	2020-11-12 10:39:24 -08:00
Guoyu Wang	dc0f7b8f82	Remove onnxruntime_session_options_config_keys.h from c_api (#5772 ) * Remove seesion config keys header from c_api * remove copy session config header in release package * Keep the session option config header in the package	2020-11-12 09:12:13 -08:00
stevenlix	54de618c2e	Improve TensorRT engine caching (#5737 ) * add profile caching to improve engine caching feature * Add comments * fix typo * add decryption for engine caching * Update tensorrt_execution_provider.cc * Update tensorrt_execution_provider.cc * Update tensorrt_execution_provider.cc * Update tensorrt_execution_provider.cc * Update tensorrt_execution_provider.cc * update onnx-tensorrt submodule * set opt profile to max value of the range * add hash to engine/profile name * Add calibration based INT8 quantization * add an option to enable both FP16 and INT8 * Update tensorrt_execution_provider.cc * add env variable to specify calibration file name * clean up code * Add comments and update TRT document * enable tensorrt basic test and add EngineCachingTest * clean up * update envrionment variable in the test * clean up	2020-11-12 08:56:45 -08:00
Vincent Wang	2a87108431	SoftmaxCrossEntropyLoss OpSet13. (#5777 ) Co-authored-by: Vincent Wang <weicwang@microsoft.com>	2020-11-12 15:50:34 +08:00
Hariharan Seshadri	b92fc66ea1	Support opset-13 specs of controlflow ops (Loop, If) (#5665 )	2020-11-11 23:44:14 -08:00
Sherlock	07dc25e939	Compute global gradient norm according to 'enable_grad_norm_clip' (#5728 ) * Introduce PassThrough op to wait for all gradient ready before weight update * Compute gradient norm for fp32 runs * Update FE UT expected value * Respect enable_grad_norm_clip	2020-11-11 21:10:34 -08:00
Pranav Sharma	1ae58c960c	Allow turning off printing of shape when compiled with onnxruntime_DEBUG_NODE_INPUTS_OUTPUTS. (#5768 ) * Allow turning off printing of shape when compiled with onnxruntime_DEBUG_NODE_INPUTS_OUTPUTS.	2020-11-11 18:59:04 -08:00
ashbhandare	5aec34500d	Add megatron transforms for BART (#5521 ) * Large model export and run ORT Python support * Megatron change refine a bit workaround self attention issue use partitioned name for weights when megatron model parallel is enabled Fix Megatron Transformer Issue (cuased by the renaming) Add UTs for T5 model parallel Fix megatron seed issue fix log a bit checkkpointing changes + rebase Unintended reshape transform change t5 layer norm changes add t5 layer norm kernel use template for t5 layer norm template definition changes no build error add CPU cuda kernel first unit test other forward unit tests add T5LayerNormGrad Add c++ transform and test for T5 LN minor fix BART MLP Megatron tranform Add concat slice transform + test Cosmetic improvements in concat slice transform Constant folding bug fix + megatron attention transform for BART Undo unnecessary changes * Cleanup * Remove unnecessary changes * Cleanup megatron * Windows build * Add self attention test graph * Correcting transforms + cleanup * review comments * review comments * fix build and test failures * Fix CI * fix windows CI Co-authored-by: Peng Wang <pengwa@microsoft.com> Co-authored-by: Aishwarya <aibhanda@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-11-11 16:21:36 -08:00
Hariharan Seshadri	a14cd6267b	Support opset-13 specs of softmax family ops (Softmax, LogSoftmax, Hardmax) (#5707 )	2020-11-11 15:45:03 -08:00
Xavier Dupré	e5c8040c52	Improves performance of operator Transpose (#5550 ) * Improves implementation of transpose operator * Simplifies transposition when it is not really needed.	2020-11-12 00:25:25 +01:00
Maajid khan	a84a058f9e	[OpenVINO-EP] Enabling Multi Device support (#5740 ) * Enabling Multi Device support for UEP Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Minor fix added *Added a simple fix to determine OpenVINO version for Arm build as well Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>	2020-11-11 15:16:30 -08:00
Guoyu Wang	4207e99be3	[NNAPI EP] Move GetCapability independent of ModelBuilder (#5767 ) * Move GetCapability independent of ModelBuilder * minor code style fix * Move ort_enforce for same number of op_builders and op_support_checkers * minor code fix	2020-11-11 13:33:38 -08:00
Xueyun Zhu	d8ace07ad7	Add CPU send/recv for pipeline (#5315 ) * cpu send/recv * clean up send/recv * remove unused code * assert and nccl option for mnist * add build option to enable build with only cpu. Without this, nccl is always enabled which will break build on machine that only contains cpu * Add USE_MPI distinct from USE_NCCL/USE_HOROVOD * fix * fix * exclude cpu send/recv for machines without mpi Co-authored-by: Tim Harris <tiharr@microsoft.com>	2020-11-11 12:41:39 -08:00

1 2 3 4 5 ...

3770 commits