onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-10 17:37:14 +00:00

Author	SHA1	Message	Date
liqunfu	af3988198c	Liqun/e2e transformer test (#3540 ) * initial change to transformer.py * prepare e2e transformer tests * refactor transformer tests * put test python files in a flat folder * fix typo pip install transform(s) * python 3.6 * python version to 3.6 in install_ubuntu.sh * remove argparser * to use opset ver 12 * workaround loss_scale naming patch in case of loss_fn_ * assign self.loss_fn_ so it can be checked * skip a few un-needed post-process steps * fix loss_scale_input_name, clean up post process steps * skip non-frontend tests * move cpu/cuda related files to coresponding cpu/cuda folder (#3668) Co-authored-by: Weixing Zhang <wezhan@microsoft.com> * type cast for ratio is not necessary for dropout (#3682) Co-authored-by: Weixing Zhang <wezhan@microsoft.com> * thrustallocator is not needed since cub is used directly for gather now. (#3683) Co-authored-by: Weixing Zhang <wezhan@microsoft.com> * GatherND-12 Implementation (#3645) * Renamed, UT passing * Move GatherND CUDA Kerenl into onnxruntime * Merge GatherNDOpTest * Refactor Test code * Merge CPU Kernel Impl * Handle Negative Indice, Fix UT * Improve CUDA kernel to handle negative index * Minor Fixes * Preserve GatherND-1 Cuda kernel * Fix Mac build * fix UT * Fix Build * fix GatherNDOpTest.double > CUDA error cudaErrorInvalidDeviceFunction:invalid device function Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: Peng Wang (pengwa) <pengwa@microsoft.com> * update with reviewers' comments * testBertTrainingGradientAccumulation was not using rtol and may fail occasionally with small (e-06) difference * fix merge mistakes Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: Weixing Zhang <weixingzhang@users.noreply.github.com> Co-authored-by: Weixing Zhang <wezhan@microsoft.com> Co-authored-by: Sherlock <baihan.huang@gmail.com> Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: Peng Wang (pengwa) <pengwa@microsoft.com>	2020-04-30 12:26:38 -07:00
pengwa	177c1357f4	Use cublasHgemm "back" for fp16 computation with Volta GPU (#3765 ) * Use cublasHgemm for fp16 computation with Volta GPU	2020-05-01 00:36:07 +08:00
Scott McKay	3421ec1110	Add Threadpool::TrySimpleParallelFor (#3759 ) * Add TrySimpleParallerFor so that there's a path with OpenMP awareness for SimpleParallelFor. Makes it consistent with [Try]BatchParallelFor and [Try]ParallelFor. Update TopK to check for the number of threads better, and to use TrySimpleParallelFor. * Update doco to mention TrySimpleParallelFor	2020-04-30 20:03:33 +10:00
M. Zeeshan Siddiqui	b9a5ed1fe2	Add SoftmaxCrossEntropyLoss to mixed-precision-transformer. (#3760 )	2020-04-30 02:48:21 -07:00
Scott McKay	9f72752397	Fix 'Install ONNX' CI failure (#3761 ) * Disable flaky test temporarily * turn off pip upgrade warning Co-authored-by: Aishwarya Bhandare <aibhanda@microsoft.com> Co-authored-by: Zeeshan Siddiqui <mzs@microsoft.com>	2020-04-30 18:18:58 +10:00
pengwa	0531acccc5	Refine GatherND CPU/CUDA Kernels & Add UTs (#3688 ) * Refactor GatherND CPU Kernel (Renaming & Simplify) * Add batch_dim=1 or 2, negative slices tests * Rename gather_nd_gard_impl.cu * Use dispatcher to refactor CUDA GatherND/GatherNDGrad * Change GatherNDBase::CommonComputeKernel --> GatherNDBase::PrepareCompute * Use HasCudaEnvironment instead of __CUDA_ARCH__ for some double type tests	2020-04-30 10:17:54 +08:00
ashbhandare	58f53966d3	Add Distributed Checkpointing support (#3639 ) * Change naming of moments to Moment_x_<weight_name> * Add checkpointing code and zero checkpoint aggregation * Correct aggregation for LAMB, cleanup * Add simple checkpointing test * Add test for zero checkpoint aggregation * Fix tests * fix test * Review changes * Fix test after review comment fix * Fix API, test * Fix test after API change * Decouple save load from ORTTrainer * Add flag to not break checkpointing with ORTModel' Co-authored-by: aishwarya bhandare <aibhanda@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-04-29 14:52:21 -07:00
David Brownell	7296e06dd5	Properly creating arguments to pass to setup.py (#3744 )	2020-04-29 09:47:51 -07:00
suffiank	ea0e2d1dde	fix warning treated as error due to ignoring return status (#3739 ) Co-authored-by: suffian khan <sukha@microsoft.com>	2020-04-29 02:38:53 -07:00
suryasidd	e529464a12	Limit the number of models run on OpenVINO (#3742 ) * Removed NMS from supported list	2020-04-29 02:23:09 -07:00
Changming Sun	7ff06056bd	Fix the test coverage pipeline (#3710 )	2020-04-28 21:21:19 -07:00
Tixxx	0638565fe0	Fix evaluation issues (#3538 ) * allow switching between eval and training modes dynamically Co-authored-by: Tixxx <root@525204a066204ea794f942530b05ae7f000000.axlncovkyjne5caro2tmz3zryb.xx.internal.cloudapp.net>	2020-04-28 21:03:37 -07:00
M. Zeeshan Siddiqui	939589c265	Fix flaky test and avoid divide by zero in SoftmaxCrossEntropyLoss-CPU. (#3734 ) * Fix flaky test and avoid divide by zero in SoftmaxCrossEntropyLoss-CPU. * fix gather test? * PR feedback.	2020-04-28 19:35:14 -07:00
Pranav Sharma	bad90d7a53	Fix a perf regression by providing a better estimate for the cost in LSTM's TryParallelFor call.	2020-04-28 19:25:20 -07:00
gwang-msft	12d7c2f6e4	iOS cross build on MacOS (#3699 ) * Enable iOS cross build on MacOS (step#1) * Changed parallel option * fixed style issues * Enable ios arm64 crossbuild on MacOS * Enable ios arm64 crossbuild on MacOS * Enable parallel build for xcode * Fix arm64 function not 4-byte aligned warning * Rename onnxruntime_ios.cmake to onnxruntime_ios.toolchain.cmake * change build.py to use the new ios toolchain file name	2020-04-28 17:09:31 -07:00
Scott McKay	29c12c0f07	Handle dim with value of zero in ConvTranspose (#3728 ) * Handle dim with value of zero in ConvTranspose * Update CUDA implementation and disable zero dim test for some EPs that don't support that yet.	2020-04-29 09:58:36 +10:00
Jeff Bloomfield	9a4d1c7720	Merge pull request #3708 from microsoft/jeffbloo/MergeDmlDev Merge DML Execution Provider updates	2020-04-28 15:19:51 -07:00
Sheil Kumar	f1a948fd62	Enable telemetry on windows zip packages (#3738 ) Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2020-04-28 14:07:11 -07:00
Ori Levari	78fde2c4cb	add downlevel test artifact to windowsai-nuget build (#3711 )	2020-04-28 10:05:32 -07:00
S. Manohar Karlapalem	f7cf703d10	[OpenVINO-EP] Optimize MCR Docker image size (#3732 ) * updated dockerfile.openvino * Group all RUN commands and add a 'cd WORKDIR' betwen each * Update doc with installer and build info Highlight usage of Online installer package. Specify --rm option during docker build to avoid caching layer. Co-authored-by: avidiyal <akhila.vidiyala@intel.com>	2020-04-29 00:08:15 +08:00
edgchen1	1356215bd0	Fix build issues in the Python Packaging pipelines. (#3725 )	2020-04-28 08:41:37 -07:00
edgchen1	1bcfd49918	Merge pull request #3731 from microsoft/ettao/ort-2-master Merge from ort_training to master	2020-04-28 07:56:05 -07:00
George Wu	6b3b4fe43e	remove warning message (#3730 )	2020-04-28 03:02:34 -07:00
Jeff Bloomfield	1a11ba8a7e	Merge remote-tracking branch 'upstream/master' into jeffbloo/MergeDmlDev	2020-04-28 00:45:22 -07:00
Tianlei Wu	f487cc0b28	Fix Reshape Fusion with graph inputs (#3729 ) Use NodeArg to check root input; Add a check on constant initializer	2020-04-28 00:03:16 -07:00
ytaous	75c24a5fac	Revert "Merge from ort_training to master (#3719 )" (#3726 ) This reverts commit `b990ba0059`.	2020-04-27 20:42:43 -07:00
ytaous	b990ba0059	Merge from ort_training to master (#3719 ) * move cpu/cuda related files to coresponding cpu/cuda folder (#3668) Co-authored-by: Weixing Zhang <wezhan@microsoft.com> * type cast for ratio is not necessary for dropout (#3682) Co-authored-by: Weixing Zhang <wezhan@microsoft.com> * thrustallocator is not needed since cub is used directly for gather now. (#3683) Co-authored-by: Weixing Zhang <wezhan@microsoft.com> * GatherND-12 Implementation (#3645) * Renamed, UT passing * Move GatherND CUDA Kerenl into onnxruntime * Merge GatherNDOpTest * Refactor Test code * Merge CPU Kernel Impl * Handle Negative Indice, Fix UT * Improve CUDA kernel to handle negative index * Minor Fixes * Preserve GatherND-1 Cuda kernel * Fix Mac build * fix UT * Fix Build * fix GatherNDOpTest.double > CUDA error cudaErrorInvalidDeviceFunction:invalid device function Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: Peng Wang (pengwa) <pengwa@microsoft.com> * Set gradient as output only for easy mode (#3694) * Support GPU Event Operators (#3653) * Add GPU event operators to support in-place updates in gradient accumulator and optimizer for modifying the tensors passing through those event operators. * Address comment and polish code * Merge shared code between CPU and GPU kernels * Move event test to a new file * Address comments * Update onnxruntime/core/providers/cuda/gpu_data_transfer.cc * fix path of cpu_featurizers_kernels.cc and cpu_featurizers_kernels.h Co-authored-by: Weixing Zhang <weixingzhang@users.noreply.github.com> Co-authored-by: Weixing Zhang <wezhan@microsoft.com> Co-authored-by: Sherlock <baihan.huang@gmail.com> Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: Peng Wang (pengwa) <pengwa@microsoft.com> Co-authored-by: ashbhandare <ash.bhandare@gmail.com> Co-authored-by: Wei-Sheng Chin <wschin@outlook.com> Co-authored-by: Ethan Tao <ettao@microsoft.com>	2020-04-27 16:45:21 -07:00
Dmitri Smirnov	4f887b465a	Uncomment celu test. (#3717 )	2020-04-27 14:24:54 -07:00
Wei-Sheng Chin	7627e6bcc2	Improve node and node argument name generation (#3649 )	2020-04-27 13:57:24 -07:00
Jeff Bloomfield	407492472f	Fix build warnings and address PR comment	2020-04-27 13:21:45 -07:00
Weixing Zhang	d03c552992	fix path of cpu_featurizers_kernels.cc and cpu_featurizers_kernels.h	2020-04-27 19:39:42 +00:00
Sherlock	635bc9cd04	Fix graph transformers to support opset 12 ops (#3715 )	2020-04-27 11:53:45 -07:00
Ethan Tao	0516e7d22e	Merge branch 'ort_public_ort_training' into ettao/ort-2-master	2020-04-27 18:17:17 +00:00
Prabhat	d901640817	Call optimised version of depthwise ConvLayer (#3664 ) * Call optimised version of depthwise ConvLayer * Update if statements	2020-04-27 17:41:33 +05:30
George Wu	c23b484275	add missing deps in Dockerfile.openvino	2020-04-26 22:02:48 -07:00
Jeff Bloomfield	1621a1fef1	Merge remote-tracking branch 'upstream/master' into jeffbloo/MergeDmlDev	2020-04-26 17:20:21 -07:00
Jeff Bloomfield	e51c6c0b3b	Fix build warning in DmlOperatorResize.cpp and ReadbackHeap.cpp	2020-04-26 17:20:15 -07:00
Changming Sun	805ffc01e5	Temp remove --enable_wcos --use_winml from CI build (#3707 ) The flags "--enable_wcos --use_winml" don't work with the latest VC++ and CMake. I don't know which caused the failure. But it doesn't work. Remove it to make the pipelines work first. Will add them back before 1.3 release.	2020-04-26 16:10:25 -07:00
Jeff Bloomfield	735caecfe1	Copy disabled ONNX backend tests from WindowsAI	2020-04-26 14:47:41 -07:00
David Brownell	a917023f94	Support for country-specific holidays in the DateTimeTransformer (#3701 ) * Support for country-specific holidays in the DateTimeTransformer Updates the DateTimeTransformer featurizer to support holidays, where holiday information is read from country-specific json files. * Addressed build breaks * Enhanced Windows strategies for scenarios when tests run from root dir * Skipping test for nuget installations	2020-04-26 11:12:26 -07:00
Jeff Bloomfield	02e8d10f3a	Fix AdapterSessionTest	2020-04-25 20:49:51 -07:00
Tracy Sharpe	bf1caba2b2	Port MLAS to Power architecture (#3703 ) Updates to MLAS to support building for the Power architecture.	2020-04-25 19:31:55 -07:00
Jeff Bloomfield	f1c19f8495	merge master	2020-04-25 19:04:58 -07:00
Jeff Bloomfield	99a0bdf271	Upgrade nuget version in dml.cmake	2020-04-25 18:48:32 -07:00
Jeff Bloomfield	8cc161aec6	Remove problematic change for dxcore.lib	2020-04-25 18:48:07 -07:00
Jeff Bloomfield	c49cc0c937	Increase DML nuget version to 0.0.2	2020-04-25 16:28:19 -07:00
edgchen1	e22d97ba56	Merge pull request #3643 from microsoft/ort_training_for_merge_to_master Introduce ORT training implementation	2020-04-25 07:15:22 -07:00
Sheil Kumar	a475f2824d	Create the Nuget WindowsAI Pipeline (#3684 ) * add windowsai.yml for new Microsoft.AI.MachineLearning nuget * temporarily add windowsai.yml to gpu.yml * pass in build arch * remove install onnx task * no dml for arm or arm64 * refactor nuget pipeline defs * update package creation * pass in build and sources path * missing hyphens * copy license file * fix parameter variable * disable arm builds for now * remove commented script block * download pipeline atifcat name update * set working dir * Add bundling nuget script * path combine * null path * combine needs parentheses * binplace microsoft.* dlls in new nuget package * update artifact name * move merged nuget to artifacts directory * move to merged subfolder in artifacts staging dir * forward slash to back * enable arm * vcvarsall needs x64 vars setup * Run Tests * fix tests * move global variables * update yml to not have global variable in template * removed parameters * fixes * Add build arch as an env variable * ne not neq * %Var% for batch script * dont pass argument for x64 * disable arm tests * skip csharp/cxx tests for microsoft nuget package * remove test-win as it tests only c# cxx and capi * test build for store apps * dont build for store * tools/nuget/generate_nuspec_for_native_nuget.py * remove args. * add new props and targets for microsoft.ai * make windowsai props/targets static * add dependency * dont ship dot net props * Remove c# fom windowsai nuget * copy license file * native packages must have win10 as the platform, not win * cuda header in wrong if branch * no dml for arm builds * only build dml for x64/ x86 * User/sheilk/props update (#3616) * prelim store work * props * Fix desktop nuget props/targets * clean up targets and make store apps work Co-authored-by: Sheil Kumar <sheilk@microsoft.com> * update windowsai.yml with latest * remove extra dloadhelpers * Add abi headers to abi dir, and reference native includes * update windowsai.yml * minor update * remove parameters * add doesrp param * hard code esrp to true * add directml for x86/x64 * revert gpu yml changes * add store builds * add store builds * add checks again in old way * dup job names for store and desktop builds * move all of the runtime binaries to win10 folder * only set safeseh on x86 * disable the store builds for now... missing msvcprt.lib * copy paste deletion... * switch back to win- (#3646) Co-authored-by: Sheil Kumar <sheilk@microsoft.com> * use stahlworks * & not supported in ado * add cuda to cpu nuget(???) and EnableDelayedExpansion to enable x86 dml package * revert nocontribops * add underscore... * extra win/win10 change * merged nuget... still not being bundled... * files in merged directory * missing parens causing dml to be included in cpu package * more diagnostic info * switch dir to get-childitem * wait for compression to complete * add winml_adapter to mkml and gpu packages * enable_wcos * add mklml binaries * props and targets missing from mklml Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2020-04-24 20:20:04 -07:00
ytaous	1c484ce33f	fix test (#3700 ) Co-authored-by: Ethan Tao <ettao@microsoft.com>	2020-04-24 18:09:46 -07:00
Wei-Sheng Chin	72b38f0a8b	Support GPU Event Operators (#3653 ) * Add GPU event operators to support in-place updates in gradient accumulator and optimizer for modifying the tensors passing through those event operators. * Address comment and polish code * Merge shared code between CPU and GPU kernels * Move event test to a new file * Address comments * Update onnxruntime/core/providers/cuda/gpu_data_transfer.cc	2020-04-24 17:43:04 -07:00

1 2 3 4 5 ...

2409 commits