onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-17 18:40:28 +00:00

Author	SHA1	Message	Date
Chih-Hsuan Yen	a37e2e33b4	Add compatibility with Protobuf 3.12 (#4291 ) In Protobuf 3.12, classes generated from protobuf files are declared as `final`, so use those classes as members rather than base classes. Ref: https://github.com/protocolbuffers/protobuf/releases/tag/v3.12.0	2020-06-25 20:34:08 -07:00
Changming Sun	5db67ec000	Fix python package issue and upgrade the linux image to 2010 (#4342 ) 1. Increase job timeout, while we are investigating why the tests take much longer 2. Upgrade the linux docker image to manylinux2010, by request from Tianlei. (We had an offline discussion with Pranav and Tracy) 3. Remove the installation of "devtoolset-7" in the CUDA image. It was added for CUDA 10.0, it is not needed for CUDA 10.1. We have moved to CUDA 10.1.	2020-06-25 20:22:39 -07:00
Shucai Xiao	bfc888613f	Migraphx improvements (#4328 ) * Add amd migraphx execution provider to onnx runtime * rename MiGraphX to MIGraphX * add migraphx EP to tests * support multiple program output * disable more tests * backup changes related to program multiple outputs * remove logging code * remove unnecessary changes in migraphx_execution_provider.cc * add migraphx EP to tests * add input requests of the batchnorm operator * add to support an onnx operator PRelu * update migrapx dockerfile and removed one unused line * chagnes related to support dynamic input shape * fix build error * code backup * code backup * version that has 106 models run correctly * code backup * code backup * remove unnecessary print info * code backup * code backup * code backup * code backup * code backup * code backup * changes corresponding to migraphx change * fix merge conflict * minor code cleanup * code cleanup * remove unnecessary code * remove unnecessary code * add to support more constant folding analysis * more constant folding checking for shape input * add env var to control whether fp16 is enabled. Modify docker file to use ROCM3.3 * fix function name to avoid build error * add build and execution instruction for migraphx execution provider * added more build instructions * fixed a small format error * a minor change * fix review comments * another minor change * additional refinement of the documents * additional changes * remove unnecessary changes in the dockfile * additional changes for the dockerfile * code change backup * fix errors related to a few unit tests * fix a build error related to api change * fix unit test errors by either disabling the test or fix related isssues * remove unnecessary log info * sync submodule tvm with master * remove unnecessary changes * remove an unnecessary code line * refine documents for addition example	2020-06-25 19:22:57 -07:00
edgchen1	0b450dcd9f	Enable BiasGelu fusion for training (#4146 ) Add gradient for BiasGelu and FastGelu with bias. Enable BiasGeluFusion and GeluApproximation transformers in TrainingSession.	2020-06-25 17:48:12 -07:00
Faith Xu	b544f5c83c	Sample updates (#4303 ) * Add section for product integrations * Wording updates	2020-06-25 16:09:17 -07:00
Du Li	645a988c04	Support binding input only for IOBinding in python api. (#4079 ) * Support binding input only in python api. * Addressing PR comments. * fixing build issues	2020-06-25 12:20:02 -07:00
Dmitri Smirnov	a08805daf9	Fix a minor typon in POM file name (#4250 ) Co-authored-by: Changming Sun <chasun@microsoft.com>	2020-06-25 11:15:14 -07:00
Tim Harris	3fc68cb150	Remove non-trivially-destructible thread-local from thread pool state, blocking ARM64 builds (#4336 ) - Move thread hint vectors from thread-local struct - Add static_assert that the per-thread state in the thread pool is trivially-destructible - Rename "thread_data" to "worker_data" (only allocated for workers in the pool, not threads calling into the pool)	2020-06-25 19:04:31 +01:00
George Wu	a3b466cdf1	fix python ep default ordering. (#4335 ) * fix python ep default ordering. cpu provider should be last. * add comment. * add test case to ensure no regressions for get_all_providers(). * expand on get_all_providers() api documentation	2020-06-25 04:25:43 -07:00
Prabhat	151ef1c8a5	Add C++ wrapper for GetAvailableProviders() C API (#4313 )	2020-06-25 13:11:55 +05:30
edgchen1	a6d10376df	Fix build error when USE_NCCL is defined. (#4334 )	2020-06-24 23:32:31 -07:00
Josh Bradley	0d9db2b28d	add informative error message regarding symbolic dimensions (#4297 ) * add informative error message regarding symbolic dimensions * fix code format and move negative value check in for loop	2020-06-25 11:56:14 +10:00
Aaron Bockover	64264c3846	Allow --cmake_generator to work on macOS (#4278 )	2020-06-24 16:30:33 -07:00
S. Manohar Karlapalem	15c07c75f8	[OpenVINO-EP] Upgrade version info to 2020.3 in docs (#4304 ) * Upgrade version to 2020.3 in docs * update online installer size for 2020.3 * update OV 2020.3 install dir path	2020-06-24 15:01:55 -07:00
Tim Harris	a241eb0bbe	Renaming --partition_optimizer to --deepspeed_zero_stage (#4312 ) * Rename partition_optimizer -> deepspeed_zero * Use ZeROConfig in orttraining_pybind_state.cc * deepspeed_zero -> deepspeed_zero_stage for clarity * Expose as deepspeed_zero_stage in pybind	2020-06-24 22:05:03 +01:00
Cecilia Liu	7e71ff2a1f	Match Reshape Subgraph Pattern For GPT2 (#4279 ) Reshape fusion for one element subgraph patterns.	2020-06-24 10:07:30 -07:00
Tim Harris	5c6a27408a	Remove signed/unsigned compiler warnings, add additional pipeline test case (#4314 ) * Avoid signed/unsigned warning on loops * Report sizes when distributed world configuration is inconsistent * Add DistributedRunContextTest for pipeline stage configuration	2020-06-24 11:36:18 +01:00
Pranav Sharma	44f06ec480	Fix memory usage when loading a model + some other minor fixes to avoid unnecessary heap allocations. (#4318 )	2020-06-24 00:23:11 -07:00
Scott McKay	5dd3ebb3b1	Tune setting for when to use MlasComputeSoftmax due to changes in #3906 . (#4170 )	2020-06-24 16:58:43 +10:00
Vincent Wang	f26c149d7d	Set NonZero Output Shape for Gradient Building. (#4246 ) * Set NonZero output shape for gradient building. * Resolve comments. Co-authored-by: Vincent Wang <weicwang@AiFramework2080ti2.corp.microsoft.com>	2020-06-24 13:43:22 +08:00
suryasidd	20e205aa0a	[OpenVINO-EP] Changed the default scheduler for VAD-M (#4295 ) * Changed the scheduler for VAD-M to bypass scheduler and modified logic * Added extra configuration step to documentation for VAD-M * Removed cout statement * Fixed documentation * Removed softmax restriction * Added VPU config setting for graphs with dynamic shape * Set VPU config only for MYRIAD * Added log statement	2020-06-23 21:21:58 -07:00
Vincent Wang	3374733783	Refactor ReduceMean/Sum Gradient without Shape Dependency. (#4261 ) * ReduceMean/Sum gradient without shape dependency. * optimize expand and use it to replace add. * Adjust test. Co-authored-by: Vincent Wang <weicwang@microsoft.com>	2020-06-24 11:36:53 +08:00
Changming Sun	deea945f80	Remove openmp and scipy from build pipelines (#4305 ) 1. Remove openmp because the default thread pool is already good enough. 2. Remove scipy from build pipelines because it stops support python 3.5.	2020-06-23 20:18:16 -07:00
Yufeng Li	867ba846f7	Implement MinMax with SIMD (#4285 ) * Implement MinMax with SIMD	2020-06-23 20:07:53 -07:00
edgchen1	4e39fda06a	Fix version of torch and torchvision in install_deps.sh. (#4316 )	2020-06-23 14:55:18 -07:00
Bowen Bao	15cb4b3023	Fix session load state & run extra_postpasses only once (#4255 ) * Fix session load state & run extra_postpasses only once * add testcase for onnx model as well	2020-06-23 11:45:26 -07:00
Prabhat	d3c5cb6349	Use providers_available array from constants.h to avoid code duplication (#4300 )	2020-06-23 11:52:51 +05:30
edgchen1	737c22a911	Refactor Python packaging builds (#4283 ) Reuse the same template file for all Python packaging builds.	2020-06-22 17:13:22 -07:00
Tim Harris	9e3b5c62fb	Use OpenMP-like synchronization patterns in Eigen thread pool (#4236 ) Updates the thread pool implementation to make work distribution over the Eigen thread pool more closely resemble techniques used in OpenMP. In particular: (1) A thread entering a parallel loop works on the iterations itself, rather than requiring a thread switch to/from a thread in the pool, if called from outside the thread pool. (2) To support this, work items pushed to the thread pool run a loop to claim iterations from a shared counter via atomic-fetch-and-add, as opposed to having work items themselves represent individual batches of iterations. This means that any thread working on the loop can execute any batch of iterations, including having the main thread run through all of the batches itself if the loop turns out to be short-running. (3) As with OpenMP active scheduling, the worker loop spins waiting for work prior to blocking. This avoids OS blocking / wake-up paths in workloads with series of short-running parallel sections.	2020-06-22 10:04:53 +01:00
Prabhat	57fabfba7a	Added GetAvailableProviders() to C API (#4247 ) * Added GetAvailableProviders to C API * Fix API version and Windows build error * Changed function name * Changed ORT_API_VERSION to 4 * Moved all_providers array to constants.h * Move check for providers to constants.h * Changed name of array to avoid warning * Address review comment * Added unit test	2020-06-22 10:10:25 +08:00
Scott McKay	175983c082	Move memory info into IAllocator (#2850 ) - Update IAllocator setup to move the OrtMemoryInfo to the base class instead of requiring derived classes to have that as a member and override a virtual method to return it. - Cleanup CreateAllocator setup to take an argument as to whether to wrap the device allocator in an arena allocator. The choice to do that isn't a property of the underlying device allocator. - Minor cleanups in the various EPs to adjust to the change to IAllocator and CreateAllocator, and to use the create_arena flag consistently when available.	2020-06-22 11:18:52 +10:00
Yang Chen	064afa0f93	define dim_idx before use it (#4290 )	2020-06-20 21:05:13 -07:00
Pranav Sharma	2204d39a06	Add build option to disable traditional ML ops from the binary. (#4272 ) * Add build option to disable traditional ML ops from the binary. * Fix python tests by splitting tests for ML ops to a separate file. Exclude ML tests from onnx_test_runner and C# tests. Exclude ML op sources. * Update Edge pkg pipelines with new MLops env variable and fix C# packaging pipeline tests to skip ML ops.	2020-06-20 06:36:06 -07:00
alkoumpa	3c633384c2	Fix TensorRT memory leaks (#4227 ) * fix tensorrt memory leaks * wrap unique_pointer in a namespace to avoid conflicts Co-authored-by: alex <act@act.com>	2020-06-20 03:37:38 -07:00
Derek Murray	a541d28fb4	Lazily get allocator when allocating an MLValue (#4276 ) According to profiling in #4267, getting the allocator can account for a large fraction of overhead when accessing a kernel output, due to STL container operations. The allocator isn't used when (i) we're not creating a fence, and (ii) we have a memory pattern and a pre-allocated buffer, so we can avoid this overhead.	2020-06-19 15:55:43 -07:00
Yang Chen	a490beedf1	update tvm submodule (#4284 )	2020-06-19 14:51:18 -07:00
Tianlei Wu	e08181f74d	Update Bert Notebooks for ORT 1.3.0 (#4274 ) * update keras notebook * re-run pytorch bert notebook	2020-06-19 14:02:16 -07:00
Tianlei Wu	466511c1c3	Update gpt2 benchmark with position_ids and fp16 (#4275 ) * support position_ids input * support fp16 conversion for gpt2 past state * output results to csv file * Remove the useless check that output of matmul is in cuda	2020-06-19 14:01:37 -07:00
Changming Sun	0349479b19	Fix component governance and codesign validation errors (#4277 ) Adjust the job steps so that these security tasks run before the build directory clean up.	2020-06-18 15:54:18 -07:00
Hariharan Seshadri	d5610e666b	Support CUDA kernel for Einsum op (#4095 )	2020-06-18 15:03:23 -07:00
goloskokovic	478b923e19	Expose ACL/ARMNN providers to Python (#4260 ) * expose ACL/ARMNN providers to python * add -acl / -armnn to package name when use_acl / use_armnn is specified * build python wheel for ARMNN EP * link ACL/ARMNN EPs into onnxruntime_pybind11_state * wrong argument order in build_python_wheel for wheel_name_suffix	2020-06-18 20:24:14 +05:30
Changming Sun	e505faa022	Fix two compiler warnings (#4263 )	2020-06-17 20:47:01 -07:00
Tracy Sharpe	5d773ee57b	MLAS: add sgemv path for aarch64 builds (#4254 ) Implement a fast path for GEMMs where M=1 and TransB=CblasNoTrans.	2020-06-17 20:10:35 -07:00
Chih-Hsuan Yen	5da849b414	Fix detection of protobuf with onnxruntime_PREFER_SYSTEM_LIB on Linux (#4230 ) The CMake module is FindProtobuf.cmake [1]. Thus the name should be capitalized so that protobuf can be found on case-sensitive file systems. [1] https://github.com/Kitware/CMake/blob/v3.17.3/Modules/FindProtobuf.cmake	2020-06-17 17:34:47 -07:00
Changming Sun	43deec2174	Temporarily remove dnnl from Linux CI build to unblock the whole team (#4266 )	2020-06-17 16:25:24 -07:00
Vincent Wang	b41fcf1570	Bugfix for shape inference and GetShape. (#4243 ) Co-authored-by: Vincent Wang <weicwang@microsoft.com>	2020-06-17 15:11:02 +08:00
Yulong Wang	12367a6b11	[C#] enable string-typed FixedBufferOnnxValue in input (#4178 )	2020-06-16 11:06:11 -07:00
Wei-Sheng Chin	189fb60ef9	Fix a bug and add code to profile memory (#4241 ) * Fix a bug and add code to profile memory 1. Compile Send/Recv again (currently broken because of HOROVOD refactor). 2. Add code to print out initializer allocation size and activation memory size. * Address comments * Split memory counts per locations * Fix a metric	2020-06-16 10:17:27 -07:00
edgchen1	63bf587623	Use azcopy to download test data (#4221 ) Use azcopy from download_e2e_test_data.py, add helper function for downloading azcopy. Update download_test_data.py to use helper function.	2020-06-16 10:14:34 -07:00
Tianlei Wu	61fa5476d5	Update PyTorch Bert notebooks (#4239 ) update PyTorch Bert SquAD notebooks to use onnxruntim-tools and update usage of intra_op_num_threads. rename python files according to coding style Fix change_input_to_int32. update keras notebook to copy script from rel-1.3.0 branch (Will update them later)	2020-06-16 09:36:51 -07:00

1 2 3 4 5 ...

2760 commits