onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-26 03:00:54 +00:00

Author	SHA1	Message	Date
Sergii Dymchenko	c5176087bf	Get onnxruntime/contrib_ops/cuda/bert/fast_gelu.cc from ort_training.	2020-04-09 17:55:52 -07:00
Sergii Dymchenko	6bbc80951d	Get onnxruntime/core/providers/cuda/tensor/slice.h from ort_training.	2020-04-09 17:03:58 -07:00
Sergii Dymchenko	0e4080f1d6	Get cuda_common.h from master.	2020-04-09 16:56:52 -07:00
Sergii Dymchenko	84773c61c6	Rename ONNX OPTIONAL to OPTIONAL_VALUE.	2020-04-09 16:22:30 -07:00
Yufeng Li	a443b1b6b9	Revert "Use IMMA for int8 matmul to leverage Turing Tensor Core (#3413 )" (#3472 ) This reverts commit `4d71958ccf`. Revert the PR. Looks like it triggers a bug in nvcc and failes the GPU pipeline.	2020-04-09 15:59:52 -07:00
liqunfu	e7297e6c9d	create pipeline for ci frontend tests (#3422 ) create pipeline for nightly python front-end e2e tests	2020-04-09 15:31:22 -07:00
Sergii Dymchenko	eaa3f652df	Fix dynamicslice.cc after merge.	2020-04-09 15:17:21 -07:00
Sergii Dymchenko	8ea0e596ec	Fix onnxruntime_unittests.cmake after merge.	2020-04-09 13:14:15 -07:00
Sergii Dymchenko	6ba7c99e50	Merge branch 'master' into ort_training	2020-04-09 12:42:04 -07:00
Jeff Bloomfield	b04e48333d	Merge branch 'master' into DmlDev	2020-04-09 11:30:01 -07:00
Scott McKay	40d80cde8f	Rework CDist (#3393 ) * Make CDist faster via Eigen squaredNorma and GEMM. * Add call to abs() as the GEMM output may differ slightly due to floating point accuracy and result in a negative distance which returns NaN if sqrt() is applied to it. * Update math::Gemm to use the type for alpha and beta instead of hardcoding to float. Matches the GemmEx definition. * Provide Eigen based replication of the GEMM call on x86 if T=double. * Make test model data deterministic. * Do the GEMM first so we can avoid potentially subtracting two numbers that are very close to each other.	2020-04-09 14:05:25 +10:00
ytaous	a08f16471a	Address comments around bfc arena (#3460 ) * rename setting * todo comments * fix build Co-authored-by: Ethan Tao <ettao@microsoft.com>	2020-04-08 19:35:32 -07:00
liqunfu	a298556600	raid rtol to unblock CI (#3457 ) raise rtol to avoid expected CI test failure in onnxruntime_test_ort_trainer.py	2020-04-08 17:17:44 -07:00
Jeff Bloomfield	31bb1182e6	Merged PR 4530161: ORT changes for conversion of DML graph to public API This updates ORT for new API signatures and removal of preview header Related work items: #24822151	2020-04-09 00:04:03 +00:00
ytaous	f73008483a	safeint for region bytes in bfc arena and code clean up (#3447 ) * PR comments * remove build issue workaround * SafeInt for region bytes * fix build * fix build Co-authored-by: Ethan Tao <ettao@microsoft.com>	2020-04-08 13:54:42 -07:00
Yulong Wang	718068f020	update C# API to optimize inference latency (#3171 ) * update C# API to optimize inference latency * rename PinnedOnnxValue to fixedBufferOnnxValue and fix build break * add more test cases * add conditions on string tensors for pre-allocated outputs * change to random inputs * fix word spell * resolve comments * resolve comments * remove FixedBufferOnnxValueTests.cs * fix trivial typos in doc	2020-04-08 11:57:40 -07:00
Pranav Sharma	cdac74b3c3	Use Eigen threadpool for ReduceSum and ReduceMean. (#3441 ) * Use Eigen threadpool for ReduceSum and ReduceMean. * Fix mac build	2020-04-08 11:50:22 -07:00
liqunfu	1ddfe1249b	frontend test to use random seed (#3209 ) frontend test to use random seed	2020-04-08 10:03:07 -07:00
Ye Wang	f8fa1dde55	Add a list of Featurizers kernels (#3435 ) * wangye/pivot (#3432) * check in * work version * add ForecastingPivot kernel * fix mac os and linux build error * update FeaturizerLibrary Version * resolve comments * remove changes * Add Kernel for LagLeadOperator & RollingWindowFeaturizer (#3434) * update * update todo * resolve comments * relax eps for TruncatedSVD transformer * mute TruncatedSVD_transformer due to undeterministic test result * resolve comments * update * test * update * fix	2020-04-07 17:00:45 -07:00
Yufeng Li	4d71958ccf	Use IMMA for int8 matmul to leverage Turing Tensor Core (#3413 ) Use IMMA for int8 matmul to leverage Turing Tensor Core Format files under onnxruntime/core/providers/cude	2020-04-07 15:22:04 -07:00
Tracy Sharpe	de60a14c16	Fix output range for int8_t QuantizeLinear op (#3445 )	2020-04-07 15:01:20 -07:00
Yulong Wang	aabf47b107	Fix Split CUDA implementation for zero sized input (#2942 ) * Fix Split CUDA implementation for zero sized input * resolve comments * add case * test case update: split into 2 tensors	2020-04-07 14:44:20 -07:00
Scott McKay	48e96ea65f	Reduce binary size of Slice implementation (#3238 ) * Make the Slice implementation based on type sizes and reduce templatized code to a minimum. * Remove using 'dynamic' as a template param to Slice as well.	2020-04-08 07:19:29 +10:00
Tiago Koji Castro Shibata	ff51d752d1	Merged PR 4527632: Fix race condition creating resource store Port https://github.com/microsoft/onnxruntime/pull/3419 Fixes #25822488 and #25880544	2020-04-07 20:31:24 +00:00
ytaous	b35468289a	View Op - new unit tests and add support for tensor memcpy by offset/size (#3439 ) * view ops UTs * update per comments * PR comments - code clean up * code clean up per comments Co-authored-by: Ethan Tao <ettao@microsoft.com>	2020-04-07 13:07:11 -07:00
Tiago Koji Castro Shibata	1fd21c109e	Fix race condition creating ConverterResourceStore (#3419 )	2020-04-07 11:57:27 -07:00
Dmitri Smirnov	53b9d52fc6	Rework TensorToTensorProto. Do not put string data to raw_string. Eliminate redundant argument. (#3438 ) Rework TensorToTensorProto. Eliminate redundant argument. Do not put string data into raw_data.	2020-04-07 11:42:10 -07:00
Andrews548	43d6c464fc	Fix ACL EP pooling build breakage (#3429 ) The commit `06fc9506fd` which refactored cpu Pool class broke ACL EP build. Also worked on the commit `a4fe60c4d3` as it also affects the new class. Move the declaration of the new MaxPoolV8 cpu class in the header file. Implement MaxPool 8-11 in ACL EP. Co-authored-by: Andrei-Alexandru <andrei-alexandru.avram@nxp.com>	2020-04-07 07:03:52 -07:00
Tianlei Wu	4bdb5cc8e2	Add CPU implementation for FastGelu operator (#3398 ) * Add CPU implementation for FastGelu operator * Update optimization script to fuse Gelu or FastGelu according to Elf or Tanh is used in graph. * Merge BiasGelu and FastGelu into one class * Enable FastGelu Fusion optimizer for CPU Execution Provider.	2020-04-07 00:19:30 -07:00
Changming Sun	9e65298d7a	Re-enable tests (#3437 ) Re-enable some tests that was recently fixed.	2020-04-06 20:13:34 -07:00
Thiago Crepaldi	15e32b44fd	Merge pull request #3383 Merge from master into ort_training	2020-04-06 19:05:01 -07:00
Tianlei Wu	8ab09186b7	Bert Optimization Script Improvements (#3387 ) Add opt_level option for graph optimization level in bert perf test. Support BERT models that output each layer, where SkipLayerNormalization has more than 4 children. Check weight and bias are 1D for layer norm fusion. Add a dummy class Gpt2OnnxModel for further changes of GPT2 model.	2020-04-06 16:55:40 -07:00
Edward Chen	95707d22a5	Disable gradient clipping for E2E test.	2020-04-06 23:07:28 +00:00
Dmitri Smirnov	c8f5e6e632	Implement Min/Max/Clip(12) (#3410 ) Implement Max/Min for opset 12. Add CLip(12) CPU impl. Implement Clip(12) for CPU and CUDA add tests	2020-04-06 14:24:59 -07:00
Yang Chen	7c69b1703b	Fixed a typo (no functional change) (#3433 ) s/initailizer/initializer/	2020-04-06 13:46:17 -07:00
Ye Wang	4ebad8805b	change (#3431 )	2020-04-06 11:30:21 -07:00
Changming Sun	0dcc6035b1	Disable strong inline (#3399 ) To bypass a MSVC bug. Without this change, people can't use VS2017 to build onnxruntime in Release or RelWithDebInfo mode.	2020-04-06 11:19:09 -07:00
Sherlock	a3ab2ba036	Reapply commit 131c65d; Fix memory regression issue. (#3423 ) * Reapply commit `131c65d` * fix merge error	2020-04-06 10:29:31 -07:00
Yang Chen	d361121d98	Do not inline ExternOp's scalar tensor inputs (#3426 ) An ExternOp's input needs buffers, so we cannot add compute_inline schedule on it even if it's a scalar tensor. Instead, we need to schedule it as compute_root.	2020-04-05 18:35:09 -07:00
Tiago Koji Castro Shibata	517693a507	Fix race condition creating ConverterResourceStore (#3419 )	2020-04-04 20:10:07 -07:00
Changming Sun	33006f48c0	Update onnx submodule to 1.7.0 release candidate (#3405 ) Update onnx submodule to 1.7.0 release candidate. This isn't a release tag, but it will be released soon, in 1-2 weeks.	2020-04-04 16:23:42 -07:00
Tracy Sharpe	d4d19a75ba	Use MlasConv for 1D convolutions (#3425 ) Use the existing 2D convolution code in MlasConv to also handle 1D convolutions.	2020-04-04 09:43:10 -07:00
Jesse Benson	5835349614	Add #pragma once to providers.h, so avoid 'struct' redefinition error when including the header from multiple places.	2020-04-03 16:25:18 -07:00
edgchen1	82c1e1b3db	Enable loss scale input from Python frontend (#3327 ) Made some fixes to enable loss scale to be wired up to ORT from the Python frontend. In particular, now addition of loss scaling is done unconditionally if mixed precision is enabled. The generated loss scale input name is passed back to the frontend. Also fixed how inputs were added during the training graph configuration. Graph::SetInputs() was causing some issues - it seems to not be working correctly. Also added some mixed precision Python frontend tests. Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-04-03 16:02:14 -07:00
Sherlock	f437665360	Revert "Addressing PR comments (#3334 )" (#3412 ) This reverts commit `131c65d23d`.	2020-04-03 11:59:47 -07:00
Pranav Sharma	14f4c3e25f	Fix issue in construction of DummyArena. (#3416 )	2020-04-03 08:28:05 -07:00
Scott McKay	85131e760c	Enable upsample2x optimization for opset 11 Resize (#3388 ) * Enable use_nearest2x_optimization for opset 11 of Resize when possible	2020-04-03 17:36:11 +10:00
Thiago Crepaldi	d89e5d91a6	Disable GradientCheckerTest tests for GPU/Debug build (#3407 )	2020-04-03 01:01:58 +00:00
Thiago Crepaldi	675035b1a8	Disable GradientCheckerTest tests for GPU/Debug build (#3407 )	2020-04-02 18:00:54 -07:00
Pranav Sharma	3568f8d186	Allow a custom op with the same name to be registered for several providers. (#3400 )	2020-04-02 15:38:51 -07:00

... 196 197 198 199 200 ...

11997 commits