onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-02 03:55:34 +00:00

Author	SHA1	Message	Date
David Brownell	72cd61baae	Removed use of parameters in python wheel build scripts (#3524 )	2020-04-15 10:31:14 -07:00
Yulong Wang	cf2fddf760	fix nuget build (#3532 )	2020-04-15 10:30:11 -07:00
Changming Sun	b63349c8d6	Fix custom op test failure (#3525 )	2020-04-14 20:36:42 -07:00
Adam Pocock	bc9a199b16	Renaming deviceNum to deviceId.	2020-04-14 20:35:03 -07:00
Adam Pocock	e9dc8954ac	Adding support for ACL and DML to the Java API.	2020-04-14 20:35:03 -07:00
Changming Sun	a2feb29b0d	Fix build break (#3528 ) Ignore some known test failures Install ONNX package before running Windows CI builds	2020-04-14 18:07:56 -07:00
Negin Raoof	e303f458e4	Add int64 input type for ReduceProd (#3507 ) * Add int64 input type * Fix for cuda * Fix linking * Cuda * Fixed missing registration * Fix registeration for opsets 1-11 * Adding reduce_matrix_rows for int64 * Update reduction_functions.cu * Revert cuda	2020-04-14 15:09:28 -07:00
Ori Levari	f564569a80	Adapter Model and Environment tests (#3469 ) Adapter Model and Environment tests winml test macro clean up and extension	2020-04-14 13:36:31 -07:00
Tiago Koji Castro Shibata	560f4c5b16	Make GPUTEST macro consistent among TAEF/googletest (#3518 )	2020-04-14 10:55:16 -07:00
Du Li	621b3ac03a	FFT contrib ops (#3381 ) * add custom op skeleton * Adding Rfft, Irfft kernels. * Fix a few errors: 1. make kernel stateless to avoid race condition 2. reclaim cufft plan * Adding MLFloat16 support * Adding fp16 support for fft ops. * Adding cufft plan cache. * adding a util func * adding copyright info. * Accommodating PR comments.	2020-04-14 10:12:04 -07:00
Yufeng Li	baa86f181f	Handle the case that initializers are in graph input (#3449 ) warn that initializers are in graph input provide a tool to move initializer out of graph input Motivation and Context ONNX model from IR_VERSION 4 only treats initializers that appear in graph input as non-constant. This may fail some of the graph optimizations, like const folding, operator fusion and etc. Warn the case and provide a tool.	2020-04-14 09:06:04 -07:00
David Brownell	006c5be1b1	Optionally produce a python wheel that includes featurizers (#3491 )	2020-04-14 09:00:13 -07:00
Changming Sun	040c28ff39	Remove dead code from HandleNegativeAxis	2020-04-14 01:01:15 -07:00
Colin Jermain	06db89cf13	Using logic for finding README.rst to find requirements.txt	2020-04-13 18:59:44 -07:00
Colin Jermain	43d9f9190e	Removing unused six package	2020-04-13 18:59:44 -07:00
Colin Jermain	c2c3102aba	Tying install_requires to requirements.txt	2020-04-13 18:59:44 -07:00
Ye Wang	66a79d2c9f	fix (#3512 )	2020-04-13 18:30:58 -07:00
Dmitri Smirnov	efd9b92482	Handle Scalars in TernaryOps and Where. (#3509 ) Handle Scalars in TernaryOps and Where.	2020-04-13 16:24:35 -07:00
Ye Wang	cbe30f3e19	update FeaturizersLibrary (#3511 )	2020-04-13 15:47:51 -07:00
Tracy Sharpe	5aab2671f8	Fix crash in DequantizeLinear with scalar tensor (#3508 )	2020-04-13 14:52:52 -07:00
Ye Wang	438353abcd	Fix TruncatedSVDFeaturizer's test failure and re-enable it's kernel test (#3458 ) * checkin * fix linux & macos build * fix test * revert the changes for a single-aimed PR * fix	2020-04-13 13:59:38 -07:00
Tianlei Wu	54bbbb78ae	Change mask_index input of Attention op to be optional (#3459 ) Change Mask Index to optional	2020-04-12 22:55:37 -07:00
George Wu	7f6e407e09	fix python packaging manylinux1 build break. (#3482 )	2020-04-11 06:58:22 +08:00
Ryan Lai	4223591043	Add automatic generation of tensors for Onnxruntime Perf Runner (#3448 ) * Add flag to enable automatic generation of input for models with tensor inputs * change wording of variable * Naming convention changes to variables * Handle free dimensions * Comment with default allocator * variable rename * Remove input_count * Cast to size_t to avoid warning Co-authored-by: Ryan Lai <ryalai96@gamil.com>	2020-04-10 11:54:17 -07:00
stevenlix	56e85484ba	Handle optional inputs and remove more empty shape nodes in TensorRT EP (#3455 ) * check optional inputs and remove more empty shape affected nodes * fix some minor issues * update code according to feedback	2020-04-10 11:13:38 -07:00
Tiago Koji Castro Shibata	d09d4a6b0d	Fix OS build (#3481 )	2020-04-09 21:46:01 -07:00
Pranav Prakash	95ade8f47b	Add check to prevent storing nullptr in value_info_ when proto has unused value info (#3461 ) * Add unit test for serialization of unused value_info * Do not add non-existent (nullptr) value_info_ when loading a model. Fixes #3430	2020-04-09 19:25:10 -07:00
Pranav Sharma	2ccedb7b4d	Improve error logging when a kernel cannot be found. (#3473 ) * Improve error logging when a kernel cannot be found. * Fix mac build	2020-04-09 19:24:46 -07:00
KeDengMS	739c9d4875	Always call cudaSetDevice at the beginning of session::Run (#3475 ) This is required for running multithreaded with multi-GPUs. Without it, when running in a work thread it would default to GPU 0, while CUDAExecutionProvider is assigned on other GPUs. That might cause CUDA crash when some CUDA resources is from GPU 0, while being used in GPU N>0.	2020-04-09 18:54:58 -07:00
Yufeng Li	a443b1b6b9	Revert "Use IMMA for int8 matmul to leverage Turing Tensor Core (#3413 )" (#3472 ) This reverts commit `4d71958ccf`. Revert the PR. Looks like it triggers a bug in nvcc and failes the GPU pipeline.	2020-04-09 15:59:52 -07:00
Scott McKay	40d80cde8f	Rework CDist (#3393 ) * Make CDist faster via Eigen squaredNorma and GEMM. * Add call to abs() as the GEMM output may differ slightly due to floating point accuracy and result in a negative distance which returns NaN if sqrt() is applied to it. * Update math::Gemm to use the type for alpha and beta instead of hardcoding to float. Matches the GemmEx definition. * Provide Eigen based replication of the GEMM call on x86 if T=double. * Make test model data deterministic. * Do the GEMM first so we can avoid potentially subtracting two numbers that are very close to each other.	2020-04-09 14:05:25 +10:00
Yulong Wang	718068f020	update C# API to optimize inference latency (#3171 ) * update C# API to optimize inference latency * rename PinnedOnnxValue to fixedBufferOnnxValue and fix build break * add more test cases * add conditions on string tensors for pre-allocated outputs * change to random inputs * fix word spell * resolve comments * resolve comments * remove FixedBufferOnnxValueTests.cs * fix trivial typos in doc	2020-04-08 11:57:40 -07:00
Pranav Sharma	cdac74b3c3	Use Eigen threadpool for ReduceSum and ReduceMean. (#3441 ) * Use Eigen threadpool for ReduceSum and ReduceMean. * Fix mac build	2020-04-08 11:50:22 -07:00
Ye Wang	f8fa1dde55	Add a list of Featurizers kernels (#3435 ) * wangye/pivot (#3432) * check in * work version * add ForecastingPivot kernel * fix mac os and linux build error * update FeaturizerLibrary Version * resolve comments * remove changes * Add Kernel for LagLeadOperator & RollingWindowFeaturizer (#3434) * update * update todo * resolve comments * relax eps for TruncatedSVD transformer * mute TruncatedSVD_transformer due to undeterministic test result * resolve comments * update * test * update * fix	2020-04-07 17:00:45 -07:00
Yufeng Li	4d71958ccf	Use IMMA for int8 matmul to leverage Turing Tensor Core (#3413 ) Use IMMA for int8 matmul to leverage Turing Tensor Core Format files under onnxruntime/core/providers/cude	2020-04-07 15:22:04 -07:00
Tracy Sharpe	de60a14c16	Fix output range for int8_t QuantizeLinear op (#3445 )	2020-04-07 15:01:20 -07:00
Yulong Wang	aabf47b107	Fix Split CUDA implementation for zero sized input (#2942 ) * Fix Split CUDA implementation for zero sized input * resolve comments * add case * test case update: split into 2 tensors	2020-04-07 14:44:20 -07:00
Scott McKay	48e96ea65f	Reduce binary size of Slice implementation (#3238 ) * Make the Slice implementation based on type sizes and reduce templatized code to a minimum. * Remove using 'dynamic' as a template param to Slice as well.	2020-04-08 07:19:29 +10:00
Dmitri Smirnov	53b9d52fc6	Rework TensorToTensorProto. Do not put string data to raw_string. Eliminate redundant argument. (#3438 ) Rework TensorToTensorProto. Eliminate redundant argument. Do not put string data into raw_data.	2020-04-07 11:42:10 -07:00
Andrews548	43d6c464fc	Fix ACL EP pooling build breakage (#3429 ) The commit `06fc9506fd` which refactored cpu Pool class broke ACL EP build. Also worked on the commit `a4fe60c4d3` as it also affects the new class. Move the declaration of the new MaxPoolV8 cpu class in the header file. Implement MaxPool 8-11 in ACL EP. Co-authored-by: Andrei-Alexandru <andrei-alexandru.avram@nxp.com>	2020-04-07 07:03:52 -07:00
Tianlei Wu	4bdb5cc8e2	Add CPU implementation for FastGelu operator (#3398 ) * Add CPU implementation for FastGelu operator * Update optimization script to fuse Gelu or FastGelu according to Elf or Tanh is used in graph. * Merge BiasGelu and FastGelu into one class * Enable FastGelu Fusion optimizer for CPU Execution Provider.	2020-04-07 00:19:30 -07:00
Changming Sun	9e65298d7a	Re-enable tests (#3437 ) Re-enable some tests that was recently fixed.	2020-04-06 20:13:34 -07:00
Tianlei Wu	8ab09186b7	Bert Optimization Script Improvements (#3387 ) Add opt_level option for graph optimization level in bert perf test. Support BERT models that output each layer, where SkipLayerNormalization has more than 4 children. Check weight and bias are 1D for layer norm fusion. Add a dummy class Gpt2OnnxModel for further changes of GPT2 model.	2020-04-06 16:55:40 -07:00
Dmitri Smirnov	c8f5e6e632	Implement Min/Max/Clip(12) (#3410 ) Implement Max/Min for opset 12. Add CLip(12) CPU impl. Implement Clip(12) for CPU and CUDA add tests	2020-04-06 14:24:59 -07:00
Yang Chen	7c69b1703b	Fixed a typo (no functional change) (#3433 ) s/initailizer/initializer/	2020-04-06 13:46:17 -07:00
Ye Wang	4ebad8805b	change (#3431 )	2020-04-06 11:30:21 -07:00
Changming Sun	0dcc6035b1	Disable strong inline (#3399 ) To bypass a MSVC bug. Without this change, people can't use VS2017 to build onnxruntime in Release or RelWithDebInfo mode.	2020-04-06 11:19:09 -07:00
Yang Chen	d361121d98	Do not inline ExternOp's scalar tensor inputs (#3426 ) An ExternOp's input needs buffers, so we cannot add compute_inline schedule on it even if it's a scalar tensor. Instead, we need to schedule it as compute_root.	2020-04-05 18:35:09 -07:00
Tiago Koji Castro Shibata	517693a507	Fix race condition creating ConverterResourceStore (#3419 )	2020-04-04 20:10:07 -07:00
Changming Sun	33006f48c0	Update onnx submodule to 1.7.0 release candidate (#3405 ) Update onnx submodule to 1.7.0 release candidate. This isn't a release tag, but it will be released soon, in 1-2 weeks.	2020-04-04 16:23:42 -07:00

1 2 3 4 5 ...

2077 commits