onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-04 23:59:56 +00:00

Author	SHA1	Message	Date
Changming Sun	c24d7a8a0a	Update eigen to the latest version (#1910 )	2019-10-11 10:44:19 -07:00
Scott McKay	bdfff800ea	Move access to intra-op threadpool into OpKernelContext. (#2091 )	2019-10-11 10:36:20 -07:00
Changming Sun	368bdfd936	Update README.md (#2070 ) Update the vcredist package link Note: Visual C++ 2015, 2017 and 2019 all share the same redistributable files.	2019-10-11 10:06:50 -07:00
Hector Li	3b335c933f	Fix issue that TRT not work for device other than device id 0 Fix issue that TRT not work for device other than device id 0. Because the allocation planner need to get the default allocator to allocate memory for graph input data. (#2094)	2019-10-11 09:22:25 -07:00
Scott McKay	ffb94fd170	Fix bug with delayed allocation of If and Scan outputs. (#2024 ) * Fix bug with delayed allocation of If and Scan outputs. If the subgraph is producing output on a non-CPU device the delayed allocation was incorrectly providing a CPU allocated tensor. Check for the required location, and update 'fetches' instead if a device copy is needed. The utils::ExecuteGraph logic will handle the device copy in this case.	2019-10-11 19:49:21 +10:00
Yang Chen	ca1b88c069	Added support to infer Pad11 (#2085 ) * Added support to infer Pad11 * address CR	2019-10-10 23:18:49 -07:00
shahasad	8803f6fff4	C# end to end test fix, and make end to end tests mandatory (#2079 )	2019-10-10 19:23:43 -07:00
Changming Sun	a314402097	Downgrade python gpu package to CUDA 10.0 (#2086 )	2019-10-10 18:31:24 -07:00
Dmitri Smirnov	af9dbb70f2	Introduce a separate check and conditional for AVX512BW build (#2083 ) Separate checks for AVX512f and AVX512BW Make AVX512BW cmake instructions nested within AVX512F support.	2019-10-10 16:14:00 -07:00
Hariharan Seshadri	2ba705ed99	Handle nodes with subgraphs in ORT function handling implementation (#2053 ) * Initial commit * Update * Update * Nits * More updates * to be reverted * Update * Update * More changes * Updates * Update Function * Nits * Fix build break * Comment	2019-10-10 16:07:42 -07:00
Pranav Sharma	2d4d0abd36	Add support for output seq(tensor) in python binding and test framework. Implement SequenceConstruct, SequenceEmpty, SequenceInsert and SequenceErase ops. (#2040 ) Add support for output seq(tensor) in python binding and test framework. Implement SequenceConstruct, SequenceEmpty, SequenceInsert and SequenceErase ops. (#2040)	2019-10-10 15:58:49 -07:00
Scott McKay	ddbc2086e4	Add support for opset 11 Clip in optimizers. (#2059 )	2019-10-10 10:47:29 -07:00
Yulong Wang	a41c71cbf2	check and fix CUDA kernel launch errors in several OPs (#2047 )	2019-10-10 23:47:00 +08:00
baowenlei	b4a98aab78	change MatMulInteger/MatMulInteger16 fallback option (#2064 ) * change MatMulInteger/MatMulInteger16 fallback option when no initializer exist * add AVX option * fix condition for old machines	2019-10-09 22:03:21 -07:00
Hariharan Seshadri	d186c19c45	Add opset-11 TopK CPU kernel (#1912 ) * initial commit * Update * Update top_k.cc * PR comments * Add more tests * Update * Add another test case * Update * Resolve conflicts * Update * Nits * Nits * Nits * Pick sorted content using 2 different approaches * Update to logic * PR comments * PR feedback * Update * Fix build * Fix build * Update	2019-10-09 19:09:30 -07:00
Colin Versteeg	8fda6593fe	Update failing tests (#2038 ) * Fix failing tests from when they were not enabled * split into two * fix failing test	2019-10-09 15:17:21 -07:00
Tracy Sharpe	57e0099425	MLAS: Implement U8S8 GEMV kernels (#2069 ) This implements an optimization for U8S8 MlasGemm when M=1, aka GEMV.	2019-10-09 11:54:16 -07:00
Changming Sun	eee9c55030	C++11 fix for memcpy_transformer_test.cc (#2061 )	2019-10-09 10:52:10 -07:00
Changming Sun	cefae93305	Add a test case for linearregressor (#1962 )	2019-10-09 10:17:08 -07:00
Changming Sun	ccaf692ff2	Run auditwheel for manylinux1 (#2063 )	2019-10-09 09:23:00 -07:00
Dmitri Smirnov	cae571c713	Add a test for AVX512 compilation before compiling 512 asm (#2055 )	2019-10-08 21:18:04 -07:00
Changming Sun	af8fe0f980	Replace make_unique in cuda_utils.cu (#2052 )	2019-10-08 18:32:08 -07:00
Scott McKay	db0dd09ded	Cleanup some aspects of the Initializer class used by optimizers (#2005 ) * Move check on data type outside of the Initializer class as it's specific to Conv processing. Use references for arguments that can't be null.	2019-10-09 10:37:44 +10:00
Changming Sun	a00ca56ae1	Remove gcc from manylinux1 docker image (#2048 )	2019-10-08 13:49:15 -07:00
baowenlei	b82de794d5	Weba/update nuphar doc (#2026 ) * update nuphar xp doc * address comments * address CR * update doc	2019-10-08 12:41:25 -07:00
RandySheriffH	f501b6e234	pack pyop in nightly build (#2018 ) * pack pyop in nightly build * correct logic * add comment * exclude debug build * add dependency * reset postbuild rule * remove dep	2019-10-08 12:02:45 -07:00
Changming Sun	e9bed8b23b	Change python packaging pipeline to use manylinux1 (#2035 ) 1. Change the python packaing pipeline to use manylinux1 2. Temporarily disable model test in the python pipeline.	2019-10-08 10:03:54 -07:00
Changming Sun	3053af812c	Fix a crash in deep_cpu_gru_op_test.cc (#2028 )	2019-10-08 10:03:07 -07:00
Zhang Lei	71b389322e	Implement cuda scatter op. (#1991 ) * Implement cuda scatter op. Disable Invalid Index of Scatter op only for cuda provider. * Fix some pipeline's type narrow warning as error.	2019-10-08 09:53:33 -07:00
Yang Chen	a94c9bd88d	throw exception using dmlc::LogMessageFatal (#2033 ) * throw exception using dmlc::LogMessageFatal On windows, ORT_THROW couldn't be caught if the exception was thrown from a jitted functions. Let's call dmlc::LogMessageFatal instead. * address CR use LOG(FATAL)	2019-10-08 09:31:35 -07:00
Yang Chen	19b0d0af87	Enabled bool input type for Equal for op_ver 11 (#2034 ) This change enabled bool type for Equal-11's inputs	2019-10-08 01:50:37 -07:00
Yang Chen	203c2f5b59	updated reduce_ops for op_ver 11 (#2039 ) After enabling op_ver 11 for reduce ops, we need to check axes to make sure it's not empty.	2019-10-08 01:05:05 -07:00
Pranav Sharma	f13b66768a	Fix build for gcc 4.8.5. (#2036 )	2019-10-08 00:50:53 -07:00
shahasad	b70fc34fae	Fix C# end to end tests in NuGet pipeline, failing for missing test data file	2019-10-07 20:14:20 -07:00
shahasad	b0feaef9de	Update the C# pretrained model test to include opset9 and 10 models (#2003 )	2019-10-07 19:14:34 -07:00
George Wu	0bd807f3b3	trt provider status return cleanup (#2032 ) * status and code cleanup. * revert change. seems like a bug in TRT causes intermittent failure return?	2019-10-07 18:34:48 -07:00
Tianlei Wu	b2c1937523	Add EmbedLayerNormalization and SkipLayerNormalization ops for bert optimization (#2012 ) * Add Embed Layer Normalization and Skip Layer Normalization ops for bert optimization. * add float16 test for skiplayernorm * Add test for EmbedLayerNormalization op * fix cpu build error * fix build warning * update HasCudaEnvironment function * handle cuda error	2019-10-07 17:29:43 -07:00
Changming Sun	8f7657fa32	Ignore some gcc warnings (#1996 )	2019-10-07 16:32:34 -07:00
Pranav Sharma	ea60469af5	Support seq(tensor), implement 2 sequence ops that use the new type. (#1983 ) * Mention OrtCreateSessionFromArray in C API doc * fix seq of tensors * changes on 9/30 * All tests passing * Add SequenceAt op * Fix shared_lib non_tensor_types test * Address some PR comments * Address PR comments * Add support in python bindings to accept seq(tensor) * Change data type from vector<Tensor> to TensorSeq * Change data type from vector<Tensor> to TensorSeq * Added some documentation * Added missing test model * Fix Linux build * Fix Mac build * Fix Mac build	2019-10-07 15:35:09 -07:00
Hector Li	00e24ae4fe	refactor Cuda Ops Sum, Max, Min, remove dup code (#1946 ) refactor Cuda Ops Sum, Max, Min, remove dup code	2019-10-07 13:17:49 -07:00
Tianlei Wu	7b39f5090c	Add Attention op for multi-head self attention in BERT (#1984 ) * Add Attention op for multi head self attention in BERT * Add test cases * Move op from kOnnxDomain to kMSDomain. Limit test to run by CUDA provider only. * fix test * Add float16 test * fix cpu build error * handle cuda error * get last cuda error when failed	2019-10-07 12:22:54 -07:00
Yang Chen	7d2f0c79bd	Bumped up to op_ver 11 for a bunch of Nuphar Ops (#2025 ) This change enabled op_ver 11 for a dozen of Nuphar Ops	2019-10-07 10:34:05 -07:00
Changming Sun	3c26ae5b6d	ThreadPool fix for roialign and CropAndResize (#2020 )	2019-10-06 22:43:59 -07:00
Pranav Sharma	4cdb95e436	Resort to sequential execution if the inter op thread pool ptr is nullptr; (#2023 )	2019-10-06 16:08:41 -07:00
stevenlix	544e53e24e	Update TensorRT to version 6.0.1.5 (#1966 ) * remove onnx-tensorrt submodule * add new onnx-tensorrt submodule (experiment) for trt6 * update engine build for trt6 * update compile and compute for tensorrt6.0 * Update tensorrt_execution_provider.cc * Update tensorrt_execution_provider.cc * Update tensorrt_execution_provider.cc * Update tensorrt_execution_provider.cc * switch to onnx-tensorrt master for TensorRT6' * Update tensorrt_execution_provider.cc * Handle dynamic batch size and add memcpy in TensorRT EP * update test cases * Update tensorrt_execution_provider.cc * update onnx-tensorrt submodule * Update Dockerfile.ubuntu_tensorrt * Update Dockerfile.ubuntu_tensorrt * Update run_dockerbuild.sh * Update run_dockerbuild.sh * Update install_ubuntu.sh * Update concat_op_test.cc * Update tensorrt_execution_provider.cc * Upgrade TensorRT to version 6.0.1.5 * Update onnxruntime_providers.cmake * Update CMakeLists.txt * Update reduction_ops_test.cc * Update install_ubuntu.sh * Update Dockerfile.ubuntu_tensorrt * Update Dockerfile.tensorrt * Update BUILD.md * Update run_dockerbuild.sh * Update install_ubuntu.sh * Update onnxruntime_providers.cmake * Update install_ubuntu.sh * Update install_ubuntu.sh * Update gemm_test.cc * Update gather_op_test.cc * Update CMakeLists.txt * Removed submodule * update onnx-tensorrt submodule * Add Ubuntu18.04 build option * Add Ubuntu18.04 build option * Add Ubuntu18.04 build option * Add Ubuntu18.04 build option * Remove redundency * Fix issue that it does not add memcopy node correctly if some nodes fall back to CUDA EP. e.g. after partition, there's TRT_Node -> Cuda_node (with CPU memory expected), we still need to add memcpy node between them. * update for Trt Windows build * Update onnxruntime_providers.cmake * Disable opset11 tests on TensorRT * Update pad_test.cc * Update build.py * update scripts for ubuntu18.04 * Disable warning for Windows build	2019-10-06 10:40:53 -07:00
baowenlei	4bb6385dca	Weba/merge ngemm (#2021 ) * save status: add tiling layout; add avx512 skylake cpuid info * unit tests and matmul integer model passed on skylake, need to verify model * save commit before update master * fix check * address comments	2019-10-05 12:09:22 -07:00
Xavier Dupré	0b5aac0a2e	fix python setup (#2022 )	2019-10-05 09:46:41 -07:00
Yang Chen	e8285a7996	Added GatherElements to Nuphar (#2016 ) * Added GatherElements to Nuphar This change added GatherElements (op_ver 11) to the Nuphar provider. * address CR feedback * create a utilify function for accessing index safely * address more CR * SafeIndex -> ClampIndex	2019-10-04 23:53:02 -07:00
Colin Versteeg	1ba76c5f74	add support for empty version and score route (#1995 )	2019-10-04 22:53:11 -07:00
Changming Sun	a9e04a29b3	Ignore a test: ParallelExecutor.StatusPropagation (#2019 )	2019-10-04 22:51:47 -07:00

1 2 3 4 5 ...

1388 commits