onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-18 18:52:16 +00:00

Author	SHA1	Message	Date
nietras	1dd920fa7c	Fix TensorRT unnecessary file cache operations (#6601 ) * Fix TensorRT unnecessary file cache operations * fix compile	2021-02-07 20:09:30 -08:00
Edward Chen	19c130f561	Reduce CastMLFloat16ThroughFloat size (Scott's suggested changes), fix unused function warning. (#6597 )	2021-02-08 07:20:53 +10:00
Scott McKay	190b90a682	Fix some coding conventions issues (#6583 ) Fix some coding conventions issues Use #define for types that Cast supports	2021-02-08 07:11:26 +10:00
Weixing Zhang	c86c21e002	Generate error when an explicit stream argument is not provided in the <<<...>>> kernel launch syntax (#6599 ) * Generate error when an explicit stream argument is not provided in the <<<...>>> kernel launch syntax Co-authored-by: Weixing Zhang <wezhan@microsoft.com>	2021-02-06 15:54:29 -08:00
Jesse Benson	d18aa45b46	Enable more ROCM ops that are sharing CUDA code. Some are needed for Turing NLG models.	2021-02-06 14:40:34 -08:00
Adam Pocock	dbe31361bc	Fix build.gradle so it always targets Java 8 class files.	2021-02-05 22:26:17 -08:00
Ryan Lai	b57a7f4de3	Delay load dxcore in winml model tests	2021-02-05 21:08:11 -08:00
George Nash	b50b0a89aa	Fix build failure when building with --build_wheel on Windows This resolves issue #6536 Signed-off-by: George Nash <george.nash@intel.com>	2021-02-05 18:59:01 -08:00
Nat Kershaw (MSFT)	af9dfa7a4d	Remove docs that have been migrated to https://onnxruntime.ai/docs (#6225 )	2021-02-05 18:09:27 -08:00
Dmitri Smirnov	dda5a62072	Fix updated Doxygen errors. (#6588 )	2021-02-05 18:07:03 -08:00
Chun-Wei Chen	115e16b37b	ort_test_utils: skip creating input if it is an initializer (#6544 )	2021-02-05 17:34:08 -08:00
Scott McKay	ccfd90291b	Remove condition from ORT_RETURN_IF[_NOT] macro output. (#6563 ) Remove condition from ORT_RETURN_IF[_NOT] macro output as repeating the condition doesn't add much value compared to the explicit error message, and the error message includes the file and line anyway so it's easy enough to find the condition if needed. Update the few places where the macros were used without an explicit error message to provide an explicit error message. Saves 12.5KB in a minimal MinSizeRel build with all DNN ops, 16KB in full release build.	2021-02-05 17:33:29 -08:00
Changming Sun	b5bd14fc9f	Update GPU packaging pipelines to cuda11 and fix the other build break issues (#6585 ) Update gpu packaging pipelines to CUDA11 In the next release we will use CUDA 11. And our CUDA 11 build suddenly became broken because recently CentOS 7 posted an update of glibc. The version of glibc was changed from 2.17-317.el7 to 2.17-322.el7_9. But the newer one isn't compatible with CUDA 11. We have to downgrade it.	2021-02-05 16:58:37 -08:00
Ye Wang	82229c8e61	Support no bias in layernorm and skiplayernorm op (#6554 ) * add noBias attribute in layernorm * skip bias in skiplayernorm * fix * fix cuda tets * add tests * fix windows build * fix win build issue * review comments	2021-02-05 16:48:22 -08:00
Weixing Zhang	299ace0759	Support to allow user to specify compute stream per session (#3723 ) * Support to allow user to specify compute stream per session Create computation cuda stream explicitly rather than use default legacy stream or per-thread default stream. remove some redudant cudaStreamSynchronize fix gpt2 model test failures don't use default stream in nccl either. add stream schronization in OnRunEnd() using cub::DeviceScan::InclusiveSum which can be called with stream specified. fix topK failure due to latest rebase fix tensorrt support user specified stream add user_stream support in tensorrt EP use same stream for both tensort and CUDA EP. fix ScatterND specify stream for adasum and p2p kernels. fix loop fix CApiTest.custom_op_handler fix CApiTest.varied_input_custom_op_handler change for cudaMemcpyFromSymbol improve provider options for user specified compute stream * add changes for ROCM EP * fix GatherGrad UT for ROCM EP * clean code and fix NonMaxSuppression * use default stream for ROCM now * fix CApiTest.custom_op_handler:OrtFormatCustomOpTests.ConvertOnnxModelToOrt * fix tensorrt ut: CApiTest.io_binding_cuda Co-authored-by: Weixing Zhang <wezhan@microsoft.com>	2021-02-05 15:48:18 -08:00
sfatimar	973c3917a6	OpenVino add build_shared_lib flag in the build command (#6560 ) * Dockerfile changes to add build_shared_lib 2021_1 indendation changes * csharp shared library Co-authored-by: sfatimar <sahar.fatima@intel/com>	2021-02-05 12:18:02 -08:00
Guoyu Wang	68193e28de	Let execution fall back to CPU EP if Compile of a partition on current EP fails (#6580 ) * Let exccution fall back to CPU EP if compile of a partition fails * Removed debugging logs * Addressed CR comments	2021-02-05 12:14:55 -08:00
Chun-Wei Chen	f2ce3aae13	add set_model_dir and update ONNX (#6119 )	2021-02-05 09:30:49 -08:00
Edward Chen	3b376da37c	Enable type reduction for Gather CPU kernel. (#6579 ) * Enable type reduction in Gather.	2021-02-05 17:22:22 +10:00
Scott McKay	c5d2538314	Add more kernels that have typed registrations to the operators we track type usage for. (#6565 )	2021-02-05 15:10:54 +10:00
Hariharan Seshadri	f14c621c10	Tile perf enhancements - continued (#6561 )	2021-02-04 20:14:27 -08:00
Scott McKay	c49d1dbc4b	Add type reduction support to Slice and Transpose (#6547 ) * Add type reduction support to Slice and Transpose	2021-02-05 11:08:23 +10:00
Yulong Wang	89627a8178	[Node.js binding] support NPM v7+ (#6559 )	2021-02-04 17:07:06 -08:00
Xavier Dupré	615acf156c	remove keras example from python documentation (#6574 )	2021-02-05 01:10:11 +01:00
Prasanth Pulavarthi	4e61e254ec	Update link in readme (#6537 )	2021-02-04 15:28:39 -08:00
Jesse Benson	d914e29fe1	Reuse reduction_functions.cu	2021-02-04 15:00:05 -08:00
Jesse Benson	3c44184963	Pick up changes from: https://github.com/microsoft/onnxruntime/pull/6490	2021-02-04 15:00:05 -08:00
Jesse Benson	a9e4d70b50	Fix merge conflict.	2021-02-04 15:00:05 -08:00
Jesse Benson	76fcebd0a4	Fix scratch buffer early free.	2021-02-04 15:00:05 -08:00
Jesse Benson	86ac11af1a	Delete ROCM-specific reduction code that is identical to CUDA reduction code.	2021-02-04 15:00:05 -08:00
Jesse Benson	5d8792705b	Code formatting.	2021-02-04 15:00:05 -08:00
Jesse Benson	21a47ec8d9	Disable a couple more unsupported tests.	2021-02-04 15:00:05 -08:00
Jesse Benson	0b147702af	Update remaining reduction ops to use MIOpen. double datatype is not supported, so disable those typed kernels.	2021-02-04 15:00:05 -08:00
Jesse Benson	a28ddb85b6	Reduction ops.	2021-02-04 15:00:05 -08:00
Jesse Benson	196132925e	Reuse CUDA's reduction_functions.cc	2021-02-04 15:00:05 -08:00
Jesse Benson	4c1db50df5	miopen common	2021-02-04 15:00:05 -08:00
Jesse Benson	554184bcc4	Add reduce template parameters.	2021-02-04 15:00:05 -08:00
Jesse Benson	c4b6559be9	Update reduction_all.cu	2021-02-04 15:00:05 -08:00
Jesse Benson	5fc377f21e	Partial updating of ROCM reduction code.	2021-02-04 15:00:05 -08:00
Edward Chen	318b82ca7e	Cast Op performance fix. (#6509 ) Update CPU Cast implementation to fix performance regressions. Update Cast unit tests for more coverage.	2021-02-04 14:52:37 -08:00
Edward Chen	2ef792ae6e	Don't resolve symlink in resolve_executable_path(). (#6540 )	2021-02-04 12:32:03 -08:00
Changming Sun	aa31ba5774	Merge CPU packaging pipelines (#6480 ) 1. Merge Nuget CPU pipeline, Java CPU pipeline, C-API pipeline into a single one. 2. Enable compile warnings for cuda files(*.cu) on Windows. 3. Enable static code analyze for the Windows builds in these jobs. For example, this is our first time scanning the JNI code. 4. Fix some warnings in the training code. 5. Enable code sign for Java. Previously we forgot it. 6. Update TPN.txt to remove Jemalloc.	2021-02-04 08:38:56 -08:00
Guoyu Wang	0d35f0e2c0	[CoreML EP] Add support of Conv operator (#6510 ) * [CoreML EP] Add support of Conv operator * Ignore an corner case setting empty padding * Add handle autopadding * Addressed CR comments	2021-02-04 00:30:10 -08:00
Guoyu Wang	6cf54ff296	Switch Android CI java build to JDK 11 (#6552 ) * switch to jdk11 * fix java * Update	2021-02-03 17:49:23 -08:00
Ryan Lai	c7feb48083	Don't send out Runtime error telemetry when can't create LearningModelDevice on machine without hardware adapters (#6535 ) * Checkoutpoint 1 * Remove global logruntime error telemetry. This isn't necessary and doesn't contain relevant information * Make macro simpler Co-authored-by: Ryan Lai <ryalai96@gamil.com>	2021-02-03 14:27:29 -08:00
Guoyu Wang	464dbef143	[NNAPI EP] add uint8 support for Transpose/Concat/Maxpool, add support of QLinearSigmoid (#6534 ) * Init change * Add QlinearSigmoid support * Update tests * Add resize int8 support * Add version check for resize linear uint8 and add scale/zero point check for concat uint8 * Address CR comments * minor fix and add test for uint8 handling * Address CR comments * Fixed an existing bug * Fix the new UT break, due to different rounding of 0.5 in device and emulator	2021-02-03 13:45:49 -08:00
Scott McKay	6cb8f8c812	Support disabling a typed kernel registration that uses the output type (#6530 ) * Update infrastructure to support disabling a typed kernel registration that uses output 0 for the type (vs. the normal use case of input 0).	2021-02-03 14:22:32 +10:00
Scott McKay	8d53ef69e5	Add type reduction support to Min, Max and Pow (#6519 ) * Add type reduction support to Min, Max and Pow Update the C++ type reduction infrastructure to allow specifying an opset for the supported types list, as those can change across opset versions. Minor updates to the type usage tracking script * Add 'all opsets' macros and constant	2021-02-03 06:51:35 +10:00
Thiago Crepaldi	fbb24b57d0	Update code owners for pytorch frontend team (#6329 )	2021-02-02 11:09:10 -08:00
ashbhandare	85434273ff	Fix CUDA Reduction kernel for ArgMax/ArgMix for when reduction dim=1 (#6490 ) * Fix for when reduction dim=1 * Disable test for AMD GPUs * Specify Async	2021-02-02 09:50:16 -08:00

1 2 3 4 5 ...

4179 commits