onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-18 18:52:16 +00:00

Author	SHA1	Message	Date
Yufeng Li	56e4e47f66	Quantize model with QDQ format (#6541 ) * implement qdq format in quant tool * refactor code	2021-02-08 18:46:07 -08:00
Scott McKay	c02ae61cab	Make kernel hash stable in type reduced build (#6603 ) * Add infrastructure so that a kernel definition has the full list of supported types and a list of types enabled in this build. We need to use the full list when calculating the kernel hash so that the hash value in an ORT format model is stable across builds with and without type reduction enabled.	2021-02-09 12:00:08 +10:00
Cian Hayes	16eed68a1e	Fix layer_norm.cc on x86 (#6556 ) * Fix LayerNromGrad on x86 * PR feedback	2021-02-08 17:36:14 -08:00
Scott McKay	13d7db9a98	Don't update the excluded ops/types unless args.update is true. Updating the exclusion info triggers rebuilding of all kernels using type reduction. (#6604 )	2021-02-09 07:15:31 +10:00
Scott McKay	0b1e21c638	Fix bug with ORT format serialization of tensor attributes. (#6602 ) The model path needs to come from the Node not the potentially nullptr subgraph.	2021-02-09 07:15:21 +10:00
Pranav Sharma	67ef6b1aa6	[Mult-GPU inferencing] Add new API to get/set device id. Set correct device id in cuda allocator. (#6592 )	2021-02-08 08:59:18 -08:00
nietras	1dd920fa7c	Fix TensorRT unnecessary file cache operations (#6601 ) * Fix TensorRT unnecessary file cache operations * fix compile	2021-02-07 20:09:30 -08:00
Edward Chen	19c130f561	Reduce CastMLFloat16ThroughFloat size (Scott's suggested changes), fix unused function warning. (#6597 )	2021-02-08 07:20:53 +10:00
Scott McKay	190b90a682	Fix some coding conventions issues (#6583 ) Fix some coding conventions issues Use #define for types that Cast supports	2021-02-08 07:11:26 +10:00
Weixing Zhang	c86c21e002	Generate error when an explicit stream argument is not provided in the <<<...>>> kernel launch syntax (#6599 ) * Generate error when an explicit stream argument is not provided in the <<<...>>> kernel launch syntax Co-authored-by: Weixing Zhang <wezhan@microsoft.com>	2021-02-06 15:54:29 -08:00
Jesse Benson	d18aa45b46	Enable more ROCM ops that are sharing CUDA code. Some are needed for Turing NLG models.	2021-02-06 14:40:34 -08:00
Adam Pocock	dbe31361bc	Fix build.gradle so it always targets Java 8 class files.	2021-02-05 22:26:17 -08:00
Ryan Lai	b57a7f4de3	Delay load dxcore in winml model tests	2021-02-05 21:08:11 -08:00
George Nash	b50b0a89aa	Fix build failure when building with --build_wheel on Windows This resolves issue #6536 Signed-off-by: George Nash <george.nash@intel.com>	2021-02-05 18:59:01 -08:00
Nat Kershaw (MSFT)	af9dfa7a4d	Remove docs that have been migrated to https://onnxruntime.ai/docs (#6225 )	2021-02-05 18:09:27 -08:00
Dmitri Smirnov	dda5a62072	Fix updated Doxygen errors. (#6588 )	2021-02-05 18:07:03 -08:00
Chun-Wei Chen	115e16b37b	ort_test_utils: skip creating input if it is an initializer (#6544 )	2021-02-05 17:34:08 -08:00
Scott McKay	ccfd90291b	Remove condition from ORT_RETURN_IF[_NOT] macro output. (#6563 ) Remove condition from ORT_RETURN_IF[_NOT] macro output as repeating the condition doesn't add much value compared to the explicit error message, and the error message includes the file and line anyway so it's easy enough to find the condition if needed. Update the few places where the macros were used without an explicit error message to provide an explicit error message. Saves 12.5KB in a minimal MinSizeRel build with all DNN ops, 16KB in full release build.	2021-02-05 17:33:29 -08:00
Changming Sun	b5bd14fc9f	Update GPU packaging pipelines to cuda11 and fix the other build break issues (#6585 ) Update gpu packaging pipelines to CUDA11 In the next release we will use CUDA 11. And our CUDA 11 build suddenly became broken because recently CentOS 7 posted an update of glibc. The version of glibc was changed from 2.17-317.el7 to 2.17-322.el7_9. But the newer one isn't compatible with CUDA 11. We have to downgrade it.	2021-02-05 16:58:37 -08:00
Ye Wang	82229c8e61	Support no bias in layernorm and skiplayernorm op (#6554 ) * add noBias attribute in layernorm * skip bias in skiplayernorm * fix * fix cuda tets * add tests * fix windows build * fix win build issue * review comments	2021-02-05 16:48:22 -08:00
Weixing Zhang	299ace0759	Support to allow user to specify compute stream per session (#3723 ) * Support to allow user to specify compute stream per session Create computation cuda stream explicitly rather than use default legacy stream or per-thread default stream. remove some redudant cudaStreamSynchronize fix gpt2 model test failures don't use default stream in nccl either. add stream schronization in OnRunEnd() using cub::DeviceScan::InclusiveSum which can be called with stream specified. fix topK failure due to latest rebase fix tensorrt support user specified stream add user_stream support in tensorrt EP use same stream for both tensort and CUDA EP. fix ScatterND specify stream for adasum and p2p kernels. fix loop fix CApiTest.custom_op_handler fix CApiTest.varied_input_custom_op_handler change for cudaMemcpyFromSymbol improve provider options for user specified compute stream * add changes for ROCM EP * fix GatherGrad UT for ROCM EP * clean code and fix NonMaxSuppression * use default stream for ROCM now * fix CApiTest.custom_op_handler:OrtFormatCustomOpTests.ConvertOnnxModelToOrt * fix tensorrt ut: CApiTest.io_binding_cuda Co-authored-by: Weixing Zhang <wezhan@microsoft.com>	2021-02-05 15:48:18 -08:00
sfatimar	973c3917a6	OpenVino add build_shared_lib flag in the build command (#6560 ) * Dockerfile changes to add build_shared_lib 2021_1 indendation changes * csharp shared library Co-authored-by: sfatimar <sahar.fatima@intel/com>	2021-02-05 12:18:02 -08:00
Guoyu Wang	68193e28de	Let execution fall back to CPU EP if Compile of a partition on current EP fails (#6580 ) * Let exccution fall back to CPU EP if compile of a partition fails * Removed debugging logs * Addressed CR comments	2021-02-05 12:14:55 -08:00
Chun-Wei Chen	f2ce3aae13	add set_model_dir and update ONNX (#6119 )	2021-02-05 09:30:49 -08:00
Edward Chen	3b376da37c	Enable type reduction for Gather CPU kernel. (#6579 ) * Enable type reduction in Gather.	2021-02-05 17:22:22 +10:00
Scott McKay	c5d2538314	Add more kernels that have typed registrations to the operators we track type usage for. (#6565 )	2021-02-05 15:10:54 +10:00
Hariharan Seshadri	f14c621c10	Tile perf enhancements - continued (#6561 )	2021-02-04 20:14:27 -08:00
Scott McKay	c49d1dbc4b	Add type reduction support to Slice and Transpose (#6547 ) * Add type reduction support to Slice and Transpose	2021-02-05 11:08:23 +10:00
Yulong Wang	89627a8178	[Node.js binding] support NPM v7+ (#6559 )	2021-02-04 17:07:06 -08:00
Xavier Dupré	615acf156c	remove keras example from python documentation (#6574 )	2021-02-05 01:10:11 +01:00
Prasanth Pulavarthi	4e61e254ec	Update link in readme (#6537 )	2021-02-04 15:28:39 -08:00
Jesse Benson	d914e29fe1	Reuse reduction_functions.cu	2021-02-04 15:00:05 -08:00
Jesse Benson	3c44184963	Pick up changes from: https://github.com/microsoft/onnxruntime/pull/6490	2021-02-04 15:00:05 -08:00
Jesse Benson	a9e4d70b50	Fix merge conflict.	2021-02-04 15:00:05 -08:00
Jesse Benson	76fcebd0a4	Fix scratch buffer early free.	2021-02-04 15:00:05 -08:00
Jesse Benson	86ac11af1a	Delete ROCM-specific reduction code that is identical to CUDA reduction code.	2021-02-04 15:00:05 -08:00
Jesse Benson	5d8792705b	Code formatting.	2021-02-04 15:00:05 -08:00
Jesse Benson	21a47ec8d9	Disable a couple more unsupported tests.	2021-02-04 15:00:05 -08:00
Jesse Benson	0b147702af	Update remaining reduction ops to use MIOpen. double datatype is not supported, so disable those typed kernels.	2021-02-04 15:00:05 -08:00
Jesse Benson	a28ddb85b6	Reduction ops.	2021-02-04 15:00:05 -08:00
Jesse Benson	196132925e	Reuse CUDA's reduction_functions.cc	2021-02-04 15:00:05 -08:00
Jesse Benson	4c1db50df5	miopen common	2021-02-04 15:00:05 -08:00
Jesse Benson	554184bcc4	Add reduce template parameters.	2021-02-04 15:00:05 -08:00
Jesse Benson	c4b6559be9	Update reduction_all.cu	2021-02-04 15:00:05 -08:00
Jesse Benson	5fc377f21e	Partial updating of ROCM reduction code.	2021-02-04 15:00:05 -08:00
Edward Chen	318b82ca7e	Cast Op performance fix. (#6509 ) Update CPU Cast implementation to fix performance regressions. Update Cast unit tests for more coverage.	2021-02-04 14:52:37 -08:00
Edward Chen	2ef792ae6e	Don't resolve symlink in resolve_executable_path(). (#6540 )	2021-02-04 12:32:03 -08:00
Changming Sun	aa31ba5774	Merge CPU packaging pipelines (#6480 ) 1. Merge Nuget CPU pipeline, Java CPU pipeline, C-API pipeline into a single one. 2. Enable compile warnings for cuda files(*.cu) on Windows. 3. Enable static code analyze for the Windows builds in these jobs. For example, this is our first time scanning the JNI code. 4. Fix some warnings in the training code. 5. Enable code sign for Java. Previously we forgot it. 6. Update TPN.txt to remove Jemalloc.	2021-02-04 08:38:56 -08:00
Guoyu Wang	0d35f0e2c0	[CoreML EP] Add support of Conv operator (#6510 ) * [CoreML EP] Add support of Conv operator * Ignore an corner case setting empty padding * Add handle autopadding * Addressed CR comments	2021-02-04 00:30:10 -08:00
Guoyu Wang	6cf54ff296	Switch Android CI java build to JDK 11 (#6552 ) * switch to jdk11 * fix java * Update	2021-02-03 17:49:23 -08:00

1 2 3 4 5 ...

4185 commits