onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-19 19:00:47 +00:00

Author	SHA1	Message	Date
Jesse Benson	d18aa45b46	Enable more ROCM ops that are sharing CUDA code. Some are needed for Turing NLG models.	2021-02-06 14:40:34 -08:00
Chun-Wei Chen	115e16b37b	ort_test_utils: skip creating input if it is an initializer (#6544 )	2021-02-05 17:34:08 -08:00
Changming Sun	b5bd14fc9f	Update GPU packaging pipelines to cuda11 and fix the other build break issues (#6585 ) Update gpu packaging pipelines to CUDA11 In the next release we will use CUDA 11. And our CUDA 11 build suddenly became broken because recently CentOS 7 posted an update of glibc. The version of glibc was changed from 2.17-317.el7 to 2.17-322.el7_9. But the newer one isn't compatible with CUDA 11. We have to downgrade it.	2021-02-05 16:58:37 -08:00
Chun-Wei Chen	f2ce3aae13	add set_model_dir and update ONNX (#6119 )	2021-02-05 09:30:49 -08:00
Scott McKay	c5d2538314	Add more kernels that have typed registrations to the operators we track type usage for. (#6565 )	2021-02-05 15:10:54 +10:00
Scott McKay	c49d1dbc4b	Add type reduction support to Slice and Transpose (#6547 ) * Add type reduction support to Slice and Transpose	2021-02-05 11:08:23 +10:00
Xavier Dupré	615acf156c	remove keras example from python documentation (#6574 )	2021-02-05 01:10:11 +01:00
Jesse Benson	d914e29fe1	Reuse reduction_functions.cu	2021-02-04 15:00:05 -08:00
Jesse Benson	86ac11af1a	Delete ROCM-specific reduction code that is identical to CUDA reduction code.	2021-02-04 15:00:05 -08:00
Jesse Benson	21a47ec8d9	Disable a couple more unsupported tests.	2021-02-04 15:00:05 -08:00
Jesse Benson	0b147702af	Update remaining reduction ops to use MIOpen. double datatype is not supported, so disable those typed kernels.	2021-02-04 15:00:05 -08:00
Jesse Benson	a28ddb85b6	Reduction ops.	2021-02-04 15:00:05 -08:00
Jesse Benson	196132925e	Reuse CUDA's reduction_functions.cc	2021-02-04 15:00:05 -08:00
Edward Chen	2ef792ae6e	Don't resolve symlink in resolve_executable_path(). (#6540 )	2021-02-04 12:32:03 -08:00
Changming Sun	aa31ba5774	Merge CPU packaging pipelines (#6480 ) 1. Merge Nuget CPU pipeline, Java CPU pipeline, C-API pipeline into a single one. 2. Enable compile warnings for cuda files(*.cu) on Windows. 3. Enable static code analyze for the Windows builds in these jobs. For example, this is our first time scanning the JNI code. 4. Fix some warnings in the training code. 5. Enable code sign for Java. Previously we forgot it. 6. Update TPN.txt to remove Jemalloc.	2021-02-04 08:38:56 -08:00
Guoyu Wang	6cf54ff296	Switch Android CI java build to JDK 11 (#6552 ) * switch to jdk11 * fix java * Update	2021-02-03 17:49:23 -08:00
Scott McKay	6cb8f8c812	Support disabling a typed kernel registration that uses the output type (#6530 ) * Update infrastructure to support disabling a typed kernel registration that uses output 0 for the type (vs. the normal use case of input 0).	2021-02-03 14:22:32 +10:00
Scott McKay	8d53ef69e5	Add type reduction support to Min, Max and Pow (#6519 ) * Add type reduction support to Min, Max and Pow Update the C++ type reduction infrastructure to allow specifying an opset for the supported types list, as those can change across opset versions. Minor updates to the type usage tracking script * Add 'all opsets' macros and constant	2021-02-03 06:51:35 +10:00
ashbhandare	85434273ff	Fix CUDA Reduction kernel for ArgMax/ArgMix for when reduction dim=1 (#6490 ) * Fix for when reduction dim=1 * Disable test for AMD GPUs * Specify Async	2021-02-02 09:50:16 -08:00
Cian Hayes	6fc5237d9e	Introduce --enable_training_ops build flag (#6523 ) * minimal_build with training ops * Removing redundant comment from an earlier attempt at a fix * Fixing a bad merge conflict resolution * Responding to PR feedback * tweaking the makefiles based on feedback * combining two enable_training blocks in CMakeLists.txt	2021-02-01 21:54:16 -08:00
Suffian Khan	76bc0e479c	Enable dense sequence optimized version of Pytorch exported BERT-L on AMD GPU (#6504 ) * Permit dense seq optimization on BERT-L pytorch export by enabling ReduceSumTraining, Equal, and NonZero on AMD * enable Equal tests * enable fast_matrix_reduction test case	2021-01-29 13:12:34 -08:00
Scott McKay	8c6d76a4c0	Update to match new test setup. (#6496 ) * Update to match new test setup. * Add Gemm(7) manually for now. Will fix properly on Monday. It's used by mnist.ort as that is created by optimizing mnist.onnx to level 1 causing 2 nodes to be replaced by a Gemm and the op to be missing from the required list as that is created using the original onnx model.	2021-01-30 06:27:19 +10:00
RandySheriffH	a19c48f5cb	Fuse cuda conv with activation (#6351 ) * optimize cuda conv by fused activation * remove needless print out * exclude test from cpu * handle status error from cudnn 8.x * add reference to base class * add hipify	2021-01-29 10:58:10 -08:00
suryasidd	1a5b75a554	[OpenVINO-EP] Remove support for OpenVINO 2020.2 (#6493 ) * Removed OpenVINO 2020.2 support * Updated documentation and build.py * Removed unnecessary libraries from setup.py	2021-01-28 23:00:41 -08:00
Guoyu Wang	3f60b27703	Speed up the Mac CI runs (#6483 )	2021-01-28 15:13:44 -08:00
liqunfu	00afd00059	merge e2e with distributed pipeline (#6443 ) merge e2e with distributed pipeline	2021-01-28 14:17:47 -08:00
Scott McKay	c84bb9df9f	Add ability to track per operator types in reduced build config. (#6428 ) * Add ability to generate configuration that includes required types for individual operators, to allow build size reduction based on that. - Add python bindings for ORT format models - Add script to update bindings and help info - Add parsing of ORT format models - Add ability to enable type reduction to config generation - Update build.py to only allow operator/type reduction via config - simpler to require config to be generated first - can't mix a type aware (ORT format model only) and non-type aware config as that may result in insufficient types being enabled - Add script to create reduced build config - Update CIs	2021-01-29 07:59:51 +10:00
Guoyu Wang	752627c5bb	[CoreML EP] Add CI for CoreML EP (macOS) and add coreml_flags for EP options (#6481 ) * Add macos coreml CI and coreml_flags * Move save debuggubg model to use environment var * Move pipeline off from macos CI template * Fix an issue building using unix make, add parallel to build script * Fixed build break for shared_lib and cmpile warning * Fix a compile warning * test * Revert the accidental push from another branch This reverts commit 472029ba25d50f9508474c9eeceb3454cead7877.	2021-01-28 12:25:46 -08:00
baijumeswani	2e228d74d0	Increase the distributes tests pipeline timeout to 120 minutes (#6479 )	2021-01-28 12:04:26 -08:00
Guoyu Wang	c05adb1147	Initial version of CoreML EP (#6392 )	2021-01-27 10:43:17 -08:00
liqunfu	6ed12402a4	Liqun/liqun/enable pipeline parallel test2 (#6399 ) * enable data and pipeline parallism test Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-01-25 15:15:26 -08:00
Yufeng Li	c20965f9b2	enable pipeline to run quantization tests (#6416 ) * enable pipeline to run quantization tests setup test pipeline for quantization	2021-01-25 09:33:08 -08:00
ashbhandare	60c772e2bc	Megatron checkpointing (#6293 ) * Add bart fairseq run script * Add frontend change to enable megatron * Initial changes for checkpointing * Megatron optim state loading, checkpoint aggregation, frontend distributed tests for H, D+H * Add load_checkpoint changes * Fix CI * Cleanup * Fix CI * review comments * review comments * review comments:	2021-01-22 11:26:47 -08:00
Guoyu Wang	eb946c4177	Unblock Android CI code coverage failure (#6393 )	2021-01-20 21:26:10 -08:00
pengwa	453431f7bb	Add max_norm for gradient clipping. (#6289 ) * add max_norm as user option for gradient clipping * add adam and lamb test cases for clip norm * add frontend tests	2021-01-21 01:01:11 +08:00
Hariharan Seshadri	d7bdd96425	Refine auto_pad based pad computation in ConvTranspose (#6305 )	2021-01-19 19:01:49 -08:00
wezuo	5b6753ce27	Wezuo/memory analysis (#5658 ) * merged alloc_plan * pass compilation * Start running, incorrect allocation memory info * add in comments * fix a bug of recording pattern too early. * debugging lifetime * fix lifetime * passed mnist * in process of visualization * Add code to generate chrome trace for allocations. * in process of collecting fragmentation * before rebuild * passed mnist * passed bert tiny * fix the inplace reuse * fix the exception of weight in pinned memory * add guards to ensure the tensor is in AllocPlan * add customized profiling * debugging * debugging * fix the reuse of differnt location type * add rank * add the rank * add fragmentation * add time_step_trace * Add summary for each execution step (total bytes, used/free bytes). * add top k * change type of top k parameter * remove prints * change heap to set{ * add the name pattern * add the useage for pattern * add partition * change to static class * add custom group * remove const * update memory_info * in process of adding it as runtime config * change the memory profiling to be an argument * add some comments * add checks to recored meomry_info in traaining session * set the "local rank setting" to correct argument. * addressing comments * format adjustment * formatting * remove alloc_interval * update memory_info.cc to skip session when there is no tensor for a particular memory type * fix memory_info multiple iteration seg-fault * consolidate mainz changes * fixed some minor errors * guard by ORT_MINIMAL_BUILD * add ORT_MEMORY_PROFILE flag * added compiler flag to turn on/off memory profiling related code * clean up the code regarding comments * add comments * revoke the onnx version * clean up the code to match master * clean up the code to match master * clean up the code to match master Co-authored-by: Jesse Benson <benson.jesse@gmail.com> Co-authored-by: Wei Zuo <wezuo@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: wezuo <wezuo@az-eus-v100-32gb-5-worker-mgtbby.eastus.cloudapp.azure.com> Co-authored-by: wezuo <wezuo@az-eus-v100-32gb-5-worker-yclzsf.eastus.cloudapp.azure.com>	2021-01-19 08:30:55 -08:00
Wei-Sheng Chin	8ce252caa9	Pipeline Parallel Experimental Python API (#5815 )	2021-01-15 12:07:28 +08:00
Scott McKay	e54e2f969d	Use readelf for minimal build binary size checks. (#6338 ) * Use readelf for minimal build binary size checks. The on-disk size grows in 4KB chunks which makes it hard to see how much growth an individual checkin causes. Only downside is that the sum of the sections is larger than the on-disk size (assumably things get packed smaller on disk and some of the section alignment constraints can be ignored) * Remove unused function	2021-01-15 07:46:02 +10:00
Changming Sun	ea6789b754	Add PREfast to python packaging pipeline (#6343 ) * Add PREfast to python packaging pipeline	2021-01-14 10:39:24 -08:00
Guoyu Wang	e35db194e3	fix the pipeline failure (#6346 )	2021-01-14 00:33:22 -08:00
Edward Chen	042053c55e	Add support for running Android emulator from build.py on Windows. (#6317 )	2021-01-13 19:21:49 -08:00
Alberto Magni	5623cc6d17	Use onnxruntime_USE_FULL_PROTOBUF=OFF for the cuda execution provider (#6340 ) This removes a special case of the cuda EP.	2021-01-13 18:27:13 +00:00
liqunfu	aeca96caba	Liqun/enable pipeline parallel test (#6331 ) enable pipeline parallel test Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-01-13 10:24:04 -08:00
Dmitri Smirnov	6b73bae035	Java: add Semmle to Java publishing pipelines (#6326 ) Add Semmle to Java API pipeline Add security results publishing and add Java GPU.	2021-01-12 15:12:13 -08:00
Ashwini Khade	0ed56d491a	fix opset imports for function body (#6287 ) * fix function opsets * add tests and update onnx * changes per review comments * add comments * plus updates * build fix	2021-01-12 13:44:36 -08:00
Changming Sun	c43ca45c4f	Force reinstall onnx python package on Windows (#6309 )	2021-01-11 22:12:56 -08:00
Chun-Wei Chen	84024bdfa9	Enable ONNX backend test of SequenceProto input/output (#6043 ) * assert sequence tensor and remove skips * update testdata json * use ONNX 1.8 in cgmanifest.json * use previous commit to workaround * update ONNX commit ID in docker * skip test_maxpool_2d_dilations test for now * update function name	2021-01-11 11:30:33 -08:00
Changming Sun	5084ce0969	Update nuget build (#6297 ) 1. Update the ProtoSrc path. The old one is not used anymore. 2. Regenerate OnnxMl.cs 3. Delete some unused code in tools/ci_build/build.py 4. Avoid set intra_op_param.thread_pool_size in ModelTests in OpenMP build. 5. Fix a typo in the C API pipeline.	2021-01-11 10:49:05 -08:00
Jesse Benson	fa851bff66	Add workaround to remove ROCm-specific binary-elementwise files.	2021-01-11 10:00:18 -08:00

1 2 3 4 5 ...

830 commits