onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-02 03:55:34 +00:00

Author	SHA1	Message	Date
RandySheriffH	aeca7c2940	Cuda Profiler (#7110 ) * implement cuda profiler * add counters * downgrade cupti kernel version * move mutex * add cupti to path * fix win gpu build err * add path for cuda10 * fix linux com err * extend include path * add init flag * fix test case * fix tensorrt pipeline * add UT Co-authored-by: Ubuntu <randysheriff@rashuai-linux-gpu-3.3cfnmjowvu4e5bidlsmcxsmzwg.xx.internal.cloudapp.net>	2021-03-29 12:04:36 -07:00
Ashwini Khade	b22e60bd44	pull onnx latest commit (#7102 ) * update onnx commit * fix test scripts to remove deprecated call * update filters * add registration for relu and cumsum ver 14 * add promote trilu to onnx domain * update onnx-tensorrt submodule * update flag * update flag * update dependencies * fix android ci failure	2021-03-29 11:00:38 -07:00
Scott McKay	9297527b7a	Enable NHWC transformer when generating ORT format model (#7126 ) * Allow specific optimizers to be disabled. - replace unused ability to specify just the optimizers to run - never used so not needed Allow the disabled list to be specified via the python bindings - expected usage is internal, so using kwargs for that so as not to pollute the documentation with stuff no user is likely to need Update the ORT format model conversion script to disable NCHWc transformer when level is 'all' - currently there aren't any known use cases where we'd want the NCHWc transformations to run as they create a device specific model and aren't used on ARM - the ORT format model is not expected to be generated on the target device (e.g. generate on Windows/Linux/macOS to deploy to Android/iOS so there's a good chance we'd generate a useless/invalid model - default to 'all' as ARM and MLAS prefer NHWC and the NHWC transformer runs at that level * Add matching changes to optimizer generation in training code	2021-03-29 18:39:48 +10:00
satyajandhyala	90294b9c43	Fix Transpose and MatMul fusion code to check the input datatypes as … (#7147 ) * Fix Transpose and MatMul fusion code to check the input datatypes as FusedMatMul only supports floating point datatypes. * Added testcases to make sure that the int32/int64 datatypes prevent Transport-MatMul fusion.	2021-03-28 09:24:12 -07:00
Jeff Daily	65ce5f07b3	add Dockerfile.rocm4.1.pytorch (#7152 )	2021-03-26 21:40:10 -07:00
Suffian Khan	f27835c4de	Disable batch size test for AMD CI pipeline after agent upgrade to Rocm 4.1 (#7153 ) * disable batch size test for rocm 4.1 until resolved * Update orttraining-pai-ci-pipeline.yml Forgot to modify both pipelines	2021-03-26 22:32:39 -05:00
Changming Sun	f365f1d967	Resize_impl.cu: Change _Round to roundf (#7140 ) This is to keep the change minimal, make it work exactly like what it worked before.	2021-03-26 18:29:21 -07:00
Edward Chen	63d9d5afd3	Fix Pad and Gather incorrect usage of HasType helpers. (#7146 )	2021-03-26 17:36:31 -07:00
Sherlock	ab86634c36	Address comments from ORTModule master merge (#7101 ) * Address ortmodule merge master comments Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-03-26 16:26:42 -07:00
Adrian Tsai	a8f0ab9c5f	Merged PR 5846998: Fix warnings level for DML EP Apparently ORT has a new, rather unusual way of setting the warning level. This change resets our warning level back to W3 for the DML EP.	2021-03-26 22:55:33 +00:00
Thiago Crepaldi	a01f15198c	Add support for large models (#7113 ) * Add support for large models * Handle models with registered buffers	2021-03-26 14:08:46 -07:00
Suffian Khan	2b31b80b1f	icnrease timeout (#7145 )	2021-03-26 11:26:18 -07:00
Yufeng Li	3771e0bf10	update bert quantization notebook (#7137 )	2021-03-25 18:12:53 -07:00
KeDengMS	c9b29fbd06	Disable MatmulTransposeFusion for CPU EP (#7135 ) It causes convergence issue in BERT on CPU	2021-03-25 17:16:58 -07:00
Dmitri Smirnov	2bf54bcaa2	Fix bugs in sparsify script (#7134 ) Fix type and check.	2021-03-25 14:53:52 -07:00
G. Ramalingam	cc0e7bee76	Add function-body to SoftmaxGrad (#6988 ) * Add function body to SoftmaxGrad schema * Add type context and cleanup * Add test case with symbolic dimensions * Add opset specification to function * handle opset dependence * Exclude from minimal build	2021-03-25 11:34:06 -07:00
Tianlei Wu	53c123dcee	Add session option configuration to enable GeluApproximation (#7131 )	2021-03-25 11:32:36 -07:00
Adrian Tsai	39bd192d33	Merged PR 5837692: Merge latest from upstream	2021-03-25 16:21:56 +00:00
Yufeng Li	8e54b76e2d	QDQ implementation (#7033 ) * Add QDQ basic implementation	2021-03-25 09:17:23 -07:00
RandySheriffH	865c67611c	Exclude profiler from minimal build (#7115 ) * Exclude TP profiler from minimum build * fix typo * remove Clock * fix comments Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2021-03-25 09:06:14 -07:00
Vincent Wang	fda0470683	Add New AllocKind for YieldOp Outputs, Run YieldOp with InferenceSession in UT (#7125 ) * new allockind, add ut * change macro * fix win build * rename alloc kind * fix mem leak	2021-03-25 15:18:51 +08:00
Sherlock	1c8d874412	Promote BiasDropout from orttraining to onnxruntime (#7116 ) * Promote BiasDropout from orttraining to onnxruntime Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-03-24 20:42:42 -07:00
Adrian Tsai	293774fbeb	Merge remote-tracking branch 'upstream/master' into p/adtsai/merge # Conflicts: # onnxruntime/contrib_ops/cpu/quantization/dynamic_quantize_matmul.cc	2021-03-24 19:48:01 -07:00
jingyanwangms	cd67f12add	Move IOBinding and RunOptions to ctx (#7028 ) * Liqun/ort module perf1 (#6806) add mysql script to log perf data Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> * Resolve HTTP Error 503: Service Unavailable for MNIST dataset (#6989) * Reduce logging for ORTModule for the end user (#6982) * Support none types in forward output (#7001) * Missed test case for none type output (#7014) * save iobinding to ctx * save run_options to ctx * remove debug tests * PR comments and clean up * add RunStateInfo * remove whitespace edits * PR comments * remove test changes * fix test failure * Fit unit test test_nesting_forward_backward_calls Co-authored-by: liqunfu <liqfu@microsoft.com> Co-authored-by: baijumeswani <bmeswani@microsoft.com> Co-authored-by: Jingyan Wang <jingywa@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-03-24 17:51:00 -07:00
Changming Sun	2e3bbad19f	Move TensorRT Windows CI build to the machine pool (#7127 )	2021-03-24 14:28:25 -07:00
Guoyu Wang	1c04eec2b1	[NNAPI EP] Fix error for QLinearAdd with an initializer as input (#7093 ) * Fix the issue where input to qlinearadd is an initializer * Add UT * Adress CR comments	2021-03-24 11:56:53 -07:00
harshithapv	540eac253e	Deepspeed pipeline parallel and fairscale sharded optimizer test samples with ORTModule (#7078 ) * adding samples for Deepspeed pipeline parallel and fairscale sharded optimizer with ortmodule * fixed typo in args * addressed Thiago's comments * Update orttraining/orttraining/test/python/orttraining_test_ortmodule_deepspeed_pipeline_parallel.py Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>	2021-03-24 09:43:05 -07:00
KeDengMS	6987106bf5	Add missing Python dependencies for ORT training (#7104 ) * Add missing Python dependencies for training cerberus - option parsing h5py - checkpoint onnx - model proto packaging/sympy - symbolic shape inference * Separate requirements.txt for inference and training Python packages.	2021-03-23 18:43:19 -07:00
Yufeng Li	fffe16cb43	Fix a bug in quant GEMM and add an unit test (#7111 )	2021-03-23 16:39:35 -07:00
Changming Sun	b07e168a2b	Delete an unused file: download_test_data.py (#7109 )	2021-03-23 14:49:26 -07:00
Suffian Khan	5cb8934459	update Dockerfile for workaround for issue in RCCL for rocm4.0 (#7108 )	2021-03-23 13:36:04 -07:00
Suffian Khan	c0994fdfbb	Update ORTTrainer to permit Rocm and permit export of opset 13 (#7059 ) * update orttrainer to permit rocm and allow export for opset 13 * wrap rocm check in try-except block	2021-03-23 11:09:48 -07:00
Edward Chen	53392664d3	Enable type reduction for Shrink, Sign, SplitToSequence CPU kernels (#7090 ) Enable type reduction for Shrink, Sign, SplitToSequence CPU kernels. Some other type reduction changes including refactoring to specify element types in a single place.	2021-03-23 09:57:33 -07:00
baijumeswani	c3310efdcd	Support for models having partially non trainable parameters (#7058 ) * Support for models having partially non trainable parameters	2021-03-23 09:41:16 -07:00
baijumeswani	a7a2a16edd	Pass arguments to azure_scale_set_vm_mount_test_data from perf test ci pipeline (#7094 )	2021-03-22 21:48:32 -07:00
Yufeng Li	c965878a69	fix a bug in global average pool and add unit test (#6913 ) * fix bug in QGlobalAveragePool * add unit test for quant GlobalAveragePool * not run quantization tests if disable_contrib_ops enabled	2021-03-22 20:01:27 -07:00
Aaron Boxer	230c137460	cmake: support install target with generated pkg-config file (#7076 )	2021-03-22 19:36:31 -07:00
liqunfu	309885b08d	upload ort-gpu-training python nightly package to azure feed (#6998 )	2021-03-22 18:44:54 -07:00
Tracy Sharpe	416ee3c4d2	MLAS: add 32-bit transpose support (#7092 )	2021-03-22 16:20:31 -07:00
Sherlock	5ec0e71542	ORTModule support non-differentiable module output (#7048 ) * Handle non-differentiable module output Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-03-22 15:46:11 -07:00
Changming Sun	be45a59d99	Make our CUDA code be compatible with the latest VS2019 update (#7062 )	2021-03-22 14:39:45 -07:00
Thiago Crepaldi	df6a68f59c	Fix fallback providers for InferenceSession (#7091 )	2021-03-22 13:38:58 -07:00
RandySheriffH	529da3b003	Thread pool profiler (#6748 ) * add profiler * add thread id * refactoring * switch to vector * add override keyword * fix comments * renaming * add revoke time * restore statics * restore enable flag * fix end error * fix comments * add comment * add comments * make profiler thread-safe * switch to shared_lock * switch to shared_timed_mutex * switch to OrtMutex * add per child thread counters * switch to vector * refactor LogCore * fix comments * cancel spin and block counter to reduce overhead * fix minor format issue Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2021-03-22 10:49:57 -07:00
Thiago Crepaldi	867804bea1	Add auto doc gen for ORTModule API during CI build (#7046 ) In addition to ORTModule auto documentation during packaging, this PR also update golden numbers to fix CI	2021-03-22 10:20:33 -07:00
Dmitri Smirnov	3b58fc7b97	Add types support for Sparse Initializer in Onnxruntime (#7004 ) Add types support for DenseToSparse and SparseToDense conversions Address the case of empty sparse values and indicies when the initializer does not contain any NNZ. Add sparsify script.	2021-03-22 10:06:11 -07:00
Olivia Jain	4a3d1176d7	adding ngraph_DIR to fix build (#6975 )	2021-03-22 09:43:02 -07:00
Edward Chen	4cbb8e166a	Update kernel def hashing (#7019 ) Update the kernel def hashing in ORT format models. The new hashing logic ignores the ordering of type constraint types. This is a backward compatibility breaking change, but we don't guarantee backward compatibility yet.	2021-03-22 09:28:27 -07:00
Brian Martin	06df28748f	Change tabs to spaces in Windows.AI.MachineLearning.idl (#7088 ) noticed this in a recent PR, this file has some tabs that should be spaces.	2021-03-22 09:23:18 -07:00
raviskolli	79ba045d74	Enabled rocm support for graph transformations (#7057 )	2021-03-22 09:02:10 -07:00
Scott McKay	b2c6617b0f	Use 'as_scalar' when checking the 'cond' value of 'If' (#7063 ) #6884	2021-03-22 18:04:38 +10:00

... 147 148 149 150 151 ...

11997 commits