onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-03 23:49:44 +00:00

Author	SHA1	Message	Date
Yufeng Li	61ba5b501a	Fix bug in the back to back quantization of matmul and conv (#5264 ) * fix bug in the back to back quantization of matmul and conv * fix bug in back to back gather	2020-09-23 08:47:20 -07:00
George Wu	b5a6a8e847	remove implicit linking of tensorrt and dnnl ep shared libs (#5262 ) * remove trt and dnnl from link command * add comment	2020-09-23 05:47:18 -07:00
Dwayne Robinson	6ea66b43db	ORT DirectML EP for Iron release, ONNX 1.5 (part 2) (#5263 ) * Merged PR 5195856: Fix broken cases of zero size tensors in Cast/Reduce MaskRCNN failed when `Cast` tried to execute `Xor` with emptiness (zero in dimensions). This is perfectly legal and should be treated as a nop. Ultimately DML itself should treat this case as a nop, just like how C's `memcpy` treats 0 count as a nop, but I'm just addressing it in ORT now, as enabling it in DML would impact more operators to be consistent (probably should incrementally add a flag to tensor validation so operators can be opted in gradually). Corresponding WindowsAI PR: https://microsoft.visualstudio.com/WindowsAI/_git/WindowsAI/pullrequest/5195850 Related work items: #27469839, #28761382 * Merged PR 5201369: Remove copy of initializers added in DMLXP refactor When used in ORT, a common method shouldn't copy and return initializer data Related work items: #29514403 Co-authored-by: Justin Stoecker <justoeck@microsoft.com> Co-authored-by: Jeff Bloomfield <jeffbloo@microsoft.com>	2020-09-23 01:56:19 -07:00
Hariharan Seshadri	75d994f194	Handle zero norm values in LpNormalization CPU kernel (#5251 )	2020-09-22 22:01:09 -07:00
Adam Pocock	d26c71f55c	[java] Fixing the buffer semantics. (#5223 ) * [java] Fixing the buffer semantics. * Renaming bufferCapacity to bufferRemaining. * Adding a cast to char* so the pointer arithmetic works on Windows.	2020-09-22 21:29:01 -07:00
Scott McKay	c52561d044	Rework broadcasting setup to decrease binary size. (#5227 ) * Rework broadcasting setup to decrease binary size. Push all the type specific down and separate out the broadcasting/parallelization. Reductions: element_wise_ops: 521.0KB -> 268.8KB where: 25.8 KB -> 17.3 KB qlinear_binary_op: 28.1 -> 12.8	2020-09-23 14:15:40 +10:00
Changming Sun	43faf9e388	Disable a few tests that run too long(1 hour) in debug mode (#5257 )	2020-09-22 21:06:24 -07:00
Tianlei Wu	3bbce69185	bump version to 1.5.1 (#5258 )	2020-09-22 20:57:34 -07:00
Jeff Bloomfield	59e69bf35b	Handle missing initializers in allocation planner to fix crashes with DML provider (#5244 ) * Fix memory planning bug with DML EP * Address PR comments * Fix typo	2020-09-22 19:37:07 -07:00
Ye Wang	898531f502	Fix reshape fusion crash (#5252 ) * fix reshape fusion crash * handling start_node statelessly * fix	2020-09-22 15:04:13 -07:00
Guoyu Wang	e30530d9ea	Add java API for AddSessionConfigEntry (#5241 ) * Add session option config entry API for java * Java format * Add extra test verification * Address PR comments * Update comments Co-authored-by: gwang0000 <62914304+gwang0000@users.noreply.github.com>	2020-09-22 14:51:39 -07:00
KeDengMS	8dceebda0e	[Training/Python] Add option to enable symbolic shape inference (#5107 ) This change adds symbolic shape inference to ORT training which helps static memory planning for model like BART.	2020-09-22 10:49:07 -07:00
edgchen1	14f250a4d0	Update BUILD.md training dependency info. (#5240 ) Update training dependency versions based on Dockerfile.training.	2020-09-22 10:36:04 -07:00
Guoyu Wang	d957dbebea	Fix possible ios build break after update to Xcode 12 (#5246 ) * Fix possible ios build break after update to Xcode 12 * Address comments	2020-09-22 07:42:54 -07:00
suffian khan	417929b049	jobs timeout ..	2020-09-21 21:51:59 -07:00
suffian khan	a6eb90472c	try fix error on code coverage ci build	2020-09-21 21:51:59 -07:00
Sherlock	1478643215	Place Shape's output in CPU memory (#5245 ) Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-09-21 20:21:59 -07:00
Sherlock	038192bdb2	Place shape related compute nodes in CPU (#4940 ) * Place shape related nodes in CPU * visit candidates by topological order * Make CPU node placement a utility function * skip placing on CPU if the data typs is float16 or bfloat16	2020-09-21 17:10:39 -07:00
Changming Sun	0cb09374c6	Update BUILD.md for CUDA versions (#5239 )	2020-09-21 15:28:53 -07:00
George Wu	3147bc00c3	update TensorRT docs (#5238 ) * doc updates TensorRT * update * update * fix warning * newline * format	2020-09-21 15:24:20 -07:00
Xueyun Zhu	55e4b5d302	add pipeline distributed training test (#5222 ) * add pipeline distributed training test * fix max line length error in windows build * function header indent * fix * fix flake8 error	2020-09-21 14:35:01 -07:00
liqunfu	84c222126c	Deprecate testMNISTTrainingAndTestingOpset10 (#4927 ) Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-09-21 14:17:08 -07:00
Pranav Sharma	974b9bfc09	Allow sharing of initializers between sessions. (#5092 ) * Allow sharing of initializers between sessions. * Allow sharing of initializers between sessions (2). * Add test for C# * Add test for C#; address PR comments * Address PR comments Moved AddInitializer logic to internal session options Added tests for owned buffer Clarified documentation Fix bug where memory info and not device was getting compared * Fix test * Fix training build * Add ver 5 end marker and ver 6 starter, add scenario and usage examples.	2020-09-21 14:09:37 -07:00
Scott McKay	e0719a1073	Revert to using release SafeInt repo now that it supports a build with exceptions disabled. (#5233 )	2020-09-22 06:29:28 +10:00
edgchen1	e9671e93f0	Fix TransposeScaleMatMul and MatMulScaleFusion issues (#5230 ) - Rename TransposeScaleMatMul back to TransposeMatMul for backwards compatibility - Fix MatMulScaleFusion issues: - Add check for supported execution providers - Add check for supported MatMul input types	2020-09-21 12:34:01 -07:00
Ye Wang	65740deb10	Fix a bug in EmbedLayerNorm fusion (#5150 ) * fix embedlayernorm bug * review comments * interim checkin * review comments * Fix core dump in MacOS * remove unnecessary lines * update document * Update graph_utils.cc * Update onnx_exporter.py * resolve comments	2020-09-21 12:26:14 -07:00
stevenlix	aefb2cc49b	Create profile for all dynamic shape input tensors (#5229 )	2020-09-20 05:55:21 -07:00
Tiago Koji Castro Shibata	cd663d58f5	Fix WinML warnings (#5228 )	2020-09-19 12:41:42 -07:00
Guoyu Wang	78a29aebbc	[ORT Mobile] ORT Minimal E2E CI (#5200 ) * Modify the ort minimal CI to ort minimal e2e ci	2020-09-19 18:43:22 +10:00
Dmitri Smirnov	8ee4e8226e	Preserve relative order of the results and the tests. (#5225 )	2020-09-19 00:45:44 -07:00
Weixing Zhang	b49f6a5e2c	using GPU_WARP_SIZE to make kernel portable between AMD and Nvidia GPU (#5173 )	2020-09-18 14:56:16 -07:00
Suffian Khan	84589c7e05	Fuse softmax(a + b) in case of simple broadcast (#4937 ) * bias softmax kernel * bias softmax kernel * remove debug comments * remove debug comment * windows build doesnt handle unary minus on unsigned type * int64 => int treated as error * only support cuda * add bias softmax fusion tests * PR comments * more PR comments * use MLTypeCallDispatcher * break function into pieces * add loop unroll and add to list for inference as well * use std::min and move operator== * revert std::min (doesnt work ci pipeline) and fix int to size_t error * pr comments * fixes for windows ci * fix for windows ci * pr comments on consistency * p_model_ * fix formatting and add anonymous namespace Co-authored-by: suffian khan <sukha@OrtTrainingDev1.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-09-18 14:15:55 -07:00
Tang, Cheng	e0b49844e9	Provide option to let layernorm stash mean/var as fp32 or bfloat16 (#5215 ) * add option to set layernorm stash type * bug fix * fix merge error * fix win build error	2020-09-18 13:42:01 -07:00
Dmitri Smirnov	a90ab12589	Refactor onnx_test_runner (#5169 ) Refactor onnx_test_runner for better object ownership, code readability and maintainability.	2020-09-18 13:19:35 -07:00
Ryan Hill	13318ab0d4	Remove invalid install line (#5219 )	2020-09-18 11:58:40 -07:00
Shucai Xiao	a632dd2d3b	Amdmigraphx improvements (#5158 ) * code backup * remove unnecessary log info * code backup * code backup * merge changes from master branch * code backup * code backup * merge changes from master branch * code backup * code backup for constant folding enhancement * code backup * include more scenarios for constant folding * code backup * remove unnecessary code * remove unnecessary log information * fix an error in comments * update algorithm to do graph partition * code backup * remove unnecessary log information * remove an unused function * remove unnecessary changes	2020-09-18 11:56:50 -07:00
Weixing Zhang	f91248e0cc	remove curand_generator_ related code since it is not used. (#5220 )	2020-09-18 11:50:35 -07:00
KeDengMS	ce3b67e0cd	[Python] Move symbolic_shape_infer from nuphar to tools (#5162 ) * [Python] Move symbolic shape inference from nuphar to tools * Fix PEP8 ERROR	2020-09-18 09:31:06 -07:00
RRRachelllll555	f7c1e51810	Remove shape inference and fix save large model(>2g) issue (#5210 ) * remove shape inference and fix save large model problem * remove unnecessary import * refine code and add external format for quantize_qat * remove initializers in tensors_to_calibrate * small refine Co-authored-by: t-yguo <t-yguo@microsoft.com>	2020-09-18 08:46:31 -07:00
Scott McKay	c46a480306	Update conversion script and process to simplify creating ORT format models and a minimal build (#5217 ) * Update conversion script and process to simplify creating ORT format models and a minimal build.	2020-09-18 18:49:54 +10:00
George Wu	1b61dfaf69	fix _WIN32 (#5218 )	2020-09-18 00:23:17 -07:00
Pranav Prakash	f5df96256c	Fix order of returned values in quantize_weight_per_channel (#5205 ) Must match returned order of `quantize_inputs`	2020-09-17 17:57:46 -07:00
liqunfu	f37e1292a1	--shm-size=1024m to fix nccl shared memory issue (#5214 ) * --shm-size=256m to fix nccl shared memory issue Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-09-17 17:21:47 -07:00
Guoyu Wang	8156e0dd10	[ORT Mobile] Some updates to iOS/Android build settings (#5184 ) * Update android CI and build settings * add build_java to arm64 also * Add ios signing param * fix a small build warning * address pr comments	2020-09-17 15:53:14 -07:00
Tracy Sharpe	8698157112	NCHWc optimizer fixes for quantized models (#5203 ) This updates the NCHWc transformer to not interfere with quantized convolution models, based on observations from internal models. The tensor type for MaxPool must be float. The input to GlobalAveragePool/GlobalMaxPool must be in NCHWc format.	2020-09-17 09:52:21 -07:00
Pranav Sharma	d535894297	Add API to allow configuration of the global thread pools. (#5199 )	2020-09-17 09:19:18 -07:00
Suffian Khan	e01e0b2e40	Fix softmax_warp_backward math when is_log_softmax = True and register LogSoftmax CUDA kernel (#5160 ) * register logsoftmax cuda kernel; fix logsoftmaxgrad cuda kernal; fix tests to invoke dispatch_softmax_* * forgot to remove axis check * add tests all axis Co-authored-by: suffian khan <sukha@OrtTrainingDev1.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-09-17 07:15:25 -07:00
S. Manohar Karlapalem	584638e5d3	Corrects doc typos and formatting (#5201 )	2020-09-17 01:25:19 -07:00
Zhang Lei	cd0386b649	MaxPool versioning in quantization tools. (#5194 ) MaxPool versioning in quantization tools.	2020-09-16 22:52:24 -07:00
Ryan Hill	b11c106346	Remove almost all of the reinterpret_casts from the provider shared API (#5190 )	2020-09-16 17:00:15 -07:00

1 2 3 4 5 ...

3433 commits