onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-04 04:07:22 +00:00

Author	SHA1	Message	Date
edgchen1	4aa033b99e	Addressing review comments (#3690 ) - https://github.com/microsoft/onnxruntime/pull/3681#discussion_r414359326 - https://github.com/microsoft/onnxruntime/pull/3681#discussion_r414359463 - https://github.com/microsoft/onnxruntime/pull/3681#discussion_r414360023 - https://github.com/microsoft/onnxruntime/pull/3681#discussion_r414361667 - https://github.com/microsoft/onnxruntime/pull/3681#discussion_r414368707 - https://github.com/microsoft/onnxruntime/pull/3681#discussion_r414371480 - https://github.com/microsoft/onnxruntime/pull/3681#discussion_r414379362 - https://github.com/microsoft/onnxruntime/pull/3681#discussion_r414374516 - https://github.com/microsoft/onnxruntime/pull/3681#discussion_r414801087	2020-04-24 14:57:18 -07:00
edgchen1	7347c73139	Revert "resolving conflicts from master (#3691 )" (#3696 ) This reverts commit `c38a60a450`.	2020-04-24 14:49:00 -07:00
ytaous	c38a60a450	resolving conflicts from master (#3691 ) * resolving conflicts * resolving conflicts * resolving conflicts * resolve conflicts Co-authored-by: Ethan Tao <ettao@microsoft.com>	2020-04-24 14:38:30 -07:00
Edward Chen	3863bd6f74	Revert "Try not to modify base name (#3638 )" This reverts commit `d9641f292d`. Reverting to fix onnx_test_runner test failures.	2020-04-24 04:26:59 +00:00
Edward Chen	5a790a4b42	Merge remote-tracking branch 'origin/master' into ort_training_for_merge_to_master	2020-04-24 02:27:27 +00:00
Pranav Sharma	939d036660	Add omp impl for tryparallelfor and modify gelu to use fastgelu impl. (#3667 ) * Add omp impl for tryparallelfor and modify gelu to use fastgelu impl. * Address PR comments.	2020-04-23 18:24:46 -07:00
edgchen1	6ca44e216a	Merge pull request #3675 from microsoft/edgchen1/merge_from_ort_training Merge from ort_training to ort_training_for_merge_to_master	2020-04-23 17:30:26 -07:00
Du Li	2659f205cc	Complex multiplication and conjugate contrib ops (#3384 ) * adding ComplexMulConj * Adding fp16 support. * adding a util func	2020-04-23 17:21:48 -07:00
Edward Chen	4416d41874	Merge remote-tracking branch 'origin/ort_training' into edgchen1/merge_from_ort_training	2020-04-24 00:19:05 +00:00
Ori Levari	bae1dd7f04	add test for LearningModel creation from missing model path (#3661 )	2020-04-23 15:37:32 -07:00
edgchen1	b4e82913d1	Merge pull request #3670 from microsoft/edgchen1/merge_from_master Merge from master to ort_training_for_merge_to_master	2020-04-23 15:17:42 -07:00
Sheil Kumar	2d2375aa23	swap float16/float (#3663 ) Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2020-04-23 14:27:18 -07:00
Yufeng Li	c0e817ff16	Fix a bug in skiplayernorm fusion pattern 2 (#3660 ) For skiplayernorm fusion pattern 2, its input[0] should be equal to the input[0] of Add_1, but is overridden by the input[0] of Add_2.	2020-04-23 14:18:59 -07:00
Edward Chen	deac467683	Merge remote-tracking branch 'origin/master' into edgchen1/merge_from_master	2020-04-23 20:50:33 +00:00
David Brownell	3ce31933bb	Wheel file updates for FeaturizerLibrary data (#3640 )	2020-04-23 13:27:22 -07:00
ytaous	ae7da23460	disable broken test in DML (#3666 ) * temporary disable LSTM_Seq_lens_unpacked for dml test * temporary disable LSTM_Seq_lens_unpacked for dml test * temporary disable LSTM_Seq_lens_unpacked Co-authored-by: Ethan Tao <ettao@microsoft.com>	2020-04-23 13:23:50 -07:00
edgchen1	49a1c5e546	Change CentOS build to use agent pool because builds on hosted agents run out of disk space. (#3662 )	2020-04-23 12:19:19 -07:00
Weixing Zhang	336624806e	Simplify and clean code (#3655 ) 1. It is not necessary to include cudnn_common.h for kernels which are not implemented with CUDNN. 2. Minor change in layer norm kernel to simplify the code and resolve building warning. Co-authored-by: Weixing Zhang <wezhan@microsoft.com>	2020-04-23 10:12:55 -07:00
XiaocenDong	125f68f305	fixed mnist bug (#3569 ) * fixed mnist bug * fixed train_step param	2020-04-23 23:22:38 +08:00
Xavier Dupré	5777fc18c3	Removes omp for ThreadPool in TreeEsemble* (#3596 ) * Removes omp to use ThreadPool * removes unnecessary old OMP code * rename compute_agg, use ThreadPool::NumThreads Co-authored-by: xavier dupré <xavier.dupre@gmail.com>	2020-04-22 23:48:31 -07:00
Xueyun Zhu	f1ba9aaf34	Add pipeline transformer for wait/record node (#3513 ) * pipeline transformer * clean up * address feedback * add record/wait for first stage and updated split script * address feedback * make recv/send signal as initializer * merge * address feedback * unify input and initializer * address feedback and bug fix * minor fix * windows build * fix	2020-04-22 23:28:01 -07:00
pengwa	6136fd0789	GatherElementsGrad Kernels (#3627 ) * GatherElementsGrad cuda kernel & tests * Fix comments * Fix include path	2020-04-23 14:02:34 +08:00
Wei-Sheng Chin	d9641f292d	Try not to modify base name (#3638 )	2020-04-22 22:24:43 -07:00
Vincent Wang	ffe19ae49b	Expand elimination and Expand gradient. (#3610 ) * Expand elmination and Expand gradient. * Resolve comments. * Fix test break. * Check if graph can remove the node. * Resolve comment. Co-authored-by: Vincent Wang <weicwang@microsoft.com>	2020-04-23 13:17:15 +08:00
Tang, Cheng	37f4f74308	expose training session so the training app could register custom kernel and transformers (#3642 ) Co-authored-by: Cheng Tang <chenta@microsoft.com>	2020-04-22 21:35:41 -07:00
gwang-msft	02bae6bd06	Not use OpenMP for android build (#3636 )	2020-04-22 21:17:05 -07:00
edgchen1	2dd4f7e96b	Add check for nullptr in PlannerImpl::FindReusableTensor(). (#3619 )	2020-04-22 20:18:29 -07:00
suffiank	0e12d05cd2	fixes for ort_trainer.py to resume from checkpoint (#3510 ) * fixes for ort_trainer.py to resume from checkpoint * define self.state_dict_ during init * add comment of explanation * add unit test for restore from checkpoint * fix file not found Co-authored-by: suffian khan <sukha@microsoft.com>	2020-04-22 16:33:58 -07:00
Changming Sun	00917917d6	Downgrade numpy requirement to 1.16.6 (#3635 )	2020-04-22 16:11:33 -07:00
Weixing Zhang	e4fc83252d	Refactoring code related to WARP_SIZE. (#3623 ) 1. Centralize its definition in common.cuh. 2. Rename it to GPU_WARP_SIZE which can be extended to AMD GPU later. 3. Centralize warp shuffle functions. Co-authored-by: Weixing Zhang <wezhan@microsoft.com>	2020-04-22 15:19:06 -07:00
Mikhail Kuznetsov	3cf3595579	Replaced spaces on tabs (#3555 )	2020-04-22 15:16:19 -07:00
Ye Wang	7837c7efc3	Add Features to ShortGrainDropper for ONNX export (#3628 ) * add features to short_grain_dropper for ONNX export * update FeaturizersLibrary * fix warnings	2020-04-22 14:09:39 -07:00
edgchen1	bb9b0ba5b3	Merge pull request #3607 from microsoft/edgchen1/merge_from_master Merge from master to ort_training	2020-04-22 13:22:32 -07:00
Ye Wang	70b554cc85	Add Features to ForecastingPivot Transformer for ONNX Export (#3608 ) * checkin * fix MSVC build error * test changes * split pivot output into multiple tensors * add horizon tensor * Support multiple types for non-pivot tensor * limit horizon tensor type to int32_t as max_horizon type * work around some conversion warnings for local machine * support variadic shape for non-pivot input * dropping all rows is an exception * fix a bug * fix the way that generates horizon tensor * more tests added * add TypeConstraint() in ONNX_OPERATOR_KERNEL_EX * update Featurizerslibrary	2020-04-22 13:09:31 -07:00
Wei-Sheng Chin	ab70625b29	Add Lamb shape inference (#3634 )	2020-04-22 11:32:28 -07:00
Paul McDaniel	2c74766ad1	Add new docs around how to bind to the onnxruntime.dll (#3539 )	2020-04-22 11:24:36 -07:00
Edward Chen	8df5076d96	Merge remote-tracking branch 'origin/master' into edgchen1/merge_from_master	2020-04-22 17:16:00 +00:00
Edward Chen	8d09cefafc	Merge remote-tracking branch 'origin/ort_training' into edgchen1/merge_from_master	2020-04-22 16:56:15 +00:00
edgchen1	b518cb2a7a	Clean up OPTIONAL name conflict workarounds in ort_training. (#3622 ) * Clean up OPTIONAL name conflict workarounds. * Cleanup unnecessory header files onnx_protobuf.h Co-authored-by: Sherlock Huang	2020-04-22 09:07:55 -07:00
Vincent Wang	d3a2ac5c5c	Eliminate Useless Cast during Transformer. (#3606 ) * Remove Useless Cast during Transformer. * Resolve comments. * Check if graph can remove the node. Co-authored-by: Vincent Wang <weicwang@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-04-22 16:36:46 +08:00
Tianlei Wu	d69bc31309	Refine BERT optimization script options (#3618 ) * Remove paramters like --gpu_only --sequence_length. Update bert GPU notebook accordingly. * Remove input_int32 and float16 parameters from constructors of BertOnnxModel class and other classes derived from it. * Update gpt2 benchmark. Add comments in gpt2 notebook to indicate work in progress. Clear notebook output before official 1.3.0 release is ready.	2020-04-21 21:28:06 -07:00
Scott McKay	b4508dbdc6	Improve TopK performance. (#3612 ) * Update TopK implementation. - add faster heap - special case k=1 - update selector for when to use heap and when to use nth_element based on performance testing - parallelize if enough work to do - reduce templatized code - add some extra unit tests. Perf tested vs. master. Average speedup is 3.75x using this combination of input sizes: ``` batches = [10, 25, 50] batch_size = [8, 16, 32, 64, 128, 256, 512, 1024, 2048] k = [1, 2, 4, 6, 8, 16, 24, 32, 48, 64, 128] ``` For larger batches (e.g. 50x2048) the speedup is over 20x.	2020-04-22 10:05:13 +10:00
edgchen1	5492d02c4e	Remove Windows CUDA 9 build definition and helper scripts. (#3615 )	2020-04-21 15:22:27 -07:00
Sherlock	d66d5bb86a	Update Optimizer Domain and Opset (#3602 ) * Update Domain and Opset for SGD * Update Adam Domain and Opset * Update Lamb Domain and Opset	2020-04-21 15:06:02 -07:00
Edward Chen	47f1758fdc	Add --skip_onnx_tests to orttraining Windows builds.	2020-04-21 21:50:35 +00:00
Edward Chen	297ab43b0c	Add --enable_onnx_tests to Windows builds to allow set up of test data directory.	2020-04-21 20:34:55 +00:00
Edward Chen	2e4b9b1d0e	Disable CudaKernelTest.SoftmaxCrossEntropyLoss_LargeSizeTensor because it's flaky.	2020-04-21 20:30:45 +00:00
Edward Chen	28a0c863b1	Revert "Convert Gelu to use TryParallelFor (#3599 )" This reverts commit `2579a72a88`.	2020-04-21 18:45:20 +00:00
Edward Chen	d50c3e7a71	Fix GraphTransformationTests tests.	2020-04-21 18:43:49 +00:00
Pranav Sharma	9636da3951	Threadpool related changes. (#3564 ) Threadpool related changes. Don't create ORT threadpool if openmp is enabled (except for inter op threadpool). Created a new static function ThreadPool::NumThreads to account for openmp settings and null threadpool ptr. Log a warning when using SetIntraOpNumThreads when openmp is enabled. Added a document for ORT devs. Fix LSTM to use the new threadpool abstractions. Rename GetNumCpuCores to GetThreadAffinityMasks and move it to the Env class. Co-authored-by: Tracy Sharpe <tracysh@microsoft.com>	2020-04-21 09:57:39 -07:00

1 2 3 4 5 ...

2269 commits