onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-14 20:48:00 +00:00

Author	SHA1	Message	Date
Miguel de Icaza	ea368f69db	Add Swift/macOS sample, a port of the Windows MNist sample	2020-06-05 21:16:41 -07:00
Yulong Wang	2e58097f8f	fix build: pipeline Node.js version to 12.16.3 (#4145 )	2020-06-05 17:56:03 -07:00
Bowen Bao	1e5307d458	Bug fix for parameter names of models not using wrapper (#4061 ) * bug fix for models not using wrapper * add test case for no wrapper case * update test case to use internal learning rate * fix bug with frozen weight update	2020-06-05 12:03:38 -07:00
Scott McKay	9790e19424	Handle mem pattern allocation failure better. Make BFCArena behavior more consistent (#4062 ) * Fixes from investigating issue running BERT-Squad model with larger batch sizes. When the batch size gets large enough the initial run will be successful (no memory pattern in use) but the second will fail to allocate the memory pattern block. The cause of this failure is that we still have the smaller blocks from the first run allocated, as BFCArena has no logic to free those. This essentially results in 2x the memory being required to run the model. There was inconsistency in BFCArena::Extend which on one path threw an exception if it couldn't do the allocation, and on another just returned false (resulting in Alloc returning a nullptr). Make the behavior consistent by always throwing if BFCArena fails to find a buffer to return. There are a huge number of places in the code where we assume Alloc returns a valid pointer so throwing will result in more correct behavior as a whole. It's also consistent with what happens when CUDA or the standard library fails to allocate memory. Next, update ExecutionFrame to check for this failure and not insert a memory block entry if it happens. With the existing code if BFCArena Alloc returned a nullptr we happily inserted that in the blocks, delaying detection of the failure to when we attempted to use the block in AllocateMLValueTensorSelfOwnBufferHelper. Finally update AllocateMLValueTensorSelfOwnBufferHelper to expect a location may not have a block. A log message will be provided when the block allocation fails so it's not necessary to have more on each individual allocation that would have used the block. Falls through to default behavior of doing a normal allocation.	2020-06-05 18:54:01 +10:00
Thiago Crepaldi	81101c9efd	Fix DropoutGrad op (#4052 ) Dropout op was recently changed to accept a new input named 'training_mode', which is passed in to DropoutGrad automatically. This PR updates the DropoutGrad schema to accommodate the new input. Tests were also update to reflect the API change Co-authored-by: Thiago Crepaldi <thiag.crepaldi@microsoft.com>	2020-06-04 15:00:02 -07:00
Dmitri Smirnov	6199ef1375	Change group id to com.microsoft.onnxruntime per requirements.	2020-06-03 22:30:13 -07:00
Scott McKay	16cef90e29	General enhancements/cleanups to test exes (#4109 ) * General enhancements/cleanups to test exes - Support running onnxruntime_perf_test with no output file - if you're profiling the output file is often unused and can be very large - Allow failure to override early success if doing multiple runs of a test using running onnx_test_runner - e.g. if the second run fails that's more important as a final status - Clarify ownership semantics - Cleanup naming, line lengths, usage of references for required parameters etc.	2020-06-04 07:01:39 +10:00
Yufeng Li	197da135eb	Implement quantized Attention on cpu (#4111 ) * Implement QAttention on CPU * support QAttention in quantization tool * refine attention code * add more unit tests	2020-06-03 13:42:00 -07:00
Andrews548	62b44527e5	Add ArmNN Execution Provider (#3714 ) * Add ArmNN Execution Provider Add a new execution provider targeting Arm architecture based on ArmNN. Validated on NXP i.MX8QM CPU with ResNet50, MobileNetv2 and VGG models. reviewed-by: mike.caraman@nxp.com * Minor fixes - renamed onnxruntime_ARMNN_RELU_USECPU to onnxruntime_ARMNN_RELU_USE_CPU - fixed acl typo * remove extra includes. added exception for ArmNN in test * fix indentation * Separated the activation implementation from the cpu and fixed the blockage from the endif Co-authored-by: Andrei-Alexandru <andrei-alexandru.avram@nxp.com>	2020-06-03 22:57:51 +05:30
Scott McKay	62af8da3f6	Use OrtMutex and OrtCondVar everywhere instead of std::mutex/std::condition_variable for consistency. Needed to change the MissingTrack enum naming due to ort_mutex.h including Windows.h which #defines TRUE and FALSE (via inclusion of fdi_fci_types.h), breaking usage of MissingTrack::TRUE and MissingTrack::FALSE.	2020-06-03 08:42:16 -07:00
liqunfu	905c535626	still need to make the test stable. Lower the acc number a bit to make the test pass for now (#4117 ) Co-authored-by: liqun fu <liqun@OrtTrainingDev1.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-06-02 21:37:48 -07:00
KeDengMS	d63b90538e	Symbolic shape inference exit on models without onnx opset used (#4090 ) * Symbolic shape inference exit on models without onnx opset used * Temporary fix for ConvTranspose with symbolic input dims Co-authored-by: Changming Sun <me@sunchangming.com>	2020-06-02 19:39:46 -07:00
KeDengMS	6f8a4f4cad	Fix Nuphar test failure	2020-06-02 18:03:38 -07:00
KeDengMS	32d8a76f2f	Fix Nuphar build in gcc 7 (Ubuntu 18.04)	2020-06-02 18:03:38 -07:00
ashbhandare	f18a99b245	Exclude non-trainable torch buffers from trainable weights (#4099 ) * Initial changes * Removed redundant fix * Revert unintended formatting change. * Add unit test	2020-06-02 14:05:44 -07:00
Faith Xu	e5cec7237d	Clarify telemetry collection (#4102 )	2020-06-02 13:12:27 -07:00
S. Manohar Karlapalem	baa0697982	[OpenVINO-EP] Add missing dependency libs in Dockerfile (#4064 ) * Fixed libjson-c_dev_fix and Updated Readme * Fix VAD-M naming inconsistency in docs * Avoid removal of sudo in install_common_deps * Remove 'sudo' for wget in install_common_deps.sh for dockerfiles 'sudo' is not required, and hinders running script from within proxy environments. Removing it also makes lines consistent with each other (there are other wget lines without sudo). Co-authored-by: gundaarx <mayax.vijayan@intel.com>	2020-06-02 02:42:58 -07:00
Yulong Wang	647a886587	[Nodejs binding] create a new pipeline to generate signed binaries (#4104 ) * add yml files * update pipeline * fix yaml syntax * yaml pop BuildCSharp * udpate yaml * do not stage codesign summary	2020-06-02 01:28:05 -07:00
Tracy Sharpe	3f7b97a63d	MLAS: more code cleanup (#4101 ) Cleanup vector intrinsics, optimized SSE quantized GEMM.	2020-06-01 21:19:42 -07:00
Changming Sun	08e5f89b37	Fix the nuget gpu pipeline (#4106 )	2020-06-01 20:42:15 -07:00
Dmitri Smirnov	afca0d15ee	Create Java publishing pipeline (#3944 ) Create CPU and GPu Java publishing pipelines. Final jars are tested on all platforms. However, signing and publishing to maven are manual steps.	2020-06-01 18:18:57 -07:00
Dwayne Robinson	51d78bc5e6	Fix DML EP doc link to C API (#4105 ) Path used "\" instead of "/".	2020-06-01 16:49:17 -07:00
Pranav Sharma	6c1b2f33b7	Fix crash reported in #4070 . (#4091 ) * Fix crash reported in #4070. * Add newline to warning message * Add comment for using cout instead of the logger	2020-06-01 15:27:14 -07:00
Cecilia Liu	8813d205cc	Update GPT2 Model Benchmark Script to Support IO Binding (#4088 ) GPT2 benchmark support io binding	2020-06-01 15:07:48 -07:00
edgchen1	ba74914c5a	Remove evaluation output from training e2e test baseline data. (#4092 )	2020-06-01 15:06:21 -07:00
Changming Sun	3eaec57c38	Fix the daily pipeline failures (#4084 ) 1. Fix the nuget cpu pipeline and put code coverage pipeline back. 2. Reduce onnx_test_runner's default logging level from WARNING to ERROR. Because there are too many log messages now. 3. Enlarge the protobuf read buffer size for onnx_test_runner. It was missed from PR #4020.	2020-06-01 14:44:49 -07:00
Derek Murray	f54518bae9	Actually switch the spdlog submodule to the master branch. (#4100 ) This is a follow-up to #4087, which did not fix the whole problem. Fixes #4077. Co-authored-by: Derek Murray <demurra@microsoft.com>	2020-06-01 14:32:16 -07:00
ytaous	72d508b7a0	New perf metric - e2e throughput (#4085 ) * new metric * on comments * tab to spaces Co-authored-by: Ethan Tao <ettao@microsoft.com>	2020-06-01 12:11:34 -07:00
Ashwini Khade	70d91a8550	re-enable graph optimizations during build phase (#4044 ) * re-enable graph optimizations during build phase * fix * re-enable optimizers for all provider tests	2020-06-01 10:32:42 -07:00
Changming Sun	ff16ca54e1	Fix the flake8 warning in generate_nuspec_for_native_nuget.py (#4089 )	2020-06-01 10:32:22 -07:00
edgchen1	a715d55bcc	Training Python package fixes (#4063 ) - Add support for ENABLE_LANGUAGE_INTEROP_OPS in training build which is enabled for nightly builds - Fix passing of environment variables to `sudo docker run` in build definitions - Fix setup.py package naming logic	2020-06-01 09:30:56 -07:00
Derek Murray	9d748afff1	Set spdlog submodule branch to "master" explicitly. (#4087 ) The default branch for the spdlog repository on GitHub recently changed from "master" to "v1.x", which has a different API for `syslog_sink::syslog_sink()`. This breaks builds of the server for anyone who has checked out the submodules since that change. Fixes #4077. Co-authored-by: Derek Murray <demurra@microsoft.com>	2020-05-29 17:53:40 -07:00
Scott McKay	1d441f89ac	Re-enable PEP8 check in Win CI build (#4075 ) * Add flake8 to Win CI build so it's re-enabled. It was in the static analysis build that is currently disabled so checks are not running. Fix build.py to be compliant again. Add prefix to flake8 output so it's (hopefully) easier to identify the errors in build output. * Add to all builds in Windows CPU CI so they all fail quickly if there's an issue.	2020-05-30 09:10:05 +10:00
Scott McKay	b85805ed01	Handle edge case with implicit input and multiple levels of subgraphs (#4031 ) * Handle edge case where an implicit input for a subgraph may not get wired in correctly. Conditions required: - two or more levels of nested subgraph - an implicit input from above the bottom two levels is used in both levels of subgraph - this creates a NodeArg for the implicit input at both levels - something changes to the first level subgraph to no longer use the implicit input - could be constant folding, could be partitioning of nodes results in a copy of the implicit input being made to a different device When that occurs we lose the wiring through to the second level of nested subgraph as there's a NodeArg in the first level but the implicit input is no longer used there. Fix that by doing a final check for outer scope values once we know all the outputs produced by the current graph. Found by commenting out the CUDA implementations of the control flow nodes and running ssd_mobilenet_300 from the mlperf models. * Add test case.	2020-05-30 07:08:21 +10:00
Sheil Kumar	c331d8cffc	WinML custom operator header is missing from nuget package. (#4083 ) * publish mloperatorauthor.h in the nuget * build dmlep into arm/arm64 builds * update to not use --use_dml everywhere, but enable custom ops everywhere * always download directml nuget in winml builds * always build with dml * dont build dml for arm Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2020-05-29 13:24:22 -07:00
Linnea May	6c7eaff676	fixed typo in readme (#4076 )	2020-05-29 12:39:28 -07:00
Tixxx	6404aba5ae	Orttraining rc1 master merge (#4080 ) * fixed seg fault when using concrete shape disable gradient as output * fix evaluation hang issue for multiple gpu run * Remove dead code, ORTModel and improve docstrings (#3814) * Refine ORTTrainer docstring descriptions (#3907)	2020-05-29 12:28:12 -07:00
Wei-Sheng Chin	e951b29a0b	Fix a macro and memory regression (#4068 ) onnxruntime_training_bert can run the following command again. ./onnxruntime_training_bert --model_name=bert-large-uncased_L_24_H_1024_A_16_V_30528_S_512_Dp_0.1_optimized_layer_norm --num_train_steps=16 --train_batch_size=52 --mode=train --train_data_dir=/bert_data/128/books_wiki_en_corpus/train --test_data_dir=/bert_data/128/books_wiki_en_corpus/test --gradient_accumulation_steps=16 --optimizer=Lamb --learning_rate=3e-3 --max_seq_length=128 --max_predictions_per_seq=20 --warmup_ratio=0.2843 --warmup_mode=Poly --display_loss_steps=100 --use_mixed_precision=True --allreduce_in_fp16 --use_nccl	2020-05-29 09:24:40 -07:00
edgchen1	38d76cc904	Clean up training E2E test (#4078 ) Update training E2E build to not go through CTest and call test scripts directly.	2020-05-29 09:20:47 -07:00
Prabhat	dd43623da2	Remove ONNX from requirements.txt (#4073 ) * Avoid installing ONNX package on aarch64 * Removed onnx from requirements * Add note in backend.py	2020-05-29 21:44:20 +05:30
KeDengMS	348ed698ec	Add more symbolic compute support in symbolic shape inference (#4057 ) * Add more symbolic compute support in symbolic shape inference * Refinements	2020-05-29 02:00:30 -07:00
Scott McKay	2a96be83f6	skottmckay/bugfix/SubgraphInput (#4004 ) Description: Fix 2 edge cases as described here: #3755 (comment) Create a NodeArg for subgraph inputs even if they have no type. If they are only used as an implicit input to another level of nested subgraph we will not create a NodeArg via any other path Allow an If output to have no shape. Obscure edge case where a loop carried dependency to a Loop node is passed through a nested If node subgraph (i.e. the Loop subgraph contains an If node with a nested subgraph for the else_branch/then_branch). We can't infer a shape for a loop carried dependency (they may change across iterations), which means we can't infer a shape for the nested If subgraph output either. We have delayed allocation support for If outputs so use that. Motivation and Context #3755	2020-05-29 14:48:07 +10:00
Hariharan Seshadri	c55634d2e6	Fix initial value of loop variable in RNN op (#4055 )	2020-05-28 19:19:39 -07:00
pengwa	6d03470587	Add e2e measurement for training (#4049 ) * add e2e measurement	2020-05-29 10:08:29 +08:00
Yufeng Li	26be762b35	Make CPU QuantizeLinear support optional zero point (#4065 ) * Disable DequantizeLinear_Without_Zero_Point test for nGraph * make quantizelinear support optional zero point	2020-05-28 14:33:26 -07:00
Tianlei Wu	60fa4b1f90	Update benchmark of gpt2 model with past state (#4043 ) * update benchmark_gpt2 to use past state only * update dynamic axes of input/output tensors * Remove --use_openmp option since it is default for onnxruntime 1.3 cpu. * Use same option names as benchmark.py	2020-05-28 13:55:43 -07:00
Ryan Lai	ed0a8e5b5c	Enable disabled tests and add fixed model (#4059 ) Co-authored-by: Ryan Lai <ryalai96@gamil.com>	2020-05-28 13:24:12 -07:00
Brian Martin	279f9aa865	Update WinRT_API.md to reflect 1.3 release (#4074 ) fix broken link, add new release to the release table, and point to the 1.3 nuget package	2020-05-28 11:01:49 -07:00
Changming Sun	c94d9685b6	Fix a problem in StacktraceTests::BasicTests (#4069 ) result.size() could be zero, in this case, we shouldn't access result[0]	2020-05-28 10:06:16 -07:00
Changming Sun	a859dc422c	Delete google::protobuf::io::FileInputStream class from our source code (#4067 ) This class is already part of the protobuf-lite library. We don't need a copy here. And if we do, we must ensure the signature of every function is exactly the same as the original. However, the upstream code may get changed over time. For example, recently protobuf added a "const" modifier to the FileInputStream::GetErrno(), which may break the build if a user want to use the latest protobuf.	2020-05-28 10:05:47 -07:00

1 2 3 4 5 ...

2651 commits