onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-04 04:07:22 +00:00

Author	SHA1	Message	Date
Tracy Sharpe	3f7b97a63d	MLAS: more code cleanup (#4101 ) Cleanup vector intrinsics, optimized SSE quantized GEMM.	2020-06-01 21:19:42 -07:00
Changming Sun	08e5f89b37	Fix the nuget gpu pipeline (#4106 )	2020-06-01 20:42:15 -07:00
Dmitri Smirnov	afca0d15ee	Create Java publishing pipeline (#3944 ) Create CPU and GPu Java publishing pipelines. Final jars are tested on all platforms. However, signing and publishing to maven are manual steps.	2020-06-01 18:18:57 -07:00
Dwayne Robinson	51d78bc5e6	Fix DML EP doc link to C API (#4105 ) Path used "\" instead of "/".	2020-06-01 16:49:17 -07:00
Pranav Sharma	6c1b2f33b7	Fix crash reported in #4070 . (#4091 ) * Fix crash reported in #4070. * Add newline to warning message * Add comment for using cout instead of the logger	2020-06-01 15:27:14 -07:00
Cecilia Liu	8813d205cc	Update GPT2 Model Benchmark Script to Support IO Binding (#4088 ) GPT2 benchmark support io binding	2020-06-01 15:07:48 -07:00
edgchen1	ba74914c5a	Remove evaluation output from training e2e test baseline data. (#4092 )	2020-06-01 15:06:21 -07:00
Changming Sun	3eaec57c38	Fix the daily pipeline failures (#4084 ) 1. Fix the nuget cpu pipeline and put code coverage pipeline back. 2. Reduce onnx_test_runner's default logging level from WARNING to ERROR. Because there are too many log messages now. 3. Enlarge the protobuf read buffer size for onnx_test_runner. It was missed from PR #4020.	2020-06-01 14:44:49 -07:00
Derek Murray	f54518bae9	Actually switch the spdlog submodule to the master branch. (#4100 ) This is a follow-up to #4087, which did not fix the whole problem. Fixes #4077. Co-authored-by: Derek Murray <demurra@microsoft.com>	2020-06-01 14:32:16 -07:00
ytaous	72d508b7a0	New perf metric - e2e throughput (#4085 ) * new metric * on comments * tab to spaces Co-authored-by: Ethan Tao <ettao@microsoft.com>	2020-06-01 12:11:34 -07:00
Ashwini Khade	70d91a8550	re-enable graph optimizations during build phase (#4044 ) * re-enable graph optimizations during build phase * fix * re-enable optimizers for all provider tests	2020-06-01 10:32:42 -07:00
Changming Sun	ff16ca54e1	Fix the flake8 warning in generate_nuspec_for_native_nuget.py (#4089 )	2020-06-01 10:32:22 -07:00
edgchen1	a715d55bcc	Training Python package fixes (#4063 ) - Add support for ENABLE_LANGUAGE_INTEROP_OPS in training build which is enabled for nightly builds - Fix passing of environment variables to `sudo docker run` in build definitions - Fix setup.py package naming logic	2020-06-01 09:30:56 -07:00
Derek Murray	9d748afff1	Set spdlog submodule branch to "master" explicitly. (#4087 ) The default branch for the spdlog repository on GitHub recently changed from "master" to "v1.x", which has a different API for `syslog_sink::syslog_sink()`. This breaks builds of the server for anyone who has checked out the submodules since that change. Fixes #4077. Co-authored-by: Derek Murray <demurra@microsoft.com>	2020-05-29 17:53:40 -07:00
Scott McKay	1d441f89ac	Re-enable PEP8 check in Win CI build (#4075 ) * Add flake8 to Win CI build so it's re-enabled. It was in the static analysis build that is currently disabled so checks are not running. Fix build.py to be compliant again. Add prefix to flake8 output so it's (hopefully) easier to identify the errors in build output. * Add to all builds in Windows CPU CI so they all fail quickly if there's an issue.	2020-05-30 09:10:05 +10:00
Scott McKay	b85805ed01	Handle edge case with implicit input and multiple levels of subgraphs (#4031 ) * Handle edge case where an implicit input for a subgraph may not get wired in correctly. Conditions required: - two or more levels of nested subgraph - an implicit input from above the bottom two levels is used in both levels of subgraph - this creates a NodeArg for the implicit input at both levels - something changes to the first level subgraph to no longer use the implicit input - could be constant folding, could be partitioning of nodes results in a copy of the implicit input being made to a different device When that occurs we lose the wiring through to the second level of nested subgraph as there's a NodeArg in the first level but the implicit input is no longer used there. Fix that by doing a final check for outer scope values once we know all the outputs produced by the current graph. Found by commenting out the CUDA implementations of the control flow nodes and running ssd_mobilenet_300 from the mlperf models. * Add test case.	2020-05-30 07:08:21 +10:00
Sheil Kumar	c331d8cffc	WinML custom operator header is missing from nuget package. (#4083 ) * publish mloperatorauthor.h in the nuget * build dmlep into arm/arm64 builds * update to not use --use_dml everywhere, but enable custom ops everywhere * always download directml nuget in winml builds * always build with dml * dont build dml for arm Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2020-05-29 13:24:22 -07:00
Linnea May	6c7eaff676	fixed typo in readme (#4076 )	2020-05-29 12:39:28 -07:00
Tixxx	6404aba5ae	Orttraining rc1 master merge (#4080 ) * fixed seg fault when using concrete shape disable gradient as output * fix evaluation hang issue for multiple gpu run * Remove dead code, ORTModel and improve docstrings (#3814) * Refine ORTTrainer docstring descriptions (#3907)	2020-05-29 12:28:12 -07:00
Wei-Sheng Chin	e951b29a0b	Fix a macro and memory regression (#4068 ) onnxruntime_training_bert can run the following command again. ./onnxruntime_training_bert --model_name=bert-large-uncased_L_24_H_1024_A_16_V_30528_S_512_Dp_0.1_optimized_layer_norm --num_train_steps=16 --train_batch_size=52 --mode=train --train_data_dir=/bert_data/128/books_wiki_en_corpus/train --test_data_dir=/bert_data/128/books_wiki_en_corpus/test --gradient_accumulation_steps=16 --optimizer=Lamb --learning_rate=3e-3 --max_seq_length=128 --max_predictions_per_seq=20 --warmup_ratio=0.2843 --warmup_mode=Poly --display_loss_steps=100 --use_mixed_precision=True --allreduce_in_fp16 --use_nccl	2020-05-29 09:24:40 -07:00
edgchen1	38d76cc904	Clean up training E2E test (#4078 ) Update training E2E build to not go through CTest and call test scripts directly.	2020-05-29 09:20:47 -07:00
Prabhat	dd43623da2	Remove ONNX from requirements.txt (#4073 ) * Avoid installing ONNX package on aarch64 * Removed onnx from requirements * Add note in backend.py	2020-05-29 21:44:20 +05:30
KeDengMS	348ed698ec	Add more symbolic compute support in symbolic shape inference (#4057 ) * Add more symbolic compute support in symbolic shape inference * Refinements	2020-05-29 02:00:30 -07:00
Scott McKay	2a96be83f6	skottmckay/bugfix/SubgraphInput (#4004 ) Description: Fix 2 edge cases as described here: #3755 (comment) Create a NodeArg for subgraph inputs even if they have no type. If they are only used as an implicit input to another level of nested subgraph we will not create a NodeArg via any other path Allow an If output to have no shape. Obscure edge case where a loop carried dependency to a Loop node is passed through a nested If node subgraph (i.e. the Loop subgraph contains an If node with a nested subgraph for the else_branch/then_branch). We can't infer a shape for a loop carried dependency (they may change across iterations), which means we can't infer a shape for the nested If subgraph output either. We have delayed allocation support for If outputs so use that. Motivation and Context #3755	2020-05-29 14:48:07 +10:00
Hariharan Seshadri	c55634d2e6	Fix initial value of loop variable in RNN op (#4055 )	2020-05-28 19:19:39 -07:00
pengwa	6d03470587	Add e2e measurement for training (#4049 ) * add e2e measurement	2020-05-29 10:08:29 +08:00
Yufeng Li	26be762b35	Make CPU QuantizeLinear support optional zero point (#4065 ) * Disable DequantizeLinear_Without_Zero_Point test for nGraph * make quantizelinear support optional zero point	2020-05-28 14:33:26 -07:00
Tianlei Wu	60fa4b1f90	Update benchmark of gpt2 model with past state (#4043 ) * update benchmark_gpt2 to use past state only * update dynamic axes of input/output tensors * Remove --use_openmp option since it is default for onnxruntime 1.3 cpu. * Use same option names as benchmark.py	2020-05-28 13:55:43 -07:00
Ryan Lai	ed0a8e5b5c	Enable disabled tests and add fixed model (#4059 ) Co-authored-by: Ryan Lai <ryalai96@gamil.com>	2020-05-28 13:24:12 -07:00
Brian Martin	279f9aa865	Update WinRT_API.md to reflect 1.3 release (#4074 ) fix broken link, add new release to the release table, and point to the 1.3 nuget package	2020-05-28 11:01:49 -07:00
Changming Sun	c94d9685b6	Fix a problem in StacktraceTests::BasicTests (#4069 ) result.size() could be zero, in this case, we shouldn't access result[0]	2020-05-28 10:06:16 -07:00
Changming Sun	a859dc422c	Delete google::protobuf::io::FileInputStream class from our source code (#4067 ) This class is already part of the protobuf-lite library. We don't need a copy here. And if we do, we must ensure the signature of every function is exactly the same as the original. However, the upstream code may get changed over time. For example, recently protobuf added a "const" modifier to the FileInputStream::GetErrno(), which may break the build if a user want to use the latest protobuf.	2020-05-28 10:05:47 -07:00
Faith Xu	1e82ecfd5c	Fix link in readme (#4058 )	2020-05-28 06:57:58 -07:00
Tianlei Wu	7f750b65ce	support model > 2GB in transformer optimizer (#4038 ) * Enable optimizer on models with external data (>2GB) * Refactoring optimizer: move fusion to separate file * Update benchmark: (1) output datatime to csv (2) Add option --onnx_dir to benchmark.py for onnx model directory path (3) add gpt2-large (4) loose thrsholds for fp16 validation * update optimizer (1) Add attribute of ConstantOfShape in fp16 conversion (2) Use OnnxRuntime level 1 optimization * update bert_perf_test.py: rename --input_ids to --input_ids_name	2020-05-28 01:16:41 -07:00
edgchen1	9f7d245446	Add noexcept to various OrtCallback utility class methods to fix warnings. (#4056 )	2020-05-27 18:03:58 -07:00
Yufeng Li	23c313cb73	fix crash in dequantizelinear/quantizelinear for optional zero point (#4047 ) fix the issue #4032 and #3802 in OnnxRuntime side. For the quantizeLinear, there also needs a fix in ONNX type inference. Will do that in ONNX repo.	2020-05-27 17:11:55 -07:00
liqunfu	6665d5e2bc	Liqun/a transformer example (#3845 ) Add transformer glue test example to show how to use ORTTrainer to fine-tune a transformer model Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-05-27 15:21:35 -07:00
Matthieu Darbois	a983509ed3	Pad: Add support for all datatypes in opset-11 spec (#4021 ) * Pad: Add support for all datatypes in opset-11 spec Pad opset-11 implementation supports: int32, int64, float & double Per specification, Pad opset-11 also supports: uint8, uint16, uint32, uint64, int8, int16 & float16 This commit add support for those types to get full coverage of Pad opset-11 operator. * Pad: Remove 16-bit datatypes support These types are unused at the moment and binary size is impacted. Remove support for those type to lower binary size.	2020-05-28 08:05:13 +10:00
Tianlei Wu	930c6a59da	Allow optional cast in embed layer norm be optional. (#4040 )	2020-05-27 14:55:03 -07:00
Yulong Wang	b3ec8035ee	[Node.js binding] add build flag for node.js binding (#3948 )	2020-05-27 13:30:22 -07:00
edgchen1	ee6371d0a8	Clean up CUDAExecutionProvider's associated PerThreadContexts on destruction (#4017 ) Clean up a CUDAExecutionProvider's associated PerThreadContext instances when that CUDAExecutionProvider is destroyed. Revert workaround (introduced in #3767) to lazily initialize CUDA handles to avoid segmentation fault. For that case, the CUDA handle cleanup was happening quite a bit later than the CUDAExecutionProvider destructor. This should be a cleaner way to fix that.	2020-05-27 11:01:43 -07:00
Xueyun Zhu	633008b5ef	Add pipeline online partition logic for pipeline (#3996 ) * online partition * fix when multiple consumer nodes is in cut info * fix windows build * address feedback * adding test * feedback * address feedback * add parser for cut edge * windows build	2020-05-26 17:44:09 -07:00
Tracy Sharpe	0d8abc1a99	MLAS: qgemm refactoring (#4030 ) Treat U8U8 as U8S8 for VNNI for performance and optimize SSE2 kernel.	2020-05-26 17:27:32 -07:00
Tianlei Wu	abcd1576c9	Add Linux bash and Windows batch scripts for running transformers benchmarks (#3997 )	2020-05-26 16:42:12 -07:00
Cecilia Liu	212efb6cde	Match New Pattern for Reshape Fusion (#3931 ) Fuse reshape subgraph.	2020-05-26 14:10:42 -07:00
Paul Fultz II	7759136610	Add amd migraphx execution provider to onnx runtime (#2929 ) * Add amd migraphx execution provider to onnx runtime * rename MiGraphX to MIGraphX * remove unnecessary changes in migraphx_execution_provider.cc * add migraphx EP to tests * add input requests of the batchnorm operator * add to support an onnx operator PRelu * update migrapx dockerfile and removed one unused line * sync submodules with mater branch * fixed a small bug * fix various bugs to run msft real models correctly * some code cleanup * fix python file format * fixed a code style issue * add default provider for migraphx execution provider Co-authored-by: Shucai Xiao <Shucai.Xiao@amd.com>	2020-05-27 04:24:59 +08:00
Vincent Wang	9d0534c0eb	Optimize OneHot CUDA Kernel (#4012 ) * Optimize for OneHot with zero off value. * Add test cases for indices out of range. Co-authored-by: Vincent Wang <weicwang@microsoft.com> Co-authored-by: Vincent Wang <weicwang@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-05-26 18:12:11 +08:00
Changming Sun	0a6d9dd301	Remove Openmp from the GPU docker files	2020-05-25 14:17:48 -07:00
Changming Sun	30efe65e95	Add use_openmp back to the docker files	2020-05-25 14:17:48 -07:00
Wenhao Hu	bd8993cb15	remove --use_openmp in build.sh	2020-05-25 14:17:48 -07:00

1 2 3 4 5 ...

2633 commits