onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-05 04:17:53 +00:00

Author	SHA1	Message	Date
Sheil Kumar	c331d8cffc	WinML custom operator header is missing from nuget package. (#4083 ) * publish mloperatorauthor.h in the nuget * build dmlep into arm/arm64 builds * update to not use --use_dml everywhere, but enable custom ops everywhere * always download directml nuget in winml builds * always build with dml * dont build dml for arm Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2020-05-29 13:24:22 -07:00
Linnea May	6c7eaff676	fixed typo in readme (#4076 )	2020-05-29 12:39:28 -07:00
Tixxx	6404aba5ae	Orttraining rc1 master merge (#4080 ) * fixed seg fault when using concrete shape disable gradient as output * fix evaluation hang issue for multiple gpu run * Remove dead code, ORTModel and improve docstrings (#3814) * Refine ORTTrainer docstring descriptions (#3907)	2020-05-29 12:28:12 -07:00
Wei-Sheng Chin	e951b29a0b	Fix a macro and memory regression (#4068 ) onnxruntime_training_bert can run the following command again. ./onnxruntime_training_bert --model_name=bert-large-uncased_L_24_H_1024_A_16_V_30528_S_512_Dp_0.1_optimized_layer_norm --num_train_steps=16 --train_batch_size=52 --mode=train --train_data_dir=/bert_data/128/books_wiki_en_corpus/train --test_data_dir=/bert_data/128/books_wiki_en_corpus/test --gradient_accumulation_steps=16 --optimizer=Lamb --learning_rate=3e-3 --max_seq_length=128 --max_predictions_per_seq=20 --warmup_ratio=0.2843 --warmup_mode=Poly --display_loss_steps=100 --use_mixed_precision=True --allreduce_in_fp16 --use_nccl	2020-05-29 09:24:40 -07:00
edgchen1	38d76cc904	Clean up training E2E test (#4078 ) Update training E2E build to not go through CTest and call test scripts directly.	2020-05-29 09:20:47 -07:00
Prabhat	dd43623da2	Remove ONNX from requirements.txt (#4073 ) * Avoid installing ONNX package on aarch64 * Removed onnx from requirements * Add note in backend.py	2020-05-29 21:44:20 +05:30
KeDengMS	348ed698ec	Add more symbolic compute support in symbolic shape inference (#4057 ) * Add more symbolic compute support in symbolic shape inference * Refinements	2020-05-29 02:00:30 -07:00
Scott McKay	2a96be83f6	skottmckay/bugfix/SubgraphInput (#4004 ) Description: Fix 2 edge cases as described here: #3755 (comment) Create a NodeArg for subgraph inputs even if they have no type. If they are only used as an implicit input to another level of nested subgraph we will not create a NodeArg via any other path Allow an If output to have no shape. Obscure edge case where a loop carried dependency to a Loop node is passed through a nested If node subgraph (i.e. the Loop subgraph contains an If node with a nested subgraph for the else_branch/then_branch). We can't infer a shape for a loop carried dependency (they may change across iterations), which means we can't infer a shape for the nested If subgraph output either. We have delayed allocation support for If outputs so use that. Motivation and Context #3755	2020-05-29 14:48:07 +10:00
Hariharan Seshadri	c55634d2e6	Fix initial value of loop variable in RNN op (#4055 )	2020-05-28 19:19:39 -07:00
pengwa	6d03470587	Add e2e measurement for training (#4049 ) * add e2e measurement	2020-05-29 10:08:29 +08:00
Yufeng Li	26be762b35	Make CPU QuantizeLinear support optional zero point (#4065 ) * Disable DequantizeLinear_Without_Zero_Point test for nGraph * make quantizelinear support optional zero point	2020-05-28 14:33:26 -07:00
Tianlei Wu	60fa4b1f90	Update benchmark of gpt2 model with past state (#4043 ) * update benchmark_gpt2 to use past state only * update dynamic axes of input/output tensors * Remove --use_openmp option since it is default for onnxruntime 1.3 cpu. * Use same option names as benchmark.py	2020-05-28 13:55:43 -07:00
Ryan Lai	ed0a8e5b5c	Enable disabled tests and add fixed model (#4059 ) Co-authored-by: Ryan Lai <ryalai96@gamil.com>	2020-05-28 13:24:12 -07:00
Brian Martin	279f9aa865	Update WinRT_API.md to reflect 1.3 release (#4074 ) fix broken link, add new release to the release table, and point to the 1.3 nuget package	2020-05-28 11:01:49 -07:00
Changming Sun	c94d9685b6	Fix a problem in StacktraceTests::BasicTests (#4069 ) result.size() could be zero, in this case, we shouldn't access result[0]	2020-05-28 10:06:16 -07:00
Changming Sun	a859dc422c	Delete google::protobuf::io::FileInputStream class from our source code (#4067 ) This class is already part of the protobuf-lite library. We don't need a copy here. And if we do, we must ensure the signature of every function is exactly the same as the original. However, the upstream code may get changed over time. For example, recently protobuf added a "const" modifier to the FileInputStream::GetErrno(), which may break the build if a user want to use the latest protobuf.	2020-05-28 10:05:47 -07:00
Faith Xu	1e82ecfd5c	Fix link in readme (#4058 )	2020-05-28 06:57:58 -07:00
Tianlei Wu	7f750b65ce	support model > 2GB in transformer optimizer (#4038 ) * Enable optimizer on models with external data (>2GB) * Refactoring optimizer: move fusion to separate file * Update benchmark: (1) output datatime to csv (2) Add option --onnx_dir to benchmark.py for onnx model directory path (3) add gpt2-large (4) loose thrsholds for fp16 validation * update optimizer (1) Add attribute of ConstantOfShape in fp16 conversion (2) Use OnnxRuntime level 1 optimization * update bert_perf_test.py: rename --input_ids to --input_ids_name	2020-05-28 01:16:41 -07:00
edgchen1	9f7d245446	Add noexcept to various OrtCallback utility class methods to fix warnings. (#4056 )	2020-05-27 18:03:58 -07:00
Yufeng Li	23c313cb73	fix crash in dequantizelinear/quantizelinear for optional zero point (#4047 ) fix the issue #4032 and #3802 in OnnxRuntime side. For the quantizeLinear, there also needs a fix in ONNX type inference. Will do that in ONNX repo.	2020-05-27 17:11:55 -07:00
liqunfu	6665d5e2bc	Liqun/a transformer example (#3845 ) Add transformer glue test example to show how to use ORTTrainer to fine-tune a transformer model Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-05-27 15:21:35 -07:00
Matthieu Darbois	a983509ed3	Pad: Add support for all datatypes in opset-11 spec (#4021 ) * Pad: Add support for all datatypes in opset-11 spec Pad opset-11 implementation supports: int32, int64, float & double Per specification, Pad opset-11 also supports: uint8, uint16, uint32, uint64, int8, int16 & float16 This commit add support for those types to get full coverage of Pad opset-11 operator. * Pad: Remove 16-bit datatypes support These types are unused at the moment and binary size is impacted. Remove support for those type to lower binary size.	2020-05-28 08:05:13 +10:00
Tianlei Wu	930c6a59da	Allow optional cast in embed layer norm be optional. (#4040 )	2020-05-27 14:55:03 -07:00
Yulong Wang	b3ec8035ee	[Node.js binding] add build flag for node.js binding (#3948 )	2020-05-27 13:30:22 -07:00
edgchen1	ee6371d0a8	Clean up CUDAExecutionProvider's associated PerThreadContexts on destruction (#4017 ) Clean up a CUDAExecutionProvider's associated PerThreadContext instances when that CUDAExecutionProvider is destroyed. Revert workaround (introduced in #3767) to lazily initialize CUDA handles to avoid segmentation fault. For that case, the CUDA handle cleanup was happening quite a bit later than the CUDAExecutionProvider destructor. This should be a cleaner way to fix that.	2020-05-27 11:01:43 -07:00
Xueyun Zhu	633008b5ef	Add pipeline online partition logic for pipeline (#3996 ) * online partition * fix when multiple consumer nodes is in cut info * fix windows build * address feedback * adding test * feedback * address feedback * add parser for cut edge * windows build	2020-05-26 17:44:09 -07:00
Tracy Sharpe	0d8abc1a99	MLAS: qgemm refactoring (#4030 ) Treat U8U8 as U8S8 for VNNI for performance and optimize SSE2 kernel.	2020-05-26 17:27:32 -07:00
Tianlei Wu	abcd1576c9	Add Linux bash and Windows batch scripts for running transformers benchmarks (#3997 )	2020-05-26 16:42:12 -07:00
Cecilia Liu	212efb6cde	Match New Pattern for Reshape Fusion (#3931 ) Fuse reshape subgraph.	2020-05-26 14:10:42 -07:00
Paul Fultz II	7759136610	Add amd migraphx execution provider to onnx runtime (#2929 ) * Add amd migraphx execution provider to onnx runtime * rename MiGraphX to MIGraphX * remove unnecessary changes in migraphx_execution_provider.cc * add migraphx EP to tests * add input requests of the batchnorm operator * add to support an onnx operator PRelu * update migrapx dockerfile and removed one unused line * sync submodules with mater branch * fixed a small bug * fix various bugs to run msft real models correctly * some code cleanup * fix python file format * fixed a code style issue * add default provider for migraphx execution provider Co-authored-by: Shucai Xiao <Shucai.Xiao@amd.com>	2020-05-27 04:24:59 +08:00
Vincent Wang	9d0534c0eb	Optimize OneHot CUDA Kernel (#4012 ) * Optimize for OneHot with zero off value. * Add test cases for indices out of range. Co-authored-by: Vincent Wang <weicwang@microsoft.com> Co-authored-by: Vincent Wang <weicwang@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-05-26 18:12:11 +08:00
Changming Sun	0a6d9dd301	Remove Openmp from the GPU docker files	2020-05-25 14:17:48 -07:00
Changming Sun	30efe65e95	Add use_openmp back to the docker files	2020-05-25 14:17:48 -07:00
Wenhao Hu	bd8993cb15	remove --use_openmp in build.sh	2020-05-25 14:17:48 -07:00
Tiago Koji Castro Shibata	faf65e960f	Refactor delayloading (#4019 ) * Refactor delayloading * Remove explicit linking to windowsapp.lib	2020-05-24 23:26:30 -07:00
Wei-Sheng Chin	24eda3df33	Create Utils for Adding Range and Marker (#4013 ) In this PR, we 1. create some APIs for creating NVTX objects 2. apply those APIs in pipeline-related operators and sequential executor. As a result, we can explicitly see how a pipeline schedule is run by GPUs in Nvidia's visual profiler. Note that these APIs are Linux only due to Nvidia's limited support.	2020-05-24 22:55:24 -07:00
Changming Sun	aafe988a11	Temporarily disable windows static analysis CI job	2020-05-24 16:31:09 -07:00
Changming Sun	7c83118364	Enlarge protobuf read buffer size	2020-05-24 16:31:09 -07:00
Ryan Hill	eb3aaa70d6	Fix compiler warning for openvino (#4010 )	2020-05-21 20:22:07 -07:00
Jeff Bloomfield	59af3ea278	Add missing D3D12 resource barriers and fences to Winml (#3941 ) * Add missing D3D12 resource barriers to Winml * Fix unsafe descriptor usage in Winml tensorization	2020-05-20 23:19:44 -07:00
Ori Levari	ce4d05862a	add bm_fish_720 to collateral for scenario 22 test (#3998 ) Co-authored-by: Ori Levari <orlevari@microsoft.com>	2020-05-20 23:19:27 -07:00
Ryan Lai	357bffe47c	Fix deprecated CentOS link for Linux CI pipeline (#4000 ) * Fix Linux_CI_GPU_Dev * centos6	2020-05-20 16:14:48 -07:00
Bowen Bao	0a5395bb78	Remove 'model_.' prefix from onnx model initializers in training (#3881 ) * Remove 'model_.' prefix for onnx model initializers in training * fix test case remove redundant device test * rename * Fix state_dict/load_state_dict with frozen_weight * nit * Add monkey patch for pt opset 10 * remove pt patch in CI * nit: newline	2020-05-20 10:06:31 -07:00
Prabhat	08763e80e0	Fix permission denied while creating directory in azure pipelines (#4001 ) * Fix permission denied while creating directory * Run tar with sudo	2020-05-20 09:47:12 -07:00
jji2019	dbd5aab6d2	Update OnnxRuntime.java for OS X environment. (#3985 ) onnxruntime init failure due to wrong path of reading native libraries. In OS X 64 system, the arch name is detected as x86 which generates invalid path to read native libraries. Exception java.lang.UnsatisfiedLinkError: no onnxruntime in java.library.path at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867) at java.lang.Runtime.loadLibrary0(Runtime.java:870) at java.lang.System.loadLibrary(System.java:1122) at ai.onnxruntime.OnnxRuntime.load(OnnxRuntime.java:174) at ai.onnxruntime.OnnxRuntime.init(OnnxRuntime.java:81) at ai.onnxruntime.OrtEnvironment.<clinit>(OrtEnvironment.java:24)	2020-05-20 09:15:03 -07:00
edgchen1	989fe2498f	Change training perf test build to use "docker" instead of "sudo docker" (#3995 ) Change training perf test build to use "docker" instead of "sudo docker". The training perf test build runs in an environment that supports calling "docker" and not "sudo docker".	2020-05-19 16:54:35 -07:00
Ryan Lai	354e571277	Miscounted the number of characters in package version of DirectML nuget (#3993 ) Co-authored-by: Ryan Lai <ryalai96@gamil.com>	2020-05-19 15:28:30 -07:00
Scott McKay	475e7e43e6	Older flake8 versions report false positives and don't handle the same things in the config file. (#3983 ) Require the current flake8 version or later so we get consistent results.	2020-05-20 07:29:22 +10:00
ytaous	fb4efafc8e	GPT-2 training perf scripts (#3974 ) * gpt2 training perf * gpt2 training perf * debug * debug * debug * fix bug * minor * on comments * dynamic sql * fix build * minor * linked hash * on comments * minor * mem * minor Co-authored-by: Ethan Tao <ettao@microsoft.com>	2020-05-19 10:21:40 -07:00
Charles Lien	36bcb28238	Add NNAPI in the exclude list (#3921 )	2020-05-19 09:39:41 -07:00

1 2 3 4 5 ...

2617 commits