Commit graph

2597 commits

Author SHA1 Message Date
liqunfu
6665d5e2bc
Liqun/a transformer example (#3845)
Add transformer glue test example to show how to use ORTTrainer to fine-tune a transformer model

Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-05-27 15:21:35 -07:00
Matthieu Darbois
a983509ed3
Pad: Add support for all datatypes in opset-11 spec (#4021)
* Pad: Add support for all datatypes in opset-11 spec

Pad opset-11 implementation supports:
int32, int64, float & double

Per specification, Pad opset-11 also supports:
uint8, uint16, uint32, uint64, int8, int16 & float16

This commit add support for those types to get full coverage of Pad opset-11 operator.

* Pad: Remove 16-bit datatypes support

These types are unused at the moment and binary size is impacted. Remove support for those type to lower binary size.
2020-05-28 08:05:13 +10:00
Tianlei Wu
930c6a59da
Allow optional cast in embed layer norm be optional. (#4040) 2020-05-27 14:55:03 -07:00
Yulong Wang
b3ec8035ee
[Node.js binding] add build flag for node.js binding (#3948) 2020-05-27 13:30:22 -07:00
edgchen1
ee6371d0a8
Clean up CUDAExecutionProvider's associated PerThreadContexts on destruction (#4017)
Clean up a CUDAExecutionProvider's associated PerThreadContext instances when that CUDAExecutionProvider is destroyed.

Revert workaround (introduced in #3767) to lazily initialize CUDA handles to avoid segmentation fault. For that case, the CUDA handle cleanup was happening quite a bit later than the CUDAExecutionProvider destructor. This should be a cleaner way to fix that.
2020-05-27 11:01:43 -07:00
Xueyun Zhu
633008b5ef
Add pipeline online partition logic for pipeline (#3996)
* online partition

* fix when multiple consumer nodes is in cut info

* fix windows build

* address feedback

* adding test

* feedback

* address feedback

* add parser for cut edge

* windows build
2020-05-26 17:44:09 -07:00
Tracy Sharpe
0d8abc1a99
MLAS: qgemm refactoring (#4030)
Treat U8U8 as U8S8 for VNNI for performance and optimize SSE2 kernel.
2020-05-26 17:27:32 -07:00
Tianlei Wu
abcd1576c9
Add Linux bash and Windows batch scripts for running transformers benchmarks (#3997) 2020-05-26 16:42:12 -07:00
Cecilia Liu
212efb6cde
Match New Pattern for Reshape Fusion (#3931)
Fuse reshape subgraph.
2020-05-26 14:10:42 -07:00
Paul Fultz II
7759136610
Add amd migraphx execution provider to onnx runtime (#2929)
* Add amd migraphx execution provider to onnx runtime

* rename MiGraphX to MIGraphX

* remove unnecessary changes in migraphx_execution_provider.cc

* add migraphx EP to tests

* add input requests of the batchnorm operator

* add to support an onnx operator PRelu

* update migrapx dockerfile and removed one unused line

* sync submodules with mater branch

* fixed a small bug

* fix various bugs to run msft real models correctly

* some code cleanup

* fix python file format

* fixed a code style issue

* add default provider for migraphx execution provider

Co-authored-by: Shucai Xiao <Shucai.Xiao@amd.com>
2020-05-27 04:24:59 +08:00
Vincent Wang
9d0534c0eb
Optimize OneHot CUDA Kernel (#4012)
* Optimize for OneHot with zero off value.

* Add test cases for indices out of range.

Co-authored-by: Vincent Wang <weicwang@microsoft.com>
Co-authored-by: Vincent Wang <weicwang@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-05-26 18:12:11 +08:00
Changming Sun
0a6d9dd301 Remove Openmp from the GPU docker files 2020-05-25 14:17:48 -07:00
Changming Sun
30efe65e95 Add use_openmp back to the docker files 2020-05-25 14:17:48 -07:00
Wenhao Hu
bd8993cb15 remove --use_openmp in build.sh 2020-05-25 14:17:48 -07:00
Tiago Koji Castro Shibata
faf65e960f
Refactor delayloading (#4019)
* Refactor delayloading

* Remove explicit linking to windowsapp.lib
2020-05-24 23:26:30 -07:00
Wei-Sheng Chin
24eda3df33
Create Utils for Adding Range and Marker (#4013)
In this PR, we
  1. create some APIs for creating NVTX objects
  2. apply those APIs in pipeline-related operators and sequential executor.
As a result, we can explicitly see how a pipeline schedule is run by GPUs in 
Nvidia's visual profiler. Note that these APIs are Linux only due to Nvidia's
limited support.
2020-05-24 22:55:24 -07:00
Changming Sun
aafe988a11 Temporarily disable windows static analysis CI job 2020-05-24 16:31:09 -07:00
Changming Sun
7c83118364 Enlarge protobuf read buffer size 2020-05-24 16:31:09 -07:00
Ryan Hill
eb3aaa70d6
Fix compiler warning for openvino (#4010) 2020-05-21 20:22:07 -07:00
Jeff Bloomfield
59af3ea278
Add missing D3D12 resource barriers and fences to Winml (#3941)
* Add missing D3D12 resource barriers to Winml

* Fix unsafe descriptor usage in Winml tensorization
2020-05-20 23:19:44 -07:00
Ori Levari
ce4d05862a
add bm_fish_720 to collateral for scenario 22 test (#3998)
Co-authored-by: Ori Levari <orlevari@microsoft.com>
2020-05-20 23:19:27 -07:00
Ryan Lai
357bffe47c
Fix deprecated CentOS link for Linux CI pipeline (#4000)
* Fix Linux_CI_GPU_Dev

* centos6
2020-05-20 16:14:48 -07:00
Bowen Bao
0a5395bb78
Remove 'model_.' prefix from onnx model initializers in training (#3881)
* Remove 'model_.' prefix for onnx model initializers in training

* fix test case remove redundant device test

* rename

* Fix state_dict/load_state_dict with frozen_weight

* nit

* Add monkey patch for pt opset 10

* remove pt patch in CI

* nit: newline
2020-05-20 10:06:31 -07:00
Prabhat
08763e80e0
Fix permission denied while creating directory in azure pipelines (#4001)
* Fix permission denied while creating directory

* Run tar with sudo
2020-05-20 09:47:12 -07:00
jji2019
dbd5aab6d2
Update OnnxRuntime.java for OS X environment. (#3985)
onnxruntime init failure due to wrong path of reading native libraries. In OS X 64 system, the arch name is detected as x86 which generates invalid path to read native libraries.

Exception java.lang.UnsatisfiedLinkError: no onnxruntime in java.library.path
	at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
	at java.lang.Runtime.loadLibrary0(Runtime.java:870)
	at java.lang.System.loadLibrary(System.java:1122)
	at ai.onnxruntime.OnnxRuntime.load(OnnxRuntime.java:174)
	at ai.onnxruntime.OnnxRuntime.init(OnnxRuntime.java:81)
	at ai.onnxruntime.OrtEnvironment.<clinit>(OrtEnvironment.java:24)
2020-05-20 09:15:03 -07:00
edgchen1
989fe2498f
Change training perf test build to use "docker" instead of "sudo docker" (#3995)
Change training perf test build to use "docker" instead of "sudo docker". The training perf test build runs in an environment that supports calling "docker" and not "sudo docker".
2020-05-19 16:54:35 -07:00
Ryan Lai
354e571277
Miscounted the number of characters in package version of DirectML nuget (#3993)
Co-authored-by: Ryan Lai <ryalai96@gamil.com>
2020-05-19 15:28:30 -07:00
Scott McKay
475e7e43e6
Older flake8 versions report false positives and don't handle the same things in the config file. (#3983)
Require the current flake8 version or later so we get consistent results.
2020-05-20 07:29:22 +10:00
ytaous
fb4efafc8e
GPT-2 training perf scripts (#3974)
* gpt2 training perf

* gpt2 training perf

* debug

* debug

* debug

* fix bug

* minor

* on comments

* dynamic sql

* fix build

* minor

* linked hash

* on comments

* minor

* mem

* minor

Co-authored-by: Ethan Tao <ettao@microsoft.com>
2020-05-19 10:21:40 -07:00
Charles Lien
36bcb28238
Add NNAPI in the exclude list (#3921) 2020-05-19 09:39:41 -07:00
edelaye
64b5f7edf6
Initial release of Vitis-AI Execution Provider (#3771)
* Initial release of Vitis-AI Execution Provider

* Add documentation, fix for onnxruntime::Model changes and use stringstream instead of file dump for model passing

* - Add Vitis-AI docker file
- Add online quantization flow Vitis-AI execution provider
- Fix remarks

* - Add fatal error build message for Vitis-AI cmake build on Windows
- Fix pep8 issue in build.py
- Add Vitis-AI execution provider example in docs

Co-authored-by: Elliott Delaye <elliott@xilinx.com>
Co-authored-by: Jorn Tuyls <jornt@xilinx.com>
Co-authored-by: Jorn Tuyls <jtuyls@users.noreply.github.com>
2020-05-19 05:32:32 -07:00
Gani Nazirov
c42867c016
Update data_frame_tool to latest (#3919)
* update data_frame_tool to latest:
Handle datetime and catecorical dataframe column types.
Handle ML.NET / Featurizers metadata outputs.
Input and Output are pandas dataframes.

* remove whitespaces

* reformat comment
remove whitespaces

* reformat comment
remove whitespaces

Co-authored-by: Gani Nazirov <ganaziro@microsoft.com>
2020-05-18 21:13:56 -07:00
Ryan Hill
672c42b396
Fix std::_Exit workaround for Dnnl crash at unload (#3978) 2020-05-18 21:03:32 -07:00
Faith Xu
b8a255e1b5
Doc Updates for Build (#3976)
* Initial update of readme

* Readme updates

* Review of consolidated README (#3930)

* Proposed updates for readme (#3953)

I found some of the information was duplicated within the doc, so attempted to streamline

* Fix links

* More updates

- fix build instructions
- nodejs doc reorganization
- roadmap update
- version fixes

* Update ORT Server build instructions

* More doc cleanup

* fix python dev notes name

* Update nodejs and some links

* sync eigen version back to master

* Minor fixes

* add nodsjs to sample table of content

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* address PR feedback

* address PR feedback

* nodejs build instruction

* Update Java instructions to include gradle

* Roadmap refresh

Reformat some data, fix link, minor rewording

* Clarify Visual C++ runtime req

Co-authored-by: Nat Kershaw (MSFT) <nakersha@microsoft.com>
Co-authored-by: Prasanth Pulavarthi <prasantp@microsoft.com>
Co-authored-by: manashgoswami <magoswam@microsoft.com>
2020-05-18 20:08:36 -07:00
Hariharan Seshadri
1168f4e85a
Support session EndProfiling() in the CSharp API (#3934) 2020-05-18 19:47:52 -07:00
Yufeng Li
b253f0b0f6
Do not fuse skiplayernorm if any output of add in graph output (#3981) 2020-05-18 19:46:51 -07:00
gwang-msft
a75a83b41a
Minor android build fix (#3980) 2020-05-18 19:20:23 -07:00
Changming Sun
2fa2019daf
Run docker commands with sudo (#3979) 2020-05-18 17:35:09 -07:00
Yufeng Li
29c39f867c
enable tensor core for rnn (#3937) 2020-05-18 16:24:18 -07:00
Scott McKay
c6a94f95cf
Update Android instructions (#3971)
Update Android build instructions to provide more information.
Add info on testing directly on Android
Update build.py to better support using Ninja generator to build Android on Windows.
2020-05-19 07:30:45 +10:00
edgchen1
024b92a970
Use path relative to script location to refer to symbolic_opset10.py from install_deps.sh. (#3975)
Update install_deps.sh to use relative path from script directory to symbolic_opset10.py. This allows install_deps.sh to be called from different working directories.
2020-05-18 13:36:06 -07:00
Adam Pocock
9d2d1eb6f6
[java] Adds a CUDA test (#3956)
* [java] - adding a cuda enabled test.

* Adding --build_java to the windows gpu ci pipeline.

* Removing a stray line from the unit tests that always enabled CUDA for Java.
2020-05-18 12:05:51 -07:00
Hariharan Seshadri
1a183784a8
Fix C# layer in the way it handles sequences (#3965)
* Fix C# layer in the way it handles sequence of tensors

* Revert comment
2020-05-18 11:10:13 -07:00
edgchen1
e259a13f8e
Initial training Python packaging pipeline (#3767)
Add a pipeline to produce training-enabled ORT wheels.
2020-05-18 09:41:00 -07:00
edgchen1
e55f24364a
Disable LTO on Windows training CPU build (#3960)
Disable LTO on Windows training CPU build. Add a parameter to the win-ci-2019.yml build template for enabling LTO with a default value of true.
2020-05-18 09:24:10 -07:00
M. Zeeshan Siddiqui
44731e88bb
Add comments for zero valued normalization factor in SoftmaxCrossEntropyLossGrad CUDA kernel. (#3972) 2020-05-18 09:08:09 -07:00
Scott McKay
fd8ea4e466
Improve handling of symbolic dimensions in the onnxruntime_test.py script. (#3959)
If a symbolic dimension is found allow the user to provide a value, or default to 1.

`python .\onnxruntime_test.py --symbolic_dims batch=1,seqlen=4 onnxruntime\test\testdata\transform\fusion\fast_gelu_use_graph_input.onnx`
2020-05-18 16:51:09 +10:00
Tianlei Wu
523d70f667
Improve Transformer Benchmark for FP16 (#3970)
Disable ORT in offline optimization script (ORT could generate some fused ops (like FusedGemm) which cannot be converted to fp16).
Remove some models from benchmark until we have optimizations for them.
2020-05-17 21:50:45 -07:00
Wei-Sheng Chin
0d11649bb3
Address comments from #3823 and polish code (#3964)
* Address comments from #3823 and polish code

* One line
2020-05-17 14:08:33 -07:00
Prabhat
4ff73d00b0
Fix python pkg permission issue (#3957)
* Fix python pkg permission issue

* Run chown with sudo

* Add workspace clean to arm pipeline

* Run docker as current user
2020-05-17 14:06:55 +05:30