Commit graph

2877 commits

Author SHA1 Message Date
Sheil Kumar
02aea5d2d4
rename telemetry provider back to Microsoft.Windows.AI.MachineLearning (#4533)
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2020-07-16 19:51:06 -07:00
Yulong Wang
5086e55a35
Fix condition of running tests in win CI (#4459) 2020-07-16 16:33:30 -07:00
Tiago Koji Castro Shibata
2189c77e5b
static_typename (#4520)
* Use static_typename

* Disable RTTI outside of Release

* Fix unused var

* Add test types

* PR feedback
2020-07-16 16:31:02 -07:00
M. Zeeshan Siddiqui
b43ce2d7ad
Replace loss function in BERT_LOSS with SoftmaxCrossEntropyLoss. (#4509)
* Replace loss function in BERT_LOSS with SoftmaxCrossEntropyLoss.

* Update BERT loss function with correct logit shapes for softmax cross entropy loss.

* fix test and PR comments.
2020-07-16 15:28:24 -07:00
RandySheriffH
76b31d6ce2
fix xcode alerts (#4470)
* fix xcode alerts

* fix comment

* fix comments

* update text

* fix comments

* fix comments

* remove checks on context

Co-authored-by: Randy <Randy@randysmac.attlocal.net>
Co-authored-by: Randy <Randy@randysmac.local>
Co-authored-by: Tracy Sharpe <tracysh@microsoft.com>
2020-07-16 10:20:34 -07:00
Changming Sun
8ada440961
Move model tests to onnxruntime_test_all (#4521)
1. Move model tests to onnxruntime_test_all
2. Publish TestResults of Windows CI build.
2020-07-15 16:46:18 -07:00
Xueyun Zhu
5f188f4cf4
ci fix (#4519) 2020-07-15 12:05:24 -07:00
stevenlix
0ebe2fab51
Refactor TensorRT EP code to better handle dynamic shape subgraphs (#4504)
* build engine in runtime for dynamic shape subgraphs

* Update TensorRT-ExecutionProvider.md

* Update TensorRT-ExecutionProvider.md

* fix build issue

* Add more instructions on how to use engine caching

* add precision to trt node name

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.cc
2020-07-15 02:35:42 -07:00
gwang-msft
cf92497c16
Nnapi, add auto_pad support for Conv/GlobalAveragePool/AveragePool/GlobalMaxPool/MaxPool operators (#4499)
* Split ComputePadAndOutputShape into ComputePad and ComputeOutputShape

* update NNAPI conv ouput shape compute to use shared ComputeOutputShapec

* move use ptr to use reference for ComputePadAndOutputShape

* nnapi conv support auto_pad

* add logging operator support bt target devices

* update InferOutputShape/ComputePadAndOutputShape/ComputePad to use force_symmetric_auto_padding as param instead of template

* make log op support for target devices optional

* add auto_pad support to pool operators

* ignore GetTargetDevices if using all devices

* fix some typo in padding calculation

* fix a bug of compute padding difference between conv and pool ops

* addressed CR comments, removed NNAPI device logging and move nnapi ep autopad handling into a shared function

* change helper functions to static
2020-07-15 00:21:42 -07:00
edgchen1
34f73fa1aa
Add sudo --preserve-env option to allow environment to go through to docker commands. (#4512) 2020-07-14 18:12:31 -07:00
liqunfu
f721f5f1cd
Liqun/multiple choice (#4480)
* multiple choice runner

* add docker cleanup task to frontent pipeline
2020-07-14 17:57:58 -07:00
Xueyun Zhu
7d96960ec8
support pipeline partition with shared initializer (#4321)
* support bert partition with shared initializer

* address feedback

* address feedback

* address feedback

* add more test

* remove bert-tiny model

* address feedback

* address function comment

* move CreateNodeArg to graph_utils

* rename function name

* rename function name

* fix windows build

* fix windows type conversion warning

* add function comment
2020-07-14 17:21:40 -07:00
edgchen1
1ebe598286
Conditionally compile without std::is_trivially_copyable to satisfy old GCC versions. (#4510) 2020-07-14 16:47:40 -07:00
Sheil Kumar
ee5ca27ae2
Split Microsoft.AI.MachineLearning.nupkg in a NuGet package and symbol NuGet package (#4503)
* add threadpool interface

* generate snupkgs

* include_pdb check

* fix snupkg generation

* Add task to merge snupkgs

* folder exists

* check dir

* revert thread pool stuff

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2020-07-14 14:52:39 -07:00
Tianlei Wu
25885cf7d0
Add option --torchscript in benchmark_gpt2.py (#4500)
* support TorchScript
* change onnx filename format
* change output name prediction_scores to logits
2020-07-14 11:53:23 -07:00
Tim Harris
a95ae164f7
Create N-1 threads in intra-op pool, given main thread now active (#4493)
Create N-1 threads in a thread pool when configured with intra-op parallelism of N. This ensures we have N active threads, given that the main thread also runs work. To avoid ambiguity on the value returned, rename ThreadPool::NumThreads method to ThreadPool::DegreeOfParallelism, and make corresponding updates in MLAS and operators.
2020-07-14 09:48:50 +01:00
liqunfu
0bff55512e
updated expected values for frontend test to pass frontend e2e pipeline. raise tolerance to reduce future risk of failure (#4497)
* updated expected values for frontend test, raise tol
2020-07-13 19:25:54 -07:00
Dmitri Smirnov
e0eddf502c
Bump version to 1.4.0 (#4496) 2020-07-13 17:09:18 -07:00
Yufeng Li
3d4ac85124
Add quantization benchmark for transformer based model (#4482)
* add support of quantization benchmark
2020-07-13 15:46:23 -07:00
gwang-msft
a3c358fd29
Split the shared ComputePadAndOutputShape into 2 separated functions ComputePad and ComputeOutputShape (#4487)
* Split ComputePadAndOutputShape into ComputePad and ComputeOutputShape

* update NNAPI conv ouput shape compute to use shared ComputeOutputShapec

* move use ptr to use reference for ComputePadAndOutputShape
2020-07-13 15:07:34 -07:00
Tiago Koji Castro Shibata
3441c687b7
Revert "Remove docstrigs if __ONNX_NO_DOC_STRINGS" (#4495)
This reverts commit bb4d331fa7bf1fe8d68b1527dda56e4739c80800.
2020-07-13 14:55:37 -07:00
gwang-msft
5f8f443ac4
Android CI build, test copy, emulator boot improvement (#4481)
* Enable onnxruntime_test_all for NNAPI EP

* switch to use ninja for ANdroid CI

* make android elumator boot faster in android ci

* simplify adb push

* more style change

* more tweaking on android ci

* build.py style update
2020-07-13 14:18:34 -07:00
Dmitri Smirnov
35ee00d888
Pin typing version. (#4490) 2020-07-13 11:48:30 -07:00
Bowen Bao
07455cff28
Support double type for Greater CPU (#4373)
* Add double for Greater

* add double type for Greater

* udpate test according to dtype
2020-07-13 11:25:14 -07:00
Tiago Koji Castro Shibata
f18dee84c2
Remove docstrigs if __ONNX_NO_DOC_STRINGS (#4494) 2020-07-13 11:08:46 -07:00
edgchen1
c71c49aaa0
Make TArray safer to use and update method name for consistency. (#4483)
- make size_ and data_ data members private
- rename GetCapacity() to Capacity() to be consistent (e.g., with Size())
- add static_assert for trivially copyable T because it is copied with memcpy
2020-07-13 09:59:56 -07:00
Sheil Kumar
00706e1502
dont add deps for uwp apps (#4485)
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2020-07-10 22:11:32 -07:00
Derek Murray
3e48ffd21c
Move AutoPadType to common.h (#4474)
Extracting some common code related to "AutoPadType" from the cpu execution provider into "common.h".

Motivation and Context
* Sharing code with authors of other execution providers that need the same functionality.
* I didn't modify the code in shared_library or dnnl EP to avoid changing their dependency structure, so there is still a redundant copy of the AutoPadType code in there.
2020-07-10 16:40:32 -07:00
Tianlei Wu
e96a829e84
Handle multiple embed nodes in transformer optimizer (#4471)
Handle model with multiple embed nodes:
* update embed layer norm fusion in onnxruntime
* Fix temp model path in optimizer
* Add unit test for model with multiple embed nodes.
* Add unit test for gpt2 fusion with past state and mask
* Add unit test for change input to int32
2020-07-10 15:28:27 -07:00
Ashwini Khade
6a9a9a35be
fix crashes caused by test runner (#4475)
* Fix crashes in test runner

* plus some fixes

* changes per review
2020-07-10 14:04:22 -07:00
Hariharan Seshadri
26ebcfab88
Fix Nuget GPU pipeline (#4462) 2020-07-10 14:02:28 -07:00
gwang-msft
9b4c54bcef
Enable onnxruntime_test_all for NNAPI EP (#4476) 2020-07-10 13:34:44 -07:00
edgchen1
6c7da5e9d3
Optimize CUDA Sum op kernel and refactor CUDA elementwise variadic input op kernels (#4418)
For the special case where all variadic inputs of a kernel are the same shape (i.e. no broadcasting is required) and there are few enough of them, we perform the entire computation in a single kernel. The general implementation (which was previously used for this special case) handles broadcasting by repeatedly invoking a binary kernel on successive inputs.
2020-07-10 10:20:23 -07:00
Prabhat
04586fc09d
Fix segmentation fault caused by invalid tensor type (#4467)
* Fix segmentation fault caused by invalid tensor type

* Addressed review comment
2020-07-10 11:23:12 +01:00
Zhang Lei
ccbf49e59f
Fix avx2 load 32 bytes buffer overrun. (#4455)
* Fix avx2 load 32 bytes buffer overrun.

* Fix qladd buffer overrun for sse2 code.

* Fix QLinearAdd buffer overrun for arm.

* Add mlas test for qladd to cover overrun and more.

* Change API to save binary space. Add more test in mlas to cover different zeropoints.
2020-07-09 15:54:31 -07:00
Yufeng Li
d4db83858b
Only quantize gather with initializer (#4469) 2020-07-09 13:33:43 -07:00
Yulong Wang
bec18eb3f4
[Node.js binding] support CentOS 7 in CI (#4447) 2020-07-09 00:59:50 -07:00
Josh Bradley
ca5af9d622
Add modern C++ standards for Ort::Value (#4367)
* add modern standards to function arguments

* code cleanup

* fix code formatting

* add element access convenience function

* change template type name to match rest of code

* remove new At() convenience function

* add better documentation message
2020-07-09 00:35:41 -07:00
Vincent Wang
7fb194d03d
Update convergence baseline for ci_test. (#4465)
Co-authored-by: Vincent Wang <weicwang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-07-09 15:29:36 +08:00
Josh Bradley
3effac2990
Experimental C++ API examples (#4358)
* Add examples

* fix build instructions for linux users

* fix header include

* update documentation
2020-07-08 23:17:50 -07:00
Yufeng Li
5dc7339be6
Add quantization tool to python package (#4458)
* Add quantization tool to python package
2020-07-08 21:42:53 -07:00
edgchen1
0ca4f7eb30
Update Git submodule cgmanifests. (#4461) 2020-07-08 19:24:03 -07:00
George Wu
f24d8e4587
fix build break from PR#2850 api change (#4451) 2020-07-08 17:02:12 -07:00
Tianlei Wu
cb5c4292b8
GPT-2 Attention Fusion without input mask (#4456)
* Allow input mask to be optional
* Add test for model without input mask and past state.
2020-07-08 15:59:57 -07:00
Wei-Sheng Chin
5222b2c6c0
Remove code which is not thread-safe. (#4454)
Because of acync access to the memory logger when using parallel executor,
ORT crashes sometime.
2020-07-08 14:27:56 -07:00
Tianlei Wu
05757b4c3c
Transformer benchmark: add option to use raw attention mask (#4446)
* Update benchmark and optimizer to add an option to use raw attention mask
* Remove temporary model in optimizer
2020-07-08 12:34:41 -07:00
Tixxx
b156ae4448
Support training_mode flag in eval (#4324)
* add training_mode feed for evaluation to support opset12
2020-07-08 10:38:54 -07:00
Negin Raoof
71aec2adcb
Custom op export test template (#4383)
* Adding pytorch custom op export tests to CI

* Test clean build

* Fix export for intended failure

* update export script

* Build onnxruntime
2020-07-08 10:14:56 -07:00
Du Li
063156d98d
IOBinding docs (#4432)
* Adding iobinding pathon docs.

* Adding iobinding pathon docs.

* Addressing PR comments.
2020-07-08 03:48:22 -07:00
Hariharan Seshadri
6d6b6b54a5
Support binding a graph output to a specific device via the Python binding (#4439) 2020-07-07 21:09:37 -07:00