Commit graph

2864 commits

Author SHA1 Message Date
Sheil Kumar
ee5ca27ae2
Split Microsoft.AI.MachineLearning.nupkg in a NuGet package and symbol NuGet package (#4503)
* add threadpool interface

* generate snupkgs

* include_pdb check

* fix snupkg generation

* Add task to merge snupkgs

* folder exists

* check dir

* revert thread pool stuff

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2020-07-14 14:52:39 -07:00
Tianlei Wu
25885cf7d0
Add option --torchscript in benchmark_gpt2.py (#4500)
* support TorchScript
* change onnx filename format
* change output name prediction_scores to logits
2020-07-14 11:53:23 -07:00
Tim Harris
a95ae164f7
Create N-1 threads in intra-op pool, given main thread now active (#4493)
Create N-1 threads in a thread pool when configured with intra-op parallelism of N. This ensures we have N active threads, given that the main thread also runs work. To avoid ambiguity on the value returned, rename ThreadPool::NumThreads method to ThreadPool::DegreeOfParallelism, and make corresponding updates in MLAS and operators.
2020-07-14 09:48:50 +01:00
liqunfu
0bff55512e
updated expected values for frontend test to pass frontend e2e pipeline. raise tolerance to reduce future risk of failure (#4497)
* updated expected values for frontend test, raise tol
2020-07-13 19:25:54 -07:00
Dmitri Smirnov
e0eddf502c
Bump version to 1.4.0 (#4496) 2020-07-13 17:09:18 -07:00
Yufeng Li
3d4ac85124
Add quantization benchmark for transformer based model (#4482)
* add support of quantization benchmark
2020-07-13 15:46:23 -07:00
gwang-msft
a3c358fd29
Split the shared ComputePadAndOutputShape into 2 separated functions ComputePad and ComputeOutputShape (#4487)
* Split ComputePadAndOutputShape into ComputePad and ComputeOutputShape

* update NNAPI conv ouput shape compute to use shared ComputeOutputShapec

* move use ptr to use reference for ComputePadAndOutputShape
2020-07-13 15:07:34 -07:00
Tiago Koji Castro Shibata
3441c687b7
Revert "Remove docstrigs if __ONNX_NO_DOC_STRINGS" (#4495)
This reverts commit bb4d331fa7bf1fe8d68b1527dda56e4739c80800.
2020-07-13 14:55:37 -07:00
gwang-msft
5f8f443ac4
Android CI build, test copy, emulator boot improvement (#4481)
* Enable onnxruntime_test_all for NNAPI EP

* switch to use ninja for ANdroid CI

* make android elumator boot faster in android ci

* simplify adb push

* more style change

* more tweaking on android ci

* build.py style update
2020-07-13 14:18:34 -07:00
Dmitri Smirnov
35ee00d888
Pin typing version. (#4490) 2020-07-13 11:48:30 -07:00
Bowen Bao
07455cff28
Support double type for Greater CPU (#4373)
* Add double for Greater

* add double type for Greater

* udpate test according to dtype
2020-07-13 11:25:14 -07:00
Tiago Koji Castro Shibata
f18dee84c2
Remove docstrigs if __ONNX_NO_DOC_STRINGS (#4494) 2020-07-13 11:08:46 -07:00
edgchen1
c71c49aaa0
Make TArray safer to use and update method name for consistency. (#4483)
- make size_ and data_ data members private
- rename GetCapacity() to Capacity() to be consistent (e.g., with Size())
- add static_assert for trivially copyable T because it is copied with memcpy
2020-07-13 09:59:56 -07:00
Sheil Kumar
00706e1502
dont add deps for uwp apps (#4485)
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2020-07-10 22:11:32 -07:00
Derek Murray
3e48ffd21c
Move AutoPadType to common.h (#4474)
Extracting some common code related to "AutoPadType" from the cpu execution provider into "common.h".

Motivation and Context
* Sharing code with authors of other execution providers that need the same functionality.
* I didn't modify the code in shared_library or dnnl EP to avoid changing their dependency structure, so there is still a redundant copy of the AutoPadType code in there.
2020-07-10 16:40:32 -07:00
Tianlei Wu
e96a829e84
Handle multiple embed nodes in transformer optimizer (#4471)
Handle model with multiple embed nodes:
* update embed layer norm fusion in onnxruntime
* Fix temp model path in optimizer
* Add unit test for model with multiple embed nodes.
* Add unit test for gpt2 fusion with past state and mask
* Add unit test for change input to int32
2020-07-10 15:28:27 -07:00
Ashwini Khade
6a9a9a35be
fix crashes caused by test runner (#4475)
* Fix crashes in test runner

* plus some fixes

* changes per review
2020-07-10 14:04:22 -07:00
Hariharan Seshadri
26ebcfab88
Fix Nuget GPU pipeline (#4462) 2020-07-10 14:02:28 -07:00
gwang-msft
9b4c54bcef
Enable onnxruntime_test_all for NNAPI EP (#4476) 2020-07-10 13:34:44 -07:00
edgchen1
6c7da5e9d3
Optimize CUDA Sum op kernel and refactor CUDA elementwise variadic input op kernels (#4418)
For the special case where all variadic inputs of a kernel are the same shape (i.e. no broadcasting is required) and there are few enough of them, we perform the entire computation in a single kernel. The general implementation (which was previously used for this special case) handles broadcasting by repeatedly invoking a binary kernel on successive inputs.
2020-07-10 10:20:23 -07:00
Prabhat
04586fc09d
Fix segmentation fault caused by invalid tensor type (#4467)
* Fix segmentation fault caused by invalid tensor type

* Addressed review comment
2020-07-10 11:23:12 +01:00
Zhang Lei
ccbf49e59f
Fix avx2 load 32 bytes buffer overrun. (#4455)
* Fix avx2 load 32 bytes buffer overrun.

* Fix qladd buffer overrun for sse2 code.

* Fix QLinearAdd buffer overrun for arm.

* Add mlas test for qladd to cover overrun and more.

* Change API to save binary space. Add more test in mlas to cover different zeropoints.
2020-07-09 15:54:31 -07:00
Yufeng Li
d4db83858b
Only quantize gather with initializer (#4469) 2020-07-09 13:33:43 -07:00
Yulong Wang
bec18eb3f4
[Node.js binding] support CentOS 7 in CI (#4447) 2020-07-09 00:59:50 -07:00
Josh Bradley
ca5af9d622
Add modern C++ standards for Ort::Value (#4367)
* add modern standards to function arguments

* code cleanup

* fix code formatting

* add element access convenience function

* change template type name to match rest of code

* remove new At() convenience function

* add better documentation message
2020-07-09 00:35:41 -07:00
Vincent Wang
7fb194d03d
Update convergence baseline for ci_test. (#4465)
Co-authored-by: Vincent Wang <weicwang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-07-09 15:29:36 +08:00
Josh Bradley
3effac2990
Experimental C++ API examples (#4358)
* Add examples

* fix build instructions for linux users

* fix header include

* update documentation
2020-07-08 23:17:50 -07:00
Yufeng Li
5dc7339be6
Add quantization tool to python package (#4458)
* Add quantization tool to python package
2020-07-08 21:42:53 -07:00
edgchen1
0ca4f7eb30
Update Git submodule cgmanifests. (#4461) 2020-07-08 19:24:03 -07:00
George Wu
f24d8e4587
fix build break from PR#2850 api change (#4451) 2020-07-08 17:02:12 -07:00
Tianlei Wu
cb5c4292b8
GPT-2 Attention Fusion without input mask (#4456)
* Allow input mask to be optional
* Add test for model without input mask and past state.
2020-07-08 15:59:57 -07:00
Wei-Sheng Chin
5222b2c6c0
Remove code which is not thread-safe. (#4454)
Because of acync access to the memory logger when using parallel executor,
ORT crashes sometime.
2020-07-08 14:27:56 -07:00
Tianlei Wu
05757b4c3c
Transformer benchmark: add option to use raw attention mask (#4446)
* Update benchmark and optimizer to add an option to use raw attention mask
* Remove temporary model in optimizer
2020-07-08 12:34:41 -07:00
Tixxx
b156ae4448
Support training_mode flag in eval (#4324)
* add training_mode feed for evaluation to support opset12
2020-07-08 10:38:54 -07:00
Negin Raoof
71aec2adcb
Custom op export test template (#4383)
* Adding pytorch custom op export tests to CI

* Test clean build

* Fix export for intended failure

* update export script

* Build onnxruntime
2020-07-08 10:14:56 -07:00
Du Li
063156d98d
IOBinding docs (#4432)
* Adding iobinding pathon docs.

* Adding iobinding pathon docs.

* Addressing PR comments.
2020-07-08 03:48:22 -07:00
Hariharan Seshadri
6d6b6b54a5
Support binding a graph output to a specific device via the Python binding (#4439) 2020-07-07 21:09:37 -07:00
Tracy Sharpe
aa06d308a6
Build new AVX file with /ARCH:AVX (#4442)
Build new file with /ARCH:AVX on Windows to ensure correct vzeroupper behavior.
2020-07-07 12:00:12 -07:00
Tiago Koji Castro Shibata
e62686c36e
Remove use of RTTI in CUDA provider (#4444) 2020-07-07 11:38:09 -07:00
Sheil Kumar
fdb4a3a2e8
Add cppwinrt and cswinrt tests in windowsai nuget pipeline (#4381)
* build e2e cppwinrt tests

* add use nuget task

* make all referenced to package version prop/target-ified

* remove dupe props/targets reference

* work around project.assets.json error by deleting it

* powershell test invocation

* switch to batch script

* print debug info

* update x86->x64

* stdio.h

* pushd/popd

* add csharp tests

* package.config -> packages.config

* typo

* x86 -> anycpu

* debug is default

* add test path

* update csproj as well

* debug

* really replace all package versions

* debug output

* really use [PackageVersion]

* sleep intead of converting async operation to task and waiting

* dont close software bitmap

* switch to powershell script

* remove binding check

* continue on failure

* continuse on error action

* continueOnError and errorActionPreference

* tabbing

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2020-07-07 09:36:42 -07:00
Yufeng Li
612f52c975
add bias for DynamicQuantizeMatmul (#4440) 2020-07-06 22:31:29 -07:00
Pranav Sharma
1f1384f8a9
Update dependency introduced by fuzzing change. (#4438) 2020-07-06 21:56:40 -07:00
Tianlei Wu
eabf6dc9ee
Add Fusion for GPT Attention with both past state and attention mask (#4437)
Add Fusion for GPT Attention with past state and attention mask
2020-07-06 19:37:37 -07:00
gwang-msft
7baf374939
Change the input to NNAPI EP ModelBuilder from ModelProto to GraphViewer (#4389)
* init version to use graph instead of model_proto for IsOpSupported

* move add to modelbuilder to use graph node

* move the rest of model_builder to use graph instead of modelproto

* remove redundant code

* Clear some redundant code

* merge master and some minor style changes

* move check if an initializer is external to individual op instead the whole graph

* Addressed comments

* Change the GetType and GetShape to log waring info inside to simplify the caller, remove some redundant onnxruntime namespace

* add squeeze op support, some more code style clean up

* fix a bug where duplicate output can be added to a subgraph, some other minor logging changes
2020-07-06 18:44:04 -07:00
EronsJ
632b2896f3
Onnxruntime fuzzing (#4341)
* Add protobuf mutator library as a git submodule

* Added files and instructions to build the protobuf mutator library in CMake

* Added fuzzing flag to build system and added fuzzing dependency library. To run fuzzing test use the flags --fuzz_testing --build_shared_lib --use_full_protobuf --cmake_generator 'Visual Studio 16 2019'

* Added src files and build instructions for the main fuzzing engine

* Removed Random number generation test from inside the engine

* Added license header to files

* Removed all pep8 violations introduced by this change and other E501 violations
2020-07-06 16:34:34 -07:00
Cecilia Liu
ec35a1b514
Remove unused initializer in graph after embed fusion (#4436) 2020-07-06 16:04:02 -07:00
Tracy Sharpe
3ef449816c
MLAS: support prepacking APIs for quantized GEMM (#4433)
Add support for prepacking matrix B for use in the quantized GEMMs.
2020-07-06 15:20:10 -07:00
Ashwini Khade
dd73e8c016
add function initialization back to graph resolve (#4434) 2020-07-06 15:17:27 -07:00
liqunfu
0fdb1e9f60
Liqun/roberta (#4408)
add GLUE Roberta example, fix unused initializer issue at backend. Bert GLUE expected out updated due to graph changes between June29 to July1st
2020-07-06 09:19:30 -07:00
Christian Goll
3588484336
use system libnsync (#4377)
* use system libnsync
2020-07-06 07:53:22 -07:00