Commit graph

11997 commits

Author SHA1 Message Date
Hariharan Seshadri
eabc1616e6
Rename variable in InferenceSession class so as to not clash with an existing var (#4391)
* Rename variable in InferenceSession class so as to not clash with an existing var

* Fix build break
2020-07-02 12:27:14 -07:00
suffiank
f6bf66c8cf
Adjustments to MPI and NCCL library discovery on build (#4407)
* cmake edits for mpi_home and nccl_home

* cmake syntax error on else
2020-07-02 12:03:42 -07:00
ISS Build Account
ff64371367 Merge remote-tracking branch 'upstream/master' into DmlDev 2020-07-02 19:01:54 +00:00
dependabot[bot]
f4e0070c2e
Bump mysql-connector-java from 8.0.15 to 8.0.16 in /tools/perf_util (#4401)
Bumps [mysql-connector-java](https://github.com/mysql/mysql-connector-j) from 8.0.15 to 8.0.16.
- [Release notes](https://github.com/mysql/mysql-connector-j/releases)
- [Changelog](https://github.com/mysql/mysql-connector-j/blob/release/8.0/CHANGES)
- [Commits](https://github.com/mysql/mysql-connector-j/compare/8.0.15...8.0.16)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2020-07-02 11:22:45 -07:00
ISS Build Account
84a34a38d0 Merge remote-tracking branch 'upstream/master' into DmlDev 2020-07-02 17:53:13 +00:00
gwang-msft
0bef9d5114
Fix the broken Android NNAPI CI (#4403)
* Change NNAPI CI to run on new NNAPI EP

* update android ci to mac 10.15 and remove in install cmake

* update the android ci to targe android api level 29

* remove unnecessary ndk install git submodule call
2020-07-02 10:22:18 -07:00
Ashwini Khade
ef602835b0
update getfunctionbody (#4396) 2020-07-02 09:00:37 -07:00
Changming Sun
3bb6a865cc Revert "remove openmp and scipy from build pipelines (#4305)" 2020-07-02 00:30:02 -07:00
ISS Build Account
5cb5081c8f Merge remote-tracking branch 'upstream/master' into DmlDev 2020-07-02 05:58:34 +00:00
S. Manohar Karlapalem
4c0236d6c1
Update MCR container instructions with dynamic device selection info (#4371) 2020-07-01 22:16:55 -07:00
Tracy Sharpe
5c23b17196
MLAS: more prepacking kernel changes (#4397)
Kernel changes to support StrideK>128
2020-07-01 22:09:42 -07:00
Xiang Zhang
d4341ea2de Merged PR 4870266: Refactor fused graph kernel so dmlxp and ort share the same code
Related work items: #26719246
2020-07-02 01:24:12 +00:00
Sherlock
2d54c89d77
Update filename and Cleanup unused cudnn kernels (#4387)
* Update filename and Cleanup unused cudnn kernels

* Cleanup unnecessary dependency
2020-07-01 17:19:49 -07:00
Yang Chen
010445fc52
handle Floor and SplitToSequence (#4384)
* handle Floor and SplitToSequence

added support to Floor and SplitToSequence ops

* Address CR

use sympy.floor for computation on Floor
2020-07-01 16:09:43 -07:00
ISS Build Account
6ee6fdbbba Merge remote-tracking branch 'upstream/master' into DmlDev 2020-07-01 22:50:58 +00:00
Yufeng Li
473cd5545f
Simple support of MatMul U8S8 on ARM to pass tests (#4392) 2020-07-01 15:18:02 -07:00
ISS Build Account
bffbae3398 Merge remote-tracking branch 'upstream/master' into DmlDev 2020-07-01 21:43:30 +00:00
Bowen Bao
7ec9a73202
deprecate frontend layernorm postpass (#4372) 2020-07-01 13:06:03 -07:00
Tiago Koji Castro Shibata
7fea332f93
Support builds without RTTI (#4333)
* Support builds without RTTI

* Disable RTTI in all builds
2020-07-01 13:05:35 -07:00
liqunfu
5dcb9b4858
Liqun/backprop deterministic graph (#4315)
make gradient graph deterministic
add to session option use_deterministic_compute.
2020-07-01 12:39:10 -07:00
Zhang Lei
94c98aa0a7
qlinaradd for arm/sse2/avx2 using intrinsic, enable binary broadcasting parallel (#4216)
* Support quantization linear binary element wise math ops, implement QLinearAdd.
Support tests for quantization linear binary element wise math ops, implement test for QLinearAdd.
Add QlinearAdd with SSE2 intrisinc implemntation, Avx2 assembly implemntation, Neon intrisinc support.
QLinearAdd support VectorOnVector, VectorOnScalar, ScalarOnVector.
Generalized QlinearBinaryOp parallel related with broadcasting.

* Modify according to PR feedbacks. Mainly:
    * template helper for generalize the qladd logic on v2v, s2v, v2s
    * remove GetKernel related.
    * change mixed lagecy MM/SSE code in the AVX code
    * formater, typos, convensions, etc.

* Utilize MlasSubtractInt32x4 in MlasDequantizeLinearVector().

* Some format fix.

* More nature parallel parameter type.

* Fix build break for x86.

* Comment goes to 80 before wrap.

* Many change on assembly on Marco related.
Using vminps than vpminsd to handle NaN.
tested on windows.

* Using CLang Format to format the file.

* Fix arm32 build error.

* Remove some duplicate in different #if defined

* working add.u8.vector to vector

* Fix runtime bus error on real arm32 linux.

* fix typo in store last one lane.

* arm32 qlinearadd handle scalar.

* Move qladd to seperate c++ file

* Add neon64 qladd.

* refactor some, enhance two instructions on arm64 only instructions

* Fix typo for arm64

* use strict op in pure c++ (min/max on float value)

* sse2 new version.

* mrege arm/sse2/avx2

* pass arm/sse/avx2 linux test

* remove non-used assembly file.

* Remove unused data definition and tailing spaces.

* Fix broadcasting parallel issue.

* Enhance broadcasting scenarios. Allow testing result diff due to round
on half.

* Add Mlas or MLAS_ prefix for namespace safety.

* Handle alignment issue for arm32 for GCC/MSVC. remove some unused
signed/unsigned int ops.

* Specify /arch:AVX2 for qladd_avx2.cpp

* Fix type during copy/paste when unrolling. Better one GreatEqual
condition. Better formater by splitting two statements on single line.

* Arm neon alignment parameter is bits rather than bytes, change it.

* Move qladd_avx2.cpp to intrinsics/avx2/ folder

* Formatting using mlas style.

* Double check mlas style for these files.

* change indent 2 to 4 for qladd_avx2.cpp

* Fix windows x86 build error due to sse2 no _mm_cvtsi128_si64

* To re-trigger all as old failed pipeline updated.

Co-authored-by: Lei Zhang <phill.zhang@gmail.com>
2020-07-01 11:54:44 -07:00
Dmitri Smirnov
49268c42da
Change the way java home is set on Mac OS for CI and Java publishing pipeline (#4385)
* Change the way java_home is set on Mac.

* Change the way JAVA_HOME is set on Mac OS
2020-07-01 07:37:14 -07:00
Ryan Lai
1b26dfc8ac Merged PR 4868876: Merge Onnxruntime github with dmldev
Related work items: #27289842
2020-07-01 01:28:32 +00:00
Ryan Lai
531126916b Merge remote-tracking branch 'upstream/master' into HEAD 2020-06-30 18:22:57 -07:00
Ryan Lai
4051667d01 Merged PR 4868833: FIx merge conflict
No files need to be changed but still showing up as merge conflict.

Related work items: #27289842
2020-07-01 01:19:05 +00:00
Ryan Lai
f422bcb372 Merged PR 4868758: Fix merge conflict in merge of Onnxruntime to DmlDev
Merge conflict in OperatorRegistration.cpp

Related work items: #27289842
2020-07-01 01:01:11 +00:00
Ryan Lai
018269f29f Merged PR 4868304: Manual merge of Onnxruntime github into DmlDev
There was a merge conflict in Operator Registration .cpp

Related work items: #27289842
2020-06-30 23:42:51 +00:00
Sherlock
6365760906
BiasDropoutFusion (#4167)
* Implement BiasDropout Fusion and Kernel

Dropout kernel for residual input

BiasDropout Fusion to take residual input

Fix BiasDropout Kernel

Optimize DropoutGrad with 4 elements per thread

* Add graph transformer UT

* MLTypeCallDispatcher for RatioData

* Use MLTypeDispatcher for ratio tensor

* Handle traing_mode input for BiasDropout fusion

* Add test case for missing ratio input

* Replace using FinalizeNodeFusion

* Make BiasDropout kernel template-less

* Make DropoutGrad template-less

* Make Dropout and TrainableDropout template-less

* Regenerate onnx file for UT

* Minior fix on divmod in BiasDropoutKernel

* Adjust pt frontend test due to dropout randomnesss

* Make dropout kernel opeartion in fp32

Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-06-30 15:43:14 -07:00
Ashwini Khade
0404763f23
Update function body initialization for ONNX functions (#4332)
* Update function body initialization

* minor fix

* changes per review comments

* minor fix

* format fix

* add function initialization in mixed precision transformer

* more updates

* more fixes
2020-06-30 14:30:59 -07:00
Dwayne Robinson
7cb2c3f025 Merged PR 4852260: DML EP remove redundant rank checks for higher dimension support
Remove redundant checks in the DML EP which should instead rely on DML's validation. At least one of these checks wrongly prevents legitimate execution (5D Concat is supported in DML, but the DML EP blocks it 🤦‍♀️). Note this is a small aspect of the larger work (not sufficient to make the models fully work) that I thought I'd flush now while I had the change ready anyway due to investigation.

Related work items: #23232293, #25707941
2020-06-30 21:26:16 +00:00
Cecilia Liu
37b624b688
Match More EmbedLayerNormalization Patterns for Bert Model Graph Fusion (#4354)
match more embed patterns for bert base cased
2020-06-30 13:12:50 -07:00
Tracy Sharpe
755675541a
NCHWc + Sigmoid optimization (#4360)
Add support to avoid reordering NCHWc tensors due to the Swish activation (x * sigmoid(x)) in EfficientNet/EfficientDet models.
2020-06-30 10:50:58 -07:00
ytaous
4380b8ba68
adjust bs size (#4375)
Co-authored-by: Ethan Tao <ettao@microsoft.com>
2020-06-30 10:29:48 -07:00
Ashwini Khade
89c6da99b5
fix output shape calc for matmul (#4362) 2020-06-30 08:21:20 -07:00
Faith Xu
a4127fc185
Add stale bot (#4323)
* Add stalebot

* Update exemptLabels
2020-06-30 01:51:09 -07:00
Tianlei Wu
55f25a4bbf
Update Attention op to support attention mask for GPT-2 (#4330)
* Support another two format of mask_index input: 2D attention mask, or 1D mask index with end and start positions.
* Update dynamic axes of gpt2 with past state
* Update script to fuse model with attention mask
2020-06-29 23:26:23 -07:00
Weixing Zhang
2601f8e1b4
Support to build CUDA EP for NV Ampere GPU (#4345)
Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
2020-06-29 21:46:13 -07:00
Hariharan Seshadri
465140b384
Misc fixes to Conv and ConvTranspose CUDA kernels (#4281) 2020-06-29 16:07:42 -07:00
Changming Sun
35a048ef9b
Ignore one failed test in DML (#4366)
2020-06-29 08:51:32.9157882 [E:onnxruntime:Default, runner.cc:452 DataRunner::RunTaskImpl] keras2coreml_Dense_ImageNet:output=output1:expected 0.233292 (3e6ee400), got 0.231783 (3e6d587b), diff: 0.00150879, tol=0.00123329 idx=52. 1 of 255 differ
2020-06-29 14:27:06 -07:00
gwang-msft
5f4e63ede6
Add nhwc support for NNAPI EP, add concat op, handle concurrent calls to NNAPI model (#4356)
* add support to internally transpose nchw input to nhwc and only transpose back if it is necessary

* more changes in nchw<->nhc, fixed small issue in concat

* Add option for NNAPI to run on [all device]s/[cpu onl]y/[non-cpu only]

* minor code style changes
2020-06-29 11:55:45 -07:00
Tiago Koji Castro Shibata
88402f5293
Make DML operator registration constexpr (#4219)
* Make DML operator registration constexpr

* Refactor requiredConstantCpuInputs template

* Revert "Refactor requiredConstantCpuInputs template"

MSVC crashes compiling the new constexpr with "Internal compiler error"

* Fix braces style
2020-06-29 00:54:43 -07:00
ISS Build Account
ee48b89350 Merge remote-tracking branch 'upstream/master' into DmlDev 2020-06-28 12:27:21 +00:00
ISS Build Account
9d1c86f7a5 Merge branch 'DmlDev' of https://microsoft.visualstudio.com/DefaultCollection/WindowsAI/_git/onnxruntime into DmlDev 2020-06-28 12:27:21 +00:00
Hariharan Seshadri
012aaa6491
Minor optimization in CUDA Reduction ops (#4353) 2020-06-28 01:14:28 -07:00
Scott McKay
274e6b4153
Cleanup SessionState. Move allocator lookup to SessionState. (#4194)
* Move allocators to SessionState so they're decoupled from ExecutionProviders
  - when looking up an allocator it's based on OrtMemoryInfo not the EP so SessionState is a more natural place for that infromation to be stored
  - add device based lookup
    - simplifies logic for copying feeds/fetches across devices
Cleanup SessionState and SessionStateInitializer
  - provide more things to SessionState at construction time so we don't construct and instance and immediately after call a bunch of setters
  - simplify SessionStateInitializer
    - reduced down to FinalizeSessionState method
2020-06-28 14:55:42 +10:00
ISS Build Account
386046426b Merge remote-tracking branch 'upstream/master' into DmlDev 2020-06-27 12:25:37 +00:00
ISS Build Account
8b0968b622 Merge remote-tracking branch 'upstream/master' into DmlDev 2020-06-27 11:07:46 +00:00
S. Manohar Karlapalem
4a1ecd9879
[OpenVINO] Upgrade OpenVINO docker base to Ubuntu 18.04 (#4346)
* update deps installer to ov 2020.3

* Upgrade docker base to Ubuntu 18.04
2020-06-27 01:57:47 -07:00
Du Li
d1777910a8
fix onnx server build failure. (#4347) 2020-06-26 15:15:58 -07:00
liqunfu
c3c4ce5ceb
refactor prototypes into headers (#4337)
* refactor prototypes into headers
2020-06-26 12:02:14 -07:00