Commit graph

3007 commits

Author SHA1 Message Date
Changming Sun
cddddc4d55
Add missing header file to MNIST.cpp (#4773)
Resolve #4766
2020-08-12 21:46:11 -07:00
Tianlei Wu
a69ca63895
add --no_attention_mask option (#4750)
output producer name and version in optimized model.
avoid removing initializer that existed in graph output
2020-08-12 15:56:25 -07:00
jingyanwangms
adda8c66d9
Docker image release pipeline (#4682)
* create orttraining-1p-linux-gpu-ci-pipeline.yml

* fix syntax

* fix file path

* fix template path

* publish docker image to test acr

* use right task name

* change parameter list

* use variables

* use python.version

* remove --enable_onnx_tests due to segfault

* add back --enable_onnx_tests

* fix docker push command line

* change docker login command

* login differently

* fix docker tag script

* create password.txt

* add ortrelease docker image

* enable test in build.sh

* add pipeline parameter

* add pipeline parameter

* change timeout

* change timeout

* fix run_dockerbuild.sh

* use PR checkin build docker

* fix strategy syntax

* fix strategy syntax

* change dockerfile

* change run_dockerbuild.sh

* change tag name

* build with root user

* use build id for docker image tag

* remove all user lines

* change docker tag

* add mpi, mellanox

* add missing args

* use release dockerfile for ci build

* remove install wheel

* use release docker image

* fix syntax

* use different pool

* add Dockerfile.training

* remove sudo to run on Linux-Multi-GPU-V100

* change docker file path

* update dockerfile

* use latest dockerfile

* change agent pool

* remove --preserve-env

* add back parameter

* Add test_flag

* use azuredevops docker

* change repository

* use cmd for docker login

* echo build script

* use ortrelrease ACR

* change key vault connection

* Move --build flag

* change build command

* add paramter for image tag

* clean up for PR

* remove unnecessary changes

* whitespace changes

* whitespace changes

* change build flag

* change flag name

* change flag

* use latest dockerfile

* enable build tests

* build builder stage and run test

* Add back python.version

* change build directory

* always run build entire dockerfile

* fix yml syntax

* fix syntax

* add en-UTF8 locale

* rename

* remove unused template

* Update orttraining-linux-gpu-docker-release-pipeline.yml for Azure Pipelines

* Update orttraining-linux-gpu-docker-release-pipeline.yml for Azure Pipelines

* Test commit sha1 in pipeline

* fix parameter

* update docker file

* fix --from=build

* remove commented blocks

* PR comments

* fix syntax

* fix syntax

* use timestamp as build number

* remove latest tag

* add build_timestamp variable

* remove wrong property

* fix docker run command

* test build id

* Use datestamp build id

* change build tags

* add no-cache to docker build

* rename BUILD_VERSION -> BUILD_CONFIG

Co-authored-by: Jingyan Wang <jingywa@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: Jingyan Wang <jingywa@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-08-12 13:29:37 -07:00
Sheil Kumar
8a66ad79a6
Add Experimental WinRT API IDL as placeholder for adding new winrt features (#4736)
* Add experimental winrt api idl with dummy type to satisfy the build

* remove experimental from the api_lib target

* make experimental api available on windows builds also

* remove /y /d

* revert some pathing changes

* remove experimental api call from tests

* revert cppwinrt cmake changes

* switch to stdapi

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2020-08-12 12:45:19 -07:00
Vincent Wang
7e955960f1
Optimize Slice Kernel by Removing If-statement (#4753)
* Slice kernel optimization.

* remove space

Co-authored-by: Vincent Wang <weicwang@AiFramework2080ti2.corp.microsoft.com>
2020-08-12 16:36:03 +08:00
Josh Bradley
b7254551f0
Add new api function At() (#4457)
* add modern standards to function arguments
* add first version of At for better tensor element access
2020-08-11 18:34:03 -07:00
Scott Bonebrake
38c804a048
Fix broken link to ScoreMNIST.java in Java_API.md (#4213) 2020-08-11 17:36:19 -07:00
Ryan Hill
ac725b53f6
Convert TensorRT provider into a shared library (#4721)
Lots of changes to shared library interfaces, new lighter weight design.
2020-08-10 21:17:16 -07:00
Dmitri Smirnov
ac4997665a
Make Java Publishing and Java GPU pipelines to run nightly (#4749)
Schedule Java daily
  Bump up iInux GPU build timeout
2020-08-10 17:38:45 -07:00
Yang Chen
f51385fd1e
Yanchen/nuphar/clip 11 (#4737)
* [WIP] log unsupported ops in Nuphar

* [Nuphar] added support for clip-11

also added some log information for unsupported ops in Nuphar
2020-08-10 15:45:21 -07:00
Dmitri Smirnov
3530ce541c
Expose IOBinding features via C/C++/C# language bindings. (#4646)
Expose I/O Binding in C/C++/C#
  Expose OrtAllocator, OrtMemoryAllocation, OrtMemoryInfo and OrtIoBinding
2020-08-10 13:33:49 -07:00
Scott McKay
6c33d7f5df
Fix bug in Loop optimization (#4210)
* Fix bug where an optimization to avoid a copy resulted in the iteration num for a Loop subgraph

* Update comments to clarify
2020-08-11 06:31:29 +10:00
Tiago Koji Castro Shibata
082a741636
Move DNNL workaround to EP (#4738) 2020-08-10 13:06:22 -07:00
edgchen1
487665c21f
Transpose MatMul fusion fixes (#4728)
Fix Transpose MatMul fusion handling of existing TransposeScaleMatMul node's attributes and enable support for missing Transpose perm attribute.
Update expected test data to account for floating point calculation differences resulting from the fusion.
2020-08-10 13:00:22 -07:00
Tianlei Wu
316d1a9e69
Update benchmark for large model or model name with non-alphanumeric. (#4743)
* Export model > 2GB using external data format
2020-08-10 12:58:01 -07:00
Vagif
6499a38b7d
Add the missing onnx_proto import (#4705)
* add missing onnx_proto import
* Fix TensorProto usage in calibrate.py
* remove unused imports
2020-08-10 12:46:21 -07:00
Scott McKay
2e3ccc7518
Change order of some checks to workaround a linker issue when /LTCG:incremental is set. (#4713) 2020-08-10 17:54:11 +10:00
Nat Kershaw (MSFT)
24d4f76436
Added explicit instructions to build for Jetson (#4714)
* Added explicit instructions to build for Jetson.

* Update after review
2020-08-09 20:28:20 -07:00
Bowen Bao
abbb7f6f5c
Avoid duplicated calls of postprocess in training frontend (#4579) 2020-08-07 21:34:11 -07:00
stevenlix
77c69a0325
Upgrade TensorRT to v7.1.3.4 (#4704)
* upgrade to TensorRT 7.1.3.4

* Upgrade onnx-tensorrt parser for TensorRT 7.1.3.4

* fix format issue

* fix format issue

* fix format issue

* Update tensorrt_execution_provider.cc

* change cmake version to 3.14

* Remove --msvc_toolset 14.16

* change to onnxruntime::make_unique

* use onnxruntime::make_unique

* disable some tests for TensorRT

* disable some tests for TensorRT

* Update upsample_op_test.cc

* Update tile_op_test.cc

* disable some tests for TensorRT

* Update constant_of_shape_test.cc

* update parser

* Update Dockerfile.ubuntu_tensorrt
2020-08-07 17:43:56 -07:00
Oliver Rausch
9c3153acd6
Improve shape inference for OneHot (#4452)
* Improve shape inference for OneHot

Attempt to get the depth parameter before adding a new symbolic dimension.

* Update symbolic shape infer

* Nit
2020-08-07 14:05:20 -07:00
Tianlei Wu
9c729d1719
Update notebook for mac since onnxruntime 1.3 or 1.4 in mac does not have openmp (#4732) 2020-08-07 14:01:48 -07:00
Marcus Turewicz
37c45c3d6b
C# ResNet50 v2 sample/tutorial (#4722)
C# ResNet50 v2 sample
  Update samples README
2020-08-07 13:36:36 -07:00
Ye Wang
61726e58f0
fix (#4697) 2020-08-07 13:08:41 -07:00
Sergii Dymchenko
c334b5738e
Remove docstring for removed parameter (#4734) 2020-08-07 11:43:36 -07:00
Yufeng Li
b22091dc91
Add the framework to support prepack (#4413)
* add support of prepack
* add support for QAttention and DynamicQuantizeMatMul
* add an use_prepacking option
* add use_prepacking in c_sharp api
2020-08-07 09:39:19 -07:00
zhijxu-MS
33fe770037
Support log sigmoid gradient (#4719)
* add log's gradient op and its related gradient test

* support sigmoid's gradient op

* resolve review comments
2020-08-07 11:21:36 +08:00
Wei-Sheng Chin
7905c57f43
Revert "Remove code which is not thread-safe. (#4454)" (#4712)
* Revert "Remove code which is not thread-safe. (#4454)"

This reverts commit 5222b2c6c0.

* Resolve race condition

* More thread-safe changes

* Remove unused lock

Polish comments
2020-08-06 18:42:05 -07:00
ashbhandare
fc2f36c608
Shape independent gradient builder for Concat (#4675)
* Add gradient for ConcatTraining

* Graph rewriter changes for concat

* Add generated onnx graph, minor fixes

* Revert unintended change

* Fix for MaxPoolGradTest

* Fix UT

* Review comments, windows tests

* Review comments
2020-08-06 14:39:33 -07:00
gwang-msft
8507bc1f48
[Android NNAPI EP] Enable test for BatchNormalization, enable dev_mode for Android, fix some issues in concat (#4715)
* update batch_norm test, enable dev_mode for nnapi, ignore onnx protobuf warning for nnapi ep

* fix some issues in concat and mark input without shape as not supported for now

* address review comments

* addressed comments
2020-08-06 14:11:59 -07:00
suffiank
4d39c6a6cb
Wire log(softmax) grad cuda kernel and add log(softmax) grad cpu kernel (#4726)
* logsoftmax cuda kernel

* add cpu logsoftmaxgrad

* revert debug printout

* revert disable for debug builds

* use /alpha x + y instead

* remove misleading log_softmax_ bool

Co-authored-by: suffian khan <sukha@OrtTrainingDev1.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-08-06 10:49:08 -07:00
KeDengMS
9a73c8f448
ReshapeGrad optimization (#4708)
* Reshape optimization

* Refactor the Reshape optimization to be more generic

Co-authored-by: Ke Deng <kedeng@OrtTrainingDev1.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-08-05 23:26:02 -07:00
suffiank
005fa5c3ae
Add initial Dockerfile for distributed training targets (#4578)
* add training dockerfile tested for examples repo

* forgot pytorch patch for build from source

* make apt-get update -y adjacent apt-get install -y due to Docker caching rules

* comment for mellanox libraries

* mpi4py comment as I forgot where it came from

* apparently curl not included anymore

* grr.. nvidia change nccl location

* dont need findnccl.patch after nvidia changed nccl location

* pr comment /opt/ompi4 => /opt/openmpi-xxx

* switch to pip install pytorch

* use Release instead of RelWithDebInfo

* comment wording

* wordin

* missed RelWithDebInfo => Release

* replace Mellanox with libibverbs

* stale comment

* ordering

* no more ninja

* add / at end of copy

* update cgmanifest.json

* pr comments

Co-authored-by: suffian khan <sukha@OrtTrainingDev1.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-08-05 18:54:54 -07:00
RandySheriffH
e802b0498f
EnrichPyOpUT (#4681)
* cancel night build on pyop

* enrich PyOp UTs

* init script only once

* remove space

* update models

* Show usage of kwargs in doc
2020-08-05 14:11:56 -07:00
Yang Chen
43142a8225
[Nuphar] added Gemm-to-MatMul conversion in model editor (#4691)
* [Nuphar] added Gemm-to-MatMul conversion in model editor

* added a mode gemm_to_matmul that turns Gemm Ops into MatMul Ops

* enabled model_quantizer to quantize MatMul inside a Loop op

* this PR also included Gemm-11 support from Ke Deng

* Fixed a couple of existing bugs

Fixed a couple of old bugs exposed by the newly-added tests and the support
of Gemm-11, including:

* correctly handle aliasing among states and outputs in Scan

* fixed a transpose issue in building tvm IR for MatMul

* fixed an issue related to generating IR for computing Gemm alpha

* disabled several tests that triggered some deep issue (likely) in
  the graph partitioner. I think it might be better to have a separate
  PR to address the issue.
2020-08-05 13:31:30 -07:00
Sheil Kumar
5c5efa900d
Add .NET Core 3.0 nuget e2e pipeline tests (#4695)
* bump cswinrt version

* add cswinrt

* test dotnetcore 3.0

* rename buildpacakge source

* set folder path to the package source and not the version

* refactor .netframework tests

* build .net core anycpu

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2020-08-05 13:02:24 -07:00
Sherlock
eb0f57f0e4
Localized Recompute for Gelu and AttentionDropout (#4402)
* Gelu Activation Recompute Draft

* Prototype for localized recompute

* Introduce localized_recompute rewriter

* Command line args for enabling recompute

* Add logger to Gradient Graph Builder

* use const when possible
2020-08-04 21:48:15 -07:00
gwang-msft
0933148fc3
[Android NNAPI EP] change most of the exceptions to return Status (#4701)
* change some function from throw to return status

* move more functions to return status

* move most of the exception to return status

* move logsink on android from CLogSink to AndroidLogSink

* addressed comments

* add type attr to check if the return status is used in compile
2020-08-04 21:37:27 -07:00
Changming Sun
1e054739b8
Remove the requirement of CUDA's version.txt (#4706)
Sometimes there is a file named "version.txt" in your CUDA installation dir, but sometimes there isn't one. I couldn't figure out it why, but the latest CUDA 11 on our CI build machines doesn't have this file. As the file is not needed for building onnxruntime, so I removed the check.
2020-08-04 17:03:40 -07:00
edgchen1
9d7284fc3b
Enable MatMul + Scale fusion (#4669)
Update TransposeMatMul to support scaling of the matrix product by a constant scalar value (analogous to the GEMM alpha parameter). Rename TransposeMatMul to TransposeScaleMatMul.
Fuse MatMul with surrounding Mul/Div with constant scalar into TransposeScaleMatMul.
2020-08-04 16:27:22 -07:00
Ryan Lai
f9bd52f852
Log telemetry for WinML Native API for setting intra op num usage (#4700)
Co-authored-by: Ryan Lai <ryalai96@gmail.com>
2020-08-04 09:44:23 -07:00
Tim Harris
4bd9e8d05c
Stress-test and fix thread pool when work queues are full (#4690)
While investigating an unrelated issue, I noticed that the thread pool may drop tasks when a burst of 1024+ tasks is submitted by a thread from inside the pool. Today, in general, we execute work synchronously in this case. However, there is a bug where work submitted by a thread already inside the pool will be discarded instead of executed. Currently the only scenario where I can see this occurring is when the parallel executor is used with a model in which such a large number of nodes become eligible to run all at once. This PR fixes the underlying issue and adds a test case for burst-submission of work.
2020-08-04 10:19:49 +01:00
Changming Sun
d0297f8d24
Add 'Install ONNX' step to Windows GPU pipeline (#4696)
Add 'Install ONNX' step to Windows GPU pipeline

Previously it's not a problem because onnxruntime python package explicitly said it depends on ONNX, so ONNX will get installed when we test onnxruntime. However, it was removed in #4073
2020-08-03 18:51:24 -07:00
Hariharan Seshadri
49febba3c2
Support int32 and int64 types for Tile CUDA kernel (#4684)
* Support int32 and int64 types for Tile CUDA kernel

* Fix build
2020-08-03 17:47:37 -07:00
Ori Levari
e6ef3653a7
Add Named Dimension Override API to LearningModelSessionOptions (WinML) (#4606)
Co-authored-by: Ori Levari <orlevari@microsoft.com>
2020-08-03 16:04:21 -07:00
Dmitri Smirnov
bb9b452a88
resolves #3101 - fix nuget package restore for sdk-style projects (#4680)
Co-authored-by: Christof Senn <christof.senn@gmail.com>
2020-08-03 15:27:48 -07:00
Hariharan Seshadri
0828a900e1
Support easy way to request verbose logging in test runner (#4676) 2020-08-03 15:25:55 -07:00
Changming Sun
01ca6392cb
Avoid building ONNX of every history ONNX versions in our CI (#4678)
1. Avoid building ONNX of every history ONNX versions in our CI, it is costly and easy to fail.
2. Run docker command without sudo. Previously the user is not in docker group, now Azure DevOps Service have added it in.
2020-08-03 10:18:10 -07:00
Tianlei Wu
e70e9e2f67
refine machine_info and output onnxruntime_tools version (#4679)
* output onnxruntime_tools version
* change get_machine_info return data type to string
2020-08-02 18:20:59 -07:00
Boris Fomitchev
6958f49dae
Added Dockerfile and build instructions for Jetson. Also set CUDA arch set automatically. (#4637)
* Revert "Remove docstrigs if __ONNX_NO_DOC_STRINGS" (#4495)

This reverts commit bb4d331fa7bf1fe8d68b1527dda56e4739c80800.

* Bump version to 1.4.0 (#4496)

* Create N-1 threads in intra-op pool, given main thread now active (#4493)

Create N-1 threads in a thread pool when configured with intra-op parallelism of N. This ensures we have N active threads, given that the main thread also runs work. To avoid ambiguity on the value returned, rename ThreadPool::NumThreads method to ThreadPool::DegreeOfParallelism, and make corresponding updates in MLAS and operators.

* Conditionally compile without std::is_trivially_copyable to satisfy old GCC versions. (#4510)

* Adding CUDA arch flags for NVIDIA Jetson

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Added Dockerfile for Jetson and instructions to build wheel and image

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Removing guess about nvcc location

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Restoring pip3 setuptools install order

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Updated README with links and notes re NVIDIA Docker runtime

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Added mention of nvidia-docker

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Addressing code review comments

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Addressing code review comments

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

Co-authored-by: Tiago Koji Castro Shibata <ticastro@microsoft.com>
Co-authored-by: Dmitri Smirnov <yuslepukhin@users.noreply.github.com>
Co-authored-by: Tim Harris <tiharr@microsoft.com>
Co-authored-by: edgchen1 <18449977+edgchen1@users.noreply.github.com>
2020-07-31 23:49:23 -07:00