Commit graph

5133 commits

Author SHA1 Message Date
Edward Chen
665ecdf9ce
[CoreML EP] Use partitioning utils in CoreMLExecutionProvider::GetCapability(). (#8179)
Use partitioning utils in CoreMLExecutionProvider::GetCapability().
2021-06-30 09:57:36 -07:00
Scott McKay
4993680e56
Graph::GetNodeProvidesGraphOutput -> NodeProducesGraphOutput (#8243)
'GetNode' is a little confusing as it returns a bool.

Update a couple more places where GetNodeOutputsInGraphOutputs was being used unnecessarily.
2021-06-30 20:43:33 +10:00
Scott McKay
b3479367cf
Add helper to check if node provides a graph output. (#8186)
* Add helper to check if node provides a graph output. The current approach unnecessarily creates a vector when most of the optimizers only care about a true/false response.

* Undo accidental change

* Fix a couple of issues due to copying from larger set of changes.
2021-06-30 12:15:42 +10:00
Scott McKay
17d4545ccb
Improve readability of Graph::PerformTopologicalSortAndCheckIsAcyclic. (#8187) 2021-06-30 12:15:17 +10:00
Guoyu Wang
9b19241b27
Disable update database for Android code coverage (#8182) 2021-06-29 18:50:16 -07:00
Ankur Verma
fa8768723a
Allow custom loaders for testing (#8150) 2021-06-29 16:54:36 -07:00
Nick Kreeger
507d97b200
Add initializer for embed layer norm unit tests. (#8196) 2021-06-29 17:57:06 -05:00
Pranav Sharma
9ec0fd6a1c
Revert the cuda algo finding change as this causes a significant memory bloat. (#8181)
* Revert the cuda algo finding change as this causes a significant memory bloat.

* Address PR comment
2021-06-28 22:49:36 -07:00
Thiago Crepaldi
83be3759bc
Add post-install command to build PyTorch CPP extensions from within onnxruntime package (#8027)
ORTModule requires two PyTorch CPP extensions that are currently JIT compiled. The runtime compilation can cause issues in some environments without all build requirements or in environments with multiple instances of ORTModule running in parallel

This PR creates a custom command to compile such extensions that must be manually executed before ORTModule is executed for the first time. When users try to use ORTModule before the extensions are compiled, an error with instructions are raised

PyTorch CPP Extensions for ORTModule can be compiled by running:
python -m onnxruntime.training.ortmodule.torch_cpp_extensions.install

Full build environment is needed for this
2021-06-28 18:11:58 -07:00
Changming Sun
25db5706bb
Change "Export PyTorch CustomOp" build pipeline to use Ubuntu 20.04 (#8158)
Change "Export PyTorch CustomOp" build pipeline to use Ubuntu 20.04
2021-06-28 16:13:55 -07:00
RajalakshmiSR
32ceaf4532
POWER10: Optimized SGEMM in MLAS (#8121)
* POWER10: Optimized SGEMM in MLAS

This patch introduces new optimized version of SGEMM in MLAS
using power10 Matrix-Multiply Assist (MMA) feature introduced in
POWER ISA v3.1. This patch makes use of new POWER10 compute instructions
for matrix multiplication operation.

* Adjust tabs in cmake

Changing tabs to spaces as per review comment.

* Adjust tabs in new sgemm file

Changing tabs to spaces in SgemmKernelPOWER10.cpp.

* Reusing functions using common header

Co-authored-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>
2021-06-28 14:41:08 -07:00
Changming Sun
9b75be3d3e
Fix a warning in pool.cc (#8168)
The warning is:
"Potential comparison of a constant with another constant. at D:\a_work\1\s\onnxruntime\core\providers\cuda\nn\pool.cc@167,21".

It was found by VS static code analyzer in our CUDA EP.
2021-06-28 07:58:02 -07:00
Nick Kreeger
821492f6f5
Drop std::count_if() in *EmbedLayerNorm Ops. (#8161)
* Drop std::count_if() in *EmbedLayerNorm Ops.

Profiling has shown that summing up the vector using the std function
can be 2x slower than just a simple plain vector sum loop.

* try and revert sumodule commits

* ensure mask is 1.
2021-06-28 08:36:02 -05:00
Pranav Sharma
523db6ef44
Check for null runoptions in Run (#8163) 2021-06-25 21:34:31 -07:00
Nick Kreeger
588511d6da
Rename embedlayernorm_op_test.cc to embed_layer_norm_op_test.cc (#8160)
* Rename embedlayernorm_op_test.cc to embed_layer_norm_op_test.cc

* cleanup
2021-06-25 21:53:50 -05:00
Nick Kreeger
800b62a139
Create a quantized EmbedLayerNorm for ORT. (#8124)
Create a quantized EmbedLayerNorm Op for ORT
2021-06-25 17:51:43 -05:00
liqunfu
9366114028
make pipelines to support torch1.8.1 and torch1.9.0 (#8084) 2021-06-25 14:55:49 -07:00
Changming Sun
c716b56f26
Update C++ Standard from 14 to 17 (#8041)
Switched the code to C++17. To build ONNX Runtime on old distros like CentOS 7, you need to install a newer GCC from additionary repos. If you build onnxruntime with the newer GCC, typically the result binary can't be distributed to other places because it depends on the new GCC's runtime libraries, something that the stock OS doesn't have. But on RHEL/CentOS, it can be better. We use Red Hat devtoolset 8/9/10 with CentOS7 building our code. The new library features(like std::filesystem) that not exists in the old C++ runtime will be statically linked into the applications with some restrictions:

1. GCC has dual ABI, but we can only use the old one. It means std::string is still copy-on-write and std::list::size() is still O(n). Also, if you build onnxruntime on CentOS 7 and link it with some binaries that were built on CentOS 8 or Ubuntu with the new ABI and export C++ symbols directly(instead of using a C API), the it won't work.

2. We still can't use std::optional. It is a limitation coming from macOS. We will solve it when we got macOS 11 build machines. It won't be too long.

3. Please avoid to use C++17 in CUDA files(*.cu). Also, the *.h files that they include(like core/framework/float16.h). This is Because CUDA 10.2 doesn't support C++17. You are welcome to use the new features in any *.cc files.
2021-06-25 14:08:01 -07:00
Guoyu Wang
9618b6ba62
Fix mac shared_provider warning (#8153) 2021-06-25 13:25:28 -07:00
Changming Sun
a41d0db43c
Enable C# GPU tests in Windows GPU CI pipeline (#8142) 2021-06-25 08:11:45 -07:00
Chi Lo
91075255a7
Enable TRT provider option configuration for C# (updated version) (#7808)
* prepare for C# to configure provider options

* add c# code

* revert modification

* Add update provider info configuration in trt ep side

* fix bugs

* fix bug for compiler error C2259

* Add c# test

* fix bug

* fix bug

* Properly deal with string

* Add c# api for accepting trt provider options

* fix bug

* Modify C# test

* add shared lib test

* Add get provider options functionality

* clean up

* clean up

* fix bug

* fix bugs for CI

* Fix bugs for CI and documentation

* Move TRT EP provider options related functions out of C API

* revert

* fix bug

* refactor

* add check for provider options string

* code refactor

* fix CI bug

* Fix CI bugs

* clean up

* fix bug

* Fix bug for Post Analysis

* fix accidental bug

* Add API_IMPL_BEGIN/API_IMPL_END

* clean up

* code refactor

* code refactor

* fix CI fail

* fix bug

* use string append

* Change the code to better handle strncpy and string append
2021-06-25 03:21:22 -07:00
Ryan Hill
49938cce77
Fix Python Cuda loading issues (#7939) 2021-06-25 02:26:50 -07:00
Changming Sun
378a98597e Use std::make_reverse_iterator directly 2021-06-24 15:29:39 -07:00
ashbhandare
00e44861c5
Fetching frontier tensors to frontend for ORTModule (#8086)
* Fetching frontier tensors to frontend

* Move before session initialize call
2021-06-24 15:04:35 -07:00
SilvanK4t1qbit
eb36258df4
Enable signed int8 data type for activations in static quantization (#7029)
* Add support for signed int8 static activation quantization. Make symmetrization in quantization switcheable
2021-06-24 14:42:22 -07:00
Ryan Hill
e083d207cf
Disable InitProvidersSharedLibrary when training is enabled. (#8132) 2021-06-24 13:55:56 -07:00
Adam Pocock
7ed9f5fc90
[Java] Fixing the creation of OnnxTensors from scalars, adding tests (#8023)
* Fixing the creation of OnnxTensors from scalars, adding tests.

* Documentation fixes from the review.
2021-06-24 13:21:35 -07:00
Negin Raoof
80b7b134bf
Adding optional ops in contrib ops (#7946)
* Added optional const spec
2021-06-24 13:16:31 -07:00
Sherlock
59e336040c
Ortmodule override torch.manual_seed() (#8131)
* Ortmodule override torch.manual_seed()
2021-06-24 11:51:25 -07:00
Viswanath Boga
b478086bc1
Fuse attention node even in case of different Q,K hidden dimensions (#8106)
* changes to fuse attention node and create varied dimensions

* added an option to optimizer to only do offline fusion

* fixing a typo

* merge with master

* removing extra changes

* added new unit test - test_attention_fusion_for_varied_qkv_dimensions()

* Unit test succesfull for q,k,v paths with varied dimensions

* adding test model for unit test case

* optimizing attention tests

* removing debugs

* minor change

* addressing comments

* addressing comments

* changed the new option to disable_onnxruntime

* replacing asserts with debugs

* make attn fusion backward compatible for head_size, hidden_size

* preserving behavior for shape_modified_tensor

* adding new option as the last parameter

* cleaning up

* line breaks and spaces

* formatting according to python

* making the changes to fuse attention node without user input

* changes to fusion_attention.py updated

* bringing the code up to python standard
2021-06-24 08:03:21 -07:00
Hariharan Seshadri
4fd7efcf0d
Update logic in props.xml to account for shared provider library changes (#8138) 2021-06-23 20:41:44 -07:00
Changming Sun
f000dfddbe
Update run_dockerbuild.sh: set default python version based on OS version (#8136) 2021-06-23 15:50:03 -07:00
Changming Sun
1fa6986656
Chang how numpy version is handled. (#8130)
Numpy has binary compatibility, which means "binaries compiled against a given version of NumPy will still run correctly with newer NumPy versions, but not with older versions." So, if an onnx runtime package was built with numpy version A, then at run time it requires numpy version >=A. In this change, we read numpy version from the installed packages at build time, to avoid manually keeping the build time/runtime consistency.
2021-06-23 14:08:37 -07:00
Tixxx
db88f3059c
[js] fixing broadcast issues in pack mode (#8090)
* fixing broadcast issues in pack mode

* improved bcast logic for matmul

* removed TODO

* rebased from master
2021-06-23 09:55:19 -07:00
Tracy Sharpe
cbdd59dae9
MLAS: enable SSE 4.1 path for x86 build (#8127) 2021-06-23 09:38:58 -07:00
Xiaoyu Liu
45ce239929
User dynamic axes in one step beam search output (#8092) 2021-06-23 01:41:32 -07:00
Scott McKay
cccd61e3bc
Add int64 as a required type to ConstantOfShape as it's used by the pytorch converter for Pad. (#8128)
It's also used pointlessly for torch.tensor.repeat (although that usage should always be able to be constant folded).
2021-06-23 14:53:06 +10:00
Edward Chen
b1e21312b5
[Mobile package] Update required operator config with additional ops for newer version of Wav2Vec 2. (#8123)
This is an update to https://github.com/microsoft/onnxruntime/pull/8079
The sample application motivating the original update changed to use an updated version of the model. Now, fewer ops are required. This change removes the previously added ops which are no longer needed.
2021-06-22 19:19:46 -07:00
Evgenii Indenbom
664e548e31 Col2im optimization by eliminating integer multiplications:
1. No padding branch performance is improved 8 times
2. Symmetric padding branch is generalized for asymmetric padding case (padding symmetry was not actually used) and further optimized by eliminating integer multiplications.
2021-06-22 18:44:20 -07:00
Changming Sun
6e2b064aec
Delete some unused code in run_dockerbuild.sh and Enable Nuget CUDA tests (#8089)
1. Remove some unused code and simplify tools/ci_build/github/linux/run_dockerbuild.sh.
2. Enable Nuget CUDA tests. The original design was we could leverage Directory.Build.props and let cmake generate the required properties(USE_CUDA/...) there. However, in nuget packaging pipeline we test the package on a different host that doesn't run cmake command and doesn't have the auto-generated Directory.Build.props file.
2021-06-22 18:43:33 -07:00
Guoyu Wang
f6292d9b38
[Android] Output error message to android log instead of stderr (#8114)
* Output error message to android log instead of stderr

* Address CR comments, move macro to a helper function

* Address CR comments

* Fix ort minimal build break
2021-06-22 17:50:06 -07:00
Guoyu Wang
9003df5d87
Fix 32bit Android java API crash (#8122)
* Fix 32bit Android java API crash

* fix code formating
2021-06-22 17:41:11 -07:00
Yufeng Li
4bb0e29d0e
initialize generated_value_names with graph input (#8085)
* initialize generated_value_names with graph input
* use set for following usage
2021-06-22 15:08:54 -07:00
Ryan Lai
839f69d249
Implement WINRT_IMPL_LoadLibraryW to avoid calling LoadLibraryW directly (#8065)
* Override load library w in cppwinrt

* Add comment
2021-06-22 14:31:20 -07:00
Shucai Xiao
e7d7fa8fa2
Update migraphx to rocm4.2 (#7994)
* update dockerfile for migraphx ep

* update to rocm4.2

* code cleanup

* fix error related to onnx unit tests
2021-06-22 13:39:51 -07:00
Changming Sun
5809890ba2
Fix a compile error in InferenceTest.cs (#8119) 2021-06-22 13:01:35 -07:00
Sunghoon
8cacb26946
remove debug.keystore from repository due to a credential issue report (#8113) 2021-06-22 10:15:10 -07:00
Chi Lo
27d1784d44
Add TRT 7.1 Pipeline (#8073)
* Revert for testing TensorRT 7.1

* change to origianl googletest version

* change machine

* remove build arg

* change back machine

* revert back googletest version

* Make it ready to merge to master

* revert onnx-tensorrt to v7.1

* rename yml

* use [[ ]] in bash command

* add sudo

* add chmod

* add correct path

* change another way to revert onnx-tensorrt

* change docker image to manylinux build
2021-06-21 20:57:04 -07:00
chethanpk
3cd06cb38c
Added support for ReduceMean on DNNL EP for CPU and GPU (#7902)
* Added support for ReduceMean on DNNL EP for CPU and GPU

Signed-off-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com>

* Added fix for a resnet model failure where it was failing to create dst shape for reducemean when it was part of a subgraph with other ops

Signed-off-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com>

* Removing the DNNL EP from these unit tests. This is in anticipation of two changes:
- DNNL EP unit tests would be added in a different location later on, so addition of EP individually to these tests will not be necessary
- This was causing a memory leak fail in debug build. The bug is in the EP itself and not in the code added for reducemean. The fix for this is in the i/o handling overhaul which will be added later.

* Update reduction_ops_test.cc

Had accidentally deleted a new line. Making sure there are no unnecessary changes in this file
2021-06-21 17:15:46 -07:00
Du Li
352d560fd5
Adding Conv+Clip fusion (#8102) 2021-06-21 16:30:12 -07:00