Commit graph

6181 commits

Author SHA1 Message Date
Edward Chen
edd1a2cf61
Add more Java test logging. (#10221) 2022-01-10 09:42:46 -08:00
Dwayne Robinson
0f5e82c294
DirectML EP remove stale code for int64 via int32 double strides (#9959) 2022-01-10 02:07:22 -08:00
Dwayne Robinson
1f5b073508
Minor DirectML EP provider factory comments (#9965) 2022-01-10 02:06:31 -08:00
PeixuanZuo
7d93498e0e
[FIX] register softmaxgrad_13/logsoftmaxgrad_13 for rocm (#10177)
* [FIX] register  softmaxgrad_13/logsoftmaxgrad_13 for rocm
* [FIX] update softmaxgrad_13/logsoftmaxgrad_13 implementation for rocm
2022-01-10 11:33:46 +08:00
Scott McKay
6e88c11cae
Refactor QDQ node group selection infrastructure (#10195)
* Separate out the QDQ node group selection from the SAT specific NodeSelector to make re-use in NNAPI etc. cleaner.

* Make MatMulIntegerToFloat matching optional.
Add move ctor to BaseSelector. Required now that it has a unique_ptr member.

* Avoid Guardian warning by using rvalue unique_ptr created with make_unique
2022-01-10 10:57:50 +10:00
Nat Kershaw (MSFT)
d52d3c0052
Update C/C++ API docs automation to create a PR (instead of push to publish branch) (#10093) 2022-01-07 16:16:47 -08:00
Ye Wang
5ebb857501
Update onnxruntime_unittests.cmake (#10215) 2022-01-07 16:14:15 -08:00
vade
bacae967a2
Update Cuda to 11.4.2, update architectures, support Ubuntu 20.04 (#10169) 2022-01-07 13:00:44 -08:00
Zhang Lei
2bbf1ac1e0
Using better words. (#10210) 2022-01-07 09:17:23 -08:00
Jeff Daily
e7efcc93fe
[ROCm] update hipify-perl location (#10102)
* [ROCm] update hipify-perl location

Depending on the ROCm version installed, hipify-perl might not always
live in the hard-coded path of /opt/rocm/bin. Use python 3.3's
shutil.which to locate the script.

* provide alternative locations for hipify-perl if not in PATH

* implement hipify-perl search as a function

This avoids running the logic during module import since all builds
import the amd_hipify module.

* fix flake8 errors
2022-01-06 17:21:02 -08:00
Abhishek Jindal
4ac3277743
adding definition of concat operator for mapping it to onnx (#10062)
* adding definition of concat operator for mapping it to onnx

* adding the opgen generator file to include tensorlist type for eager mode
2022-01-06 14:56:35 -08:00
Chris Hua
cab4579b83
remove six references (#9941)
Python 2 compatibility is no longer necessary and helps unblock upgrades to mypy and others.
2022-01-06 13:52:20 -08:00
Hariharan Seshadri
0552a47ec2
Enable CUDA provider option configuration for C# (#10188) 2022-01-06 11:03:14 -08:00
Ye Wang
08f512b25e
Fix a Win GPU reduced ops pipeline (#10202) 2022-01-06 09:46:34 -08:00
ashari4
4ab891999a
fix hardcoded type (#10205) 2022-01-06 09:28:22 -08:00
ashari4
7b5464ed7b
aten add_ op supports bf16 (#10084)
* hand implemented add_
2022-01-05 09:33:28 -08:00
Edward Chen
34c025109c
Exclude graph_runtime_optimization_test.cc from reduced ops build. (#10191) 2022-01-05 09:22:38 -08:00
Ye Wang
2803a9465d
Add example of registering custom cuda op as shared lib (#10025) 2022-01-05 09:22:15 -08:00
yz
2078210a1c Improve logging for symbolic shape inference 2022-01-04 13:17:07 -08:00
Edward Chen
792db33f01
Enable loading of ORT format model graph runtime optimizations (#9901)
Initial implementation of load/replay of runtime optimizations in an ORT format model.
2022-01-04 12:09:07 -08:00
Tang, Cheng
97659495d9
fix aten view op (#10050)
* fix aten view op

* add test case

* fix signature

* fix the build

Co-authored-by: Cheng Tang <chenta@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-01-04 08:29:30 -08:00
George Wu
91f85dfdad
update Dockerfile.manylinux2014_cuda11_4_tensorrt8_2 to TensorRT 8.2.2.1 (#10167) 2022-01-03 20:38:37 -08:00
Chi Lo
c29397ad4f
Modify the code to get correct ragne for symmetric quantization (#10170) 2022-01-03 19:13:37 -08:00
Nat Kershaw (MSFT)
0c517112c4
Automate Python API docs generation (#10116) 2022-01-03 18:22:22 -08:00
Yufeng Li
230f323600
add qdq support for LeakyRelu (#10077)
* add qdq support for LeakyRelu
2022-01-03 14:48:49 -08:00
Tongliang Liao
1d3b34cc92 Add .git suffix to github URL.
Although github works with both, this is more precise.
Having an extension also makes it easy to match with regex, when we want to inject code to reroute traffic to our own git mirror.
2022-01-03 14:38:35 -08:00
Yufeng Li
7208fcbe1c
use wasmscalar as default kernel (#9988)
* use wasmscalar as default kernel
2022-01-03 10:55:08 -08:00
Dmitri Smirnov
28ce2a5a78
Re-work hierarchy, fix virtual method overload/hiding (#10160)
Re-work hierarchy, fix virtual method overload/hiding
Use std::optional with a clear comment on the member thread-safety.
2022-01-03 10:24:49 -08:00
Abhishek Jindal
d5742f3a43
moving from torch nightly build to stable build (#10150)
* moving from torch nightly build to stable build

* using torch cpu version

* using torch cpu version from link
2021-12-29 19:35:10 -08:00
Edward Chen
3bc91c2151
Move reduced ops files into build directory (#10030)
In a reduced ops build, some source files get updated. This change moves the updated files into the build directory. This way, it is easier to simultaneously manage different build directories (with possibly different reduced ops configurations) based on a single source directory.
2021-12-28 19:04:20 -08:00
Scott McKay
a367f0664d
From Python 3.8 and on you need to explicitly add the current directory for libraries to be loaded from it. Update onnxruntime_test_python.py with that handling. (#10129) 2021-12-28 16:10:26 +10:00
George Wu
3d6786c92e
update tensorrt multi gpu pipeline to tensorrt 8.2 (#10141) 2021-12-27 15:43:27 -08:00
Vincent Wang
ceb17f82ff
Use FusedMatMul When Transpose is Between First Dim and Contiguous Batch Dims (#9734)
* fusedmatmul support transpose batches

* fix win build

* fix contrib op md

* more comments
2021-12-27 10:49:46 +08:00
Vincent Wang
f780f06240
ConcatGrad for OpSet13 (#10109) 2021-12-24 10:02:52 +08:00
stevenlix
05d20343ee
Remove duplicated constant initializer copies for TensorRT nodes (#10105)
* add new field constant_initializers in metadef and remove constant initializers from trt node inputs

* remove redundancy

* use GetConstantInitializer() to get constant initializers

* add ORT_ENFORCE check

Co-authored-by: Ubuntu <azureuser@orteplinuxdev.bxgbzpva45kedp3rhbsbit4phb.jx.internal.cloudapp.net>
2021-12-22 12:19:56 -08:00
Sheil Kumar
ce1a9ca618
Fix Microsoft.AI.MachineLearning NuGet App failure with multiple binaries copied to same destination (#10076)
* Include onnxruntime binary when not using pacakge referene or uap app.

* Remove the lib\uap10.0 build from the nuget package - causing conflicts

* Add UWP test

* remove build files

* remove local change

* reset mimalloc and onnx-tensorrt

* change username to Microsoft

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2021-12-21 12:34:03 -08:00
Ye Wang
7a1bdc2052
Don't check cache shape when using dynamic axis (#10090)
Co-authored-by: Ubuntu <wy@linux-v100.aidmrjtolptuzevavgwhrapqcd.jx.internal.cloudapp.net>
2021-12-20 21:19:29 -08:00
Changming Sun
4e9e01cb3c
Fix SDL warnings in CPU EP (#9975) 2021-12-19 20:54:29 -08:00
satyajandhyala
bd4fb4c5da
Coding style fix. (#10080) 2021-12-18 12:05:48 -08:00
ashari4
cdbd678192
Check kMSDomain already exists before registering it (#10078)
* Check domain before registration
2021-12-17 17:55:15 -08:00
Yufeng Li
12ee2e942f
add int8_t for Resize (#10067)
As we support quantization for format s8s8, we need Resize to support int8_t.
2021-12-17 15:36:09 -08:00
Moshe David
4fd85cd97a
Fix broken link to TRT doc in exception details (#9496)
Co-authored-by: Moshe <modav@microsoft.com>
2021-12-17 09:00:33 -08:00
Faith Xu
d42feae042
Add citation file (#10061)
* Add citation file

* Fix typos
2021-12-16 19:56:21 -08:00
Guoyu Wang
f3c72de718
[QDQ] Add shared NodeUnit class (#10052)
* initial change

* move more function to node_unit

* Remove commented code

* Minor update

* Update onnxruntime/core/providers/nnapi/nnapi_builtin/builders/op_builder.cc

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>

* address CR comments

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
2021-12-16 17:37:51 -08:00
Tianlei Wu
ef36488df0
Add BeamSearch operator for GPT-2 decoding (#9680)
* Add BeamSearch operator and CPU implementation
* Add ONNX conversion script
2021-12-16 16:08:05 -08:00
Yufeng Li
fab39b4704
Update optimization level message in perf_test tool (#9972) 2021-12-16 13:49:18 -08:00
Bowen Bao
102f9b05e1
Support new symbolic function api from PyTorch with PythonOp (#9880)
* Support new symbolic function api from PyTorch with PythonOp

* Specify exact exception

* add comments

* move comments and arg
2021-12-16 11:08:06 -05:00
George Nash
93636cbd20
Reduce ops for DNNL ep (#10056)
* Add Reduce Ops to DNNL ep

Combine the Reduction ops into one class

Add ReduceL1, ReduceL2, ReduceSum, ReduceMax, ReduceMin, and ReduceProd,
ReduceSumSquare, ReduceLogSum, and ReduceLogSumExp

Reduce code now also handles the keepdims attribute

Also updated code to use HandleNegativeAxis function from
the providers/common.h code instead of manually calculating.

In code documentation exists to help explain complex reduction op code

Add elementwise ops to Reduction op capability code removed keepdims check
from the Reduction op capability code.

Updated the error_tolerance for LogGrad(DNNL EP only) after finding a few
instances that the tests were a little out of tolerance.

Signed-off-by: George Nash <george.nash@intel.com>

* Documentation cleanup in dnnl_qattention

Cleaned up the Comments documenting the QAttention operator
For some reason a bunch of new lines were introduced to the
comment making it harder to read.

Signed-off-by: George Nash <george.nash@intel.com>
2021-12-16 07:31:16 -08:00
Changming Sun
44c701192b
Revert a bad change in bfc_arena.cc (#10057) 2021-12-15 23:38:45 -08:00
Tang, Cheng
6357c12977
use inplace reshape (#9991)
Co-authored-by: Cheng Tang <chenta@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2021-12-15 21:17:29 -08:00