Commit graph

6675 commits

Author SHA1 Message Date
Olivia Jain
86cdabbcfd
Add OpenVINO Pipeline Status to README (#11299)
* update ov pipeline definition ID

* Update ov build status
2022-04-21 15:59:50 -07:00
Edward Chen
4d0214f851
Move Contains() helper function to a higher common.h. (#11289) 2022-04-21 09:31:48 -07:00
Gary Miguel
7aa4af238a
Add strict_shape_type_inference config option (#11081)
Prior to this, certain shape and type errors were surfaced only when
the model was using the latest known op set version.

Providing users an explicit option allows for better testing of code
that produces models, which includes unit tests within this repo and
other repos such as the TF-ONNX and PT-ONNX converters.

Remove the previous behavior which seems quite counter-intuitive:
an otherwise identical model with a later op set version should be treated
identically in this regard.

The option defaults to false to avoid causing errors for users that
rely on the previous permissive behavior.

Turned on the strict enforcement by default in OpTester, which revealed a few
disagreements between ORT and ONNX on what the correct output shape should
be.

Fix shape inference bug in ReduceSumTraining with noop_with_empty_axes=1
which was revealed.

Fix TensorOpTest.Unsqueeze_scalar, which was testing negative axes on an
op set version where the op did not actually support negative axes.

Fixes #9506.
2022-04-21 08:32:40 -07:00
Scott McKay
c5de493a8a
Exclude EPs that aren't available on mobile to try and fix Xamarin build error on M1. (#11267) 2022-04-21 07:01:46 +10:00
Edward Chen
4854a09340
Consolidate utils::ToTensorProtoElementType, TypeToDataType, and data_types_internal::ToTensorDataType. (#9824) 2022-04-20 12:45:53 -07:00
Changming Sun
2cacd18d51 Fix an SAL annotation error 2022-04-20 12:02:30 -07:00
Tianlei Wu
1d96cbec73
Move gpt2 script to models\gpt2 sub-directory (#11256)
* move gpt-2 scripts to models\gpt2
* change gpt2 beam search helper to make test_gpt2 passes
2022-04-20 11:09:26 -07:00
Chi Lo
cb46d79108
Model tests refactor (#11194)
* Update model test

* update comment

* create map to hold OnnxModelInfo so test doesn't need to reload the model again

* revert the code and use GTEST_SKIP() to skip test

* fix bug

* revert LATEST_ONNX_OPSET_SUPPORTED_BY_TENSORRT
2022-04-20 10:14:28 -07:00
Scott McKay
af249943a1 Increase the timeout so the packaging pipeline stops failing.
TODO: Someone should investigate why the AARCH64 build takes 3+ hours and reduce it if possible. Assuming it's using an emulator given the x64 build with the same arguments takes 13 minutes.
2022-04-20 09:36:37 -07:00
cloudhan
013306c940
[MinBuild] 132KB minimal build binary size reduction via dummy __cxa_demangle (#11071)
Minimal build binary size reduction via dummy __cxa_demangle
2022-04-21 00:11:10 +08:00
Edward Chen
180b3f7cc2
Update QDQFinalCleanup transformer to also handle removing DQ/Q node pairs. (#11219)
` -> DQ -> Q -> ` where DQ and Q have the same scale and zero point is not necessary.
2022-04-20 09:03:12 -07:00
Edward Chen
e3ff4a6bfa
Fix NNAPI EP error when handling external node adjacent to partition. (#11233)
Move a check for a graph output (for the partition) prior to iterating the downstream nodes to avoid trying to get a NodeUnit for a node that is outside of the partition.
2022-04-20 08:53:29 -07:00
Zhang Lei
70d97bdf53
Support only one input in QLinearConcat (#11265) 2022-04-19 20:55:51 -07:00
Yufeng Li
2e6c2177af
remove deprecated quantize api (#11263) 2022-04-19 19:41:55 -07:00
Maxiwell
acb555c4c7
ppc64le: Optimizing the MlasMaximumPool() to use VSX instructions (#11216)
It runs on Power8, Power9, and Power10
2022-04-19 15:13:55 -07:00
Tianlei Wu
bab9b80f1f
auto mixed precision for t5 (#11252) 2022-04-19 12:42:11 -07:00
Yulong Wang
5ee8e2e491
[js] use NPM and yarn to upgrade package version (#11059) 2022-04-19 12:28:13 -07:00
Vincent Wang
06026fe8e6
SizeInBytes Fix for Strided Tensor (#11224)
* SizeInBytes Fix for Strided Tensor

* resolve comments
2022-04-19 15:13:00 +08:00
Edward Chen
3dac66698b
Add option to specify onnxruntime repo URL in tools/android_custom_build/build_custom_android_package.py. (#11250) 2022-04-18 19:29:41 -07:00
Lukas Berbuer
efb0928e2b Fix find_package for benchmark 2022-04-18 15:25:43 -07:00
Dmitri Smirnov
98faaa7e2f
Scoped GIL release in run_with_iobinding (#11248) 2022-04-18 13:07:45 -07:00
Yufeng Li
dec99657a1
Improve onnx shape inference in quant tool (#11106)
onnx.shape_inference.infer_shapes only works for model size < 2GB, while onnx.shape_inference.infer_shapes_path works for all models. This PR replaces infer_shapes with infer_shapes_path.
2022-04-18 08:07:31 -07:00
pengwa
9765ef8b4e
fix build warnings (#11213)
* fix build warning
2022-04-18 21:09:09 +08:00
Vincent Wang
0bad5b1b5a
[CUDA] Rollback TileMemcpy and TileBatchedMemcpy when Block Size is Small (#11187) 2022-04-16 07:46:43 +08:00
George Nash
d9eeb48393
One dnn v2.6 update (#11220)
* Disable training code in DNNL LayerNorm code

The capability code already does not claim the LayerNorm and
SkipLayerNorm that require more than one output. However,
building with training enabled was causing issues.

The training specific code has been removed even when building with
training enabled.

Signed-off-by: George Nash <george.nash@intel.com>

* Fix for DNNL FusedMatMul op.
The bug was in the transpose code.

Signed-off-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com>

* Use agreed upon memory format type when runnig Pooling Gradient in dnnl ep

The dnnl ep does not currently have a way to pass memory_format information
between the forward pooling primitive to the backward pooling primitive.

This change explicitly sets the memory_format to use match that of Onnxruntime.
For both the forward and backward pooling code. This will prevent using un-matched
memory format that could result in an `unimplemented` error from dnnl ep.

Signed-off-by: George Nash <george.nash@intel.com>

* Update dnnl ep to use OneDNN v2.6

Do not run ReduceInfLogSum on the kDnnlExecutionProvider due to a
calculation bug when doing Log or infinity valuse. The fix for this
issue will be part of the next OneDNN release.

Signed-off-by: George Nash <george.nash@intel.com>

* Update PrintMemory function in dnnl ep

This modification can be used to enable/disable memory printing
for dnnl ep develpers.  This is considered a developer only feature
and is disabled by default. It must be enabled and code recompiled
to use.

Even if it is enabled it will not actually print any memory because
the developer needs to take the extra step of spefifying the memory
that will be printed to the screen.

Signed-off-by: George Nash <george.nash@intel.com>

* Update binary ops to run on intel GPU when using dnnl ep

Binary ops (i.e. Add, Div, Mul, and Sub ) was updated to no longer
call GetMemoryAndReshape in the past this would move the memory from
CPU to the GPU.  This extra call is no longer needed since it is taken
care of by the GetMemoryInOrtFormat call. Removing the GetMemoryAndReshape
prevented copying the memory to GPU twice.

Signed-off-by: George Nash <george.nash@intel.com>

Co-authored-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com>
2022-04-15 12:51:11 -07:00
sumitsays
227bc7264e
Fixed compilation error for ARM architecture (#11223)
Co-authored-by: Sumit Agarwal <sumitagarwal@microsoft.com>
2022-04-15 09:24:21 -07:00
ytaous
bc296c706e
MatMulScaleFusion - handling scale input (#11121)
* scale input

* more condition check

* alternative

* per comments

* fix comments

Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-04-14 21:54:04 -07:00
Yi Zhang
94032357e2
use int storage (#11185) 2022-04-15 09:56:36 +08:00
Ahmad Zakaria
63ff391b16
add AppendExecutionProvider_CUDA_V2 to the C++ api (#11153) 2022-04-14 17:33:27 -07:00
chausner
c2b4054c74 Fix typos 2022-04-14 13:53:50 -07:00
stevenlix
5216a43c9d
Consolidate TensorRT subgraphs to reduce inference overhead (#11211)
* add trt node list consolidation

* add more log

* fix typo

* seperate cycle detection and removal

* update

* change function name

Co-authored-by: Ubuntu <azureuser@orttrtlinuxdev.bxgbzpva45kedp3rhbsbit4phb.jx.internal.cloudapp.net>
2022-04-14 11:05:27 -07:00
Faruk D
a00d24066a
Fix CITATION.cff and add automatic validation of your citation metadata (#10478)
* Add cffconvert.yml to validate CITATION.cff

* Fix CITATION.cff by removing duplicate title and correcting the license

Co-authored-by: Abel Soares Siqueira <abel.s.siqueira@gmail.com>
2022-04-13 10:03:52 -07:00
Vincent Wang
9707181257
fix build error (#11199) 2022-04-13 13:09:19 +08:00
Scott McKay
3b3b23bcf9
Add new python helper dirs to wheel. (#11196) 2022-04-13 13:34:07 +10:00
Chen Fu
0d0edc071f
Detecting ARM64 CPU core micro-architectures in Windows (#11145)
Some micro-architectures of power efficient cores in ARMv8 system have narrow 64b load/store resources, which require specialized computing kernels in MLAS. We leverage pytorch CPUinfo package for detecting these cores. Unfortunately CPUinfo package does not work on Windows.

This commit implements ARM64 micro-architecture detection.
2022-04-12 16:47:11 -07:00
ashbhandare
ddb17294b2
Fix gradient builder for Cast (#11008)
* fix grad builder for cast

* reviw comments

Co-authored-by: Aishwarya Bhandare <aibhanda@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: Aishwarya Bhandare <aibhanda@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-04-12 16:08:21 -07:00
Gary Miguel
e84c338989
minor improvements to CONTRIBUTING doc (#11080) 2022-04-12 15:22:34 -07:00
Faith Xu
5337972f92
Update to use teams instead of individual GH handles (#11163)
* Update to use teams instead of individual GH handles

* Fix typo

* Update CODEOWNERS

* Update CODEOWNERS

* Update team name
2022-04-12 12:06:12 -07:00
Edward Chen
38e67e66a2
Add script and Dockerfile to build custom Android package (#11144)
* Handle relative paths in --include_ops_by_config.

* Add dockerfile.

* update comments

* refine

* update perms

* refine

* wording

* Change readme to md file, add link to docs site.
2022-04-12 10:16:10 -07:00
RajalakshmiSR
e397d8e63e
POWER: Optimize MlasTranspose functions (#11172)
This patch makes use of POWER vector intrinsics to improve performance
of MlasTranspose functions.

Co-authored-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>
2022-04-12 09:51:20 -07:00
Xavier Dupré
833f5d5604
Remove dependancy on EP TVM in unit test project (#11170) 2022-04-12 09:03:57 +02:00
Ryan Hill
625cc0ab99
Add Initialize() to shared providers to allow for reload (#11066) 2022-04-11 22:58:50 -07:00
Changming Sun
8237568b65
Fix the rocm packaging pipeline package upload problem (#11174)
In #11114 , I changed the script to use azcopy instead of azure blob storage's python APIs. However, it doesn't work for the AMD rocm pipeline, because:

1. The machines do not have azcopy installed
2. The machines are not in Azure, so they don't have Azure managed identity. So they still need to use SAS.

Therefore in this PR I get the old python file back, but only use it in the AMD pipeline.
2022-04-11 13:59:44 -07:00
dependabot[bot]
04fe1bd2ed
Bump electron from 12.2.3 to 13.6.6 in /js/web (#10978)
Bumps [electron](https://github.com/electron/electron) from 12.2.3 to 13.6.6.
- [Release notes](https://github.com/electron/electron/releases)
- [Changelog](https://github.com/electron/electron/blob/main/docs/breaking-changes.md)
- [Commits](https://github.com/electron/electron/compare/v12.2.3...v13.6.6)

---
updated-dependencies:
- dependency-name: electron
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-11 12:51:56 -07:00
Olivia Jain
ae243c2bb5
Pull Nightly Wheel File and Cleanup Perf (#11164)
* delete unused files

* only use one dockerfile, otherwise install

* Update pipeline file

* get other changes

* minimal packages

* update pull nightly variable

* try logical boolean

* test boolean

* have build ort as boolean

* case senstive

* use the current head not the previous commit

* add helpful note
2022-04-11 11:41:11 -07:00
Yi-Hong Lyu
749c0ddd1e
Upsample support NHWC (#10824)
This patch implement bilinear interpolation for Upsample/Resize 4-D input with
the outermost and innermost scale (usually channel of NHWC) as 1. It is
parallelized with output_height * output_width instead of one dimension only.

Besides, I also revert the HandleResize back to the original implementation for
TransposeOptimizerTests.TestResize* tests.

Finally, I add microbenchmark BM_NhwcUpsampleBilinear.
2022-04-11 11:39:17 -07:00
Edward Chen
269be2fe63
Remove unnecessary option from convert_onnx_models_to_ort.py, fix old instructions. (#11088)
Remove unnecessary --nnapi_partitioning_stop_ops option from convert_onnx_models_to_ort.py, fix old instructions.
2022-04-11 11:19:21 -07:00
Tianlei Wu
00b595e389
move longformer and t5 to models subdirectory (#11161)
* move longformer scripts to models subdirectory
* Copy transformers\models\t5 to python package as well
2022-04-09 22:35:14 -07:00
Erick Muñoz
f24523e0eb
Enable LayerNorm and SkipLayerNorm in OneDNN EP (#11128) 2022-04-08 23:10:13 -07:00
liqun Fu
d96230065e
fix code error in function.cc (#11148) 2022-04-08 10:04:21 -07:00