Commit graph

6427 commits

Author SHA1 Message Date
Changming Sun
b3e96d6195
A new pipeline to replace the existing WindowsAI packaging pipeline (#10646) 2022-03-03 08:56:49 -08:00
Hubert Lu
fe8d867efa
Optimize BinaryElementWise and BiasGeluGrad kernels for AMD (#10594)
* Optimize elementwise and biasgelugrad kernels for AMD

* Clean up for BiasGeluGradDxKernel
2022-03-03 08:07:15 -08:00
cloudhan
4c20f6863d
Fix build with gcc 7.5 (#10567) 2022-03-03 18:29:02 +08:00
Fei Hu
75160d6779
Add the missing status return in beam search (#10738) 2022-03-03 01:24:44 -08:00
Rachel Guo
a9dc50ba8b
Add option to force QDQIsInt8Allowed to return true when exporting to ORT format (#10719)
* wip

* save

* minor update

* fix

* fix

* Revert "fix"

This reverts commit a76f364b2d.

* revert

* revert

* revert submodule removal

* address pr comments

* minor fix

* address cr comments

* fix format

Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
2022-03-02 23:26:14 -08:00
Ye Wang
44d08d80a0
Add restriction to first usage in allocation planner (#10724)
* Add restriction to first usage in allocation planner

* change phrases

* add UT

Co-authored-by: Ubuntu <wy@linux-v100.aidmrjtolptuzevavgwhrapqcd.jx.internal.cloudapp.net>
2022-03-02 22:03:50 -08:00
Tianlei Wu
47ab0c2006
Auto mixed precision conversion of GPT-2 onnx model (#10711)
* add auto mixed precision
* Add float_to_float16_max_diff, update fp16 constants
* remove cascaded Cast nodes
2022-03-02 21:08:51 -08:00
Olivia Jain
7ebff2b273
add missing link to openvino (#10737) 2022-03-02 15:10:59 -08:00
Baiju Meswani
f9b6eef05f
orttraining packaging pipeline for rocm 5.0.1 (#10725) 2022-03-02 12:32:14 -08:00
Yufeng Li
7ab0c607b4
add qdq support of (un)squeeze and GlobalAveragePool (#10721) 2022-03-02 10:58:35 -08:00
Numfor Tiapo
9ad95bf068
Skip SetName test on inbox build (#10699) 2022-03-02 10:28:58 -08:00
RajalakshmiSR
5d8c5409ab
POWER10: QGEMM optimization (#10642)
* POWER10: QGEMM optimization

This patch makes use of POWER10 MMA feature for QGEMM function.
This optimization includes signed and unsigned cases.Tested and
there are no new failures with gcc11 and clang-14.

* Changes as per review comments

Co-authored-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>
2022-03-02 08:36:26 -08:00
Funtowicz Morgan
e5c6dc1fc8
Add ability to save calibration augmented models through external data format when model size exceeds 2Gb. (#10695) 2022-03-02 08:35:30 -08:00
Valery Chernov
62cc981599
[TVM EP] support of TVM Virtual Machine (#10341)
* add executor option (vm or graph) and support virtual machine methods

* nullptr check for compile and run methods (see also PR#10211 from microsoft:onnxruntime)

* get output shapes for VM

* remove run_with_benchmark. remove run methods from python api, get it from native side

* get outputs method for VM was implemented

* support multiple input for VM

* update python logging and exception

* small fix

* update tvm with patch for VM API

* update nhwc transformations for TVM EP

* add data alignment check and support set_input_zero_copy for GE in TVM EP

* fix logger name

* return back to apache/tvm with VM fixes instead of local dev branch

* hide customized tvm logger while issue is not resolved. fix tvm warning related to target_host

* flake8 fix

Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
2022-03-02 11:02:33 +01:00
Sunghoon
a7f6442c45
[js] release pipeline for web and react native (#10656)
* skip browserstack test at release pipeline

* Update web-ci-pipeline.yml for Azure Pipelines

* Update web-ci-pipeline.yml for Azure Pipelines

* Update web-ci-pipeline.yml for Azure Pipelines

* Update web-ci-pipeline.yml for Azure Pipelines

* Update web-ci-pipeline.yml for Azure Pipelines

* Update web-ci-pipeline.yml for Azure Pipelines

* Update web-ci-pipeline.yml for Azure Pipelines

* Update web-ci-pipeline.yml for Azure Pipelines

* Update web-ci-pipeline.yml for Azure Pipelines

* Update web-ci-pipeline.yml for Azure Pipelines

* Update web-ci-pipeline.yml for Azure Pipelines

* Update web-ci-pipeline.yml for Azure Pipelines

* Update web-ci-pipeline.yml for Azure Pipelines

* Update web-ci-pipeline.yml for Azure Pipelines

* Update web-ci-pipeline.yml for Azure Pipelines

* Update web-ci-pipeline.yml for Azure Pipelines

* Update web-ci-pipeline.yml for Azure Pipelines

* pool name as a parameter to run at lotus

* Update web-ci-pipeline.yml for Azure Pipelines

* Update web-ci-pipeline.yml for Azure Pipelines

* Update web-ci-pipeline.yml for Azure Pipelines

* Update web-ci-pipeline.yml for Azure Pipelines

* Update web-ci-pipeline.yml for Azure Pipelines

* create a packaging pipeline for web

* Update web-packaging-pipeline.yml for Azure Pipelines

* make web-ci-pipeline as a template

* make web-ci-pipeline as a template

* make web-ci-pipeline as a template

* make web-ci-pipeline as a template

* change a paramter name checking a pipeline

* make a pool name changable for react native pipeline

* disable code sign validation for react native

* fix react native package.json publish

* fix indentation

* remove unnecessary comment

* test onnxruntime-common package publish

* ts and js files use lf as eol for windows

* use Linux style of ending line break

* change newLine at only tsconfig.json

* restore a commented code

* fix git restore directory for npm packaging

* fix a typo

* force eol to lf on windows for js directory in CI
2022-03-01 21:38:33 -08:00
Edward Chen
9e7d7a9e97
Convert ConvActivationFusion transformer to a selector action transformer. (#10687) 2022-03-02 13:47:55 +10:00
Tianlei Wu
fa9090f259
check gpt-2 graph in converting beam search (#10712) 2022-03-01 19:04:34 -08:00
Edward Chen
d07a2377b1
Fix race condition in CUDA, ROCm, and TensorRT EP GetKernelRegistry() implementations. (#10200)
Make GetKernelRegistry() kernel registry initialization thread-safe.
2022-03-01 17:53:58 -08:00
Tianlei Wu
2fb2dae42f
Print tensor snippet in dumping node Inputs/Outputs to StdOut (#10707)
* dump tensor snippet
2022-03-01 16:59:12 -08:00
zhangyaobit
a7738b52c5
Add microbench to benchmark single operators. (#10678)
* Add microbench to benchmark single operators.

* Move to tool directory; seperate data genration from io binding.

* Refector.

* Clean up.

* Use precision instead for extensibility.

* Refactor the create_io_binding function to take in torch tensors
instead of numpy arrays; this reflects more accurately what
the function does, because it is torch tensors that got bound.
2022-03-01 16:00:16 -08:00
Guoyu Wang
19464614e7
[NNAPI QDQ] Add QDQ Concat (#10666)
* add qdq concat

Co-authored-by: Scott McKay <skottmckay@gmail.com>
Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
2022-03-02 09:08:36 +10:00
Bowen Bao
6448ca64e6
Fix reshape allowzero with unknowndim (#10665) 2022-03-01 10:47:48 -08:00
Yufeng Li
f652f70d91
set qdq as the default static quantization format (#10684)
* set qdq as the default static quantization format
2022-03-01 10:27:20 -08:00
Yi Zhang
f1b6f0becd
Update nuget icon (#10672)
Update nuget icon from url to local file because the old tag is deprecated.
2022-03-01 09:11:03 -08:00
Ryan Hill
c1cf16ed5d
Conv node bug, cached state was incoherent (#10041)
* Moved the init earlier to keep the cache coherent
* Move setting of w_desc later, and zero shape check later to catch all cacheable changes.
* Add comment
2022-03-01 01:31:57 -08:00
Yulong Wang
f4b2d3af2b
Upgrade emsdk to 3.1.3 (#10577) 2022-02-28 23:52:41 -08:00
Tianlei Wu
c51b500ca7
replace std::numeric_limits<T> by cub::FpLimits<T> (#10703) 2022-02-28 23:11:51 -08:00
Vincent Wang
9a22b5d253
Strided Tensor Support for Eager Mode (#10578)
* strided tensor for eager mode

* fix build and resolve comments

* fix win x86 build
2022-03-01 14:25:31 +08:00
Adam Pocock
f856608599
[java] Changes OrtEnvironment so it can't be closed by users (#10670)
* Changes OrtEnvironment so it can't be closed by users.

* Fix the formatting and add a same instance check.
2022-02-28 21:03:40 -08:00
Dmitri Smirnov
e23a224518
Fix CUDA 10.2 compile error due to inlined_containers.h inclusion (#10702)
Fix CUDA 10.2 compile error due to inlined_containers.h inclusion
 into a common CUDA header.
 Use NumberOfNodes() to reserve space in a hash table
 Prefer separate call to reserve() rather than passing in the
 hash table constructor. They have somewhat different meaning.
2022-02-28 19:56:44 -08:00
cloudhan
3243c9579f
Fix VLOG?_DEFAULT macros usability. (#10568)
* Add `set_default_logger_verbosity` api.

* fix docs

* make flake8 happy
2022-03-01 13:16:26 +10:00
cloudhan
d1b2fb15ad
Avoid clang-tidy crashing due to readability-static-accessed-through-instance check bug (#10690)
See https://github.com/llvm/llvm-project/issues/53874 for more info.
2022-03-01 11:06:00 +08:00
Chi Lo
d2d22f2195
Add support of generating symmetric/asymmetric tensor's range for calibration (#10663)
* add support of symmetric/asymmetric range of value

* modify comment

* Update calibrate.py

* update quantize.py

* remove newline at end of file
2022-02-28 16:33:45 -08:00
Edward Chen
ffde44cd09
[iOS Packaging] Add full ORT build iOS package. (#10626)
Add C/C++ and Objective-C packages with full ORT builds.
2022-02-28 15:39:07 -08:00
Scott McKay
1f6d8248da
Add optional optimizer to remove leftover Q->DQ pairs after all other QDQ processing has completed (#10659)
Add an optimizer that can remove leftover Q->DQ pairs. Depending on the model this may help with performance and/or improve accuracy. Optional as it could make things worse so user needs to be aware of this and test what works best for their scenario. Enable with SessionOptions config param `session.enable_quant_qdq_cleanup`
2022-03-01 08:05:02 +10:00
Thiago Crepaldi
e788cc2a23
Convert com.microsoft::ATen into org.pytorch.aten::ATen onnx op (#10060)
Signed-off-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
2022-02-28 14:14:45 -05:00
Adam Pocock
e47434ea12
[java] Adding the graph description to the exposed model metadata. (#10318) 2022-02-28 10:05:03 -08:00
harshithapv
037f08f1ff
Fix unsqueeze for opset 13 for ReduceMean Grad (#10668)
* fix unsqueeze for opset 13 for reducemean grad

* fix input for reduce mean
2022-02-28 09:55:52 -08:00
Ryan Hill
eb116595d4
Add ability to customize ORT_CXX_API_THROW (#10688) 2022-02-28 00:15:10 -08:00
Guoyu Wang
240f31ef6e
fix softplus (#10576) 2022-02-28 09:27:07 +10:00
Scott McKay
f2ca43fe0d
Enable CoreML in the macos package (#10675)
* packaging pipeline change

* Enable CoreML on macos

Co-authored-by: Guoyu Wang <wanggy@outlook.com>
2022-02-28 09:12:37 +10:00
Dmitri Smirnov
b30e0e2283
Remove inline_containers include from tensor_shape (#10682)
Hide Inlined Hash set and maps guts behind template forward declarations.
Currently CUDA 10.2 compiler can not compile abseil but provider interfaces
use those types in their signatures. InlinedVector seems to be fine.
Introduce core/common/inlined_containers_fwd.h header
2022-02-26 20:07:18 -08:00
Changming Sun
81831201a8
Change C# tests to use C# 5.0 (#10686)
.NET Core 2.1 has reached end of support on August 21, 2021. Use C# 5.0 instead. Our CI build machines do no have C# 6.0 yet. Later I will do it.
2022-02-26 00:28:30 -08:00
Numfor Tiapo
5fbfca3d58
Add Experimental API for setting model name (#10518)
* Add experimental API for editing model name

* Change EditModelName to 'SetName'

* Change API to pass c_string

* Update SetName to edit the proto

* Test that the model proto gets changed

* Remove comments

* Skip inbox tests

* Use filehelper path

Co-authored-by: Numfor Mbiziwo-Tiapo <numform@microsoft.com>
2022-02-25 14:23:49 -08:00
Tianlei Wu
36c3271546
BeamSearch op cuda (#10556)
Add BeamSearch cuda implementation with support of fp16 GPT-2 subgraph
2022-02-25 13:08:55 -08:00
Dmitri Smirnov
957dccb379
Fix compile (#10667) 2022-02-25 10:08:30 -08:00
Chen Fu
12c44bfc4e
fix bug: getting current cpu core type (#10630)
Prev merged pull request has a bug:

#10521

It was aimed to detect current CPU core micro-architecture and select a best suited kernel. Unfortunately it assumes that a thread can never migrate from one core to another.

This change tries to fix that problem. It introduces about 2-5% performance degradation on symmetric quantized matmul

Co-authored-by: Chen Fu <fuchen@microsoft.com>
2022-02-25 08:56:14 -08:00
David Fan
617474e298
Stop gradient edges for aten::argmax (#10650) 2022-02-24 21:14:53 -08:00
Dmitri Smirnov
2679711bee
Refactor transformers and other code to reduce memory allocation calls (#10523)
Work on minimizing memory management calls by
  reducing number of allocations and copies.
  Replace std::unordered_set to InlinedHashSet
  and add usage of InlinedVector.
  Employ std::move() to minimize copying and memory allocations.
  Remove copying of the const shared data into each of the
  PropagateCast transformer instances.
  Move inlined_containers.h header to include/common
  Adjust AsSpan imlementation for C++ < 17
2022-02-24 16:17:14 -08:00
Scott McKay
0b19a03361
Fix the debug dump of tensor values to output int8 and uint8 values correctly. Without the change they are treated as char/unsigned char by std::cout. (#10658)
Other changes are from clang-format
2022-02-25 07:25:23 +10:00