Commit graph

7863 commits

Author SHA1 Message Date
Vincent Wang
1798698545
avgpool2d atenop (#8507) 2021-07-28 14:04:55 +08:00
Xiang Zhang
73660d78df
Fix WinML build warnings in HStringFromUTF8 (#8519) 2021-07-27 22:29:58 -07:00
Yufeng Li
ceeb1a65d6
Add quantization support of GEMM directly with QGemm (#8447)
QGemm takes in quantized A, B, C, and quantization parameters of output Y, in which C and quantization parameters of Y are optional. Its output can be quantized or full precision, which depends on whether quantization parameters of Y exists or not. If quant params of Y are provided, the output will be requantized or is full precision.

Comparing with QLinearMatMul and MatMulInteger, QGemm supports transpose, apha and beta attribute.

The formula for quantized GEMM is:
Y = alpha * scale_a * scale_b * ((A_int8 - zp_a) * (B_int8 - zp_b) + C_int32), in which,
C_int32 is quantized with formula: C_int32 = (beta * C) / (alpha * scale_a * scale_b)
2021-07-27 21:21:49 -07:00
Zhang Lei
0f46b08646
improve the qlinear avg pool perf (#8514)
*) use context buffer allocator, remove init cost of vector
    *) using lookup table to dequantize large input
    *) fall back to global average pool if it is
2021-07-27 20:56:59 -07:00
Tim Harris
56441dcd88
Limit work items to available threads, upgrade checks from assert to ORT_ENFORCE (#8495) 2021-07-27 19:25:12 -07:00
Sherlock
686f9b530b
ORTModule set_seed in int (#8511) 2021-07-27 15:43:13 -07:00
Tracy Sharpe
7d47175f76
cleanup NCHWc transformer (#8479) 2021-07-27 15:39:10 -07:00
ashari4
3850755feb
Fix: onnxruntime_eager library does not compile on Windows due to path string constant (#8487) 2021-07-27 15:15:18 -07:00
Ryan Lai
6b05a62584 Merge remote-tracking branch 'upstream/master' into HEAD 2021-07-27 14:27:58 -07:00
Oliver Rausch
1685ab8138
Implement Concat with Strided copy (#8336)
Adds a StridedCopy function that implements a copy from strided tensor to another.

This parallelizes the Concat operator, and can also be used in the future to parallelize many other data movement operators (e.g. Transpose, Split, etc.).
This operation is also required for the proposed data layout extensions to ORT.
2021-07-27 18:27:56 +02:00
Guoyu Wang
4c939e1cb7
Add an option to use the input model bytes (ORT format only) directly without copy at session creation (#8502)
* Do not copy the model_data when session is started by CreateSessionFromArray

* Add config option for disabling copy model bytes

* Add one additional test

* Address CR comments
2021-07-27 09:11:42 -07:00
ytaous
1ae32655b3
fix t5 assert error (#8501)
Co-authored-by: Ethan Tao <ettao@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2021-07-27 09:04:01 -07:00
Edward Chen
b4baac888c
[NNAPI EP] Make partitioning stop ops configurable from Python API. (#8484) 2021-07-27 08:16:47 -07:00
Edward Chen
421c4059c0
[iOS Packaging] Update build definition (#8503)
* Add build number into version.

* Add parameter for archive upload.
2021-07-27 08:16:02 -07:00
KeDengMS
0a70c2de00
[Nuphar] Add support for opset 14 (#8483)
- For ops used in quantized LSTM
- Update nuphar model editing/quantizer scripts
2021-07-27 06:13:47 -07:00
Ankur Verma
91936864ce
Expose additional shared_provider APIs (#8478) 2021-07-26 18:03:12 -07:00
Tianlei Wu
534c22d769
use float for alpha in attention Gemm (#8477) 2021-07-26 11:04:56 -07:00
Xavier Dupré
a9fc3c448c
Improves documentation, show InferenceSession contructor attributes (#8494)
* include constructor parameters in the python documentation
* expose more classes into the documentation
2021-07-26 15:58:47 +02:00
Tianlei Wu
79097ef553
remove useless reshape node (#8419) 2021-07-23 18:12:21 -07:00
Viswanath Boga
6dee9b9d2d
attention fusion kernel refactoring (#8432)
* attention fusion kernel refactored

* consider the case of none in add_qk

* variabled added to check for pre-pack weights

* added a comment to PrePack()

* Optimized prepack and try to free the weights

* making comment sound better

* fixing a bug with optimizer.py

* commented out changes to be done

* removed comments

* make the private fn() private

* fix build

* making clean up fn static

* backed out optimizer tool change, needs more looking into
2021-07-23 17:46:39 -07:00
Ryan Hill
a396c9e572
Add more safety checks to the C API (#8474) 2021-07-23 15:41:27 -07:00
Ye Wang
6a07172a93
Restore cpu affinity after loading tensorflow model from transformers (#8448)
* Update onnx_exporter.py

* update

* review comments
2021-07-23 15:20:44 -07:00
Yulong Wang
e66846da4a revise terms according to guideline 2021-07-23 13:26:15 -07:00
ytaous
ab5289f109
Performance: enable faster training with skip checks config (#8411)
* freeze/fastpath support

* more comments on _fast_path

* per comments

* minor fix

* IntFlag improve

* address comments

Co-authored-by: Ethan Tao <ettao@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2021-07-23 10:23:13 -07:00
Vincent Wang
c8d210de29
Decouple Forward and Backward of ATenOp (#8301)
* atenop for inference

* assert if dtype mismatch

* atenop config in frontend

* fix orttrainer test

* gradient def not only for ATenOp

* bugfix

* fix gradient input shape and type issue

* fix after merge master
2021-07-23 16:53:26 +08:00
Vincent Wang
619a8782a5
Improve AddValueInfo (#8451)
* change AddValueInfo

* fix after merge master
2021-07-23 16:39:55 +08:00
Tracy Sharpe
b2b9de939f
cleanup onnxruntime_mlas.cmake of old gcc workarounds (#8469) 2021-07-22 22:01:05 -07:00
Frank Liu
002e427c5b
Add UINT8 datatype support to Java (#8401)
Add UINT8 datatype support
Add inference test for UINT8 model
2021-07-22 17:11:49 -07:00
Dmitri Smirnov
950fe5e28b
Implement SparseTensor and infrastructure suppport and advance ONNX commit (#8038)
SparseTensor support
  Implement Builder pattern
  Fix support for 1-D and 2-D COO indices
  Implement and test CSR support.
  Handle shape inference for SparseTensors
  Implement conversion for COO, CSR and tests.
  Address the case where constant sparse initializer is the output.
  Implement test infra for SparseTensors
  Implement SparseDenseMatMul for Csr and COO and tested it.
  Add hash for SparseToDenseMatMul
  Finish shared provider refactor
  Refactor GetOrCreate to Create
  Working on py interface
  Expose OrtDevice and use it in allocate_numpy
	Adjust Sparse interfaces, add support for string SparseTensor. Add tests.
	Add and test to_cuda()
	Add accessors to format specific indices
	Test values and indices views, read-only flag, after GC access
	Add sparse related methods to OrtValue
	Re-work SparseTensor wrapper, add OrtValue methods
	Rework numpy_array_to_cuda/to_cpu
	Add run_with_ort_values
	Add models and test sparse_mat_mul with run_with_ort_values
	Refactor sparse tensor to use a single buffer
        Ifdef x86 Eigen CSR sparse matmul implementation
        Exclude broken test, check for string type when copying cross device
       Split pybind schema, regenerate docs, add exclusion
       Conditionally exclude schema module
       Update docs fix cuda build
       Add test to a filter and renerate JS docs
      Add conversion and test string support for sparse tensors
      Exclude conversion utils from minimal build
      Add CUDA Memcpy and adjust provider interfaces
2021-07-22 15:24:36 -07:00
raviskolli
f641c0f4e8 Update requirements.txt
Updated requests version to address component governance failure
2021-07-22 14:18:21 -07:00
Thiago Crepaldi
9073c094d4 Update torch litghning and re-enable test 2021-07-22 14:18:07 -07:00
Ye Wang
e8ee31bcc3
Update onnx_model_bert_tf.py (#8457)
Fix a bug: when layernorm and skiplayernorm are not fused, the program will crash
2021-07-22 13:50:55 -07:00
Adam Pocock
9a6fa057c8
[Java] Allow extraction of multidimensional String tensors (#8452)
Fixing a bug where String tensors would always be single dimensional in Java.
2021-07-22 13:19:49 -07:00
Edward Chen
287a2a778f
Update CODEOWNERS with mobile team ownership of expected kernel def hash data files. (#8454) 2021-07-22 11:19:06 -07:00
Hariharan Seshadri
3360024a0b
Support plugging in custom user-defined allocators for sharing between sessions (#8059) 2021-07-22 10:17:35 -07:00
Edward Chen
989491c333
[NNAPI EP] Make partitioning stop ops configurable. (#8444)
Enable NNAPI EP partitioning stop ops to be overridden by a session configuration option.
2021-07-22 09:21:42 -07:00
pengwa
892ac9f55a
code structure update (rename only) (#8410) 2021-07-22 23:50:19 +08:00
DeyuHuang
4275055868
Add Gridsampler contrib op (#8372)
* add Gridsampler contrib op

* fix gridsampler_paddingmode_border test

* disable the tests until the kernel added

* fix CI failure

* change GridSampler to GridSample
2021-07-22 15:39:28 +08:00
Ryan Hill
53d5814d12
Move the wrapped types out of provider_interfaces (#8455) 2021-07-21 21:43:40 -07:00
Faith Xu
14b045ad52
Add link to sample repos (#8417)
* Update readme and add link to sample repos

* Minor updates based on PR feedback

* Add links to sample repos in former samples folder
2021-07-21 16:18:59 -07:00
Ryan Lai
83bb771e0b Merged PR 6283367: RI Onnxruntime github to internal Fork 7/21
This fixes merge conflict in onnxruntime.cmake

Related work items: #34759165
2021-07-21 21:31:04 +00:00
Ryan Lai
b0c0b087a4 FIx merge conflict 2021-07-21 14:02:18 -07:00
Edward Chen
695536a7ac
Make some common macros safer to use. (#8445) 2021-07-21 12:14:36 -07:00
Oliver Rausch
972aee8308
Fix GCC build error in quantization tests (#8449) 2021-07-21 18:15:13 +02:00
Ryan Hill
7e2ecb2eeb
Remove unnecessary line as no headers exist now (#8446) 2021-07-21 01:03:05 -07:00
Adam Pocock
55b26b6951
[Java] Adds support for DNNL, OpenVINO, TensorRT shared providers and refactors the CUDA shared provider loader (#8013) 2021-07-20 22:33:15 -07:00
Changming Sun
1cd9b47d8d
Remove all C/C++ samples from our C# dir (#8441) 2021-07-20 21:46:46 -07:00
Rajalakshmi Srinivasaraghavan
894fc82858 POWER10: Additional check in cmake
When compiling with newer gcc and older glibc, there is a chance
for new POWER10 macros to be not available in hwcap.h. This patch
checks whether hwcap macros are available before using that in
platform.cpp.
2021-07-20 13:04:18 -07:00
Sherlock
28527b4867
Handle duplicated names for output_grads (#8431) 2021-07-20 10:17:31 -07:00
Ryan Hill
cc9f793b48
Move one function from cuda_provider_factory.h (#8407) 2021-07-19 17:55:59 -07:00