Commit graph

3986 commits

Author SHA1 Message Date
Xavier Dupré
84addcd2cf
Support double for operator ReduceMean, ReduceLogSumExp (#6217)
* Support double for operators ReduceMean, ReduceLogSumExp
2020-12-31 11:24:54 +01:00
Xavier Dupré
5968a91ea6
Support double for operator Gemm + fix bug in gemm implementation for cuda, rocm when sizeof(type) != sizeof(float) (#6223)
* Support double for operator Gemm
* fix type size while copying data in gemm operator for GPU
* fix type in gemm implementation for rocm
2020-12-31 11:24:16 +01:00
Xavier Dupré
70e2f96ef4
Support double for operator TopK + fix one bug in TopK implementation for GPU for double (#6220)
* Support double for operator TopK
* add static classes for topk/double
* fix cast issue in topk
2020-12-31 11:23:19 +01:00
Tracy Sharpe
ecb2e119e4
MLAS: handle MlasGemm(M/N/K==0) cases (#6238) 2020-12-30 23:25:10 -08:00
Hariharan Seshadri
4cc2ffef21
Support MLFloat16 type in Pow opset-12 CUDA kernel (#6233) 2020-12-30 20:41:59 -08:00
William Tambellini
39a988ce1c Upgrade build.py to assert for python 3.6+
Upgrade build.py to assert for python 3.6+
as python 3.5 cannot build anymore todays master.
2020-12-30 20:17:09 -08:00
Changming Sun
c15a858745 Update the readme file 2020-12-30 20:16:45 -08:00
Changming Sun
3911105f09 Remove python 3.5 2020-12-30 20:16:45 -08:00
Changming Sun
1b23b28706
Remove MKLML/openblas/jemalloc build config (#6212) 2020-12-30 17:18:19 -08:00
Michael Giba
5c584b2636
Removed executor todo that looks dead. (#6234) 2020-12-30 17:17:37 -08:00
Michael Goin
bbb6b416f0
Fix ImportError in build.py (#6231)
There is a possible ImportError where build.py can import the wrong 'util' package if there are others present in `sys.path` already
2020-12-30 14:22:55 -08:00
Xavier Dupré
df7e2f3c1e
Support double for operators Relu, Tanh, Sigmoid (#6221) 2020-12-29 18:25:23 +01:00
Xavier Dupré
111ac299cc
Support double for operators Where, LpNormalisation (#6034) 2020-12-28 12:53:44 +01:00
Xavier Dupré
2d09db67b4
Support double for operators Log, Reciprocal, Sum (CPU) (#6032)
* Support double for operators Log, Reciprocal, Sum
* remove tesdt erf_double
2020-12-28 12:53:18 +01:00
Xavier Dupré
8a0f5c50ab
Minor change to improve performance for operator Pad. (#5537)
* small improvment for pad
2020-12-28 12:52:41 +01:00
Jesse Benson
7ccdfed1a6 Remove most ROCm-specific element-wise code and reuse CUDA element-wise code. 2020-12-27 10:30:29 -08:00
Jesse Benson
52228a703c Use TArray in AMD element-wise kernels, rather than manually copying memory to device. 2020-12-27 10:30:29 -08:00
Changming Sun
1fc7f92f25
Fix a memory leak in test_inference.cc (#6201)
* Fix a memory leak in test_inference.cc
2020-12-25 13:02:21 -08:00
sfatimar
7347996942
Openvino ep 2021.2 (#6196)
* Enabling fasterrcnn variant and vehicle detector

* changes for 2021_2 branch

* yolov3_pytorch commit

* fixed braces in basic_backend.cc

* ci information added

* faster rcnn variant and vehicle detector changes were made in 2021.1 and not in 2021.2

* some changes to support unit tests

* disable some tests which are failing

* fix myriad tests for vehicle detector

* Did some cleanup
*cleaned up comments
*Disabled Add_Broadcast_0x1 and Add_Broadcast_1x0
tests on MYRIAD_FP16 backend due to a bug
*cleaned up capability_2021_2.cc file
*Removed extra conditions which were added
for some validation in backend_utils

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* yolov3 pytorch workaround to ensure that the output names are matched

* gemmoptest fixed on myriad

* Fixed MYRIADX CPP Test Failures

*Expand,GatherND,Range,Round op's
are only supported in model

*where op with float input data
types are not supported and fixed

*Scatter and ScatterElements op's with
negative axis are fixed

*Reshape op with 0 dim value are not
supported and fixed

*Disabled InstanceNorm_2 test on MYRIADX

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* make changes to yolov3 pytorch

* Fixed python unit tests
*Fixed failing python tests on vpu,
GPU and CPU

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fixes POW op failures on GPU_FP16

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Clean up capability_2021_2.cc

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Updated docx for MultiThreading option
*Added extra info on setting the num_of_threads
option using the API and it's actual usage

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* fixed slice and removed extra prints

* Disabled failing python tests

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Minor changes added in capabilty_2021_2

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* made changes to slice to avoid failures

* Disabling FP16 support for GPU_FP32
->Inferencing an FP16 model on GPU_FP32
leads to accuracy mismatches. so, we would
rather use GPU_FP16 to infer an FP16 model
on GPU Device

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Updated docx for Inferencing a FP16 Model

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* fix for mask rcnn

* Script for installing openvino from source

* Updated with openvino 2021.2 online installation

* code comment fixes
fixed accuracy mismatch for div

* Update OpenvinoEP-ExecutionProvider.md

updated for 2021.2 branch

* Update README.md

updated dockerfile documentation

* Update BUILD.md

build.md update documentation

* permissiong change of install_openvino.sh

* made changes to align with microsoft onnxruntime changes

* Updated with ov 2021.2.200

Co-authored-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
Co-authored-by: sfatimar <sahar.fatima@intel/com>
Co-authored-by: MaajidKhan <n.maajidkhan@gmail.com>
Co-authored-by: mohdansx <mohdx.ansari@intel.com>
2020-12-23 08:47:22 -08:00
Ryan Lai
0494a0f95f
Add ability to skip GPU tests based on GPU adapter name (#6198)
* Implement conversion from ortvalue to Itensor for string tensors and comparing sequence of maps of strings to floats

* PR comments

* Add ability to skip gpu tests according to adapter description

* spacing

* spacing

* spacing
2020-12-22 15:20:23 -08:00
Jesse Benson
c562952750 Dockerfile to build onnxruntime with ROCm 4.0 2020-12-22 10:21:12 -08:00
Ryan Lai
21395f8e24
Implement comparing outputs that are sequence of maps of strings to floats (#6180)
* Implement conversion from ortvalue to Itensor for string tensors and comparing sequence of maps of strings to floats

* PR comments
2020-12-22 09:52:29 -08:00
baijumeswani
a8b482681a
Clean up checkpoint tests to use the new checkpoint functions (#6188)
* add deprecation warning for old checkpoint functions

* update all the distributed checkpoint tests to use new checkpoint functions
2020-12-22 09:15:57 -08:00
Hariharan Seshadri
04b3e0ef5e
Condition fix in Resize operator (#6193) 2020-12-22 00:05:31 -08:00
Hariharan Seshadri
fc27074bae
Implement ScatterND for CUDA EP (#6184) 2020-12-22 00:04:20 -08:00
Chi Lo
945fae8f56
Lochi/quantization tool for trt (#6103)
* Initial implementation of generating calibration dynamic range table

* Initialize validation support for Quantization

* Initialize validation support for Quantization (cont.)

* Improve validation support for Quantization

* Improve validation support for Quantization

* Rewrite/Refine for calibration and validation

* Rewrite/Refine for calibration and validation (cont.)

* Refine code

* Refine code

* Add data reader for BERT

* Add flatbuffers to serialize calibration table

* Refine code and add BERT evaluation

* Refine the code

* minor modification

* Add preprocess/postprocess of vision team yolov3 and refine the code

* Update annotation

* Make bbox cooridates more accurate

* Fix bug

* Add support of batch processing

* Batch processing for model zoo yolov3

* Add batch inference for evaluation

* Refine the code

* Add README

* Add comments

* Refine the code for PR

* Remove batch support checking in data_reader and refine the code

* Refine the code for PR

* Refine the code for PR review

Co-authored-by: Olivia Jain <oljain@microsoft.com>
2020-12-21 20:59:08 -08:00
Olivia Jain
234e94b4e1
Add Status.csv to EP Perf Tool (#6167)
* merge master, keep postprocess status commit

* download float16.py everytime

* removing hardcoded values
2020-12-21 20:23:19 -08:00
Suffian Khan
67ac6ae4e0
Tune fast Gelu to use exp(x) instead of tanh(x) on Rocm platform (#6174)
* tune fast gelu to use exp(x) instead of tanh(x) on rocm

* update to use expression 2/(1+exp(-2x))-1 for stability
2020-12-21 16:25:21 -08:00
Weixing Zhang
53307a5f2e
improve perf for softmax (#6128)
* improve perf for both gathergrad and softmax

* revert the change in gathergrad and will be done in another PR.

* address comments from code review.
2020-12-21 14:15:54 -08:00
S. Manohar Karlapalem
ea9cfa554a
Add usage details of unified MCR container image (#6182)
Going forward, a single unifed docker image will be published in
MCR. The hardware accelerator target choice will have to be made
in the application using OpenVINO EP's runtime config options.
2020-12-21 11:48:54 -08:00
satyajandhyala
201d0dbb1a
Android coverage dashboard (#6163)
* Write the report to a file.

* Post code coverage to the Dashboard database.
2020-12-21 10:34:01 -08:00
jingyanwangms
f874260b9e
Backend APIs for checkpointing (#5803)
* Add backend API GetOptimizerState and GetModelState

* add GetPartitionInfoMap
2020-12-21 08:21:29 -08:00
Scott McKay
2da8060f34
Helper for compiling EP to generate deterministic unique ids for use in MetaDef names (#6156)
* Create a helper for generating unique ids that can be used by an EP that creates compiled nodes and needs ids to be deterministic for a model when used in multiple sessions.

Added to IExecutionProvider as this can potentially be used by all compiling EPs and is more robust than a simplistic counter (although EP implementer is free to choose either approach).

* Restructure the helper so it can be called across the EP bridge.
Add ability to call id generation helper from EP bridge
  - convert DNNL EP to use helper to validate
Address issue where a new Model may be loaded into the same address as a previous one.
  - hash the bytes in the Graph instance (1728 bytes currently) to use as the key to the full hash for the model
Add lock around id generation to ensure no issues if multiple sessions partitions graphs at exactly the same time.
  - Extremely unlikely but would be hard to debug and the locking cost is not an issue as it's only incurred during graph partitioning and not execution.
2020-12-21 12:17:58 +10:00
Edward Chen
cd3a5acca0
Update get_docker_image.py to enable use without image cache container registry. (#6177)
Update get_docker_image.py to enable use without image cache container registry.
2020-12-18 19:01:02 -08:00
Derek Murray
11b0a5401e
Fix typo in BERT pretraining script (#6175)
A misplaced `}` meant that the `'enable_adasum'` option was interpreted incorrectly, causing the test to fail.
2020-12-18 16:38:14 -08:00
Guoyu Wang
bbb52e9274
[NNAPI EP] Enable per-channel quantization for QlinearConv (#6155)
* Enable qlinearconv per-channel quantization

* Fix the android CI test failure

* Add Android Version Check for Per-Channel Quant

* Address PR comments

* Fix some minor issues

* Add verification of per-channel zero points

* Make the error tolerance configurable
2020-12-18 16:13:22 -08:00
baijumeswani
39aedbc97f
aggregate model states only for the case when mixed precision was true (#6176) 2020-12-18 14:09:32 -08:00
Pranav Sharma
86493e6d0c
Update documentation for contributing a PR and add deprecation notices for PyOp and ORT server. (#6172) 2020-12-18 02:00:42 -08:00
Sergii Dymchenko
824ef9a1de
Don't try to bind unused inputs in the Training frontend (#6166) 2020-12-17 21:41:28 -08:00
baijumeswani
adc2071043
save_checkpoint, load_checkpoint and aggregate_checkpoints (#6136)
* save_checkpoint and load_checkpoint implementations

* checkpoint aggregation logic

* unit tests for save_checkpoint, load_checkpoint and aggregate_checkpoints
2020-12-17 21:01:36 -08:00
Guoyu Wang
c339bb2da9
Remove ignored build warnings for pybind on Mac (#6165) 2020-12-17 19:54:28 -08:00
Yufeng Li
98d8a3e335
Revert "Fuse MatMulIntegerToFloat only when scales are scalar (#6008)" (#6169)
This reverts commit f2dcba7afe.
2020-12-17 19:53:50 -08:00
Du Li
34725ae520
Bugfix for topk cuda kernel (#6164)
* fix the issue that std::numeric_limits cannot handle half type

* adding a test

Co-authored-by: Du Li <duli@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-12-17 17:59:37 -08:00
Jay Rodge
dec703b62d
Update TensorRT-ExecutionProvider.md (#6161) 2020-12-17 17:10:40 -08:00
Tixxx
32c67c2944
Deprecating Horovod and refactored Adasum computations (#5468)
deprecated horovod submodule
refactored adasum logic to be ort-native
added tests for native kernel and e2e tests
2020-12-17 16:21:33 -08:00
Pranav Sharma
efa1b0d864
Minor fix to satisfy c++14 (#6162) 2020-12-17 13:53:24 -08:00
Juliana Franco
36c03b32e9
Using a map of of ops to stages as input of partition function. (#5940)
* New partition algorithm running before AD

* Convert cut_group_info into device map. Work in progress -- works for  bert-tiny with pp=2

* Removing code for partition of bwd graphs

* Remove old code

* Adding some verification code

* Handle Shared Initializer

* Renaming rank with stage

* Added first unit test

* new test

* redundant check

* undo change in bert

* Moved cut-based partition to testing utils file

Co-authored-by: xzhu1900
Co-authored-by: wschin

* New conversion function and tests

* minor

* remove test that is not needed2

* improve GetDeviceAssignment and PR comments

* minor changes

* PR comments

* improving documentation and variable naming

* add documentation

* Variable naming and docs

* more doc improvements

* more doc improvements

* missing static cast

* Fix test file for windows

* Fix test file for windows

* Fix test file for windows

* stage id is not the same as rank id

* PR comments

* PR comments

* More comments

* More comments
2020-12-17 09:03:33 -08:00
Tracy Sharpe
503b61d897
MLAS: add NEON version of int8 depthwise convolution (#6152) 2020-12-16 18:39:10 -08:00
Edward Chen
0fa04bdc50
Fix clean_docker_image_cache.py detection of image pushes. (#6151)
Fix clean_docker_image_cache.py detection of image pushes. They were being ignored because the expected HTTP status code was wrong. For pushes, it's 201 instead of 200.
2020-12-16 17:25:22 -08:00
Changming Sun
344a2a8ee5
Revert "work around of the build break in mac (#6069)" (#6150)
This reverts commit 3cae28699b.
2020-12-16 14:41:18 -08:00