Commit graph

5198 commits

Author SHA1 Message Date
Edward Chen
ef930b3ca9
[Objective-C API] Fix ORTIsCoreMLExecutionProviderAvailable link error when used from Swift. (#8350) 2021-07-14 18:38:58 -07:00
Guoyu Wang
c5038063ed
Add iOS/macOS static framework (#8357)
* Add ability to generate ios static framework

* Fix typos

* Add pod cache clean, update some comments of previous commit

* Fix CI failure with newly added cpuinfo library

* Update test model (CoreML requires node has a name)

* Addressed CR comments
2021-07-14 16:39:17 -07:00
Tianlei Wu
41f1280fc9
Fix transformer optimizer (#8392)
* fix a few issues
2021-07-14 16:00:17 -07:00
Edward Chen
88d1ffe9b8
Fix invalid access in log call. (#8389)
Fix bug that shows up when running tests (in particular, GraphTransformationTests.ConcatSliceEliminationTest) with more verbose logging level.

There is a log statement that doesn't get evaluated at the default test logging level (warning). It was accessing the first element of an empty vector. This change moves that log statement before the point where that vector is cleared.
2021-07-14 15:09:45 -07:00
Yulong Wang
0a1c00e8db
[js/node] remove unused dependency node-pre-gyp-github (#8388) 2021-07-14 14:30:44 -07:00
Tianlei Wu
5cd254aa79
update gpt2 attention fusion for past pattern (#8375) 2021-07-14 12:04:53 -07:00
Changming Sun
4e1c5f6ef4
Move the samples to a new repo (#8374)
Move the samples to a new repo https://github.com/microsoft/onnxruntime-inference-examples
2021-07-14 11:16:39 -07:00
Sherlock
4931ef666d
Update ORTModule frontend code owner file (#8335) 2021-07-14 09:26:04 -07:00
Guoyu Wang
68c5eb5414
Fix reduced ops CI failure (#8377) 2021-07-13 20:53:57 -07:00
Tianlei Wu
e340a59993
Update machine info script for transformers notebooks (#8376)
* fix constructor
* update machine_info
* refactor shape_infer_helper
2021-07-13 19:54:27 -07:00
Edward Chen
16f6904232
[iOS] Packaging pipeline improvements. (#8324)
Updates to the iOS packaging pipeline:
- Make it harder to overwrite package archives accidentally when uploading (fails if the archive already exists)
- Only upload package archives for release builds
- Some clean up
2021-07-13 18:48:28 -07:00
Chen Fu
0020703d00
Fix cpuinfo initialization failure in mlas test (#8366)
Fix cpuinfo initialization failure in mlas test


Co-authored-by: Chen Fu <fuchen@microsoft.com>
2021-07-13 18:39:15 -07:00
Ye Wang
04297110c3
Support int64 in ReduceMin cuda op for Opset 14 (#8307)
* reducemin int64_t support

* fix xxcuda.so load error

* testtest

* refactor

* update doc

* propagate types to opset14

* re-generate doc

* rename macro
2021-07-13 16:18:06 -07:00
Jeff Daily
8d8db7c9f0
[ROCm] clear last status if hipErrorNotReady (#8358)
* [ROCm] clear last status if hipErrorNotReady

* use hipEventDisableTiming in rocm_fence.cc

* fix syntax errors

* destroy event before handle becomes invalid
2021-07-13 15:58:40 -07:00
Nick Kreeger
178c139718
cleanup formatting in skip_layer_norm.cc (#8371) 2021-07-13 16:36:41 -05:00
Chi Lo
31f291f0af
Add TRT EP memory leak test into trt perf script (#8155)
* Add memory check for TRT perf

* Revise test app

* Add memory check for TRT perf

* Revise test app

* add test cases

* Modify script and add pipeline YAML

* remove redundant code

* temporarily change

* Change YAML

* revise test app

* fix minor bug

* code refactor

* small fix

* temporarily change for test

* prepare result log

* rm container when it exits

* code refactor
2021-07-13 09:39:08 -07:00
KeDengMS
eda1411e03
Fix symbolic shape inference regression in RoBERTa training (#8364)
* Needs to assign shape field for scalar output

* Add op test for SoftmaxCrossEntropyLoss
2021-07-13 08:15:53 -07:00
pengwa
7db4fc8c2a
Fix segment fault for custom function (#8331)
* unregister registered python functions upon normal interpreter termination
* atexit.register(unregister_python_functions) should be called by __init__.py
* minor fix
2021-07-13 18:01:33 +08:00
Yufeng Li
5bf862eef9
Fix build break on windows arm64 (#8361) 2021-07-12 22:35:21 -07:00
Changming Sun
530d7bb46d
Temporarily disable transformers tool test (#8360) 2021-07-12 20:31:22 -07:00
Zuwei Zhao
0a5b75f5cd
Update submodule onnxruntime-extensions. (#8282)
* Update submodule onnxruntime-extensions to latest.

* Add document for onnxruntime-extensions.

* Update cgmanifest.json for onnxruntime-extensions.

* Add example in JavaScript.

Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com>
2021-07-13 10:21:11 +08:00
Changming Sun
29ca08a729 Update Dockerfile.cuda: remove compute capability 30
30 is not supported in CUDA 11.1.

35,37,50 are deprecated.
2021-07-12 15:19:48 -07:00
Chen Fu
df4cb6f301
Adding pytorch cpuinfo as dependency (#8178)
Pytorch cpuinfo library allows us to query current cpu features, micro-architecture and cache size, etc. These information is needed for targeted performance optimizations.

Unfortunately it does not work under Windows/ARM. We need to develop our own later
2021-07-12 14:21:12 -07:00
Sheil Kumar
eec8e1394a
Memory map files on windows to speed up model load (#8349)
* Memory map files on windows to speed up model load

* fix custom ops

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2021-07-12 11:52:08 -07:00
Yufeng Li
f6956e0259
Refactor qgemm file (#8322)
This PR purely extracts each kernel to a standalone file. No functionality change. It includes specifically:

leave the MlasGemm function and thread handling in the qgemm.cc
put dispatcher functions and the template functions (interfaces) that are required to implement a kernel into qgemm.h
put each kernel implementation in a separate file, which implements/specialize template functions: MlasGemmU8X8FixupZeroPointB, MlasGemmU8X8CopyPackA, MlasGemmU8X8CopyPackB, MlasGemmU8X8Kernel
determine the files to be compiled in cmake file
2021-07-12 10:13:20 -07:00
KeDengMS
b7c9696ac3
Symbolic_shape_infer fixes (#8280)
1. Add support for sequence ops: ConcatFromSequence, SequenceAt, SequenceInsert. There are other sequence ops supported by onnx that worked well after adding these ops, so no need to add all of them in symbolic_shape_infer
2. For If node, the two branches output might have different shapes. In that case, for sequence output, use None in dimension; For tensor output, create a new symbolic dimension.
3. Fix a bug in Tile, where input for repeats might be of unknown value
4. Topological sort of nodes in graph need to consider implicit input in subgraphs for If/Loop/Scan ops
5. Generate unique prefix for new dimensions inside subgraph
2021-07-09 19:14:26 -07:00
Guoyu Wang
10142f9510
Add metadata_props to ORT model (#8340)
* Add metadata_props to ORT model

* Minor update

* Update python binding, and increase the minimal pipeline size threshold

* Fixed a small bug in serializing ir_version

* Remove temp ort.py.fbs and add it to .gitignore
2021-07-09 11:28:27 -07:00
Changming Sun
60641a19e4
Add "/external:templates-" to VC++ flags (#8338) 2021-07-09 11:23:53 -07:00
Tang, Cheng
e467d78a11
fix a typo (#8334)
Co-authored-by: Cheng Tang <chenta@microsoft.com>
2021-07-09 09:24:43 -07:00
Tang, Cheng
598454bb5f
Fix the mix precision handle for square case (#8333)
* handle unsqueeze change in opset13

* fix the node arguments index check for square case (x * x)

* Revert "fix the node arguments index check for square case (x * x)"

This reverts commit c66344f0a82c35d8c24d31f2264cf7e9b235ce22.

* handle the square case (x * x) for node argument search

Co-authored-by: Cheng Tang <chenta@microsoft.com>
2021-07-09 09:24:19 -07:00
Rachel Guo
187743726b
[CoreML EP] Add Int32<->Int64 handling around coreml ep (#8183)
* initial int32-int64 type handling

* initial

* clean and fix UT error

* modify code comments

* address partial pr comments

* minor update

* address pr comments

Co-authored-by: rachguo <rachguo@rachguos-Mac-mini.local>
Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
2021-07-09 09:08:05 -07:00
Hariharan Seshadri
5369821ad6
Support SpaceDepth ops in the CUDA and ROCM EPs (#7960) 2021-07-09 01:00:22 -07:00
Scott McKay
1b2e1a7e0c
Refactor QDQ optimizers to enable future usage in minimal build (#8191)
* Add new transformer that can split node selection from node modification to allow just the modifications to be applied at runtime in a minimal build. This is the first step of a few to enable a QDQ model to be optimized for the NNAPI EP and/or the CPU EP at runtime in a mobile scenario.
Add generic and QDQ specific helpers for selection and modification.
Replace existing QDQ optimizers with optimizer based on new approach.
2021-07-09 16:11:43 +10:00
Hariharan Seshadri
46e5c8d4b9
Cosmetic change in test infrastructure (#8292) 2021-07-08 21:52:02 -07:00
pengwa
5454af4b95
decouple the shared python dependency (#8294)
* remove warnining message for non-training build

* move to/from dlpack for onnxruntime_python back into python project
2021-07-09 11:47:11 +08:00
Dmitry Yutkin
067759b387 Fix bad URL to huggingface onnx-export example notebook 2021-07-08 15:01:46 -07:00
satyajandhyala
84bc20fe9d
Enable cast propagation with level one by default. (#8286) 2021-07-08 14:38:09 -07:00
RandySheriffH
f40df30219
Replace functions with secured version for OSX compliance (#7586)
* replace strlen with strnlen

* replace vsnprintf with vsnprintf_l

* add macro

* switch to std numeric::limits

* apply uint16 max

* fix build err

* fix mac build

* define MAX_STR_LEN

* define MAX_STR_LEN

* fix typo

* trim empty lines

* apply constexpr

* fix typo

* add namespace

* fix build err

* rename global constant

Co-authored-by: Randy <Randy@randysmac.attlocal.net>
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
Co-authored-by: Randy <Randy@randysmac.local>
2021-07-08 11:02:36 -07:00
pengwa
6dbfb8db0e
autograd function fallback perf (#8312)
* fix known issues

* Update orttraining/orttraining/test/python/orttraining_test_ortmodule_autograd.py

Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
2021-07-09 00:29:40 +08:00
Edward Chen
c254c3c355
Fix issue with ONNX to ORT format model conversion script when given single model file as input. (#8323) 2021-07-07 14:08:47 -07:00
baijumeswani
6652d17dcd
Support lists as inputs to ORTModule (#8311) 2021-07-07 13:04:19 -07:00
Thiago Crepaldi
9a855fe9e7
Make Torch CPP extension build optional for packaging pipelines (#8305) 2021-07-07 07:24:58 -07:00
Tang, Cheng
d7c3703371
handle unsqueeze change in opset13 (#8308)
Co-authored-by: Cheng Tang <chenta@microsoft.com>
2021-07-06 22:30:24 -07:00
pengwa
2347a0aca8
Autograd Function Fallback bug fix - moe support (#8105)
* Support forward inputs orders like "Non_tensor/Tensor/Non_tensor". Correspondingly, support "None/Tensor_Grad/None" fpr backward outputs.

* Report RuntimeError when PythonOp detected but _enable_custom_autograd_function is enabled.

* Fix "PoliCheck ] - Defect : Term "hang", Component : orttraining\orttraining\python\training\ortmodule\__init__.py (1 issue)"

* rename call_convention->input_convention, input_tensor_requires_grads->input_requires_grads

* fix minor comment

* revert polycheck fix in case of conflict

* Update orttraining/orttraining/core/graph/training_op_defs.cc

Co-authored-by: Tim Harris <tiharr@microsoft.com>

* Apply suggestions from code review

Refine the schema description

Co-authored-by: Tim Harris <tiharr@microsoft.com>

* Resolve review comments

Co-authored-by: Tim Harris <tiharr@microsoft.com>
2021-07-07 08:58:01 +08:00
Nick Kreeger
40e5279f8f
Drop unused functions from math.h (#8304)
* Drop unused functions from math.h

* fix dnnl_conv.h
2021-07-06 19:18:18 -05:00
Nick Kreeger
62d1458ea8
Move kernel implementations outside of lookup table utility functions. (#8306) 2021-07-06 18:31:05 -05:00
baijumeswani
090bae21ab
Pinning pillow version to 8.2.0 to circumvent regression introduced by 8.3.0 (#8303) 2021-07-06 13:02:39 -07:00
Suffian Khan
008c5f7640
Use single builder image across Python versions for ROCm wheels (#8302)
* first attempt share docker image across python and torch versons

* set dependency between jobs

* fix yaml grammer

* remove python version from first stage

* clean deepspeed directroy

* split into two images according torch version

* fix yaml syntax

* invalidate cache

* remove DS to prevent torch 1.9.0 upgrade
2021-07-06 11:56:00 -07:00
RandySheriffH
56e4dd1d3e
Fix optimizer crash (#8274) 2021-07-02 17:19:15 -07:00
Suffian Khan
e71846b029
fix ld_preload for rocm (#8290) 2021-07-02 17:15:28 -07:00