Commit graph

11997 commits

Author SHA1 Message Date
Changming Sun
a5fea26cb4 Disable model tests for Mac OS X builds 2020-04-02 15:14:32 -07:00
Thiago Crepaldi
e2afe5e054 Revert Session and InferenceSession implementation 2020-04-02 11:47:44 -07:00
Changming Sun
aefa466334
Allow zero in split op (#3389)
Allow zero in split op (A change in onnx 1.7 without bumping up the op version)
2020-04-01 16:20:14 -07:00
Tiago Koji Castro Shibata
1671072b6b
[WIP] Port image tests from WAI (#3365)
* Copy image tests from ADO

* wip

* Port tests to googletest

* Add FNS-Candy license

* Add missing collaterals

* Remove brand images

* Fix typos

* Use PrepareModelSessionBinding in MnistImageTest

* Fix typos
2020-04-01 15:38:44 -07:00
Thiago Crepaldi
0b1e3f1e10 Revert _SliceKernel cuda implementation 2020-04-01 14:28:17 -07:00
Thiago Crepaldi
28ff88ce52 Disable tests (temporary) 2020-04-01 14:28:07 -07:00
Tiago Koji Castro Shibata
1c334ed0f1
Add Ninja generator to build.py (#3331) 2020-04-01 14:19:22 -07:00
Xavier Dupré
edec8043d4
Fix python examples in documentation (#3379) 2020-04-01 22:48:32 +02:00
ytaous
2ce90cff4c
PR comments (#3374)
* PR comments

* PR comments

* PR comments

* PR comments

* PR comments

* PR comments

* PR comments

Co-authored-by: Ethan Tao <ettao@microsoft.com>
2020-04-01 10:36:16 -07:00
Sherlock
614eb438ae
Update Op's Domain and Version (#3356)
* Update Nccl ops domain opset

* Update ZeroGradient Domain OpSet

* Update InPlaceAccumulator Domain OpSet

* Update SoftmaxGrad Domain and OpSet

* Update LayerNormalizationGrad Domain and OpSet

* Update BatchNormGrad Domain and Opset

* Update IsAllFinite Domain and Opset

* Update DivGrad Domain and Opset

* Update GatherGrad Domain and Opset

* Update IsFinite Domain and OpSet

* Update ReduceAllL2 Domain and Opset

* Update MixedPrecisionScale Doman and Opset

* Update AllOp Domain and Opset

* Update GroupOp Domain and OpSet

* Update ViewOp Domain and OpSet
2020-04-01 10:10:38 -07:00
Changming Sun
accffded5d
Build options for enabling AVX/AVX2/AVX512 (#3373)
1. Add build options for enabling AVX/AVX2/AVX512
2. Update eigen to a newer version, because the current one doesn't work with VC and AVX512.
2020-04-01 10:07:22 -07:00
Brian Martin
77c7d09ced
ERROR_NOT_SUPPORTED doesn't trigger Failed Hresult. Need E_NOTIMPL (#3396) 2020-04-01 10:06:00 -07:00
Brian Martin
052c1fda44
fix some warnings in concurrency tests (#3395) 2020-04-01 10:05:24 -07:00
Scott McKay
33d3239b67
Rework SVMClassifier to improve performance (#3363)
* Rework SVMClassifier
 - use GEMM for initial scoring
 - minimize data allocations and copies
 - parallelize the second half of the scoring for larger batches
2020-04-01 22:00:01 +10:00
Thiago Crepaldi
6d769d47c4 Fix InferenceSession API 2020-03-31 20:10:06 -07:00
Xueyun Zhu
efc8bd738f
add pipeline graph split script (#3275)
* pipeline graph cut

* add element type

* add input wait event and shape info

* shape inference

* support multiple cuts

* format script

* address feedback

* address feedback
2020-03-31 19:30:18 -07:00
Thiago Crepaldi
83c3da3fc0 Fix code-base after breaking API changes 2020-03-31 17:59:20 -07:00
Tiago Koji Castro Shibata
a61400de01
Fix ARM cross compilation (related to #3378, #3298) (#3385) 2020-03-31 17:10:48 -07:00
Dwayne Robinson
ae15c36687 Merged PR 4500055: Fix autopilot WinML::Engine::Test::BenchmarkProtobuf#onnxzoo_ssd#1.5#GPU
Model node "Slice-505" specifies a 64-bit INTMAX value (9223372036854775807) to mean essentially unbounded. The proper response in this case is to clamp it to 32-bit INTMAX, not bitwise truncate it, which yields -1.

Related work items: #24672220
2020-04-01 00:10:45 +00:00
Changming Sun
55fd283d20
Fix a bug in FunctionImpl::FunctionImpl (#3376)
1. Fix a bug in FunctionImpl::FunctionImpl. It set wrong name for the new attribute.
2. Set error code to NOT_IMPLEMENTED if a function contains a not implemented op.
2020-03-31 15:54:47 -07:00
Dmitri Smirnov
a4fe60c4d3
OpSet 12 ops (#3341)
Advance ONNX commit to pickup the latest ArgMax, ArgMin,
  ReduceMax/ReduceMin, MaxPool
  Declare new versions for CPU/CUDA.
  Implement infrastructure support for int8/uint8.
  Adust GatherOp test for a new error.
  Adjust Scan9.BadShape test.
  Add exclusions for index out of bounds checks.
  Rework result verification for SVDTransformer.
2020-03-31 15:31:06 -07:00
manashgoswami
044c466158
Updated tags for v1.2.0 release (#3386)
Updated the tags in the table to reflect the new images for Release v1.2
2020-03-31 14:54:56 -07:00
Tianlei Wu
ecbacd7d79
Add Benchmark of GPT2 CPU inference (#3351)
* Add benchmark script and notebook for GPT2
* Update Reshape fusion for GPT2 model
* Add opt_level option for bert_model_optimization to disable onnxruntime by setting --opt_level 0
* Fix keras optimization
2020-03-31 13:43:09 -07:00
Jeff Bloomfield
e6e4339f0b Merged PR 4408731: Perf: Command lists should be preemptively reset in DML when flushing
Noticing this while analyzing perf of small models, e.g. emotionferplus.  Resetting command lists right after flushing rather than when they're used next ensures that the CPU work occurs when the GPU is busy.

Related work items: #25512194
2020-03-31 20:30:22 +00:00
Thiago Crepaldi
759818f2c1 Merge remote-tracking branch 'origin/master' into thiagofc/ort_training_merge_from_master 2020-03-31 10:53:22 -07:00
Scott McKay
ace741680d
Constant-12 support (#3304)
1. Support the new fields for Constant in opset 12
2. Support SparseTensor in the Constant node by converting to dense tensor when lifting the Constant to an initializer. Will make a model with a sparse tensor in a Constant work but isn't an overly efficient approach.
2020-03-30 23:13:52 -07:00
stevenlix
2332a93db0
Update onnx-tensorrt parser (#3369)
* sync onnx-tensorrt parser and update TensorRT doc

* remove --msvc_toolset 14.16 in tensorrt ci pipeline
2020-03-30 20:31:59 -07:00
Jan Scholz
ce9acf0c21
iOS crosscompilation under linux (#3298)
* added support for ios crosscompilation under linux

* reverted cmake generator change

* if --ios is added protoc can be compiled for host system

* accidently reverted change to compile protoc for host system for ios if protoc exe is not set

* wdata is now used

* accidentally pasted CMAKE_OSX_ARCHITECTURES into CmakeLists.txt, also made bad merge on build.py previously

* removed print

* fixed typeo, deleted commented statements for earlier debugging

* reverted accidental delete

* added asmmacro.h for aarch64 asm
now MlasSgemmKernel**** gets underscore added if needed
no need anymote to differentiate between iOS arm64 and normal amr64 build
onnxruntime.cmake: added check if iOSCross is set to properly set RPATH

* removed 2 spaces

* fix: logcial error fixed, now protoc gets compiled if not supplied with --path_to_protoc_exe

* removed unecessarily added spaces

* removed some more spaces
2020-03-30 19:39:17 -07:00
edgchen1
fb2f97a002
Address master merge PR comments (#3348)
Address some comments from https://github.com/microsoft/onnxruntime/pull/3174.

- https://github.com/microsoft/onnxruntime/pull/3174#discussion_r396855459
- https://github.com/microsoft/onnxruntime/pull/3174#discussion_r396855630
- https://github.com/microsoft/onnxruntime/pull/3174#discussion_r396857140
- https://github.com/microsoft/onnxruntime/pull/3174#discussion_r398094858
- https://github.com/microsoft/onnxruntime/pull/3174#issuecomment-599024924
2020-03-30 18:52:48 -07:00
Yufeng Li
af618278f6
fix bugs in quantization and calibration tools (#3329)
Fix 3 bugs:
node names duplicate in calibration augment_graph if the name of node to quantize is empty.
If output nodes are quantized, output value are quantized and not dequantized back
Gather with data type int64 should not be quantized
2020-03-30 17:50:25 -07:00
Maxim Kalinin
f2ca2b2981
Avoid "infinite" loop in optimizer (#3321)
* Avoid "infinite" loop in optimizer

When symbolic dimensions are present and can be overridden,
FreeDimensionOverrideTransformer always sets modified flag to true. As a
consequence, the optimizer loops until the iteration limit is reached.
2020-03-31 08:37:00 +10:00
Changming Sun
06fc9506fd
Thread pool changes (#3153)
1. Copy tensorflow's thread pool class to ORT, so that we can get a better implementation of thread pool based parallelfor
2. Copy Eigen's thread pool class to ORT
3. Support thread affinity
4. Remove RNN kernel’s private thread pool
5. Modify pool kernels to use the thread pool when openmp is disabled.
2020-03-30 12:18:40 -07:00
Yulong Wang
0494036006
fix tensor location mismatch in allocation planner (#3249) 2020-03-30 11:20:43 -07:00
Cassie
2b10e625f9
added public value varibale to NamedOnnxValue (#3347)
Co-authored-by: cassieview <cassie.siljander@microsoft.com>
2020-03-30 10:45:39 -07:00
George Wu
355f39ddee
fix cuda build for cmake >= 3.17.0 (#3362) 2020-03-30 00:38:57 -07:00
ytaous
d8f0a0f223
Address PR comments (#3352)
* PR comments

* revert code for a couple comments

* add negative test case

Co-authored-by: Ethan Tao <ettao@microsoft.com>
2020-03-29 12:34:54 -07:00
Weixing Zhang
1bbc421884
Don't cast to fp16 in LayernormGrad (#3328)
Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
2020-03-28 19:07:32 -07:00
Yang Chen
33b5010e62
skip optional inputs for scan subgraphs (#3349)
* skip optional inputs for scan subgraphs

We may have cases where the subgraph has optionial inputs that appear
in both subgraph's input and initializer, but not in the node's input.
In such cases, the input model might be invalid, but let's not choke
on it. Instead, let's issue a warning, skip the optional inputs,
and keep going forward.

* address CR feedback
2020-03-28 16:15:45 -07:00
Dwayne Robinson
324166b9cd Merged PR 4471419: DML EP kernels in ORT
- Add updated registrations for v11 of existing kernels (ScatterElements, Clip...).
- Add new kernels for GatherElements, GatherND, ScatterND, IsInf, Round, BitShift, CumSum, ReverseSequence, Mod.

Related work items: #23106898, #24655666
2020-03-28 03:00:19 +00:00
Dwayne Robinson
5bc81b7ae9 Fix bad merge (caused Slice to fail). 2020-03-27 19:58:41 -07:00
Dwayne Robinson
5972ce5566 PR feedback 2020-03-27 19:32:47 -07:00
Dwayne Robinson
f1a062c292 PR feedback 2020-03-27 19:22:25 -07:00
Sherlock
ffb2a3359e
Implement WhereGrad (#3343) 2020-03-27 19:10:40 -07:00
Dwayne Robinson
351c3c30fb Merge branch 'DmlDev' into user/dwayner/DmlEpGatherScatterReverseRangeInfModRoundBitshiftCumSumClip 2020-03-27 18:54:30 -07:00
Dwayne Robinson
8ff351ecc4 Merged PR 4482575: DML EP Slice in ORT
Related work items: #24672220
2020-03-28 01:24:44 +00:00
Dwayne Robinson
6c960a9417 PR feedback. 2020-03-27 18:10:46 -07:00
Tiago Koji Castro Shibata
c3cea486d0
Port ConcurrencyTests from TAEF (#3086)
* Add ConcurrencyTests

* Make ConcurrencyTests compatible with TAEF

* Use test PCH in concurrency tests

* Fix include header

* Ignore unused code warnings on WINML_SKIP_TEST

* Remove BOM

* Remove conflicting namespace in older SDK

* Refactor duplicate code

* Fix unused DELAYLOAD

* Fix unused DELAYLOAD

* Remove link to internal bug

* Address code style fixes

* Add new concurrency tests
2020-03-27 17:39:22 -07:00
Tixxx
49e6043d07
support Huggingface's adamw (#3318)
* add weight decay mode to support both pytorch and huggingface's adamw
2020-03-27 08:04:27 -07:00
Dwayne Robinson
031647635b Delete another litter file. 2020-03-27 02:43:11 -07:00
Dwayne Robinson
5feb3c0f19 Delete litter backup files. 2020-03-27 02:42:09 -07:00