Commit graph

7128 commits

Author SHA1 Message Date
Chi Lo
de3a91d85d
Revert TRT EP cache refactoring (#12376)
* revert cache refactor

* fix conflicts when reverting
2022-08-01 23:57:05 -07:00
Yi Zhang
5d1173fe68
Run IOS pipeline concurrently (#12400)
split ios pipelines
2022-08-02 11:07:17 +08:00
Yi Zhang
63d64636f6
Add the comment linking to wiki (#12398)
add the comment
2022-08-02 10:09:16 +08:00
LironKesem
315e006532
adding a comment on nll_loss_forward.output that can not be implemented (#12406)
adding a comment on nll_loss_forward.output that can not be implemented
2022-08-01 19:12:35 -04:00
msftlincoln
62922f4c3c
Eager Mode generator: add comments, rename functions (#12385)
* eager generator: add comments, rename functions

* lint
2022-08-01 15:52:47 -04:00
Edward Chen
f77ab4fea6
Manually add optimization flag for Android Release builds. (#12390)
With recent versions of NDK (since 23), the `-O` optimization level compile flag is not being passed when building in the "Release" configuration.
More details here: https://github.com/android/ndk/issues/1740

Our "Release" Android builds have been built without the optimization flag since we upgraded from NDK 21.

This change is a workaround to manually add `-O3` for "Release" Android builds.
2022-08-01 12:49:03 -07:00
George Wu
6bb807ef74
add cuda compute 8.7 to Cmakelists.txt to support Nvidia Orin devices (#12377)
* add cuda arch 8.7 to cmakelists.txt to support Nvidia Orin devices

* add cuda version >= 11 check for orin support
2022-08-01 09:45:58 -07:00
Cheng
3f66297499
code clean (#12392)
* code clean

* mispelling fix
2022-08-01 14:12:35 +08:00
Valery Chernov
1a4868e5c4
[TVM EP] Hot fix of build on Windows of TVM EP with ipp-crypto (#12381)
fix of build on Windows with ipp-crypto. cmake warnings fix

Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
2022-07-31 14:36:54 +02:00
Yi Zhang
8b4ad77ea2
pipeline can use last run's artifacts (#12379)
* first step

* depends on stage

* temp change

* specific

* runId

* parameters

* fix typo

* fix typo

* add nnapi

* add nnapi

* fix typo

* minor fix

* condition on stage

* format

* format
2022-07-30 21:34:57 +08:00
pengwa
6d1eb9509e
Refine gradient accumulation (on device training) (#12363)
* a

(cherry picked from commit 43909cdd6e3daf30a82d584292286806d1172a0b)

* optimize inplace accumulator a bit

* fix inputs

* revert logging

* minor fix

* tune perf and resolve comments

* typo

* fix

* fix tests

* move threshold to constexpr.
2022-07-30 10:24:01 +08:00
Changming Sun
7b4ce0c1e1
Delete the build scripts that were copied from manylinux project (#12358)
1. Delete the build scripts that were copied from manylinux project. Use "git checkout" instead.
2. Update manylinux version to get python 3.11. Related issue: Python 3.11 support #12343
3. Change the cuda version of linux gpu build job of nuget packaging pipeline from cuda 11.4 to cuda 11.6 to match the TRT job within the same pipeline.. (A lot other places need be updated as well, but I'd prefer to put them in another PR)
4. Make dockerfile names static. For example, replace tools/ci_build/github/linux/docker/$(DockerFile) to tools/ci_build/github/linux/docker/Dockerfile.manylinux2014_cpu . The former one relies on a runtime variable $(DockerFile), Template Parameters are expanded early in processing a pipeline run when most variables are not available. It like C++ macros vs variables.
2022-07-29 18:24:19 -07:00
Hariharan Seshadri
d5a1c01b38
Add C++ Session ctor taking model bytes and OrtPrepackedWeightsContainer (#12333) 2022-07-29 12:32:43 -07:00
Nat Kershaw (MSFT)
df8dd41a8e
Automatically run workflows to generate API docs PRs (#11749) 2022-07-29 10:24:59 -07:00
msftlincoln
9559d25da9
ORT Eager Mode Generator - make smaller functions (#12371)
These changes resulted in no change to the generated outputs ort_aten.g.cpp and ort_customops.g.cpp.
2022-07-29 10:12:34 -04:00
Changming Sun
f1a04078e9
Update CODEOWNERS to add a line for yaml files (#12378) 2022-07-29 07:05:41 -07:00
Yateng Hong
c579497134
Fix TRT custom op issue (#12283)
* Pass schema registry on CreateModel.

* Fix ORT_MINIMAL_BUILD.

* Fix build issue.
2022-07-29 03:39:56 -07:00
pengwa
6514069749
Make memory profiler work with multiple session runs. (#12317)
* make memory profiler work with multiple session runs.

(cherry picked from commit 5b636b4dd6fe91b75c063696dc73eda33ec36c8d)

* minor fix

* fix build

* fix window build

* 1. fix cpplint issues;
2. give unique filesname for each session profiler result.
2022-07-29 18:36:31 +08:00
ytaous
e4bd41fb3b
[ROCm] Enable Einsum for inferencing perf (#12360)
* enable einsum

* address comments

* comments

Co-authored-by: Ethan Tao <ettao@microsoft.com>
2022-07-28 20:26:25 -07:00
Justin Stoecker
9c0fa65110
Scope CreateFileMapping2 to valid API partitions (#12374) 2022-07-28 16:47:36 -07:00
Scott McKay
0e85af6990
Add MAUI csharp\sample\InferenceSample\ project (#12356)
Add csharp\sample\InferenceSample\Microsoft.ML.OnnxRuntime.InferenceSample.Maui so we have an equivalent setup for MAUI as for the other platforms.

This provides a setup to do some basic local testing of using an InferenceSession in a MAUI app.
2022-07-29 07:22:36 +10:00
sumitsays
805aa297fc
Remove preview keyword from DirectML pacakge (#12368)
Remove preview keyword

Co-authored-by: Sumit Agarwal <sumitagarwal@microsoft.com>
2022-07-28 14:18:58 -07:00
Jian Chen
7a7e372b9f
Remove training cuda 10.2 pipeline (#12347)
* update to 2022

* Update the VS version

* Rolling back to gcc 10

* Rolling back

* Update cuda home

* remove "CMAKE_CUDA_ARCHITECTURES=52"

* update cuda Architure to 70

* Delete cuda 10.2 training pipeline

* rolling back a mistake

* Update win-gpu-reduce-op-ci-pipeline.yml

* Update win-gpu-reduce-op-ci-pipeline.yml

* Update win-gpu-reduce-op-ci-pipeline.yml

* Delete tools/ci_build/github/linux/docker/scripts/training/ortmodule/stage1/requirements_torch1.10.0_cu10.2 directory

* Delete tools/ci_build/github/linux/docker/scripts/training/ortmodule/stage1/requirements_torch1.11.0_cu10.2 directory
2022-07-28 14:58:17 -04:00
Edward Chen
6e892a95b4
Use specific Android NDK version in CI builds. (#12350)
Current builds use a NDK version that happens to be on the build machine. The build machine environment may change in ways that are outside of our control.
This change installs a specific version of NDK (the current LTS version 25.0.8775105) and uses it.
2022-07-28 11:01:04 -07:00
Chen Fu
73919a6756
Convert DQ node with const weight tensor int8 to uint8 (#12331)
Convert DQ node with const weight tensor int8 to uint8

This is a follow-up with #12088, where convert weight tensor from int8 to uint8. Here we do the same thing in DequantizeLinear node, so that we don't have to perform the same changes for every single future operator.
2022-07-28 09:22:04 -07:00
Valery Chernov
e2423bb55c
[TVM EP] Build on Windows with ipp-crypto support (#12336)
* update TVM EP docs for ipp-crypto build conditions

* add ipp-crypto by ExternalProject_Add

Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
2022-07-28 15:40:19 +02:00
Vincent Wang
7a298c916a
Add Pow-15 for LayerNorm Fusion and FastGelu Fusion (#12314)
* op version check during fusion

* Revert "op version check during fusion"

This reverts commit cacc8f50ea36b08b73a98a81ca0ae3f84782435f.

* add pow-15 for fusion
2022-07-28 10:23:35 +08:00
Yulong Wang
186ba6e9f2
[js/rn] upgrade package react-native@^0.69.1 (#12155)
* [js/rn] upgrade package react-native@^0.69.1

* upgrade compile sdk to v31

* update ios version requirement

* update pod path for onnxruntime-react-native
2022-07-27 15:15:45 -07:00
Changming Sun
e6bb447101
Change native folder name for java macos arm64 (#12335) 2022-07-27 15:13:07 -07:00
101arrowz
148b1efe5e
[js/web] add ConvTranspose2D to WebGL backend (#11990)
* Add ConvTranspose

* Update docs + tests

* fix lint

* fix output shape calculations

* Revert "fix output shape calculations"

This reverts commit 8014fa9b33115f1d6a677fe2270a6da1b510ff67.

* fix format

* remove broken output_shape test
2022-07-27 13:57:12 -07:00
Justin Chu
d2b25a7c1c
Reduce CI noise from Python lint (#12270)
Description: Reduce CI noise from Python lint

Motivation and Context

Disable "missing-docstring" in pylint. This is usually noisy in tests
Show only added lint messages only for pyright
2022-07-27 13:42:29 -07:00
msftlincoln
9cf6912bba
Fix ORT Eager Mode to work with Pytorch 1.12 (#12323) 2022-07-27 16:24:46 -04:00
Hariharan Seshadri
e2eeffeafb
Cosmetic fix to AttentionFusion (#12329) 2022-07-27 12:43:50 -07:00
Wil Brady
1163294699
Fixing up some python warnings. (#12319) 2022-07-27 07:24:37 -04:00
kiennguyen94
1dd65b9ae3
Use memory mapped data for external data initializers with Cuda fix (#11789)
Description: Reinstate #11127 with Cuda fix.

Motivation and Context

Fixes Inference on Onnx with external data not working since PR 11320 (location planning logic) #11511
2022-07-27 19:13:04 +10:00
Yi Zhang
4df4471d5e
add missing build_java in Android testing stage. (#12187)
add missing build_java in testing
2022-07-27 14:13:08 +08:00
pengwa
2b2367efbf
Fix orttraining-linux-gpu-ci-pipeline (fairscale dependency) (#12320)
authored by: @pengwa
2022-07-26 15:11:04 -07:00
Tianlei Wu
51a799802a
Move initializers from subgraph to the main graph to reduce memory (#12310)
* move initializers from gpt2 subgraph to the beamsearch main graph
2022-07-26 11:30:17 -07:00
Adam Louly
f3dcbf539a
Checkpoint load inference (#12168)
* LoadCheckPoint to tensor cpp functions (draft)

* Load Checkpoint into inference model

* fix python lint

* fix python lint

* Fixing lint and some unused imports

* added assert for zero weights model, resolved other issues

* resolved issues

* Solved issues

* changed variable names for get_models

* paparameters names missmatched fix

Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-07-26 11:08:50 -05:00
Wil Brady
de57daaab0
Eager mode: binary ops more complete behavior and testing. (#12293)
* Remove hand written add_.Tensor as it can now be generated.

* Generate .out for tensor version of basic math ops. Add.out testing added too.

* Remove sin tests as they are covered by parameterized tests. Also, moved all parameterized tests to the end in their own section.

* Add binary ops tests for tensors. Scalar tests are calling the aten .out which is for tensor.

* Add support for scalar input to add, div, mul, and sub.
2022-07-26 09:14:57 -04:00
Ryan Hill
3e014a5e5d
Fix C header to stop people accidentally copying the OrtApi by value (#12297)
* Fix C header to stop people accidentally copying the OrtApi by value
* Remove api_ from KernelTwo
2022-07-25 19:19:40 -07:00
Vincent Wang
c40f73ae0c
Remove aten::binary_cross_entropy_with_logits from ATen Fallback (#12301) 2022-07-26 07:29:56 +08:00
Dmitri Smirnov
3bf614fd47
Eliminate memory allocations per recent profiling (#12225)
* Alloc begin

FeedsFetches refactoring
Refactor Tensor class
Fix buffer deletor
Remove new/delete deleted
Adjust alloc move
Fix up xnnpack provider
Clarifying the comment on Create()
2022-07-25 14:14:38 -07:00
dependabot[bot]
972bb9676c
Bump terser from 5.7.0 to 5.14.2 in /js/common (#12248)
Bumps [terser](https://github.com/terser/terser) from 5.7.0 to 5.14.2.
- [Release notes](https://github.com/terser/terser/releases)
- [Changelog](https://github.com/terser/terser/blob/master/CHANGELOG.md)
- [Commits](https://github.com/terser/terser/commits)

---
updated-dependencies:
- dependency-name: terser
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-07-25 14:09:05 -07:00
Baiju Meswani
ddb45e9126
On device training CI pipeline (#11987) 2022-07-25 10:07:17 -07:00
Jameson Miller
8d0e86dec8
Apply project formatting rules to ort_aten.cpp (#12294)
* Apply project formatting rules to ort_aten.cpp

Formatting applied by formatting the file in VS Code.

This file is under active development and the inconsistent formatting
was causing friction due to:

  1. cpplint job on Pipeline was flagging a lot of style issues,
     resulting in a lot of noisy annotations.

  2. local edits would result in changes that are not part of the core change.

While there are other files in this part of the source tree with
inconsistent formatting, this file was causing the most friction. We can
come back and address the other files later, which would be a much
larger change.

* Apply consistent pattern for invoker.Invoke(...)
2022-07-25 07:26:35 -04:00
Vincent Wang
0fa3aeb65c
[CUDA] Add Strided Tensor Support for Expand->GatherElements for Training (#11976)
* strided tensor for expand and gather_elements

* bugfix

* simplify CoalesceDimensions

* resolve comments

* resolve more comments.
2022-07-25 16:05:26 +08:00
pengwa
75bda9f267
CPU AdamW implementation (#11978)
* cpu adamwoptimizer implementation

* unit tests for cpu kernel pass

* refine based on comments

* parallize the weights loop in PrepareForCompute.

* fix wrong test data path

* fix kernel hash

* fix rocm ci pipeline
2022-07-25 09:43:52 +08:00
Edward Chen
564dc32304
Initialize generated tensor data in onnxruntime_perf_test. (#12275)
Initialize generated tensor data in onnxruntime_perf_test to zeroes instead of leaving it uninitialized. String tensors were already being initialized.
2022-07-22 16:26:13 -07:00
Ye Wang
89ac61f4d4
support gpt2 model with greedy search (#12068)
* greedy search gpt2 cpu checkin

* add cuda support

* add test

* provider

* update

* fix some bugs

* refactor impl class

* refactor test

* remove unused func

* refactor parameters class

* simplify padding

* fix lint warnings

* python format

* Revert "python format"

This reverts commit f25fe1017fa33d960b2418ebbb5dba6a4bd043cf.

* python format

* fix pipelines

* fix pipeline

* move bufferallocater to generate_impl_base

* review comments(alignment, filename/namespace change)

* rebase2

* python reformat

* reformat

* fix rocm build

* review comment

* review comments

* review comments

* fix a bug

* rebase test files

* python format

* format import order

* review comments

* fix build
2022-07-22 15:45:16 -07:00