Commit graph

7096 commits

Author SHA1 Message Date
Hariharan Seshadri
e2eeffeafb
Cosmetic fix to AttentionFusion (#12329) 2022-07-27 12:43:50 -07:00
Wil Brady
1163294699
Fixing up some python warnings. (#12319) 2022-07-27 07:24:37 -04:00
kiennguyen94
1dd65b9ae3
Use memory mapped data for external data initializers with Cuda fix (#11789)
Description: Reinstate #11127 with Cuda fix.

Motivation and Context

Fixes Inference on Onnx with external data not working since PR 11320 (location planning logic) #11511
2022-07-27 19:13:04 +10:00
Yi Zhang
4df4471d5e
add missing build_java in Android testing stage. (#12187)
add missing build_java in testing
2022-07-27 14:13:08 +08:00
pengwa
2b2367efbf
Fix orttraining-linux-gpu-ci-pipeline (fairscale dependency) (#12320)
authored by: @pengwa
2022-07-26 15:11:04 -07:00
Tianlei Wu
51a799802a
Move initializers from subgraph to the main graph to reduce memory (#12310)
* move initializers from gpt2 subgraph to the beamsearch main graph
2022-07-26 11:30:17 -07:00
Adam Louly
f3dcbf539a
Checkpoint load inference (#12168)
* LoadCheckPoint to tensor cpp functions (draft)

* Load Checkpoint into inference model

* fix python lint

* fix python lint

* Fixing lint and some unused imports

* added assert for zero weights model, resolved other issues

* resolved issues

* Solved issues

* changed variable names for get_models

* paparameters names missmatched fix

Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-07-26 11:08:50 -05:00
Wil Brady
de57daaab0
Eager mode: binary ops more complete behavior and testing. (#12293)
* Remove hand written add_.Tensor as it can now be generated.

* Generate .out for tensor version of basic math ops. Add.out testing added too.

* Remove sin tests as they are covered by parameterized tests. Also, moved all parameterized tests to the end in their own section.

* Add binary ops tests for tensors. Scalar tests are calling the aten .out which is for tensor.

* Add support for scalar input to add, div, mul, and sub.
2022-07-26 09:14:57 -04:00
Ryan Hill
3e014a5e5d
Fix C header to stop people accidentally copying the OrtApi by value (#12297)
* Fix C header to stop people accidentally copying the OrtApi by value
* Remove api_ from KernelTwo
2022-07-25 19:19:40 -07:00
Vincent Wang
c40f73ae0c
Remove aten::binary_cross_entropy_with_logits from ATen Fallback (#12301) 2022-07-26 07:29:56 +08:00
Dmitri Smirnov
3bf614fd47
Eliminate memory allocations per recent profiling (#12225)
* Alloc begin

FeedsFetches refactoring
Refactor Tensor class
Fix buffer deletor
Remove new/delete deleted
Adjust alloc move
Fix up xnnpack provider
Clarifying the comment on Create()
2022-07-25 14:14:38 -07:00
dependabot[bot]
972bb9676c
Bump terser from 5.7.0 to 5.14.2 in /js/common (#12248)
Bumps [terser](https://github.com/terser/terser) from 5.7.0 to 5.14.2.
- [Release notes](https://github.com/terser/terser/releases)
- [Changelog](https://github.com/terser/terser/blob/master/CHANGELOG.md)
- [Commits](https://github.com/terser/terser/commits)

---
updated-dependencies:
- dependency-name: terser
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-07-25 14:09:05 -07:00
Baiju Meswani
ddb45e9126
On device training CI pipeline (#11987) 2022-07-25 10:07:17 -07:00
Jameson Miller
8d0e86dec8
Apply project formatting rules to ort_aten.cpp (#12294)
* Apply project formatting rules to ort_aten.cpp

Formatting applied by formatting the file in VS Code.

This file is under active development and the inconsistent formatting
was causing friction due to:

  1. cpplint job on Pipeline was flagging a lot of style issues,
     resulting in a lot of noisy annotations.

  2. local edits would result in changes that are not part of the core change.

While there are other files in this part of the source tree with
inconsistent formatting, this file was causing the most friction. We can
come back and address the other files later, which would be a much
larger change.

* Apply consistent pattern for invoker.Invoke(...)
2022-07-25 07:26:35 -04:00
Vincent Wang
0fa3aeb65c
[CUDA] Add Strided Tensor Support for Expand->GatherElements for Training (#11976)
* strided tensor for expand and gather_elements

* bugfix

* simplify CoalesceDimensions

* resolve comments

* resolve more comments.
2022-07-25 16:05:26 +08:00
pengwa
75bda9f267
CPU AdamW implementation (#11978)
* cpu adamwoptimizer implementation

* unit tests for cpu kernel pass

* refine based on comments

* parallize the weights loop in PrepareForCompute.

* fix wrong test data path

* fix kernel hash

* fix rocm ci pipeline
2022-07-25 09:43:52 +08:00
Edward Chen
564dc32304
Initialize generated tensor data in onnxruntime_perf_test. (#12275)
Initialize generated tensor data in onnxruntime_perf_test to zeroes instead of leaving it uninitialized. String tensors were already being initialized.
2022-07-22 16:26:13 -07:00
Ye Wang
89ac61f4d4
support gpt2 model with greedy search (#12068)
* greedy search gpt2 cpu checkin

* add cuda support

* add test

* provider

* update

* fix some bugs

* refactor impl class

* refactor test

* remove unused func

* refactor parameters class

* simplify padding

* fix lint warnings

* python format

* Revert "python format"

This reverts commit f25fe1017fa33d960b2418ebbb5dba6a4bd043cf.

* python format

* fix pipelines

* fix pipeline

* move bufferallocater to generate_impl_base

* review comments(alignment, filename/namespace change)

* rebase2

* python reformat

* reformat

* fix rocm build

* review comment

* review comments

* review comments

* fix a bug

* rebase test files

* python format

* format import order

* review comments

* fix build
2022-07-22 15:45:16 -07:00
Edward Chen
cb351388d0
Minor fixes (#12276)
* Log message for each node placement so the output won't get truncated.

* Use correct variable in ReshapeOpBuilder::CanSkipReshape().
2022-07-22 14:33:49 -07:00
RandySheriffH
0264a9c29b
Bump ort version number (#11948)
* bump ort version number

* update link and note url

* update version to silence assert

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2022-07-22 12:55:53 -07:00
Edward Chen
a71f7d3339
[NNAPI EP] Add some support for MatMul with batch inputs (#12261)
MatMul allows multiplying batches of matrices. This change enables limited support of batch inputs in the NNAPI EP.

Some limitations:
- Broadcasting is not supported. A and B must have the same leading dimensions.
- Only float inputs are supported. QDQ MatMul or QLinearMatMul with batch inputs is not supported yet.

Note that NNAPI's ANEURALNETWORKS_BATCH_MATMUL is pretty much what we need, but it is only available from NNAPI feature level 6. This change composes a bunch of NNAPI operations to achieve a similar result but this is not ideal.
2022-07-22 11:21:45 -07:00
Juan Paez
4f57da78cf
OrtModule fix pytorch version comparison (#12280)
* fix torch version comparison

* remove patchfile

Co-authored-by: Juan Paez <juanpaez@microsoft.com>
2022-07-22 09:11:28 -07:00
pengwa
feabafe58b
Fix memory consumption discrepancy (#12266)
* release cached cuda memory after temp model_copy run

* op schema change only: remove PythonOp forward output from PythonOpGrad inputs.

* always export model using torch.no_grad

* 1.update PythonOP's "input_requires_grads" attribute according to ORT gradient graph.
2. remove PythonOp's "output_tensor_requires_grads" attribute because in torch.no_grad mode, the exported value is not correct.
3. [related to 2] remove PythonOPGrad's "input_tensor_requires_grads" because it comes from corresponding PythonOP's "output_tensor_requires_grads".

* fix uts

* refine basde on wschin's comments && fix pylint

* fix comments

* fix unused variable
2022-07-22 16:55:50 +08:00
Changming Sun
f2533d3084
Remove two lines in the Dockerfile for Github Codespace (#12278)
We are not allowed to use "--extra-index-url" anywhere in our code
2022-07-21 20:52:17 -07:00
Ashwini Khade
ceb76429db
Merge pull request #12056 from microsoft/bmeswani/merge-training_dev/on_device_poc
Merge On-Device-Training Offline Tooling and C/C++ APIs
2022-07-21 15:09:48 -07:00
Yufeng Li
7194ec1894
fix bug: output of Concat is quantized twice in qdq format (#12254) 2022-07-21 14:55:47 -07:00
Yufeng Li
a18b080513
clean up calibration model (#12255) 2022-07-21 14:50:28 -07:00
Wil Brady
45c0be8a25
Modify generator for eager to use all inputs for determining promote type. (#12268)
* Sort supported types order so we get a consistently generated order of types.
* Fix promote type to include all the input types and not just the first one.
2022-07-21 17:21:10 -04:00
dependabot[bot]
30ac6e87fa
Bump terser from 5.10.0 to 5.14.2 in /js/web (#12253)
Bumps [terser](https://github.com/terser/terser) from 5.10.0 to 5.14.2.
- [Release notes](https://github.com/terser/terser/releases)
- [Changelog](https://github.com/terser/terser/blob/master/CHANGELOG.md)
- [Commits](https://github.com/terser/terser/commits)

---
updated-dependencies:
- dependency-name: terser
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-07-21 14:04:08 -07:00
Rachel Guo
eb3b49b6a1
[CoreML EP] Remove batch=1 restriction in depthtospace op support (#12258)
* remove batch restriction in depthtospace op support

* update input rank check

Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>
2022-07-21 12:47:26 -07:00
Jameson Miller
108b860dc1
Add dev container / codespace configuration (#12256)
Dev containers[1] provide a self-contained development environment that
can be tailored for a project. GitHub Codespaces[2] provide a cloud
hosted environment to run these containers in. This makes it easy to
provision a consistent development environment with developer tooling
already installed and configured that provide the following benefits:

1. Developer onboarding is simplified.
    1. Easy to get environment setup and running
    2. Reference environment is available, if developer is having issues
       with local environment
2. Developer tooling is provided and automatically configured.
    1. Python / C++ build tooling
    2. Python / C++ code formatters / linters
3. Easy to provision cloud hosted environment via GitHub Codespace.
4. Easy to create ephemeral development environments to test new changes
     1. Can be used to provision environments to test changes
        and Pull Requests

This can ease several pain points that developers on-boarding to the
project can encounter. One of the problems I have seen with developers
new to the project (I am one of these) is having the baseline
development environment (Python / C++) and recommended tools (e.g. VS
Code Python / C++ extensions, linters, and autoformatters) installed and
configured to efficiently get started in the repository. For all
developers, this makes it easy to leverage ephemeral cloud hosted
development environments via GitHub Codespaces.

**Notes:**

  - Compiling the project can run into trouble if the codespace has < 32
    GB of RAM

1) https://docs.github.com/en/codespaces/setting-up-your-project-for-codespaces/introduction-to-dev-containers
2) https://docs.github.com/en/codespaces/overview
2022-07-21 15:29:15 -04:00
Xinya Zhang
03dfcb0e87
[ROCm] Enable int8 for MatMulInteger Op (#11776) 2022-07-21 11:20:48 -07:00
Baiju Meswani
cbf08c7a7b Make GetTrainingApi as a part of the OrtApis, add Training API documentation and address other pull request review comments 2022-07-21 18:11:48 +00:00
Justin Chu
3d2bcb3386
Use unregister_custom_op_symbolic to unregister torch symbolics (#12146)
Description: Use unregister_custom_op_symbolic to unregister torch symbolics

Motivation and Context

Fixes #11305
2022-07-21 10:47:53 -07:00
Rachel Guo
496618594f
Update supported ops md for NNAPI/CoreML EP (#12245)
* update supported ops md

* address pr comments

* address pr comments

* wording
2022-07-21 10:23:08 -07:00
LironKesem
7dc45bc311
Implementing aten::gt.Scalar_out and aten::lt.Scalar_out (#12181)
* Implementing aten::gt.Scalar_out and aten::lt.Scalar_out

* modified the code according to code review
2022-07-21 10:36:43 -04:00
Yi Zhang
007ef42749
Fix: Test coverage is undercounting and profiling errors (#12260)
add data relocation for onnx_test_runner
2022-07-21 16:19:24 +08:00
Ye Wang
5066ef1185
Fix a bug in beam search custom attention mask allocation (#12240) 2022-07-20 23:42:54 -07:00
Yulong Wang
0c78b71352
prepare test folder from GitHub (#12220)
* consume onnx test data from github

* ensure tests

* update script and allow opset specification

* fix python format

* fix python format

* consume new filter format

* fix linting error
2022-07-20 22:01:08 -07:00
Tianlei Wu
568d08994f
fix test_optimizer.py (#12219)
* fix optimizer test
* update message and skip test instead of uncomment
* fix deprecated warning
2022-07-20 19:21:26 -07:00
101arrowz
c72bb8aaa9
[js/web] add OffscreenCanvas support to WebGL backend (#12159)
* Add OffscreenCanvas support to WebGL backend

* fix format

* fix lint
2022-07-20 14:06:03 -07:00
Rachel Guo
471dbfc250
[NNAPI] Add int32_t as supported input data type and other minor gather op updates (#12171)
* update (including commented out code for gather)

* update tests etc.

* update

* minor updates

* fix typo

* fix build

* minor update

* address pr comment

* refine comments

* address pr comment

* update condition check and UTs

* refine code comments

* address lint warning
2022-07-20 12:07:46 -07:00
Tianlei Wu
5651d91c32
Fix onnx version comparison (#12223)
use version.parse to compare version
2022-07-20 11:14:06 -07:00
Jian Chen
43e1e89453
Update aarch64 building pool to aiinfra-linux-ARM64-CPU-2019 (#12243)
* Setting new pool for arm64

* Setting defualt pool name

* adding DockerInstaller stage

* try to install docker from apt-get

* change to specific

* adding chmod to docker.sock

* install dotnet sdk

* specic dotnet 3.1.x

* add manuall step to install dotnet

* typo bass

* remove inputs

* change dotnet installation dir

* skipComponentGovernanceDetection on arm64 linux

* variables typo

* variables:
    - name: skipComponentGovernanceDetection
      value: true

* update variables

* skipComponentGovernanceDetection set to true

* moving varliables

* moving the variables again

* setting condition on cgd

* indentation

* indentation again

* conditional variable

* if

* remove cgd

* conditionl on cgd

* condition

* parameters

* clean up
2022-07-20 12:08:02 -04:00
msftlincoln
424120d0fa
cpplint & Eager mode: refactor and add comments to empty_* functions, general lint cleanup in ort_aten (#12238)
* empty* comments and code reuse

* lint

* more cpplint

* add cpplint settings

* test empty
2022-07-20 11:47:57 -04:00
Vincent Wang
72c689a502
[CUDA] Use dim3.z to Handle Large Input For GatherGrad (#12250)
* use dim3.z to handle large input size

* less blocks
2022-07-20 18:42:52 +08:00
pengwa
ebfd81e67e
Fix BiasGeluGrad bug (#12200)
* use 3D grid to avoid the upper limit of grid dimension

* enrich tests

* Revert "use 3D grid to avoid the upper limit of grid dimension"

This reverts commit 2d5badf2fe8cd985f3f29ee2cb18fff13d07c2ab.

* change to a fix: switch the 1st and 2nd dim
2022-07-20 17:59:29 +08:00
Vincent Wang
3cdc6d7775
[ORTModule] Bugfix of torch.chunk's Custom Symbolic when chunks==1 (#12249)
handle custom chunk with chunks==1
2022-07-20 17:00:41 +08:00
cloudhan
a0074ba9bc
Add baseline gemm for kernel explorer (#12050)
Use rocblasGemmHelper gemm wrapper from ORT and profile for bert param size only.
2022-07-20 13:49:26 +08:00
mindest
add631410a
[ROCm] Re-enable ReduceL1, L2 and related tests (#12209)
Re-enable ReduceL1,L2 and related tests
2022-07-20 13:13:02 +08:00